Being assigned the same page more than once?
-
by ssgiris moderator
What should folks do if they are are assigned the same page to classify more than once?
I've gotten pages more than once - I recognize them because of the odd illustration, or the contents of a table.
Should we skip those? Hashtag? Classify again?
@DMZ who should we contact to see if this is a bug?
Posted
-
by yshish moderator in response to ssgiris's comment.
I would report those to the developers since it shouldn't happen and could be caused by a bug - depends on the image ID. You can check the comments. If image ID numbers differ, it is just uploaded twice..
Zuzi
Posted
-
by DZM admin
The best thing to do would be to save the IDs, because it's possible that they are repeat scans rather than repeat serving of the same image.
We'll want to be sure that the system really is serving repeats before we try to figure out why it is. 😃
Thanks!!
(P.S.: It's D-Z-M. 😃 The DMZ is something else entirely!)
Posted
-
by yshish moderator
Hey,
I checked my Recents and have found some images more than once there - checked the image ID and it is the same for both! All were cassified as 'no illustration' ones. But I think they shouldn't appear more than once anyway.
Here they are:
There may be more such ones.. I'm not about to go through all .)
Zuzi
Posted
-
by ssgiris moderator
@DZM 😃
I've been skipping the second servings of a page I recognize, so I'm not sure how I would go about finding them.
ssgiris
Posted
-
by yshish moderator in response to ssgiris's comment.
@DZM Do skipped images appear among the 'Recent' ones?
If yes, then you @ssgiris would find them there.
Zuzi
Edit: I made an experiment and yes, they do! So you can go through the thumbnails (in case there are not 324532543 of them.) and open each of the two same ones in a different tab and compare their ID numbers.
Posted
-
by jules moderator in response to yshish's comment.
Yes they do but finding the first instance of a page you are sure is a repeat isn't easy when you've classified a fair few! I've had some I'm sure I've seen before but there's too many to search.
Posted
-
by DZM admin in response to yshish's comment.
Alright, @yshish -- thanks for the report and the confirmed IDs. Keep us posted if you find others!
I will put through an issue, but I'm not going to make a wild amount of noise about it yet. I am not too concerned if this happens less than 1% of the time, but if it's becoming a more common thing, I will start raising alarms. 😃 Thanks!!
Posted
-
by yshish moderator in response to DZM's comment.
Hey,
OK. I will.
The thing which I worry a bit more about is the 'No data' notification. It happens almost always when I start the classification, sometimes the reloading doesn't help and it appears again. Sometimes it pops up even after finishing a page instead of loading a new one...
It actually reminds me the recent issue on the Floating Forests when there were no data from one of the locations. Could it be possible here? (I mean that one of the documents would have been completely classified and there were no unclassified pages available..?)
Just an idea:]
Zuzi
Posted
-
by jules moderator
Here's another:
ASC0000c91First got this 13 March and then again yesterday (29 March). Same ID.
Posted
-
by yshish moderator
Another report of a repeating image: ASC0000gpb
Posted
-
by yshish moderator
I have figured out that some images display twice among my Recent however I classified them only once. It could be caused by making a pause during the classification. I' m curious whether it counts the classification as skipped awhen it goes to my recent before finishing it.
@DZM If I send you IDs, are you able to figure it out? There are more of them.
Working on tablet.
Zuzi
Posted
-
by DZM admin
The more IDs that I get, the more I can give to the devs... 😃
Posted
-
by yshish moderator in response to DZM's comment.
Ok. The lat ones:
- ASC00001u5 - twice the same IDs
- ASC00006gk - twice the same ID
Looks like two different scans of the same page, ID numbers are different!
Will look for some others later.. the Plankotn is calling! 😃
Thanks!
Posted
-
by yshish moderator
Other repeated images: ASC0000if7 and ASC000057p and ASC0000a47 and ASC0003dec (with the same ID)
Posted
-
by tfmorris
These two: ASC0000gce ASC0000gt6
are from the same journal page scanned multiple times at the Internet Archive and that duplication was propagated all the way through the pipeline at BHL, then SG.The entire volume of the journal is probably going to end up getting processed twice.
Posted