Retired no_illustration pages being presented?
-
by tfmorris
I was looking at the API network traffic out of curiosity and when this page:
was displayed, it was associated with the metadata below from the API. Without digging into the API documentation, my naive reading is that this page has been classified 13 times out of 13 as having no illustrations and that was "retired" for that reason on May 14 and currently has a state of "complete," yet here it is on display in my browser. What's going on?
Here's a table of the 10 pages in that batch from the API:
zooniverse_id state classification_count no_illustrations_count has_illustrations_count skip_count updated_at retire_reason ASC0000tcf complete 13 13 2015-05-14T14:07:44Z detected_no_illustrations ASC0000tw5 complete 24 21 1 2 2015-05-14T14:12:49Z detected_no_illustrations ASC0000r1z complete 16 15 1 2015-05-14T13:46:39Z detected_no_illustrations ASC0000vqe complete 6 6 2015-05-14T14:29:07Z detected_no_illustrations ASC0000ogk complete 12 8 3 1 2015-05-14T13:22:33Z detected_no_illustrations ASC0000rva complete 20 20 2015-05-14T13:54:40Z detected_no_illustrations ASC0000osn complete 10 10 2015-10-15T00:21:12Z detected_no_illustrations ASC0000wva complete 25 25 2015-05-14T14:39:02Z detected_no_illustrations ASC0000v62 complete 13 12 1 2015-05-14T14:23:51Z detected_no_illustrations ASC0000v4g complete 8 8 2015-05-14T14:23:27Z detected_no_illustrations
From the looks of it, none of these pages should be presented to me. If it makes a difference, I'd selected a book from the Periodicals page: "Wiltshire archaeological and natural history magazine."
{ "id": "54f4b8a230017b04c9003685", "activated_at": "2015-03-07T07:05:47Z", "classification_count": 13, "coords": [ ], "created_at": "2015-03-03T19:21:00Z", "group": { "_id": "54f4b81b30017b04c9000009", "zooniverse_id": "GSC0000007", "name": "wiltshirearchaeo" }, "group_id": "54f4b81b30017b04c9000009", "location": { "standard": "http:\/\/zooniverse-static.s3.amazonaws.com\/www.sciencegossip.org\/subjects\/standard\/54f4b8a230017b04c9003685.jpg", "thumb": "http:\/\/zooniverse-static.s3.amazonaws.com\/www.sciencegossip.org\/subjects\/thumb\/54f4b8a230017b04c9003685.jpg" }, "metadata": { "contributor": "Natural History Museum Library, London", "item_id": "45554", "no_illustrations_count": 13, "original_size": { "width": 1680, "height": 2974 }, "page_id": "12643735", "page_no": null, "page_seq": "441", "sponsor": "Natural History Museum Library, London", "volume": "v.33=no.99-102 (1903-1904)", "year": "1904 - 1904", "retire_reason": "detected_no_illustrations" }, "project_id": "54f42c0ab35d2e06bd000001", "random": 0.1300746338013, "state": "complete", "updated_at": "2015-05-14T14:07:44Z", "workflow_ids": [ "54f42c32b35d2e06bd000002" ], "zooniverse_id": "ASC0000tcf" },
Posted
-
by eatyourgreens admin
Hi,
Thanks very much for this. The API definitely should not be sending back completed subjects, since those have been fully classified. Subjects tagged with "retire_reason": "detected_no_illustrations" are pages that were detected as having no illustrations by an OCR algorithm, so it could be that those weren't properly pulled from the active pool of classification subjects.
I could filter the API call in the browser, but I suspect that this would lead to the front-end reporting "no more data" for the project, which isn't true either.
Thanks, by the way, for raising this as a github issue too.
Jim
Posted
-
by eatyourgreens admin
This should be fixed now, and you should be seeing more illustrated pages for Wiltshire Archaeology, Quarterly Journal of the Geological Society and Hardwicke's Science Gossip.
Jim
Posted
-
by yshish moderator
Great! Thanks, guys 😃
Posted