Wikipedia:Bots/Requests for approval/InternetArchiveBot 3

InternetArchiveBot 3

 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was

Operator:

Time filed: 19:17, Tuesday, June 4, 2019 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): PHP

Source code available: Not at current

Function overview: Bluelink Book references where ever possible.

Links to relevant discussions (where appropriate): Unanimous discussion

Edit period(s): Continuous

Estimated number of pages affected: mainspace

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: IABot will go through a DB subset of information as well as crawl through recent changes to maintain an index of available books and cited books. It will use that to derive which pages have cited books that can be blue linked and make the edit. The end result is that the page title and/or the page number will be a link to the previewable book. If the title is already wikilinked somewhere else, only the page number is linked, vice-versa. If there is a page number, IABot will provide a direct link to the page in the book for readers to look at. If not, the book will be linked without a page number. Links to book searches and for-profit entities will be replaced.

Discussion

 * Does the bot also work with citations that have no ISBN and/or books where there is more than one book, edition etc. under the same title? Jo-Jo Eumerus (talk, contributions) 16:05, 14 June 2019 (UTC)
 * , in the future it will but for now it will only deal with ISBNs. ISBNs are unique to editions.  Down the road it will handle matching of books missing ISBNs. — CYBERPOWER  (Around ) 03:24, 15 June 2019 (UTC)

As per usual, please perm link contribs here when done & take all the time you need,. -- The SandDoctor Talk 07:29, 16 June 2019 (UTC)

Trial

 * 1) Edit 1
 * 2) Edit 2
 * 3) Edit 3
 * 4) Edit 4
 * 5) Edit 5
 * 6) Edit 6
 * 7) Edit 7
 * 8) Edit 8
 * 9) Edit 9
 * 10) Edit 10
 * 11) Edit 11
 * 12) Edit 12
 * 13) Edit 13
 * 14) Edit 14
 * 15) Edit 15
 * 16) Edit 16
 * 17) Edit 17
 * 18) Edit 18
 * 19) Edit 19
 * 20) Edit 16
 * 21) Edit 20
 * 22) Edit 21
 * 23) Edit 22
 * 24) Edit 23
 * 25) Edit 24
 * 26) Edit 25
 * 27) Edit 26
 * 28) Edit 27
 * 29) Edit 28
 * 30) Edit 29
 * 31) Edit 30
 * 32) Edit 31
 * 33) Edit 32
 * 34) Edit 33
 * 35) Edit 34
 * 36) Edit 35
 * 37) Edit 36
 * 38) Edit 37
 * 39) Edit 38
 * 40) Edit 39
 * 41) Edit 40
 * 42) Edit 41
 * 43) Edit 42
 * 44) Edit 43
 * 45) Edit 44
 * 46) Edit 45
 * 47) Edit 46
 * 48) Edit 47
 * 49) Edit 48
 * 50) Edit 49
 * 51) Edit 50


 * Edits with bugs were reversed and re-attempted after the bug was fixed. This was a controlled manual run.— CYBERPOWER  ( Chat ) 13:47, 1 July 2019 (UTC)
 * Ping the .— CYBERPOWER  ( Chat ) 14:18, 1 July 2019 (UTC)
 * I have few queries:
 * Why is both the page number linked to the url, and url set to that same url? That creates two links in the citation that link to the same place, which seems strange/confusing. General practice from what I've seen seems to be to only add a link using url.
 * I don't think Google Books links should be removed, as that would for one create a heavily reliance on Archive.org - meaning if, for some reason, Archive.org had to stop offering page previews/this form of library loan, we would lose out links to Google Book previews. It would also be overriding the decisions of individual editors to include Google Books links, and individual editorial decisions shouldn't be overriden by a bot IMO. Citing sources is somewhat relevant.
 * Also, I'm curious at a ballpark how many edits you think the bot will make. There are about a million articles with ISBNs, so I'd think >~100000?
 * Galobtter (pingó mió) 19:40, 1 July 2019 (UTC)
 * , thanks for the questions. This was made after careful consideration.  It makes it easier for the reader to access the book preview.  As is in many cases a lot of book references are wiki-linked to a different article making the title unavailable to be linked via url=.  Also in terms of ease of access, I believe that if a reader looked at the reference and wanted directed access to the material/page, they are more likely to click on the page number.  That was the general idea behind it.  Hence I made a demo edit and proposed it as such in the linked discussion above which got unanimous support.  In regards to the second question, the logic behind that is moving from a for-profit organization to a non-profit organization that gives free access to the complete book seemed like a no brainer.  The page number in a Google Books URL is not lost and is carried over to the new URL.  So in short the readers will still see the same book, and same page, just on a different site.  As for the ballpark, the initial run is calculated to be around 120,000 pages.  After that it monitors for additions to the archive.org library and watches for changes in recent changes. — CYBERPOWER  ( Chat ) 20:07, 1 July 2019 (UTC)


 * Re Google Books. "Google Books is a bit of a ghost town. The Google Books blog, and Google’s library newsletter were shut down long ago. And the leading visionaries behind Google Books have all moved on to dream other dreams.". There are other articles about the possible demise of Google Books. Google has shut down other favorite services (RSS reader etc) so its long-term viability is uncertain. OTOH archive.org is taking this as an opportunity and devoting a lot of resources to be a bigger and better Google Books. As it should be as a non-profit library which is always preferable over a commercial book seller. -- Green  C  15:13, 2 July 2019 (UTC)

Under normal circumstances, I would prefer to leave the close for someone else. However, given the backlog, lack of recent BAG activity (myself included), and the fact that this task is uncontroversial and based on how well the trial went, I am inclined to make an exception for this. As per usual, if amendments to - or clarifications regarding - this approval are needed, please start a discussion on the talk page and ping. -- The SandDoctor Talk 00:51, 13 July 2019 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.