Wikipedia:Bots/Requests for approval/CobraBot


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol keep vote.svg Approved.

CobraBot
Operator: Cybercobra

Automatic or Manually assisted: Automatic Manual (at least for now; may file another BRfA for Automatic if this goes off without any hitches)

Programming language(s): Python (pywikipedia)

Source code available: User:CobraBot/Code (Will post shortly)

Function overview: Add OCLC# parameter to pages that use Template:Infobox Book based on ISBN in the infobox, if it is given.

Edit period(s): Several runs as my time permits until either task complete; periodic re-reruns (e.g. quarterly) as new pages & -es added or BRfA for automatic running filed

Estimated number of pages affected: However many pages transclude Template:Infobox Book, and specify an ISBN (55% maybe?) (see ); Several thousand over the first week

Exclusion compliant (Y/N): Y (via pywikipedia defaults) N (Don't know how to easily implement it)

Already has a bot flag (Y/N): N (needs one)

Function details:
 * 1) Bot chooses an article that transcludes Template:Infobox Book
 * 2) Bot locates the template in the article
 * 3) Bot checks if  parameter is present
 * 4) If yes, and value is non-whitespace, page is skipped (OCLC# already present). GOTO step 1.
 * 5) If yes, and value is whitespace, parameter is removed.
 * 6) If no, continue.
 * 7) Bot grabs  parameter
 * 8) If parameter not present, page is skipped (No ISBN to use for OCLC# lookup). GOTO step 1.
 * 9) The value of the parameter is obtained, extra preceding "ISBN" text or dashes are stripped from the obtained value
 * 10) If the value is "N/A" or similar, page is skipped (No useful ISBN to use for OCLC# lookup). GOTO step 1.
 * 11) Using a proprietary process, the corresponding OCLC# is found for the given ISBN. The title of the work corresponding to the OCLC# is also obtained.
 * 12) The OCLC# is added to the infobox body using the  parameter
 * 13) (In assisted editing mode only) The bot operator is presented with the title of the WP page, ISBN, OCLC#, and OCLC title and asked to confirm the change.
 * 14) Page changes are saved.
 * 15) GOTO 1 until all pages either processed or skipped.

Discussion
After some further review of the trial edits, this bot seems to be doing just fine. No concerns, and good bot task - Kingpin13 (talk) 10:09, 25 September 2009 (UTC)
 * I've run the bot for testing without having it actually modify the pages on a decent number of articles and think all the bugs are worked out. The bot is more conservative than necessary in that the code for finding where the infobox ends is rather dumb and will think the template ends early if its body has another template within it, thus it might end up skipping some pages it otherwise could help. --Cyber cobra (talk) 00:19, 24 September 2009 (UTC)
 * Bot also skips cases where  value empty except for a comment. Also, bot has been much refactored (see updated code page) and is being re-tested. --Cyber cobra  (talk) 06:28, 24 September 2009 (UTC)
 * Changed type to Automatic after not observing problems after significant testing. --Cyber cobra (talk) 07:42, 24 September 2009 (UTC)
 * Currently running code in assisted editing mode for demonstration/testing. --Cyber cobra (talk) 17:29, 24 September 2009 (UTC)
 * Assisted editing run with human oversight complete. 50 edits for examination. Only issues were an attempt to edit a talkpage (code now ensures pages are in article namespace) and one apparent deficiency in WorldCat's database (Heretics of Dune's ISBN maps to the OCLC# of its French translation). --Cyber cobra (talk) 18:22, 24 September 2009 (UTC)
 * After even more testing, code now changed to ensure ISBN is of plausible length --Cyber cobra (talk) 23:48, 24 September 2009 (UTC)
 * And with lots of further testing, a couple of rare corner cases were found and now handled. I am confident any remaining issues must be ridiculously obscure & infrequent. --Cyber cobra (talk) 05:30, 25 September 2009 (UTC)
 * Great, the edits you've done so far look pretty good, I'll look at them more thoroughly later. I think making the bot exclusion compliant is fairly easy in pywikipedia, although I don't actually know how the language works, I have this from a previous BRfA: pywikipedia default setting/ignore_bot_templates = False</tt>. Do you think you could use this to make this bot exclusion compliant please? Thanks - Kingpin13 (talk) 08:26, 25 September 2009 (UTC)
 * After grepping for, pywikipedia appears to be exclusion-compliant by default; I confirmed by testing on Three Men in a Boat. --<b style="color:#3773A5;">Cyber</b> cobra  (talk) 08:49, 25 September 2009 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.