Wikipedia:Bots/Requests for approval/HersfoldBot


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol keep vote.svg Approved.

HersfoldBot
Operator: Hers fold  (t/a/c)

Automatic or Manually Assisted: Automatic, but only run when supervised.

Programming Language(s): Java, using User:MER-C/Wiki.java.

Function Overview: HersfoldBot will transwiki articles from Category:Copy to Wiktionary to Wiktionary using the Special:Import function. This bot will also need approval and admin/import rights over there before it can fully operate, however testing of the import function is possible at test.wikipedia.org.

Edit period(s): When needed, probably no more than once a day.

Already has a bot flag (Y/N): No, would need one should not need one unless required by policy - it would be preferred to have the bot's edits show up in RC so they can be noticed and the imported articles dealt with.

Function Details: HersfoldBot will collect the list of articles (only pages in the main namespace) from Category:Copy to Wiktionary and complete the following execution cycle for each article. The bot will ignore any article that has been tagged with TooManyRevisions; that template's function is explained later on. All of the actions the bot makes are logged to a text file on my computer so that I can review what happened, why it stopped running, and whether or not I need to go in myself to clean up some of the things it wasn't able to do (see next paragraph).
 * 1) Determine if Wiktionary already has a page existing at wikt:Transwiki: .
 * 2) If so, the bot will attempt to import the full history of the article using Special:Import at Wiktionary (through the use of the API).
 * 3) If the import is successful, the bot will replace the Transwiki template (Copy to Wiktionary or one of its redirects) on Wikipedia with TWCleanup and log the transwiki both at Transwiki log/Articles moved from here/en.wiktionary and wikt:Wiktionary:Transwiki log.
 * 4) If there is already a transwikied article by that title, the bot will not import, but simply replace the transwiki template with TWCleanup2.

The bot has multiple safety checks built into it which will either stop it running or set it to ignore particular articles which have proven to be a waste of time.
 * The bot will stop editing more-or-less immediately if it has new messages at either Wiktionary or Wikipedia.
 * The bot will be unable to continue if it is blocked, and should exit cleanly if this proves to be the case.
 * The bot will stop if it is unable to create or open the text log file on my computer. This happens before it tries to log in to either wiki, and in fact before I even enter the password.
 * The bot will stop running if it encounters an IOException at any point (with one exception mentioned later), as these usually indicate a problem with the internet connection.
 * The bot will stop running if it has inadvertently been logged out or finds that it does not have access to import articles.
 * The bot will stop running if it gets a "cantimport", "badinterwiki", or unknown error back from the import API, as this indicates access has been denied, there is a problem with the hard-coded portion of the request URL, or something really bad happened.
 * The bot will stop running if it encounters ten "notempdir" errors in a row - this is a server-side error, and may be only temporary; the counter allows the server time to correct itself without the bot stopping, but then will force the bot to stop if it seems the server is really having trouble.
 * The bot will stop if I enter the wrong password or it otherwise fails to log in twice.
 * The bot will mark within its log that manual review is needed in the following circumstances, however will not stop running:
 * The bot receives a HTTP 504 error from the import API after attempting to import an article. In testing, it appears that this will sometimes occur when importing articles with high revision counts (roughly 200-300, I think), even though the import may successfully complete. The bot will also pause for five seconds to allow the server to recover.
 * The bot receives a "filetoobig" error from the API. This will cause the bot to stop importing it and add TooManyRevisions to the article on Wikipedia, which will cause it to ignore the article on future runs.
 * The bot receives ten "cantopenfile" errors from the API for the same article. This seems to occur at random for some articles, but repeatedly for articles with very high revision counts (estimated to be 300 or more, not reliably tested yet). This will cause the bot to stop importing the article and add TooManyRevisions to the article on Wikipedia, which will cause it to ignore the article on future runs.


 * The bot will stop running if a total of three or more of these errors occur during its run. While these errors do not necessarily indicate a problem by themselves (since the import API does appear to be only partially reliable at best), repeated occurrences of them could mean I need to check the code. When each of these errors occurs, the error will be noted in the text log and the article will be re-added to the bot's import queue for a later attempt.
 * The bot receives a "notoken" error from the import API.
 * The bot receives a "badtoken" error from the import API.
 * The bot receives a "nofile" error from the import API.
 * The bot receives a "partialupload" error from the import API.

I will be placing the source code online soon at User:HersfoldBot/Source for review; that page will be fully protected. The code contains more complete documentation, as well as a slightly more detailed listing of the various conditions that will make the bot die (there are currently 30 different exit codes that indicate an error).

I would like to get approval here first, if possible. I have been unable to test the editing functions of this bot yet, and would like to be able to test that here before trying to get approval and admin rights over at Wiktionary. The import functionality has been tested at and appears to work fine (see testwiki:Special:Contributions/HersfoldBot). Once operational, I will also look into transferring the logs the bot produces onto Wikipedia, somewhere within the bot's userspace.

Discussion
Wow, that's one BRFA you've got there Hersfold. Anyhow, by request, a quick criteria analysis from me (as urgency probably plays second fiddle to getting it perfect here): So in summary, no problems so far, although one gets the feeling that some time, trial and error may be needed to get everything working perfectly. - Jarry1250 (t, c) 20:25, 9 March 2009 (UTC)
 * is harmless: that's what a period of debate and trial is for.
 * is useful : Yes, passes that one easily enough.
 * does not consume resources unnecessarily: Yep.
 * performs only tasks for which there is consensus: I can't see this being a problem.
 * carefully adheres to relevant policies and guidelines: I can't see any problems, and I trust an admin to know his way around them anyway.
 * uses informative messages, appropriately worded, in any edit summaries or messages left for users: I would hope so.


 * Oh, definitely. You'll see at User:HersfoldBot/Version that the code underwent several revisions already as I worked out the weird bugs. I still don't entirely know what to expect from the import API (since by all accounts it's buggy as hell), but I think (hope) I've worked out most of the major things already. It's proven easy enough to work around those errors so far, anyway. Hers fold  (t/a/c) 21:41, 9 March 2009 (UTC)


 * Seems harmless  MBisanz  talk 23:09, 9 March 2009 (UTC)


 * Trial running now -
 * The bot will handle test.wikipedia.org as Wiktionary; edits can be viewed at testwiki:Special:Contributions/HersfoldBot.
 * The bot will use User:HersfoldBot/Wikipedia:Transwiki log/Articles moved from here/en.wiktionary as the local transwiki log since the articles aren't being transwikied to Wiktionary.
 * The bot will edit the source articles here, however all of those edits will be rollbacked on completion to keep the articles categorized. Special:Contributions/HersfoldBot
 * Once done, I will copy the text log to User:HersfoldBot/Trial run log. Hers fold  (t/a/c) 01:52, 10 March 2009 (UTC)


 * First trial run failed - seems there's an issue with the edit functions, so the bot stopped running due to repeated non-fatal errors. Taking a look to see what happened; the log will be available at the above link soon. Hers fold  (t/a/c) 01:58, 10 March 2009 (UTC)


 * Ok, trying this again after I clean up the mess on test wiki - I forgot to assign the results of some functions back into some strings, so when the bot tried to edit, it ended up not doing anything (on the articles) or overwriting the existing content (on the logs). Hers fold  (t/a/c) 02:14, 10 March 2009 (UTC)


 * And it messed up again. I'm going to try and figure out why it's not noticing these transwiki templates; the log editing seems to be correct now, but the article editing has some issues. Hers fold  (t/a/c) 02:31, 10 March 2009 (UTC)


 * The problem was it's being totally case sensitive, when templates and links are case insensitive for the first letter only. This has been fixed, so I'm running the bot again for the remaining 40 edits. Hers fold  (t/a/c) 02:38, 10 March 2009 (UTC)


 * The trial has finished. On the last run log, the bot imported 13 articles to test.wikipedia.org out of the 16 that it attempted. The 3 articles that it failed to import received "cantopenfile" errors; I'm not sure what caused Hoodrats to fail (it was probably the API being ornery), however both List of slang terms for police officers and List of terms for gay in different languages have repeatedly gotten these errors, which makes me think they have too many revisions to import. Had the program continued, it probably would have marked both of these with TooManyRevisions eventually. The bot recorded all of its actions at Transwiki log (mimicking Wiktionary's log) and at User:HersfoldBot/Wikipedia:Transwiki log/Articles moved from here/en.wiktionary (mimicking Wikipedia's log). The bot is now blocked again, pending further clearance. Hers <em style="font-family:Bradley Hand ITC;color:gold">fold  (t/a/c) 04:32, 10 March 2009 (UTC)


 * I've just made some changes to the code to allow it to run through a GUI instead of the command line; could I get another trial to make sure it still works OK? The changes made will be logged in the bot's userspace shortly, although the changes made to the bot's operating code shouldn't have a substantial effect on how it runs. <em style="font-family:Bradley Hand ITC;color:blue">Hers <em style="font-family:Bradley Hand ITC;color:gold">fold  (t/a/c) 23:29, 10 March 2009 (UTC)


 * Screw it - I've been messing around with the GUI without actually running the bot and I can't get the output to work right. Command line's not awful anyway. <em style="font-family:Bradley Hand ITC;color:blue">Hers <em style="font-family:Bradley Hand ITC;color:gold">fold  (t/a/c) 04:23, 11 March 2009 (UTC)


 * This seems pretty harmless, but could take some time to get right, so I'm moving it to . Lemme know when it is all fixed up.  MBisanz  talk 08:12, 11 March 2009 (UTC)


 * Running bot again, limited to 24 total edits, which should be roughly eight articles. Again, any edits the bot makes to actual articles here will be rollbacked. <em style="font-family:Bradley Hand ITC;color:blue">Hers <em style="font-family:Bradley Hand ITC;color:gold">fold  (t/a/c) 18:21, 11 March 2009 (UTC)
 * Oops. Forgot to unblock the bot. <em style="font-family:Bradley Hand ITC;color:blue">Hers <em style="font-family:Bradley Hand ITC;color:gold">fold  (t/a/c) 18:26, 11 March 2009 (UTC)


 * Seems to be working fine now - the bot's operation is unaffected. <em style="font-family:Bradley Hand ITC;color:blue">Hers <em style="font-family:Bradley Hand ITC;color:gold">fold  (t/a/c) 19:30, 11 March 2009 (UTC)


 * I was called to comment over at Wiktionary and left my comments there, but my main suggestions are to make it check if there are main Wiktionary articles under the same name (not just Transwiki) for duplicates, and have it have a character limit of how long articles can be that are imported to Wiktionary. Goldenrowley (talk) 02:48, 12 March 2009 (UTC)


 * I've added the check for the main namespace, however I'm still leery on the character limit for the reasons I explained on Wiktionary; some articles could be fairly sizeable but still useful to your purposes. <em style="font-family:Bradley Hand ITC;color:blue">Hers <em style="font-family:Bradley Hand ITC;color:gold">fold  (t/a/c) 05:49, 12 March 2009 (UTC)


 * (outdent) Still adding features as requested on Wiktionary; I'm going to hold off on the final trial until I get approval for test runs on their end or they stop throwing suggestions at me. Their suggestions are including a lot of stuff that a Wikipedia editor wouldn't know about simply because it's about how they deal with things on their end. <em style="font-family:Bradley Hand ITC;color:blue">Hers <em style="font-family:Bradley Hand ITC;color:gold">fold  (t/a/c) 05:29, 13 March 2009 (UTC)


 * OnHold - In order for me to do final testing on Wiktionary, I'll need use of the import flag over there. In order to get that (and approval to run) I need to go through one of their two-week vote periods, so there isn't likely to be any more information here for a while. I am still watchlisting this, so if anyone has any further comments or questions, I'll see them. <em style="font-family:Bradley Hand ITC;color:blue">Hers <em style="font-family:Bradley Hand ITC;color:gold">fold  (t/a/c) 01:20, 16 March 2009 (UTC)

BAGAssistanceNeeded - The approval vote over at Wiktionary should wrap up tomorrow, and so far it's unanimously in favor of granting the import flag. Once I (or whoever closes the vote) hunts down a steward for the flag, the bot will be ready for a final live trial between Wikipedia and Wiktionary for approval here. There have been several changes to the code since a trial was last run, so if I could get someone to review that and approve the bot for final trials that'd be great. Thanks. <em style="font-family:Bradley Hand ITC;color:blue">Hers <em style="font-family:Bradley Hand ITC;color:gold">fold  (t/a/c) 19:17, 24 March 2009 (UTC)

sounds fine to me.  MBisanz  talk 21:35, 24 March 2009 (UTC)


 * Running into some minor problems with the import API - should be running smoothly in a moment. <em style="font-family:Bradley Hand ITC;color:blue">Hers <em style="font-family:Bradley Hand ITC;color:gold">fold  (t/a/c) 05:51, 26 March 2009 (UTC)


 * After fixing the API queries (I forgot to fix that bit of the code before running), the bot imported 10 articles, marked one for manual review due to its size, and removed another from the category since it already existed at Wiktionary. The bot ran for approximately 11-12 minutes and encountered no errors. A log of the imports it made can be viewed at Transwiki log/Articles moved from here/en.wiktionary, and the bot's full operation log is available at User:HersfoldBot/Trial run log. <em style="font-family:Bradley Hand ITC;color:blue">Hers <em style="font-family:Bradley Hand ITC;color:gold">fold  (t/a/c) 06:53, 26 March 2009 (UTC)

BJ Talk 07:02, 26 March 2009 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.