User:HersfoldCiteBot/Trial run logs

This page will contain the bot's operation logs during its BAG-approved trial runs. These logs are normally saved locally on my computer and are not displayed on the wiki unless needed.

These trials are conducted within the Eclipse IDE, in "Debug" mode. If I notice a significant problem during the run, I will pause the program to check what the cause of the problem may be, and may terminate the run manually if necessary.

Trial 1 - Failed
This run was terminated early when the bot failed to notice any cite web errors in a number of articles that do contain them. On suspending the program, I noticed an error with the "displaytext" the bot receives to look for error messages. This display text is the HTML code we see when viewing the site with an internet browser. Unfortunately, the bot is not getting the entire page this way; it truncates well before it reaches the references section, which means the bot doesn't notice any errors and moves on to the next article. Obviously this needs to be fixed for the bot to do any good.

Never mind... on further investigation, it is an error with my bot's code. When testing on test.wikipedia.org, I copied the error messages in plain-text. Here, the error messages are live, and so appear in HTML code to the bot's eyes. I'm going to try and update the search strings and that should fix the problem.

-- HersfoldCiteBot Operation Log Running version 1.1.2b September 23 2010, 23:01:15 UTC --

This is a trial run; the bot will make 30 edits, then stop.

23:01:15 - Attempting login...

23:01:17 - Successfully logged in as HersfoldCiteBot on en.wp.

23:01:19 - Getting articles in Category:Articles with broken citations 23:01:20 - Processing 'O Sole Mio 23:03:42 - No errors found in this article.

23:03:42 - Processing 2010 Women's Rugby World Cup squads 23:03:49 - No errors found in this article.

23:03:49 - Processing 5 Centimeters Per Second 23:03:54 - No errors found in this article.

23:03:54 - Processing AIESEC 23:03:57 - No errors found in this article.

23:03:57 - Processing ASCII art 23:04:01 - No errors found in this article.

23:04:01 - Processing Aaliyah (album) 23:04:05 - No errors found in this article.

23:04:05 - Processing Alan Ritchson 23:04:06 - No errors found in this article.

23:04:06 - Processing Alex Timbers 23:04:07 - No errors found in this article.

23:04:07 - Processing All Our Kings Are Dead 23:04:08 - No errors found in this article.

23:04:08 - Processing Amelia Reynolds (television presenter) 23:04:09 - No errors found in this article.

23:04:09 - Processing American Idiot (song) 23:04:11 - No errors found in this article.

23:04:11 - Processing American Slang 23:04:13 - No errors found in this article.

23:04:13 - Processing Amy Irving 23:04:14 - No errors found in this article.

23:04:14 - Processing Auburn University 23:04:17 - No errors found in this article.

23:04:17 - Processing Back It Up (song) 23:04:19 - No errors found in this article.

23:04:19 - Processing Bagri (clan) 23:04:21 - No errors found in this article.

23:04:21 - Processing Bandera, Texas 23:04:24 - No errors found in this article.

23:04:24 - Processing Bay Village, Ohio 23:04:25 - Logging out and shutting down.

Trial 2 - Failed
While the problems from the previous trial were fixed, there appear to be some more errors in actually correcting the templates. I have reverted all the bot's edits, but I'll need to look through the log and code to determine the exact problems and causes.

-- HersfoldCiteBot Operation Log Running version 1.1.3b September 24 2010, 00:06:07 UTC --

This is a trial run; the bot will make 10 edits, then stop.

00:06:07 - Attempting login...

00:06:08 - Successfully logged in as HersfoldCiteBot on en.wp.

00:06:10 - Getting articles in Category:Articles with broken citations 00:06:10 - Processing 'O Sole Mio 00:06:26 - Possible fixable errors found, attempting corrections 00:06:26 - Getting text of 'O Sole Mio 00:06:26 - Trying to add a title= parameter to 00:06:27 - Saving changes to 'O Sole Mio

00:06:37 - Processing 2010 Women's Rugby World Cup squads 00:06:43 - No errors found in this article.

00:06:43 - Processing 5 Centimeters Per Second 00:06:49 - No errors found in this article.

00:06:49 - Processing AIESEC 00:06:51 - No errors found in this article.

00:06:51 - Processing ASCII art 00:06:57 - Possible fixable errors found, attempting corrections 00:06:57 - Getting text of ASCII art 00:06:58 - Trying to add a title= parameter to 00:07:05 - IOException recieved when trying to access http://Simon. 00:07:05 - Saving changes to ASCII art

00:07:22 - Processing Aaliyah (album) 00:07:26 - No errors found in this article.

00:07:26 - Processing Alan Ritchson 00:07:27 - No errors found in this article.

00:07:27 - Processing Alex Timbers 00:07:28 - No errors found in this article.

00:07:28 - Processing All Our Kings Are Dead 00:07:30 - No errors found in this article.

00:07:30 - Processing Amelia Reynolds (television presenter) 00:07:31 - No errors found in this article.

00:07:31 - Processing American Idiot (song) 00:07:33 - No errors found in this article.

00:07:33 - Processing American Slang 00:07:35 - No errors found in this article.

00:07:35 - Processing Amy Irving 00:07:40 - No errors found in this article.

00:07:40 - Processing Auburn University 00:07:44 - No errors found in this article.

00:07:44 - Processing Back It Up (song) 00:07:46 - No errors found in this article.

00:07:46 - Processing Bagri (clan) 00:07:49 - No errors found in this article.

00:07:49 - Processing Bandera, Texas 00:07:51 - No errors found in this article.

00:07:51 - Processing Bay Village, Ohio 00:07:54 - No errors found in this article.

00:07:54 - Processing Belgian nationality law 00:07:55 - No errors found in this article.

00:07:55 - Processing Ben Affleck 00:08:02 - No errors found in this article.

00:08:02 - Processing Big Four (audit firms) 00:08:04 - No errors found in this article.

00:08:04 - Processing Blake LeVine 00:08:05 - No errors found in this article.

00:08:05 - Processing Blindcrake 00:08:14 - No errors found in this article.

00:08:14 - Processing Boston University 00:08:22 - No errors found in this article.

00:08:22 - Processing Britney's New Look 00:08:24 - No errors found in this article.

00:08:24 - Processing C++ 00:08:29 - No errors found in this article.

00:08:29 - Processing Cafe Antarsia Ensemble 00:08:32 - No errors found in this article.

00:08:32 - Processing California Polytechnic State University 00:08:38 - No errors found in this article.

00:08:38 - Processing Canada Bank Act 00:08:40 - Possible fixable errors found, attempting corrections 00:08:42 - Getting text of Canada Bank Act 00:08:43 - Trying to add a title= parameter to 00:08:45 - IOException recieved when trying to access http://faculty.marianopolis.edu/c.belanger/quebechistory/encyclopedia/BankinginCanada-CanadianBanks-CanadianHistory.htm=[[The . 00:08:45 - Saving changes to Canada Bank Act

00:08:55 - Processing Candice Bergen 00:08:58 - No errors found in this article.

00:08:58 - Processing Cecile B. Kremer 00:09:00 - Possible fixable errors found, attempting corrections 00:09:01 - Getting text of Cecile B. Kremer 00:09:01 - Trying to add a title= parameter to 00:09:26 - Logging out and shutting down.

Problems noted

 * Failed to identify correct situation in 'O Sole Mio, attempted to add author's name as an existing title. diff. Likely cause is that the template did not in fact have a named url parameter and the next space happened to be in the guy's name.
 * If url= cannot be found, search for http://. If that can't be found, give up and flag for review - ✅ in 1.1.4b
 * If url= cannot be found but http:// can, add url= where needed - ✅ in 1.1.4b
 * When searching for an existing title, don't keep searching beyond the end of the parameter (stop at the pipe or end brackets) - Already does this, the apparent misbehavior was the result of other misbehavior, now fixed.
 * Lots of issues with ASCII art - first, the article was identified incorrectly as having a fixable error. Secondly, it tried to add a title where a title existed already. Thirdly, it grabbed the wrong parameter for the URL. Fourthly, it saved an edit converting &amp;lt; and &amp;gt; code to symbols.
 * Fix the search string for the "archiveurl missing archivedate" error; that's what caused the misidentification. - ✅ in 1.1.4b
 * Need to figure out why the template was flagged for not having a title parameter, although maybe having the template on multiple lines caused it? - ✅ in 1.1.4b, apparently regexes in Java don't match newlines on the . character. Odd.
 * Issue #3 is likely the same problem as #2, thus needing a similar fix - ✅ per above
 * Not sure issue #4 can be fixed; the bot framework decodes all that itself, and in some cases those corrections may be necessary. -
 * The problem with Canada Bank Act is to be expected; here the template is malformed, and the bot won't be able to recognize that the URL has in fact ended, as equal signs are commonly found in URLs. Nothing to be fixed there, the page would have been flagged for review.
 * The bot crashed when trying to access the blank URL at Cecile B. Kremer
 * I need to add code to the bot to tell it to flag blank url arguments for attention and not try to connect to them. - ✅ in 1.1.4b

Trial 3 - Failed
On this run, the bot correctly made two edits, however made one incorrect edit at Canada Bank Act and reported that it had edited a fourth page, however no edit was actually made... so I'm not really sure what happened there. The bot terminated its run early as a result of a runtime error while processing Fowey. The bot's log is provided below.

-- HersfoldCiteBot Operation Log Running version 1.1.4b September 29 2010, 00:17:02 UTC --

This is a trial run; the bot will make 10 edits, then stop.

00:17:02 - Attempting login...

00:17:02 - Successfully logged in as HersfoldCiteBot on en.wp.

00:17:04 - Getting articles in Category:Articles with broken citations 00:17:04 - Processing 'O Sole Mio 00:17:05 - Possible fixable errors found, attempting corrections 00:17:05 - Getting text of 'O Sole Mio 00:17:05 - Trying to add a title= parameter to 00:17:06 - Saving changes to 'O Sole Mio

00:17:16 - Processing 2010 Women's Rugby World Cup squads 00:17:22 - No errors found in this article.

00:17:22 - Processing 5 Centimeters Per Second 00:17:27 - No errors found in this article.

00:17:27 - Processing AIESEC 00:17:29 - No errors found in this article.

00:17:29 - Processing ASCII art 00:17:32 - No errors found in this article.

00:17:32 - Processing Aaliyah (album) 00:17:37 - No errors found in this article.

00:17:37 - Processing Aberdeen 00:17:49 - No errors found in this article.

00:17:49 - Processing Alan Ritchson 00:17:50 - No errors found in this article.

00:17:50 - Processing Alex Timbers 00:17:51 - No errors found in this article.

00:17:51 - Processing All Our Kings Are Dead 00:17:53 - No errors found in this article.

00:17:53 - Processing American Idiot (song) 00:17:54 - No errors found in this article.

00:17:54 - Processing American Slang 00:17:56 - No errors found in this article.

00:17:56 - Processing Amy Irving 00:17:57 - No errors found in this article.

00:17:57 - Processing Ann-Margret 00:18:01 - No errors found in this article.

00:18:01 - Processing Aranya 00:18:02 - No errors found in this article.

00:18:02 - Processing Arts in Rome 00:18:03 - No errors found in this article.

00:18:03 - Processing Auburn University 00:18:08 - No errors found in this article.

00:18:08 - Processing Axe (grooming product) 00:18:12 - No errors found in this article.

00:18:12 - Processing Bagri (clan) 00:18:14 - No errors found in this article.

00:18:14 - Processing Bandera, Texas 00:18:17 - No errors found in this article.

00:18:17 - Processing Basel 00:18:27 - No errors found in this article.

00:18:27 - Processing Bay Village, Ohio 00:18:29 - No errors found in this article.

00:18:29 - Processing Belgian nationality law 00:18:30 - No errors found in this article.

00:18:30 - Processing Big Four (audit firms) 00:18:32 - No errors found in this article.

00:18:32 - Processing Blake Harrison 00:18:33 - No errors found in this article.

00:18:33 - Processing Blake LeVine 00:18:34 - No errors found in this article.

00:18:34 - Processing Blindcrake 00:18:41 - No errors found in this article.

00:18:41 - Processing Bloggingheads.tv 00:18:45 - No errors found in this article.

00:18:45 - Processing Borders Group 00:18:49 - No errors found in this article.

00:18:49 - Processing Britney's New Look 00:18:50 - No errors found in this article.

00:18:50 - Processing Bruce Van Voorhis 00:18:51 - No errors found in this article.

00:18:51 - Processing C++ 00:18:56 - No errors found in this article.

00:18:56 - Processing Cafe Antarsia Ensemble 00:18:56 - No errors found in this article.

00:18:56 - Processing California Polytechnic State University 00:19:04 - No errors found in this article.

00:19:04 - Processing Canada Bank Act 00:19:05 - Possible fixable errors found, attempting corrections 00:19:05 - Getting text of Canada Bank Act 00:19:05 - Trying to add a title= parameter to 00:19:05 - Saving changes to Canada Bank Act

00:19:15 - Processing Candice Bergen 00:19:17 - No errors found in this article.

00:19:17 - Processing Card counting 00:19:19 - No errors found in this article.

00:19:19 - Processing Cassin Young 00:19:19 - No errors found in this article.

00:19:19 - Processing Chew Magna 00:19:26 - Possible fixable errors found, attempting corrections 00:19:26 - Getting text of Chew Magna 00:19:26 - Trying to add a title= parameter to 00:19:26 - IOException recieved when trying to access http://http:www.singstargame.com/en-gb/. 00:19:26 - Saving changes to Chew Magna

00:19:39 - Processing Child benefit 00:19:40 - No errors found in this article.

00:19:40 - Processing Chilean Army 00:19:42 - No errors found in this article.

00:19:42 - Processing Cinema of Nigeria 00:19:44 - No errors found in this article.

00:19:44 - Processing Cleon Skousen 00:19:50 - No errors found in this article.

00:19:50 - Processing Coconut cake 00:19:50 - No errors found in this article.

00:19:50 - Processing Cornetto (ice cream) 00:19:51 - Possible fixable errors found, attempting corrections 00:19:51 - Getting text of Cornetto (ice cream) 00:19:51 - Trying to add a title= parameter to 00:19:52 - Saving changes to Cornetto (ice cream)

00:20:02 - Processing Dartmouth College 00:20:24 - No errors found in this article.

00:20:24 - Processing David Miliband 00:20:32 - No errors found in this article.

00:20:32 - Processing Davy Fresh 00:20:33 - No errors found in this article.

00:20:33 - Processing Decoder Ring Theatre 00:20:34 - No errors found in this article.

00:20:34 - Processing Delta Air Lines 00:20:43 - No errors found in this article.

00:20:43 - Processing Demi Lovato 00:20:48 - No errors found in this article.

00:20:48 - Processing Demographics of Italy 00:20:52 - No errors found in this article.

00:20:52 - Processing Digital terrestrial television 00:21:08 - No errors found in this article.

00:21:08 - Processing Dr Pepper 00:21:12 - No errors found in this article.

00:21:12 - Processing Dubstar 00:21:13 - No errors found in this article.

00:21:13 - Processing Dylan Baker 00:21:14 - No errors found in this article.

00:21:14 - Processing Dynasty (TV series) 00:21:18 - No errors found in this article.

00:21:18 - Processing Eagles of Death Metal 00:21:19 - No errors found in this article.

00:21:19 - Processing Economy of Rome 00:21:20 - No errors found in this article.

00:21:20 - Processing Edward P. Jones 00:21:21 - No errors found in this article.

00:21:21 - Processing Electronic waste 00:21:25 - No errors found in this article.

00:21:25 - Processing Eliza Dushku 00:21:28 - No errors found in this article.

00:21:28 - Processing Emmanuelle Chriqui 00:21:29 - No errors found in this article.

00:21:29 - Processing Enes Mešanović 00:21:30 - No errors found in this article.

00:21:30 - Processing Esotericism 00:21:32 - No errors found in this article.

00:21:32 - Processing Eye of Horus 00:21:32 - No errors found in this article.

00:21:32 - Processing Fart 00:21:34 - No errors found in this article.

00:21:34 - Proc

Hersfold note: The bot was apparently trying to write to the log at the time it crashed; however, I am not sure why the log is so far delayed, as "Fart" is several entries up the category from the page it crashed on, "Fowey".

Problems Noted

 * Rather than logging for manual review as should have happened, the bot attempted to insert a null title in the wrong place at Canada Bank Act (diff)
 * I'm not sure what changes were supposed to get saved to Chew Magna, but I'll need to figure that out. The IOException there is to be expected, as the URL is malformed.
 * The bot crashed due to a StringOutOfBoundsException, reporting the following:
 * Exception in thread "Thread-5" java.lang.StringIndexOutOfBoundsException: String index out of range: -1
 * at java.lang.String.substring(Unknown Source)
 * at citation.HersfoldCiteBot.correctCiteWebErrors(HersfoldCiteBot.java:653)
 * at citation.HersfoldCiteBot.run(HersfoldCiteBot.java:216)
 * at java.lang.Thread.run(Unknown Source)