Wikipedia:Bots/Requests for approval/IPLRecordsUpdateBot


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

User:IPLRecordsUpdateBot
Operator:

Time filed: 16:35, Monday May 6, 2013 (UTC)

Automatic, Supervised, or Manual: Supervised

Programming language(s): PHP

Source code available: Special:PrefixIndex/User:IPLRecordsUpdateBot/Source

Function overview: Update the tables at List of Indian Premier League records and statistics

Links to relevant discussions (where appropriate):

Edit period(s): Usually 1 or 2 edits per day (more if there are errors which need to be fixed)

Estimated number of pages affected: 1

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: Updates the tables in the article List of Indian Premier League records and statistics by parsing data from stats.espncricinfo.com (the external links below each table in the article). Most of the article is currently outdated information, even though a few users maintain some parts. They update them usually only when something significant happens, even though some of the tables contain data that is to be updated after every match. (Not all tables - only 34 of them - will be updated though)

Each table which is to be updated is linked to a function which updates it. The function first locates the section header under which the table lies. It then loads the page given by the URL in the function, and uses the DOM and XPath to extract and parse data. This parsed data is then used to construct table rows which then replace the old rows in the table, using the section header as a reference point. Translation tables are used to convert Cricinfo names, which can be confusing, into common names. Each run will usually take about 10 to 15 minutes (http://stats.espncricinfo.com/robots.txt requires a 15-second pause after each request) and even though this time can lead to edit conflicts, the bot will save the wikitext to a local file which can be used to edit without re-doing all the updates again.

Because section headers are used to locate the tables, even a slight change in a section header will cause an error (and the bot will skip to the next function). Also, adding, removing or reordering columns can result in a mismatch. For this reason the bot will not edit if it has new messages on its talk page, so that editors can inform me about such changes so I can update the bot according to the changes. (I have not tested this yet as there is no other account to post a message to my talk page, so someone should check the source code to see if it will work) (an editnotice should be created for the article when the bot is active).

Note that the bot's first edit may do things like replacing the deprecated  with , closing some unclosed HTML tags and assigning sort keys to some of the values in the tables so that they sort properly. This bot will use a hard limit of 5 rows per table, with a few exceptions (the current article has a limit of 5 rows, but not a hard one - when there is a tie in the fifth spot, sometimes all rows with that value are used and sometimes the fifth row is left out, without any clear instructions or explanation). Also some minor changes may have to be done after the bot makes its first edit (but not after subsequent edits)

Note that the current season of the Indian Premier League will end on 26 May, after which this bot will not edit until next year (although its code can be forked to create bots for other major tournaments), take this into consideration when deciding when to approve this bot.

This bot will not mark edits as minor, nor will it set the bot flag on them.

Discussion
Left notices on Wikipedia talk:WikiProject Indian Premier League and Wikipedia talk:WikiProject Cricket, who may be interested in this discussion. jfd34 ( talk ) 17:24, 6 May 2013 (UTC)

jfd34 ( talk ) 09:00, 9 May 2013 (UTC)

 MBisanz  talk 11:39, 9 May 2013 (UTC)

jfd34 ( talk ) 17:08, 19 May 2013 (UTC)

Please note the following:
 * First edit: It involves things such as assigning sort keys so that the tables sort correctly, and fixing unclosed  tags and deprecated attributes, so do not mind the big size change.
 * Human editors may not know the sorting algorithm used to keep all tables the bot maintains to 5 rows. Sometimes human editors have been updating a few tables before the bot is scheduled to run, and sometimes add (or sometimes remove) some rows in case of a tie at the fifth spot as they do not know the rules used by the bot. However the bot uses a hard limit of 5 (which is easier for automated programs) and if there is a tie, uses available parameters other than the one sorted first to determine which row is &quot;better&quot; (rather than treating all tied rows as equal, which causes undefined behaviour according to ) - the sorting algorithm can be found in User:IPLRecordsUpdateBot/Source/CricinfoDataParser.php and the sort orders for each function are listed in User:IPLRecordsUpdateBot/Source/StatsUpdateFunctions.php.
 * See some discussion on my talk page here. jfd34  ( talk ) 17:08, 19 May 2013 (UTC)

jfd34 ( talk ) 09:00, 9 May 2013 (UTC)

 MBisanz  talk 05:45, 30 May 2013 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.