Wikipedia:Bots/Requests for approval/Chartbot 3


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

Chartbot 3
Operator:

Time filed: 06:23, Tuesday March 19, 2013 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): PHP

Source code available:

Function overview: Correct links to Billboard stories and articles

Links to relevant discussions (where appropriate):

Edit period(s): one time run

Estimated number of pages affected: 5000

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: This is a continuation of Chartbot's mission to get our links to Billboard repaired. This bot will require interaction with Billboard's existing redirect system, which will remain in place until May.

We have links that have not worked in years, using the format. This URL format was dropped by Billboard in 2008. Unfortunately, we have several thousand links that use this format. We have even more links that use the format that began use in 2008,. The article IDs held constant from the 1990s through January 2013.

Billboard currently has a redirect system in place to aid in the transition. Presented with a URL of the form, it will redirect to a link of the form

Links of the form  simply return a 404: witness http://www.billboard.com/bbcom/esearch/article_display.jsp?vnu_content_id=1000808321, which used to point at http://www.billboard.com/news/chart-beat-bonus-1000808321.story.

Note that Billboard's redirection system is not sensitive to the text: http://www.billboard.com/news/dummy-noise-for-url-retrieval-1000808321.story successfully redirects to http://www.billboard.com/articles/news/64057/chart-beat-bonus, even though the text is obviously not the text originally returned.

In this bot, I will search for links of the form  and.

If it is the former style, I will synthesize the link. I will then retrieve that link from Billboard, and extract the XML field  (in this case, it is  . I will validate that the link appears to point to a news story, and, if it does, replace the link.

This is closely related to existing Chartbot functions. All of the supporting scripts that created BillboardURLbyName interacted with the redirection functions at Billboard to determine where things had been placed, so this is just more of the same.

Discussion
 MBisanz  talk 11:52, 19 March 2013 (UTC)

Trial was held between 21 mrt 2013 16:48 and 21 mrt 2013 17:32. All the edits have a summary of "***TRIAL Chartbot function 3: Repair of article links. Contact User talk:Kww if there are problems. Edits being monitored.***"
 * This edit contains article links of both kinds contained in the original bot request.
 * This edit shows a link style I didn't anticipate in the bot request, but appeared to be a harmless extension. http://www.billboard.com/news/1926761.story doesn't correctly redirect, but by changing it to http://www.billboard.com/news/dummy-text-1926761.story, the redirect works, and the bot was able to locate http://www.billboard.com/articles/news/70145/beyonce-branch-albums-storm-the-chart. Note that the original webcite title parameter was "Beyonce, Branch Albums Storm The Chart", so we can be quite confident in the replacement. I've checked out several dozen proposed replacements in these cases, and all seem accurate.

Looking at the bot log, I can see that it correctly handled cases of articles that cannot be successfully found, refusing to perform the replacements in those cases. For the technically minded, the bot is smart enough to cache replacements: if it finds the same link multiple times in the same run, it stores the canonical version on the first occurrence, and doesn't query Billboard again on subsequent hits. This phase will replace 11,679 links in 5,280 articles. Today's trial replaced 352 links in 50 articles.&mdash;Kww(talk) 17:47, 21 March 2013 (UTC)


 *  MBisanz  talk 20:42, 23 March 2013 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.