Wikipedia:Bots/Requests for approval/MadmanBot 11


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol keep vote.svg Approved

MadmanBot 11
Operator:

Time filed: 01:15, Monday February 13, 2012 (UTC)

Automatic-unsupervised, Automatic-supervised, or Manual: Automatic.

Programming language(s): Perl.

Source code available: Yes, updated.

Function overview: Mirror of CorenSearchBot, which has been offline since 31 December 2011.

Links to relevant discussions (where appropriate):

Edit period(s): Continuous.

Estimated number of pages affected: At most three edits per new page created, hard limit of twelve edits per minute (further edits are queued).

Exclusion compliant (Y/N): No.

Already has a bot flag (Y/N): Yes.

Function details: Mirror of CorenSearchBot, which has been offline since 31 December 2011. CorenSearchBot has become invaluable to WikiProject Copyright Cleanup and the Suspected Copyright Violations patrollers; unfortunately, no one's been able to contact Coren recently (regarding the bot or even ArbCom matters). According to the source code, the bot is not exclusion compliant, nor would it be appropriate for it to be.

Discussion
 MBisanz  talk 02:38, 13 February 2012 (UTC)
 * Update: Please note my updates to csb.pl (I like my Perl scripts to run with use strict and use warnings, and knowing the scope of the variables helps me understand them.) I'll try to run this trial tomorrow. &mdash; madman 03:42, 13 February 2012 (UTC)
 * So during the trial I got a notification that the API the bot was using was deprecated per . It looks like the new API, BOSS, is priced at $0.40-$0.80/1000 queries. I don't know if Coren encountered this problem or if his app ID was grandfathered in. I think I'm going to suspend the trial for a bit while I try to figure out how many queries CorenSearchBot would be making per month (my guess is quite a few) and whether I can use another search engine's API. &mdash; madman 19:00, 13 February 2012 (UTC)
 * Come to think of it, this may be why CorenSearchBot isn't working anymore; that blog posts said the shutdown should be effective at end of year and CorenSearchBot's last edit was on 31 December 2010. It's probably still running, but there are no URIs in the results that are returned (just the error message). &mdash; madman 19:02, 13 February 2012 (UTC)
 * I know the WMF paid for some access to Yahoo so as to get CSB up and running again after CSB also encountered the depreciation problem (it was grandfathered for a while). That makes me think that the code may be old as I would've thought it would've been updated to the new API.  I have a feeling that Coren looked into other search engines and found they all have problematic T&Cs although I can't remember which one's he looked at.  I'll go and see if I can find the relevant thread. Dpmuk (talk) 23:20, 13 February 2012 (UTC)
 * Well here's the thread discussing WMF paying for some access. I'll keep looking for the other search engines thread. Dpmuk (talk) 23:23, 13 February 2012 (UTC)
 * And here's the thread discussing other search engine providers. Coren seems to specifically discount Bing as having incompatible conditions. Dpmuk (talk) 23:25, 13 February 2012 (UTC)
 * I see. Madman do you know anyone at the WMF? If not, I can put you in touch with people who work there and might be able to help you replicate CSB's Yahoo deal.  MBisanz  talk 00:35, 14 February 2012 (UTC)
 * So yeah, it looks like he did update it to use BOSS but didn't update the published source code. If I recall, the published source code has a last-modified date of sometime in 2010 but I can't say exactly when as my connection to the Toolserver just in fact went down. -_- I've already modified the YahooFind function to use BOSS so if I could get an app ID from someone at WMF that would work. Or we could go for a second try at trying to get permission from Google; I do think that would yield the best results. &mdash; madman 00:44, 14 February 2012 (UTC)
 * Done at meta:User_talk:Steven_(WMF)  MBisanz  talk 03:23, 14 February 2012 (UTC)
 * You're the best! While I'm waiting on that, I'm going to continue documentation. I'd also like to update the code to use a Wagner Fischer module in CPAN instead of one that's nine years old and takes a while to track down. :p Cheers! &mdash; madman 03:35, 14 February 2012 (UTC)
 * Yeah, he does; he knows User:Mdennis (WMF). :) It was Erik that worked that magic. FWIW, I wrote Coren over a week ago, and have not yet received a reply. Hope all is well with him! But I'll ping Erik and Philippe and see what we can do. I know that Steven would like to see the bot up and running again as well, as he's working on a study that requires evaluation of CSB notices. --Moonriddengirl (talk) 11:24, 14 February 2012 (UTC)
 * Great! I hope this works. Thanks.  MBisanz  talk 16:36, 14 February 2012 (UTC)

– 48 edits, since tagging one page involves three edits. The first page tagged only went to the article and User talk because I set an edit threshold of two edits for the first run. Buddleja 'Pink Pagoda' was tagged due to a known issue with CorenSearchBot, which I believe I've now fixed for MadmanBot. Looks good to me; I'm happy! &mdash; madman 18:35, 24 February 2012 (UTC)
 * Just for interest: will you also check the CBS/Manual requests? mabdul 18:51, 24 February 2012 (UTC)
 * It will check User:MadmanBot/manual as that's how the original code works. If requested, I could also possibly code in checks to User:CorenSearchBot (and transclude User:MadmanBot/results there). &mdash; madman 19:57, 24 February 2012 (UTC)
 * We simply could redirecting CBS/manual to your new page (easy to revert when Coren might come back) (and the waiting requests would be also easily added to your new manual page by copy and paste...) So good to know where to put in the new requests ;) mabdul 20:08, 24 February 2012 (UTC)
 * Actually, I already coded in a $FATHER variable, and if it's set it'll check the $FATHER/manual page and post to $FATHER/results. I don't want to make it too hard for Coren to bring CSB back online. &mdash; madman 20:18, 24 February 2012 (UTC)

Manual check functionality has been tested successfully (and is continuing to take requests within MadmanBot's userspace). I fixed a bug with HTML entities in titles a while back; a similar bug with underscores in titles has been fixed. (There also was a small bug with Coren's regexes that he probably fixed but the fix wasn't in the code he posted; I've fixed it independently.) Not seeing any further problems. &mdash; madman 20:58, 24 February 2012 (UTC)
 * One question, did you get the WMF user ID you needed or are you still using the other search provider?  MBisanz  talk 21:47, 25 February 2012 (UTC)
 * I got login information from the WMF. :) &mdash; madman 21:48, 25 February 2012 (UTC)

 MBisanz  talk 21:50, 25 February 2012 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.