Wikipedia:Bots/Requests for approval/ClueBot


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol keep vote.svg Approved.

ClueBot
Operator: Winbots

Automatic or Manually Assisted: Manually assisted until I have corrected all bugs and false positives are few and far between.

Programming Language(s): PHP, my own classes to interact with Wikipedia's query.php, api.php, and index.php

Function Summary: Reverting vandalism.

Edit period(s) (e.g. Continuous, daily, one time run): Continuous

Edit rate requested: Only as often as it finds vandalism via Special:RecentChanges, but no faster than 10 edits per minute.

Already has a bot flag (Y/N): N

Function Details: Request RecentChanges and perform basic analysis on the data returned. If it finds something suspicious, request more information about that page and perform more in depth analysis on that information. If it determines that it looks like vandalization and it is in read-only mode: send information to owner. If it determines that it looks like vandalization and it is in read-write mode, request more information on user. If user has a long history on Wikipedia, log incident, but do nothing. If user is new/IP address, revert, log incident, and post a notice on the user's talk page.

The bot will also set its user page and its source code page when starting.

Discussion
A couple questions.
 * 1) Why not use the recent changes IRC feed? It's a lot less bandwidth-intensive, both on the server and on the client.
 * 2) Why only report potential vandalism to the owner? Why not open up the communication to the community?
 * 3) Are you aware that bots like this already exist, such as AntiVandalBot and MartinBot?
 * 4) Do you envision this bot being a replacement for bots like AntiVandalBot and MartinBot, or a supplement?
 * 5) Are you confident that as a relatively-new user (57 contributions), that you fully understand Vandalism and the Bot policy?

Thanks! — Madman bum and angel (talk – desk) 21:14, 24 July 2007 (UTC)
 * A couple of answers.
 * The bot does not request Special:RecentChanges but rather uses the api.php page specifically designed to be used by bots.
 * I might did add a /PossibleVandalism page, but what I meant by reporting it to its owner is on the command line, which is inaccessible to other users for obvious reasons.
 * Yes, I am. But, this being my first wikipedia bot, I thought I should try something reasonably easy to code.  I do plan on expanding (after getting permission, of course) to make the bot more useful.
 * More as a supplement. I don't see the need to deactivate the current bots.  Their rules are likely much more refined.  I am making this bot because I enjoy creating automated systems in my spare time, and thought I could contribute to Wikipedia.
 * Yes, I am. I have read both pages, I have corrected a fair amount of vandalism during the last couple of days (as you can see here).  I have implemented all the items under the "Good form" header.  And I intend to fix any errors my bot makes.


 * Thanks! Winbots 22:33, 24 July 2007 (UTC)
 * What algorithim will your bot use to determine vandalism, will it be using a scoring system, a dictionary? —  xaosflux  Talk 22:27, 24 July 2007 (UTC)
 * A scoring system of sorts. I haven't coded the scoring system yet, though.  When I do, the details will be on the user page for the bot. have coded a basic scoring system, see here. Winbots 22:35, 24 July 2007 (UTC)
 * Providing a diff on the /PossibleVandalism page would be helpful. ~   Wi ki  her mit  04:07, 25 July 2007 (UTC)
 * Done. New entries' "changed" text is now a link to the diff. Winbots 04:39, 25 July 2007 (UTC)

As you're probably aware, it took quite a bit of work to get our current antivandalism bots to where they are today. False positives obviously aren't the end of the world but they are still a problem. Having said that, I'd be interested in seeing what kind of scoring system you plan on implementing. -- S up? 08:35, 25 July 2007 (UTC)


 * How about the bot removing reports after a certain amount of time on the page? ~   Wi ki  her mit  08:49, 25 July 2007 (UTC)
 * Currently it looks like the bot searches for large deletions of text from articles. You could also create a blacklist of words, and have the bot scan recent changes for edits that include words on those list. You should get a whitelist if you plan to do it to help cut down on false positives. ~   Wi ki  her mit  08:58, 25 July 2007 (UTC)
 * I do plan on making it It now does remove reports after a certain amount of time 5 hours, otherwise the page gets very large. About the searching for large deletions, yes, that is what one of the things the code currently does.  It also searches for massive additions (and runs them through the scoring system), page blanks, and page replaces.  I do plan on making have made a blacklist/whitelist and a scoring system (documentation) (see above). Winbots 13:49, 25 July 2007 (UTC)
 * The bot is having some problems. See diff. It also had a problem when the vandal replaced the page with Image:Example.jpg. It reported it as [[Image:Example.jpg]] which made the image appear on the page. See diff ~   Wi ki  her mit  15:53, 26 July 2007 (UTC)
 * Yes, sorry, I have fixed that problem now. Winbots 17:43, 26 July 2007 (UTC)

I have now implemented the maxlag feature. The bot will sleep 10 seconds, then abort the edit if the server is lagged more than 2 seconds. Winbots 01:24, 27 July 2007 (UTC)

I have now coded the reverting/warning feature. It will remain disabled until such time as the bot is given a trial run, though. Winbots 16:57, 28 July 2007 (UTC)

I have also coded a feature to ask me about each revert before proceeding, if enabled. I plan on enabling this feature during the trial run, when the trial run is granted. Winbots 22:43, 28 July 2007 (UTC)


 * MartinBot (AVB is out of commission now : uses a custom warning. I suggest that you do the same, to make it clear that the warning has come from a bot.  Also, will the bot be able to increment warnings and/or report users to AIV (the bot section, ideally)?  Thanks, Martinp23 23:01, 28 July 2007 (UTC)


 * The bot does use a custom warning, See here. It also increments warnings (only from it's own warnings, it doesn't detect others' warnings) and reports to AIV under the bot section after 4 warnings have been issued. Winbots 23:13, 28 July 2007 (UTC)


 * Any particular reason it doesn't detect others? If your bot doesnt pick up one or two, and others do, and get to final warning, then you give another level 1 warning, its unusual and they should be reported to AIV. I don't see any good in not detecting other warnings. Matt/TheFearow (Talk) (Contribs) (Bot) 23:15, 28 July 2007 (UTC)


 * It now does detect others' warnings. It also will not revert the same title more than once per day. Winbots 00:02, 29 July 2007 (UTC)


 * It also will revert back more than edit if the vandal has made several consecutive edits until it has found a edit not made by the vandal. It will not try to go back further than 5 edits.  Winbots 00:40, 29 July 2007 (UTC)


 * ("undenting") We don't flag bots that deal with vandalism, so the edit rate will have to be lower to not clog up recentchanges. Otherwise it seems we ironed out all of the problems. ~   Wi ki  her mit  00:43, 29 July 2007 (UTC)

I'm approving you for a 50 revert trial. Each diff should be manually checked before allowing the bot to revert, so I expect an edit rate of no more than 2 epm. For the final 10 reverts, presuming you have had no problems, please set the bot to full-auto mode and (while watching) let it go at an edit rate of no more than 6epm (a 10 second sleep). Thanks, Martinp23 00:57, 29 July 2007 (UTC)


 * Before this is approved, I'd really like to see it go through a longer trial (perhaps a fortnight) so that we can pick up on any problems raised by the community as they come to see it. Of course, approval for that trial can only come after the current one! Martinp23 23:32, 30 July 2007 (UTC)
 * I agree. ((BotTrial|days=14|editrate=6)). Matt/TheFearow (Talk) (Contribs) (Bot) 01:47, 31 July 2007 (UTC)
 * No - this trial can only come when the current one is completed and we've been able to have a look over the results. For now, ignore that approval for a 2 week trial. Martinp23 11:28, 31 July 2007 (UTC)


 * Trial completed. All 50 edits. The 40 assisted edits at 2epm. The 10 fully automated edits at 6epm.  Winbots 05:51, 1 August 2007 (UTC)
 * Not completely done looking through, but it looks great! One note: It needs to detect other users final warnings, as I have seen several cases it should have AIV reported. I'll comment more when I check more. Matt/TheFearow (Talk) (Contribs) (Bot) 05:55, 1 August 2007 (UTC)
 * Everything else looks good, and no false positives. If a higher edit rate would be useful, I will approve for that. Matt/TheFearow (Talk) (Contribs) (Bot) 05:58, 1 August 2007 (UTC)

Matt, this is a major bot, so it helps to have more than one person look over the trial results before jumping ahead. As it is, there are a few instances of things like this, which don't make sense. Also, I note that the bot is reverting to the same page more than once, which I believe I explained on IRC to not be a good idea - we often get IPs or new users blanking massive sections of articles completely legitimately, where those articles violate BLP policy. There are similar cases which mean that a bot edit war is completely undesirable (by all means have a mode, manually set, which allows you to give the bot permission to make more than one revert, but by default it should be off).

For warnings, I'd suggest that you craft your own, and issue only two before reporting the user (so, report on the third offence). It would be nice if the bot could recognise other peoples' warnings, but not essential. Thanks, Martinp23 12:21, 1 August 2007 (UTC)
 * The bot will not revert more than once per article per 24 hour period. As I have watched it, it hasn't (that I am aware of) reverted more than once in a 24 hour period.  The bot adds vandalism to /PossibleVandalism whether or not it reverts it due to certain contraints.  I agree, I need to make it check more thoroughly that it actually corrected it before saying it corrected it on the /PossibleVandalism page.  That link you provided was because of a strange coincidence where the edit was within the last second of the recent changes the bot requested the first time, so when it requested all articles since then, at a later time, the recent changes page still gave it that article.  Winbots 17:06, 1 August 2007 (UTC)
 * OK. I've seen the "reverted by Cluebot before I saw it" message quite often now - not sure if it's something you can fix or not (it's only a cosmetic issue anyway). Martinp23 17:14, 1 August 2007 (UTC)
 * Yeah, I'll see if I can fix that. Winbots 17:30, 1 August 2007 (UTC)
 * Also, it does use it's own warnings, based off of the official warning templates. And it does detect others' warnings.  Just, it only honors others' warnings made within the last 48 hours. Winbots 17:29, 1 August 2007 (UTC)
 * Martin, sorr about that. I went through, and saw nothing wrong, so I re-approved. I'll wait a bit longer on these in the future. A single feature I would recommend, that would probably be a different bot, but it should be easy to implement, is to make it report a page to RFPP if it recognises more than 30 peices of vandalism on it from several different names in a 48 hour period - it would be incredibly useful, and it would find out a lot of pages subject to heavy vandalism. This is probably good as a different bot entirely, but it would be quite possible using this bot and its already good vandalism detecting features. Matt/TheFearow (Talk) (Contribs) (Bot) 21:35, 1 August 2007 (UTC)

Looks like it's working well, I don't see why not to. --ST47 Talk·Desk 22:05, 12 August 2007 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.