Wikipedia:Bots/Requests for approval/PxBot II


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol delete vote.svg Denied.

PxBot II
Operator:  PxMa

Automatic or Manually Assisted: auto

Programming Language(s): .NET languages

Function Summary: revert vandalism

Edit period(s) (e.g. Continuous, daily, one time run): continuous

Edit rate requested: Not sure what this should be set at...

Already has a bot flag (Y/N): No

Function Details: Revert vandalism. The bot feeds from recent changes (API or RSS, if I can figure out the latter) & filters out register users, so it only reverts IPs. I can also filter by namespace. For now it will only revert in the articlespace. I might make an option list where you can "op-in" like ClueBot so it will watch that page, regardless of namespace. The bot will be able to detect others warnings using   and other html tags. It will also warn according to the last warning (like it won't add uw-vandalism4 if the last uw-vandalism3 was two months ago). I'm also working on a function so it will report to AIV after four warnings. I have two ways I am thinking of finding vandalism. The first is bad words, and the second is a certain number of bytes changed (XXX number of bytes added or deleted). I've looked over other bots and I think I've generally covered all areas. I'll be making a set of custom warnings for the bot, so the users know that it is a bot that reverted. The bot won't be too intense; if it's not sure about a change, it doesn't revert. Any more ideas?

Discussion
You can't get an RSS feed of recent changes, as far as I know. Will the bot revert to itself, or to another bot? — madman bum and angel 02:42, 17 October 2007 (UTC)
 * Ah, looks like you can't :(. What do you mean by "Will the bot revert to itself, or to another bot?"  PxMa 02:52, 17 October 2007 (UTC)
 * If the bot reverts a user, then the user "vandalizes" again, however the bot may define that, will the bot revert again to its own version? I ask only because such behavior is discouraged.  — madman bum and angel 02:54, 17 October 2007 (UTC)
 * I could make it, unless it unwanted.  PxMa 11:47, 17 October 2007 (UTC)
 * RSS recent changes feed A le_Jrb talk 21:10, 17 October 2007 (UTC)

I'v had a quick scan through of your code, unfortunately I am not familiar with DotNetWikiBot, however I am presuming that the "i.text" contains the changed content of the page rather than the whole page itself? Overall, the method your using for working out if a revert is needed is currently very basic and will need more work otherwise your going to be reverting some obvious articles over and over :) Good luck though! Lloydpick 23:18, 18 October 2007 (UTC)
 * Yes, the i.text only contains the changes, not the entire article. It is quite simple right now, and I plan to remove the most obvious pages from being watched. The scoring system needs a lot of work, but the basics work for now.  PxMa 00:20, 19 October 2007 (UTC)
 * I would say attempting to white list pages which you don't want scanned, will take considerably longer than it would to construct a proper scoring method. But as you say, this is still work in progress, but I really wouldn't try to eliminate false positives by white listing articles, your list would be unmanageable. Lloydpick 01:02, 19 October 2007 (UTC)
 * I've only white listed the pages with the highest chance of false positives, like this and this.  PxMa 01:35, 19 October 2007 (UTC)
 * Update: Currently I've created a system for identifying old warnings with the time, the bot can warn users with the right level, I've got the rollback working, and I got the basic scoring system down. I should be ready for a trial run fairly soon.  PxMa 16:39, 19 October 2007 (UTC)


 * Personally, I would like to see what would trigger bot reverts first. Also, I'm a little worried about your hardcoded dates:         if ((t.text.Contains("19 October 2007")) || (t.text.Contains("20 October 2007")) || (t.text.Contains("18 October 2007")))  Do you intend on manually keeping those up to date every time you run the bot?  What if the bot runs past midnight?  — Coren (talk) 23:42, 24 October 2007 (UTC)
 * On a similar note, if im reading this correctly, the logic it uses to determine whether to report someone to AIV. If the user has been given a "uw-vandalism4" at any time in the past on their page and then been issued with a "uw-vandalism1" say today, you would report the user to AIV? Lloydpick 00:21, 25 October 2007 (UTC)
 * I haven't uploaded the newest source recently, but those issues were fixed :)  C O  02:41, 25 October 2007 (UTC)
 * ? — <tt>madman bum and angel</tt> 13:47, 25 October 2007 (UTC)
 * Are you the author of the code PxMa proposes to operate, CO? — Coren (talk) 14:19, 25 October 2007 (UTC)
 * I'm confused too... did you find a fix? <span style="font-family:Verdana,Arial,Helvetica;"> PxMa 14:40, 25 October 2007 (UTC)
 * Sorry for the confusion. I've been following your source and I've been messing around with it on my local wiki, and I've figured out a way so you don't have to "hardcode" the dates in. I'll email you what I got so far. <span style="font-family:Verdana,Arial,Helvetica;"> C O  15:58, 25 October 2007 (UTC)

This bot seems to me redundant to other similar countervandalism bots presently in operation on Wikipedia. How is this one's task different or how is its implementation better? Is there a real need for yet another countervandalism bot? I've looked through the source, and I have to say that the algorithm used to determine if an edit is vandalism is overly simplistic and appears quite prone to producing false positives. In contrast to the complexity of algorithms used by, for instance, User:AntiVandalBot (an algorithm that even after over a year of maturation and expansion still turns up the occasional false positive), the entirety of this bot's algorithm fits on three lines in an if-statement. A simple assessment of whether an edit introduces a "bad word" or not is a) going to miss a hell of a lot of vandalism, making its use of server resources quite inefficient, and b) match many edits that are not in bad faith (including, if "hell" is considered a bad word, this one). I also see no built-in safe-guards--what will cause this bot to stop editing? Do you plan to implement a mechanism to stop it from editing when it receives a new message on its talk page or to have it only stop when it is blocked? (And if the latter, will it die gracefully or explode?) The ideas you're exploring seem to me too immature to warrant a trial run of the bot at this time, and I would strongly suggest you review the work of other bot ops and consider how you can apply what they have discovered to your own work here. Once the idea is more developed and you've had some time to work out the kinks, then you should start exploring getting approval for a trial run or permanent operation of the bot. AmiDaniel (talk) 03:00, 26 October 2007 (UTC)


 * I would tend to agree. Regardless of my technical misgivings (which  states he has addressed), I would want to see which vandalism that a bot will catch that the other would not before a test run.  Other important worries:
 * This bot has no code to make it play nice with other bots (not revert to another AV bot, or for that matter not reverting an AV bot!) As it now stands, if someone reverts someone blanking an article containing a "bad" word, it will revert the fix (and eventually report it to AIV!)
 * This bot is not exclusion compliant, has no opt-out mechanism, and no non-blocking way to stop it
 * This bot has a detection algorithm much too primitive to be useful, with no scoring whatsoever. As it stands, it's little but a "censor bot", and not a very good one at that (does that framework even handle "bad" words that are part of bigger words)?
 * While fighting vandalism is a laudable goal, at this point I feel requesting approval for a new AV bot is premature, especially since its code base is not yet well-developed. I would recommend that you refine the bot, preferably by testing it on a Wiki, and get consensus from the community that another such bot is useful and desired first.  With no prejudice to a new request once the bot is mature,  — Coren (talk) 16:20, 26 October 2007 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.