User:Emijrp/AVBOT

An effort to rewrite the Anti-Vandalism BOT code.

Users to watch:
 * Anons
 * Registered users with < 10 edits

Users to exclude:
 * Bots
 * Admins
 * Registered users with >= 10 edits

Edits to watch:
 * Article edits

Edits to exclude:
 * Talk edits
 * Non-article edits

Classes of edits to revert:
 * Mass blanking
 * Mass blanking + adding a vandalism
 * Mass blanking + mass adding (text replacing)
 * Section blanking (be careful in BLP, count refs? to avoid blanking legit text?)
 * Section blanking + adding a vandalism
 * Section blanking + mass adding
 * Mass adding
 * Insults, bad words, word replacing, ect
 * Nonsense
 * Test edits
 * NPOV
 * Black list URLs: youporn.com, ect
 * ASCII Art
 * Long-term vandals
 * Date changers (impossible birth/death dates)

Training reverts:
 * Edits made by anons
 * Reverts made by users > 1000 edits
 * Reverted in < 3 minutes
 * Before vandalism and after revert, two different users to the vandal and the reversor
 * Use md5 to verify reversion 100%
 * Exclude reverts with the summary: good faith edits by...
 * The edit is not re-inserted after being reverted: compare all md5 in the history?

Stuff to study:
 * Position of the added text begining, end (be careful with templates, iws, categories, ect, a lot of anons add iws), after templates, inline section, section end, section begining
 * Renaming sections
 * Make a ranking with top inserted words and quickly deleted

Desired features:
 * Easy to install
 * i18n
 * No dependences (except pywikipediabot obviously)
 * Load recent changes using IRC channels or API feed