User talk:とある白い猫/bots

General Working
Reads the IRC feed from wikipedia, uses regular expressions to divide up each message into its constituent parts, the only thing it ignores is the protection log since there doesn't seem (or didn't seem) to be a reliable way of doing this.

Each event from wikipedia is then categorised possibly into multiple categories based on the configuration and lists.

The routing information lists the categories which should be reported to channels, after categorising each change is compared against these routes and reported to the channel as appropriate.

Multiple wikis
Supports multiple wikis (untested), each will be under the same nick and the same running code, but configuration, lists, alternate name are all unqiue to that wiki.
 * 1) * [[Image:Yes_check.svg|15px|Yes]] I have this. However one problem exists "special" events are generaly in native languages. Also my bot cant support unicode, anoter reason why mirc is less preferable. However, people using the bot (japanese guys for example) use clients that also lack unicode support. they use some other encoding tho. Unicode characters are corny. -- Cool CatTalk 00:20, 6 February 2006 (UTC)

Configuration
Apart from IRC details most configuration is dynamic.
 * 1) Threshold for small pages
 * 2) * [[Image:Yes_check.svg|15px|Yes]] -- Cool CatTalk 00:20, 6 February 2006 (UTC)
 * 3) Threshold for big new pages
 * 4) * [[Image:Yes_check.svg|15px|Yes]] -- Cool CatTalk 00:20, 6 February 2006 (UTC)
 * 5) Thershold for big removals
 * 6) * [[Image:Yes_check.svg|15px|Yes]] -- Cool CatTalk 00:20, 6 February 2006 (UTC)
 * 7) Threshold for big additions
 * 8) * [[Image:Yes_check.svg|15px|Yes]] -- Cool CatTalk 00:20, 6 February 2006 (UTC)
 * 9) Name to respond to as well as nickname
 * 10) * ? What is this? -- Cool CatTalk 00:21, 6 February 2006 (UTC)
 * 11) ** This is so as well as responding to it's own nick, it can respond to another name. e.g. computer. This is per language so the en wiki could use "computer", then fr wiki "ordinateur" or whatever they want really. (The main point in this was so it could listen for wl, bl etc. in the same channel your bot was running) --pgk( talk ) 22:11, 7 February 2006 (UTC)
 * Comment: what I think (correct me if I'm mistaken) computer is missing, is the ability to change those values on the fly (by channel operators for isntance) without having to edit configuration files locally. -- ( drini's page &#x260E;  ) 22:13, 7 February 2006 (UTC)

Routing
Events can be routed to multiple channels if need be and any channel can be set to accept commands if need be. When no events are left for a channel and it is not required to accept commands the bot will automatically leave the channel. The only exception here is the channel for the wiki defined as the control channel, the bot will never leave that and will always accept commands there.
 * [[Image:X_mark.svg|15px|No]] I kinda have something like this, new users are dumped to one channel but everything else is dumped to #wikipedia-en-vandalism -- Cool CatTalk 00:21, 6 February 2006 (UTC)
 * the interesting bit is changing the destinationss on the fly. -- ( drini's page &#x260E;  ) 22:20, 7 February 2006 (UTC)

Lists
Lists are built around a generic mechanism, so each can have
 * 1) An Expiry time, after which the entry is automatically deleted
 * 2) * [[Image:X_mark.svg|15px|No]] I didnt do this (lazy me) -- Cool CatTalk 00:21, 6 February 2006 (UTC)
 * 3) A reason for adding to or removing from the list (removals aren't stored yet)
 * 4) * [[Image:Yes_check.svg|15px|Yes]] and my removals are stored (comented out timestamped and person commiting the removal is stored) -- Cool CatTalk 00:21, 6 February 2006 (UTC)

The lists can all be manipulated on line (except the greylist)

Users
These are all case sensitive like wikipedia is. User lists can be either IP's or named users


 * 1) adminlist - List of wikipedia administrators
 * 2) * [[Image:Yes_check.svg|15px|Yes]] -- Cool CatTalk 00:21, 6 February 2006 (UTC)
 * 3) botlist - List of bots (so they can be dropped from the output)
 * 4) * [[Image:Yes_check.svg|15px|Yes]] -- Cool CatTalk 00:21, 6 February 2006 (UTC)
 * 5) blacklist - List of users to watch all edits
 * 6) * [[Image:Yes_check.svg|15px|Yes]] (not sure if it works for users) -- Cool CatTalk 00:21, 6 February 2006 (UTC)
 * 7) whitelist - List of trusted users
 * 8) * [[Image:Yes_check.svg|15px|Yes]] -- Cool CatTalk 00:21, 6 February 2006 (UTC)
 * 9) greylist* - List of users who have been rolled back by an admin in the last 10 minutes
 * 10) * [[Image:X_mark.svg|15px|No]] -- Cool CatTalk 00:21, 6 February 2006 (UTC)

These are mutually exclusive, being on one list stops you being on the others, with the exception of the greylist where only admins or blacklisted users don't go on the greylist.

Articles
These are all case sensitive like wikipedia is


 * 1) CVP - Commonly vandalised pages - always reported (including deletion and creation)
 * 2) * [[Image:Yes_check.svg|15px|Yes]] -- Cool CatTalk 00:21, 6 February 2006 (UTC)
 * 3) CNVP - Commonly not vandalised pages - never reported (Don't think this has been used yet)
 * 4) * [[Image:X_mark.svg|15px|No]] (is this necesary? since all pages can be vandalised. Is this to ignore sandboxes etc?) -- Cool CatTalk 00:21, 6 February 2006 (UTC)
 * 5) ** I'm not convinced it's a requirement at all, as I say I don't think it has been used, and only really is suitable for sandboxes etc. --pgk( talk ) 22:13, 7 February 2006 (UTC)

Others

 * 1) CVI - Commonly vandalised images - reports uploads/deletions of the image (case sensitive)
 * 2) * [[Image:X_mark.svg|15px|No]] -- Cool CatTalk 00:21, 6 February 2006 (UTC)
 * 3) BNU - "words" suspicious in new user names, these are actually not words but regular expressions, case insensitive.
 * 4) * [[Image:Yes_check.svg|15px|Yes]] -- Cool CatTalk 00:21, 6 February 2006 (UTC)
 * 5) Wheeler - "words" to look for in moved pages to guess if it's a wheeler or not, again regular expressions and case insensitive.
 * 6) * [[Image:Yes_check.svg|15px|Yes]] -- Cool CatTalk 00:21, 6 February 2006 (UTC)

Security
Security is based on the OP and Voice status of channel users. Private messages are not accepted.
 * 1) * [[Image:Yes_check.svg|15px|Yes]] indirectly thats true (given only voiced and oped people can talk. -- Cool CatTalk 00:22, 6 February 2006 (UTC)

Misc

 * 1) Quiet/Speak can be told to stop talking, start talking again.
 * 2) * [[Image:X_mark.svg|15px|No]] (thinking about implementing, but dont think will be necesary with merge) -- Cool CatTalk 00:22, 6 February 2006 (UTC)
 * 3) Status records the last time an event was seen on wikipedia and reports that as part of the status
 * 4) * [[Image:X_mark.svg|15px|No]] -- Cool CatTalk 00:22, 6 February 2006 (UTC)
 * 5) Reader reset the RC reader connection can be reset on command so it drops the connection from wikimedia and rejoins
 * 6) * [[Image:X_mark.svg|15px|No]] -- Cool CatTalk 00:22, 6 February 2006 (UTC)
 * 7) Can load files (whitelist, CVP etc.) from Cool_Cats bot into it's "database"
 * 8) * [[Image:X_mark.svg|15px|No]] (by nature) :P -- Cool CatTalk 00:22, 6 February 2006 (UTC)
 * 9) Can download an admin list from wikipedia (offline only at the moment)
 * 10) * [[Image:X_mark.svg|15px|No]] I thought about using a sock link to read wikipedia admin list page but backed down due to bandwith concerns, this is quite easy on mirc. -- Cool CatTalk 00:22, 6 February 2006 (UTC)
 * 11) Autoblacklist works against any editor, it tried (sometime not too successfully) to work out the length of block imposed and tailors the length of the blacklisting to that.
 * 12) * [[Image:X_mark.svg|15px|No]] I only auto-blacklist ips. Users backlistable should be blocked indefinately in my POV. -- Cool CatTalk 00:22, 6 February 2006 (UTC)
 * 13) ** Yes this is a debatably useful feature, I often end up de-blacklisting them. The original discussions I had with one or two felt we should keep it, but I've never been that convinced. --pgk( talk ) 22:22, 7 February 2006 (UTC)
 * 14) Blockconflicts - keeps tracks of blocks and checks if the difference in end time between blocks imposed on the editor differs more than a set number of hours.
 * 15) * [[Image:X_mark.svg|15px|No]] excelent idea -- Cool CatTalk 00:22, 6 February 2006 (UTC)

Future

 * 1) Quiet/Speak per channel
 * 2) * [[Image:X_mark.svg|15px|No]] I think you can do this with few variables :P -- Cool CatTalk 00:23, 6 February 2006 (UTC)
 * 3) Admin list update - either enable this as a regular job, or add people as admins if it sees them doing admin only things (deletes, blocks), or a combination - download the list if it sees someone it doesn't know about doing a delete or block.
 * 4) * [[Image:X_mark.svg|15px|No]] Might be better to do this semi-automatic. -- Cool CatTalk 00:23, 6 February 2006 (UTC)
 * 5) Routing presets to enable fast switching between what is being reported on a given channel
 * 6) * ? What is this? -- Cool CatTalk 00:23, 6 February 2006 (UTC)
 * 7) ** This is to do with the routing of events to different channels, at the moment reporting on vandalism2 has a list of say 10-15 different items to report, although this is flexible it is annoying when it needs to report somewhere else. The idea is to allow list of events to be group to a preset, then a channel can be set to receive that group of events very quickly. --pgk( talk ) 22:16, 7 February 2006 (UTC)
 * So for instance instead of havint "quiet" or "speak", you could to pgkbot dest preset "minimal" to only report greylist and blockconflict for instance, I assume. Routing it's explained above.-- ( drini's page  &#x260E;  ) 22:18, 7 February 2006 (UTC)

Stuff your pgkbot lacks and computer has

 * 1) Statistics showing how active a wikipedia is. Could be use to determine what percentage of the edits triger bots thresholds -- Cool CatTalk 00:23, 6 February 2006 (UTC)
 * 2) Multi-lingual support. (my bot covers 10 languages) -- Cool CatTalk 00:23, 6 February 2006 (UTC)
 * 3) * Although pgkbot can watch over several languages, it depends on which ones you put as parameter when you start it. -- ( drini's page &#x260E;  ) 22:19, 7 February 2006 (UTC)
 * 4) * In theory it does support the concept, but would require translation of the message and help files. I would also need to make sure the parsing of the events from WP also worked for the different languages. --pgk( talk ) 22:20, 7 February 2006 (UTC)