Wikipedia:Bots/Requests for approval/XLinkBot 2


 * The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Symbol keep vote.svg Approved.

XLinkBot 2
Operator: User:Versageek, co-operated/programmed by User:Beetstra

Automatic or Manually Assisted: Automatic

Programming Language(s): Perl

Function Summary: Reverting addition of external links which probably do not comply with policy and/or guideline; reverting spam

Edit period(s) (e.g. Continuous, daily, one time run): continuously

Already has a bot flag (Y/N): (runs without botflag)

Function Details: XLinkBot has been approved to revert the addition of external links where the addition is probably in violation of policy and/or guideline when these are added by new editors and/or IP editors, using an on-wiki revertlist. External links are detected using the rule '(?:https?|ftp|irc|gopher|telnet|nntp|worldwind):\/\/[^\s\]\[\{\}\\\|^`<>]+', where the revertrules match specific parts of the result of that. These bots report the links that match rules on the revertlist to XLinkBot, which determines if the edit needs to be reverted as defined in the original BRFA.

The bots that do the actual parsing of articles, also detect the addition of email addresses, using the rule '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}'. These additions are (together with the above mentioned urls) reported to #wikipedia-en-spam (all en.wikipedia additions), and e-mail addresses also to #wikimedia-alerts (for all monitored wikis). I have given some examples in Bots/Requests_for_approval/XLinkBot_2/catches. I mention here that the bots also TRY to catch the addition of telephone numbers, but that sequence is not suitable for automatic reverting.

Generally, there are hardly any cases where e-mail addresses should be added to mainspace on the English wikipedia. OTRS does get regular complaints about third parties mentioning their e-mail addresses on wikipedia, other additions are often promotional. Moreover, Biographies of living persons states (in the Privacy of personal information section:

... Wikipedia articles should not include addresses, e-mail addresses, telephone numbers, or other contact information for living persons, ...

The original BRFA does not include reverting the addition of email addresses to mainspace. As I feel that the addition of the rule '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' to the revertlist would be a significant extension of the scope of the bot, I'd rather discuss that addition as a 'new task' then as a simple discussion on the talkpage of the revertlist. --Dirk Beetstra T C 13:39, 17 January 2009 (UTC)

Addition: another possibility would be to allow reverting specific e-mail addresses by using rules starting with @ (e.g. '@domain\.com\b'; so that all domain.com email addresses added to mainspace would be reverted, or \bsomeone@domain\.com\b for that specific email address). This would still involve an expansion of the task, though give more control. --Dirk Beetstra T C 15:47, 17 January 2009 (UTC)

Discussion
I have had to request revision deletion on three cases (if I remember correctly). There should be a way to add legitimate emails, but overall, I don't see why they would be in the mainspace. Should be. NonvocalScream (talk) 19:35, 17 January 2009 (UTC)
 * Emails should not be in articles, and a bot to revert their addition would certainly be a good thing. Let's see it in action.  Richard 0612  16:52, 20 January 2009 (UTC)
 * Rule added diff. Thanks.  Please poke me if I forget to remove the rule again .. --Dirk Beetstra T  C 17:07, 20 January 2009 (UTC)

... The addition of the rule killed the bot (and I have been away since that moment), and it has been offline for nearly two days now (and I don't see my error yet, so now testing). --Dirk Beetstra T C 11:45, 22 January 2009 (UTC)


 * I have removed the rule again, trial is about over. Below a list of all reverts regarding the new rule in XLinkBot's last 500 edits, 16 reverts in those approx 250-275 reverts (2 edits per revert, some contributions are now on deleted pages).  I have not copied the difflinks, they are on the talkpages in the warning the bot left if needed.  Note, some pages are already deleted so the contribs are gone:


 * 16:10, 25 January 2009 (hist) (diff) N User talk:117.96.56.78 ‎(BOT - Notifying 117.96.56.78 of reverted link additions to List of schools in India (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> info@jalaninternationalschool.com)) (top) [rollback]
 * 16:09, 25 January 2009 (hist) (diff) List of schools in India ‎(BOT--Reverting link addition(s) by 117.96.56.78 to revision 266240537 (matching (?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3} -> info@jalaninternationalschool.com)) (top) [rollback]
 * 16:05, 25 January 2009 (hist) (diff) User talk:Samcgtantra ‎(BOT - Notifying Samcgtantra of reverted link additions to Cgtantra.com (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> support@cgtantra.com))
 * 14:25, 25 January 2009 (hist) (diff) User talk:BoruB ‎(BOT - Notifying BoruB of reverted link additions to Open Spaces Society (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> (raoxon@yahoo.co.uk)) (top) [rollback]
 * 14:25, 25 January 2009 (hist) (diff) Open Spaces Society ‎(BOT--Reverting link addition(s) by BoruB to revision 266312013 (matching (?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3} -> (raoxon@yahoo.co.uk)) (top) [rollback]
 * 13:21, 25 January 2009 (hist) (diff) User talk:119.95.229.254 ‎(BOT - Notifying 119.95.229.254 of reverted link additions to Steps (group) (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> ''live@wembley.2000)) (top) [rollback]
 * 13:21, 25 January 2009 (hist) (diff) Steps (group) ‎(BOT--Reverting link addition(s) by 119.95.229.254 to revision 266301604 (matching (?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3} -> ''live@wembley.2000))
 * 11:58, 25 January 2009 (hist) (diff) User talk:Jimclose2000 ‎(BOT - Notifying Jimclose2000 of reverted link additions to Riz Pardaz Salsabil (SMP Co.) (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> info@smpict.com)) (top) [rollback]
 * 11:12, 25 January 2009 (hist) (diff) N User talk:81.153.180.33 ‎(BOT - Notifying 81.153.180.33 of reverted link additions to Watton, Norfolk (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> info@ngbc.co.uk))
 * 11:12, 25 January 2009 (hist) (diff) Watton, Norfolk ‎(BOT--Reverting link addition(s) by 81.153.180.33 to revision 261833731 (matching (?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3} -> info@ngbc.co.uk))
 * 08:06, 25 January 2009 (hist) (diff) N User talk:24.127.204.148 ‎(BOT - Notifying 24.127.204.148 of reverted link additions to Berlin Wall (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> luisga1990@yahoo.com)) (top) [rollback]
 * 08:06, 25 January 2009 (hist) (diff) Berlin Wall ‎(BOT--Reverting link addition(s) by 24.127.204.148 to revision 266167021 (matching (?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3} -> luisga1990@yahoo.com))
 * 08:01, 25 January 2009 (hist) (diff) N User talk:173.18.2.247 ‎(BOT - Notifying 173.18.2.247 of reverted link additions to 7 Up (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> dmickelson@mchsi.com)) (top) [rollback]
 * 08:00, 25 January 2009 (hist) (diff) 7 Up ‎(BOT--Reverting link addition(s) by 173.18.2.247 to revision 265790262 (matching (?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3} -> dmickelson@mchsi.com))
 * 01:04, 25 January 2009 (hist) (diff) N User talk:75.19.35.128 ‎(BOT - Notifying 75.19.35.128 of reverted link additions to Larissa (given name) (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> sweetbutwild@wild4music.com)) (top) [rollback]
 * 01:04, 25 January 2009 (hist) (diff) Larissa (given name) ‎(BOT--Reverting link addition(s) by 75.19.35.128 to revision 266222070 (matching (?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3} -> sweetbutwild@wild4music.com)) (top) [rollback]
 * 22:34, 24 January 2009 (hist) (diff) N User talk:72.203.144.248 ‎(BOT - Notifying 72.203.144.248 of reverted link additions to Anna Hutchison (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> contact:powerrangerjunglefury@gmail.com)) (top) [rollback]
 * 22:34, 24 January 2009 (hist) (diff) Anna Hutchison ‎(BOT--Reverting link addition(s) by 72.203.144.248 to revision 265800899 (matching (?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3} -> contact:powerrangerjunglefury@gmail.com))
 * 22:00, 24 January 2009 (hist) (diff) N User talk:88.107.39.3 ‎(BOT - Notifying 88.107.39.3 of reverted link additions to Pikelet (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> masacoola@hotmail.com))
 * 22:00, 24 January 2009 (hist) (diff) Pikelet ‎(BOT--Reverting link addition(s) by 88.107.39.3 to revision 266166683 (matching (?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3} -> masacoola@hotmail.com))
 * 7:21, 24 January 2009 (hist) (diff) N User talk:67.184.42.155 ‎(BOT - Notifying 67.184.42.155 of reverted link additions to Ashlie Michelle Cebak (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> signings08@live.com)) (top) [rollback]
 * 17:21, 24 January 2009 (hist) (diff) Ashlie Michelle Cebak ‎(BOT--Reverting link addition(s) by 67.184.42.155 to revision 259674128 (matching (?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3} -> signings08@live.com)) (top) [rollback]
 * 16:00, 24 January 2009 (hist) (diff) User talk:218.248.67.35 ‎(BOT - Notifying 218.248.67.35 of reverted link additions to Syntax (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> ravi.india87@yahoo.com)) (top) [rollback]
 * 16:00, 24 January 2009 (hist) (diff) Syntax ‎(BOT--Reverting link addition(s) by 218.248.67.35 to revision 264972476 (matching (?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3} -> ravi.india87@yahoo.com)) (top) [rollback]
 * 14:43, 24 January 2009 (hist) (diff) N User talk:119.154.2.98 ‎(BOT - Notifying 119.154.2.98 of reverted link additions to Sunidhi Chauhan (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> sh_726@hotmail.com))
 * 14:43, 24 January 2009 (hist) (diff) Sunidhi Chauhan ‎(BOT--Reverting link addition(s) by 119.154.2.98 to revision 265526020 (matching (?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3} -> sh_726@hotmail.com))
 * 14:06, 24 January 2009 (hist) (diff) N User talk:114.130.8.194 ‎(BOT - Notifying 114.130.8.194 of reverted link additions to Chunky Pandey (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> :eng.apurba@yahoo.com)) (top) [rollback]
 * 14:05, 24 January 2009 (hist) (diff) Chunky Pandey ‎(BOT--Reverting link addition(s) by 114.130.8.194 to revision 263735119 (matching (?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3} -> :eng.apurba@yahoo.com)) (top) [rollback]
 * 11:29, 24 January 2009 (hist) (diff) User talk:125.17.148.2 ‎(BOT - Notifying 125.17.148.2 of reverted link additions to Suya (good faith remark) (matching '(?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3}' -> gajender_bisht@yahoo.co.in)) (top) [rollback]
 * 11:29, 24 January 2009 (hist) (diff) Suya ‎(BOT--Reverting link addition(s) by 125.17.148.2 to revision 265970931 (matching (?<![^\s:])[^\s\]\[\{\}\\\|^\/`<>@:]+@\w+(?!\.htm)(?:\.\w+){1,3} -> gajender_bisht@yahoo.co.in))

Going through these edits, the only edit which looks appropriate (and hence was 'wrongly' reverted) is diff by an IP, where "live@wembley.2000" was misinterpreted as an email address (edits 6 and 7 in the list above). The IP seems to have followed XLinkBot's advice to revert the bot-revert, and the edit still stands (bot does not revert twice in a row, etc. etc.). All the others are either of the form 'email us for more info' or 'I made this edit, email me if you have questions' and should be reverted. Note that the warning/remark that is left on the user talkpage can be configured (by any admin) in the settings ([User:XLinkBot/Settings]]). --Dirk Beetstra T C 18:48, 25 January 2009 (UTC)
 * Everything looks good here, this regex will certainly help with cleaning up the mass of spam Wikipedia seems to get every day!  Richard  0612  20:48, 25 January 2009 (UTC)


 * The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.