Wikipedia:Bots/Requests for approval/BattyBot 18


 * The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Symbol oppose vote.svg Withdrawn by operator.

BattyBot 18
Operator:

Time filed: 01:33, Tuesday February 19, 2013 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AutoWikiBrowser

Source code available: AWB

Function overview: Create new sockpuppet categories

Links to relevant discussions (where appropriate): Bot requests/Archive 53

Edit period(s): Weekly runs

Estimated number of pages affected: about 200 during the first run, then a couple per week

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Example category created: Category:Suspected Wikipedia sockpuppets of Paliku
 * Copy the list of categories from WP:Database reports/Red-linked categories with incoming links into AWB
 * Filter the list to only keep those that contain "sockpuppets of"
 * If the category does not exist, create it with the template "Sockpuppet category"

Discussion

 * Seems pretty straightforward and uncontroversial. I don't see any issues yet.— cyberpower ChatOnline 02:58, 19 February 2013 (UTC)
 * Comment as the original suggestor - the first run will create about 200 categories to clear the backlog, after that it should be no more than a couple per week. Should probably test for case where the name is blank and ignore it, we can do that manually. Le Deluge (talk) 12:29, 19 February 2013 (UTC)
 * Seems rather useless to me. Why would we create Category:Wikipedia sockpuppets of Scrotel, which only contains an IP address (open proxy)? Category:Wikipedia sockpuppets of Joestieg has been a populated redlink for over 6 years, what's the use of creating it now? Category:Wikipedia sockpuppets of Rexmorgan also has been a redlink since 2006, for a 2006 sock of an editor who hasn't edited since 2008 (and apparently hasn't socked apart from the once in 2006). Category:Wikipedia sockpuppets of 74.138.237.75 is a cat labeling an IP (probably no longer used by the same person anyway) as the master of one editor, User:Rlholt, who has a tag claiming that he is blocked indefinitely while in reality he was blocked for 48 hours only, and has made only one edit in 2009. Category:Suspected Wikipedia sockpuppets of Ewok Slayer seems problematic (and very stale anyway). Category:Suspected Wikipedia sockpuppets of Avineshjose only has an IP (open proxy) which probably is no longer in use by the original sockmaster.
 * This task will create thousands of new categories, with very little benefit (the only purpose seems to be the removal of these pages from the database report, no indication that the vast, vast majority will ever be used). Removing the "sock" tag from every user (talk) page where it has sat for more than 6 months or a year without the creation of the accompanying category, seems like a more useful proposal which will achieve the same result (they will get removed from the database report) without the needless category creation. See this and the next pages for the thousands (not 200) of redlinks that match this bot task. Fram (talk) 13:01, 19 February 2013 (UTC)
 * So what if thousands of new categories are created in the initial clearing of the backlog, doesn't WP:Don't worry about performance apply? And why should these socks be treated any differently to those that have had categories created? If you're worrying about filling up the sock categories with subcategories, how many is too many? As for suddenly deciding that a user is no longer a sock after x months (I'd go for at least 12) - that's maybe useful for IP's, but not usernames. Still, it's a separate task that applies equally to those in categories and those which are not. For those with usernames, are you arguing for some kind of archiving system by date of last known sock activity or something? I'm not sure that's particularly useful, but that's where your argument ends up.Le Deluge (talk) 14:03, 19 February 2013 (UTC)
 * And once the one-off backlog is cleared, this seems to be something genuinely helpful to the anti-sock community, it raises the profile of socks that might otherwise go unnoticed by putting the cat in the appropriate place in the cat hierarchy, and the template gives them various tools to use against the sock.Le Deluge (talk) 14:16, 19 February 2013 (UTC)
 * "Don't worry about performance" is applicable if what you are creating is actually useful. What I question is the actual usefulness of these cats, many of them years old where neither the socks nor the masters have edited for years, and no one seems to have felt the need to create these cats for years either. Whether the existing ones should be kept around for ever is a different discussion, and not very relevant for this bot discussion. But actively creating categories that for 99% will never be used by anyone, and which are (in the case of IPs) outdated at the moment of creation, is an utter waste. Plus it has the risk of creating nonsense or attack cats: we once had a user creating all these cats, but this included Category:Suspected Wikipedia sockpuppets of Ghirlandajo. Someone had tagged an IP incorrectly (or maliciously) as a sock of Ghirlandajo. No one noticed this, and Ghirlandajo had no real method of knowing this: once you create such a category though, a user gets linked to sockpuppetry in a much more visual and direct way.
 * As an example, this bot would create Category:Suspected Wikipedia sockpuppets of Elfguy: as far as I can tell, the tagged user and the suspected master have nothing to do with each other. This cat is from 2005, Elfguy is still editing. What is the purpose, the benefit, of creating a cat that (probably incorrectly) links him to sockpuppetry? Fram (talk) 14:25, 19 February 2013 (UTC)
 * I've invited the SPI people to comment here. I don't think the usefulness is in question for newly-tagged users. I think the usefulness for old categories is no greater or worse than existing sock categories from 2005 or whenever. The bot is neutral - any "attack" is made by the person who applies the category tag, not by the user who creates the category in response. I agree that if Elfguy is innocent then anything linking him to socking is A Bad Thing - the fact that he's been tagged as a sock for 8 years suggests current mechanisms for clearing the innocent are inadequate. Bringing that accusation into the mainstream of SPI visibility should help it be resolved one way or the other - sunlight is the best disinfectant. Meanwhile I don't think you can argue the utility of this bot for new cases? Le Deluge (talk) 14:58, 19 February 2013 (UTC)
 * But you aren't suggesting doing this only for new cases, you are suggesting doing this for all cases, including those of 6 years ago or more. If the correct old ones will never be used, and the incorrect old ones are a problem, then why would we create these cats instead of removing the tags? You haven't given any argument yet why these old ones need to be created. Fram (talk) 15:06, 19 February 2013 (UTC)
 * Just to be clear, you're saying that the bot does a useful task but we're just debating the timing? I've mentioned several times why old socks are of interest - it's no different to existing categories of old socks, sometimes you may need to investigate the old ones if a sock starts editing after a Wikibreak. Plus your problems are not caused by the bot - any attack editing is inherent in the original classification, by making that attack more visible and providing the anti-sock template in the cat, it gives ready access to the tools that would allow false accusations of socking to be cleared up. As for IPs, you can either purge them after 12 months - or I can see where it would be useful to know about IP socks for much longer than that if you're poking around an article's history. PS Sorry about blanking the top earlier, it was accidental - I've been having problems with WP timing out today and obviously my precautionary copy didn't include the whole page when it came to pasting back in after the timeout. Le Deluge (talk) 15:39, 19 February 2013 (UTC)
 * I'm not saying that the bot does a useful task. I'm saying that the bot does a clearly useless task when it does this for older sockpuppet cats. Considering that more than half of such cats never get created without anyone having problems with this, it follows that for over half of the new ones it would create, the task would be pretty useless as well. And I don't see the benefit of anything you bring forward here; when are any of these actually used for that purpose? Or is that purely hypothetical? Where are the examples of sock cats that were redlinks for more than 6 or 12 months, which then suddenly became of interest? The benefits seem purely theoretical to me. Creating categories to make it easier to weed out the incorrect ones afterwards is rather backwards as well; make sure that you only create the correct ones instead. A bot can't do that? Tough luck, then this isn't a bot task. Fram (talk) 15:56, 19 February 2013 (UTC)
 * "A clearly useless task"? Sounds like argument #3 of here. All this is doing is automating a task which the anti-sock community are doing already and have spent some considerable time designing a complex template to achieve. If you think that adding that template to newly-identified socks is "clearly useless", then your argument is with the guys at WP:SPI as a whole. I'm trying to establish some common ground here, and it seems that establishing the merits of the bot for new socks is where we're likely to find that common ground. You seem keen to establish some arbitrary cut-off as though sockpuppets never operate for more than a few weeks. On the contrary, there's a whole department dedicated to those operating over longer timescales. I'm not sure what the record is and haven't time to do a detailed search but to pick one example, Ananny started in 2006 and was still going in 2011. So 5+ years is a meaningful period for sock investigations, despite your personal disbelief. As for weeding out - the thing that needs weeding out is the original categorisation of innocents as socks, not the category itself. Le Deluge (talk) 16:55, 19 February 2013 (UTC)
 * No, nothing to do with "Idon'tlikeit". I have nothing against the template, I have nothing against creating the category, where needed. This is not the same as replacing this with mindless creation of very old ones for the sake of one database report. I am also not saying that all socks tagged in year X should be detagged; I am quite aware of the long range of some sock investigations. I am talking about the vast majority of these sock tags though, where a person used a sock once or twice years ago, and then either never edited again, or didn't cause any sock problems again. The long range abuse cases are not redlinks, they are bluelinks. Please give me an example of any of the redlinks, tagged over a year ago, where it would be useful to have it as a bluelink instead (or even to keep it as a redlink). Fram (talk) 08:08, 20 February 2013 (UTC)


 * It seems to me that we are discussing multiple issues.
 * Red-linked vs. blue-linked categories: who cares? doesn't do any more good than . Both collect the list of tagged sockpuppets.
 * Removing sockpuppet tags from old accounts: Vigorously, emphatically, unyieldingly oppose: I use the categories (red-linked or blue-linked) all the time, and don't want data taken away.
 * Removing sockpuppet tags from old IPs: Oppose, but not so emphatically: It is useful to be able to see the IP history. When Soccermeco returned recently, part of what confirmed him for me was that he hadn't moved. I can at least understand the argument that it somehow "taints" the IP, but my feeling about IP editors is that if they don't want to be tainted, they should get an account. The control is completely in their hands.
 * &mdash;Kww(talk) 15:17, 19 February 2013 (UTC)


 * There are advantages of blue over red - the cat in turn gets catted into hierarchies such as which increases visibility, connects confirmed and suspected socks of the same user, and makes the category amenable to bots and AWB cleverness. Plus it has the standard sock template at the top with links to SPI etc so that the sock can be processed more quickly.Le Deluge (talk) 15:39, 19 February 2013 (UTC)
 * And what processing, bot editing, AWB editing, ... would be needed for socks and editors that haven't been around for years? Fram (talk) 15:56, 19 February 2013 (UTC)
 * Studying the patterns of socking from an IP block over time, perhaps if an institution has moved from IPv4 to IPv6; having it in the category list allows easy regex searches on similar category names (and hence user names) which may inspire investigation of similar usernames; generally gathering statistics on the total population of socks - that's just a few that come to mind. Le Deluge (talk) 16:23, 19 February 2013 (UTC)
 * (ec)If you use the cats, why don't you create them? Having the ones that are in use created is fine (and this discussion does not suggest deleting any existing cats); having ones that aren't in use at all (and this will be the vast majority anyway, seeing that most of these are about editors that don't return at all or haven't returned (or been identified) in years) only makes clutter. What is the last time you needed a redlinked cat for someone tagged over a year ago? Just like blocks and warnings, old sock tags can still be seen in the history (they usually have a sock block anyway): but why tag them as well for eternity? What is the possible benefit of Category:Suspected Wikipedia sockpuppets of 78.149.176.189? Or of Category:Suspected Wikipedia sockpuppets of Alvin Stone IP 66.82.9.109, when there is no User: Alvin Stone IP 66.82.9.109? Fram (talk) 15:50, 19 February 2013 (UTC)
 * I've never blue-linked a sockpuppet category, or paid any attention to whether it was red or blue. I use them all the time: my memory is good, just not perfect. Being in the history is completely useless. Usually what happens is I'll look at an edit and say "Hmmm... that looks a lot like the guy that used to cause so much trouble on x", dig up a user name from the history of x, and go from his sockpuppet tag to the suspected sockpuppet category and start my comparisons. Whether the category is red or blue doesn't matter, so a bot can play with the category creation all it wants. If you delete the tags and the categorizations that they generate, you will cause me trouble. The sockpuppet tags should not be archived, treated as stale, or anything of the like because of age or inactivity. If you want to delete them because you think the suspicions are inaccurate, go ahead.&mdash;Kww(talk) 16:08, 19 February 2013 (UTC)
 * Some are inaccurate, others are inactionable. No one is going to (or at least should) give a one-off socking of five years ago any value in any discussion. Long-term abuse is about recurring problems with many sock accounts, not about the ones we are discussing here, with one or at most two accounts from years ago. This bot would be creating thousands of categories with very little actual benefit. Fram (talk) 08:08, 20 February 2013 (UTC)
 * As I've said before, the fact that it's thousands of categories is irrelevant. Just because there's only one or two members of a red-link category doesn't mean that only one or two accounts are involved - to take a random example, this SPI blocked six accounts, but only two of them were in the red-link . It's an example of multiple small "clusters" of socks being identified initially, which then see SPI join the dots between them. It's inevitable that the kind of thing you're talking about will be fairly short-lived, as once an old red-link cat proves itself to be useful in a "live" investigation it has a good chance of the sock-cat template being added (to generate the links to the tools), with the side effect that the category is created. So you're talking about a fleeting moment that will make it hard to point to any one cat that lives up to your demands - certainly without spending a lot more time on it than I have available. But glancing at current investigations, this socking career started with User:Didier Cochon making his two and only edits back in 2009 and could potentially have been linked to User:Elias Jack in a red-link suspected sock cat three years ago. That cat would have been useful in subsequent investigations and might have led to the sock being intercepted earlier than February 2013. Hard to prove a hypothetical though. Number of edits isn't always a reliable guide, as a non-admin I can only see a handful of edits eg here but the discussion implies there's been a lot more posts that have been nuked during the SPI process. I guess you would say that most of these accounts have "made only one edit back in 2010" between them. Incidentally, this is not just for the purpose of one report, without the cat there's a red link on the main sock template (eg User:Akhil.s.vijayan), which would appear to be covered by WP:REDLINK in relation to template links that "are meant to serve a navigational purpose. Red links are useless in these contexts; if possible they should be replaced by a functioning link, or else be removed". Le Deluge (talk) 19:19, 20 February 2013 (UTC)
 * Baby P's mum is not a good example, since the two accounts in the "suspected" cat are the sockpuppeteer (who shouln't be in that cat) and an account which is already included in Category:Wikipedia sockpuppets of Baby P's mum anyway. Creating that redlinked cat would serve no purpose at all. The rest are rather hypothetical (a sock that wasn't discovered then, and is stale now, and doesn't even overlap with the other socks, is hardly a sockpuppet: he never had a talk page discussion, no warnings, no problems, so "Didier Cochon" had every right to start a new account the next year anyway. Wanting to tie these together shows a misunderstanding of the sockpuppet policy. As for your last argument: "if possible they should be replaced by a functioning link, or else be removed": that's what I suggest, "be removed". The page WP:REDLINK is hardly applicable in any case, it deals with the notability of topics, not with the background processes of Wikipedia anyway. Fram (talk) 21:26, 20 February 2013 (UTC)
 * I still have a problem with having a bot remove the tags from accounts, or tags from IPs when they are tied to an account. If you wanted to get rid of categories where people invalidly claimed that an IP was the sockmaster, I'd be a relatively enthusiastic supporter. Think we could compromise there?&mdash;Kww(talk) 21:42, 20 February 2013 (UTC)
 * I also have a problem with tying years-old IPs to named accounts as well, since a) the IP probably no longer has anything to do with the editor, and b) it outs the editor in a minimal way, based on problems from long ago. I like your compromise (and certainly thank you for attempting one), but would like to see it extended to all IPs (whether as master or sock) from over a year old. Linking named accounts together, which also has some problems, is the least problematic of the three cases, so in the spirit of cooperation, I'm willing to compromise on these. Fram (talk) 22:00, 20 February 2013 (UTC)
 * Removing tags for IPs on red-linked sockpuppet categories is something I could countenance, but not when someone has processed it to the point of creating the category. Like I said, I actually made good use of the IP information in quite recently, and those IPs are over a year old.&mdash;Kww(talk) 22:11, 20 February 2013 (UTC)
 * Yes, I meant only for redlinked cats. All my comments and proposals here are only intended to apply to the not-yet-created cats, not to existing (bluelinked) cats. Fram (talk) 22:21, 20 February 2013 (UTC)
 * OK, so I think the two of us (at least) could agree on a bot that blue-linked all redlinked categories that were named after an account and contained at least one other named account, and deleted tags from any IP account that was associated with a red-linked category after that creation was completed.&mdash;Kww(talk) 23:41, 20 February 2013 (UTC)
 * I'm not technically saavy enough to take Database reports/Red-linked categories with incoming links and use your criteria to determine if the category should be created or not. Is there another way to compile the list of categories to be created?  GoingBatty (talk) 21:42, 24 February 2013 (UTC)
 * I'm not sure what you are writing in or where your technical savvy fails you. After getting the category name, trim "Wikipedia sockpuppets of " or "Suspected Wikipedia Sockpuppets of" off the front. If that matches to, it's named after an IP address, and shouldn't be created. If it was named after an account, http://www.mediawiki.org/wiki/API:Categorymembers shows you how to find the members of the category, and if any of those members aren't an IP address, create it.&mdash;Kww(talk) 22:13, 24 February 2013 (UTC)
 * Sorry I wasn't clear. Yes, I can get the list of all the categories that contain "Wikipedia Sockpuppets of" and filter out those with an IP address in the category title.  The part I don't know how to do is get AWB to look at the members of the category to determine if any are not an IP address.  GoingBatty (talk) 22:57, 24 February 2013 (UTC)
 * I don't have enough experience with AWB to answer.&mdash;Kww(talk) 23:03, 24 February 2013 (UTC)


 * Note - Having the redlinked sockpuppet accounts in Database reports/Red-linked categories with incoming links and Database reports/Deleted red-linked categories makes cleaning up the more important pages in those listing somewhat harder. See also User_talk:BernsteinBot. -- Alan Liefting (talk - contribs) 19:30, 19 February 2013 (UTC)
 * Yes, both the suggested bot and my suggested alternative would remove the sock cats from that list. Fram (talk) 08:08, 20 February 2013 (UTC)

- The requirements have changed since I filed the bot request, and my technical skills are not sophisticated enough to handle them. I hope another bot operator will pick up this task instead. GoingBatty (talk) 01:42, 26 February 2013 (UTC)
 * The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.