Wikipedia talk:Requests for comment/Community expectation of Checkuser

Other privacy essays
I know there are other privacy essays besides mine but have forgotten who the authors are. If any are especially pertinent to the use of checkuser, I would not object to adding them to the Further information section, or at least listing here. Thatcher 04:18, 27 January 2009 (UTC)
 * One I'm aware of is: On privacy, confidentiality and discretion. That has this section on checkuser, which should be checked as always to see if it is correct and up-to-date. The new 'intitle' search function is very handy (when restricted by namespace) for searching for stuff like this. See here and here. Gets a bit swamped with AfD results, but I'm sure someone can figure out how to filter those out or improve the search. Once that is done, starting a category of "privacy" related stuff might help, as I did for BLP essays, Civility essays, and IRC essays. Carcharoth (talk) 22:49, 27 January 2009 (UTC)

"Innocence" checks
In this reverted edit banned user Moulton asks about a check run against Filll by Raul654 at Filll's requiest. The edit itself is not welcome here so I reverted it. However the question is a valid one to be asked... what of innocence checks? Should policy be explicit about them one way or the other? I've always said in response to these "CU is not a way to prove innocence" and usually declined to run them, or if I did, reported results without any "exoneration" style phrasing. Thoughts? ++Lar: t/c 05:37, 27 January 2009 (UTC)
 * Ditto. Checkuser cannot prove a negative. Mackensen (talk) 12:23, 27 January 2009 (UTC)
 * When Simple Wiki suspected I was grawp, they did run an innocence-style CU to determine I was on the other side of the country from the fellow, but yes, generally Lar's approach is sound.  MBisanz  talk 12:51, 27 January 2009 (UTC)
 * CU can sometimes be nearly definitive on the issue of "innocence". I have no objection to CUs answering innocence checks as long as they explain the caveats in sufficient detail.  I think it should be up to the discretion of the CU; there should not be an expectation that innocence checks will be answered but I don't think a prohibition should be written in to the policy. Thatcher 13:25, 27 January 2009 (UTC)
 * I actually had two innocence checks run on me when I came back from my break - which apparently confirmed I was not Poetlister et al.--Tznkai (talk) 18:12, 27 January 2009 (UTC)
 * Well, that's a relief. ;-) Risker (talk) 18:14, 27 January 2009 (UTC)


 * I concur with Thatcher. Sometimes so-called 'innocence' checks can be appropriate, but it's very much down to the technical evidence, circumstances, etc. Checkuser discretion, really. One notable one I recall, which was the proper thing to do at the time, was the Piperdown check I did during the unblock debate. If these are to be performed, the rationale and conclusions should be detailed in as full a manner as possible within policy, however. They are not the time for vague, hand-wavy 'exonerations' - A l is o n  ❤ 18:34, 27 January 2009 (UTC)
 * The main trouble with "innocence" checks is that they give the wrong impression of the CheckUser function. Doing "innocence" checks has the potential to (a) give the impression that CU is infallible (though skilled caveats mitigate this to some extent) and (b) they give the impression that CheckUser has the final say in all sockpuppeting cases.  (b) is an especially dangerous attitude -- it should be understood that determining the outcome of sockpuppet cases is almost always a matter of balancing probabilities, not of getting handily-packaged certainty.  The CheckUser team must be very careful to keep expectation low.  This entails things such as labelling cases  rather than ✅ unless things are stone-cold certain, and, especially, being careful about (a) which "innocence" checks to run and (b) how they report the results of the checks.   [[Sam Korn ]] (smoddy) 18:45, 27 January 2009 (UTC)
 * If the two parties are innocent, then there are two people whose privacy will be compromised by the check, and both should consent. If one is clearly not innocent (MBisanz's example above), the potentially innocent party should consent.  A useless (or worse) report would be "they are not the same editor" with no supporting details.  A useful report would be "they edit from widely different geographic regions and don't use proxies".  However, if one of the editors is using proxies or the two are unrelated but don't know that they are in the same geographic region then a report with real data could look damming, and an answer without data will increase whatever suspicions existed to cause the check to be run in the first place.  When I was first active at WP:DRV, about a third of the regulars were located in or within an hour or two's drive of my city of residence - such an innocence check on us would have looked damming.  GRBerry 18:54, 27 January 2009 (UTC)
 * I was checked without consent, and it didn't bother me over much, but I can imagine others would be very upset. Being checkusered will certainly feel to others very similar to being outed - its invasive to have details you thought obscured suddenly laid bare.--Tznkai (talk) 19:08, 27 January 2009 (UTC)


 * Depends on circumstances. A couple of years ago, an RfA was nearly derailed on thie first night by a good-faith, but completely false, allegation that the candidate was the sockpuppet of a very problematic user. A quick checkuser requested by me confirmed that the two editors were editing from different continents (and through completely conventional ISPs, not proxies or anything), which was universally regarded as disproving the allegation. Newyorkbrad (talk) 17:58, 29 January 2009 (UTC)

Checkuser needs to show as little information as needed to the person doing the checking
I'm not a CU so maybe this has already been done, but IMHO CU needs to be able to do a very limited number of things:


 * Tell if two or more users could be from the same ISP. Give a yes/probably/maybe/probably not/no response.
 * Tell if two or more users could be from the same geography, organization, proxy, ISP, etc.
 * Tell if a user could be from given netblock, or IP address.
 * Tell if a user could be from a given geography, organization, proxy, ISP, etc.

Most of these do NOT require revealing the IP address to the checkuser. If a checkuser could type in "name1, name2" and get back "probable match: both users are in the same /24 netblock and neither is a known widely-used proxy" or "uncertain: one or both users use an IP shared by many people" that is good enough.

If such a system were in place, most checkuser requests would not reveal IP or other private information. Some, such as checkusers comparing registered users to IP-editors, would.

Implementation
I assume every login records the actual IP address. If in addition to the IP address, there was a random-number code associated with the netblock, where all IP addresses in the same netblock got the same code, then CU could start by matching these codes and only match IP addresses when necessary. While netblocks can be arbitrary in size, in almost all cases they would be the same as the information published by IP-address registration authorities, with exceptions made on an as-needed basis. davidwr/ (talk)/(contribs)/(e-mail)  17:57, 27 January 2009 (UTC)


 * A description of how the tool works and what sort of information is returned is given at m:Help:CheckUser. I think you are asking for the software to do the analysis and interpretation instead of having human beings do it.  I don't know if that is even possible, much less effective.  Except in the simplest of cases (say, 6 Grawp sockpuppets created on the same IP at the same time) there is a fair amount of analysis required that is beyond the capabilities of an SQL database.  Certainly, restricting the information presented to the checkusers will restrict their effectiveness.
 * I also think there is a fundamental error in this approach. I want to deal with the rare problem of a Checkuser who makes an error in judgement by identifying that person, correcting their behavior, and if necessary, removing their access.  You want to deal with the same problem by restricting the data available to everyone.  It would be like arguing that because some admins occasionally make errors in judgement when deleting articles, the deletion function should be withdrawn from all admins. Thatcher 18:07, 27 January 2009 (UTC)
 * My motivation is something else entirely. My motivation is that private information should be available to those who need to know, but only those who need to know and only when they need to know it.  If part of the process can be automated or information that is not needed can be removed before presenting it to a human, that is a good thing.  The trick is the technology:  If the technology doesn't exist to do what I propose, then it doesn't exist.  But if there is demand, we can build it.  I once heard of a locally-customized enterprise e-mail system where the e-mail administrators had to deal with bounced or misdirected messages.  The tools they used to manually resend mail obscured the contents of the messages.  This was done because of a philosophical decision that the contents of mail was private and administrators didn't need to see it without a darn good reason, and rerouting misdirected mail wasn't a good enough reason.  I'm applying the same principle here.  davidwr/  (talk)/(contribs)/(e-mail)  20:21, 27 January 2009 (UTC)
 * The trouble with implementing what you suggest is that it requires programming a very high degree of artificial intelligence. It is actually quite rare for a CheckUser investigation to be limited to is X also Y?  Furthermore, CU investigations are not limited to asking do X and Y edit from the same netblock?  I am highly sceptical that any program can be built that would replicate the ability even of someone with relatively little knowledge about the architecture of the Internet to run CU investigations.   [[Sam Korn ]] (smoddy) 21:23, 27 January 2009 (UTC)
 * IMO human attention is needed - restricting it like this isn't going to be helpful as pointed above. Getting a system like that wouldn't be something I'd rely on. I think a human evaluating the info given from a CU check is needed, not some program evaluating whether X is Y, or vice versa. Remember that the CU tool is not a magic pixie dust. -- Kanonkas : Talk  13:46, 28 January 2009 (UTC)
 * There is part of checkuser that does require a human element. Analyzing editing patterns, shared attitudes, etc. However, in principle saying whether or not a given logged-in account's IP addresses match another editor's IP addresses, either by netblock, organization, or geography, could be automated.  If the technology isn't there to do that now, then we have a decision to make:  Invest in the tech to do it, or don't.  So far in this thread, people are saying "don't."  Even if this were done, there will be some times when the person doing the checkuser will have to look at the actual IP, company, or country and compare it to the editing patterns, but this will be far from all the time.  If we can reduce the overall amount of confidential information checkusers look at without compromising the accuracy of their reports, this is a very good thing in my book. davidwr/  (talk)/(contribs)/(e-mail)  14:06, 28 January 2009 (UTC)
 * You just don't get it; that's not entirely your fault as you have not actually been a checkuser and it is difficult to give concrete examples without giving out personal information. Let me give you one example.  British Telecom has highly dynamic IP addressing, even their DSL customers can sometimes have a new IP every few days.  They also recycle IPs quite often.  It is common to check a BT IP address and find that it was used today by Smith and 3 weeks ago by Jones, but an analysis of contemporaneous edits and IP use patterns shows that they are entirely different people.  Editors who use BT can appear on multiple /16 ranges.  So, your software solution would report, when checking Smith, that Jones was a sockpuppet, when he was not.  Your software would also fail to identify sockpuppets of Smith (if the netblock was too small) or would falsely report hundreds or thousands of sockpuppets (if the range was too big).  How would your software handle editors from schools, universities and businesses with corporate proxies or firewalls?  They will have the same IP (or netblock) at school/work but different IPs at home?  And so on.  Editors who have IPBE and are allowed to use tor proxies will come up as false positives.  BestBuy uses a single /24 range for their official corporate network as well as all the computers in their store that are publicly available for demonstrations.  All iPhones use the same netblock.  All AOL customers use the same netblock.  Trusted vs non-trusted xff headers will be another source of false positives or failures to locate.  Again as I said above, you want a technical solution to a human problem.  Would you argue that because some people use VCRs to make bootleg copies of movies, that therefore all VCRs should be outlawed, or that because some admins make irresponsible blocks, that the blocking function should be disabled for all admins?  The answer to CU privacy is a people answer, not a technological answer (unless you want to retire the CU function entirely and do without). Thatcher 17:02, 28 January 2009 (UTC)
 * The posted thought the data should only be available when there is a need to know. This is a reasonable position - but the recommended solution isn't viable.  I've never been a checkuser, but I've seen some of their reports.  User "only uses proxies" is sometimes an issue, and can only be detected by testing the IP addresses that a user has used.  So realistically, in every single case the checkuser needs to see the IP addresses, even if the net blocks are different.  Thatcher is right, the answer to privacy is not technological masking - it is having good people that only run checks when appropriate.  Running the check only when there is a need to know is the key control.  How do we ensure that we achieve that is an interesting question we should be focusing on.  GRBerry 17:10, 28 January 2009 (UTC)
 * In case folks aren't aware, this has been discussed multiple times previously, and the outcome has always been the same: Analysis of technical data cannot be automated to a degree which would yield useful results; interpretation by a knowledgeable and trusted human is required. This is one of the reasons that CheckUsers generally have a fair amount of technical knowledge in addition to being trusted - we don't pick just anyone to do the job because it's highly complex. The nature of the analysis almost entirely precludes the possibility of a useful automated analysis, and nothing is going to change that. &mdash; Mike.lifeguard &#124; @en.wb 00:38, 29 January 2009 (UTC)
 * As far as "artificial intelligence" goes: Might be a good idea to talk to the people at chess servers such as playchess.com who employ fairly sophisticated methods to detect cheating. Running commonly used chess-playing software programs in the background and comparing players' moves is part of it but not all. Then again, they probably won't tell anyone about their trade secrets. --Goodmorningworld (talk) 18:20, 2 February 2009 (UTC)

Discussion of View by Goodmorningworld
Well, the response that a rigid one-strike policy is too harsh has some merit. My idea was based on the notion that a bad block can be rescinded, ruffled feathers can be smoothed, and life can go on. I can't imagine off the top of my head a situation in which an administrator should be desysopped, temporarily or permanently, for a single bad block. Release of identifying info, on the other hand, can't be undone. You can't erase people's memory of what they read. Hence, in my opinion, an awareness that bad decisions carry potentially more serious consequences should be reinforced by stricter sanctions. While I have you, Thatcher, I wonder if you would agree that this edit of yours came close to crossing the line? (It's in AN/I archive 491, thread begins with the word ACORN.)--Goodmorningworld (talk) 17:53, 2 February 2009 (UTC)

Note Checkuser results show Curious bystander and Marx0728 are the same person but probably different from the others, although from the same city. 300wackerdrive edits exclusively from a workplace previously associated with BryanfromPalatine; Kossack4Truth edits exclusively from a residential IP in the same city, and WorkerBee74 edits exclusively from a Sprint PCS mobile device of some kind. Thatcher 12:20, 12 November 2008 (UTC)


 * The only thing that strikes me as immediately concerning is the comment that 300w edits from BFP's workplace, although that depends in part on whether BFP's workplace was general knowledge or not. Saying that two people edit from the same city, or that one uses Sprint mobile, is sufficiently vague that by itself, it does not give away identifying information.
 * That discussion is about the possible return of an indefinitely banned user, including discussion of various block/ban proposals (which users and durations). There were 9600 words in discussion, 2700 before I posted the checkuser results.  In situations like these the community needs to know enough about the situation to make an informed decision.  Based on the findings I could have said  for all of them and let that be that.  But I have seen numerous complaints on-wiki and off about high-handed checkusers for making false conclusions and banning people for no reason.  Two accounts at the same workplace might be innocent or they might be socks, an analysis of behavior is required and I felt a simple yes/no/maybe answer would be insufficient to help the admins resolve the situation.  Likewise, would a simple "maybe" give enough information to evaluate the mobile device and residential users?  How long would the accounts in question have dragged out the situation, arguing their innocence and complaining about checkuser abuse, if I had simply said "maybe"?  Or I could have said ✅ and blocked them all myself and no one would have been able to question me, and no information at all would have been revealed.  Would that have been an improvement?  Most cases are answered simply.  Complex cases involving long-term abusers (particularly smart abusers who take steps to conceal themselves) often require more detailed answers to enable good admin decision making. Thatcher 18:33, 2 February 2009 (UTC)
 * I appreciate the challenges in what you do, glad I don't have to.--Goodmorningworld (talk) 18:45, 2 February 2009 (UTC)
 * I'm always open to outside views of my own actions. Thatcher 18:50, 2 February 2009 (UTC)


 * I try to avoid rigid, single-strike policies (to borrow your wording), as in my own opinion they seem to risk depriving us of an ability to exercise good decision making in context, after something happens; that said, I do agree that loss of privacy is a particularly grave problem because it is often impossible to reverse, that users entrusted with private data should know this and display appropriately high levels of care, and that -- putting it gently -- our patience for problems in this area should be very low. – Luna Santin  (talk) 06:53, 8 February 2009 (UTC)
 * Once again quoting Tznkai, the phrase "culture of caution" sounds like a good mantra. – Luna Santin  (talk) 21:14, 9 February 2009 (UTC)

Discussion of view by MZMcBride
All CheckUsers should be required to be reconfirmed every 18 months by the community using the same standards as the current ongoing elections. --MZMcBride (talk) 11:53, 6 February 2009 (UTC)


 * Possibly a good idea, but perhaps we should wait until after the elections to make any firm decision (it may have gone well so far, but it's only 12 hours old!). If we really think elections are the best way of selecting CheckUsers (I don't, but I can see that almost everyone else does), then I can see strong arguments for having confirmation elections.  We should also, I think, consider whether this can be more effectively managed through oversight bodies such as the AC (if it takes on the job of CU-monitoring, which I strongly feel it should) or the proposed review panel without the hassle and potential caprice of an electorate.   [[Sam Korn ]] (smoddy) 12:42, 6 February 2009 (UTC)


 * And assuming elections are the right way to choose checkusers, how could people cast informed votes in a reconfirmation election? For example, I know of some people who don't trust Jayjg and assume he uses checkuser improperly on Israel-Palestine editors and issues.  I can't say with absolute certainty, but I did look over his logged actions for 2008 and found nothing that sparked my concern.  How can a reconfirmation election be informed as to checkuser issues and not simply be a popularity contest?  How do you measure a checkuser's effectiveness?  Raw number of logged checks, RFCUs responded to, number of accounts blocked?  Are you going to ask the Review Board (or Arbcom if Sam gets his way) to produce a comprehensive election scorecard on 10 checkusers every 6 months? That's a significant expansion of their workload and scope of the proposed board.   Informed elections are often useful, uninformed elections are useless and probably harmful.  How would you deal with this? Thatcher 14:04, 6 February 2009 (UTC)
 * Have current CheckUsers and Arbs (and possibly stewards) review the logs and make recommendations. Concurrently allow the community to express its concerns of things it may have heard. This notion that because people aren't CheckUsers, they have no idea what goes on behind the curtain, well, that's simply not true. Having current CUs and the community comment allows any rumors or gossip to be dispelled while also revealing any misuses or abuses. Trust me, if the community is allowed to openly comment, any missteps will be noted. If there's one thing Wikipedians are good at, it's finding faults. ;-) As to your specific example, if a CheckUser no longer has the trust of the community, they should go. This role is one where nearly the entire position is predicated on trust by the community. --MZMcBride (talk) 17:48, 6 February 2009 (UTC)


 * This is a good idea but Thatcher's concerns are valid. Obvoiusly, a checkuser can't run on his record as a checkuser, but he can run on his record as an editor/administrator, the same as he did in his first run.  A continued good editing/administrative history plus nothing negative coming out from those looking over his shoulder should make re-confirmation relatively easy as long as he didn't make too many wiki-enemies from his public administrative decisions.  Personally, I think 18 months in such a position is enough, I'd favor a forced break of several months before running again.  A little disclaimer is in order:  I would prefer that all positions of responsibility would have regular elections with a limited number of times you could succeed yourself without taking a break.  Whether this is 2 9-month terms, 1 18-month term, or 1 or 2 terms of a different length isn't a big deal, as long as it's not excessively long. No 10-year terms, please. davidwr/  (talk)/(contribs)/(e-mail)  14:38, 6 February 2009 (UTC)
 * I debated with myself between a hard limit of 18 months and a reconfirmation. It seemed silly to lose people who are good at what they're doing if they still have the community's trust. We just need to make sure they do. --MZMcBride (talk) 17:48, 6 February 2009 (UTC)
 * Actually it's not so silly, and it prevents complacency. It also forces people to spend some time doing "something else," either for the project or off-Wiki.  It can also give people who want to quit but don't want to offend anyone an easy way to gracefully quit:  It's a lot easier to not run again as soon as your "time out period" expires than it is to say "I'm not going to run again at the end of my term."  If we had elections 2x a year with a staggered 18-month terms with a required 6-month break, we would need only 33% more active people than if we didn't require a break, and that's assuming everyone ran unopposed every time they could.  Given that you don't need a lot of checkusers, I don't think finding trustworthy people will be a major factor if we impose a short mandatory break after every so-many-months of service.  davidwr/  (talk)/(contribs)/(e-mail)  18:44, 6 February 2009 (UTC)
 * On the other hand, continuity is important. Even if they were "rested", they would still need to be consulted about on-going cases.  One benefit of having more CUs than necessary might well be that they feel more comfortable reducing workload or taking breaks of their own accord.   [[Sam Korn ]] (smoddy) 19:10, 6 February 2009 (UTC)
 * I think the problem here is we have two conflicting needs: the first being the need for community trust in CheckUsers - dealing with privacy both as "good governance" and a pure pragmatic matter requires community trust. Second we want CheckUsers to do their job right and to be rewarded/protected/encouraged for doing so. Gauging community trust is difficult to do without a wide election/poll/survey/what-have-you. On the other hand, the best CheckUser (in my mind) is so discrete and efficient, you hardly hear anything out of them at all, their name doesn't get mentioned except by the people rightly Checked and found sockpuppeting. Resolving these tensions is difficult - perhaps impossible - but there is a reason that policemen are not elected, but sherriffs are (at least around these parts).
 * In addition we need to accept is that some people, Wikipedians especially do not trust authority of any sort, especially those they don't have direct power over. Trying to satisfy these people is an exercise in futility - in addition they will (in my opinion) disproportionately show up at elections on Wikipedia, while in "real life" they tend to stay home at elections. (Similarly, there are fawning sycophants, but sorting those out from genuine supporters can be difficult.)
 * I'm of the opinion that in balancing all those concerns, CU elections workout, but any sort of reconfirmation/unelection has to be away from the whims of the polity. The last thing we want is CUs advertising their fantastic records or even worse, turning down difficult CU jobs in fear of angering the wrong people.--Tznkai (talk) 16:14, 8 February 2009 (UTC)
 * No, the last thing we want is CheckUsers being able to maintain their access and power long after they've lost the community's trust (rightfully or not). --MZMcBride (talk) 18:30, 8 February 2009 (UTC)
 * Arlight - but I'm not convinced that a reconfirmation election is a good way to prevent that.--Tznkai (talk) 22:18, 8 February 2009 (UTC)

Discussion of view by Od Mishehu
Posting the name of the account targeted by a checkuser block will lead to unacceptable release of personally identifying information under the privacy policy. IP addresses and ranges may identify a city, a school, or an employer, with unacceptable precision. Regarding Od's specific suggestions,


 * All checkuser blocks should indicate the apparent sock-puppeteer, except where this information can link the sockpuppetry to a real-world identity; and where the issue is that the original account seems to be the link (such as a real name) - then some other account should be chosen to "represent" the case.

For accounts this is usually possible but it is not possible for single IP and IP rangeblocks.
 * All accounts blocked with a checkuser block should be tagged to link to the sockpuppeteer (again, subject to the same considerations as before) - such tagging will allow sockpuppets to be linked to each other.

This has nothing to do with checkuser. Anyone can tag user pages.
 * If a sockpuppeteer is already known to the community via a confirmed account, there is no point in treating this account name as private, even if it may be the real name of the user.

Are you targeting a specific case here?

In general, I find there is very little benefit to specific tagging of blocked accounts, and I do not endorse a requirement that the checkusers themselves do anything. If I find that a vandal on [ISP redacted] has created 6 accounts a day for the past week, I can use the checkuser interface to block them all at once using very few mouse clicks, but I find no purpose in identifying them as the Avril troll, or Pope Lister, or ByAppoitmentTo, just to name 3 sockpuppeteers who use the same internet provider. Blocking established users should always be followed with a reason on the talk page and in the block log, but what is the purpose of tagging the user pages of checkuser-blocked sleeper accounts when there is another group of Wikipedians who consider such pages temporary and go around deleting them at will.

The checkuserblock template says ask a checkser for good reason. To be honest, I think there is some other agenda behind these proposals. Thatcher 19:53, 17 February 2009 (UTC)

I concur with Thatcher, here. Mandatory disclosure of sock-puppeteers vis. IP addresses is far too 'outty' for my liking. There's a time and a place for when IP addresses can be connected to accounts but this should always be at the discretion of the checkuser involved, and upon the extent and nature of the abuse. IP addresses can disclose location/ISP/employment/personal information about an editor depending on the circumstances and publicizing this is rarely appropriate - A l is o n  ❤ 23:01, 17 February 2009 (UTC)


 * While I would almost never unblock a checkuser block without consulting a checkuser (there was one case,, where the disclosed information at RFCHU merely confirmed that which I had already understood from his own unblock request), I find that dealing with a sockpuppet of a known user is different than a sockpuppet of "some" user. This specifically came up yesterday when dealing with , a sockpuppet of - if I had known it was HR, I would have immediately fixed the block to include the talk page; since I didn't know - I had to give a generic "Please make the request from your original account" answer.
 * I found a similar issue when dealing with the account, and his IP address (which I knew, without checkuser help, had previously been Hamish Ross) - if I had known that the account TLGA was also HR - I would have handled those differently.
 * My original knowledge of the extent of HR's actions came through the categories of the suspected and confirmed HR sockpuppets, after a few unblock requests linked me to the HR account. If the other sockpuppets hadn't been linked to there, I wouldn't have known how the correct way to deal with new HR socks. עוד מישהו Od Mishehu 06:19, 18 February 2009 (UTC)