User talk:Sophia/analysis

This kind of baselining is very helpful. With the spelling errors, let us distinguish between evidence and proof: it is a common misunderstanding (or at least a common rhetorical device) that if something isn't in itself proof, it's not evidence. These spelling errors are indeed common, particularly "seperate" - here, we also don't have many examples (yet - I'm still assembling the corpus) - at this time, that point, it's weak evidence. "Concensus" and "noone" are somewhat stronger, for there are many examples and the error isn't nearly as common. Not unheard of - I think I saw MONGO himself using "concensus" at one time. No one would take these two points together as meaning editors were the same user…but they do support that hypothesis. Similarly, if both were interested in electrical engineering, were Samoan, or another thing that many people are, but most people aren't. They're just points of consistency, and yes, just how unusual they are or aren't should definitely be taken into account. Where did you search for these, through google?

However, there is a big difference between "lol" or "rofl" appearing somewhere on a page, and edit summaries which consist solely of "lol," "rofl." That kind of tallying is basically useless. I'd be very interested to learn the percentage of edit summaries wikipedia-wide - something like "rv" is obviously meaningless, "huh?" less so, "rofl" less so…but how much less so? excellent questions. What I do know is that ZF/NU's edit summaries are remarkably similar, not just individual ones, but the overall collection, which. I wouldn't be surprised to see anyone else doing any one of the very most common ones "response", "fix(ed)", "typo(s)", "+1", also "please do not (remove)", "readded …", etc. What is remarkable is the consistency with which these appear in what is a very limited set of conventionalized responses. It's as if he's using identical customized drop-down menus on all his accounts (in fact, this might be precisely what's occurring, as browsers have auto-fill functions.) I have no idea how one would go about collecting the data, but I'd be very surprised to see anyone else showing this particular pattern of responses.

Anyhow, I encourage you to include your analyses on User talk:MONGO/Ban evasion.Proabivouac 23:45, 21 August 2007 (UTC)
 * Hey, thanks for the move!Proabivouac 07:09, 22 August 2007 (UTC)


 * I am doing some analysis just to get an idea of how common these phrases are. Having sat next to my daughter on MSN watching her cheerfully type in "lol" "huh? and "rofl" on a regular basis that is how (despite my age) I know what they mean. I would use this evidence to pinpoint age group as that is all the information it will reliably give you. All her friends use these terms regularly -take a look at myspace to get some idea of how common this is. As this is an international project we need to define common by region - I'm UK. If something is uncommon in the US and someone is obviously editing from a US timezone then this is more definitive. However again circumstances can affect things. At that age it would have appeared that I lived in the US by my time zone editing as those were the hours I chose to work to get full access to the systems.
 * I started this as I can't spell and I tend to copy the last instance on the page that I see of the word I want to use in the hope that someone else is better than me. I swap between "noone" and "no one" because neither looks right and "consensus" was a nightmare when I first joined. Also to pull up your "both interested in engineering" point - as a physicist, daughter of an engineer, married to another physicist and mother of a undergrad physicist I can say that spelling is a known weak point with these types :-). Upgrading my web browser to Firefox has helped as I now have a spell checker in edit mode.
 * I have no idea how to get to the edit summaries wiki-wide but we could lodge a request with the developers as a one off dump for say a day. If you are going to use this stuff you need a baseline or it is meaningless. As for autofill - if they have been daft enough to use that then they need to be shot but again like me they probably just picked up how other people write their edit summaries and copied the style (something I also do as if I find something informative and clear it makes sense to use a very similar format).
 * 1 in 20 in a project with 1000's of people can be considered very common. For 10,000 people at that probability this is 500 people. Sophia  07:31, 22 August 2007 (UTC)