User talk:The Fat Man Who Never Came Back/Admin edits

Per your suggestion
I don't have a much time tonight, so this is a rough draft. Cool Hand Luke 03:25, 31 October 2008 (UTC)

Initial impressions: That's as far as I got ... my computer is having a really hard time sorting the chart. How about this stat: first, remove the arbs (as they have the highest RfAR percentage). Then look at mainspace percentage minus the sum of (AN/I, RfA and RfAR percentage). Or absolute edit numbers. Or (mainspace + RFA) - (RfAR + AN/I). Sandy Georgia (Talk) 04:43, 31 October 2008 (UTC)
 * 1) The first time I tried to sort it, my computer hung and I had to restart; click and wait patiently
 * 2) Not sure what these dates mean:  Editing data gathered from 10-27 to 10-31.
 * 3) Most admins most heavily edit mainspace
 * 4) LHvU, get off of AN/I :-)
 * 5) Look at high AN/I percentage relative to low mainspace percentage.  For example, of the top 20 AN/I % admins, MastCell has 30% mainspace, but MastCell is a well-rounded content contributor.  On the other hand, Seicer shows up in the top 20, yet has a higher mainspace percentage than MastCell, so I'm not sure that's a good measure of admin/content contributor efficacy.  MastCell is usually a moderate and mature voice, and a good content contributor as well.
 * 6) Interesting: sort by high AN/I percentage, and you have to scroll long and deep to encounter the names most frequently associated with featured content.
 * 7) It would be interesting to see this data meshed with something like WP:WBFAN, User:Franamax/Ucontribs-0.1 and User:Franamax/Ucontribs-0.3a
 * 8) High RfAR percentage shows mostly arbs, but some standouts.


 * Luke, this is really amazing. I'll respond to a few of Sandy's observations.
 * I have no problem sorting the table at all; goes very smoothly, with a 2-3 second delay at most. Could you try again?
 * "data gathered from 10-27 to 10-31" simply means that any edits the admin made after those dates will not show up in the table (unless we download new data and refresh the table later on)
 * I totally agree that once cannot evaluate the performance of an admin by statistics alone (i.e., this data does not purport to ascertain the quality--as opposed to quantity--of someone's contributions to mainspace or to AN/I). Nevertheless, the table is useful for gauging an admin's primary area of focus
 * Some of my favorite people (jehochman, NYB--who knew!) are near the top of AN/I % too!
 * Merging this data with how many Featured Articles, B-class articles, etc. each admin had created is an intriguing/promising but also problematic proposition. I need to think about this one.
 * If possible, I do plan to merge this data with a link to each admin's orginal RfA, so we can see when they were promoted, who voted for them etc; I'd also like to analyze each admin's edits after they were promoted, compared to before. This is still in the brainstorming stage.
 * I love the idea for your mainspace percentage minus the sum of (AN/I, RfA and RfAR percentage)--I think that's a fairly meaningful stat, and I think it will be added soon.
 * Thanks, Luke for your help with the data and thanks, Sandy, for the feedback.--The Fat Man Who Never Came Back (talk) 12:03, 31 October 2008 (UTC)


 * Ok, some initial reactions (I don't have much time to fix it right now).
 * I know big sortable tables are very cumbersome on some computers. I myself have successfully hung a browser trying to sort the big table at WP:ADMINSTATS. I don't want this one to get much bigger, but I will add a few niceties.
 * The dates are when I gathered the data. So they aren't up-to-the-minute. I have a second script to update everything relatively quickly, but it didn't work the first time I tried it yesterday, and I didn't have much time, so I posted this data. I will get these updated.
 * They do. In fact, I was thinking of turning that row into a "most common besides mainspace" column. I would mark the mainspace number in bold if it's the highest, and the second column in bold if it's the highest (is it is only for a minority).
 * As for the last few points, I do have some ideas. First, I would like to make a table of contributions in the last year. I think this will give a better snapshot of the current state of things. Grunt, for example, is no longer deeply involved with ArbCom (or anything). I would leave off admins with no edits in the last year, which will make the table much more compact. Probably put it into a different subpage for efficiency. As for merging with featured data, I have this idea (let me know what you think): number of edits to articles which we subsequently promoted to FA according to this log. It's imperfect, but I think will help capture some of the work that goes into FAs. Nominators are an unfair metric, because many nominators are third parties, or involved in a truly group effort.
 * General problem with using mainspace edit counts as an indica for article-writing: the fastest way to rack up mainspace edits is to revert vandalism. We should discover a statistical proxy for actual content edits. Perhaps #of mainspace edits with more than two edits to the same article in a one-month period. Maybe # of mainspace edits not marked "revert" or "rv," ect. That way, RC reverts will mostly not show up, but if you're making several edits to the same article by (presumably by improving its content), those will.
 * Let me know. I'll use your feedback to improve this later today! Cool Hand Luke 16:02, 31 October 2008 (UTC)

OK, I've refined this a bit. I added ArbCom talk pages to "Arb" because much more of the action is there. I also made a heading for deletion, including DRV. Definitely some specialization there.

I've restricted this page to admins with more than 365 edits in the last year, and added a subpage for the complete list. I've also made a subpage for percentages of edits in the last year. I think this shows a more accurate snapshot of the moment. For example, MastCell is not even in the top 45 on a percentage basis. Perhaps I should filter this subpage too (to >364 or some other threshold?).

I was not able to automate data mining for all of the RFA dates. I got what I could, then I did a bunch by hand, all of the redlinks (suggesting name changes), and all of the dates up through "M" or so, but I don't have enough time now. If anyone would like to pitch in, I would appreciate it. The probable RFA location is linked on the question marks. Not all RFAs pass the first time, so please also update the link to "... 2" or wherever you find the successful RFA. I find it relatively easy to open a bunch of links in separate tabs, then type the dates into notepad. Cool Hand Luke 22:47, 2 November 2008 (UTC)


 * Thank you, CHL, this is staggeringly useful. Filtering out inactive admins and creating sub-sub pages was a great idea.


 * Also, I can do the rest of the RFA dates sometime in the next few days. It will be fun.


 * I wanted to explore your idea of filtering out substantive content edits from mere vandalism reverting, templating, etc. I love the idea of incorporating statistics regarding edits to articles the user has edited repeatedly.  Some ideas for more column headings (I'm just brainstorming here):


 * [Name of] Most edited article
 * Edit count on most edited article
 * ratio of edit count of editor's most edited article to AN/I posts (Sandy calculated this for a few people and found the results amusing)
 * A variation of the above would be (edit count of most edited article) / (AN/I + RfA+ RfARB posts)
 * another varation: (SUM of edit count of top 5 [or top 10] most edited artcles) / (AN/I + RfA+ RfARB posts)
 * We could give this stat a cute name like content-to-drama ratio or something like that
 * % of articles edited that were edited more than once (or twice or 5 times or 10 times, or whatever)--this will mostly filter out the more mindless edits


 * How hard would it to be extract this sort of data from the information you already have?


 * Finally, I'm still not so sure how I want to incorporate data on the featured content/good article contributions, but I should have some ideas soon. Let me know what you think of the above.--The Fat Man Who Never Came Back (talk) 02:50, 7 November 2008 (UTC)

Now that I have the data on my hard drive, additional reports are easy. I could include all the statistics you and SG have suggested, but it will be very large. Therefore, I'll limit them to very active admins so that the table doesn't crash systems while sorting (maybe >3600 edits in the last year), and we can decide which statistics are useful in this set, then add them to all the tables. Cool Hand Luke 21:06, 7 November 2008 (UTC)


 * That's excellent to hear. After I look at the resulting stats, I'll try to envision a "main" table that includes just a few of the columns that I find the most interesting, then we could relegate the full data to subpages.--The Fat Man Who Never Came Back (talk) 23:50, 7 November 2008 (UTC)

Multiple accounts
Sorting by ArbCom edits, I was surprised that User:Tony Sidaway is an all-time #2, only behind Fred Bauder. But this is without including his newer two accounts. How should we handle users who had more than one account? For example, the numbers for Bishzilla look strange because most of her editing is with Bishonen. Also, some users specifically broke with their old accounts and had their bit transferred. I did not link to their original RFAs where it appeared there was some effort to obscure their former identity, but perhaps we should include the edits somehow?

On the other hand, I doubt I could make a comprehensive list of alternate accounts, and sticking to the account that has admin privileges is conceptually clean. Cool Hand Luke 00:23, 3 November 2008 (UTC)


 * Did you say you're surprised Tony Sidaway has more ArbCom edits than almost anyone else? Please.


 * In any case, I don't think we should attempt to track alternate non-admin accounts because, among other difficulties, I'm sure a significant number of admins and ex-admins do not disclose or publicize their alternate accounts.


 * I can write footnotes for the very small number of admins who had the bit transferred to a new account and will likewise do my best not to "out" anybody.--The Fat Man Who Never Came Back (talk) 02:21, 7 November 2008 (UTC)

Suggested format
Franamax worked his magic: the numbers I would like to see look like this. If you believe (as I do) that Tim Vickers represents the gold standard in excellence in balance between dispute resolution skills, policy understanding, civility, and mainspace editing, looking at the numbers this way shows that balance.

Tim has healthy contributions to the Administrators' noticeboards, but he has contributed more edits to 9 articles (for more than 5,000 edits total) than he has to AN. He has 12 times as many edits to those articles as he does to AN. He has 2.5 times as many edits to his highest-edited article (Evolution) as he does to AN. This shows an editor with quality mainspace contributions, significant AN input, but a priority and balance towards understanding of mainspace and avoidance of drama at the trainwreck that is AN. Sandy Georgia (Talk) 18:16, 20 November 2008 (UTC)


 * Heh, "The number of articles to which the editor has contributed more edits than to the Administrators' noticeboards." I take it then you want all AN pages, just not ANI?
 * I like that statistic. I can do that, but it'll probably have to wait until Thanksgiving. I think most admins will be underwater, and some of those who aren't would have gotten tallies through edit wars. Cool Hand Luke 20:20, 20 November 2008 (UTC)


 * Franamax has already done all the numbers, and conglomerated them by family (for example, all AN boards and subboards). They only need to be pulled into a table. Maybe Franamax's numbers suggest other options to evaluate ?   Sandy Georgia  (Talk) 21:54, 20 November 2008 (UTC)

Oh. Cool Hand Luke 22:28, 20 November 2008 (UTC)
 * Prob is, it's a lot to wade through: we need a meaningful conglomeration/summary. If you look at the samples I already added, it gives a lot to chew on (for starters).  What is "enough" AN participation?  What is "enough" mainspace participation?  I think Tim Vickers shows a good balance.  Sandy Georgia  (Talk) 22:32, 20 November 2008 (UTC)


 * Dunno, but I got over 30 million edit summaries on my hard drive. I just don't have time. Cool Hand Luke 22:34, 20 November 2008 (UTC)


 * I may chip away at them, but Franamax has a way of surprising me with done deals :-) I'm not sure, though, if I'm looking at the right combo of factors.  The goal is to strike a balance between excellence in editing and dispute resolution.  I've already shown several clearly unbalanced candidates.  Sandy Georgia  (Talk) 22:55, 20 November 2008 (UTC)