Wikipedia:AFD 100 days

A computer script designed by Dragons flight was used to parse 100 days of AFD logs from June 1 2005 - September 8 2005 searching for bolded keywords (e.g. delete, keep, merge, redirect, kill, cleanup, etc.) in signed comments. This has allowed a large statistical sample to be generated from which important patterns in voting and article deletion behavior might be identified.

Methodology
A computer script was produced that parsed 100 days of AFD logs extracting votes and usernames from lines of the following kind:

* ... Vote text ... ... ... {end of line}

These were broken into groups by searching for the === Nomination Headers ===. The first link to User space after the nomination header was assumed to belong to the nominator and was recorded in a separate category.

Obviously, there are many ways that this script can be fooled if people format things in unusual ways (or even some fairly mundane variations), but the hope is that by capturing a large enough sample it will be possible to derive meaningful patterns even if people who forget to bold their vote or sign in an unusual place are ignored. (Note: if more than one link to User space was present, the last one on the line was assumed to be the signature.)

The "vote text" was interpreted by removing a long list of modifiers (e.g. strong, weak, super, borderline, massive, etc.) and creating lists of common synonyms (e.g. delete = kill, nuke, destroy; keep = cleanup, revise, expand, don't delete). In this way it was possible to categorize 96% of all vote text as either: keep, delete, merge, speedy, speedy keep, redirect, bjaodn, rename, transwiki, or comment. The remaining 4% consist of a variety of ambiguous statements that could not be interpreted and rarely used phrases (e.g. "grind into a pulp") that were not considered frequent enough to be worth teaching to the parser even though their meaning may have been clear. If someone used multiple keywords, e.g. "delete or merge", the vote was usually recorded based on the first occurring recognized word. Some inversion terms (e.g. "don't" in "don't delete") were also processed to handle exceptions where the keyword wasn't assigned its normal meaning.

Obviously, this can never be as accurate or as complete as someone processing the AFD votes by hand, and it is likely a variety of mistakes and misinterpretations were made, but I believe this methodology is more than sufficient to get a broad understanding of AFD patterns.

Overview patterns

 * Percent deleted includes content removal outcomes (delete, speedy, bjaodn and redirect) as compared to content retaining outcomes (keep, no consensus, merge, speedy keep, move, and transwiki).

Condensed voting patterns

 * Content removal options (delete, speedy, nominate, bjaodn and redirect) consolidated under "delete".
 * Content preserving options (keep, merge, speedy keep, move, and transwiki) consolidated under "keep".
 * Comments and unparsed options are ignored and removed from counts.

Expanded low voter count

 * Same as above but for the very infrequent voters. Done on request of User:Fubar Obfusco.

Deletionist vs. Inclusionist tendencies
Expressed in terms of how often they vote delete, this table summarizes the tendencies of AFD regulars.

AFD outcomes

 * "uncertain" represents all of the AFDs whose outcome the program was unable to parse.
 * no consensus results are included under keep.

Note: Combining the content removal options (delete, speedy, redirect, and bjaodn) and discounting the 6.4% of "uncertain" conclusions, indicates that 75.2% of AFDs result in content being "deleted", versus 24.8% with content preserving conclusions (keep, speedy keep, merge, move, transwiki).

User patterns
The voting patterns for the top 30 participants on AFD.

See also: An expanded list for all participants averaging more than one vote per day.

Condensed voting patterns
Simplified keep/delete count (see overview for description)

Most frequent closers

 * Estimated threshold is the percentage of delete votes this person most usually requires before closing an AFD as a deletion. This threshold can be significantly distorted for closers that avoid controversial votes (see deviation below).
 * Adherence is the fraction of closes performed that appear consistent with this admin's threshold.
 * Estimated deviation is the estimated number of AFD results that would have to be changed if this admin adopted the 63.5% threshold which is the average. Admins who rarely close controversial votes (e.g. nothing in the 50-80% range) may have a deviation of 0 even if their estimated threshold is substantially displaced.

Deletion as a function of vote percentage

 * No compensation for sockpuppets / anon votes
 * Some of the more perverse results also reflect parser error. For example, some of the 2% of AFDs that were kept despite apparent unanimous delete resulted from strangely formatted or labeled keep votes that the parser was unable to count.

Votes per article

 * The nomination counts as one vote. Those AFDs with only 1 or 2 votes were typically speedy deleted shortly after listing.

Contested AFDs
Number of AFDs with at least the specified number of both keep and delete votes, and the percentage of such AFDs as a fraction of all AFDs

AFDs with at least 2/3 delete
Shows the number of AFDs with at least the specified number of keep votes and those with at least twice as many delete votes as keep votes.