User:Ocaasi/AGW

Nom from Ocaasi
I'm pleased to nominate User:West.andrew.g for administrator. Andrew is a University of Pennsylvania PhD candidate in his final year, writing his thesis on security in collaborative online environments. In other words, he builds computer tools that scale to protect Wikipedia and other wikis. He has demonstrated his ability in this area by developing the anti-vandal tool WP:STiki, which to date has been used to revert over 250,000 instances of vandalism. STiki is a novel approach to vandalism detection because it: uses a metadata algorithm to identify and prioritize likely vandalism, including 'subtle' vandalism'; presents an interface for human review of lower confidence but still suspect ClueBot findings (that cannot be reverted automatically); has developed a feed for reviewing external link spam; and finally, engages STiki users through regular recognition and statistical tracking of participation. Andrew has also been a rollbacker since 2010 and used that capability to manually classify 67,000 instances of suspected vandalism himself.

Andrew has been with us since 2008, but he really came onto my radar in 2010 when he conducted a rigorous but ill-conceived breaching experiment which involved using multiple accounts to test Wikipedia's security and response to spam. Andrew was blocked and negotiated his return to good standing with ArbCom shortly thereafter Since the breaching experiment Andrew has shown himself to be willing to work within our community rules and to advance the interests of Wikipedia, not putting his own research priorities above those of our project. He has built tools that permit others to carry on valuable work at a massive scale. Moreover, he has added some of the most authoritative and useful scholarship about vandalism detection on wikis of any researcher in the field.

Andrew's scope of research is vast and includes not only vandalism and spam but also suspicious editing by IPs, copyright-violation detection, category organization, deleted content, and article popularity (see here and here). Andrew needs the Administrator tools to continue certain areas of research, part of which involves analyzing revision-deleted content in statistical detail, as well as other aspects of site operations. While he could pursue the Researcher userright, to date this has never been granted to volunteers, only WMF staff. A successful RfA will permit him the access he requires while also demonstrating community support. In all, I believe he's not only someone we want on our side, he has shown that he is on our side, and we should keep him around by enabling him to continue and expand the innovative and compelling work he has undertaken.

Co-nom from Madman
(note, we can move some of the above nom statement to here, or trim/refocus different parts to give both of us something to say. It would make sense for you to focus on the more technical aspects, of course).

Response from AGW
I thank User:Ocaasi and User:Madman for this nomination and it is one I accept. They have well summarized my contributions on the project and I would like to expand on just two main points before the community's discussion/questioning:

Without question, my link spam experiments were contrary to WP:POINT, WP:BEANS, and other community policies. To this end, all I can offer is (a) apologies, (b) evidence of good-faith intentions, and (c) to note how the events shaped my future/ongoing interaction with the project. My goal was to obtain data on human damage responses that could be used to prevent future -- actually malicious -- incidents of the same type. My findings have since been shared internally, externally, and integrated into my classifiers/tools (see: Final report). The experiments were rigorously planned to minimize harm to human subjects, have IRB approval, and vague details were published only months after the WMF was offered code/consultation on the vulnerability. Regardless of one's stance on such practices, my conditions with ArbCom make clear that no further such experiments should take place; conditions I have now honored for several years. Please consider that I am transparent about my real-life identity, and I consider many longstanding community members among my professional colleagues. Also for any concerned that my career and wiki-work are in conflict-of-interest, very soon I will be taking a research position unrelated to wikis/Wikipedia/collaboration. So, this RfA and the subsequent work that would result from it are done on purely personal/volunteer terms.
 * Regarding the link spam experiments

I think my community involvement here is not one best reflected by my contribution history alone. I am an researcher/developer, and I am confident my tools (e.g., WP:STiki, WP:WikiAudit, work-in-progress) and reports (e.g., WP:5000) have enabled others to efficiently perform a magnitude of work that I could never approach as an individual. Countless researchers have used Wikipedia as a dataset, but I feel I distinguish myself by practically implementing my findings for the benefit of the community and continuing to improve and support these tools long after they have fulfilled their research role. Virtually all of my edits are (a) vandalism/spam reverts or (b) on talk pages in support of my tools/reports. I don't arrive with diverse experience regarding dispute resolution (though I've had to ask some users to stop using my tools, and dealt with a few angry vandals), edit warring, or many of the other oft-discussed topics here. While I may not edit in these spaces, I do understand the processes by which they operate. I follow along at WP:VPT, WP:AIV, WP:ANI, and WP:BRFA -- and subscribe to a number of related mailing lists. I've also attended the past three Wikimania events (Gdansk, Haifa, Washington D.C.).
 * Regarding my technical focus

The catalyst behind this RfA is to obtain access to the administrative toolkit for purposes of data-analysis and tool-building, not so I can use it for my editing and ongoing projects. My 2010 attempt to get the researcher user-right dissolved in a philosophical debate, and a more recent request via WMF research/legal contacts suggested this RfA was the most appropriate venue to secure the needed permissions. My previous inability to secure the user-rights made my analysis of oversight/deleted revisions a far more challenging process than it needed to be. Regardless, that research showed copyright violations were perhaps the project's biggest vulnerability (understandably, they survive long periods because they are not surface-level damage like vandalism). An autonomous means to discover copyright violations would be very exciting, and indeed, my participation with WP:Turnitin (and the need to view RD1 deleted content) is a catalyst for this request. However, the opportunities do not end there. I hope to analyze article deletion and page protection actions (among others), and hopefully bring machine-learning to bear by creating tools that can autonomously perform/suggest some fraction of these tasks and prioritize the remainder for human review.
 * Use for Administrator tools (this should probably be moved to the question and answer section)