Wikipedia:Ethically researching Wikipedia

In social science research, issues of research ethics, informed consent, and research protocols often arise, and research of Wikipedia is no exception. Rules and laws established after controversial studies like the Milgram experiment and Stanford prison experiment require researchers to design their studies such that they do no harm to participants.

Researchers are expected to adhere to professional codes of ethics. Where required, they may also need to obtain permission to carry out research of Wikipedia editors from appropriate bodies at their research institutions.

Codes of ethics
Researchers are expected to adhere to their respective professional organizations' codes of ethics. In the context of researching Wikipedia, sample related codes of ethics include:
 * American Association for Public Opinion Research code of ethics
 * International Sociological Association code of ethics
 * American Sociological Association code of ethics
 * American Psychological Association code of ethics

Obtaining permissions
In the United States, for example, any research into human subjects that receives federal funding must be approved as ethically responsible by an Institutional Review Board (IRB) in order to protect both researchers and participants.

At institutions which have Institutional Review Board (IRB) or similar bodies, researchers should notify their IRBs that they are conducting such research; IRBs have the responsibility to agree that a given activity is exempt, and often have forms (e.g., http://www.irb.cornell.edu/forms/ has a Request for Exemption from IRB Review form).

Content analysis
Most research of Wikipedia does not involve ethical issues of informed consent. Because all contributions to Wikipedia are publicly released under the GNU Free Documentation License and the Creative Commons Attribution-ShareAlike 4.0 International License (see Copyright), content analysis – the analysis of publicly-available pages, archives, or logs is generally considered exempt from such requirements and does not require an IRB approval.

Surveys and interviews
If researchers distribute surveys or privately interview Wikipedians, this does require an IRB approval.

It is both easy and standard to inform the participant that their responses will be made public in certain ways and kept confidential in other ways. Common elements required by IRB for surveys and interviews include a recruitment script providing participants with information about this research project and their rights, to be emailed to or posted on a public wiki discussion page of an editor. Example of such a script: I am RESEARCHER'S REAL NAME, and I am a researcher at SPECIFIC UNIVERSITY. I am conducting a research of SUBJECT. The purpose of this research is to GATHER INFORMATION ON/CONTRIBUTE TO/ETC. For that reason, we will be SURVEYING/INTERVIEWING PEOPLE LIKE YOU asking them to COMPLETE A BRIEF (X MINUTES) QUESTIONNAIRE/PARTICIPATE IN A BRIEF (X MINUTES) INTERVIEW. If you are willing to participate, our QUESTIONNAIRE/INTERVIEW will ask you about SUBJECT. There are no foreseeable risks nor benefits to you associated with this project. All responses are confidential. Your participation is voluntary and you may withdraw from this project at any time. This study is conducted by RESEARCHER'S REAL NAME, who can be reached at RESEARCHER'S CONTACT INFO (EMAIL/PHONE) at any time Thank you very much for your time, RESEARCHER'S REAL NAME

Ethnographic research
When ethnographers enter Wikipedia and interact with editors in real time, issues of informed consent emerge. Ethnographic interactions may be considered "interventions" and therefore require getting the informed consent of participants beforehand. The fact that many Wikipedians are under the age of 18 and are therefore defined as "children" in certain jurisdictions makes such research problematic as well. It would be difficult to have each participant in, say, a deletion debate sign a digital form before the ethnographer began participating. However, there must still be a way of respecting editors as they interact with the ethnographers.

It is customary for Institutional Review Boards to waive the requirement for direct informed consent if the ethnographer works with the community to develop a commonly agreed-upon research protocol that establishes an alternative way of informing participants that they are the subject of research. This page is an attempt to establish such a research protocol. It takes the form of a pledge or agreement between the ethnographer and the Wikipedian community. It is not the only possible research protocol, but it can be used by any researcher who wishes to ethically research Wikipedia by actively interacting with the community. Feel free to change any of these requirements or add another distinct set of your own.

Staeiou's ethnographic research protocol

 * 1) I will recognize that as an ethnographer, I am a guest of the Wikipedian community and the Wikimedia Foundation. As such, I will respect any decisions made by the community, the Arbitration Committee, or the Wikimedia Foundation regarding the way in which I participate in the project and collect data about my experiences.
 * 2) I will fully disclose myself as a researcher of Wikipedia on my account's userpage and user talk page. Here, I will explain who I am, what I am doing and why, my research protocols, ways to opt-out of research, and University administrators or faculty members who can be contacted if concerns arise with my research.
 * 3) I will have a signature that shows my status as a researcher of Wikipedia to let editors know that I am interacting with them in such a role. This will include a link to the above research description and my talk page. For example: Staeiou I'm researching WikipediaQuestions, concerns, comments?. I will sign every contribution I make to talk or process pages.
 * 4) When collecting data and publishing results, I can refer to the specific actions of editors or quote them using their username. I can also publish information they have made public on userpage, their edit/log history, and the results of various programs that analyse publicly available data like Interiot's edit counter.
 * 5) I will let editors opt-out of my research. Any editor will be free to tell me that he or she does not wish to be a subject in my research. If this happens, I will not communicate with him or her further, and I will exclude from my research any existing data specifically based on my interactions with him or her.
 * 6) If my research leads me to communicate with Wikipedians off-wiki – whether via e-mail, chat, in person, or other medium outside of the public wikispace – I will use established interview-based research protocols to establish informed consent. This means that those who communicate with me off-wiki will be initially informed of my research project and asked to digitally consent to such communication being used for research purposes.
 * 7) Unless I am explicitly told otherwise, I will assume that all off-wiki conversations are off-the-record and cannot be quoted in full or in part, attributed, or alluded to either on-wiki or in published works. In each case, I will work to mutually establish the privacy level of the conversation – that is, under what conditions can such conversations be used for research purposes.
 * 8) When archiving off-wiki conversations, I will include the level of privacy agreed to by the participant and their statement consenting to be the subject of research. If for any reason a level of privacy or statement of consent is not attached to an archived conversation, I will assume it is off-the-record and cannot be used for research purposes.
 * 9) I will archive off-wiki conversations in a password-protected, encrypted file which only I can decrypt.
 * 10) I will work to minimize risks to subjects by focusing on topics directly or indirectly related to Wikipedia, encyclopedia-building, and the community. To protect subjects, I will not discuss personally sensitive topics, such as editors' past or current illegal behavior, sexual behavior, medical or psychological care, and drug or alcohol use. If editors express these or other personally sensitive topics, I will not include them in my research.

Best practices
If you interact with the Wikipedia, please don't disrupt the editing process. To avoid disrupting it, you will need to understand it. Help is available, and listening to feedback is strongly advised.


 * Methodology that interferes with the main goals of the encyclopedia is unlikely to get consent.
 * Studies should avoid interfering with the work of others, including by wasting volunteer time.
 * Do not disrupt Wikipedia to illustrate a point, even if that point is in the name of research. Vandalism for any reason is not tolerated, and is a very fast way to generate hostilities between the Wikipedia community and the research community.


 * In order not to unwittingly violate community rules or norms, at least one author should have become an editor and learned the culture of the community before starting.
 * If in doubt, ask for help.
 * Wikipedia editing for research scientists may be a good starting point.
 * Generally speaking, communication is mandatory. Editors should watch their notifications and respond when people talk to them, although there is no expectation that editors always be available.


 * Consult with and gain the consent of the community before beginning.
 * Wikimedia Foundation employees are analogous to civil servants; consulting with them is not equivalent to consulting with the community.
 * It is recommended that you create a page describing your project at Research (use the "Create a new project page" box). People may comment on your project on its Talk page (see "Talk" tab).
 * It would be a good idea to disclose your research project on your user page (link to your description); if you are paid to do your research, you must disclose that.
 * You may also want to consult with people at the relevant Wikiproject(s).