User:Participatorydata/Data Justice

Data justice is a research framework that aims to integrate social justice concerns into research about data and algorithms. It is an overarching term that is used to describe the implications of activism, policy, research, and governance under conditions of datafication. Data justice investigates how the development of smart technologies involves social and political considerations that may disproportionately impact marginalized groups. This research examines how justice concerns play a role in the way people are made visible, represented and treated through the production and use of their data. As an umbrella term, data justice investigates the impacts of how data is used, as well as responses to data harms and misuse.

Origins
Data justice originated as a response to a concern regarding datafication, the practice of turning behaviour into specific data points through the use of easily accessible information and communications technology.. A key element in this process involves the feeding of data to users in the form of advertising and specific content recommendations. . In response to the increased use of behavioural data, critics questioned how this information was used and sold in the data economy. A critique of datafication emerged as a sub-criticism of how surveillance may or may not be employed to track and modify human behaviour. Shoshana Zuboff argues that these practices have become central to the modern economy creating the condition of surveillance capitalism. This process is enabled by the existence of a platform economy, where infrastructures are developed to enable the mass collection of data.

Data justice sees technology as embedded within a political economy that involves a series of social and political decisions related to monetization and datafication. Within these practices, there has also been an increase in public-private interface use, where technology is developed for simultaneous use in the public and private sectors. Critical perspectives on the data economy gained prominence following the Snowden leaks and the Cambridge Analytica data scandal, with critics focussing on the purpose of datafication and big data practices. As a research practice, data justice examines how the shift to data centric infrastructures may or may not create unequal social conditions based on class, race, socioeconomic status, and/or gender. This involves investigating the intersectional consequences of the increased use of automated decision making systems.

Justice
Data justice grows out of practices in information studies which considers information as a beneficial resource that should be distributed equally. This involves analyzing the justices and injustices relating to datafication through a consideration of how technology can enable or hinder fairness and equity. This perspective aims to determine ethical pathways for the continued development of information technology.

Data justice work is concerned with integrating social justice into data science. This focus attempts to balance distributive justice and structural justice to understand how data can play a role in upholding or challenging systemic injustices. Distributive justice involves arbitrating opposing claims through the issuing of material goods, political power, and/or social and political rights. Structural justice reflects the degree to which efforts towards self-determination and self-development are enabled or constrained by institutional conditions and values. Data justice attempts to balance both of these practices under a research agenda that affirms the need for equal distribution while acknowledging the presence of systemic injustices within datafication. Conducting data justice research involves situating technology, data and algorithms as having the potential to reinforce systemic forms of oppression.

Data justice was designed with a goal of developing alternatives to unequal participation in datafication by encouraging the use of data technology as a tool to fulfill aims of social justice and political participation. To accomplish this goal, data justice examines the impacts of institutional uses of data, while also responding to potential data harms and misuses. This bridges research from data ethics, algorithmic governance and social justice together under one framework. A focus on justice enables social movements to build alliances with algorithms to pursue social justice objectives.

Power
Data justice presumes that data is collected under pre-existing conditions related to power.. This perspective draws from an account of the historical imperatives of data collection, in which data was collected and used as a tool to consolidate knowledge about certain populations. The collection and use of historical data has been cited as exploitative for groups such as Indigenous communities, populations from the global south and Black people in North America due to unequal power distributions between researchers and subjects. The prominence of data universalism has been presented as a key factor that ignores the presence of power imbalances in data collection and use.

There are current calls to re-examine the role power plays in the production and use of data in these areas that have been historically exploitative. . Data justice encourages this re-examination, as it sees data and technology as part of a longer histories of structural inequality and systemic violence. Through investigating the presence of data discrimination within algorithms and datasets, data justice aims to investigate the role that data play in the production of meaning. This process involves an examination of how the unequal dissemination of power in data collection and deployment may lead to the reinforcement of certain discursive frames over others. As such, data justice rejects technological utopianism and data universalism in favour of a framework that accounts for how technology may or may not be used to reflect a dominant system of power.

Intersectionality
Data justice research often involves a specific focus on intersectionality as it relates to the way data is used and collected in ways that are informed by direct experiences and systemic values. Catherine D’Ignazio and Lauren Klein employ intersectionality within their definition of data feminism as a way to examine how unequal power relations may be replicated with data, and a commitment to developing data practices that are oriented around feminist principles of accountability and equity. This perspective builds from the theory of intersectionality as a mode of critique that examines how individuals are impacted by dominant and interlocking systems of power within society. It investigates how one person can hold multiple different race, gender, class, and ability identities, leading to different experiences of systemic oppression. There is a strong body of research that investigates how artificial intelligence systems and Big Data can exacerbate this systemic oppression, bias and inequality. A data justice perspective argues that data is not objective or neutral, but often can enact systemic bias that can worsen inequalities.

Data Agency
Data agency involves connecting agency and democratic processes to data organization. A focus on agency connects to a focus on power. When there is an increasing reliance on data to complete everyday activities, these data practices acquire more power. This increase in power has led scholars to discuss the importance of citizen agency in relation to data structures. Agency involves the way an individual or collective may take action based on a reflection about the conditions of the world around them. Data agency draws from theories of agency to understand how citizens can become involved in the everyday processes of datafication, to further supplement the role played by tech companies and government organizations.

A key element of data agency involves integrating data literacy practices into citizen participation initiatives. This perspective argues that in order to engage citizens with data in meaningful ways, data justice work must help citizens understand and reflect on the collection of their data and the implications that come with it. A focus on literacy demonstrates that algorithmic transparency may not provide those without a technical background with a full understanding, limiting the potential for citizen accountability to be fully realized.

Projects working to build data literacy and agency include:


 * The MyData initiative has developed a set of principles relating to how public institutions and companies can govern data according to principles of agency, including citizen involvement.
 * The Information is Beautiful Project is a visual demonstration of different concepts relating to politics, news, science, economics and culture. The project has a section dedicated to tech and digital visualizations, encouraging literacy on different issues relating to technology and Big Data.
 * The Data scores investigation tool was designed to map and investigate the uses of data analytics and algorithms in public services in the UK, as a way to inform citizens about the prominence and potential pitfalls of data scoring - the practice of using off and on-line activities to categorize citizens, allocate services, and predict behaviour.
 * Datakind is an initiative that encourages data literacy and interaction with data in a variety of contexts, including hackathon events and longer capacity building projects.

Data Activism
Data activism involves civic engagement as an alternative to opaque datafication. It is characterized by mobilizations against existing data uses and practices. Data activism involves both reactive and proactive responses to the use of data. Proactive responses occur when activists appropriate open data practices to promote social justice and broaden participation in decision making. A reactive approach aims to challenge the presence of algorithmic control that may be utilized by government agencies and corporations. Both of these approaches emphasize the important structural role of intermediary organizations as channels through which citizens can provide feedback and organize collectively.

Examples of Data Activism Include:


 * The role of Mexican bloggers and data mining experts in documenting and observing the rise of  bots, trolls and fake profiles on various social media platforms and websites.
 * Members of anonymous who utilize techniques of hacking to leak information related to public safety.
 * Protest organization during the Arab Spring Movement demonstrating the key role of social media during the Arab Spring protests as a way to disseminate collective information.
 * The End the Backlog movement, which aims to use the presence of a large backlog of data on rape kits in the United States to alter current legislation relating to the processing of these kits.

De-Westernization
De-westernization promotes the inclusion of a decolonial lens in data studies to acknowledge the importance of including perspectives from the global South. By including specific and varied information from the Global South, it is argued that data can more accurately reflect the needs and experiences of marginalized groups. This perspective reflects the claim that data gathering has historically existed as a form of oppression in the Global South.

Through integrating the perspectives of historically marginalized communities, data justice works to empower communities to use data in safe and ethical ways. Practices such as counter-mapping in Indigenous communities and partnering with trusted community partners demonstrate how de-westernization can work within data justice research projects. Throughout the 1970s, Indigenous groups in Canada utilized counter-mapping, alongside visual and oral media, in formal land claim negotiations as evidence of territorial sovereignty. Recently, projects such as the Decolonial Atlas Project have engaged in counter mapping to demonstrate a wide range of socio political issues using data. The Big Data From the South initiative calls for a de-westernization of critical data studies through a research agenda that integrates the perspectives of marginalized communities into data studies. The initiative is oriented around five conceptual operations that relate to data justice practices:


 * 1) Moving past universalism
 * 2) Understanding the south as composite and plural entity
 * 3) Postulates a critical engagement with decolonial approach
 * 4) Need to bring agency to the core of our analyses
 * 5) Embracing imagines of datafication emerging from the souths

Research Frameworks
Data justice involves a broad and interdisciplinary research agenda encompassing various perspectives in the disciplines of critical data studies, health, international development, machine learning, and public policy. These research agendas are united under a commitment to investigating the role that politics and power play in the collection and use of data. Data justice research can take many forms, and has been conceptualized using different frameworks, reflecting the broad nature of the term.

Five Levels of Data Justice Research
Richard Heeks and Satyarupa Shekhar developed a framework for data justice research that aims to understand the different levels in which data can interact with social justice goals. Their framework encourages researchers to analyze datafication through five different levels of critique:


 * 1) The procedural level reflects fairness in the way data is handled, focussing on collection and analysis procedures.
 * 2) The instrumental level aims to understand how the data is being used and for what purpose(s).
 * 3) The rights-based level reflects adherence to specific data rights such as representation, privacy, access and ownership. This level reflects calls to examine data rights as human rights.
 * 4) The structural level investigates whether the interests and power seen in wider societal practices reflect fair outcomes.
 * 5) The distributive level is an overarching dimension. It investigates fairness in each stage of the data justice research process, understanding how power and privilege is disseminated within data systems.

Three Pillars of Data Justice Research
Linnet Taylor depicts data justice research as a project that can integrate three “pillars”. This framework was developed in response to the three original conceptions of data justice: as a concern for governance, as a matter of distributive justice, and as a way to connect to social justice organizations. The three pillars of data justice combine these approaches under a set of key concepts:


 * 1) Visibility focuses on the presence of privacy and representation, connecting with international development studies, human geography and legal scholarship.
 * 2) Engagement with Technology acknowledges the importance of engaging a wide range of stakeholders in data-driven activities, connecting with postcolonial theory and conceptions of digital citizenship.
 * 3) Nondiscrimination involves two dimensions: the power to identify and challenge bias in data use, and the freedom not to be discriminated against.

Four Steps fo Data Scientists Engaging with Data Justice
Ben Green proposed four steps for data scientists to engage with data justice research. This perspective argues that data ethics efforts are ill-equipped to generate data science that avoids social harms and promotes social justice. For data scientists to recognize themselves as political actors, Green argues that researchers must complete four stages:


 * 1) Interest involves an initial interest in working directly with social justice concerns.
 * 2) In the reflection stage, data scientists come to recognize how politics and systemic bias may underlie their work
 * 3) Stage three involves data scientists creating applications that address these systemic biases.
 * 4) Practice, as a final stage, involves the long-term project of developing new methods and structures within computer science that focus directly on data justice.

Groups currently working within this framework include Black in AI, LatinX in AI, Queer in AI, and Women in Machine Learning, which seek to integrate the perspectives of marginalized communities into AI development.

Data Stewardship
Data stewardship is an oversight practice relating to the governance of data. It involves an understanding of developing practices according to the principles of FAIR data use. As a data justice practice, data stewardship involves a focus on participation and community engagement by rejecting practices of data collection, storage and sharing in ways that are opaque. It builds from the idea that public participation involves a ladder that involves increasing control at each level - inform, consult, involve, collaborate and empower. Integrating citizen participation into data stewardship enables citizens to gain insight into whether the integration of data activism may provide increasing levels of accountability and control. In this framework, members of a datafied system are able to access and understand their data, enabling them to improve upon it as necessary. The integration of participatory data stewardship can complement existing legal and rights-based approaches to improve digital literacy and agency.

Proposals for data stewardship

 * Citizen Data Audits build on the practice of corporate and government data audits to evaluate the standards that government agencies and corporations follow in the governance of data. As data audits became more common following the implementation of the GDPR, the proposal for citizen data audits integrate mechanisms for data literacy, agency and accountability. The BigDataSur initiative presents citizen data audits as starting from lived experiences to establish criteria for auditing the corporate use of personal data.
 * Data commons involve a legal understanding of data in the sharing economy as part of a collective ownership program. This proposes aims to utilize legal tools - such as  reciprocity licenses - to emphasize the beneficial societal effects of collective ownership of data.
 * Data trusts are trusted agreements that aim to ensure equitable and secure exchanges of data. They work within existing legal structures to provide ethical, architectural and governance support for trustworthy data processing. Data trusts have been proposed as a way to align trust and trustworthiness through a focus on transparency, accountability, and mutual benefit.
 * Harm Records detail the presence of algorithmic harms and/or harms relating to misappropriation of data. The Harm Record developed by the Data Justice Lab details how individuals and communities are harmed by algorithmic systems. The record details different harms according to specific categories: Exploitation; Discrimination; Loss of Privacy; Surveillance, Control and Physical Injury; Manipulation; Exclusion from necessities for life; and Injustice. The development of easily accessible harm records can enable citizens to understand the ways in which data can be used for damaging purposes.

Organizations Conducting Data Justice Research

 * Data Justice Lab: a research centre hosted at Cardiff University dedicated to examining the relationship between datafication and social justice.
 * Data for Black Lives: a social movement organization that joins activists, organizers and mathematicians to develop ways to use data science to benefit the lives of Black people.
 * Algorithmic Justice League: a non-profit organization that aims to raise awareness about the impacts of artificial intelligence, specifically related to AI risk and harm.
 * Our Data Bodies: a research group that examines the ways community information is collected, stored, and shared by government organizations and corporations.
 * The Ada Lovelace Institute: a British policy research institute that examines how data and AI can be developed to maximize social wellbeing.
 * Algorithm watch: a European human rights organization that focusses on the ways algorithms can be employed to enhance justice, democracy and sustainability
 * The AI Now Institute:: a research institute that examines policy developments relating to artificial intelligence, focussing on accountability and participation
 * Detroit Digital Justice Coalition: a collective of organizations in Detroit who focus on ensuring equitable access to and participation in communications technologies
 * Data + Feminism Lab: a research lab at Massachusetts Institute of Technology that uses data and computations methods to enhance goals of gender and racial equity

See Also:

 * Big Data Ethics
 * Algorithmic Transparency
 * Ethical AI
 * Digital Literacy
 * Data sovereignty
 * Data Activism