Datasphere

The datasphere is a multidisciplinary concept that first appeared in the 1980s. While many terms have been adopted to describe the digital world – terms such as the Internet, cyberspace, metaverse – the various concepts of the datasphere seem to address the growing dependency of human activities on data, as well as approach the digital world in a holistic manner. Related terms include data economy, data governance, data commons, and data management.

History of the term
The term "datasphere" has been used to broadly define digital space and information, particularly in relation to information flow, data, and digital platforms. Since the 1980s, the concept started to be used more and more. Since then, it has been applied to a variety of contexts ranging from product names, to conference titles, and terms of science fiction art.

The 'datasphere' as a concept was popularized by the media theorist, writer and advocate of cyberpunk culture and open-source solutions to social problems, Douglas Rushkoff, in the 1980s. He approached the datasphere as the "circulatory system for today's information, ideas and images", understood as "our new natural environment". Rushkoff's conceptualization, centered in media theory, was deployed to explain how 'media viruses' – ideas that capture public attention – rapidly spread. As such, Rushkoff's datasphere invokes ideas of information flow, rather than being focused on structured data and its analysis.

Around the same time as Rushkoff's global datasphere concept was coined, others were writing of the 'personal datasphere' - drawing more upon the idea of a datasphere as a stock of data. The personal datasphere concept envisions multiple dataspheres each with their own center (e.g. a person, with a personal datasphere encompassing all the data about that individual; a location, such as a shopping center; or a company, and so-on).

In 2004, a short paper by lawyer and cyberdefender, Andrew Updegrove, introduced the concept of a "Personal Data Sphere". Updegrove (2004) conceptualizes the Personal Data Sphere (PDS), with a nod back to Pierre Teilhard de Chardin's 1925 concept of a 'Noosphere' "layer of consciousness surrounding the globe, comprising all human thought and culture". Updegrove's PDS resonates with contemporary concepts such as MiData, Vendor Relationship Management, and Personal Data Stores. His concept refers to personal digital data such as birth and death records, and parental estate planning documents.

In 2015, law professor, Stephen Humphreys, alludes to the idea of "living in the datasphere" in his article "Conscience in the datasphere" where he attempts to reframe the debate on privacy, law, and technology using the datasphere to communicate the public's immersion in data.

The term emerged again in 2015 when a group of doctors published their paper "The Project Data Sphere Initiative: Accelerating Cancer Research by Sharing Data" in The Oncologist journal. Acknowledging that cancer research can be advanced through access to historical data from clinical trials, the authors introduce the Data Sphere as a digital database which allows researchers to electronically share cancer clinical trial data.

In 2016, the term was also adopted in the medical arena. It was used to refer to domain specific 'data spaces', as in Jérôme Béranger's 'Big Data and Ethics: The Medical Datasphere'. Béranger brings to the table the use of digital information and the need to find a balance between confidentiality and transparency. The datasphere here reverses to the "massive data" and the ethical issues surrounding its design and manipulation, especially regarding personal data.

Bergé, Grumbach, and Zeno-Zencovich (2018) describe the datasphere as an emerging space, hosted primarily through digital platforms. They describe how:


 * "The notion of 'Datasphere' proposes a holistic comprehension of all the 'information' existing on earth, originating both in natural and socio-economic systems, which can be captured in digital form, flows through networks, and is stored, processed and transformed by machines."

This contains parallels with the idea of the Infosphere introduced by Luciano Floridi (2007) as "the whole informational environment constituted by all informational entities (thus including informational agents as well), their properties, interactions, processes, and mutual relations". However, where Floridi's concept includes both digital and "offline and analog spaces of information" alongside digital data, the scope of the datasphere is more tightly defined, concerned primarily with digital representations of the world that have been "found, collected and organized".

Overall, the notion of the 'datasphere' is increasingly used and adopted to define the complex digital ecosystem we are currently navigating.

Contemporary uses of the datasphere
As a metaphor and as a complex system, the notion of 'datasphere' is approaching a highly complex digital ecosystem and, overall, addresses the question of the type of society we want to build.

Datasphere as a spatial metaphor
The term 'datasphere' is used as a spatial metaphor. For instance, it was adopted by the GEODE Center, a research and training center that studies "the strategic and geopolitical issues of the digital revolution". The GEODE's objective is twofold. On one hand, it seeks to study the datasphere as a geopolitical "object in its own right". On the other hand, it uses the resources of the datasphere to conduct geopolitical analyses. As part of its research work, the Center has developed a cartography of the datasphere, where there is not only a focus on regional approaches to the digital space, data flows, logical and physical routes, and social networks, but also to the power distribution across geographies.

Both Floridi and Bergé et al. see these new spheres as 'spaces we inhabit': architectures and ecosystems affecting the way daily life is lived. For Bergé in particular, the spatial metaphor of the datasphere highlights the way in which datafication reconfigures relations between "conventional institutional territories (.e.g. States, towns, international and regional organizations)", and "gives rise to new territories". For some authors, the idea of 'living in the datasphere' brings to mind the 'public sphere', and the 'Hobbes's bargain' through which public institutions are arguably grounded, albeit more shakily so as expansion of an unregulated datasphere undermines classical institutions authority and effectiveness.

As the datasphere seems to be more and more perceived as an ecosystem and space we inhabit, new collective data governance frameworks have also arisen across the globe. These frameworks might replicate some datasphere elements in their design systems. For example, new governance tools such as data commons, data trusts, cooperatives, collaboratives, data pools, among others can all be tools used to navigate the datasphere and do so collectively – thus enhancing its complexity –.

The datasphere could also be understood as a natural ecosystem. Just as it happens in nature – where energy flows and there is an ongoing cycle among ecosystems – the datasphere is an ecosystem where fast-paced and complex data flows. Governance efforts are nowadays focusing on leveraging free data flows while ensuring the protection of different human groups. As much as data flows naturally, innovative rules are needed to allow for the cycles to flow and guarantee that the environment as a whole and its sub-systems are protected.

The notion of datasphere is related to that of cyberspace, which describes a widespread interconnected digital technology. The datasphere encompasses the notion of cyberspace, while adding layers of complexity, namely human groups and norms. In addition, the datasphere does not only consider digital technologies, but the different data flows produced in a hyperconnected society.

Datasphere as a complex system
The datasphere, according to the Datasphere Initiative, was first conceptualized in the paper "We Need To Talk About Data: Framing the Debate Around the Free Flow of Data and Data Sovereignty" by Bertrand De La Chapelle and Lorrayne Porciuncula (2021). The Datasphere Initiative has since coined the datasphere "as the complex system encompassing all types of data and their dynamic interactions with human groups and norms".

This formula essentially draws attention to the mutually co-constituted nature of digital artifacts (datasets), constituencies and social relationships (human groups) and rules and social expectations (norms) - and to the multiplicity of each. At the same time, it stops short of detailed specification of datasets, human groups or norms, and leaves open the question of how the interaction of these should be governed.


 * Datasets: All digital data including personal and non-personal, private, and public as presented in datasets which are classified differently depending on sector, use-case, or agreed standard.
 * Human groups: "Individuals and human groups of all sorts generate, collect, store, process, exchange, make accessible or access, analyze, and use data for various purposes."
 * Norms: Cultural, legal, and technical norms "including high-level principles, international agreements, laws and regulatory frameworks, but also contracts, licenses or terms of service, and even code, standards, and software underpinning technical systems (including that of supporting infrastructures)."

The Datasphere Initiative's definition of the datasphere seeks to find connections between contexts and look for policy and governance strategies that may not arise from a focus on embedded local contexts. The concept of human groups, for example, implicitly points to groups that are potentially co-constituted by data infrastructures, and may exist across conventional boundaries of geography and polity. A recognition of both global norms, and a global plurality of norms, calls for governance approaches that have appropriate levels of flexibility and adaptability. The model implies governance of one interconnected datasphere, not many isolated instances.

Just as it can be discussed of the atmosphere, and some local atmospheric conditions, it can be talked about the datasphere, and of how it is experienced quite differently in different spaces and settings:


 * "The Datasphere (...) is a complex adaptive system, exhibiting the well-documented characteristics of such systems, including: a large number of interconnected agents, non-linear impacts of their actions, positive and negative feedback loops, unintended consequences, structural unpredictability, emergence and path dependencies."

Datasphere and health
The concept of the datasphere has already been applied to several publications and studies related to health. The medical datasphere was a concept adopted by Jérôme Béranger (2016) to talk about Big Data and ethics, and the impact on medical paradigms such as the Hippocratic Oath. Moreover, the Center for Clinical and Translational Science from the University of Kentucky launched its web-based application for researchers to explore de-identified patient data, which they called the DataSphere. Efforts to govern the datasphere to leverage its power for health has also been explored by some researchers. Patient-centric approaches, for example, have been highlighted as necessary for data governance in the healthcare sector.

Datasphere and agriculture
Across the globe, the concept of the datasphere has been gaining importance and prevalence. In Africa, for example, the Technical Centre for Agricultural and Rural Cooperation (CTA) referred to the different "data spheres" that could emerge to replicate the models created for agriculture in other sectors, such as mobile services. The concept was also adopted by researchers mapping livestock population with data from the Food and Agriculture Organization of the United Nations Statistical Database (FAOSTAT). It was used to name a software in Germany in 2022 that helps to optimize irrigation by data-driven decisions.

Datasphere and climate change
As one of the most pressing issues nowadays, climate change is one of the topics related to the governance of the datasphere. Companies are increasingly seeking to adopt sustainable models when it comes to data sharing. Seagate, an American data storage company, has framed its approach towards "advancing a more sustainable datasphere", by which they seek to power their global footprint with 100% renewable energy by 2030 and to achieve carbon neutrality by 2040. In addition, several organizations tackling climate change base themselves on the Worldwide IDC Global DataSphere Forecast, 2022–2026, a series of documents written by the International Data Corporation (IDC) that predicts that by 2025, the Global Datasphere will grow from 33 Zettabytes (ZB) in 2018 to 175 ZB by 2025, and to double in size from 2022 to 2026.

Datasphere and gender
Since the datasphere contains data from and about different human groups, ensuring that it is diverse and equitable is fundamental to break gender bias within the datasphere. Diversity has been highlighted as key for effective and future-oriented data management, and thus for a more inclusive datasphere. Companies such as Seagate have also pointed to the importance of crafting an inclusive datasphere, not only in terms of gender but ethnicity, nationalities, etc. In fact, gender data and gender data governance are highly promoted nowadays, as well as adopting data management and data governance frameworks that adopt inclusive perspectives.

Datasphere and metaverse
The datasphere is related to the recent metaverse developments. Although this emerged as a science fiction concept, the metaverse is nowadays real thanks to the developments of different private and big tech companies, as well as other actors. The existence and increasing use of metaverse spaces has been highly criticized, and people around the world have expressed their concerns about governance within the datasphere, as well as access to it. Several researchers have insisted that there is a link between the efforts to govern the datasphere and the regulation of the metaverse, which will change the language of regulation and potentially create new digital rights.

Use of the datasphere in popular culture
The datasphere has been utilized to refer to science fiction concepts. For instance, the Historical Dictionary of Science Fiction defines the datasphere as "the notional environment in which digital data is stored; esp. the internet viewed in this way; (also) the realm of virtual reality; cyberspace n.". The Historical Dictionary of Science Fiction provides a timeline of references to the term in popular science fiction media including books, articles, and online forums.

Hyperion by Simmons described the datasphere as a construct delight: "I called up information almost constantly, living in a frenzy of full interface". In Cyberia, Rushkoff mentioned that anyone could access the datasphere – "a web of telecommunications and computer networks stretching around the world and into outer space" – through a personal computer and a modem. In The Year's Best Science Fiction: Twenty-Sixth Annual Collection by Dozois, the author talked about the datasphere of Dione, which was "crawling with agents".

C. A. Mason's science fiction book "Datasphere: The New Epic Sci-fi Virtual Reality Adventure" (2016) references the term of the datasphere again. In this context, it is a corrupt virtual reality created by "a program that became a world" whose soil is defined as computer circuitry. Interestingly, when defining the term cyberpunk, Lawrence Person from the Nova Express adopted the term datasphere and said that:


 * "Classic cyberpunk characters were marginalized, alienated loners who lived on the edge of society in generally dystopic futures where daily life was impacted by rapid technological change, an ubiquitous datasphere of computerized information, and invasive modification of the human body."

The tabletop role-playing game Numenera "Voices of the Datasphere" allows players to "reach and explore" the datasphere. In Numenera, the datasphere is described as:


 * "[A] strange new metaspace [that] is not just an ancient alien data network but in fact multiple such networks, some damaged, incomplete, or aged and evolved past their original purpose—created by myriad civilizations and interacting in unexpected ways. The datasphere offers knowledge to be learned, treasures to be discovered, and wholly new challenges to be overcome."

Finally, a GitHub repository named 'data-sphere' provides code that allows a user to view data visualizations in Virtual Reality.