Research transparency

Research transparency is a major aspect of scientific research. It covers a variety of scientific principles and practices: reproducibility, data and code sharing, citation standards or verifiability.

The definitions and norms of research transparency significantly differ depending on the disciplines and fields of research. Due to the lack of consistent terminology, research transparency has frequently been defined negatively by addressing non-transparent usages (which are part of questionable research practices).

After 2010, recurrent issues of research methodology have been increasingly acknowledged as structural crisis, that involve deep changes at all stages of the research process. Transparency has become a key value of the open science movement, which evolved from an initial focus on publishing to encompass a large diversity of research outputs. New common standards for research transparency, like the TOP Guidelines, aims to build and strengthen open research culture across disciplines and epistemic cultures.

Confused terminologies
There is no widespread consensus on the definition of research transparency.

Differences between disciplines and epistemic cultures has largely contributed to different acceptions. The reproduction of past research has been a leading source of dissent. In an experimental setting, reproduction relies on the same set-up and apparatus, while replication only requires the use of the same methodology. Conversely, computational disciplines use reversed definitions of the term replicability and reproducibility. Alternative taxonomies have proposed to make do entirely with the ambiguity of reproducibility/replicability/repeatability. Goodman, Fanelli and Ioannidis recommended instead a distinction between method reproducibility (same experimental/computational setup) and result reproducibility (different setup but same overall principles).

Core institutional actors continue to disagree on the meaning and usage of key concepts. In 2019, the National Academies of Science of the United States retained the experimental definition of replication and reproduction, which remains "at odds with the more flexible way they are used by [other] major organizations". The Association for Computing Machinery opted in 2016, for the computational definition and added also an intermediary notion of repeatability, where a different team of research use exactly the same measurement system and procedure.

Debate over research transparency has also created new convergences between different disciplines and academic circles. In the Problem of science (2021), Rufus Barker Bausell argues that all disciplines, including the social sciences, currently face similar issues to medicine and physical sciences: "The problem, which has come to be known as the reproducibility crisis, affects almost all of science, not one or two individual disciplines."

Negative definitions
Due to lack of consistent terminology over research transparency, scientists, policy-makers and other major stake-holders have increasingly rely on negative definitions: what are the practices and forms that harm or disrupt any common ideal of research transparency.

The taxonomy of scientific misconducts has been gradually expanded since the 1980s. The concept of questionable research practices (or QRP) was first incepted in a 1992 report of the Committee on Science, Engineering, and Public Policy as a way to address potentially non-intentional research failures (such as inadequacies in the research data management process). Questionable research practices uncover a large grey area of problematic practices, which are frequently associated to deficiencies in research transparency. In 2016, a study identified as much as 34 questionable research practices or "degree of freedom", that can occur at all the steps of the project (the initial hypothesis, the design of the study, collection of the data, the analysis and the reporting).

Surveys of disciplinary practices have shown large differences in the admissibility and spread of questionable research practices. While data fabrication and, to a lesser extent, rounding of statistical indicators like the p value are largely rejected, the non-publication of negative results or the adjonctions of supplementary data are not identified as major issues.

In 2009, a meta-analysis of 18 surveys estimated that less than 2% of scientists "admitted to have fabricated, falsified or modified data or results at least once". Real prevalence may be under-estimated due to self-reporting: regarding "the behaviour of colleagues admission rates were 14.12%". Questionable research practices are more widespread as more than one third of the respondents admit to have done it once. A large 2021 survey of 6,813 respondents in the Netherlands found significantly higher estimate, with 4% of the respondents engaging in data fabrication and more than half of the respondents engaging in questionable research practices. Higher rates can be either attributed to a deterioration of ethic norms or to "the increased awareness of research integrity in recent years".

A new dimension of open science?
Transparency has been increasingly acknowledged as an important component of open science. Until the 2010s, definitions of open science have been mostly focused on technical access and enhanced participation and collaboration between academics and non-academics. In 2016, Liz Lyon identified transparency as a "third dimension" of open science, due to the fact that "the concept of transparency and the associated term ‘reproducibility’, have become increasingly important in the current interdisciplinary research environment." According to Kevin Elliott, the open science movement "encompasses a number of different initiatives aimed at somewhat different forms of transparency."

First drafted in 2014, the TOP guidelines have significantly contributed to bring transparency on the agenda of the open science movements. They aim to promote an "open research culture" and implement "strong incentives to be more transparent". They rely on eight standards, with different levels of compliance. While the standards are modular, they also aim to articulate a consistent ethos of science as "they also complement each other, in that commitment to one standard may facilitate adoption of others.".

This open science framework of transparency has been in turn coopted by leading contributors and institutions on the topic of research transparency. After 2015, contributions from science historians underlined that there have been no significant deterioration of research quality, as past experiments and research design were not significantly better conceived and the rate of false or partially false has likely remained approximately constant for the last decades. Consequently, proponents of research transparency have come to embrace more explicitly the discourse of open science: the culture of scientific transparency becomes a new ideal to achieve rather than a fundamental principle to re-establish. The concept of transparency has contributed to create convergences between open science and other open movements in different areas such as open data or open government. In 2015, the OECD describe transparency as a common "rationale for open science and open data".

Discourse and practices of research transparency (before 1945)
Transparency has been a fundamental criterion of experimental research for centuries. Successful replications have become an integral part of the institutional discourse of natural sciences (then called natural philosophy) in the 17th century. An early scientific society of Florence the Accademia del Cimento adopted in 1657 the motto provando e riprovando as a call for "repeated (public) performances of experimental trials" A key member of the Accademia, the naturalist Francesco Redi described extensively of the forms and benefits of procedural experimentation, that made it possible to check for random effects, the soundness of the experiment design, or causal relationships through repeated trials Replication and the open documentation of scientific experiments has become a key component of the diffusion of scientific knowledge in society: once they attained a satisfying rate of success, experiments could be performed in a variety of social spaces such as courts, marketplaces or learned salon.

Although transparency has been early on acknowledged as a key component of science, it was not defined consistently. Most concept associated today with research transparency have arisen as terms of the art with no clear and widespread definitions. The concept of reproducibility appeared in an article on the "Methods of illuminations" first published in 1902: one of the methods examined was deemed limited regarding "reproducibility and constancy" In 2019, the National Academies underlined that the distinction between reproduction, repetition and replication has remained largely unclear and unharmonized across disciplines: "What one group means by one word, the other group means by the other word. These terms — and others, such as repeatability — have long been used in relation to the general concept of one experiment or study confirming the results of another."

Beyond this lack of formalization, there was a significant drift between the institutional and disciplinary discourse on research transparency and the reality of research work, that has persisted till the 21st century. Due to the high cost of the apparatus and the lack of incentives, most experiences were not reproduced by contemporary researchers: even a committed proponent of experimentalism like Robert Doyle had to devolve to a form of virtual experimentalism, by describing in detail a research design that has only been run once For Friedrich Steinle, the gap between the postulated virtue of transparency and the material conditions of science has never been solved: "The rare cases in which replication actually is attempted are those that either are central for theory development (e.g., by being incompatible with existing theory) or promise broad attention due to major economical perspectives. Despite the formal ideal of replicability, we do not live in a culture of replication."

Preconditions of the transparency crisis (1945–2000)
The development of big science after the Second World War has created unprecedented challenges for research transparency. The generalization of statistical methods across a large number of fields, as well as the increasing breadth and complexity of research projects, entailed a series of concerns about the lack of proper documentation of the scientific process.

Due to the expansion of the published research output, new quantitative methods for literature surveys have been developed under the label of meta-analysis or meta-science. These rely on the assumption that quantitative results and the details of the experimental and observational framework are sound (such as the size or the composition of the sample). In 1966, Stanley Schor and Irving Karten published one of the first generic evaluation of statistical methods in 67 leading medical journals. While few outright problematic papers were found, "in almost 73% of the reports read (those needing revision and those which should have been rejected), conclusions were drawn when the justification for these conclusions was invalid"

In the 1970s and the 1980s, scientific misconducts gradually ceased to be presented as individual misconducts and became collective problems that need to be addressed by scientific institutions and communities. Between 1979 and 1981, several major cases of scientific frauds and plagiarism draw a larger focus to the issue from researchers and policy-makers in the United States In a well-publicized investigation, Betrayers of Science, two scientific journalists described scientific fraud as a structural problem: "As more cases of frauds broke into public view (…) we wondered if fraud wasn't a quite regular minor feature of the scientific landscape (…) Logic, replication, peer review — all had been successfully defied by scientific forgers, often for extended periods of time". The codification of research integrity has been the main institutional answer to this increased public scrutiny with "numerous codes of conduct field specific, national, and international alike."

The reproducibility/transparency debate (2000–2015)


In the 2000s, long-standing issues on the standardization of research methodology have been increasingly presented as a structural crisis which "if not addressed the general public will inevitably lose its trust in science." The early 2010s is commonly considered to be a turning point: "it wasn’t until sometime around 2011–2012 that the scientific community’s consciousness was bombarded with irreproducibility warnings".

An early significant contribution to the debate has been the controversial and influential claim of John Ioannidis from 2005: "most published research findings are false. The main argument was based on the excessively lax experimental standards in place, with numerous weak result being presented as solid research: "the majority of modern biomedical research is operating in areas with very low pre- and post-study probability for true findings"

Due to being published in PLOS Medicine the study of Ioannidis had a considerable echo in psychology, medicine and biology. In the following decades, large range projects attempted to assess experimental reproducibility. In 2015, the Reproducibility Project: Psychology attempted to reproduced 100 studies from three top psychology journals (Journal of Personality and Social Psychology, Journal of Experimental Psychology: Learning, Memory, and Cognition, and Psychological Science): while nearly all paper had reproducible effects, it was found that only 36% of the replications were significant enough (p value above the common threshold of 0.05). In 2021, another Reproducibility Project, Cancer Biology, analyzed 53 top papers about cancer published between 2010 and 2012 and established that the effect sizes were 85% smaller on average than the original findings.

During the 2010s, the concept of reproducibility crisis has been expanded to a wider array of disciplines. The share of citations per year of the seminal paper of John Ioannidis, Why Most Published Research Findings Are False in the main fields of research according to the metadata recorded by the academic search engine Semantic Scholar (6,349 citations as of June 2022) shows how this framing has especially expanded to computing sciences. In Economics, a replication of 18 experimental studies in two major journals, found a failure rate comparable to psychology or medicine (39%).



Several global surveys have reported a growing uneasiness of scientific communities over reproducibility and other issues of research transparency. In 2016, Nature highlighted that "more than 70% of researchers have tried and failed to reproduce another scientist's experiments, and more than half have failed to reproduce their own experiments" The survey also found "no consensus on what reproducibility is or should be", in part due to disciplinary differences, which makes it harder to assess what could be the necessary steps to overcome the issue at plays. The Nature survey has also been criticized for its paradoxical lack of research transparency, since it was not based on a representative sample but an online survey: it has "relied on convenience samples and other methodological choices that limit the conclusions that can be made about attitudes among the larger scientific community" Despite mixed results, the Nature survey has been largely disseminated and ahs become a common entry data for any study of research transparency.

Reproducibility crisis and other issues of research transparency have become a public topic addressed in the general press: "Reproducibility conversations are also unique compared to other methodological conversations because they have received sustained attention in both the scientific literature and the popular press".

Research transparency and open science (2015–)
Since 2000, the open science movement has expanded beyond access to scientific outputs (publication, data or software) to encompass the entire process of scientific production. In 2018, Vicente-Saez and Martinez-Fuentes have attempted to map the common values shared by the standard definitions of open science in the English-speaking scientific literature indexed on Scopus and the Web of Science. Access is no longer the main dimension of open science, as it has been extended by more recent commitments toward transparency, collaborative work and social impact. Through this process, open science has been increasingly structured over a consisting set of ethical principles: "novel open science practices have developed in tandem with novel organising forms of conducting and sharing research through open repositories, open physical labs, and transdisciplinary research platforms. Together, these novel practices and organising forms are expanding the ethos of science at universities."

The global scale of the open science movement and its integration in a large variety of technical tools, standards and regulations makes it possible to overcome the "classic collective action problem" embodied by research transparency: there is a structural discrepancy between the stated objective of scientific institutions and the lack of incentives to implement them at an individual level.

The formalization of open science as a potential framework to ensure research transparency has been initially undertaken by institutional and communities initiatives. The TOP guidelines were elaborated in 2014 by a committee for Transparency and Openness Promotion that included "disciplinary leaders, journal editors, funding agency representatives, and disciplinary experts largely from the social and behavioral sciences". The guidelines rely on eight standards, with different levels of compliance. While the standards are modular, they also aim to articulate a consistent ethos of science as "they also complement each other, in that commitment to one standard may facilitate adoption of others."

After 2015, theses initiatives have partly influenced new regulations and code of ethics. The European Code of Conduct for Research Integrity from 2017 is strongly structured around open science and open data: it "pays data management almost an equal amount of attention as publishing and is also in this sense the most advanced of the four CoCs." First adopted in July 2020, the Hong Kong principles for assessing researchers acknowledge open science as one of the five pillars of scientific integrity: "It seems clear that the various modalities of open science need to be rewarded in the assessment of researchers because these behaviors strongly increase transparency, which is a core principle of research integrity."

Forms of research transparency
Research transparency has a large variety of forms depending on the disciplinary culture, the material condition of research and the interaction between scientists and other social circles (policy-makers, non-academic professionals, general audience). For Lyon, Jeng and Mattern, "the term ‘transparency’ has been applied in a range of contexts by diverse research stakeholders, who have articulated and framed the concept in a number of different ways." In 2020, Kevin Elliott introduced a taxonomy of eight dimensions of research transparency: purpose, audience, content, timeframe, actors, mechanism, venues and dangers. For Elliott not all forms of transparency are achievable and desirable, so that a proper terminology can help to make the more appropriate decisions: "While these are important objections, the taxonomy of transparency considered here suggests that the best response to them is typically not to abandon the goal of transparency entirely to consider what forms of transparency are best able to minimize them.".

Method reproducibility
Goodman, Fanelli and Ioannidis define method reproducibility as "the provision of enough detail about study procedures and data so the same procedures could, in theory or in actuality, be exactly repeated." This acception is largely synonymous with replicability in a computational context or reproducibility in an experimental context. In the report of the National Academies of Science, that opted for an experimental terminology, the counterpart of method reproducibility was described as "obtaining consistent results using the same input data; computational steps, methods, and code; and conditions of analysis".

Method reproducibility is more attainable in computational sciences: as long as it behaves as expected, the same code should produce the same output. Open code, open data and more recently, research notebook are common recommendations to enhance method reproducibility. In principle, the wider availability of research output makes it possible to assess and audit the process of analysis. In practice, Roger Peng already underlined in 2011, that many projects require "computing power that may not be available to all researchers". This issue has worsened in some areas such as Artificial Intelligence or Computer vision, as the development of very large deep learning models makes it nearly impossible to recreate them (or at a prohibitive cost), even when the original code and data are open. Method reproducibility can also be affected by library dependency, as the open code can rely on external programs which may not always be available or compatible. Two studies in 2018 and 2019 have shown that a large share of research notebook hosted on GitHub are no longer usable, either due to the of required extensions no longer being available or issues in the code.

In experimental sciences, there is no commonly agreed criterium of method reproducibility: "in practice, the level of procedural detail needed to describe a study as "methodologically reproducible" does not have consensus."

Result reproducibility
Goodman, Fanelli and Ioannidis define result reproducibility as "obtaining the same results from the conduct of an independent study whose procedures are as closely matched". Result reproducibility is comparable to replication in an experimental context and reproducibility in a computational context. The definition of replicability retained in the National Academies of Science, largely applies to it: "obtaining consistent results across studies aimed at answering the same scientific question, each of which has obtained its own data.". The reproducibility crisis met in experimental disciplines like psychology or medicine is mostly a crisis of "result reproducibility", since it concerns research that cannot been simply re-executed, but involve the independent recreation of the experimental design. As such it is arguably the most debated form of research transparency in the recent years.

Result reproducibility is harder to achieve than other forms of research transparency. It involve a variety of issues that may include computational reproducibility, accuracy of scientific measurement and diversity of methodological approaches. There are no universal standard to determine how close are the original procedures matched and criterium may vary depending on the disciplines or, even on the field of research. Consequently, meta-analysis of reproducibility have faced significant challenges. A 2015 study of 100 psychology papers conducted by Open Science Collaboration has been confronted with the "lack of a single accepted definition" which "opened the door to controversy about their methodological approach and conclusions" and made it necessary to fall back on "subjective assessments" of result reproducibility.

Observation reproducibility and verifiability
In 2018 Sabina Leonelli defines observation reproducibility as the "expectation being that any skilled researcher placed in the same time and place would pick out, if not the same data, at least similar patterns". This expectation recovers a large range scientific and scholarly practices in non-experimental disciplines: "A tremendous amount of research in the medical, historical and social sciences does not rest on experimentation, but rather on observational techniques such as surveys, descriptions and case reports documenting unique circumstances"

The development of open scientific infrastructure has radically transformed the status and the availability of scientific data and other primary sources. Access to theses resources has been thoroughly transformed by digitization and the attribution of unique identifiers. Permanent digital object identifiers (or DOI) have been first allocated to dataset since the early 2000s which solved a long-standing debate on the citability of scientific data.

Increased transparency of citations to primary sources or research materials has been framed by Andrew Moravcsik as a "revolution in qualitative research". Access to theses resources has been thoroughly transformed by digitization and the attribution of unique identifiers. Permanent digital object identifiers (or DOI) have been first allocated to dataset since the early 2000s which solved a long-standing debate on the citability of scientific data.

Value transparency
Transparency of research values has been a major focus of disciplines with strong involvements in policy-making such as environment studies or social sciences. In 2009, Heather Douglas underlined that the public discourse on science has been largely dominated by normative ideals of objective research: if the procedures have been correctly applied, science results should be "value-free". For Douglas, this ideal remains largely at loss with the effective process of research and scientific advising as pre-defined values may largely predate choices about the concepts, the protocols and the data used. Douglas argued instead in favor of a disclosure of the values held by researchers: "the values should be made as explicit as possible in this indirect role, whether in policy documents or in the research papers of scientists."

In the 2010s, several philosopher of sciences attempted to systematize value transparency in the context of open science. In 2017, Kevin Elliott emphasized three conditions for value transparency in research, the first one involved "being as transparent as possible about (…) data, methods, models and assumptions so that value influence can be scrutinized".

Review and editorial transparency
Until the 2010s, the editorial practices of scholarly publishing have remained largely unformal and little studied: "Despite 350 years of scholarly publishing (…) research on ItAs [Instruction to authors], and on their evolution and change, is scarce."

Editorial transparency has been recently acknowledged as a natural expansion of the debate over research reproducibility. Several principles laid in the 2015 TOP guidelines already implied the existence of explicit editorial standards. Unprecedented attention given to editorial transparency has also been motivated by the diversification and the complexification of the open science publishing landscape: "Triggered by a wide variety of expectations for journals’ editorial processes, journals have started to experiment with new ways of organizing their editorial assessment and peer review systems (...) The arrival of these innovations in an already diverse set of practices of peer review and editorial selection means we can no longer assume that authors, readers, and reviewers simply know how editorial assessment operates."

Transparent by design: developing open workflow
The TOPs Guidelines have set up an influential transdisciplinary standard to establish result reproducibility in an open science context. While experimental and computational disciplines remains a primary focus, the standards have strived to integrate concerns and formats more specific to other disciplinary practices (such as research materials).

Informal incentives like badges or indexes have been initially advocated as a way to support the adoption of harmonized policies in regard to research transparency. Due to the development of open science, regulation and standardized infrastructures or processes are increasingly favored.

Sharing of research outputs
Data sharing has been early on identified as major potential solution to the reproducibility crisis and the lack of solid guidelines for statistical indicators. In 2005, John Ioannidis hypothesized that "some kind of registration or networking of data collections or investigators within fields may be more feasible than registration of each and every hypothesis-generating experiment."

The sharing of research outputs is covered by three standards of the TOPs guidelines: on Data transparency (2), Analytic/code methods transparency (3) and Research materials transparency (4). All the relevant data, code and research materials are to be stored on a "trusted repository" and all analysis being already reproduced independently prior to publication.

Extended citation standards
While citation standards are commonly applied to academic reference, there is much less formalization for all the other research output, such as data, code, primary sources or qualitative assessments.

In 2012, the American Political Science Association adopted new policies for open qualitative research. They covered three dimensions of transparency: data transparency (in the sense of precise bibliographic data to the original sources), analytic transparency (in regards to claims extrapolated from the cited sources) and production transparency (in reference to the editorial choices made in the selection of the sources). In 2014, Andrew Moravcsik advocated the implementation of transparency appendix, containing detailed quotes of original sources as well as annotations "explaining how the source supports the claim being made".

According to the TOP Guidelines, "appropriate citation for data and materials" should be provided each publication. Consequently, scientific outputs like code or dataset are fully acknowledged as citable contributions: "Regular and rigorous citation of these materials credit them as original intellectual contributions."

Pre-registrations
Pre-registrations are covered by two TOP guidelines: Preregistration of studies (6) and Preregistration of analysis plans (7). In both cases, for the highest level of compliance journal should provide "link and badge in article to meeting requirements".

Pre-registrations aims to preventively address a variety of questionable research practices. It takes usually the form of "a timestamped uneditable research plan to a public archive [that] states the hypotheses to be tested, target sample sizes". Preregistration acts as an ethical contract as it theoretically constrains "the researcher degrees of freedom that make QRPs and p-hacking work".

Preregistration do not solve all the range of questionable research practices. Selective reporting of the results would especially still be compatible with a predefined research plan: "preregistration does not fully counter publication bias as it does not guarantee that findings will be reported." It has been argued that preregistration may also in some cases harm the quality of the research output by creating artificial constraints that do not fit with the reality of the research field: "Preregistration may interfere with valid inference because nothing prevents a researcher from preregistering a poor analytical plan."

While advocated as a relatively cost-free solution, preregistration may be in reality harder to implement as it relies on a significant commitment on the part of the researchers. An empiric study of the adoption of open science experiments in a psychology journals has shown that "Adoption of pre-registration lags relative to other open science practices (…) from 2015 to 2020". Consequently "even within researchers who see field-wide benefits of pre-registration, there is uncertainty surrounding the costs and benefits to individuals."

Replication studies
Replication studies or assessments of replicability aims to re-do one or several original studies. Although the concept has only appeared in the 2010s, replication studies have been existing for decades but were not acknowledged as such. The 2019 report of the National academies include a meta-analysis of 25 replications published between 1986 and 2019. It finds that the majority of the replication concern the medical and social sciences (especially, psychology and behavioral economics) and that there is for now no standardized evaluation criteria: "methods of assessing replicability are inconsistent and the replicability percentages depend strongly on the methods used." Consequently, at least as for 2019, replication studies cannot be aggregated to extrapolate a replicability rate: they "are not necessarily indicative of the actual rate of non-replicability across science for a number"

The TOPs guidelines have called for an enhanced recognition and valorization of replication studies. The eighth standards state that compliant journals should use "registered Reports as a submission option for replication studies with peer review".

Open editorial policies
In July 2018, several publishers, librarians, journal editors and researchers drafted a Leiden Declaration for Transparent Editorial Policies. The declaration underlined that journals "often do not contain information about reviewer selection, review criteria, blinding, the use of digital tools such as text similarity scanners, as well as policies on corrections and retractions" and this lack of transparency. The declaration identifies four main publication and peer review phases that should be better documented:
 * At submission: details on the governance of the journal, its scope, the editorial board or the rejection rates.
 * During review: criteria for selection, timing of the review and model of peer review (double bind, single bind, open).
 * Publication: disclosure of the "roles in the review process".
 * Post-publication: "criteria and procedures for corrections, expressions of concern, retraction" and other changes.

In 2020, the Leiden Declaration has been expanded and supplemented by a Platform for Responsible Editorial Policies (PREP). This initiative also aims to solve the structural scarcity of data and empirical information on editorial policies and peer review practices. As of 2022, this database contains partially crowdsourced information on the editorial procedures of 490 journals, from an initial base of 353 journals. The procedures evaluated include especially "the level of anonymity afforded to authors and reviewers; the use of digital tools such as plagiarism scanners; and the timing of peer review in the research and publication process". Despite this developments, research on editorial research still highlight the need for the "a comprehensive database that would allow authors or other stakeholders to compare journals based on their (…) requirements or recommendations"