Scientific integrity

Research integrity or scientific integrity is an aspect of research ethics that deals with best practice or rules of professional practice of scientists.

First introduced in the 19th century by Charles Babbage, the concept of research integrity came to the fore in the late 1970s. A series of publicized scandals in the United States led to heightened debate on the ethical norms of sciences and the limitations of the self-regulation processes implemented by scientific communities and institutions. Formalized definitions of scientific misconduct, and codes of conduct, became the main policy response after 1990. In the 21st century, codes of conduct or ethics codes for research integrity are widespread. Along with codes of conduct at institutional and national levels, major international texts include the European Charter for Researchers (2005), the Singapore statement on research integrity (2010), the European Code of Conduct for Research Integrity (2011 & 2017) and the Hong Kong principles for assessing researchers (2020).

Scientific literature on research integrity falls mostly into two categories: first, mapping of the definitions and categories, especially in regard to scientific misconduct, and second, empirical surveys of the attitudes and practices of scientists. Following the development of codes of conduct, taxonomies of non-ethical uses have been significantly expanded, beyond the long-established forms of scientific fraud (plagiarism, falsification and fabrication of results). Definitions of "questionable research practices" and the debate over reproducibility also target a grey area of dubious scientific results, which may not be the outcome of voluntary manipulations.

The concrete impact of codes of conduct and other measures put in place to ensure research integrity remain uncertain. Several case studies have highlighted that while the principles of the code of conduct adhere to common scientific ideals, they are seen as remote from actual work practices and their efficiency is criticized.

After 2010, debates on research integrity have been increasingly linked to open science. International codes of conduct and national legislation on research integrity have officially endorsed open sharing of scientific output (publications, data or code) as ways to limit questionable research practices and to enhance reproducibility. References to open science have incidentally opened up the debate over scientific integrity beyond academic communities, as it increasingly concerns a wider audience of scientific readers.

Definition and history
Research integrity or scientific integrity became an autonomous concept within scientific ethics in the late 1970s. In contrast with other forms of ethical misconducts, the debate over research integrity is focused on "victimless offence" that only hurts "the robustness of scientific record and public trust in science". Infractions to research integrity include chiefly "data fabrication, falsification, or plagiarism". In that sense, research integrity mostly deal with the internal process of science. It can be treated as community issue, that should not involve external observers: "research integrity is more autonomously defined and regulated by the community, while research ethics (again, a narrow definition) has closer links to legislation".

Emergence of the issue (1970–1980)
Before the 1970s, ethical issues were largely focused on the conduct of medical experiments, especially in regard to tests on humans. In 1803, the "code" of Thomas Percival created a moral foundation for experimental treatments that "was built upon fairly regularly" throughout the next two centuries, notably by Walter Reed in 1898 and by the Berlin code in 1900. After the Second World War, the Nazi human experimentations motivated the development of international, widely acknowledged codes of research ethics, such as the Nuremberg code (1947) and the World Medical Association Declaration of Helsinki.

According to Kenneth Pimple, Charles Babbage was the first author to set aside the specific issue of scientific integrity. In the Reflections on the Decline of Science in England, and on Some of its Causes, first published in 1830, Babbage identified four classes of scientific frauds, from outright forgery to varied degrees of arrangements and cooking of the data or the methods.

Research integrity became a major debated topic in biological sciences after 1970, due to a combination of factors: the development of advanced data analysis methods, the growing commercial relevancy of fundamental research, and the increased focus of federal funding agencies in the context of big science. In 1974, the "painted mouse incident" attracted unprecedented media attention: William Summerlin inked a black dot on a mouse to claim a treatment has been a success. Between 1979 and 1981, several major cases of scientific fraud and plagiarism drew a greater focus on the issue from researchers and policymakers in the United States: as many as four important frauds occurred in the summer of 1980.

At the time, the "scientific community responded to reports of 'scientific fraud' (as it was often called) by asserting that such cases are rare and that neither errors nor deception can be hidden for long because of science's self-correcting nature". A journalist of Science, William Brad, took the opposite position and made an influential contribution to the issue of research integrity. In an answer to the US House of Representatives Science and Technology subcommittee, he highlighted that "cheating in science was nothing new" but, until recently, "had been handled as an internal affair". In a detailed investigation co-signed with Nicholas Wade, Betrayers of Science, Brad described scientific fraud as a structural problem: "As more cases of frauds broke into public view (…) we wondered if fraud wasn't a quite regular minor feature of the scientific landscape (…) Logic, replication, peer review — all had been successfully defied by scientific forgers, often for extended periods of time." Other early assessments of the systematicity of scientific frauds presented a more nuanced picture. For Patricia Wolff, along with a few obvious manipulations, there were a wide range of grey areas, which were due to the complexity of fundamental research: "the boundaries between egregious self-deception, culpable carelessness, fraud, and just plain error, can be very blurred indeed". Characteristically, the debate led to a re-evaluation of past scientific practices. In 1913, a well-known scientific experiment on electron charge by Robert Millikan was explicitly based on discarding some results that would not agree with the underlying theory: while well received at the time, by the 1980s this work had come to be considered as a textbook example of scientific manipulation.

Formalization of research integrity (1990–2020)
By the end of the 1980s, the amplification of misconduct scandals and the heightened political and public scrutiny put scientists in a difficult position in the United States and elsewhere: "The tone of the 1988 US congressional oversight hearings, chaired by Rep. John Dingell (D-MI), that investigated how research institutions were responding to misconduct allegations reinforced many scientists’ view that both they and scientific research itself were under siege." The main answer was procedural: research integrity has "been codified into numerous codes of conduct field specific, national, and international alike." This policy response largely stemmed from research communities, funders and scientific administrators. In the United States, the United States Public Health Service and the National Science Foundation adopted "similar definitions of misconduct in science" in 1989 and 1991. The concepts of research integrity and its reverse, scientific misconduct were especially relevant from the perspective funding bodies, since it made it possible to "delineate the research-related practices that merit intervention": lack of integrity led not only to unethical but inefficient research and funds have better to be allocated elsewhere.

After 1990, there was a "veritable explosion of scientific codes of conduct". In 2007 the OECD published a report on best practices for promoting scientific integrity and preventing misconduct in science (Global Science Forum). Such international texts include:
 * European Charter for Researchers (2005)
 * the Singapore statement on research integrity (2010)
 * European Code of Conduct for Research Integrity of All European Academies (ALLEA) and the European Science Foundation (ESF) (2011 revised in 2017).

There are no global estimates of the total number of codes of conduct related to research integrity. A UNESCO project, the Global Ethics Observatory (no longer accessible after 2021), referenced 155 codes of conduct but "this is probably just a fraction of the total number of codes produced in recent years." Codes have been created in highly diverse settings and show a wide variation in scale and ambition. Along with national-scale codes, there are codes for scientific societies, institutions and R&D services. While these normative texts may frequently share a core of common principles, there has been growing concern "over fragmentation, lack of interoperability and varying understandings of central terms can be sensed".

Taxonomy and classification
In codes of conduct, the definition of research integrity is usually negative: the collection of norms aims to single out different forms of unethical research and scientific misconduct with varying degrees of gravity.

The multiplication of codes of conduct has also corresponded with an expansion of scope. While the initial debate was focused on "three deadly sins of scientific and scholarly research: fabrication, falsification and plagiarism", attention has later shifted "to the lesser breaches of research integrity". In 1830, Charles Babbage introduced the first taxonomy of scientific frauds that already encover some forms of questionable research practices : hoaxing (a voluntary fraud "far from justifiable"), forging ("whereas the forger is one who, wishing to acquire a reputation for science, records observations which he has never made"), trimming (which "consists in clipping off little bits here and there from those observations which differ most in excess from the mean" and cooking. Cooking is the main focus of Babbage as an "art of various forms, the object of which is to give to ordinary observations the appearance and character of those of the highest degree of accuracy". It falls done under several sub-cases such as data selection ("if a hundred observations are made, the cook must be very unlucky if he cannot pick out fifty or twenty to do the serving up", model/algorithm selection ("another approved receipt (…) is to calculate them by two different formulae") or use of different constants.

In the late 20th century, this classification has been greatly expanded and have come to encompass a wider range of deficiencies than intentional frauds. The formalization of research integrity entailed a structural change in the vocabularies and the concept associated with it. By the end of the 1990s, use of the expression "scientific fraud" was discouraged in the United States, in favor a "semi-legal term": scientific misconducts. The scope of scientific misconducts is expansive: along with data fabrication, falsification and plagiarism it includes "other serious deviations" that are demonstrably done in bad faith. The associated concept of questional research practice, first incepted in a 1992 report of the Committee on Science, Engineering, and Public Policy, has an even broader scope, as it also encompass potentially non-intentional research failures (such as inadequacies in the research data management process). In 2016, a study identified as much as 34 questionable research practices or "degree of freedom", that can occur at all the steps of the project (the initial hypothesis, the design of the study, collection of the data, the analysis and the reporting).

After 2005, research integrity has been additionally redefined through the perspective of research reproducibility and, more specifically, of the "reproducibility crisis". Studies of reproducibility suggest that there is continuum between irreproducibility, questionable research practices and scientific misconducts: "Reproducibility is not just a scientific issue; it is also an ethical one. When scientists cannot reproduce a research result, they may suspect data fabrication or falsification." In this context, ethical debates are less focused on a few highly publicized scandals and more on the suspicion that the standard scientific process is broken and fails to meet its own standard.

Prevalence of ethical issues
In 2009, a meta-analysis of 18 surveys estimated that less than 2% of scientists "admitted to have fabricated, falsified or modified data or results at least once". Real prevalence may be under-estimated due to self-reporting: regarding "the behaviour of colleagues admission rates were 14.12%". Questionable research practices are more widespread as more than one third of the respondents admit to have done it once. A large 2021 survey of 6,813 respondents in the Netherlands found significantly higher estimate, with 4% of the respondents engaging in data fabrication and more than half of the respondents engaging in questionable research practices. Higher rates can be either attributed to a deterioration of ethic norms or to "the increased awareness of research integrity in recent years". The higher rates of self-declared scientific misconducts are found in the medical and life science, with at much as 10.4% respondents surveyed in the Nerthelands admitting a scientific fraud (either fabrication of falsification of the data).

Other forms or scientific misconducts or questionable research practices are both less problematic and much more widespread. A 2012 survey of 2,000 psychologists found that "the percentage of respondents who have engaged in questionable practices was surprisingly high", especially in regard to selective reporting. A 2018 survey of 807 researchers in ecology an evolutionary biology showed that 64% "did not report results because they were not statistically significant", 42% have decided to collect additional data "after inspecting whether results were statistically significant" and 51% "reported an unexpected finding as though it had been hypothesised from the start". As they come from self-declared survey, these estimations are likely to be underestimated.

Implementation and assessment of codes of conduct
Several case studies and retrospective analyses have been devoted to the reception of codes of conduct in scientific communities. They frequently highlight a discrepancy between the theoretical norms and the "lived morality of researchers".

In 2004, Caroline Whitbeck underlined that the enforcement of a few formal rules has overall failed to answer to a structural "erosion or neglect" of scientific trust. In 2009, Schuurbiers, Osseweijer and Kinderler led a series of interviews in the aftermath of the Dutch code of conduct on research integrity, introduced in 2005. Overall, most respondents were unaware of the code and complementary ethical recommendations. While the principles "were seen to reflect the norms and values within science rather well", they seemed to be isolated from the actual work practices, which "may lead to morally complex situations". Respondents were also critical of the underlying individualist philosophy of the code, which shifted the entire blame to individual researchers without taking into account institutional or community-wide issues. In 2015, a survey of "64 faculty members at a large southwestern university" in the United States "yielded similar results": many of the respondents were not aware of the existing ethical guidelines, and the communication process remained poor. In 2019, a case study on Italian universities noted that the proliferation of research codes "has a reactive nature because codes of ethics are drawn up in response to scandals and as a result are punitive and negative, with lists of prohibitions".

Codes of conduct on research integrity may have a more significant impact on professional identity. Development of research codes has been equated to an internalization of issues related to research integrity within scientific social circles and its close associate with disputed results, which made it a typical form of "knowledge club" governance. In contrast to a wider range of ethical issues that may overlap with more general social debates (such as gender equality), research integrity belongs to a form of professional ethics analogous to the ethical standards applied by journalists or medical practicians. As such, not only does it create a common moral framework but also, incidentally, "justifies the existence of the profession as separate from other professions". While the impact of codes on actual ethical practices remains difficult to assess, they have a more measurable impact on the professionalization of research, by transforming informal norms and customs into a set of predefined principles: "codes in general are supported both by those pursuing them as a vehicle to encourage the greater professionalization of biologists (e.g., an initial stage to introducing professional licensing) and those seeking them to forestall any further regulation."

Research integrity and open science
In the 2000s and 2010s, scientific integrity was gradually reframed in the context of open science, and increased accessibility to scientific publications. The debate on research reproducibility has significantly contributed to this evolution.

Ethics of open science
The underlying ethical principles of open science predates the development of an organized open science movement. In 1973, Robert K. Merton theoretized a normative "ethos of science" structured on a "norm of disclosure". This norm "was far from universally accepted" in the early development of scientific communities and has remained "one of the many ambivalent precepts contained in the institution of science." Disclosure was counterbalanced by the limitations of the publication and evaluation process, that tended to slow down the divulgation of research results. In the early 1990s, this norm of disclosure was reframed as norm of "openness" or "open science".

The early open access and open science movements emerged partly as a reaction against the large corporate model that has come to dominate scientific publishing since the Second World War. Open science was not framed as a radical transformation of scientific communication but as a realization of core underlying principles, already visible at the start of the scientific revolution of the 17th and the 18th century: the autonomy and self-governance of scientific communities and the divulgation of research results.

Since 2000, the open science movement has expanded beyond access to scientific outputs (publication, data or software) to encompass the entire process of scientific production. The reproducibility crisis has been an instrumental factor in this development, as it moved the debates over the definition open science further from scientific publishing. In 2018, Vicente-Saez and Martinez-Fuentes have attempted to map the common values shared by the standard definitions of open science in the English-speaking scientific literature indexed on Scopus and the Web of Science. Access is no longer the main dimension of open science, as it has been extended by more recent commitments toward transparency, collaborative work and social impact. These diverse conceptual dimensions "encompasses (Graph 5) the emerging trends on Open Science such as open code […] open notebooks, open lab books, science blogs, collaborative bibliographies, citizen science, open peer review, or pre-registration"

Through this process, open science has been increasingly structured over a consisting set of ethical principles: "novel open science practices have developed in tandem with novel organising forms of conducting and sharing research through open repositories, open physical labs, and transdisciplinary research platforms. Together, these novel practices and organising forms are expanding the ethos of science at universities."

Codification of open science ethics
The translation of the ethical values of open science toward applied recommendation was mostly undertaken by institutional and communities initiatives until the 2010s. The TOP guidelines were elaborated in 2014 by a committee for Transparency and Openness Promotion that included "disciplinary leaders, journal editors, funding agency representatives, and disciplinary experts largely from the social and behavioral sciences". The guidelines rely on eight standards, with different levels of compliance. While the standards are modular, they also aim to articulate a consistent ethos of science as "they also complement each other, in that commitment to one standard may facilitate adoption of others.". The highest levels of compliance for each standard include the following requirements:

In 2018, Heidi Laine attempted to establish a nearly-exhaustive list of "ethical principles associated with open science":

This categorization has to contend with the diversity of approaches and values associated with the open science movement and their ongoing evolutions, as the "term will likely remain as fluid as any other attempt to coin a complex system of practices, values and ideologies in one term". Laine identified a significant variation in the way open science principles have been embedded in four major codes of conduct and statements on research integrity: the Singapore Statement on Research Integrity (2010), the Montreal Statement on Research Integrity in Cross-Boundary Research Collaborations (2013), the Responsible Conduct of Research and Procedures for Handling Allegations of Misconduct in Finland (2012) and the European Code of Conduct for Research Integrity (2017). Access to research publications is recommended in all four codes. Integrations of data sharing and reproducibility practices are less obvious, and vary from a tacit approval to detailed support, in the case of the later European Code of Conduct: "The European code pays data management almost an equal amount of attention as publishing and is also in this sense the most advanced of the four CoCs." Yet, important areas of open science, are consistently ignored, especially regarding the development of open science infrastructure, increased transparency of evaluation or support for citizen science and wider social impact. Overall, Laine found "none of the evaluated CoCs to be in blatant contradiction with the ethical principles of open science, but only the European code of conduct can be said to actively support and give guidance on open science."

After 2020, new forms of open science code of conduct have explicitly claimed to "foster the ethos of open scientific practices". First adopted in July 2020, the Hong Kong principles for assessing researchers acknowledge open science as one of the five pillars of scientific integrity: "It seems clear that the various modalities of open science need to be rewarded in the assessment of researchers because these behaviors strongly increase transparency, which is a core principle of research integrity."

Research integrity and society
While there is still a continuum between the procedural norms of the codes of conduct and the range of values encompassed by open science, open science has significantly altered the setting and the context of the ethical debate. Open scientific productions can be universally shared in theory: their dissemination is not constrained to the classic membership model of the "knowledge club". Implications are wider as well, as potential misuses of scientific publications is no longer limited to professional scientists. The discrepancy was already visible in the late 2000s, although it was framed under "different buzzwords": in a case study on the implementation of the Dutch code of conduct, Schuubiers, Osseweijer and Kinderlerer already identified a "shift in practices" that "goes by many names like Mode 2 science, post-normal science, or post-academic science" that a diverse array of transfrom such as technological evolution in the management of research, increased involvement of private actors, open innovation or open access. These structural trends were not well covered by the existing codes of conduct.

In the 1990s and the 2000s, discussions about research integrity have become increasingly professionalized and detached from the public domain. The shift toward open science may potentially contradict this trend, as the range of interesting parties and potential reusers of scientific production has expanded well beyond professional academic circles. In 2018, Heidi Laine underlines that established codes of conduct have not yet taken this decisive step: "The one aspect where even the European code falls short of a full recognition of open science is in crossing the traditional professional borders of the research community, i.e. citizen science, open collaboration and science communication." By not taking into account this new framework, existing codes of conduct risk becoming increasingly out of touch with the reality of scientific practices:

"If the ethical aspects of open science continue to be left out of RCR (Responsible Code of Research) guidance and ponderings, the research community risks losses on both fronts: open science as well as RI (Research integrity). Open science is just as much about values and ethics as it is about technology. Most of all it is about the role of science in society. It is perhaps the most all-encompassing value discussion that the research community has ever known, and the research integrity angle and community of experts risks being side-lined."

The broadened discussion about scientific integrity led to an increased involvement of political institutions and representatives, beyond specialized scientific committee and funders. In 2021, the French government passed a decree on scientific integrity, which called for generalization of open science practices.

Initiatives
In 2007 the OECD published a report on best practices for promoting scientific integrity and preventing misconduct in science (Global Science Forum).

Main international texts in this field:
 * European Charter for Researchers (2005)
 * the Singapore statement on research integrity (2010)
 * European Code of Conduct for Research Integrity of All European Academies (ALLEA) and the European Science Foundation (ESF) (2011 revised in 2017).

In Europe
The European Code of Conduct for Research Integrity, published in 2011 and revised in 2017, develops the concept of scientific integrity along four main lines :
 * Reliability: concerns the quality and reproducibility of research.
 * Honesty: concerns the transparency and objectivity of research.
 * Respect: for the human, cultural, and ecological environment of research.
 * Accountability: concerns the implications of publishing the research.

US Department of Health and Human Services
In a statement made by the US Department of Health and Human Services (HHS), they adopted the definition of Scientific Integrity as stated below. This policy is currently being reviewed and will be officially published in early 2024. "' Scientific integrity is the adherence to professional practices, ethical behavior, and the principles of honesty and objectivity when conducting, managing, using the results of, and communicating about science and scientific activities. Inclusivity, transparency, and protection from inappropriate influence are hallmarks of scientific integrity.”-HHS"To promote a culture of scientific integrity at HHS, they have outlined their policy in seven specific areas:


 * Protecting Scientific Processes
 * Ensuring the Free Flow of Scientific Information
 * Supporting Policymaking Processes
 * Ensuring Accountability
 * Protecting Scientists
 * Professional Development for Government Scientists
 * Federal Advisory Committees

As a result of these areas, open science practices can be promoted to protect against bias, plagiarism, and data fabrication, falsification as well as inappropriate influencing, political interference, and censorship.

National Institute of Health
The National Institute of Health (NIH) is a branch of the HHS. They act as the nation's medical research agency which focuses on making important discoveries that improve health and save lives. The mission of NIH is to provide a fundamental understanding of the nature and behavior of living systems and applying that understanding to improve health, extend life, and reduce illness and disability. The NIH fosters the definition of Scientific Integrity from the HHS Scientific Integrity Policy draft to ensure their scientific findings are objective, creditable, transparent, and readily available to the public. All NIH staff are expected to:


 * Foster an organizational Culture of Scientific Integrity
 * Protect the Integrity of the Research Process
 * Communicate Science with Integrity
 * Safeguard Scientific Integrity

Journal articles

 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }
 * }