GISAID

GISAID, the Global Initiative on Sharing All Influenza Data, previously the Global Initiative on Sharing Avian Influenza Data, is a global science initiative established in 2008 to provide access to genomic data of influenza viruses. The database was expanded to include the coronavirus responsible for the COVID-19 pandemic, as well as other pathogens. The database has been described as "the world's largest repository of COVID-19 sequences". GISAID facilitates genomic epidemiology and real-time surveillance to monitor the emergence of new COVID-19 viral strains across the planet.

Since its establishment as an alternative to sharing avian influenza data via conventional public-domain archives, GISAID has facilitated the exchange of outbreak genome data during the H1N1 pandemic in 2009, the H7N9 epidemic in 2013, the COVID-19 pandemic  and the 2022–2023 mpox outbreak.

Origin
Since 1952, influenza strains had been collected by National Influenza Centers (NICs) and distributed through the WHO's Global Influenza Surveillance and Response System (GISRS). Countries provided samples to the WHO but the data was then shared with them for free with pharmaceutical companies who could patent vaccines produced from the samples. Beginning in January 2006, Italian researcher Ilaria Capua refused to upload her data to a closed database and called for genomic data on H5N1 avian influenza to be in the public domain. At a conference of the OIE/FAO Network of Expertise on Animal Influenza, Capua persuaded participants to agree to each sequence and release data on 20 strains of influenza. Some scientists had concerns about sharing their data in case others published scientific papers using the data before them, but Capua dismissed this telling Science "What is more important? Another paper for Ilaria Capua's team or addressing a major health threat? Let's get our priorities straight." Peter Bogner, a German in his 40s based in the US and who previously had no experience in public health, read an article about Capua's call and helped to found and fund GISAID. Bogner met Nancy Cox, who was then leading the US Centers for Disease Control's influenza division at a conference, and Cox went on to chair GISAID's Scientific Advisory Council.

The acronym GISAID was coined in a correspondence letter published in the journal Nature in August 2006, putting forward an initial aspiration of creating a consortium for a new Global Initiative on Sharing Avian Influenza Data (later, "All" would replace "Avian"), whereby its members would release data in publicly available databases up to six months after analysis and validation. Initially the organisation collaborated with the Australian non-profit organization Cambia and the Creative Commons project Science Commons. Although no essential ground rules for sharing were established, the correspondence letter was signed by over 70 leading scientists, including seven Nobel laureates, because access to the most current genetic data for the highly pathogenic H5N1 zoonotic virus was often restricted, in part due to the hesitancy of World Health Organization member states to share their virus genomes and put ownership rights at risk.

Towards the end of 2006, Indonesia announced it would not share samples of avian flu with the WHO which led to a global health crisis due to an ongoing epidemic. By October 2006, Indonesia had agreed to share their data with GISAID, which their health minister considered to have a "fair and transparent" mechanism for sharing data. It was one of the first countries to do so. In February 2007, GISAID and the Swiss Institute of Bioinformatics (SIB) announced a cooperation agreement, with the SIB building and administering the EpiFlu database on behalf of GISAID. Ultimately, GISAID was launched in May 2008 in Geneva on the occasion of the 61st World Health Assembly, as a registration-based database rather than a consortium.

2009 onwards
In 2009 SIB disconnected the database from the GISAID portal over a contract dispute, resulting in litigation. In April 2010 the Federal Republic of Germany announced during the 7th International Ministerial Conference on Avian and Pandemic Influenza in Hanoi, Vietnam, that GISAID had entered into a cooperation agreement with the German government, making Germany the long-term host of the GISAID platform. Under the agreement, Germany's Federal Ministry of Food, Agriculture and Consumer Protection was to ensure the sustainability of the initiative by providing technical hosting facilities, and the Federal Institute for Animal Health, the Friedrich Loeffler Institute, was to ensure the plausibility and curation of scientific data in GISAID. By 2021, the ministry was no longer involved with either database hosting nor curation. In 2013 GISAID dissolved a nonprofit organisation based in Washington DC and the organisation began to be operated by a German association called Freunde von GISAID (Friends of GISAID).

Some of the earliest SARS-CoV-2 genetic sequences were released by the Chinese Center for Disease Control and Prevention and shared through GISAID in mid January 2020. Since 2020, millions of SARS-CoV-2 genome sequences have been uploaded to the GISAID database.

In 2022, GISAID added Mpox virus and Respiratory syncytial virus (RSV) to the list of pathogens supported by its database. Indonesia's Ministry of Health announced in November 2023 the establishment of GISAID Academy in Bali, to focus on bioinformatics education, advance pathogen genomic surveillance, and increased regional response capacity.

The GISAID model of incentivizing and recognizing those who deposit data has been recommended as a model for future initiatives; Because of this work, the entity has been described as "a critical shield for humankind".

Database for SARS-CoV-2 genomes
GISAID maintains what has been described as "the world's largest repository of COVID-19 sequences", and "by far the world's largest database of SARS-CoV-2 sequences". By mid-April 2021, GISAID's SARS-CoV-2 database reached over 1,200,000 submissions, a testament to the hard work of researchers in over 170 different countries. Only three months later, the number of uploaded SARS-CoV-2 sequences had doubled again, to over 2.4 million. By late 2021, the database contained over 5 million genome sequences; as of December 2021, over 6 million sequences had been submitted; by April 2022, there were 10 million sequences accumulated; and in January 2023 the number had reached 14.4 million.

In January 2020, the SARS-CoV-2 genetic sequence data was shared through GISAID. Throughout the first year of the COVID-19 pandemic, most of the SARS-CoV-2 whole-genome sequences that were generated and shared globally were submitted through GISAID. When the SARS-CoV-2 Omicron variant was detected in South Africa, by quickly uploading the sequence to GISAID, the National Institute for Communicable Diseases there was able to learn that Botswana and Hong Kong had also reported cases possessing the same gene sequence.

In March 2023, GISAID temporarily suspended database access for some scientists, removing raw data relevant to investigations of the origins of SARS-CoV-2. GISAID stated that they do not delete records from their database, but data may become temporarily invisible during updates or corrections. Availability of the data was restored, with an additional restriction that any analysis based thereon would not be shared with the public.

Governance
The board of Friends of GISAID consists of Peter Bogner and two German lawyers who are not involved in the day-to-day operations of the organisation. Scientific advice to the organization is provided by its Scientific Advisory Council, including directors of leading public health laboratories, such as WHO Collaborating Centres for Influenza. In 2023, GISAID's lack of transparency was criticized by some GISAID funders, including the European Commission and the Rockefeller Foundation, with long-term funding being denied from International Federation of Pharmaceutical Manufacturers and Associations (IFPMA). In June 2023, it was reported in Vanity Fair that Bogner had said that "GISAID will soon launch an independent compliance board 'responsible for addressing a wide range of governance matters'". The Telegraph similarly reported that GISAID's in-house counsel was developing new governance processes intended to be transparent and allow for the resolution of scientific disputes without the involvement of Bogner.

Access and intellectual property
The creation of the GISAID database was motivated in part by concerns raised by researchers from developing countries, with Scientific American noting in 2009 that that "a previous data-sharing system run by WHO forced them to give up intellectual property rights to their virus samples when they sent them to WHO. The virus samples would then be used by private pharmaceutical companies to make vaccines that are awarded patents and sold at a profit at prices many poor nations cannot afford". In a 2022 piece in The Lancet, it was further noted that scientists in North America and Europe sought unrestricted access, with "scientists from Africa requiring sufficient protections for those who generate and share data as per the GISAID terms and conditions". Unlike public-domain databases such as GenBank and EMBL, users of GISAID must have their identity confirmed and agree to a Database Access Agreement that governs the way GISAID data can be used. These Terms of Use are "weighted in favour of the data provider and gives them enduring control over the genetic data they upload". They prevent users from sharing any data with other users who have not agreed to them, and require that users of the data must credit the data generators in published work, and also make a reasonable attempt to collaborate with data generators and involve them in research and analysis that uses their data.

A difficulty that GISAID's Data Access Agreement attempts to address is that many researchers fear sharing of influenza sequence data could facilitate its misappropriation through intellectual property claims by the vaccine industry and others, hindering access to vaccines and other items in developing countries, either through high costs or by preventing technology transfer. While most public interest experts agree with GISAID that influenza sequence data should be made public, and this is the subject of agreement by many researchers, some provide the information only after filing patent claims while others have said that access to it should be only on the condition that no patents or other intellectual property claims are filed, as was controversial with the Human Genome Project. GISAID's Data Access Agreement addresses this directly to promote sharing data. GISAID's procedures additionally suggest that those who access the EpiFlu database consult the countries of origin of genetic sequences and the researchers who discovered the sequences. As a result, the GISAID license has been important in rapid pandemic preparedness. However, these restrictions evidence common criticisms to an open data model.

GISAID describes itself as "open access", which is naturally replicated by the media and in journal publications. This description indeed aligns with the original announcement of the consortium, which also mentioned depositing the data to the databases participating in the INSDC. As of March 2023, this is not the case, as "GISAID does not offer a mechanism to release data to any other database". A few academic papers have compared GISAID's licensing model to unrestricted, open databases, highlighting the differences while other researchers have signed an open letter calling for the use of any of the INSDC's unrestricted databases.

In 2017, GISAID's editorial board stated that "re3data.org and DataCite, the world's leading provider of digital object identifiers (DOI) for research data, affirmed the designation of access to GISAID's database and data as Open Access". However, after several researchers had their accounts suspended in March 2023 as reported by the journal Science and other news outlets, its open access status was revoked by the Registry of Research Data Repositories (re3data), which now classifies it as a "restricted access repository". In 2020 the World Health Organization chief scientist Soumya Swaminathan called the initiative "a game changer", while the co-director of the European Bioinformatics Institute (EBI) Rolf Apweiler has argued that because it does not allow sequences to be reshared publicly, it hampers efforts to understand the coronavirus and the rapid rise of new variants.

GISAID's restrictions on access have led to conflict with "labs and institutions whose priorities are academic rather than driven by the immediate priorities of public health protection". In January 2021, GISAID's restricted access led a group of scientists to write an open letter asking for SARS-CoV-2 sequences to be deposited in open databases, which was replicated in the journals Nature and Science. Furthermore, the article from Science points out that the lack of transparency in access to the database also prevents many scientists from even criticising the platform. A paper from 2017 describing the success of GISAID mentions that revoking researchers' credentials was rare, but it did happen. The same publication described a "perceived merit in GISAID's formula for balancing the need for control and openness". In April 2023, Science and The Economist reported these issues continue as well as the lack of transparency of its governance. An investigation by The Telegraph into claims made by Science noted the incentives of various potential competitors in the field, for whom GISAID is an obstacle to consolidation of control over the field, and also noted that GISAID's position inevitably places it at the center of disputes between groups of scientists, which will tend to result in the losing side blaming GISAID for that outcome.