Structural Genomics Consortium

The Structural Genomics Consortium (SGC) is a public-private-partnership focusing on elucidating the functions and disease relevance of all proteins encoded by the human genome, with an emphasis on those that are relatively understudied. The SGC places all its research output into the public domain without restriction and does not file for patents and continues to promote open science. Two recent publications revisit the case for open science. Founded in 2003, and modelled after the Single Nucleotide Polymorphism Database (dbSNP) Consortium, the SGC is a charitable company whose Members comprise organizations that contribute over $5,4M Euros to the SGC over a five-year period. The Board has one representative from each Member and an independent Chair, who serves one 5-year term. The current Chair is Anke Müller-Fahrnow (Germany), and previous Chairs have been Michael Morgan (U.K.), Wayne Hendrickson (U.S.A.), Markus Gruetter (Switzerland) and Tetsuyuki Maruyama (Japan). The founding and current CEO is Aled Edwards (Canada). The founding Members of the SGC Company were the Canadian Institutes of Health Research, Genome Canada, the Ontario Research Fund, GlaxoSmithKline and Wellcome Trust. The current (March 2022) Members comprise Bayer Pharma AG, Bristol Myers Squibb, Boehringer Ingelheim, the Eshelman Institute for Innovation, Genentech, Genome Canada, Janssen, Merck KGaA, Pfizer, and Takeda.

SGC research activities take place in a coordinated network of university-affiliated laboratories – at Goethe University Frankfurt, Karolinska Institutet, McGill University, and the Universities of North Carolina at Chapel Hill and Toronto. The research activities are supported both by funds from the SGC Company as well as by grants secured by the scientists affiliated with the SGC programs. At each university, the scientific teams are led by a Chief Scientist, who are Stefan Knapp (Goethe University Frankfurt), Michael Sundstrom (Karolinska Institutet), Ted Fon (McGill University), Tim Willson (University of North Carolina at Chapel Hill), and Cheryl Arrowsmith (University of Toronto). The SGC currently comprises ~200 scientists.

Chemical biology of human proteins
Structural biology of human proteins – The SGC has so far contributed over 2000 protein structures of human proteins of potential relevance for drug discovery into the public domain since 2003. Structures that constitute complexes with synthetic small molecules is aided by a partnership with the Diamond synchrotron in Oxfordshire. The chemical probe program prioritizes (members of) protein families that are relatively understudied, or which may be currently relevant to human biology and drug discovery. These families include epigenetic signaling, solute transport,  protein proteostasis,   and protein phosphorylation. The protein family approach is supported by publicly available bioinformatics tools (ChromoHub, UbiHub ), family-based protein production and biochemistry, crystallography and structure determination, biophysics, and cell biology (for example target engagement assays). The SGC has (so far) contributed ~120 chemical probes into the public domain over the past decade, and >25,000 samples of these probes have been distributed to the scientific community. The chemical probes conform to the now community-standard quality criteria created by the SGC and its collaborative network.


 * 1) Epigenetic chemical probes that have generated clinical interest in their targets include PFI-1 and JQ1 for the BET family, UNC0642 for G9a/GLP, UNC1999 for EZH2/H1, LLY-283 and GSK591 for PRMT5, and OICR-9429 for WDR5. The WDR5 chemical probe was optimized (by a company external to the SGC) for clinical amenability and is the subject of investment from Celgene.
 * 2) Kinases have seen 50 drugs approved by the FDA for treatment of cancer, inflammation, and fibrosis. A review from two and a half years ago, a recent preprint, and peer-reviewed publication highlight low coverage of kinases both by peer-reviewed publications and 3D structures. In the last 4 years laboratories in Frankfurt, North Carolina and Oxford have developed chemical matter to help biologists study underrepresented kinases. In collaboration with pharmaceutical companies and academia, 15 chemical probes, and version 1.0 of 187 chemogenomic inhibitors (aka KCGS) for 215 kinases have been co-developed.
 * 3) Integral membrane proteins are permanently attached to the cell membrane. The family includes the solute carrier (SLC) proteins. The SLCs are largely unexplored therapeutically ~30% are considered ‘orphaned’ because their substrate specificity and biological function are unknown. In 2019 a public-private partnership comprising 13 partners, including the SGC, formed The RESOLUTE Consortium with funding from the IMI. RESOLUTE’s goal is to encourage research on SLCs.
 * 4) The Target Enabling Package (TEP) is a collection reagents and knowledge on a protein target aimed to catalyze biochemical and chemical exploration, and characterization of proteins with genetic linkage to key disease areas. The SGC has opened target nominations to the public.
 * 5) The Unrestricted Leveraging of Targets for Research Advancement and Drug Discovery (ULTRA-DD) program, funded by the European Commission’s Innovative Medicines Initiative (IMI), aims to identify and validate under-explored targets in auto-immune and inflammatory disease models. Patient-derived cell lines are screened against chemical modulators (including chemical probes and chemogenomic compounds) with the intention of obtaining phenotypic read-outs in a disease relevant context.
 * 6) The Enabling and Unlocking biology in the Open (EUbOPEN) program, funded by the IMI, aims to assemble a chemogenomic library for ~1,000 proteins, discover ~100 high-quality, chemical probes, establish infrastructure to characterize these compounds, disseminate robust protocols for primary patient cell-based assays, while establishing the infrastructure to seed a global effort on addressing the entire druggable genome.

Non-human proteins
The Structure-guided Drug Discovery Coalition (SDDC) comprises the Seattle Structural Genomics Center for Infectious Disease (SSGCID), the Midwest Center for Structural Genomics, the Center for Structural Genomics of Infectious Diseases (CSGID), and drug discovery teams from academia and industry has resulted in 7 early drug leads for tuberculosis (TB), malaria, and cryptosporidiosis. The SDDC receives funding from participating academic initiatives and the Bill & Melinda Gates Foundation.

The University of North Carolina at Chapel Hill and the Eshelman Institute for Innovation, launched Rapidly Emerging Antiviral Drug Development Initiative (READDI™) and Viral Interruption to Medicines Initiative (VIMI™). REDDI™ is modelled after the non-profit drug research and development Drugs for Neglected Diseases Initiative (DNDi). READDI™ and VIMI™ are non-profit, open science initiatives that focus on developing therapeutics for all pandemic-capable viruses.

Open Science
Open science is a key operating principle. A Trust Agreement  is signed before reagents are shared with researchers. These reagents include cDNA clones (Addgene), chemical probes, and 3D structures. Tools to promote open science include open lab notebooks. The latter platform is being used to share research on (for example) Diffuse intrinsic pontine glioma (DIPG), Fibrodysplasia ossificans progressiva, Huntington’s disease, Parkinson’s disease, and Chordoma.

Open Drug Discovery
The for-profit spin-off companies M4K Pharma (Medicines for Kids), M4ND Pharma (Medicines for Neurological Diseases) and M4ID Pharma (Medicines for Infectious Diseases) do not file patents and practise open science. The M4 companies are wholly owned by a Canadian charity Agora Open Science Trust whose mandate is to share scientific knowledge and ensure affordable access to all medicines. M4K Pharma has the most advanced open drug discovery program and is supported with funding from the Ontario Institute for Cancer Research, The Brain Tumour Charity, Charles River Laboratories and Reaction Biology, and with contributions from scientists at the Universities of McGill, North Carolina, Oxford, Pennsylvania, and Toronto and in the Sant Joan de Déu hospital, the University Health Network hospitals, the Hospital for Sick Children, and The Institute for Cancer Research. M4K Pharma is developing a selective inhibitor of ALK2 for DIPG, a uniformly fatal pediatric brain tumour.

The Concept
In 2000, a group of companies and Wellcome conceptualized forming a Structural Genomics Consortium to focus on determining the three-dimensional structures of human proteins. The consortium must place all structural information and supporting reagents into the public domain without restriction. This effort was designed to complement other structural genomics programs in the world.

Phase I (2004-2007)
The SGC scientific program was launched, with activities at the Universities of Oxford and Toronto, and with a mandate to contribute >350 human protein structures into the public domain. To be counted toward these goals, the proteins had to derive from a pre-defined list and the protein structures were required to meet pre-defined quality criteria. The quality of protein structures was and continues to be adjudicated by a committee of independent academic scientists. Michael Morgan was the Chair of the SGC Board, and the scientific activities were led by Cheryl Arrowsmith (Toronto) and Michael Sundstrom (Oxford). In mid 2005, VINNOVA, the Knut and Alice Wallenberg Foundation and the Foundation for Strategic Research (SSF) established the Swedish research node of the SGC. Experimental activities started at the Karolinska Institutet in Stockholm, led by Pär Nordlund and Johan Weigelt. Together, the three SGC laboratories contributed 392 human protein structures into the public domain. A pilot program in the structural biology of proteins in the malaria parasite was also initiated.

Phase II (2007-2011)
The new goal for structures was 650. The SGC focused considerable activities in the areas of ubiquitination, protein phosphorylation, small G-proteins and epigenetics, and also initiated an effort in the structural biology of integral membrane proteins. In this phase, the SGC determined the structures of 665 human proteins from its Target List. With support from Wellcome and GSK, the SGC launched a program to develop freely-available chemical probes to proteins involved in epigenetic signalling which at the time were under studied. The quality of each chemical probe was subject to two levels of review prior to their dissemination to the public. The first was internal, through a Joint Management Committee comprising representatives from each member organization. The second was provided by a group of independent experts selected from academia. This level of oversight is aimed at developing reagents that support reproducible research. It ultimately led to the creation of the Chemical Probes Portal. The SGC Memberships expanded to include Merck, Sharpe and Dohme, and Novartis. Wayne Hendrickson served as the Chair of the SGC Board.

Phase III (2011-2015)
The SGC mandate diversified to include 200 human proteins including 5 integral membrane proteins and chemical probes (30). Many of the chemical probes’ programs were undertaken in partnership with scientists in the pharmaceutical companies, which made the commitment to contribute the collaborative chemical probe into the public domain, without restriction. In Phase III, the SGC, along with the SSGCID ( https://www.ssgcid.org/ ) and the CSGID ( https://csgid.org/ ) launched the SDDC. SGC Memberships: AbbVie, Bayer AG, Boehringer Ingelheim, Eli Lilly and Janssen. Merck, Sharpe and Dohme and the Canadian Institutes for Health Research left the consortium. Markus Gruetter became the Chair of the SGC Board.

Phase IV (2015-2020)
This phase built on the goals of previous phases but included well-characterized antibodies to human proteins. The SGC initiated a concerted effort to develop disease-relevant, cell-based assays using (primary) cells or tissue from patients. This phase saw the launch of research activities at Goethe University in Frankfurt, at McGill University, and at the Universities of Campinas and North Carolina, and participation in ULTRADD and RESOLUTE within IMI. SGC Memberships: Merck KGaA, the Eshelman Institute for Innovation, Merck, Sharpe and Dohme joined while GSK and Eli Lilly left. Tetsuyuki Maruyama became the Chair of the Board.

The Future - Target 2035
Target 2035 is an open science movement with the goal of creating chemical    and/or biological  tools for the entire proteome by 2035. The launch in November 2020 and monthly webinars have and continue to be free to attend. Supporting projects currently underway include the SGC’s epigenetics chemical probe program,  the NIH’s Illuminating the Druggable Genome initiative for under-explored kinases, GPCR’s and ion channels,   IMI’s RESOLUTE project on human SLCs, and IMI's Enabling and Unlocking Biology in the Open (EUbOPEN). These teams are linked to SGC’s global collaborative network.