List of biophysically important macromolecular crystal structures

Crystal structures of protein and nucleic acid molecules and their complexes are central to the practice of most parts of biophysics, and have shaped much of what we understand scientifically at the atomic-detail level of biology. Their importance is underlined by the United Nations declaring 2014 as the International Year of Crystallography, as the 100th anniversary of Max von Laue's 1914 Nobel prize for discovering the diffraction of X-rays by crystals. This chronological list of biophysically notable protein and nucleic acid structures is loosely based on a review in the Biophysical Journal. The list includes all the first dozen distinct structures, those that broke new ground in subject or method, and those that became model systems for work in future biophysical areas of research.

Myoglobin
1958 – Myoglobin was the very first crystal structure of a protein molecule. Myoglobin cradles an iron-containing heme group that reversibly binds oxygen for use in powering muscle fibers, and those first crystals were of myoglobin from the sperm whale, whose muscles need copious oxygen storage for deep dives. The myoglobin 3-dimensional structure is made up of 8 alpha-helices, and the crystal structure showed that their conformation was right-handed and very closely matched the geometry proposed by Linus Pauling, with 3.6 residues per turn and backbone hydrogen bonds from the peptide NH of one residue to the peptide CO of residue i+4. Myoglobin is a model system for many types of biophysical studies, especially involving the binding process of small ligands such as oxygen and carbon monoxide.

Hemoglobin
1960 – The hemoglobin crystal structure showed a tetramer of two related chain types and was solved at much lower resolution than the monomeric myoglobin, but it clearly had the same basic 8-helix architecture (now called the "globin fold"). Further hemoglobin crystal structures at higher resolution (PDB 1MHB, 1DHB) soon showed the coupled change of both local and quaternary conformation between the oxy and deoxy states of hemoglobin, which explains the cooperativity of oxygen binding in the blood and the allosteric effect of factors such as pH and DPG. For decades hemoglobin was the primary teaching example for the concept of allostery, as well as being an intensive focus of research and discussion on allostery. In 1909, hemoglobin crystals from >100 species were used to relate taxonomy to molecular properties. That book was cited by Perutz in the 1938 report of horse hemoglobin crystals that began his long saga to solve the crystal structure. Hemoglobin crystals are pleochroic dark red in two directions and pale red in the third  because of the orientation of the hemes, and the bright Soret band of the heme porphyrin groups is used in spectroscopic analysis of hemoglobin ligand binding.



Hen-egg-white lysozyme
1965 – Hen-egg-white lysozyme (PDB file 1lyz). was the first crystal structure of an enzyme (it cleaves small carbohydrates into simple sugars), used for early studies of enzyme mechanism. It contained beta sheet (antiparallel) as well as helices, and was also the first macromolecular structure to have its atomic coordinates refined (in real space). The starting material for preparation can be bought at the grocery store, and hen-egg lysozyme crystallizes very readily in many different space groups; it is the favorite test case for new crystallographic experiments and instruments. Recent examples are nanocrystals of lysozyme for free-electron laser data collection and microcrystals for micro electron diffraction.



Ribonuclease
1967 – Ribonuclease A (PDB file 2RSA) is an RNA-cleaving enzyme stabilized by 4 disulfide bonds. It was used in Anfinsen's seminal research on protein folding which led to the concept that a protein's 3-dimensional structure was determined by its amino-acid sequence. Ribonuclease S, the cleaved, two-component form studied by Fred Richards, was also enzymatically active, had a nearly identical crystal structure (PDB file 1RNS), and was shown to be catalytically active even in the crystal, helping dispel doubts about the relevance of protein crystal structures to biological function.



Serine proteases
1967 – The serine proteases are a historically very important group of enzyme structures, because collectively they illuminated catalytic mechanism (in their case, by the Ser-His-Asp "catalytic triad"), the basis of differing substrate specificities, and the activation mechanism by which a controlled enzymatic cleavage buries the new chain end to properly rearrange the active site. The early crystal structures included chymotrypsin (PDB file 2CHA), chymotrypsinogen (PDB file 1CHG), trypsin (PDB file 1PTN), and elastase (PDB file 1EST). They also were the first protein structures that showed two near-identical domains, presumably related by gene duplication. One reason for their wide use as textbook and classroom examples was the insertion-code numbering system, which made Ser195 and His57 consistent and memorable despite the protein-specific sequence differences.

Papain
1968 – Papain



Carboxypeptidase
1969 – Carboxypeptidase A  is a zinc metalloprotease. Its crystal structure (PDB file 1CPA) showed the first parallel beta structure: a large, twisted, central sheet of 8 strands with the active-site Zn located at the C-terminal end of the middle strands and the sheet flanked on both sides with alpha helices. It is an exopeptidase that cleaves peptides or proteins from the carboxy-terminal end rather than internal to the sequence. Later a small protein inhibitor of carboxypeptidase was solved (PDB file 4CPA) that mechanically stops the catalysis by presenting its C-terminal end just sticking out from between a ring of disulfide bonds with tight structure behind it, preventing the enzyme from sucking in the chain past the first residue.



Subtilisin
1969 – Subtilisin (PDB file 1sbt ) was a second type of serine protease with a near-identical active site to the trypsin family of enzymes, but with a completely different overall fold. This gave the first view of convergent evolution at the atomic level. Later, an intensive mutational study on subtilisin documented the effects of all 19 other amino acids at each individual position.

Lactate dehydrogenase
1970 – Lactate dehydrogenase



Trypsin inhibitor
1970 – Basic pancreatic trypsin inhibitor, or BPTI (PDB file 2pti ), is a small, very stable protein that has been a highly productive model system for study of super-tight binding, disulfide bond (SS) formation, protein folding, molecular stability by amino-acid mutations or hydrogen-deuterium exchange, and fast local dynamics by NMR. Biologically, BPTI binds and inhibits trypsin while stored in the pancreas, allowing activation of protein digestion only after trypsin is released into the stomach.



Rubredoxin
1970 – Rubredoxin (PDB file 2rxn ) was the first redox structure solved, a minimalist protein with the iron bound by 4 Cys sidechains from 2 loops at the top of β hairpins. It diffracted to 1.2Å, enabling the first reciprocal-space refinement of a protein (4,5rxn ). (NB: note that 4rxn was done without geometry restraints.) Archaeal rubredoxins account for many of the highest-resolution small structures in the PDB.



Insulin
1971 – Insulin (PDB file 1INS) is a hormone central to the metabolism of sugar and fat storage, and important in human diseases such as obesity and diabetes. It is biophysically notable for its Zn binding, its equilibrium between monomer, dimer, and hexamer states, its ability to form crystals in vivo, and its synthesis as a longer "pro" form which is then cleaved to fold up as the active 2-chain, SS-linked monomer. Insulin was a success of NASA's crystal-growth program on the Space Shuttle, producing bulk preparations of very uniform tiny crystals for controlled dosage.

Staphylococcal nuclease
1971 – Staphylococcal nuclease

Cytochrome C
1971 – Cytochrome C

T4 phage lysozyme
1974 – T4 phage lysozyme

Immunoglobulins
1974 – Immunoglobulins

Superoxide dismutase
1975 – Cu,Zn Superoxide dismutase

Transfer RNA
1976 – Transfer RNA



Triose phosphate isomerase
1976 – Triose phosphate isomerase

Pepsin-like aspartic proteases

 * 1976 – Rhizopuspepsin
 * 1976 – Endothiapepsin
 * 1976 – Penicillopepsin

Later structures (1978 onwards)

 * 1978 – Icosahedral virus
 * 1981 – Dickerson B-form DNA dodecamer
 * 1981 – Crambin
 * 1985 – Calmodulin
 * 1985 – DNA polymerase


 * 1985 – Photosynthetic reaction center:  Pairs of bacteriochlorophylls (green) inside the membrane capture energy from sunlight, then traveling by many steps to become available at the heme groups (red) in the cytochrome-C module at the top.  This was  first crystal structure solved for a membrane protein, a milestone recognized by a Nobel Prize to Hartmut Michel, Hans Deisenhofer, and Robert Huber.


 * 1986 – Repressor/DNA interactions
 * 1987 – Major histocompatibility complex
 * 1987 – Ubiquitin
 * 1987 – ROP protein
 * 1989 – HIV-1 protease
 * 1990 – Bacteriorhodopsin
 * 1991 – GCN4 coiled coil
 * 1991 – HIV-1 reverse transcriptase
 * 1993 – Beta helix of Pectate lyase
 * 1994 – Collagen
 * 1994 – Barnase/barstar complex
 * 1994 – F1 ATPase
 * 1995 – Heterotrimeric G proteins
 * 1996 – Green fluorescent protein
 * 1996 – CDK/cyclin complex


 * 1996 – Kinesin motor protein
 * 1997 – GroEL/ES chaperone
 * 1997 – Nucleosome
 * 1998 – Group I self-splicing intron


 * 1998 – DNA topoisomerases perform the biologically important and necessary job of untangling DNA strands or helices that get entwined with each other or twisted too tightly during normal cellular processes such as the transcription of genetic information.
 * 1998 – Tubulin alpha/beta dimer
 * 1998 – Potassium channel
 * 1998 – Holliday junction
 * 2000 – Ribosomes are a central part of biology and biophysics, which first became accessible structurally in 2000


 * 2000 – AAA+ ATPase
 * 2002 – Ankyrin repeats
 * 2003 – TOP7 protein design
 * 2004 – Cyanobacterial Circadian clock proteins
 * 2004 – Riboswitch
 * 2006 – Human exosome


 * 2007 – G-protein-coupled receptor


 * 2009 – The vault particle is an intriguing new discovery of a large hollow particle common in cells, with several different suggestions for its possible biological function.  The crystal structures (PDB files 2zuo, 2zv4, 2zv5 and 4hl8 ) show that each half of the vault is made up of 39 copies of a long 12-domain protein that swirl together to form the enclosure. Disorder at the very top and bottom ends suggests openings for possible access to the interior of the vault.