ProtCID

The Protein Common Interface Database (ProtCID) is a database of similar protein-protein interfaces in crystal structures of homologous proteins.

Its main goal is to identify and cluster homodimeric and heterodimeric interfaces observed in multiple crystal forms of homologous proteins. Such interfaces, especially of non-identical proteins or protein complexes, have been associated with biologically relevant interactions.

A common interface in ProtCID indicates chain-chain or domain-domain interactions that occur in different crystal forms. All protein sequences of known structure in the Protein Data Bank (PDB) are assigned a ”Pfam chain architecture”, which denotes the ordered Pfam assignments for that sequence, e.g. (Pkinase) or (Cyclin_N)_(Cyclin_C). Homodimeric interfaces in all crystals that contain particular domain or chain architectures are compared, regardless of whether there are other protein types in the crystals. All interfaces between two different Pfam domains or Pfam architectures in all PDB entries that contain them are also compared (e.g., (Pkinase) and (Cyclin_N)_(Cyclin_C) ). For both homodimers and heterodimers, the interfaces are clustered into common interfaces based on a similarity score.

ProtCID reports the number of crystal forms that contain a common interface, the number of PDB entries, the number of PDB and PISA biological assembly annotations that contain the same interface, the average surface area, and the minimum sequence identity of proteins that contain the interface. ProtCID provides an independent check on publicly available annotations of biological interactions for PDB entries.

ProtCID also contains interface clusters between protein domains and peptides, nucleic acids, and ligands.