Complementarity plot



The complementarity plot (CP) is a graphical tool for structural validation of atomic models for both folded globular proteins and protein-protein interfaces. It is based on a probabilistic representation of preferred amino acid side-chain orientation, analogous to the preferred backbone orientation of Ramachandran plots). It can potentially serve to elucidate protein folding as well as binding. The upgraded versions of the software suite is available and maintained in github for both folded globular proteins as well as inter-protein complexes. The software is included in the bioinformatic tool suites OmicTools and Delphi tools.

Background
Validation of three dimensional protein crystal structures are traditionally based on a multitude of parameters ranging from (i) the distribution of residues in the Ramachandran plot, (ii) deviations from ideality,  for bond lengths and angles, (iii) atomic short contacts (steric clash scores), (iv) the distribution of the side-chain conformers (rotamers) and, (v) hydrogen bonding parameters. The advent of the complementarity plot as a structural validation tool for proteins essentially provides a conjugation of the traditional approaches. CP detects both local errors in atomic coordinates and also correctly matches an amino acid sequence to its native three dimensional fold situated amid decoys. The Complementarity Plot is based on the combined use of shape and electrostatic complementarity of completely / partially buried residues with respect to their environment constituted by rest of the polypeptide chain and is a sensitive indicator of the harmony or disharmony of interior residues with regard to the short and long range forces sustaining the native fold. The term 'Complementarity Plot' (CP) is perhaps a misnomer as there are actually three plots, each serving a given range of solvent exposure of the plotted residues (CP1, CP2, CP3 for burial bins 1, 2, 3).

Pictorial description
The complementarity plot has been largely inspired by the Ramachnadran Plot in its design (but not in its physicochemical attributes). Ramachandran Plot is deterministic in nature, in contrast, CP is probabilistic. Ramachandran plot deals with main-chain torsion angles and errors in such parameters are essentially locally restricted. In contrast, CP deals with geometric and electrostatic fit of the interior side-chains with their local and non-local neighborhood. Disharmony (misfit) in these conjugated parameters may arise due to a plethora of errors coming from bond angles or torsions from effectively the whole folded polypeptide chain. However, analogous to the Ramachandran Plot, the region within the first contour is termed 'probable' (analogous to the 'allowed' region), between the first and second contour, 'less probable' ('partially allowed') and outside the second contour 'improbable' ('disallowed').

Applications
CP has a multitude of applications in experimental as well as in computational structural biology. Thorough investigation of the effect of small errors in both main- and side-chain bond angles / torsions on the overall fold shows that the CP is effective in the detection of these errors even while failure of the other already existing parameters based on prohibition of local steric overlap and deviation from ideality. Consequences of such small angular errors are not restricted locally, resulting in geometric and electrostatic misfit of interior residues throughout the fold, potentially detectable by the CPs. These errors may arise from (i) misfitting of side-chain torsions/ wrong rotamer assignments (especially relevant for low-resolution structures), (ii) incorrect tracing of the main-chain trajectories during refinement (resulting in low-intensity errors diffused over the entire polypeptide chain). CP can also detect packing anomalies, and, in particular, can potentially signal unbalanced partial charges within protein interiors. It is useful in homology modeling and protein design. A version of the plot (CPint) has also been built and made available to probe similar errors in protein-protein interfaces.

CPdock
In contrast to the residue-wise plots, there is also a variant available for the Complementarity Plot, namely CPdock for plotting single Sc, EC values for the protein-protein interface and adjudging thereby the quality of the complex atomic structure (either experimentally solved or computationally built) therein. Sc, EC are shape and electrostatic complementarities computed for 'interacting protein-protein surfaces' originally proposed by Peter Colman and co-workers in the 1990s. CPdock was primarily developed as a scoring function to serve as an initial filter in protein-protein docking and can be a very helpful tool in protein design - as has lately been demonstrated in COVID-research both in scoring as well as in the evaluation of docked complexes to eliminate the effect of co-substrate binding in a targeted inhibitor binding.

Software
CP@SINP: http://www.saha.ac.in/biop/www/sarama.html CP: https://github.com/nemo8130/SARAMA-updated CPint: https://github.com/nemo8130/SARAMAint-updated CPdock: https://github.com/nemo8130/CPdock

EnCPdock (web-server): https://scinetmol.in/EnCPDock/