Computer Atlas of Surface Topography of Proteins

Computer Atlas of Surface Topography of Proteins (CASTp) aims to provide comprehensive and detailed quantitative characterization of topographic features of protein, is now updated to version 3.0. Since its release in 2006, the CASTp server has ≈45000 visits and fulfills ≈33000 calculation requests annually. CASTp has been proven as a confident tool for a wide range of researches, including investigations of signaling receptors, discoveries of cancer therapeutics, understanding of mechanism of drug actions, studies of immune disorder diseases, analysis of protein–nanoparticle interactions, inference of protein functions and development of high-throughput computational tools. This server is maintained by Jie Liang's lab in University of Illinois at Chicago.

Geometric Modeling Principles
For the calculation strategy of CASTp, alpha-shape and discrete-flow methods are applied to the protein binding site, also the measurement of pocket size by the program of CAST by Liang et al. in 1998, then updated by Tian et al. in 2018. Firstly, CAST identifies atoms which form the protein pocket, then calculates the volume and area, identifies the atoms forming the rims of pocket mouth, computes how many mouth openings for each pocket, predict the area and circumference of mouth openings, finally locates cavities and calculate their size. The secondary structures were calculated by DSSP. The single amino acid annotations were fetched from UniProt database, then mapped to PDB structures following residue-level information from SIFTS database.

Instructions of Protein Pocket Calculation
Input

Protein structures in PDB format, and a probe radius.

Searching

Users can either search for pre-computed result by 4-letter PDB ID, or upload their own PDB file for customized computation. The core algorithm helps in finding the pocket or cavity with capability of housing a solvent, with a default or adjusted diameter.

Output

CASTp identifies all surface pockets, interior cavities and cross channels, provides detailed delineation of all atoms participating in their formation, including the area and volume of pocket or void as well as measurement of numbers of mouth opening of a particular pocket ID by solvent accessible surface model (Richards' surface) and by molecular surface model (Connolly surface), all calculated analytically. The core algorithm helps in finding the pocket or cavity with capability of housing a solvent with a diameter of 1.4 Å. This online tool also supports PyMOL and UCSF Chimera plugin for molecular visualization.

Why CASTp is Useful?
Protein science, from an amino acid to sequences and structures

Proteins are large, complex molecules that playing critical roles to maintain the normal functioning of the human body. They are essential not just for the structure and function, but also the regulation among the body's tissues and organs. Proteins are made up of hundreds of smaller units called amino acids that are attached to one another by peptide bonds, forming a long chain.

Protein active sites

Usually, the active site of a protein locates on its center of action and, the key to its function. The first step is the detection of active sites on the protein surface and an exact description of their features and boundaries. These specifications are vital inputs for subsequent target druggability prediction or target comparison. Most of the algorithms for active site detection are based on geometric modeling or energetic features based calculation.

The role of protein pockets

The shape and properties of the protein surface determine what interactions are possible with ligands and other macromolecules. Pockets are an important yet ambiguous feature of this surface. During drug discovery process, the first step in screening for lead compounds and potential molecules as drugs is usually a selection of the shape of the binding pocket. Shape plays a role in many computational pharmacological methods. Based on existing results, most features important to predicting drug-binding were depended on size and shape of the binding pocket, with the chemical properties of secondary importance. The surface shape is also important for interactions between protein and water. However, defining discrete pockets or possible interaction sites still remains unclear, due to the shape and location of nearby pockets affected promiscuity and diversity of binding sites. Since most pockets are open to solvent, to define the border of a pocket is the primary difficulty. Those closed to solvent we refer to as buried cavities. With the benefit of well-defined extent, area and volume, buried cavities are more straightforward to locate. In contrast, the border of an open pocket defines its mouth and it provides the cut-off for determination of the surface area and volume. Even defining the pocket as a set of residues does not define the volume or the mouth of the pocket.

Druggability role prediction

In pharmaceutical industry, the current priority strategy for target assessment is high-throughput screening (HTS). NMR screenings are applied against large compound datasets. Chemical characteristics of compounds binding against specific targets are measured, so how well the compound sets bind to the chemical space will decide the binding efficiency. Success rates of virtually docking of the drug-like ligands into the active sites of the target proteins would be detected for prioritization, while the most of the active sites located at the pockets.

With the benefits of large amount of structural data, computational methods from different perspectives for druggability prediction have been introduced during the last 30 years with positive results, as a vital instrument to accelerate the prediction accessibility. Many candidates have been integrated into drug discovery pipeline already since then.

New Features in CASTp 3.0
Pre-computed results for biological assemblies

For a lot of proteins deposited in Protein Data Bank, the asymmetric unit might be different from biological unit, which would make the computational result biologically irrelevant. So the new CASTp 3.0 computed the topological features for biological assemblies, overcome the barriers between asymmetric unit and biological assemblies.

Imprints of negative volumes of topological features

In the 1st release of CASTp server in 2006, only geometric and topological features of those surface atoms participated in the formation of protein pockets, cavities, and channels. The new CASTp added the "negative volume" of the space, referred to the space encompassed by the atoms formed these geometric and topological features.

Comprehensive annotation on single amino-acid polymorphism

The latest CASTp integrated protein annotations aligned with the sequence, including the brief feature, positions, description, and reference of the domains, motifs, and single amino-acid polymorphisms.

Improved user interface & convenient visualization

The new CASTp now incorporated 3Dmol.js for structural visualization, made users able to browse, to interact the protein 3D model, and to examine the computational results in latest web-browsers including Chrome, Firefox, Safari, et al. Users can pick their own representation style of the atoms which form each topographic feature, and to edit the colors by their own preferences.