Solenoid protein domain



Solenoid protein domains are a highly modular type of protein domain. They consist of a chain of nearly identical folds, often simply called tandem repeats. They are extremely common among all types of proteins, though exact figures are unknown.

"Repeats" in molecular biology
In proteins, a "repeat" is any sequence block that returns more than one time in the sequence, either in an identical or a highly similar form. Repetitiveness does not in itself indicate anything about the structure of the protein. As a "rule of thumb", short repetitive sequences (e.g. those below the length of 10 amino acids) may be intrinsically disordered, and not part of any folded protein domains. Repeats that are at least 30 to 40 amino acids long, are far more likely to be folded as part of a domain. Such long repeats are frequently indicative of the presence of a solenoid domain in the protein.

Examples of disordered repetitive sequences include the 7-mer peptide repeats found in the RPB1 subunit of RNA polymerase II, or the tandem beta-catenin or axin binding linear motifs in APC (adenomatous polyposis coli). Examples of short repeats exhibiting ordered structures include the three-residue collagen repeat or the five-residue pentapeptide repeat that forms a beta helix structure.

Architecture of solenoid domains
Due to the identical form of their building blocks, solenoid domains can only assume a limited number of shapes. Two main topologies are possible: linear (or open, generally with some degree of helical curvature) and circular (or closed).

Linear (open) solenoids
If the two terminal repeats in a solenoid do not physically interact, it leads to an open or linear structure. Members of this group are frequently rod- or crescent-shaped. The number of individual repeats can range from 2 to over 50. A clear advantage of this topology is that both the N- and C-terminal ends are free to add new repeats and folds, or even remove existing ones during evolution without any gross impact on the structural stability of the entire domain. This type of domain is extremely common among extracellular segments of receptors or cell adhesion molecules. A non-exhaustive list of examples include: EGF repeats, cadherin repeats, leucine-rich repeats, HEAT repeats, ankyrin repeats, armadillo repeats, tetratricopeptide repeats, etc. Whenever a linear solenoid domain structure participates in protein-protein interactions, frequently at least 3 or more repetitive subunits form the ligand-binding sites. Thus - while individual repeats might have a (limited) ability to fold on their own – they usually cannot perform the functions of the entire domain alone.

Circular (closed) solenoids


In the case when the N- and C-terminal repeats lie in close physical contact in a solenoid domain, the result is a topologically compact, closed structure. Such domains typically display a high rotational symmetry (unlike open solenoids that only have translational symmetries), and assume a wheel-like shape. Because of the limitations of this structure, the number of individual repeats is not arbitrary. In the case of WD40 repeats (perhaps the largest family of closed solenoids) the number of repeats can range from 4 to 10 (more usually between 5 and 7). Kelch repeats, beta-barrels and beta-trefoil repeats are further examples for this architecture. Closed solenoids frequently function as protein-protein interaction modules: it is possible that all repeats must be present to form the ligand-binding site if it is located at the centre or axis of the domain "wheel".

Repetitive supradomain modules
As common in biology, there are several borderline cases between solenoid architectures and regular protein domains. Proteins that contain tandem repeats of ordinary domains are very common in eukaryotes. Even if these domains are perfectly capable of folding on their own, some of them might bind together and assume a rigidly fixed orientation in the full protein. These supradomain modules can perform functions that its individual constituents are incapable of. A famous example is the case of tandem BRCT domains, found in the tumor suppressor protein BRCA1. While individual BRCT domains are found in certain proteins (e.g. some DNA ligases) binding DNA, these tandem BRCT domains evolved a novel function: phosphorylated linear motif binding. In the case of BRCA1 (and MDC1), the peptide-binding groove lies in a cleft formed by the junction of the two domains. This elegantly explains why individual constituents of this supradomain block are incapable of ligand binding, while their proper assembly endows them with a novel function. Therefore, tandem BRCT domains can be regarded as a form of a single, linear solenoid domain as well.