MIPOL1

MIPOL1 (Mirror Image Polydactyly 1), also known as CCDC193 (Coiled-coil domain containing 193), is a protein that in humans is encoded by the MIPOL1 gene. Mutation of this gene is associated with mirror-image polydactyly (also known as Laurin-Sandrow syndrome. ) in humans, which is a rare genetic condition characterized by mirror-image duplication of digits.

Gene
MIPOL1 is also known as CCDC193 (Coiled-coil domain containing 193).

Locus
The MIPOL1 gene is located at 14q13.3-q21.1 on the plus strand, spanning base pairs 37,197,888 to 37,579,207 (in the human GRCh38 primary assembly, length: 381,320 base pairs), consisting of 15 exons and 11 introns. Some notable genes in its neighborhood include SLC25A21 (mutation of this gene causes synpolydactyly ) and FOXA1.

mRNA
MIPOL1 has at least 15 known splice isoforms produced by alternative splicing.

Properties
The unmodified MIPOL1 protein isoform 1 in humans has an isoelectric point of 5.6 and molecular weight 51.5 kDa. Relative to other human proteins, MIPOL1 consists of unusually low amounts of Proline and Glycine and higher amounts of Glutamic acid and Glutamine.

Isoforms
There are at least three known isoforms of this protein in humans produced by alternative splicing: isoform 1, of length 442 amino acids, isoform 2 of length 261 amino acids and isoform 3 of length 169 amino acids.



Domains and motifs
MIPOL1 contains two coiled-coil domains in its C-terminus at positions 107 – 212 and 253 – 435 (shown in Fig.1). A bipartite nuclear localization signal is predicted at position 128 – 143.



Post-translational modifications
The following post-translational modifications are predicted using bioinformatics tools for MIPOL1. Multiple phosphorylation sites are predicted for this protein, that are conserved in close orthologs, including a Casein kinase 1 (CK1) site, three Casein kinase 2 (CK2) sites, and three NEK2 sites.



Structure
The exact structure of the MIPOL1 has not yet been characterized. Homology-based and de novo predictions of its tertiary structure suggest that it may consist of inter-twined alpha helices, forming coiled-coil domains (see Fig.4.).

Sub-cellular localization
Immunofluorescence imaging in the human U2OS cell line (bone Osteosarcoma epithelial cells) shows localization in the cytosol. Immunohistochemistry imaging of human prostate tissue also suggests cytosolic localization. A bipartite nuclear localization signal is predicted at position 128 – 143, which is highly conserved in mammalian orthologs (see Fig.2.), indicating possible localization in the nucleus.

Gene regulation
The predicted promoter sequence for this gene spans from base pair 37196852 to 37198126 (1,275 bp) and has multiple predicted binding sites for transcription factors such as GATA binding factors, SMAD3, TP63 and NRF1.

Gene Expression
MIPOL1 is ubiquitously expressed at low levels in humans, with highest expression in the prostate.

Transcript regulation
The RNA secondary structure is stabilized by multiple stem loops that have been predicted (using bioinformatics tools ), and conserved across closely related species. Multiple binding targets are found for microRNAs such as MIR3163 and MIR190a, that could silence these regions on the mRNA and inhibit translation.

Clinical significance
The MIPOL1 gene is an autosomal dominant gene. It is one of six genes in humans causing non-syndromic polydactyly (i.e. polydactyly occurring as a separate event with no other associated anomalies). Mutation of this gene is associated with mirror-image polydactyly (also known as Laurin-Sandrow syndrome ) in humans, which is a rare genetic condition characterized by mirror-image duplication of digits in hands and feet. This gene has also been associated with central nervous system development, and the loss of this gene can cause craniofacial defects and agenesis of the corpus callosum.

The gene is shown to function as a tumor suppressor in nasopharyngeal carcinoma (NPC), through the up-regulation of the p21 (WAF1/CIP1) and p27 (proteins that are both cyclin-dependent kinases that are linked with tumor suppression via cell cycle arrest) pathways. Another study investigating the role of MIPOL1 gene in cancer progression reported that MIPOL1 was downregulated in NPC tumor tissues, and that artificially re-expressing the gene caused tumor suppression by down-regulating angiogenic factors and reducing the phosphorylation of metastasis associated proteins like AKT, p65 and FAK14. MIPOL1 interacts with another well-known tumor-suppressing gene, RhoB and this interaction was confirmed to enhance RhoB activity.

In a study of pediatric high grade glioma (pHGG), MIPOL1 gene was found to be down-regulated 2.4-fold in the high vascularity tumors

The protein is known to interact with Replicase polyprotein 1ab in SARS-CoV2, which is a protein involved in the transcription and replication of viral RNAs.

Interacting proteins
This protein is known to interact with multiple human proteins, verified via two-hybrid screening. A few notable examples include:

LATS2: Negatively regulates YAP1 in the Hippo signaling pathway that plays a pivotal role in organ size control and tumor suppression by restricting cell proliferation and promoting apoptosis.

ZGPAT (Zinc finger CCCH-type with G patch domain-containing protein): A transcription repressor that negatively regulates expression of EGFR, a gene involved in cell proliferation, survival and migration, suggesting that it may act as a tumor suppressor.

RCOR3 (REST Corepressor 3): A protein that may act as a component of a co-repressor complex that represses transcription

It also interacts with viral proteins such as:

Replicase polyprotein 1ab (SARS-CoV2): A multifunctional protein involved in the transcription and replication of viral RNAs.

Protein E7 (Human Papillomavirus): Plays a role in viral genome replication by driving entry of quiescent cells into the cell cycle.

Origin and evolution
The earliest known ortholog of this protein appeared around 948 million years ago in Trichoplax adhaerens in phylum Placozoa in kingdom Animalia. The next most distant orthologs appear in phylum Cnidaria, around 824 million years ago.

Sequence Homology
The MIPOL1 protein has no known paralogs in humans and other species for which orthologs have been found, therefore, it is the only member of its gene family. There are more than 300 known orthologs of the MIPOL1 protein in Animalia, ranging from primates to corals and sea anemones in phylum Cnidaria. Orthologs of the protein were found in species as distant as Trichoplax adhaerens, a simple primitive invertebrate species. Table 2 shows a sample of the ortholog space.

Closely related orthologs are found in chordates such as mammals, reptiles, birds and amphibians, with sequence similarities greater than 70%. Sequence lengths of orthologs were similar to the human MIPOL1 protein, with no significant gene duplication observed.

Organisms with sequence similarities in the 55-70% range (moderately related orthologs) were found in bony fish, cartilaginous fish and coelacanths. Sequence length is generally longer in these species, with a longer amino acid sequence in the N-terminus (alignment with human protein occurs around amino acid 100).

Distantly related orthologs with similarities less than 50% (around 30 – 40%) are found in hemichordates, echinoderms, arthropods, molluscs, cnidaria and placozoa. Multiple sequence alignment with distant orthologs indicates poor alignment in the N-terminus of the protein.

Two COG (Clusters of Orthologous Groups of proteins) domains were found in this protein (see Fig.3): COG1196 at position 106 - 340 (Chromosome segregation ATPase ) and COG4372 at 259 - 431 (uncharacterized conserved protein containing a DUF3084 domain )



Phylogenetics
Using a linear regression analysis on a plot of corrected percent divergence (amino acid changes per 100 amino acids) as a function of date of divergence from humans for different MIPOL1 orthologs (see Fig.5), it is estimated that a 1% change in amino acids in the MIPOL1 protein takes 5.68 million years. MIPOL1 protein is evolving at a moderate rate relative to fast evolving protein such as fibrinogen alpha, and slow evolving proteins such as cytochrome C.