Chromosome 12 open reading frame 71

Chromosome 12 open reading frame 71 (c12orf71) is a protein which in humans is encoded by c12orf71 gene. The protein is also known by the alias LOC728858.

Gene
The gene is located on the minus strand of chromosome 12 (12p11.23). The DNA sequence of the c12orf71 gene is 3071 base pairs long and 8 significant structural variations have been identified including deletions, duplications, gain- and loss-of-function mutations. c12orf71 gene was determined to be altered (gain of 21 Mb) in the chromosomal region 12p11.21-p13.3 of a male patient with chromosomal aberrations and in a duplication (gain of 411 kb) at chromosome 12p11.23 along with c12orf70, the coding regions of STK38L and ARNTL2 and a portion of PPFIBP1. Manual inspection of alignments, has determined that c12orf71 gene is mammalian specific. Furthermore, genome-wide screening has identified c12orf71 as one of 1000 disrupted genes that are positively selected by cisplatin, a chemotherapy drug.

RNA
c12orf71 transcript variant 1 mRNA is 1022 nucleotides long and consists of 2 exons. There is one more, slightly longer transcript variant of c12orf71, with a length of 1087 nucleotides. The mRNA sequence of c12orf71 transcript variant 1 consists of a coding sequence that spans over two exons and 2 poly-A signal sequences.

Expression
In humans c12orf71 has shown an intermediate expression level in testis and low expression in the bone marrow, skin, spleen, lymph node and liver. Human c12orf71 is expressed after the fetal-development stage. RNA-sequencing analysis has revealed that c12orf71 was expressed at a very low level or not expressed at all in osteoarthritis and non-osteoarthritis hip cartilage. A genome engineering study that studied mice knock-outs has found that c12orf71 has a decreased expression in humans compared to mouse testis, however the absence of the c12orf71 had no effect on mouse fertilization.

Protein
c12orf71 protein is 269 amino acids long and the unmodified precursor protein has a predicted molecular weight of 30.4 kDa and a theoretical isoelectric point of 5.21. Additionally, the protein is rich in Serine and Aspartic Acid and has a relatively low amount of Valine and Tyrosine.

Cellular localization
Cellular localization analysis showed that human c12orf71 protein is found in the cytoplasm of the cell. All of the orthologs of the protein were also localized to the cytoplasm. Immunohistochemistry with polyclonal antibody for c12orf71 localized the protein in the cytosol of the cell.

Domains
The first 21 amino acids of the coding sequence are comprising a disordered region, followed by a domain of unknown function (DUF4640) which spans almost the whole coding sequence. Additionally, the human protein also contains a vacuolar domain, which is mammal specific and may be modulated by phosphorylation.

Post-translation modifications
c12orf71 protein has multiple predicted phosphorylation sites, which can have an impact on the protein interactions and sub-cellular localization as well as affect the protein's stability and activity. The protein has one predicted SUMOylation and one ubiquitination predicted site, which can influence many biological functions of the protein, such as cellular response to stress and degradation, respectively. Five different Lysine acetylation sites were predicted, which can neutralize the positive charge on the Lysine, but at the same time the transfer of acetyl group can increase the expression of the protein. 2 N-glycosylation, multiple O-glycosylation and O-linked-N-acetylglucosaminylation sites were predicted, which could potentially affect the protein stability. There is a competition for Lysine-acetylation and ubiquitination at K130, suggesting that a deacetylase enzyme is acting at this site.

Interacting proteins
There is a direct interaction between c12orf71 and AP2B1, with a moderate confidence level. Adaptor related protein complex 2 subunit beta (AP2B1) helps establish a link between clathrin and receptors in coated vesicles. c12orf71 protein has been found to be present in a protein-protein interaction (PPI) network of the Carboxypeptidase M (CPM) gene, along with nine more genes.

Homology and evolution
Orthologs of the c12orf71 gene have been found only in mammals, in particular Theria (marsupials and placentals). No orthologs in monotremes, birds or reptiles, amphibians, fish, invertebrates, fungi, plants, bacteria, and viruses

Evolutionary history
It has been estimated that c12orf71 gene first appeared in marsupials approximately 160 million years ago. Among the marsupial species, based on the sequence similarity, the gene has first appeared in species from the Microbotheria taxonomic group, represented by the Dromiciops Gliroides (Colocolo opossum) species. Only one isoform of the c12orf71 protein has been found in this species.



Clinical association
There is only one disease associated with c12orf71 gene, common warts. A study of global gene methylation of common warts caused by HPV infection found that c12orf71 gene is differentially methylated in Arab male patients with common warts. In particular, c12orf71 is hypomethylated in skin infected with common warts compared to normal skin. 10 SNPs from the GWAS catalog associate c12orf71 gene with obsolete and androgenic alopecia, healing of bone mineral density and educational attainment.