C3orf62

Chromosome 3 open reading frame 62 (C3orf62) is a protein that in humans is encoded by the C3orf62 gene. C3orf62 is a glycine-depleted protein relative to the amount of glycine in proteins in the rest of the genome. C3orf62 has a KKXX-like motif and is predicted to be localized in the nucleus. Expression of C3orf62 remains highest in whole blood.

Locus
C3orf62 is mapped to the reverse strand of chromosome 3 at 3p21.31 and spans 9,313 bases. C3orf62 starts at 49,268,597 base pairs from the terminus of the short arm (pter) and ending at 49,277,909 base pairs pter. This gene is known to have 3 exons, 4 transcripts, and 37 orthologues.

Gene neighborhood
C3orf62 is flanked by Ubiquitin Specific Protease 4 (USP4) and Coil-Coiled Domain Containing 36 (CCDC36).



Aliases
C3orf62 possesses the following alternate names and synonyms: CC062; FLJ43654.

Primary sequence
C3orf62 human protein (Q6ZUJ4) is 267 amino acids long, and has a molecular mass of 30,194 daltons. The isoelectric point of C3orf62 is roughly 5.2. The unmodified C3orf62 protein is a “glycine depleted protein” relative to amounts of glycine in proteins in the rest of the genome. It appears that glycine is evenly distributed throughout the C3orf62 sequence with no preference of areas to cluster in. Before post-translational modifications, C3orf62 is an acidic protein. No charge clusters are present in C3orf62, and no specific spacing of cysteine is found. The isoelectric point of C3orf62 is 5.211000.

Domains and motifs
There are no known transmembrane domains for C3orf62. C3orf62 has a KKXX-like motif in the C-terminus meaning C3orf62 may be responsible for retrieval of endoplasmic reticulum (ER) membrane proteins from the Golgi apparatus.

Secondary structure
Roughly 7 alpha helices are predicted for C3orf62 through Pele Protein Structure Protein Prediction and strengthened through orthologous secondary structure predictions by Ali2D.

Subcellular localization
C3orf62 is predicted to be localized in the nucleus. The k-nearest neighbors algorithm predicts C3orf62 to be classified as follows: k=9/23; 69.6% nuclear, 13.0% mitochondrial, 13.0% cytoskeletal, 4.3% cytoplasmic.

Expression
C3orf62 is expressed in more than 30 different tissues; highest expression is in whole blood. Specifically, highest expression of C3orf62 is in the following tissues: lung, tonsil, trachea, small intestine, mammary gland, and salivary gland. Through analysis of various microarray studies, C3orf62 is found to have consistently high expression compared to other genes tested in the datasets. C3orf62 has low expression in brain tissues.



Post-transcriptional modifications
C3orf62 possess two post-translational modifications, both are phosphorylation sites with locations at amino acid 210 and 224. A natural variant is found at amino acid 110 (Glutamic acid (E)--> Lysine K).

It appears as though C3orf62 may have a YinOYang site at residue 115, meaning that this Threonine residue is predicted to be O-GlycNAcylated as well as phosphorylated. This site may be reversibly and dynamically modified by O-GlcNAc or Phosphate groups at different times in the cell.

Regulation of expression
Thirteen promoters have been predicted for C3orf62.

Transcript variants
Transcription of C3orf62 produces 5 alternatively spliced variants and 1 unspliced form. Of the four splice variants, two of them are protein coding, one is nonsense meditated decay, and one is a retained intron. QIAGEN denotes the following as transcription factor binding sites in the C3orf62 promoter: TFCP2, Pax-6, p53, MyoD, YY1, Ik-2, AREB6, IRF-7A3.

Function
Function of C3orf62 is not currently understood by the scientific community.

Interactions
Upwards of 12 interacting proteins have been predicted for C3orf62. Interacting proteins with the strongest confidence to interact with C3orf62 include: HAUS augmin-like complex subunit 1 (HAUS-1), Inhibitor of growth protein 5 (ING5), Thioredoxin domain-containing protein 9 (TXNDC9), and MORF4-family associated proteins (MORF4L1, MFRAP1).

Chemicals known to interact with C3orf62 include the following: Aflatoxin B1, Hydralazine, Valproic acid, and Decitabine.

Clinical significance
Interstitial deletions of chromosome 3 are rare, and only a few patients with a microdeletion of 3p21.31 have been reported to date. Characteristic clinical features found in patients with a microdeletion of 3p21.31 include developmental delay and distinctive facial features (including arched eyebrows, hypertelorism, epicanthus, and micrognathia).

In the gene region, NCBI SNP identified 1,326 SNPS on the reverse minus strand of C3orf62. In the coding region, NCBI SNP identified 147 common SNPs.

Paralogs
There are no known paralogs of C3orf62.

Orthologs
The ortholog space of C3orf62 is fairly narrow, with the majority of orthologs found in mammals. A small fraction of orthologs have also been found in the following classes: Reptila, Sarcopterygii, and Actinoptergii.

The groupings of nearly all Mammalia ortholog sequences of C3orf62 are as follows: E-value: 2e-94 to 1e-169; similarity 56-84%. Mammals in this group consist largely of primates but also include the following orders: Perissodactyla, Rodentia, Carnivora, Proboscidea, Cetartiodactyla, Cingulata, Artiodactyla, Eulipotyphla, Diselphimorphia, and Afrosoricida.

More distantly related ortholog sequences of C3orf62 include organisms from classes Reptilia, Sarcopterygii, and Actinopterygii ranging from an E-value of 8e-10 to 3e-59 with similarity of 24-39%. Organisms in this grouping consist of Testudines, Coelacanthiformes, Squamata, and Osteoglossiformes orders. No ortholog sequences of C3orf62 were found for the following life forms: Bacteria, archaea, protist, plant, fungus, trichoplax, invertebrate, amphibian, or bird.

Phylogeny
The most distant ortholog of C3orf62 are species of fish and amphibians. Orthologs of C3orf62 are not seen in birds, invertebrates, or bacteria.