C13orf42

C13orf42 is a protein which, in humans, is encoded by the gene chromosome 13 open reading frame 42 (C13orf42). RNA sequencing data shows low expression of the C13orf42 gene in a variety of tissues. The C13orf42 protein is predicted to be localized in the mitochondria, nucleus, and cytosol. Tertiary structure predictions for C13orf42 indicate multiple alpha helices.

Summary
C13orf42 is a protein encoding gene containing 4 exons. C13orf42 is also known by aliases LINC00371 and LINC00372. RNA sequencing shows the gene's expression at low levels in various tissues.

Location
C13orf42 is located on the minus strand of chromosome 13 at 13q14.3 in humans. C13orf42 is located from 51.08 Mb to 51.20 Mb on chromosome 13 and spans 118 kilobases.

Neighborhood
The genomic neighborhood of C13orf42 consists of several pseudogenes along with ribonuclease H2 subunit B (RNASEH2B), uncharacterized LOC107984554, and family with sequence similarity 124 member A (FAM124A).

Exons
The C13orf42 gene contains 4 exons.

Expression
RNA sequencing of C13orf42 shows expression in a variety of tissues including the spleen, kidney, heart, brain, testis, skin, esophagus, colon, small intestine, stomach, lung, placenta, salivary gland, thymus, and adipose. RNA sequencing of human fetal tissue shows C13orf42 expression starting at 20 weeks in the intestine, 16 weeks in the kidney, 10 weeks in the lung, and expression in the stomach is seen at 16 weeks but not 10, 18, or 20 weeks. Recorded RNA expression is very low, with all results being lower than 0.5 reads per kilobase of transcript per million reads mapped (RPKM). Microarray data from NCBI geo (GDS425) shows expression in additional tissues including bone marrow, liver, skeletal muscle, spinal cord, and pancreas.

Variants
C13orf42 produces four known transcript variants, variant 1, variant 2, variant 3, and variant X1. Transcript variant 3 (accession number: NM_001351589.3) is the longest high-quality mRNA at 3075 nucleotides. Transcript variant 3 contains 4 exons and encodes a 325 amino acid protein.

Transcript variants 1, 2, and X1 all lack the first exon but align with exons 2, 3, and 4 of transcript variant 3. Variants 1 and 2 are not protein encoding, while variants 3 and X1 are protein coding. Variant X1 is 2717 nucleotides long and encodes a 189 amino acid protein which aligns with the last 187 amino acids of the longer protein encoded by transcript variant 3 and differs in its first two amino acids.

Isoforms
There are two known proteins encoded by the isoforms of C13orf42. Transcript variant 3 encodes the longest protein at 325 amino acids long. Transcript variant X1 encodes a 189 amino acid long protein. This protein aligns with exons 2, 3, and 4 of the 325 amino acid protein, but is missing exon 1.

Protein Composition
C13orf42 has a predicted isoelectric point of 9.3 and a predicted molecular weight of 37.4 kDa. Human C13orf42 is a serine rich and positively charged amino acid (lysine and arginine) rich protein. This composition is partially conserved in orthologs.

Tertiary Structure
The C13orf42 tertiary structure of the highest confidence predicted by I-Tasser is predicted to have many alpha helices. In the structure below, residues indicated to be present in C13orf42 in higher amounts (serine, lysine and arginine) are annotated. A space filling model and a charge model is also shown for C13orf42.

Subcellular Localization
Human C13orf42 is predicted to be localized to the mitochondria, nucleus, cytosol, and endoplasmic reticulum with the ER predicted at a low percentage (<5%). Orthologs show similar predicted subcellular localization with mitochondria, nucleus, and cytosol being the top predicted locations, however, predicted percentages vary.

Immunohistochemistry
C13orf42 antibody B-4 (catalog number: sc-376095) shows cytoplasmic and nuclear staining in seminiferous ducts and Lyedig cells of testis tissue. C13orf42 antibody E-3 (catalog number: sc-374567) shows cytoplasmic staining in seminiferous ducts and Lyedig cells of testis tissue, and cytoplasmic and nucleolar localization in HeLa cells.

Post translational Modifications
C13orf42 is predicted to have 10 highly conserved (in over 70% of analyzed orthologs from table below) phosphorylation sites. Phosphorylation sites include one CK2 phosphorylation, one TYR phosphorylation, two cAMP phosphorylation sites, and six PKC phosphorylation sites. There are three predicted O-β-GlcNAc sites and two predicted yin-yang sites in C13orf42 which are fully conserved in orthologs. A yin-yang site occurs when O-β-GlcNAc and phosphorylation are predicted for the same site. C13orf42 is not predicted to have myristylation sites as it does not contain an N-terminal glycine.

Domains
C13orf42 has no identified domains with high confidence or conservation in orthologs.

Orthologs
C13orf42 has orthologs in mammals, birds, reptiles, amphibians, bony fish, and cartilaginous fish as shown in the ortholog table below. No orthologs were found in jawless fish, invertebrates, plants, fungi, viruses, or bacteria. All mammals contain the same 4 exons as the human C13orf42 protein, and nonmammals are missing exon 4. Mammalian orthologs have a high percent identity to human C13orf42, each having over 62% identity. The furthest orthologs (cartilaginous fish) have sequence identities around 33%. Human C13orf42 does not have paralogs.

Phylogeny
A phylogenetic tree shows human C13orf42 is most related its mammalian orthologs, and most distantly related to cartilaginous fish orthologs.

Clinical significance
Kanagal-Shamanna et. al identified an ATM fusion with C13orf42 in a patient with chronic lymphocytic leukemia which lead to ATM inactivation.

Xiong et. al indicated SNP rs7325564 to be significantly associated with nasion and pronasale face shape in humans.