User:Alimkey/sandbox

Chromosome 16 open reading frame 13, also known as C16orf13, is a protein-coding gene of unknown function, also known as JFP2. The JFP indicates is location on chromosome 16 amidst other proteins of the jumonji family, known to be involved in chromatin regulation and gene expression. Though the function of this gene is unknown, microarray data has revealed that it is expressed at high levels in various cancerous tissues16. Underexpression of this gene has also been linked to disease consequences in humans.

Gene Summary
C16orf13 is located on the short arm of chromosome 16 in humans, in the thirteenth open reading frame, accession number NM_032366.3. The longest cDNA transcript (the reference sequence) is 854 base pairs long2 There are five transcript variants of this gene, named 1, 2, 3, 4, and 7. This gene is composed of six exons, all of which contribute to the major domain of the protein, the NADB Rossman domain. The primary transcript of this gene is 1919 base pairs long, with NCBI reference number NC_000016.9.

Conservation Analysis
A multiple sequence alignment conducted reveals high sequence similarity with the Mus musculus organism, particularly in the region known to code for the NADB Rossman domain. The NADB Rossman domain appears highly conserved among many of the orthologous genes and proteins in a diverse array of species, back to such distant relatives as Xenopus tropicalis, the western clawed frog. The sequence similarity in the 5’ UTR and 3’ UTR shows sequence similarity only among close mammalian relatives, and high conservation among primates.

Protein
The protein that this gene codes for is known as UPF0585, where UPF signals unknown protein function, accession number NP_115742.329. There are 5 isoforms of this protein, corresponding to the five splice variants of the gene. The isoforms are named a, b, c, d, and g24. The conserved domain detected in this amino acid sequence is the NADB Rossman domain. This is thought to be a type of NAD reductase enzyme containing the NAD binding sequence, GXGXX(G/A). The reference sequence has 204 amino acid residues.

Species Distribution
The C16orf14 has homologs in many species, including distant orthologs in fungi and plants1. There are no known paralogs of this protein1. This gene and its protein are very highly conserved in primates and mammals, particularly in the functional domain and in the NAD binding site of the NADB domain. As detailed above, the UTR and promoter regions show less conservation than the coding sequence (CDS.) This may be because different species-specific uses and expression patterns are required across organisms.

Disease Links
Data from microarray experiments has linked over expression of this gene to cancer in various tissues, particularly breast cancer and gastric cancer16. In addition, under expression of this gene is also linked to diseases, particularly connective tissue disease, nutritional and metabolic disorders, and digestive disorders

Chromosome (Gene neighborhood)
The C16orf13 gene is located near the end of chromosome 16, making it subject to deletion mutations, as pictured below. The surrounding genes of the C16orf13 gene include hypothetical protein LOC100287175 and LOC100138285 to the right and RAB40C and WFIKKN1 to the left30. This gene is located on the minus strand, along with LOC100138285. The other surrounding genes are oriented in the opposite way on the plus strand30. The gene neighborhood is represented in the schematic below30.