User:Jmnosh/sandbox

C3orf70 also known as Chromosome 3 Open Reading Frame 70, is a 250aa protein in humans that is encoded by the C3orf70 gene. The protein encoded is predicted to be a nuclear protein; however, its exact function is currently unknown. C3orf70 can be identified with known aliases: Chromosome 3 Open Reading Frame 70, AK091454, UPF0524, and LOC285382.

Gene
In humans, C3orf70 is located on the reverse strand of Chromosome 3 at 3q27.2. This identifies its location starting 184,795,838 base pairs and ending 184,870,802 base pairs from PTER, the terminus of the short arm, on chromosome 3. C3orf70 spans 74,964 bases containing two exons and two introns.

mRNA
The transcribed mRNA is a 5,901 base pair transcript. C3orf70 consists of one known splice variant with two exons of 388 base pairs and 5,512 base pairs respectively; location of junction occurs at 67aa[C]. A single 5’ cap and three possible 3’ polyadenylation signals have been identified.

Composition
The translated protein is a 250 amino acid product. The precursor protein has been predicted with a molecular weight of 27.8kdal and an isoelectric point of 4.67. With 33 serines and 8 glysines, the C3orf70 protein is both Serine rich and Glycine poor. It was also identified that the C-terminus of the C3orf70 protein contains a negative charge cluster resulting in a coiled region.

Domains
C3orf70 protein has no known signal peptides or domains.

Homology
C3orf70 has no known paralogs in humans; however C3orf70 has conserved homologs, see “’Figure 3”’. Highly conserved across species excluding invertebrates, plants, fungi, and bacteria, C3orf70 shows a moderate rate of evolution, see Figure 4 and 5.

Promoter
There is only one known promoter predicted by Genomatix for the C3orf70 protein located on the minus strand of chromosome 3 at location 184870702-184871302bp, therefore identified as 600bp in length. High mammalian conservation was observed for the identified promoter sequence.

Transcription factors


Through the use of Genomatix, a table was generated of the top 20 transcription factors and their binding sites in the C3orf70 promoter (see Figure 6).

Post-translational modifications
Utilizing NetPhos, a total of 25 phosphorylation sites have been predicted (20 Serines, 3 Threonines, and 2 Tyrosines) which occur throughout the protein indicating an intracellular localization. Figure 4 pinpoints the location of the 25 phosphorylation sites. Additionally, two N-myrisolation sites were identified at amino acid position 40-45 and 210-215 indicating a possible N-terminus and C-terminus membrane anchor region. There are also 28 known missense mutations in the human C3orf70.

Subcellular localization
PSORT II indicates the subcellular localization of C3orf70 is in the nucleus. In addition to this, following SDSC's Biology Workbench's SAPS kNN-Prediction, the C3orf70 protein for humans has a 60.9% likelihood to end up in the nuclear region of a cell. Homologs including chimp, mouse, alligator, and zebrafish conclude the same nuclear region with a >60% likelihood. There has been no identified nuclear localization site but a phosphorylation or missense mutation of amino acid 48 (serine) would create a potential N-terminus site.

Expression
From Unigene's EST cDNA tissue abundance display, C3orf70 is non-ubiquitously expressed and has relatively low expression levels with slightly higher expression levels are seen in the brain. C3orf70 protein also has a notable high presence in brain, spinal cord, and prostate tissue, determined from microarray data profile GDS426 showing expression of C3orf70 across normal human tissue.

Function
The function of C3orf70 is unknown. It is suggested to be a nuclear protein that plays a role in neurological development. Further avenues of research pertaining to the C3orf70 gene include:

There is a patent that identified genes associated with midbrain dopamine neurons for engraftment by looking at the differentiation of hESC and/or hiPSC in floor plate midbrain progenitor cells. C3orf70 was found to have a fold-change of 2.45, which was not determined significant in experimentation

A publication was discovered through multiple sources that linked the C3orf70 gene to a “Genome-wide association study of major depressive disorder”. Though intriguing, the identity of the gene was not discoverable in the article or supplementary text/data under the gene name C3orf70 or its aliases. A SNP was discovered at 3q28 but that is millions of base pairs away from the gene of interest3. The link between the publication on the major depressive order and the C3orf70 has not been identified.

A microdeletion has been identified from 3q26.33-3q27.2. Mandrille et al associates this discovered microdeletion with a possible clinical syndrome characterized by clinical features related to brain development.