User:Drpmd08/sandbox

Chromosome 20 open reading frame 111, or C20orf111, is the hypothetical protein encoded by the C20orf111 gene. C20orf111 has many common names, including Perit1 (Peroxide inducible transcript 1), HSPC207, dJ1183I21.1, OTTHUMP00000031041, oxidative stress responsive 1. It was originally located using genomic sequencing of chromosome 20. NCBI shows that it is at location q13.11 on chromosome 20, however BLAT shows that it is at location q13.12, and within a million base pairs of the adenosine deaminase locus.

Gene
C20orf111 a valid, protein coding gene that is found on the minus strand of chromosome 20 at q13.12 according to BLAT,, but q13.11 according to NCBI.



Gene Neighborhood
C20orf111 has many genes in it's neighborhood upstream and downstream on the minus and also the plus strand of the chromosome. A few of the known genes near C20orf111 are given in the box below with their known function.

General Properties

 * Genomic DNA Length:14,968 base pairs (bp)
 * mRNA Length: 2,260 bp with 4 exons.
 * 5' untranslated region 252 bp long.
 * 3' untranslated region 1,129 bp long.

Transcript Variants


According to AceView, 10 splice isoforms that encode good proteins, altogether 8 different isoforms, 2 of which are complete isoforms. The image below is also from AceView and shows the 10 isoforms that are predicted.

Transcription Regulation
When looking at the predicted promoter sequence given by Genomatix, there are no RNA Polymerase II binding sites, however there is a binding site for core promoter element for TATA-less promoters. In this same region of the promoter, there is also a TATA-binding factor sequence, which helps in the positioning of RNA polymerase II for transcription.

General Properties

 * Contains a highly conserved domain of unknown function 776 (DUF776)
 * Molecular weight 31.8 kilodaltons
 * Isoelectric point 8.57
 * Predicted to be a nuclear protein

Function
The function of C20orf111 is not well understood by the scientific community. It does contain a domain of unknown function, DUF776, which has a large segment that is conserved in most mammals and the amphibians such as the western clawed frog. It is also shown to have an increase in expression in rat cardiomyocytes undergoing hydrogen peroxide induced apoptosis.

Expression
When looking at the EST Profiles in humans given by NCBI, normal tissue (non-cancerous), expresses at a level of 82 transcripts per million. In one published article in Physiological Genomics, they showed that Perit1 expression is increased in cardiac myocytes undergoing H2O2-induced apoptosis, suggested a role in cell death. In many cancer cells, there are expression levels higher than normal, like in breast cancer cells, and in leukemia. However, in prostate cancer, pancreatic cancer, and lung cancer cells the levels of expression of Perit1 is lower than normal tissue.



Homology
C20orf111 gene has no true paralogs in the human genome. However, it has many orthologs in other organisms, and is conserved highly in organisms such as Xenopus tropicalis and is semi-conserved in the C-terminus in Trichoplax adherens.

The following table presents some of the orthologs found using searches in BLAST and BLAT. This list isn’t complete, but shows the conservation of the Perit1 protein throughout evolutionary history.

Predicted Post-Translational Modification


Using various tools at ExPASy the following are possible post-translational modifications for Perit1.
 * Predicted propeptide cleavage site in protein between position R81 and S82.
 * Predicted Sulfation Site at Y237
 * 30 predicted Serine phosphorylation sites
 * 5 predicted Threonine phosphorylation sites
 * 3 predicted Tyrosine phosphorylation sites

Predicted Secondary Structure
PELE (Protein Secondary Structure Prediction) was used to predict the secondary structure of C20orf111. There are no regions that are rich in either β-sheet and α-helix, but there are many random coils formed. This is shown on the image of the C20orf111 images above.