User:Danica00/sandbox

= LOC100287387 = LOC100287387 is a protein that in humans is encoded by the gene LOC100287387. The function of the protein is not yet understood in the scientific community. The gene is located on the q arm of chromosome 2. The gene has low expression in many tissues.



Gene
The human LOC100287387 gene is located on the minus strand of the q arm of chromosome 2 at 2q37.3. It overlaps the TWIST2 gene family on the plus strand of chromosome 2. The gene is formed by three exons, with two introns near the start codon.

mRNA
There are no alternative splicings of the LOC100287387 gene (isoforms).

Protein Structure
The LOC100287387 protein is formed by a 423 amino acid peptide sequence. The molecular mass is 44.4 kdal, and the isoelectric point is 10.77. There is a G-patch domain and a short domain of unknown function within the peptide sequence. There are many predicted modification sites within the amino acid sequence including cAMP- dependent phosphorylation sites (CampP), casein kinase 2 (CK2), and protein kinase C (PKC) phosphorylation sites, O-linked beta-N-acetylglucosamine sites, and a sumoylation site. The predicted secondary structure of the protein includes 8 short alpha-helices (15.6% of the protein), 14 short extended strands (12.1%), and the rest as random coils (72%).

Regulation
The promoter region of the LOC100287387 gene contains binding sites for many transcription factors which affect transcription levels of the gene. There are three TFIIB binding sites (initiates transcription), a cysteine-serine-rich nuclear protein 1 site (an activator), a Kruppel-like zinc finger protein 219 site (repressor), a stimulating protein 1 site (activator), and many more.

Expression
In humans, there is low expression of LOC100287387 in all tissues. Highest expression is in the skin and central nervous system tissue such as the pons, superior cervical ganglion, trigeminal ganglion, and globus pallidus. However, expression was inconsistent among patients.

Homology
Orthologs to the human LOC100287387 gene are found only in mammals, and the protein sequence is not highly conserved. Conservation is highest in primates, and falls drastically among other mammals. Conservation between species is highest at the nuclear localization signal and towards the end of the coding sequence at the G Patch domain and DUF308 which indicates these are the most functionally important parts of the sequence.

There are no paralogs of the human gene LOC100287387.

Function
The protein contains a nuclear localization signal, and most likely acts in the nucleus. There are no confirmed protein interactions or associations to diseases.