User:Kunjpatel1995/sandbox

C15orf52 is a protein which in humans is encoded by the C15orf52 gene.

Gene
C15orf52 is a gene located on chromosome 15 on the reverse strand of the species Homo sapiens. The gene is 9,516 base pairs long including introns and exons. The gene contains 12 distinct introns, 11 exons, produces 7 different mRNAs, 6 alternatively spliced variants and 1 complete, unspliced form.

mRNA
The linear mRNA is 5344 base pairs long. The mRNA contains a short 5’ untranslated region of 15 base pairs and a long 3’ untranslated region of 3782 base pairs.

General Properties
The protein contains a domain of unknown function (DUF4594 from amino acid 185 to 350). The protein, C15orf52, is a 534 amino acid long protein weighing 57.325 kDa found in Homo sapiens.

Primary
Below is the primary amino acid sequence of the C15orf52 protein :

MISCAEQRSRQGEAGRGPAPVAPAFLPLWLPRGCSGILSVPAVAMHSAGTPRAESPMSRQEKDAELDRRI VALRKKNQALLRRYQEIQEDRRQAEQGGMAVTTPALLQPDGLTVTISQVPGEKRVVSRNWARGTCGPRVT NEMLEDEDAEDHGGTFCLGELVELAVTMENKAEGKRIVSEKPTRARNQGIEGSPGGRVTRSPPTQVAISS DSARKGSWEPWSRPVGEPPEAGWDYAQWKQEREQIDLARLARHRDAQGDWRRPWDLDKAKSTLQDCSQLR GEGPARAGSRRGPRSHQKLQPPPLLPDGKGRGGQASRPSVAPATGSKARGKERLTGRARRWDMKEDKEEL EGQEGSQSTRETPSEEEQAQKQSGMEQGRLGSAPAASPALASPEGPKGESVASTASSVPCSPQEPDLAPL DLSLGGAGIPGPRESGCVLGLRPGAQESPVSWPEGSKQQPLGWSNHQAELEVQTCPEPQRGAGLPEPGED RSGKSGAQQGLAPRSRPTRGGSQRSRGTAGVRRRTGRPGPAGRC

Tertiary
C15orf52 has a coiled coil domain spanning amino acids 60-97.

Composition
Comparison of the amino acid composition to "Homo sapiens" revealed certain amino acids with differing frequencies than other proteins. Phenylalanine, Tyrosine, and Asparagine were all found in lower frequencies than other proteins. Glycine and Arginine were found at higher frequencies than other proteins. The isoelectric point of the protein is 9.457, indicating a basic protein at a normal physiological pH of 7.4.

Subcellular Localization
There are no transmembrane sequences detected in the C15orf52 protein. C15orf52 is also predicted to be a non-cytoplasmic soluble protein.

Post-Translational Modifications
The protein has been experimentally observed with phosphorylation at serines found at two locations, S201 and S392.

Interacting Proteins
Two proteins, THO complex subunit 1 (THOC1) and THO complex subunit 7 (THOC7) were found to interact with C15orf52 using anti-tag coimmunoprecipitation. THOC1 is a component of the THO cubcomplex of the TREX complex that is thought to couple mRNA transcription, processing and nuclear export. It is also involved in an apoptotic pathway characterized by activation of caspase-6. THOC7 is also part of the same subcomplex and is required for efficient export of polyadenylated RNA. Ring finger protein 2 (RNF2) and SUZ12 polycomb repressive complex 2 subunit (SUZ12) were also indicated as interacting proteins. RNF2 is part of a polycomb group of proteins that are important for transcription repression of various genes. It also possess ubiquitin ligase activity. SUZ12 is also a polycomb group protein and part of a complex that methylates lysines of histones and also is involved with repression of genes.

Paralogs
There are no known paralogs for the C15orf52 gene.

Expression
Origin of cDNAs of C15orf52 shows that the gene is expressed in numerous locations such as primary and secondary digestive organs (pancreas, stomach, liver, etc.), nervous system (brain, retina, lens), skin, reproductive organs, bones, and many other tissues suggesting a fairly nonspecialized function. However, C15orf52 protein is relatively overexpressed in the colon, peripheral blood mononuclear cells, testis, and rectum. Application of RNA-seq to plasma extracellular RNA profiles indicated C15orf52 as the most abundant mRNA present, possibly indicating some role outside of the cell. In mice, the expression pattern of C15orf52, as well as TCEA3 and FHOD3, two other genes studied, was found to be similar to that of well-characterized genes known to be associated with heart development such as BVES and CXCL12. However C15orf52 was not detected before embryological day 9.5 in the tail area and it’s exact function is not yet known.