User:Rufus22181496/sandbox

C22orf23 (Chromosome 22 Open Reading Frame 23) is a protein which in humans is encoded by the C22orf23 gene.

Size and Locus
C22orf23 is a gene found in homo sapiens. It is located on Chromosome 22 on the minus strand, map position 22q13.1. It spans 10,620 base pairs. Its mRNA transcript is is 1988 base pars long and has 7 exons.

Common Aliases
Its aliases are: UPF0193 Protein EVG1, DJ1039K5.6, EVG1 FLJ32787, and LOC84645.

Primary sequence
The protein encoded by the mRNA sequence is 217 amino acids in length and has a predicted molecular mass of 25 kDa. The predicted isoelectric point is 9.8. It is located in the nucleus.

Domains and motifs
It is predicted to be an intracellular protein and does not have any predicted transmembrane domains. Due to its location and lack of a transmembrane domain, the protein structure is likely a globular protein.

Post-Translational Modifications
C22orf23 has many predicted post-translational modifications such as: phosphorylation sites, cell attachment sequences, N-myristoylation sites, O-linked glycosylation sites, glycation sites, Ac-ASQK cleaved-acetylated sites, and Sumoylation sites. Many of the predicted phosphorylation sites were also predicted to be O-linked glycosylation sites thus the phosphorylation site could be blocked altering that domain's structure or function.

Secondary Structure
The predicted secondary structure consists of alpha helices and disordered/coil regions. The predicted model has a coverage of 28% of the amino acid sequence with a 42.9% confidence.

Paralogs
There are currently no known paralogs to C22orf23.

Orthologs
Orthologs can be found in most major groups of species ranging from most similar in primates to most distant in a member of phylum Chytridiomycota. This includes: mammals, reptiles, birds, amphibian, bony fish, cartilaginous fish, invertebrates, and fungi. Orthologs may have first appeared in plants or fungi however it is hard to claim with certainty that it is present in these species.

This table lists several orthologs for C22orf23 and includes their species name, common name, taxonomic order, accession number,sequence length, sequence similarity, and evolutionary date of divergence.

Promoter
The core promoter is GXP_7541220 (-), and it's coordinates are 37953445-37954669 and it is 1225 base pairs long.

Human Expression
Protein expression is highest in the testes however it is also expressed at low levels in many other tissues such as: brain, kidney, stomach, skin, thyroid, urinary bladder, placenta, endometrium,esophagus, and appendix, bone marrow, adipose, lung, and ovary.

Ortholog Expression
Expression in orthologs Rattus norvegicus, is expressed primarily in the testes with low levels of expression in the: kidneys, lungs, heart, and uterus. Mus musculus is expressed primarily in the adrenal and testes, and also notably expressed in the: bladder, abdomen, heart, lungs, ovaries, and mammary gland.

Protein Interactions
There are several predicted protein interactions: Cyclin-D1-binding protein 1 which may regulate cell cycle progression, Vacuolar protein sorting-associated protein 28 homolog which is involved as a regulator of vesicular trafficking, UPF0739 protein C1orf74, and estrogen related receptor gamma.These interacting proteins were identified as either having direct interactions or physical associations. They were identified through a variety of detection methods including affinity chromatography, 2 hybrid prey pooling, and 2 hybrid array. It also has predicted protein interactions with SH3 domain containing 19, EvC ciliary complex subunit 1, RIMS binding protein 3B, RIMS binding protein 3C,TSSK6-activating co-chaperone protein, V-set and immunoglobulin domain containing 8, family with sequence similarity 124 member B, small nucleolar RNA host gene 28, and transmembrane protein 200B. Evidence suggesting a functional link for these interactions were supported through Co-mention on PubMed.

Disease Association
C22orf23 was identified as belonging to one of two groups of pooled serum samples in a study that analyzed the difference between serum glycoproteins of hepatocellular carcinoma and that of normal serum. Deletions of parts of C22orf23 (exons 3 and 4) and several other genes including SOX10 has been observed in patients with peripheral demyelinating neuropathy, central demyelinating leukodystrophy, Waardenburg Syndrome, and Hirschsprung disease and is therefore, suggested to be a potential factor involved in these ailments. C22orf23 was also mentioned in a study of mutation profiles from ER+ breast cancer samples taken from postmenopausal patience. There were mutations found that affected C22orf23 among many other genes. In a study of epigenetic alterations involved in coronary artery disease, C22orf23 was found to have altered epigenetic modifications which could be involved in novel genes in Coronary artery disease. In a study that attempts to predict imprinted genes that maybe linked to Human disorders, C22orf23 was identified as homologous of imprinted Gene candidates showing linkage to schizophrenia. In another study it was listed as being a potently regulated protein in uterine leiomyoma.

Mutations
There are a total of 3340 SNPs within the 5’ and 3’ UTR, introns, exons, as well as some genes near the 5’ and 3’ UTR. There is a total of 225 SNPs within the coding sequence. Some of the SNPs occur in conserved amino acids within the coding sequence and some reported have one or more types of validation. Some of the SNPs have high heterozygosity scores and thus have a presence in the population.