User:Penal007/sandbox

Chromosome 12 Open Reading Frame 42 (C12orf42) is a protein encoding gene in Homo sapiens.

Locus
The genomic location for this gene is as follows: starts at 103,237,591 bp and ends 103,496,010 bp. The cytogenetic location for C12orf42 is 12q23.2. It is located on the negative strand

mRNA
Fifteen different mRNAs are made by transcription, fourteen alternative splice variants and one unspliced form.

Protein
The protein released by this gene is known as uncharacterized protein C12orf42. There are three isoforms for this protein produced by alternative splicing. The first isoform is a conical sequence. The second isoform differs from the conical sequence by missing 1-95 aa from its sequence. The third isoform differs from the conical sequence for two reasons:

87-107 aa: VFPERTQNSMACKRLLHTCQY$$\Rightarrow$$GSHHGQATQKLQGAMVLHLEE

108-360 aa: Missing

Secondary Structure
C12orf42 protein takes on several secondary structures, such as: alpha helices, beta sheets, and random coils. The protein is a soluble protein. Soluble proteins affect the tertiary structure by the the outer part consist of hydrophilic amino acids and interior of the structure consist of hydrophobic side chains. Proteins that are hydrophilic are able to freely float inside a cell, due to the liquid composition of the cytosol.

Subcellular Location
C12orf42 is an intracellular protein. This is known by the lack of transmembrane domains or signal peptides. This suggests that it is predicted to be a nuclear protein, given the nuclear localization signal (NSL) found: PRDRRPQ at 292 aa and a bipartite KRLIKVCSSAPPRPTRR at 325 aa.

Post-translation Modification
Predicted post-translation modification sites are seen below in the table. Nuclear proteins are known for having phosphorylation, acetylation, sumoylation and O-GlcNAc, as types of modifications. Phosphorylation affects proteins-protein interaction and stability of the protein. Acetylation promotes protein folding and improves stability. Sumoylation is involved in nuclear-cytosolic transport and DNA repair. Glycosylation present in the nucleus are known as O-GlcNAc, it functions in protein folding and stability.

Tissue Profiles
Microarray data shows expression of the C12orf42 gene in different tissues throughout the human body. There is high expression in the lymph node, spleen, and thymus. There is also expressed in the brain, bladder, epididymis, and the helper T cell. Therefore, there is statistically significant expression of C12orf42 gene in the nervous system, immune system, and male reproductive system.

In Situ Hybridization
The table below shows the areas in the mouse brain where C12orf42 is expressed. The gene name for the mouse is 1700113H08Rik, it is the human homolog of C12orf42.

Paralog
C12orf42 gene has only one other member in its gene family, this gene is known as Neuroligin 4, Y linked gene (NLGN4Y).

Orthologs
C12orf42 orthologs are mostly mammals. One exception that was found is Pelodiscus Sinensis or more commonly known as the Chinese soft-shell turtle.

Conserved Domain Structure
The domain structure that is most important is DUF4607, it is conserved in the Eutheria clade in the Mammalia class. The order that it is conserved in is as follows: Artiodactyla, Carnivora, Chiroptera, Lagomorpha, Perissodactyla, Primates, Proboscidea, and Rodentia.

Clinical Significance
In a experiment, fine-tiling comparative genomic hybridization (FT-CGH) and ligation-mediated PCR (LM-PCR) were combined. This resulted in the finding of a chromosomal translocation t(12;14)(q23;q11.2) in T-lymphoblastic lymphoma (T-LBL). This occurs during T-receptor delta gene-deleting rearrangement, which is important in T-cell differentiation. This disrupts C12orf42 and it brings the gene ASCL1 closer to the T-cell receptor alpha (TRA) enhancer. This leads the cross-fused gene to encode vital transcription factors that are found in medullary thyroid cancer and small-cell lung cancer.