Chromosome 4 open reading frame 50

Chromosome 4 open reading frame 50 is a protein that in humans is encoded by the C4orf50 gene. The protein localizes in the nucleus. C4orf50 has orthologs in vertebrates but not invertebrates

Gene
The C4orf50 gene is on chromosome 4 at position 4p16.2 and is located on the minus strand. The gene's longest isoform consists of 11 exons, a coding sequence of 6370 nucleotides, and an upstream in-frame stop codon. Other genes in the gene neighborhood include: CRMP1 and JAKMIP1

Protein
C4orf50 is 1508 amino acids long and has a calculated molecular weight of 30 kDa. The isoelectric point is at approximately a pH of 5.6. In addition, the protein has higher than normal amounts of glutamic acid and arginine, and lower than normal amounts of phenylalanine and tyrosine.

Tertiary structure
i-TASSER and Phyre 2 predict C4orf50 to have a tertiary structure rich in alpha helices concentrated near the N-terminus and C-terminus.

Expression
C4orf50 RNA is expressed lowly and ubiquitously in most tissue types. C4orf50 is expressed at a much higher level in the brain, testis, adrenal, and prostate. C4orf50 was expressed in specific parts of the brain including the hippocampus and striatum. Other tissues with moderate expression included the frontal lobe, parietal lobe, and amygdala. In all available RNA-sequencing data shows C4orf50 is found in the brain.

Modification
It is predicted that C4orf50 has 21 phosphorylation sites, one sulfonation site, one N-glycosylation site, and several O-glycosylation sites.

Subcellular localization
The primary subcellular location is the nucleus. Immunofluorescent staining of C4orf50 antibodies show that C4orf50 is present in the nucleus, but the reason remains unknown. C4orf50 is less abundant than most proteins in humans

Evolution
Orthologs C4orf50 in Homo sapiens is poorly conserved. It is found in vertebrates but not invertebrates and has many orthologs including mammals, reptiles, birds, amphibians, and fish. Table 1 below shows orthologs of C4orf50 in mammals, reptiles, birds, amphibians, and fish. C4orf50 is evolving considerably quickly compared to reference sequences Cytochrome C and Fibrinogen alpha. This is shown to the right when comparing the divergence rates of C4orf50, Cytochrome C, and Fibrinogen Alpha.

*MYA = Million Years Ago