C5orf49

Chromosome 5 open reading frame forty-nine, also known as C5orf49, is a protein that in humans is encoded by the C5orf49 gene. Aliases for C5orf49 include Chromosome 5 Open Reading Frame 49, Uncharacterized Protein C5orf49 and LOC134121. C5orf49 is predicted to localize to the cilia and have ciliary functions.

Gene
C5orf49 is found on chromosome 5, cytoband p15 between base pairs 7,830,378 and 7,851,151, meaning it has a length of 20,774 base pairs. This gene has two splice forms, one that is 147 amino acids in length and another that is 145 amino acids in length. C5orf49 is oriented on the minus strand. Neighboring genes of C5orf49 include, FASTKD3, MTRR, and ADCY2.

Promoter
C5orf49 has one upstream promoter, GXP_1271072, that regulates both of the primary transcripts. GXP_1271072 is 1,396 base pairs in length, spanning from base pair 7,851,094 to base pair 7,852,489 on chromosome 5. The transcription start region for the longest transcript of 147 amino acids spans from base pair 7,851,148 to base pair 7,851,164 on chromosome 5.

Structure
C5orf49 is characterized by the presence of the protein domain DUF4541. Within this protein domain, there is a conserved KLHRDDR sequence motif and a single completely conserved residue Y that may be functionally important. Domain is shown on the annotated conceptual translation.

Predicted properties
The following properties of C5orf49 were predicted using bioinformatic analysis:


 * Molecular Weight: 17 kDa
 * Isoelectric point: 7.0
 * Post-translational modification: fourteen post-translational modifications are predicted:
 * Seven phosphorylation sites at positions 8, 9, 11, 80, 100, 135, and 147 on the protein sequence
 * Six ubiquitination sites at 16, 39, 69, 104, 137.
 * Two acetylation sites at 39 and 104.C5orf49 post-translational modifications.jpg

Tissue distribution
Expression data indicate expression most significantly in the lung, brain, and spinal cord tissues.

Binding partners
CDKN2d, HSF2BP, KRT31 and KRT34 were found to be binding partners of C5orf49 by two hybrid prey pooling approach and two hybrid array.

Species Distribution
C5orf49 shows conservation through mammals and orthologs can be found in flatworms and sea anemone. The table to the right shows a spread of some orthologs found using BLAST. C5orf49 is not found in sponges, which diverged at a median date of 777 million years ago (MYA), and it is found in its most distant ortholog 736 MYA. Therefore, C5orf49 diverged as a gene between 777 MYA and 736 MYA.

Evolution
C5orf49 does not show a fast or slow evolution rate over time when compared to cytochrome C and fibrinogen alpha. This is shown by the protein divergence graph on the right.