UPF0488

UPF0488 is a protein that in humans is encoded by the C8orf33 (Chromosome 8 Open Reading Frame 33) gene. Chromosome 8 open reading frame 33 (C8orf33) is a human protein-coding gene of currently unknown function.

Tissue and subcellular distribution
The UPF0488 protein is expressed in low-moderate levels in most tissues with some exceptions. It is predicted to localize in the nucleus and mitochondrion, though several orthologs were also predicted to localize in the cytosol; additionally, there is experimental evidence showing that human C8orf33 may localize in the peroxisomes. The expression of this gene is up-regulated after lithium exposure. C8orf33 is significantly up regulated in breast cancer drug treatment.

Post-translational modification
Several post-translational modifications including phosphorylation, methylation, and acetylation are predicted. Additionally, it has several post-translational modifications such as acetylation, methylation, phosphoprotein – this includes amino acid modifications (or modified residues) such as N-acetylalanine, omega-N-methylarginine, and phosphoserine).

Gene
This gene has 5 transcripts (splice variants), 62 orthologues and is a member of 1 Ensembl protein family. This gene is a member of the Human CCDS set: CCDS34974.1 This gene is a member of the Human CCDS set: CCDS34974. C8orf33 expression profile revealed that this gene was over-expressed after lithium exposure.

C8orf33 (UPF0488) has 31 alternatively spliced exons which combine in 13 different transcript variants –X1 variant is the longest and seems to have the greatest identity. Human tissue RNA sequencing of UPF0488.

Transcript
UPF0488 has 5 transcripts splice variants. In terms of common gene haplotype alleles, the frequency of haplotype is 96.3% for one variant site. The primary transcript is 3,593 bp while a similar variant is 1,666 bp. The mRNA secondary structure of 3’ and 5’ UTR’s indicate different fold energies. The 5’ UTR region contains a fold energy of -21.20 and consists of 54 bases, the energy of the bases is -0.393. The 3’UTR region contains a fold energy of -646.10, consisting of 1873 bases – while the energy of the bases is -0.345.

Expression
According to microarray-assessed tissue expression analysis by NCBI GEO, the gene C8orf33 has average expression levels in most tissues save including thyroid gland and parathyroid gland. Expression seems to be low in the pancreas, small intestine and other digestive organs except the kidney which seems relatively higher.

Approximate expression patterns inferred from EST sources. Norway rat putative protein-coding gene. Represented by 30 ESTs from 20 cDNA libraries. EST representation biased toward fetus. Gene expression seems to increase in the obesity-resistant categories

Promoter
The promoter region for c8orf33 covers 1191 base pairs of DNA and contains over 700 potential factor binding sites. Fifteen transcription factors with highly conserved binding sites across multiple species’ promoter regions for c8orf33 were selected and shown (see Annotated Promoter Section). CDF1(Cycling DOF Factor 1) physically interacts with FKF1, CDF1 protein is more stable in FKF1 mutants. Another transcription factor, transcription factor II B (TFIIB) is a general transcription factor that is involved in the formation of the RNA polymerase II preinitiation complex (PIC).

Protein
The Isoelectric point of the protein (UPF0488) is 9.16, given a detailed analysis of isoelectric point according to different scales for individual proteins. The Net Charge had been determined using the values available from the Lehninger's Biochemistry book. The precursor protein has a molecular weight of approximately 24.9925 kDa. This is slightly greater than the average pI of 6.81 for the human proteome. It contains repeats from 149 to 166, and 167 to 186. However, the repeats contain a high degree of degeneracy.

UPF0488 is an alanine rich protein relative to other proteins and low in all other amino acids besides arginine, leucine, and proline.

Homology and evolution
The evolutionary lineage of UPF0488 can be traced as distant as invertebrates with a rate of evolution greater than that of fibrinogen.

Graph shows divergence of UPF0488 in a given time scale compared to fibrinogen and cytochrome c. Analignment using the SDSC Biology Workbench gives a 27.7% match Danio rerio. The ALIGN calculates a global alignment of two sequences, giving a Global alignment score of 215.

The mRNA of UPF0488 has a very high level of degeneracy across organisms. Sequences of very low identity to the human mRNA could only be identified in closely related organisms. However, the protein had far more distant relatives, including several invertebrates. Protein alignments for Homo sapiens UPF0488 was performed using the San Diego Workbench; these alignments were performed against several different taxa including vertebrates such as mammalia, reptilia, aves and invertebrates such as insecta. The protein sequences for UPf0488 are very highly conserved amongst close relatives of homo sapiens such as Gorilla Gorilla Gorilla (Gorilla). The similarity in protein sequence is inversely proportional to divergence (MYA) (table of homologs).

Function
C8orf33 activity was found to be associated with G protein-coupled receptor signaling pathway, neuroactive ligand-receptor interaction, calcium signaling pathway and the regulation of the actin cytoskeleton. The following substances interact with UPF0488: 7,8-dihydro-7,8-dihydroxybenzo(a)pyrene 9,10-oxide, benzo(a)pyrene, methotrexate, and vitamin E.

Pathology
The expression of the UPF0488 gene increases after treatment with cephaloridine, a semisynthetic derivative of cephalosporin C that inhibits gluconeogenesis in both target (kidney) and non-target (liver) organs.