C1orf94

Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.

Gene
C1orf94 gene is also known as Q6P1W5; B3KVT1; D3DPR3; E9PJ76 and Q96IC8is; MGC15882.

C1orf94 has the FLJ20508 gene as an alias.

Locus
C1orf94 is located on the short arm of chromosome 1 specifically at 1p34.3 chr1:34,166,883-34,219,131 and is situated near HSPD1P14 gene. It is encoded on the sense strand.

This gene has 7 exons (only 6 of them are coding)

mRNA
This protein has two isoforms a and b; a being the longest (598 aa).

Transcription
There are two promoters predicted for C1orf94. Only one of them is predicted for the transcript used for the analysis. This is the list of transcription factor binding sites that bind transcription factors:

ZF02 (C2H2 zinc finger transcription factors 2)

Cart1 Sequence-specific DNA-binding transcription factor

HTLV-I U5 repressive element-binding protein 1

NKX homeodomain factors

AARE binding factors PREB core-binding element

Protein
DUF4688 is a large region found within C1orf94 protein sequence and in both isoforms a and b. This sequence is conserved in eukaryotes.

C1orf94 is a Protein tissue co-expression partner for RBBP8NL. the isoelectric point is 8.56 and the molecular weight is around 65353 KDa. Proline is the most abundant amino acid in the protein sequence (11.7%) then followed closely by Leucine (10.4%).

Seven PEST motifs were identified in from positions 1 to 598 : PEST domain signatures, rich in proline (P), glutamic acid (E), serine (S), and threonine (T).

Prediction of only one potential PEST motif with 21 amino acids between positions 133 and 155. This sequence is associated with proteins that have a short intracellular half-life.

Post-translational modifications
C1orf94 goes through Palmitoylation, phosphorylation and glycation mainly on the N-terminus of C1orf94. Also, Mitochondrial processing peptidase cleavage site is predicted on the first Methionine.

Structure
According to CFSSP, the secondary structure of C1orf94 shows alpha Helix, extended strands, beta turns, and Random coils.

Both Tertiary structures predicted by Phyre2 and the SWISS model show that C1orf94 is a monomer.

According to I-TASSER the closest protein structures and Identified structural analogs to C1orf94 are 3IXZ (Pig gastric H+/K+-ATPase complexed with aluminum fluoride) and 3B8E (Crystal structure of the sodium-potassium pump).

Protein-protein Interactions
Mentha proposed a strong physical interaction with ATXN1 which is a chromatin-binding factor that represses Notch signaling in the absence of the Notch intracellular domain.

According to PSICQUIC, C1orf94 and MMADHC have physical interactions that were demonstrated through affinity chromatography technology. MMADHC is a gene that encodes a mitochondrial protein that is involved in early steps of vitamin B12 metabolism.

RFX2 is possibly a functional partner according to STRING and it is a query protein and involved in first shell of interactors.RFX2 is a Transcription factor that acts as a key regulator of spermatogenesis.

Expression
According to AceView, this gene is well expressed, 0.5 times the average gene in this release.

According to PSORT II C1orf94 is 69.6% nuclear.

Data from NCBI shows that C1orf94 is primarily expressed in the testis tissues.

According to the human protein Atlas, C1orf94 is slightly expressed in the brain tissue.

According to GEO profiles, the C1orf94 increase of expression is highly correlated with Morbid obesity. Also, C1orf94 increased after related coactivator depletion.

Function
The function of C1orf94 is not yet fully understood and there are no experiments yet that proved otherwise. However, C1orf94 shows higher rates of expression in HPA RNA sequences in normal tissues compared to tissues during fetal development.

Association with diseases
According to GWAS, C1orf94 was identified as an OncoORF (Oncogenic Open Reading frame). According to Colorectal cancer Atlas, C1orf94 is involved in protein-protein interactions with 50 nodes causing colorectal cancer like interactions with AKAP9 kinase anchor protein, which is the most dangerous one as it promotes colorectal cancer development by regulating Cdc42 interacting protein.

Sequence homology
C1orf94 evolved faster than both Cytochrome C and less than fibrinopeptides.

C1orf94 has no paralogs. Orthologs were identified using NCBI BLASTp. Mammalians showed the most conservation and the most distant orthologs were found in fish.

After running SAPS on a group of orthologs (Gorilla, Rat, Dog, and Bat), the protein's composition only shows minor variations compared to the human sequence: Proline is still the most abundant amino acid followed by leucine and tryptophan remains the least abundant.