Glutamate-rich protein 3

Glutamate-rich protein 3, also known as Uncharacterized Protein C1orf173, is a protein encoded by the ERICH3 gene. ERICH3 was named “chromosome 1 open reading frame 173 (C1orf173)” based on its map location in the human genome. It was subsequently renamed “E-rich 3” as a result of the high content of glutamate (E) in its encoded amino acid sequence. Single-nucleotide polymorphisms (SNPs) in the ERICH3 gene has been identified as one of the "top" signals in a genome-wide association study (GWAS) for plasma serotonin concentrations which were themselves associated with selective serotonin reuptake inhibitor (SSRI) response in major depressive disorder (MDD) patients. The same ERICH3 SNP was later demonstrated that was significantly associated with SSRI treatment outcomes in three independent MDD trials, including STAR*D, ISPC and PReDICT. ERICH3 is most highly expressed in a variety of regions of the human brain, including the nucleus accumbens (basal ganglia) and frontal cortex based on the GTEx RNA-seq data. The single-cell RNA-seq data for human brain samples revealed that ERICH3 is predominantly expressed in neurons rather than other CNS cell types. ERICH3 was found interacts with proteins function in vesicle biogenesis and may play a significant role in vesicular function in serotonergic and other neuronal cell types, which might help explain its association with antidepressant treatment response. ERICH3 protein was also found abundant in blood platelets and cilia based on the proteomic studies. Its function in platelet was thought related to plasma serotonin storage because more than 99% of blood serotonin was stored in platelet and ERICH3 SNPs has been associated with plasma serotonin concentration in MDD patients. ERICH3 in primary cilia might regulates cilium formation and the localizations of ciliary transport.

Gene
The ERICH3 gene in humans is 105,628 bases and is encoded on the minus strand at position 31.1 on the short arm of chromosome 1 from base pair 75,033,795 bp to 75,139,422 bp from pter. ERICH3 RNA was predominantly expressed in human brain and testis based on the GTEx RNA-seq data. The Ensembl human genome assembly annotated five ERICH3 RNA transcripts. The reference transcript consisted of fifteen exons, with exon 14 encoding half of the open reading frame. The reference ERICH3 transcript was expressed in brain, predominantly in neurons but not in testis. A "shorter" ERICH3 transcript consisted of seven exons, of which its first exon mapping to intron 6 of the reference ERICH3 transcript, was predominantly expressed in testis. In addition to ERICH3 vesicular function in antidepressant treatment response and cilium formation, expression of this gene has been linked to several forms of cancer, such as breast cancer and skin sarcomas. C1orf173 is expressed in the brain, eye, lung, mammary gland, muscle, pituitary gland, testis, trachea, and uterus.

Protein
The C1orf173 protein in humans is 1,530 amino acids (aa) in length and contains two domains of unknown function, DUF4590 and DUF4543. Both DUF regions are currently uncharacterized though they are found in eukaryotes including humans. There are currently three known isoforms of the C1orf173 protein in humans, Q5RHP9-1 (canonical), Q5RHP9-2 and Q5RHP9-3. Other animals tend to have a multitude of variant forms of this gene. The canonical ERICH3 protein, which was encoded from its reference RNA transcript, has been demonstrated is the predominant ERICH3 isoform in neurons by Western blot assays.



C1orf173 is predicted to be a nuclear protein based on PSORT II analysis and the suggested protein interactions found between c1orf173 and other proteins such as TAF5L. Analyzing the protein for isoelectric point using the Compute pI/Mw tool in Expasy, it was found that C1orf173 is slightly acidic ranging from a pH of 4.6-5 for most orthologs. Further analysis using the NetPhos tool on Expasy found that there are a large number of phosphorylated serines, an intermediate number of phosphorylated threonines and a few phosphoylated tyrosines.

However, the experimental data clearly showed that ERICH3 proteins, including all three known isoforms, are localized in cytoplasma but not in nucleus. The “canonical” ERICH3 protein was predicted to have a molecular weight (MW) of 168.5 kD, but a band at ~250 kD was observed by Western blot. This striking difference (>80kD) between predicted and observed MWs was unlikely to result from post-translational modification such as glycosylation or phosphorylation but from the high content of glutamate (E) in its amino acid sequence. Previous studies have reported that proteins with a high content of glutamate (E) and/or aspartate (D), amino acids with acidic side chains, can display higher apparent MW values during Western blot analysis than would be predicted.



Protein Structure
The C1orf173 protein has a secondary structure that is primarily alpha helices and random coils based on bioinformatical analysis. In humans the tertiary structure of C1orf173 has two components that resemble ubiquitin-like 2 activating enzyme e1b and alginase.

Protein Interactions
The C1orf173 protein has been predicted or experimentally observed to interact with the following proteins:

• CRISPLD2

• GIMAP4

• MDM2

• SLC45A4

• TAF5L

• TLE3

• TTC23