Glutamate-rich protein 4

Glutamate-rich protein 4 is encoded by the gene ERICH4 and can be otherwise known as chromosome 19 open reading frame 69 (C19orf69). ERICH4 is highly conserved in mammals and exhibits overexpression in tissues of the kidneys, terminal ileum, and duodenum. The function of ERICH4 has yet to be well understood by the scientific community but is suggested to contribute to immune inflammatory responses.

Gene
ERICH4 is located on the sense strand of 19q13.2 in humans, consists of 2,340 base pairs, and contains 2 exons. ERICH4, on the sense strand, is located within DMAC2 and next PCAT19 and B3NT8 which are all on the antisense strand.



Promoter & Predicted Transcription Factors (TF)
The promoter is predicted to begin 1,806 bp upstream from the 5' UTR and consists of 1,819 bp which overlaps with the coding sequence by 13 bp.



mRNA
The ERICH4 mRNA sequence is 955 nucleotides in length with a fold energy predicted as -139.80 kcal/mol with -0.258 energy/base.

Alternative Splicing
ERICH4 has one different protein-encoding transcript variant, or isoform.

General Properties
The primary encoded protein consists of 130 amino acids and has a predicted molecular mass of 14.5 kDa and isoelectric point of 4 pI. As suggested by the protein's name, glutamate-rich protein 4, the protein is most highly composed of glutamic acid amino acids at 17.7% of the protein's composition followed by leucine at 14.6%, and then proline at 9.2%. ERICH4 has no positive or negative charge clusters. The human protein has one identifiable mixed cluster from amino acid 91 to 116 with 3 positively-charged, 15 negatively-charged, and 8 neutral amino acids. The same mixed cluster region in humans is frequently negative within ERICH4's orthologous proteins. This protein contains no significant hydrophobic or transmembrane segments which are supported with comparison to five of ERICH4's orthologs (Graymouse lemur, Sheep, House mouse, African elephant, and Opossum).

Domains
ERICH4 has one identified domain of unknown function, DUF4530, which is found in eukaryotes. Proteins in this family are typically 140 amino acids in length and ERICH4 is a known human member of this family.



Secondary Structure
A cross-program analysis determines ERICH4 protein to be composed of five separated alpha helixes and five interspersing coils. The alpha helix segments span from amino acids 2-9, 21-24, 47-58, 61-94, and 104-111 in the protein sequence. ERICH4 is not predicted to contain beta-sheets.

Tertiary Structure
Program analysis in SWISS-Model proposes a tertiary structure for ERICH4 by matching the protein against the template of NLRP6 with a sequence identity of 25.79%, sequence similarity of 0.30, and coverage of 0.43 for amino acids 43-92 in ERICH4.

Post-translational Regulation
ERICH4 has proposed phosphorylation at serine amino acids 28 and 96 and amino acid 36, a threonine, by casein kinase II and protein kinase c, respectively. ERICH4 is not predicted to be undergo a methionine cleavage or acetylation.

Localization
This protein is predicted to be intracellular without any transmembrane regions. Sub-cellular localization is predicted to be mostly localized to the cytoplasm with a reliability score of 70.6 via the Reinhardt's method. No significant O-GlcNAc site and N-myristoylation predictions.

Tissue Expression
ERICH4’s highest levels of expression are within human tissue of the duodenum and small intestine, followed by the kidneys. Notably, expression within the small intestines is highest in the twentieth week of human fetal development. Within a representative set of mouse (Mus musculus) tissues, Erich4 is most highly expressed within the kidneys, followed by and in decreasing expression, the large intestines, adult duodenum, and adult small intestine. The Sigma-Aldrich antibody product, HPA042632, derived from rabbit, has a strong granular cytoplasmic positivity in cytoplasmic structure in glandular cells (goblet cells) of the rectum.





Abnormal Tissue Expression
ERICH4 has high expression within normal tissue and low-to-medium expression with renal cell carcinoma tissue.

An analysis examining ERICH4 was reviewed in tissues of the ileum and colon that were either normal or afflicted with Crohn's disease or ulcerative colitis. ERICH4 had high (~90%) expression within the ileum for all states (normal/control, Crohn's disease, and ulcerative colitis). ERICH4 also has a higher expression in Crohn's disease than in either normal tissue or ulcerative colitis.



Function
The function of ERICH4 has yet to be well understood by the scientific community and therefore, requires further research.

Interactions
According to STRING analysis, ERICH4 has multiple predicted interactions with other proteins including proteins with associated immune function and expression within the gastrointestinal tract or testes from textmining. No experimentally confirmed protein interactions yet.

Paralogs
No human paralogs were found for the gene.

Orthologs
Orthologs have been identified in most mammals for which complete genome data is available. Notably, ERICH4 orthologs are only present in placental and marsupial mammals but absent in monotremes. The most distant ortholog was identified in the gray short-tailed opossum which is a marsupial mammal.

No significant similarities were found in the vertebrates Aves, Reptilia, Amphibia, Chondrichthyes, Osteichthyes or Agnatha. Searching to exclude vertebrates in BLAST and BLAT produced no significant ortholog findings for invertebrates, fungi, and bacteria.

Molecular Evolution
The m value, or number of corrected amino acid changes per 100 residues, for the gene ERICH4 was plotted against the divergence of species in millions of years. When compared to the data of hemoglobin, fibrinogen alpha chain, and cytochrome C, it was determined that the gene has the closest progression to fibrinogen alpha chain, suggesting a relatively rapid pace of evolution. M values for ERICH4 were derived from percentage of identity of species protein sequences compared to the human sequence using the formula derived from the Molecular Clock Hypothesis.