Albert Erives

Albert Erives (born March 4, 1972) is a developmental geneticist who studies transcriptional enhancers underlying animal development and diseases of development (cancers). Erives also proposed the pacRNA model for the dual origin of the genetic code and universal homochirality. He is known for work at the intersection of genetics, evolution, developmental biology, and gene regulation. He has worked at the California Institute of Technology, University of California, Berkeley, and Dartmouth College, and is an associate professor at the University of Iowa.

Erives has shown how genes of the nucleocytoplasmic large DNA viruses inform on intermediate steps in the evolution of the linear, chromatinized eukaryotic chromosome and its mechanisms of gene regulation.

Regulatory grammars of enhancers
Erives' major work is on “regulatory grammars” for transcriptional enhancers underlying animal development and cancer diseases. Exploiting assemblies for animal genomes, Erives discovered complex gene regulatory codes underlie non-homologous subsets of mechanistically equivalent enhancers. These codes are composed of a combinatorial “lexicon” of transcription factor (TF) binding sites, functional inflections of those binding sites (so-called “specialized sites” constrained for binding affinity and competition by multiple TFs), and complex site ordering (orientation and positional spacing of those sites). The relationship of these complex regulatory codes within a nucleosomal "regulatory reading frame" is a key goal. His lab's work also elucidated how a mutational mechanism (microsatellite repeat slippage) plays a significant evolutionary role in functionally adjusting complex binding site arrangements that recruit poly-glutamine rich factors. Correspondingly, Erives lab has pioneered the identification of novel poly-glutamine complex recruiting enhancers that integrate developmental signals, while also identifying polyQ allelic series for key developmental factors targeting those enhancers.

A significant implication of this work is that gene regulatory networks largely evolve by indels in both cis and trans (in enhancer DNAs and polyQ-encoding genes, respectively). As indels are largely produced by unstable microsatellite repeats, which are fast-evolving and difficult to genotype accurately, a large compartment of functional genetic variation is not treated by genome-wide association studies, which focus on single nucleotide polymorphisms and at most a subset of non-repeat associated indels.

Molecular determinants of morphogenic responses
Erives and colleagues determined how different morphogen gradient responses are encoded in DNA sequence. They did so by using diverse Drosophila species that have different sized eggs to study how a set of structured enhancers would have co-evolved or co-adapted to changes in the concentration gradients. Morphogen gradient systems are a core fundamental subject of developmental biology. Models of how morphogen gradient responses were encoded had previously been proposed but not tested across a set of unrelated enhancers constructed from a shared regulatory grammar and located throughout a genome.

Three major unexpected findings resulted from this work. The first finding is that gradient responses in general do not evolve by changes in transcription factor (TF) binding site quality or quantity (site density) as expected, but rather by changes in the precise spacing between binding sites for morphogenic TFs and their partner TFs. The second finding is that homotypic site clustering at such enhancers was largely the result of a complex evolutionary history of selection for different threshold responses in the evolving insect egg. A third-related finding is that frequent selection for different responses also enriches for microsatellite repeat tracts, which are inherently unstable and most responsible for the production of novel indel alleles.

Erives' work also showed the existence of inherent spatial-temporal conflict in morphogenic responses and how this is handled in nature via complementary morphogenic gradients.

Molecular determinants of the genetic code
Using insights gleaned from archaeal genomes, Erives elaborated and described a stereochemical model of "proto-anti-codon RNAs" (pacRNAs). The pacRNA model ascribes a predetermined combined origin for the universal genetic code (i.e., the codon table), the biogenic amino acids, and their exclusive homochirality in life. The model implies that early RNA world was an aminoacylated RNA world and that proteinogenic amino acids arose because of compatible interactions with nucleotide-based polymers. The pacRNA model explicitly lists possible interactions between various anti-codon di-nucleotide and tri-nucleotide sequences and different amino acids. When the nucleotides are D-ribose based, L-amino acids are preferred.

In the pacRNA world, codons originate as cis-elements for recruiting self-aminoacylated pacRNAs/proto-tRNAs. Thus, a curious aspect of this model is that the (anti-) codon table is determined in evolutionary history prior to the origin of ribosome-based protein translation. The pacRNA model may explain why extant tRNAs are heavily modified in all three domains of life.

Erives first presented the pacRNA model at NASAs 2012 Astrobiology Science Conference and most recently at the 2013 Iowa City Darwin Day festival, which focused on the origins of life on Earth.

Like Erives' enhancer studies, which focus on how protein complexes interact with enhancer DNAs, his pacRNA work focuses on how biogenic amino acids would have beneficially interacted with the nucleotide-based molecules of early life. Both areas of study demonstrate how complex patterns in linear molecules emerge from interactions in 3-dimensions.

Developmental genetics in chordates
With his doctoral advisor Michael Levine, Erives authored several papers on ascidian developmental genetics, with key insights into the evolution of the proto-vertebrate body plan. This work used the Ciona system to generate copious amounts of embryos that were then electroporated with enhancer DNAs.

In collaboration with Nori Satoh's lab at the University of Kyoto in Japan, where Erives spent a winter doing research, they also identified the largest collection of notochord specific genes by using genetically altered Ciona over-expressing the Brachyury transcription factor. The notochord is a defining evolutionary innovation of the chordate body plan and this work was designed to advance understanding of the morphogenetic signals emanating from this important developmental and structural tissue.

CodeGrok, Inc.
In 2001, Erives co-founded the Caltech-associated company CodeGrok (code "grok") with Paul Mineiro, currently a Principal Research Software Developer for Microsoft. It was started in Pasadena, California but later moved to Berkeley, California after its second round of financing. In its first three years, CodeGrok developed and used machine learning methods to identify, classify, and clone transcriptional enhancers from the human genome and construct pathway-specific cell-based reporters for drug screening and other applications. The company took its name from the Robert Heinlein novel Stranger in a Strange Land and its concept of grok, which is to understand something deeply and intuitively, in reference to the goal of "grokking" the regulatory code of the human genome. While the company is no longer in existence, it is often cited as a humorous example of what not to do in naming a start-up company as many people were unable to pronounce the name.