Reprogramming

In biology, reprogramming refers to erasure and remodeling of epigenetic marks, such as DNA methylation, during mammalian development or in cell culture. Such control is also often associated with alternative covalent modifications of histones.

Reprogrammings that are both large scale (10% to 100% of epigenetic marks) and rapid (hours to a few days) occur at three life stages of mammals. Almost 100% of epigenetic marks are reprogrammed in two short periods early in development after fertilization of an ovum by a sperm. In addition, almost 10% of DNA methylations in neurons of the hippocampus can be rapidly altered during formation of a strong fear memory.

After fertilization in mammals, DNA methylation patterns are largely erased and then re-established during early embryonic development. Almost all of the methylations from the parents are erased, first during early embryogenesis, and again in gametogenesis, with demethylation and remethylation occurring each time. Demethylation during early embryogenesis occurs in the preimplantation period. After a sperm fertilizes an ovum to form a zygote, rapid DNA demethylation of the paternal DNA and slower demethylation of the maternal DNA occurs until formation of a morula, which has almost no methylation. After the blastocyst is formed, methylation can begin, and with formation of the epiblast a wave of methylation then takes place until the implantation stage of the embryo. Another period of rapid and almost complete demethylation occurs during gametogenesis within the primordial germ cells (PGCs). Other than the PGCs, in the post-implantation stage, methylation patterns in somatic cells are stage- and tissue-specific with changes that presumably define each individual cell type and last stably over a long time.

Embryonic development


The mouse sperm genome is 80–90% methylated at its CpG sites in DNA, amounting to about 20 million methylated sites. After fertilization, the paternal chromosome is almost completely demethylated in six hours by an active process, before DNA replication (blue line in Figure). In the mature oocyte, about 40% of its CpG sites are methylated. Demethylation of the maternal chromosome largely takes place by blockage of the methylating enzymes from acting on maternal-origin DNA and by dilution of the methylated maternal DNA during replication (red line in Figure). The morula (at the 16 cell stage), has only a small amount of DNA methylation (black line in Figure). Methylation begins to increase at 3.5 days after fertilization in the blastocyst, and a large wave of methylation then occurs on days 4.5 to 5.5 in the epiblast, going from 12% to 62% methylation, and reaching maximum level after implantation in the uterus. By day seven after fertilization, the newly formed primordial germ cells (PGC) in the implanted embryo segregate from the remaining somatic cells. At this point the PGCs have about the same level of methylation as the somatic cells.

The newly formed primordial germ cells (PGC) in the implanted embryo devolve from the somatic cells. At this point the PGCs have high levels of methylation. These cells migrate from the epiblast toward the gonadal ridge. Now the cells are rapidly proliferating and beginning demethylation in two waves. In the first wave, demethylation is by replicative dilution, but in the second wave demethylation is by an active process. The second wave leads to demethylation of specific loci. At this point the PGC genomes display the lowest levels of DNA methylation of any cells in the entire life cycle [at embryonic day 13.5 (E13.5), see the second figure in this section].

After fertilization some cells of the newly formed embryo migrate to the germinal ridge and will eventually become the germ cells (sperm and oocytes) of the next generation. Due to the phenomenon of genomic imprinting, maternal and paternal genomes are differentially marked and must be properly reprogrammed every time they pass through the germline. Therefore, during the process of gametogenesis the primordial germ cells must have their original biparental DNA methylation patterns erased and re-established based on the sex of the transmitting parent.

After fertilization, the paternal and maternal genomes are demethylated in order to erase their epigenetic signatures and acquire totipotency. There is asymmetry at this point: the male pronucleus undergoes a quick and active demethylation. Meanwhile the female pronucleus is demethylated passively during consecutive cell divisions. The process of DNA demethylation involves base excision repair and likely other DNA-repair-based mechanisms. Despite the global nature of this process, there are certain sequences that avoid it, such as differentially methylated regions (DMRS) associated with imprinted genes, retrotransposons and centromeric heterochromatin. Remethylation is needed again to differentiate the embryo into a complete organism.

In vitro manipulation of pre-implantation embryos has been shown to disrupt methylation patterns at imprinted loci and plays a crucial role in cloned animals.

Learning and Memory


Learning and memory have levels of permanence, differing from other mental processes such as thought, language, and consciousness, which are temporary in nature. Learning and memory can be either accumulated slowly (multiplication tables) or rapidly (touching a hot stove), but once attained, can be recalled into conscious use for a long time. Rats subjected to one instance of contextual fear conditioning create an especially strong long-term memory. At 24 h after training, 9.17% of the genes in the rat genomes of hippocampus neurons were found to be differentially methylated. This included more than 2,000 differentially methylated genes at 24 hours after training, with over 500 genes being demethylated. The hippocampus region of the brain is where contextual fear memories are first stored (see figure of the brain, this section), but this storage is transient and does not remain in the hippocampus. In rats contextual fear conditioning is abolished when the hippocampus is subjected to hippocampectomy just 1 day after conditioning, but rats retain a considerable amount of contextual fear when a long delay (28 days) is imposed between the time of conditioning and the time of hippocampectomy.

Molecular stages
Three molecular stages are required for reprogramming the DNA methylome. Stage 1: Recruitment. The enzymes needed for reprogramming are recruited to genome sites that require demethylation or methylation. Stage 2: Implementation. The initial enzymatic reactions take place. In the case of methylation, this is a short step that results in the methylation of cytosine to 5-methylcytosine. Stage 3: Base excision DNA repair. The intermediate products of demethylation are catalysed by specific enzymes of the base excision DNA repair pathway that finally restore cystosine in the DNA sequence.



The Figure in this section indicates the central roles of ten-eleven translocation methylcytosine dioxygenases (TETs) in the demethylation of 5-methylcytosine to form cytosine. As reviewed in 2018, 5mC is very often initially oxidized by TET dioxygenases to generate 5-hydroxymethylcytosine (5hmC). In successive steps (see Figure) TET enzymes further hydroxylate 5hmC to generate 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). Thymine-DNA glycosylase (TDG) recognizes the intermediate bases 5fC and 5caC and excises the glycosidic bond resulting in an apyrimidinic site (AP site). In an alternative oxidative deamination pathway, 5hmC can be oxidatively deaminated by APOBEC (AID/APOBEC) deaminases to form 5-hydroxymethyluracil (5hmU) or 5mC can be converted to thymine (Thy). 5hmU can be cleaved by TDG, SMUG1, NEIL1, or MBD4. AP sites and T:G mismatches are then repaired by base excision repair (BER) enzymes to yield cytosine (Cyt).

TET family
The isoforms of the TET enzymes include at least two isoforms of TET1, one of TET2 and three isoforms of TET3. The full-length canonical TET1 isoform appears virtually restricted to early embryos, embryonic stem cells and primordial germ cells (PGCs). The dominant TET1 isoform in most somatic tissues, at least in the mouse, arises from alternative promoter usage which gives rise to a short transcript and a truncated protein designated TET1s. The isoforms of TET3 are the full length form TET3FL, a short form splice variant TET3s, and a form that occurs in oocytes and neurons designated TET3o. TET3o is created by alternative promoter use and contains an additional first N-terminal exon coding for 11 amino acids. TET3o only occurs in oocytes and neurons and was not expressed in embryonic stem cells or in any other cell type or adult mouse tissue tested. Whereas TET1 expression can barely be detected in oocytes and zygotes, and TET2 is only moderately expressed, the TET3 variant TET3o shows extremely high levels of expression in oocytes and zygotes, but is nearly absent at the 2-cell stage. It is possible that TET3o, high in neurons, oocytes and zygotes at the one cell stage, is the major TET enzyme utilized when very large scale rapid demethylations occur in these cells.

Recruitment of TET to DNA
The TET enzymes do not specifically bind to 5-methylcytosine except when recruited. Without recruitment or targeting, TET1 predominantly binds to high CG promoters and CpG islands (CGIs) genome-wide by its CXXC domain that can recognize un-methylated CGIs. TET2 does not have an affinity for 5-methylcytosine in DNA. The CXXC domain of the full-length TET3, which is the predominant form expressed in neurons, binds most strongly to CpGs where the C was converted to 5-carboxycytosine (5caC). However, it also binds to un-methylated CpGs.



For a TET enzyme to initiate demethylation it must first be recruited to a methylated CpG site in DNA. Two of the proteins shown to recruit a TET enzyme to a methylated cytosine in DNA are OGG1 (see figure Initiation of DNA demthylation) and EGR1.

OGG1
Oxoguanine glycosylase (OGG1) catalyses the first step in base excision repair of the oxidatively damaged base 8-OHdG. OGG1 finds 8-OHdG by sliding along the linear DNA at 1,000 base pairs of DNA in 0.1 seconds. OGG1 very rapidly finds 8-OHdG. OGG1 proteins bind to oxidatively damaged DNA with a half maximum time of about 6 seconds. When OGG1 finds 8-OHdG it changes conformation and complexes with 8-OHdG in the binding pocket of OGG1. OGG1 does not immediately act to remove the 8-OHdG. Half maximum removal of 8-OHdG takes about 30 minutes in HeLa cells in vitro, or about 11 minutes in the livers of irradiated mice. DNA oxidation by reactive oxygen species preferentially occurs at a guanine in a methylated CpG site, because of a lowered ionization potential of guanine bases adjacent to 5-methylcytosine. TET1 binds (is recruited to) the OGG1 bound to 8-OHdG (see figure). This likely allows TET1 to demethylate an adjacent methylated cytosine. When human mammary epithelial cells (MCF-10A) were treated with H2O2, 8-OHdG increased in DNA by 3.5-fold and this caused large scale demethylation of 5-methylcytosine to about 20% of its initial level in DNA.

EGR1
The gene early growth response protein 1 (EGR1) is an immediate early gene (IEG). The defining characteristic of IEGs is the rapid and transient up-regulation—within minutes—of their mRNA levels independent of protein synthesis. EGR1 can rapidly be induced by neuronal activity. In adulthood, EGR1 is expressed widely throughout the brain, maintaining baseline expression levels in several key areas of the brain including the medial prefrontal cortex, striatum, hippocampus and amygdala. This expression is linked to control of cognition, emotional response, social behavior and sensitivity to reward. EGR1 binds to DNA at sites with the motifs 5′-GCGTGGGCG-3′ and 5'-GCGGGGGCGG-3′ and these motifs occur primarily in promoter regions of genes. The short isoform TET1s is expressed in the brain. EGR1 and TET1s form a complex mediated by the C-terminal regions of both proteins, independently of association with DNA. EGR1 recruits TET1s to genomic regions flanking EGR1 binding sites. In the presence of EGR1, TET1s is capable of locus-specific demethylation and activation of the expression of downstream genes regulated by EGR1.

History
The first person to successfully demonstrate reprogramming was John Gurdon, who in 1962 demonstrated that differentiated somatic cells could be reprogrammed back into an embryonic state when he managed to obtain swimming tadpoles following the transfer of differentiated intestinal epithelial cells into enucleated frog eggs. For this achievement he received the 2012 Nobel Prize in Medicine alongside Shinya Yamanaka. Yamanaka was the first to demonstrate (in 2006) that this somatic cell nuclear transfer or oocyte-based reprogramming process (see below), that Gurdon discovered, could be recapitulated (in mice) by defined factors (Oct4, Sox2, Klf4, and c-Myc) to generate induced pluripotent stem cells (iPSCs). Other combinations of genes have also been used, including LIN25 and Homeobox protein NANOG.

Phases of reprogramming
With the discovery that cell fate could be altered, the question of what progression of events occurs signifies a cell undergoing reprogramming. As the final product of iPSC reprogramming was similar in morphology, proliferation, gene expression, pluripotency, and telomerase activity, genetic and morphological markers were used as a way to determine what phase of reprogramming was occurring. Reprogramming is defined into three phase: initiation, maturation, and stabilization.

Initiation
The initiation phase is associated with the downregulation of cell type specific genes and the upregulation of pluripotent genes. As the cells move towards pluripotency, the telomerase activity is reactivated to extend telomeres. The cell morphology can directly affect the reprogramming process as the cell is modifying itself to prepare for the gene expression of pluripotency. The main indicator that the initiation phase has completed is that the first genes associated with pluripotency are expressed. This includes the expression of Oct-4 or Homeobox protein NANOG, while undergoing a mesenchymal–epithelial transition (MET), and the loss of apoptosis and senescence.

If the cell is directly reprogrammed from one somatic cell to another, the genes associated with each cell type begin to be upregulated and downregulated accordingly. This can either occur through direct cell reprogramming or creating an intermediate, such as a iPSC, and differentiating into the desired cell type.

The initiation phase is completed through one of three pathways: nuclear transfer, cell fusion, or defined factors (microRNA, transcription factor, epigenetic markers, and other small molecules).

Somatic cell nuclear transfer
An oocyte can reprogram an adult nucleus into an embryonic state after somatic cell nuclear transfer, so that a new organism can be developed from such cell.

Reprogramming is distinct from development of a somatic epitype, as somatic epitypes can potentially be altered after an organism has left the developmental stage of life. During somatic cell nuclear transfer, the oocyte turns off tissue specific genes in the somatic cell nucleus and turns back on embryonic specific genes. This process has been shown through cloning, as seen through John Gurdon with the tadpoles and Dolly the Sheep. Notably, these events have shown that cell fate is a reversible process.

Cell fusion
Cell fusion is used to create a multi nucleated cell called a heterokaryon. The fused cells allow for otherwise silenced genes to become reactivated and expressive. As the genes are reactivated, the cells can re-differentiate. There are instances where transcriptional factors, such as the Yamanaka factors, are still needed to aid in heterokaryon cell reprogramming.

Defined factors
Unlike nuclear transfer and cell fusion, defined factors do not require a full genome, only reprogramming factors. These reprogramming factors include microRNA, transcription factor, epigenetic markers, and other small molecules. The original transcription factors, that lead to iPSC development, discovered by Yamanaka include Oct4, Sox2, Klf4, and c-Myc (OSKM factors). Although the OSKM factors have been shown to induce and aid in pluripotency, other transcription factors such as Homeobox protein NANOG, LIN25, TRA-1-60, and C/EBPα aid in the efficiency of reprogramming. The use of microRNA and other small molecule-driven processes has been utilized as a means of increasing the efficiency of the differentiation from somatic cells to pluripotency.

Maturation
The maturation phase begins at the end of the initiation phase, when the first pluripotent genes are expressed. The cell is preparing itself to be independent from the defined factors, that started the reprogramming process. The first genes to be detected in iPSCs are Oct4, Homeobox protein NANOG, and Esrrb, followed later by Sox2. In the later stages of maturation, transgene silencing marks the start of the cell becoming independent from the induced transcription factor. Once the cell is independent, the maturation phase ends and the stabilization phase begins.

As reprogramming efficiency has proven to be a variable and low efficiency process, not all the cells complete the maturation phase and achieve pluripotency. Some cells that undergo reprogramming still remain under apoptosis at the beginning of the maturation stage from oxidative stress brought on by the stresses of gene expression change. The use of microRNA, proteins, and different combinations of the OSKM factors have started to lead towards a higher efficiency rate of reprogramming.

Stabilization
The stabilization phase refers to the processes in the cell that occur after the cell reaches pluripotency. One genetic marker is the expression of Sox2 and X chromosome reactivation, while epigenetic changes include the telomerase extending the telomeres and loss of the cell’s epigenetic memory. The epigenetic memory of a cell is reset by the changes in DNA methylation, using activation-induced cytidine deaminase (AID), TET enzymes (TET), and DNA methyltransferase (DMNTs), starting in the maturation phase and into the stabilization stage. Once the epigenetic memory of the cell is lost, the possibility of differentiation into the three germ layers is achieved. This is considered a fully reprogrammed cell as it can be passaged without reverting to its original somatic cell type.

In cell culture systems
Reprogramming can also be induced artificially through the introduction of exogenous factors, usually transcription factors. In this context, it often refers to the creation of induced pluripotent stem cells from mature cells such as adult fibroblasts. This allows the production of stem cells for biomedical research, such as research into stem cell therapies, without the use of embryos. It is carried out by the transfection of stem-cell associated genes into mature cells using viral vectors such as retroviruses.

Transcription factors
One of the first transacting factors discovered to change a cell was found in a myoblast when the complementary DNA (cDNA) coding for MyoD was expressed and converted a fibroblast to a myoblast. Another transacting factor that directly transformed a lymphoid cell into a myeloid cell was C/EBPα. MyoD and C/EBPα are examples of a small number of single factors that can transform cells. More often, a combination of transcription factors work in conjunction to reprogram a cell.

OSKM
The OSKM factors (Oct4, Sox2, Klf4, and c-Myc) were initially discovered by Yamanaka in 2006, by the induction of a mouse fibroblast into an induced pluripotent stem cell (iPSCs). Within the following year, these factors were used to induce human fibroblasts into iPSCs.

Oct4 is part of the core regulatory genes needed for pluripotency, as it is seen in both embryonic stem cells and tumors. The use of Oct4 even in small increases allows for the start differentiation into pluripotency. Oct4 works in conjecture with Sox2 for the expression of FGF4 which could aid in differentiation.

Sox2 is a gene used in maintaining pluripotency in stem cells. Oct4 and Sox2 work together to regulate hundreds of genes utilized in pluripotency. However, Sox2 is not the only possible Sox family member to participate in gene regulation with Oct4 – Sox4, Sox11, and Sox15 also participate, as the Sox protein is redundant throughout the stem cell genome.

Klf4 is a transcription factor used in proliferation, differentiation, apoptosis, and somatic cell reprogramming. When being utilized in cellular reprogramming, Klf4 prevents cell division of damaged cells using its apoptotic ability, and aids in histone acetyltransferase activity.

c-Myc is also known as an oncogene, and in certain conditions can become cancer causing. In cellular reprogramming, c-Myc is used for cell cycle progression, apoptosis, and cellular transformation for further differentiation.

NANOG
Homeobox protein NANOG (NANOG) is a transcription factor used to aid in the efficiency of generating iPSCs by maintaining pluripotency and suppressing cell determination factors. NANOG works by promoting chromatin accessibility through repression of histone markers, such as H3K27me3. NANOG aids recruitment of Oct4, Sox2, and Esrrb used in transcription, while also recruiting Brahma-related gene-1 (BRG1) for chromatin accessibility.

C/EBPα
CEBPA is a commonly used factor when reprogramming cells into not only iPSCs, but also other cells. C/EBPα has shown itself to be a single transacting factor during direct reprogramming of a lymphoid cell into a myeloid cell. C/EBPα is considered a 'path breaker' to aid in preparing the cell for intake of the OSKM factors and specific transcription events. C/EBPα has also been shown to increase the efficiency of the reprogramming events.

Variability
The properties of cells obtained after reprogramming can vary significantly, in particular among iPSCs. Factors leading to variation in the performance of reprogramming and functional features of end products include genetic background, tissue source, reprogramming factor stoichiometry and stressors related to cell culture.