Human somatic variation

Human somatic variations are somatic mutations (mutations that occur in somatic cells) both at early stages of development and in adult cells. These variations can lead either to pathogenic phenotypes or not, even if their function in healthy conditions is not completely clear yet.

The term mosaic (from medieval Latin musaicum, meaning "work of the Muses") has been used since antiquity to refer to an artistic patchwork of ornamental stones, glass, gems, or other precious material. At a distance, the collective image appears as it would in a painting; only on close inspection do the individual components become recognizable. In biological systems, mosaicism implies the presence of more than one genetically distinct cell line in a single organism. Occurrence of this phenomenon not only can result in major phenotypic changes but also reveal the expression of otherwise lethal genetic mutations.

Genetic mutations involved in mosaicism may be due to endogenous factors, such as transposons and ploidy changes, or exogenous factors, such as UV radiation and nicotine.

Somatic mosaicism in healthy human tissues
Somatic mosaicism arises a result of somatic mutations: genomic (or even mitochondrial) alterations of different sizes ranging from a single nucleotide to chromosome gains or loss within somatic cells. These alterations within somatic cells begin at an early stage (pre-implantation or conception) and continue during aging, giving rise to phenotypic heterogeneity within cells, which may lead to the development of diseases such as cancer. Novel array based techniques for screening genome-wide copy number variants and loss of heterozygosity in single cells showed that chromosome aneuploidies, uniparental disomies, segmental deletions, duplications, and amplifications frequently occur during embryogenesis. Yet not all somatic mutations are propagated to the adult individual, due to the phenomenon of cell competition.

Genetic alterations involving gains or loss of entire chromosomes predominantly occur during anaphase stage of cell division. But these are uncommon in somatic cells because they are usually selected against due to their deleterious consequences. Somatic variations during embryonic development can be represented by monozygous twins since they carry different copy number profiles and epigenetic marks that keep on increasing with age.

Early research on somatic mutations in aging showed that deletions, inversion, and translocations of genetic material are common in aging mice and aging genomes tend to contain visible chromosomal changes, mitotic recombination, whole gene deletions, intragenic deletions, and point mutations. Other factors include the loss of methylation, increasing gene expression heterogeneity correlating to genomic abnormalities, and telomere shortening. It is uncertain if transcription-based DNA repair takes part in the maintaining of somatic mutations in aging tissues.

In some cells, the somatically acquired alterations can be reversed back to wild type alleles by reversion mosaicism. This can be due to endogenous mechanism such as homologous recombination, codon substitution, second-site suppressor mutations, DNA slippage, and mobile elements.

Somatic cancer-associated mutations in normal tissues
The advent of Next-Generation Sequencing technologies has increased the resolution of mutation detection and has led to the revelation that older individuals not only accumulate chromosomal alterations but also abundant mutations in cancer driver genes.

Age-associated accumulation of chromosomal alterations has been documented with a variety of cytogenetic approaches, from chromosome painting to single nucleotide polymorphism (SNP) arrays.

Numerous studies demonstrated that the clonal populations might lead to loss of organismal health through the functional decline of tissue and/or the promotion of disease processes, such as cancer. This is the reason why the aberrant clonal expansion (ACE) resulting from cancer-associated mutations are common in noncancerous tissue and accumulate with age. This is universal in most organisms and affects multiple tissues.

In the hematopoietic compartment mutations include both large structural chromosomal alterations and point mutations affecting cancer-associated genes. Some translocations appear to occur very early in life. The frequency of these events is low in people younger than 50 years (<0.5%), but this frequency rapidly increases to 2% to 3% of individuals in their 70s and 80s. This phenomenon was termed clonal hematopoiesis. A number of environmental factors, such as smoking, viral infections, and pesticide exposure, may contribute not only through mutation induction but also by modulation of clonal expansion.

Otherwise, the detection of somatic variants in normal solid tissues has historically proved difficult. The main reasons are the generally slower replicative index, clonally restrictive tissue architecture, difficulty of tissue access, and low frequency of mutation occurrence. Recently, the analysis of somatic mutations in benign tissues adjacent to tumors revealed that 80% of samples harbors clonal mutations, with increased frequency associated with older age, smoking, and concurrent mutations in DNA repair genes. With the advent of NGS, it has become increasingly clear that somatic mutations accumulate with aging in normal tissue, even in individuals who are cancer-free.

This suggested that clonal expansions driven by cancer genes are a near-universal feature of aging. NGS technologies revealed that the clonal expansions of cancer-associated mutations are very common condition in somatic tissues.

Human somatic variations in brain
Through several recent studies a prevalence of somatic variations, both in pathological and healthy nervous systems, has been highlighted.

Somatic aneuploidy such as SNVs (single-nucleotide variations) and CNVs (copy number variations) have been particularly observed and linked to brain disfunctions when arising in prenatal brain development; anyway those somatic aneuploidy have been observed in rates of 1,3-40%, potentially increasing with age and for this reason they have been proposed as a mechanism to generate normal genetic diversity among neurons.

The confirmation of that hypothesis has been obtained through studies of single-cell sequencing, which allow a direct assessment of single neuronal genomes, so that a systematic characterization of somatic aneuploidies and subchromosomal CNVs of these cells is possible. Using postmortem brains of both healthy and diseased humans it has been possible to study how CNVs change among these two groups. It emerged that somatic aneuploidies in healthy brains are quite rare, but somatic CNVs instead aren't.

These studies also showed that clonal CNVs exist in both pathological and healthy brains. This means that some CNVs can arise in early development without causing diseases, even though, when compared to the CNVs arising in other cell types such as lymphoblast, the brain's ones are more often private. This evidence could be given by the fact that, while lymphoblasts can generate clonal CNVs for a long period as they continue to proliferate, adult neurons do not replicate anymore, so the clonal CNVs they are carrying must have been generated in an early development stage.

Data highlighted a tendency in neurons for the loss, rather than for the gain of copies when compared to lymphoblasts. These differences could suggest that the molecular mechanisms of CNVs arising in that two cell types are completely different.

L1-associated mosaicism in brain cells
The retrotransposon LINE-1 (long interspersed element 1, L1) is a transposable element that has colonized the mammalian germline. L1 retrotransposition can happen also in somatic cells causing mosaicism (SLAVs – L1-associated variations) and in cancer. Retrotransposition is a copy and paste process in which the RNA template is retrotranscribed in DNA and integrated randomly in the genome. In humans there are around 500.000 copies of L1 and occupy 17% of genome. Its mRNA encodes for two proteins; one of them in particular has a reverse transcriptase and endonuclease activity that allows the retrotransposition in cis. Anyway most part of these copies are rendered immobile by mutations or 5’ truncation, leaving just about 80–100 mobile L1 per human genome and just about 10 are considered hot L1s so able to mobilize efficiently.

L1 transpose using a mechanism called TPRT (target primed reverse transcription) it's able to insert a L1 endonuclease motif, target site duplications (TSD) and a poly-A tail with a cis preference.

It has been seen in the past that there's L1 mobilization in neural progenitors during foetal and adult neurogenesis suggesting that the brain may be a L1 mosaicism hotspot. Moreover, some studies suggested that also non-dividing neurons can support L1 mobilization. This has been confirmed by single-cell genomic studies.

Single-cell paired-end sequencing experiments found out that SLAVs are present both in neurons and glia of hippocampus and frontal cortex. Any neural cell has a similar probability to contain a SLAV, suggesting that somatic variations are a random phaenomenon, not focused on a specific group of cells. SLAVs occurrence in the brain is estimated to be of 0.58–1 SLAVs per cell and to involve 44–63% of the brain cells.

Since experiments showed that a half of the analyzed SLAVs lack target site duplication (TSD), another kind of L1-associated variant might occur. In fact those sequences don't have an endonuclease activity, but still have endonuclease motifs so that they can be retrotransposed in trans.

An application of the study of somatic mosaicism in the brain could be the tracing of specific brain cells. Indeed, if the somatic L1 insertions occurs in a progenitor cell, the unique variant could be used to trace the progenitor cell's development, localization, and spreading through the brain. On the contrary, if the somatic L1 insertion occurs late in development, it will be present just in a single cell or in a small group of cells. Therefore, tracing somatic variations could be useful to understand at which point of development they have occurred. Further experiments are necessary to understand the role of somatic mosaicism in brain function, since small groups of cells or even single cells can affect network activity.

Human somatic variations and the immune system
Human somatic mutations (HSMs) are intensively exploited by the immune system for the production of antibodies. HSMs, recombination in particular, are indeed the reason why antibodies can identify an epitope with such high specificity and sensitivity.

Antibodies are encoded by B cells. Each antibody is composed of two heavy chains (IgH, encoded by IGH gene) and two light chains (IgL, encoded by either IGL or IGK gene). Each chain is then composed of a constant region (C) and a variable region (V). The constant region (C) on the heavy chain is important in the BCR signaling and determines the type of immunoglobuline (IgA, IgD, IgE, IgG, or IgM). The variable region (V) is responsible for the recognition of the target epitope and is the product of recombination processes in the related loci.

After exposure of an antigen, B cells start developing. B cells genome undergoes repeated recombination processing on the Ig genes until the recognition of the epitope is perfectioned. The recombination involves the IGH locus first and then the IGL and IGK loci. All IGL, IGK, and IGH genes are the product of the V(D)J recombination process. This recombination involves the variable (V), diversity (D) and joining (J) segments. All three segments (V, D, J) are involved in the formation of the heavy chain, while only V and J recombination products encode for the light chain.

The recombination between these regions allows the formation of 1012–1018 potential different sequences. However, this number is an overestimation, since many factors contribute to limit the diversity of the B cell repertoire, first of all the actual number of B cell in the organism.

Cardiac mosaicism
Somatic mosaicism has been noted in the heart. Sequencing suggested mosaic variation in the gap junction protein connexin in three patients out of 15 might contribute to atrial fibrillation although subsequent reports in larger numbers of patients found no examples among a large panel of genes. At Stanford, a team led by Euan Ashley demonstrated somatic mosaicism in the heart of a newborn presenting with life threatening arrhythmia. Family-based genome sequencing as well as tissue RNA sequencing and single cell genomics techniques were used to verify the finding. A model combining partial and ordinary differential equations with inputs from heterologous single channel electrophysiology experiments of the genetic variant recapitulated certain aspects of the clinical presentation.