Complex traits

Complex traits are phenotypes that are controlled by two or more genes and do not follow Mendel's Law of Dominance. They may have a range of expression which is typically continuous. Both environmental and genetic factors often impact the variation in expression. Human height is a continuous trait meaning that there is a wide range of heights. There are an estimated 50 genes that affect the height of a human. Environmental factors, like nutrition, also play a role in a human's height. Other examples of complex traits include: crop yield, plant color, and many diseases including diabetes and Parkinson's disease. One major goal of genetic research today is to better understand the molecular mechanisms through which genetic variants act to influence complex traits. Complex traits are also known as polygenic traits and multigenic traits.

The existence of complex traits, which are far more common than Mendelian traits, represented a significant challenge to the acceptance of Mendel's work. Modern understanding has 3 categories of complex traits: quantitative, meristic, and threshold. These traits have been studied on a small scale with observational techniques like twin studies. They are also studied with statistical techniques like quantitative trait loci (QTL) mapping, and genome-wide association studies (GWAS) on a large scale. The overall goal of figuring out how genes interact with each other and the environment and how those interactions can lead to variation in a trait is called genetic architecture.

History
When Mendel's work on inheritance was rediscovered in 1900, scientists debated whether Mendel's laws could account for the continuous variation observed for many traits. One group known as the biometricians argued that continuous traits such as height were largely heritable, but could not be explained by the inheritance of single Mendelian genetic factors. Work published by Ronald Fisher in 1919 mostly resolved debate by demonstrating that the variation in continuous traits could be accounted for if multiple such factors contributed additively to each trait. However, the number of genes involved in such traits remained undetermined; until recently, genetic loci were expected to have moderate effect sizes and each explain several percent of heritability. After the conclusion of the Human Genome Project in 2001, it seemed that the sequencing and mapping of many individuals would soon allow for a complete understanding of traits' genetic architectures. However, variants discovered through genome-wide association studies (GWASs) accounted for only a small percentage of predicted heritability; for example, while height is estimated to be 80-90% heritable, early studies only identified variants accounting for 5% of this heritability. Later research showed that most missing heritability could be accounted for by common variants missed by GWASs because their effect sizes fell below significance thresholds; a smaller percentage is accounted for by rare variants with larger effect sizes, although in certain traits such as autism, rare variants play a more dominant role. While many genetic factors involved in complex traits have been identified, determining their specific contributions to phenotypes—specifically, the molecular mechanisms through which they act—remains a major challenge.

Quantitative traits
Quantitative traits have phenotypes that are expressed on continuous ranges. They have many different genes that impact the phenotype, with differing effect sizes. Many of these traits are some what heritable. For example, height is estimated to be 60-80% heritable; however, other quantitative traits have varying heritability.

Meristic traits
Meristic traits have phenotypes that are described by whole numbers. An example is the rate chickens lay eggs. A chicken can lay one, two, or five eggs a week, but never half an egg. The environment can also impact expression, as chickens will not lay as many eggs depending on the time of year.

Threshold traits
Threshold traits have phenotypes that have limited expressions (usually two). It is a complex trait because multiple genetic and environmental factors impact the phenotype. The phenotype before the threshold is referred to as normal or absent, and after the threshold as lethal or present. These traits are often examined in a medical context, because many diseases exhibit this pattern or similar. An example of this is type 2 diabetes, the phenotype is either normal/healthy or lethal/diseased.

Twin studies
Twin studies is an observational test using monozygotic twins and dizygotic twins, preferably same sex. They are used to figure out the environmental influence on complex traits. Monozygotic twins in particular are estimated to share 100% of their DNA with each other so any phenotypic differences should be caused by environmental influences.

QTL mapping
Many complex traits are genetically determined by quantitative trait loci (QTL). A Quantitative Trait Loci analysis can be used to find regions on the genome sequence that are associated with a complex trait. To find these regions, researchers will select a trait of interest and take a group of individuals of a species with varying expressions of this trait. They will label the individuals as founding parents and attempt to measure the trait. This can be difficult as most traits do not have a direct cut off point. Researchers will then genotype the parents using molecular markers such as SNPs or RFLPs. These act as signposts pointing to an area of where the genes associated with a trait are. From there, the parents are crossed to produce offspring. These offspring are then made to produce new offspring, but who they breed with can vary. They can either reproduce with their siblings, with themselves (different from asexual reproduction), or backcross. After this, a new generation is produced that are more genetically diverse. This is due to recombination. The genotype and phenotype of this new generation are measured and compared with the molecular markers to identify which alleles are associated with the trait. This does not mean there is a direct causal relationship between these regions and the trait, but it does give insight that there are genes that do have some relationship with the trait and reveals where to look in future research.

GWAS
A Genome-Wide Association Study (GWAS) is a technique used to find gene variants linked to complex traits. A GWAS is done with populations that mate randomly because all the genetic variants are tested at once. Then researchers can compare the different alleles at a locus. It is similar to QTL mapping. The most common set-up for a GWAS is a case study which creates two populations one with the trait we are looking at and one without the trait. With the two populations researchers will map every subject's genome and compare them to find different variance in the SNPs between the two populations. Both populations should have similar environmental backgrounds. GWAS is only looking at the DNA and does not include differences that would be caused by environmental factors. Statistical test, such as a chi squared is used to find if there is association with the trait and each of the SNPs tested. The statistical test produces a p-value which the researcher will use to conclude if the SNP is significant. This p-value cut off can range from being a higher number or a lower number at the researcher's discretion. The data can then be visualized in a Manhattan plot which takes the -log (p-value) so all the significant SNPs are at the top of the graph.

Genetic architecture
Genetic architecture is an overall explanation of all the genetic factors that play a role in a complex trait and exists as the core foundation of quantitative genetics. With the use of mathematical models and statistical analysis, like GWAS, researchers can determine the number of genes affecting a trait as well as the level of influence each gene has on the trait. This is not always easy as the architecture of one trait can be different between two separate populations of the same species. This can be due to the fact that both populations live in different environments. Differing environments can lead to different interactions between genes and the environment, changing the architecture of both populations.

Recently, with rapid increases in available genetic data, researchers have begun to characterize the genetic architecture of complex traits better. One surprise has been the observation that most loci identified in GWASs are found in noncoding regions of the genome; therefore, instead of directly altering protein sequences, such variants likely affect gene regulation. To understand the precise effects of these variants, QTL mapping has been employed to examine data from each step of gene regulation; for example, mapping RNA-sequencing data can help determine the effects of variants on mRNA expression levels, which then presumably affect the numbers of proteins translated. A comprehensive analysis of QTLs involved in various regulatory steps—promotor activity, transcription rates, mRNA expression levels, translation levels, and protein expression levels—showed that high proportions of QTLs are shared, indicating that regulation behaves as a “sequential ordered cascade” with variants affecting all levels of regulation. Many of these variants act by affecting transcription factor binding and other processes that alter chromatin function—steps which occur before and during RNA transcription.

To determine the functional consequences of these variants, researchers have largely focused on identifying key genes, pathways, and processes that drive complex trait behavior; an inherent assumption has been that the most statistically significant variants have the greatest impact on traits because they act by affecting these key drivers. For example, one study hypothesizes that there exist rate-limiting genes pivotal to the function of gene regulatory networks. Others studies have identified the functional impacts of key genes and mutations on disorders, including autism and schizophrenia. However, a 2017 analysis by Boyle et al. argues that while genes which directly impact complex traits do exist, regulatory networks are so interconnected that any expressed gene affects the functions of these "core" genes; this idea is called the "omnigenic" hypothesis. While these "peripheral" genes each have small effects, their combined impact far exceeds the contributions of core genes themselves. To support the hypothesis that core genes play a smaller than expected role, the authors describe three main observations: the heritability for complex traits is spread broadly, often uniformly, across the genome; genetic effects do not appear to be mediated by cell-type specific function; and genes in the relevant functional categories only modestly contribute more to heritability than other genes. One alternative to the omnigenic hypothesis is the idea that peripheral genes act not by altering core genes but by altering cellular states, such as the speed of cell division or hormone response.