User:Taarguss/sandbox

Epigenome wide association studies (EWAS) are statistical studies aiming to discover the role of non genetic variation in diseases. With the use of next generation sequencing technologies and known epigenetic variants as biomarkers, the studies attempt to uncover areas of the genome where epigenetic qualities may play a molecular role in a disease. There has been reported cases that certain epigenetic variants are associated with human diseases and well as other complex phenotypes, prompting for further study . In recent years there have been many successful EWAS's performed that have shown epigenetic variants to be involved with diseases such as diabetes, cancer, arthritis, obesity as well as human ageing.

Background
Genome-wide association studies have been carried out for the past decade but focus on genetic variants. GWAS are shown to be efficient due to the ease in genotyping and the technologies available allowing for full genome coverage. In the past few years with further understanding of the complexity of the regulation of gene expression, attention has been given to the study of the role of non-genetic variants gene regulation and how influence disease.

Epigenetics is variation in gene expression not caused by change in DNA sequences. Epigenetic mechanisms include DNA methylation, histone tail modification as well as non-coding RNAs. The full range of epigenetic modifications within the genome can be potentially enormous . Multiple studies have shown that epigenetic mechanisms play a key role in genome function during development as well as ageing.

Epigenetic variation can be categorised into two main groups; DNA methylation and histone modification. DNA methylation is the addition of a methyl group to the C5 position of a cytosine. Cytosine methylation commonly occurs at CpG sites (cytosine-phosphate-guanine), which are clustered into CpG islands with surrounding domains being referred to as CpG shores. CpG islands are seen in the promoter section of genes with methylation of CpG islands being correlated with gene silencing. Histone modification is the post translational chemical modification of the histone tail. Histones are subject to over 100 modification including a combination of acetylation, methylation, phosphorylation and ubiquitination.

In the context of an EWAS the most common epigenetic mark used as a biomarker is cytosine methylation for a number of technical and biological. Cytosine methylation has a relatively stable temporal characteristic as it is dependent upon the underlying DNA sequence. This allows for the study of cytosine methylation changes over time .

Cytosine methylation also provides potential for therapeutic targets and clinical applications . The areas of the genome where cytosine methylation often occurs is easily accessible with modern biotechnologies. CpG Islands also associated with promoters regions and house keeping genes, therefore changes in the methylation of CpG cytosine is associated with a variety of tumours and deregulation of gene transcription.

The common types of cytosine methylation changes measured in an EWAS:
 * Methylation Variable Position (MVP) : A CpG site that shows differential methylation
 * Differentially methylated region (DMR): Region of the genome that shows different methylation to surrounding CpG sites.
 * Variable methylated region (VMR): Increase in methylation variability in specific genomic region.
 * Allele-specific methylation: Positions that vary in methylation based on the parent of origin, polymorphism or stochastic event.

With the increase in number of EWAS studies being performed there has been a greater number of resources being made available for researchers. The Human Epigenome Project is a collaboration aiming to provide a publicly available catalogue of methylation variable positions in the human genome.

An EWAS has many core differences and similarities to a GWAS. The main differences come from the types of biomarkers used and as a result the design of the study. A GWAS uses Single-nucleotide polymorphism (SNPs) as biomarkers to find associated regions and relies upon Linkage disequilibrium to detect association across the entire genome. As EWAS is still detecting association but is uses differences in cytosine methylation as biomarkers. As the epigenome is highly dynamic, studies are also performed across a timeframe, monitoring the changes in cytosine methylation in the genome.

Methods
The goal of an EWA study is to identify loci with epigenetic marks that are meaningfully associated with the phenotype under study. Several different study designs are employed depending on the researcher's hypothesis and whether the study is exploratory (finding novel associations) or aiming to find causal epigenetic marks :
 * Case-control study: in this design there is a case group that consists of individuals with the phenotype of interest and a control group that are homogenous to the case group in all respects aside from the phenotype. All samples have their epigenomes sequenced and differences between case and control groups can be used to ascertain epigenetic marks that are associated with disease. This type of study may spurious results produced by confounding.
 * Longitudinal study: in this design individuals are monitored over time and their changes in phenotype are recorded. This type of study may be able to identify causal epigenetic marks compared to case-control designs as biases introduced during sample acquisition.
 * Monozygotic twin study: in this design twins are recruited who are different for the phenotype of interest. An advantage of this type of study is confounding caused by differences in genetic variation is removed.

There are several microarray and next-generation sequencing technologies available to measure cytosine methylation. The most common methods to measure DNA methylation are bisulfite sequencing and methylation DNA immunoprecipitation (meDIP). Briefly, the former uses bisulfite ions to treat the DNA sample, converting unmethylated cytosine to uracil. The latter uses antibodies against methylated cytosine to fragment DNA. Methylation at any given CpG site in the sample is binary at the single DNA-level, however at the tissue level it is the average over many copies of DNA. The absoulte level of methylation at a CpG site measured by these technologies is therefore a percentage.



Statistical analyses for methlyation can be conducted in several ways. The first is to examine the difference in methylation levels at CpG sites between groups associated with the phenotype. Because there are many CpG sites, the tests for significant differences must account for multiple comparisons by controlling the false-discovery rate. The effect size and biological relevance of highly statistically significant marks should also be identified before declaring any novel associations.

Unlike genome-wide association studies, testing for differences at CpG sites may be underpowered due to the variance in methlyation levels and non-uniform distrubion of sites accross the human epigenome. Hence another approach for analysis methylation data is to identify differentially methylated regions between groups.

Limitations
The choice of an appropriate tissue in an EWA study is important to ensure the study is biologically sound as only certain tissue types will exhibit  epigenetic variation associated with disease.

Similarly, cells also exhibit variable levels of DNA methylation. This can cause systematic bias in a EWA study because a difference in methylation levels between cases and controls could be caused by differences in cell type mixtures rather than the underlying trait of interest. The effect of cell population heterogeneity can be included in an analysis using deconvolution methods. This requires access to reference epigenomes of each cell type.

Batch effects can be introduced during sequencing of the epigenome, introducing non-biological sources of variation into the study. An example of a batch effect would be sequencing all cases on one day and all controls on another. Consequently, the case and control groups would have different levels experimental error, possibly confounding final results.

Clinical Applications
EWAS can be used to uncover potential biomarkers that can be used to aid in diagnosis as well as monitoring the progress of treatment for a disease . The biomarkers can also be used to assess the risk of recurrence of the disease. In cancer particular genes may be differentially methylated at different stages of tumour development. Identification of loci that tend to become methylated with high frequencies or that are consistently altered may provide information into the progression of the cancer.

As opposed to DNA variants cytosine methylation is reversible giving it potential to be a drug target. Currently there are two main classes of DNA methylation inhibitors, with a range of DNA methylation inhibitors that are targeted therapies specifically in cancer treatment. Promoter hypermethylation and aberrant gene silencing are characteristic features of cancer, the use of demethylated agents and inhibitors to reactivate genes is a viable treatment method of specific types of cancers. The Demethylation of specific genes can activate repair genes and tumour antigens increasing immunogenicity. There are also therapies available for histone modification with histone deacetylation inhibitor therapies available for cancers.