Kataegis

In molecular biology, kataegis describes a pattern of localized hypermutations identified in some cancer genomes, in which a large number of highly patterned basepair mutations occur in a small region of DNA. The mutational clusters are usually several hundred basepairs long, alternating between a long range of C→T substitutional pattern and a long range of G→A substitutional pattern. This suggests that kataegis is carried out on only one of the two template strands of DNA during replication. Compared to other cancer-related mutations, such as chromothripsis, kataegis is more commonly seen; it is not an accumulative process but likely happens during one cycle of replication.

The term kataegis (καταιγίς) is derived from the ancient Greek word for "thunderstorm". It was first used by scientists at the Wellcome Trust Sanger Institute to describe their observations of breast cancer cells. In the process of mapping mutation clusters across the genome, they used a visualization tool called "rainfall plots", as shown on the picture on the right, with which they observed a clustering pattern for kataegis.

Mechanism
Regions of kataegis have been shown to be colocalised with regions of somatic genome rearrangements. In these regions, known as the breakpoints, basepairs are more prone to get deleted, substituted, or translocated. Most hypotheses of the kataegis involves errors during the frequent DNA repair at the breakpoints. A collection of enzymes from the DNA repair system will come in to excise the mismatch basepair. When these enzymes try to mend the mutational damage, they unwind DNA into single strands and create lesion regions that do not have a purine/pyrimidine base. Across the lesion region, the bases in the unpaired, single-stranded DNA(ssDNA) are more accessible to the modifying enzyme groups that can cause further damage in the sequence, thus forming the mutational clusters seen in kataegis.

Two enzyme families are assumed to be related to kataegis. The APOBEC("apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like") enzyme family causes predominately C→T mutations, and translesional DNA synthesis (TLS) DNA polymerase causes C→G or C→T mutations.

APOBEC enzyme family (C→T mutations)
APOBEC family is a group of cytidine deaminase enzymes that plays an important role in immune system. Its major function is to induce genetic mutations in antibodies, which need a huge variety of genes in order to bind to different antigens. APOBEC family can also protect against the infection of RNA retroviruses and retrotransposons. In a single-strand DNA (ssDNA), APOBEC can transfer an amine group from a cytosine(C) and turn it into a uracil(U); such mutations can deaminate the viral gene and terminate the retro-transcription process that codes RNA back to DNA.

As shown in Figure 1, the base mutations in kataegis regions were found to be almost exclusively cytosine to thymine in the context of a TpC dinucleotide(p denotes the phosphoribose backbone). At DNA lesion sites, APOBEC enzyme can have access to long ssDNA and induce a C→U mutations. APOBEC family is processive and can continue to induce multiple mutations in a small region. If this part of DNA is replicated before such mutation is repaired, the mutation gets passed on to the subclones. The original CG pair will become a TA pair after one round of replication, hence the predominantly seen C→T mutation in kataegis.

Among the APOBEC family, APOBEC3 subfamily are responsible for protection against retroviruses such as HIV(known to be modified by APOBEC3F and APOBEC3G). Since their original functions include editing ssDNA, they are more likely to be responsible for causing large numbers of mutations on human ssDNA. The direct link between the APOBEC deaminases and kataegistic clusters of mutations was recently obtained by expressing hyperactive deaminase in yeast cells. Recent evidence has linked the over-expression of the family member APOBEC3B with multiple human cancers, highlighting its possible contribution to genomic instability and kataegis.

Meanwhile, activation-induced cytidine deaminase (AID) is shown to facilitate kataegis formation in human lymphomas. AID's majorly function is to diversify the genes among immune cells. Recent research shows that AID is involved in site-specific mutations in B cell tumor, while APOBEC3 subfamily causes the non-specific, cross-genomic mutations in non-B cell tumor.

TLS DNA polymerase (C→G and C→T mutations)
Translesional DNA synthesis (TLS) DNA polymerase family brings in the nucleotide to bridge across the abasic sites in DNA lesion. Due to the natural of the function of this enzyme, TLS DNA polymerase has a high error rates. It can slip at sequence or insert A or C base pairs into a distorted region on DNA strand; ss shown in Figure 3, TLS DNA polymerase may cause mutations in many different ways.

Among the TLS DNA polymerases, Rev1 has a mechanism of inserting cytosine into lesion site that does not contain a template. Since Rev1 does not read according to Watson and Crick basepair, it can introduce any random nucleotide into the DNA sequence. In most experimental cases, Rev1 is responsible for the C→G mutation during DNA repair. The effect of Rev1 can be combined with that of the APOBEC family. If the C→U mutation error is detected by its specific glycosylase, the glycosylase will cut the base pair and form an abasic site. Then TLS DNA polymerase can come in and induce C→G in this case. In yeast research data, Rev1 and Rev3 can account for up 98% of basepair substitutions and 95% of UV induced mutations.

Pol ζ is another kind of TLS DNA polymerase that collaborates with Rev1(mostly Rev1p) in the process of forming hypermutations in eukaryotes. Pol ζ is hypothesized to contribute to homologous allele exchanges. It can extend from DNA region distorted or bulged due to mismatches and bypass certain lesion site in DNA. According to research in yeast, Pol ζ can pass different mutations with ~10% efficiency, much more often than the result from other polymerases. When Pol ζ reads pass the mutation sites, the genetic mutations remain and are passed on to the next round of replication.

Clinical Significance
Kataegis is prevalently seen among breast cancer patients, and it is also exists in lung cancers, cervical, head and neck, and bladder cancers, as shown in the results from tracing APOBEC mutation signatures. Understanding the mechanism of how kataegis can be useful for the future research in how cancer has developed. Due to the highly patterned mutations in kataegis, researchers can make statistical models in order to trace the loci that are prone to mutations.

Research have found that kataegis could be a good prognostic indicator for breast cancer patient, that there is a life expectancy difference between patients with kataegis and those without. The specific reason was not clear. Because kataegis causes up-regulation and down-regulation of different factors, it is hypothesized that kataegis might have down-regulated the migration related gene, thus causing the tumor to be less invasive.