Transcription activator-like effector

TAL (transcription activator-like) effectors (often referred to as TALEs, but not to be confused with the t hree a mino acid l oop e xtension homeobox class of proteins) are proteins secreted by some β- and γ-proteobacteria. Most of these are Xanthomonads. Plant pathogenic Xanthomonas bacteria are especially known for TALEs, produced via their type III secretion system. These proteins can bind promoter sequences in the host plant and activate the expression of plant genes that aid bacterial infection. The TALE domain responsible for binding to DNA is known to have 1.5 to 33.5 short sequences that are repeated multiple times (tandem repeats). Each of these repeats was found to be specific for a certain base pair of the DNA. These repeats also have repeat variable residues (RVD) that can detect specific DNA base pairs. They recognize plant DNA sequences through a central repeat domain consisting of a variable number of ~34 amino acid repeats. There appears to be a one-to-one correspondence between the identity of two critical amino acids in each repeat and each DNA base in the target sequence. These proteins are interesting to researchers both for their role in disease of important crop species and the relative ease of retargeting them to bind new DNA sequences. Similar proteins can be found in the pathogenic bacterium Ralstonia solanacearum and Burkholderia rhizoxinica, as well as yet unidentified marine microorganisms. The term TALE-likes is used to refer to the putative protein family encompassing the TALEs and these related proteins.

Xanthomonas
Xanthomonas are Gram-negative bacteria that can infect a wide variety of plant species including pepper/capsicum, rice, citrus, cotton, tomato, and soybeans. Some types of Xanthomonas cause localized leaf spot or leaf streak while others spread systemically and cause black rot or leaf blight disease. They inject a number of effector proteins, including TAL effectors, into the plant via their type III secretion system. TAL effectors have several motifs normally associated with eukaryotes including multiple nuclear localization signals and an acidic activation domain. When injected into plants, these proteins can enter the nucleus of the plant cell, bind plant promoter sequences, and activate transcription of plant genes that aid in bacterial infection. Plants have developed a defense mechanism against type III effectors that includes R (resistance) genes triggered by these effectors. Some of these R genes appear to have evolved to contain TAL-effector binding sites similar to site in the intended target gene. This competition between pathogenic bacteria and the host plant has been hypothesized to account for the apparently malleable nature of the TAL effector DNA binding domain.

Non-Xanthomonas
R. solanacearum, B. rhizoxinica, and banana blood disease (a bacterium not yet definitively identified, in the R. solanacearum species group).

DNA recognition
The most distinctive characteristic of TAL effectors is a central repeat domain containing between 1.5 and 33.5 repeats that are usually 34 residues in length (the C-terminal repeat is generally shorter and referred to as a “half repeat”). A typical repeat sequence is LTPEQVVAIAS HD GGKQALETVQRLLPVLCQAHG, but the residues at the 12th and 13th positions are hypervariable (these two amino acids are also known as the repeat variable diresidue or RVD). There is a simple relationship between the identity of these two residues in sequential repeats and sequential DNA bases in the TAL effector's target site. The crystal structure of a TAL effector bound to DNA indicates that each repeat comprises two alpha helices and a short RVD-containing loop where the second residue of the RVD makes sequence-specific DNA contacts while the first residue of the RVD stabilizes the RVD-containing loop. Target sites of TAL effectors also tend to include a thymine flanking the 5’ base targeted by the first repeat; this appears to be due to a contact between this T and a conserved tryptophan in the region N-terminal of the central repeat domain. However, this "zero" position does not always contain a thymine, as some scaffolds are more permissive.

The TAL-DNA code was broken by two separate groups in 2010. The first group, headed by Adam Bogdanove, broke this code computationally by searching for patterns in protein sequence alignments and DNA sequences of target promoters derived from a database of genes upregulated by TALEs. The second group (Boch) deduced the code through molecular analysis of the TAL effector AvrBs3 and its target DNA sequence in the promoter of a pepper gene activated by AvrBs3. The experimentally validated code between RVD sequence and target DNA base can be expressed as follows:

Target genes
TAL effectors can induce susceptibility genes that are members of the NODULIN3 (N3) gene family. These genes are essential for the development of the disease. In rice two genes, Os-8N3 and Os-11N3, are induced by TAL effectors. Os-8N3 is induced by PthXo1 and Os-11N3 is induced by PthXo3 and AvrXa7. Two hypotheses exist about possible functions for N3 proteins:
 * They are involved in copper transport, resulting in detoxification of the environment for bacteria. The reduction in copper level facilitates bacterial growth.
 * They are involved in glucose transport, facilitating glucose flow. This mechanism provides nutrients to bacteria and stimulates pathogen growth and virulence

Engineering TAL effectors
This simple correspondence between amino acids in TAL effectors and DNA bases in their target sites makes them useful for protein engineering applications. Numerous groups have designed artificial TAL effectors capable of recognizing new DNA sequences in a variety of experimental systems. Such engineered TAL effectors have been used to create artificial transcription factors that can be used to target and activate or repress endogenous genes in tomato, Arabidopsis thaliana, and human cells.

Genetic constructs to encode TAL effector-based proteins can be made using either conventional gene synthesis or modular assembly. A plasmid kit for assembling custom TALEN and other TAL effector constructs is available through the public, not-for-profit repository Addgene. Webpages providing access to public software, protocols, and other resources for TAL effector-DNA targeting applications include the TAL Effector-Nucleotide Targeter and taleffectors.com.

Applications
Engineered TAL effectors can also be fused to the cleavage domain of FokI to create TAL effector nucleases (TALEN) or to meganucleases (nucleases with longer recognition sites) to create "megaTALs." Such fusions share some properties with zinc finger nucleases and may be useful for genetic engineering and gene therapy applications.

TALEN-based approaches are used in the emerging fields of gene editing and genome engineering. TALEN fusions show activity in a yeast-based assay, at endogenous yeast genes, in a plant reporter assay, at an endogenous plant gene, at endogenous zebrafish genes, at an endogenous rat gene, and at endogenous human genes. The human HPRT1 gene has been targeted at detectable, but unquantified levels. In addition, TALEN constructs containing the FokI cleavage domain fused to a smaller portion of the TAL effector still containing the DNA binding domain have been used to target the endogenous NTF3 and CCR5 genes in human cells with efficiencies of up to 25%. TAL effector nucleases have also been used to engineer human embryonic stem cells and induced pluripotent stem cells (IPSCs) and to knock out the endogenous ben-1 gene in C. elegans.

TALE-induced non-homologous end joining modification has been used to produce novel disease resistance in rice.