TATA-binding protein

The TATA-binding protein (TBP) is a general transcription factor that binds to a DNA sequence called the TATA box. This DNA sequence is found about 30 base pairs upstream of the transcription start site in some eukaryotic gene promoters.

TBP gene family
TBP is a member of a small gene family of TBP-related factors. The first TBP-related factor (TRF/TRF1) was identified in the fruit fly Drosophila, but appears to be fly or insect-specific. Subsequently TBPL1/TRF2 was found in the genomes of many metazoans, whereas vertebrate genomes encode a third vertebrate family member, TBPL2/TRF3. In specific cell types or on specific promoters TBP can be replaced by one of these TBP-related factors, some of which interact with the TATA box similarly to TBP.

Role as transcription factor
TBP is a subunit of the eukaryotic general transcription factor TFIID. TFIID is the first protein to bind to DNA during the formation of the transcription preinitiation complex of RNA polymerase II (RNA Pol II). As one of the few proteins in the preinitiation complex that binds DNA in a sequence-specific manner, it helps position RNA polymerase II over the transcription start site of the gene. However, it is estimated that only 10–20% of human promoters have TATA boxes - the majority of human promoters are TATA-less housekeeping gene promoters - so TBP is probably not the only protein involved in positioning RNA polymerase II.. The binding of TBP to these promoters is facilitated by housekeeping gene regulators. Interestingly, transcription initiates within a narrow region at around 30 bp downstream of TATA box on TATA-containing promoters, while transcription start sites of TATA-less promoters are dispersed within a 200 bp region.

Binding of TFIID to the TATA box in the promoter region of the gene initiates the recruitment of other factors required for RNA Pol II to begin transcription. Some of the other recruited transcription factors include TFIIA, TFIIB, and TFIIF. Each of these transcription factors contains several protein subunits.

TBP is also important for transcription by RNA polymerase I and RNA polymerase III, and is therefore involved in transcription initiation by all three RNA polymerases.

TBP is involved in DNA melting (double strand separation) by bending the DNA by 80° (the AT-rich sequence to which it binds facilitates easy melting). The TBP is an unusual protein in that it binds the minor groove using a β sheet.

Another distinctive feature of TBP is a long string of glutamines in the N-terminus of the protein. This region modulates the DNA binding activity of the C-terminus, and modulation of DNA-binding affects the rate of transcription complex formation and initiation of transcription. Mutations that expand the number of CAG repeats encoding this polyglutamine tract, and thus increase the length of the polyglutamine string, are associated with spinocerebellar ataxia 17, a neurodegenerative disorder classified as a polyglutamine disease.

DNA-protein interactions
When TBP binds to a TATA box within the DNA, it distorts the DNA by inserting amino acid side-chains between base pairs, partially unwinding the helix, and doubly kinking it. The distortion is accomplished through a great amount of surface contact between the protein and DNA. TBP binds with the negatively charged phosphates in the DNA backbone through positively charged lysine and arginine amino acid residues. The sharp bend in the DNA is produced through projection of four bulky phenylalanine residues into the minor groove. As the DNA bends, its contact with TBP increases, thus enhancing the DNA-protein interaction.

The strain imposed on the DNA through this interaction initiates melting, or separation, of the strands. Because this region of DNA is rich in adenine and thymine residues, which base-pair through only two hydrogen bonds, the DNA strands are more easily separated. Separation of the two strands exposes the bases and allows RNA polymerase II to begin transcription of the gene.

TBP's C-terminus composes of a helicoidal shape that (incompletely) complements the T-A-T-A region of DNA. This incompleteness allows DNA to be passively bent on binding.

For information on the use of TBP in cells see: RNA polymerase I, RNA polymerase II, and RNA polymerase III.

Protein–protein interactions
TATA-binding protein has been shown to interact with:


 * BRF1,
 * BTAF1,
 * C-Fos,
 * C-jun,
 * EDF1,
 * GTF2B (TFIIB),
 * GTF2A1 (TFIIA subunit 1),
 * GTF2F1 (TFIIF subunit 1)
 * GTF2H4 (TFIIH subunit 4),
 * Mdm2,
 * MSX1,
 * NFYB,
 * P53,
 * PAX6,
 * POLR2A,
 * POU2F1,
 * RELA,
 * NR2B1,
 * TAF1,
 * TAF4,
 * TAF5,
 * TAF6,
 * TAF7,
 * TAF9.
 * TAF10,
 * TAF11,
 * TAF13, and
 * TAF15.

Complex assembly
The TATA-box binding protein (TBP) is required for the initiation of transcription by RNA polymerases I, II and III, from promoters with or without a TATA box. In the presence of a TATA-less promoter, TBP binds with the help of TBP-associated factors (TAFs). TBP associates with a host of factors, including the general transcription factors TFIIA, -B, -D, -E, and -H, to form huge multi-subunit pre-initiation complexes on the core promoter. Through its association with different transcription factors, TBP can initiate transcription from different RNA polymerases. There are several related TBPs, including TBP-like (TBPL) proteins.

Structure
The C-terminal core of TBP (~180 residues) is highly conserved and contains two 88-amino acid repeats that produce a saddle-shaped structure that straddles the DNA; this region binds to the TATA box and interacts with transcription factors and regulatory proteins. By contrast, the N-terminal region varies in both length and sequence.