RNA polymerase II holoenzyme

RNA polymerase II holoenzyme is a form of eukaryotic RNA polymerase II that is recruited to the promoters of protein-coding genes in living cells. It consists of RNA polymerase II, a subset of general transcription factors, and regulatory proteins known as.

RNA polymerase II
RNA polymerase II (also called RNAP II and Pol II) is an enzyme found in eukaryotic cells. It catalyzes the transcription of DNA to synthesize precursors of mRNA and most snRNA and microRNA. In humans, RNAP II consists of seventeen protein molecules (gene products encoded by POLR2A-L, where the proteins synthesized from POLR2C, POLR2E, and POLR2F form homodimers).

General transcription factors
General transcription factors (GTFs) or basal transcription factors are protein transcription factors that have been shown to be important in the transcription of class II genes to mRNA templates. Many of them are involved in the formation of a preinitiation complex, which, together with RNA polymerase II, bind to and read the single-stranded DNA gene template. The cluster of RNA polymerase II and various transcription factors is known as a basal transcriptional complex (BTC).

Preinitiation complex
The preinitiation complex (PIC) is a large complex of proteins that is necessary for the transcription of protein-coding genes in eukaryotes and archaea. The PIC helps position RNA polymerase II over gene transcription start sites, denatures the DNA, and positions the DNA in the RNA polymerase II active site for transcription.

The typical PIC is made up of six general transcription factors: TFIIA (GTF2A1, GTF2A2), TFIIB (GTF2B), B-TFIID (BTAF1, TBP), TFIID (BTAF1, BTF3, BTF3L4, EDF1, TAF1-15, 16 total), TFIIE, TFIIF, TFIIH and TFIIJ.

The construction of the polymerase complex takes place on the gene promoter. The TATA box is one well-studied example of a promoter element that occurs in approximately 10% of genes. It is conserved in many (though not all) model eukaryotes and is found in a fraction of the promoters in these organisms. The sequence TATA (or variations) is located at approximately 25 nucleotides upstream of the Transcription Start Point (TSP). In addition, there are also some weakly conserved features including the TFIIB-Recognition Element (BRE), approximately 5 nucleotides upstream (BREu) and 5 nucleotides downstream (BREd) of the TATA box.

Assembly of the PIC
Although the sequence of steps involved in the assembly of the PIC can vary, in general, they follow step 1, binding to the promoter.


 * 1) The TATA-binding protein (TBP, a subunit of TFIID), TBPL1, or TBPL2 can bind the promoter or TATA box. Most genes lack a TATA box and use an initiator element (Inr) or downstream core promoter instead. Nevertheless, TBP is always involved and is forced to bind without sequence specificity. TAFs from TFIID can also be involved when the TATA box is absent.  A TFIID TAF will bind sequence specifically, and force the TBP to bind non-sequence specifically, bringing the remaining portions of TFIID to the promoter.
 * 2) TFIIA interacts with the TBP subunit of TFIID and aids in the binding of TBP to TATA-box containing promoter DNA.  Although TFIIA does not recognize DNA itself, its interactions with TBP allow it to stabilize and facilitate formation of the PIC.
 * 3) The N-terminal domain of TFIIB brings the DNA into proper position for entry into the active site of RNA polymerase II. TFIIB binds partially sequence specifically, with some preference for BRE. The TFIID-TFIIA-TFIIB (DAB)-promoter complex subsequently recruits RNA polymerase II and TFIIF.
 * 4) TFIIF (two subunits, RAP30 and RAP74, showing some similarity to bacterial sigma factors) and Pol II  enter the complex together. TFIIF helps to speed up the polymerization process.
 * 5) TFIIE joins the growing complex and recruits TFIIH. TFIIE may be involved in DNA melting at the promoter: it contains a zinc ribbon motif that can bind single-stranded DNA. TFIIE helps to open and close the Pol II’s Jaw-like structure, which enables movement down the DNA strand.
 * 6) DNA may be wrapped one complete turn around the preinitiation complex and it is TFIIF  that helps keep this tight wrapping. In the process, the torsional strain on the DNA may aid in DNA melting at the promoter, forming the transcription bubble.
 * 7) TFIIH enters the complex. TFIIH is a large protein complex that contains among others the CDK7/cyclin H kinase complex and a DNA helicase. TFIIH has three functions: It binds specifically to the template strand to ensure that the correct strand of DNA is transcribed and melts or unwinds the DNA (ATP-dependent) to separate the two strands using its helicase activity. It has a kinase activity that phosphorylates the C-terminal domain (CTD) of Pol II at the amino acid serine. This switches the RNA polymerase to start producing RNA. Finally it is essential for Nucleotide Excision Repair (NER) of damaged DNA. TFIIH and TFIIE strongly interact with one another. TFIIE affects TFIIH's catalytic activity. Without TFIIE, TFIIH will not unwind the promoter.
 * 8) TFIIH helps create the transcription bubble and may be required for transcription if the DNA template is not already denatured or if it is supercoiled.
 * 9) Mediator then encases all the transcription factors and Pol II. It interacts with enhancers, areas very far away (upstream or downstream) that help regulate transcription.

The formation of the preinitiation complex (PIC) is analogous to the mechanism seen in bacterial initiation. In bacteria, the sigma factor recognizes and binds to the promoter sequence. In eukaryotes, the transcription factors perform this role.

Mediator complex
Mediator is a multiprotein complex that functions as a transcriptional coactivator. The Mediator complex is required for the successful transcription of nearly all class II gene promoters in yeast. It works in the same manner in mammals.

The mediator functions as a coactivator and binds to the C-terminal domain (CTD) of RNA polymerase II holoenzyme, acting as a bridge between this enzyme and transcription factors.

C-terminal domain (CTD)
The carboxy-terminal domain (CTD) of RNA polymerase II is that portion of the polymerase that is involved in the initiation of DNA transcription, the capping of the RNA transcript, and attachment to the spliceosome for RNA splicing. The CTD typically consists of up to 52 repeats (in humans) of the sequence Tyr-Ser-Pro-Thr-Ser-Pro-Ser. The carboxy-terminal repeat domain (CTD) is essential for life. Cells containing only RNAPII with none or only up to one-third of its repeats are inviable.

The CTD is an extension appended to the C terminus of RPB1, the largest subunit of RNA polymerase II. It serves as a flexible binding scaffold for numerous nuclear factors, determined by the phosphorylation patterns on the CTD repeats. Each repeat contains an evolutionary conserved and repeated heptapeptide, Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7, which is subjected to reversible phosphorylations during each transcription cycle. This domain is inherently unstructured yet evolutionarily conserved, and in eukaryotes it comprises from 25 to 52 tandem copies of the consensus repeat heptad. As the CTD is frequently not required for general transcription factor (GTF)-mediated initiation and RNA synthesis, it does not form a part of the catalytic essence of RNAPII, but performs other functions.

CTD phosphorylation
RNAPII can exist in two forms: RNAPII0, with a highly phosphorylated CTD, and RNAPIIA, with a nonphosphorylated CTD. Phosphorylation occurs principally on Ser2 and Ser5 of the repeats, although these positions are not equivalent. The phosphorylation state changes as RNAPII progresses through the transcription cycle: The initiating RNAPII is form IIA, and the elongating enzyme is form II0. While RNAPII0 does consist of RNAPs with hyperphosphorylated CTDs, the pattern of phosphorylation on individual CTDs can vary due to differential phosphorylation of Ser2 versus Ser5 residues and/or to differential phosphorylation of repeats along the length of the CTD. The PCTD (phosphoCTD of an RNAPII0) physically links pre-mRNA processing to transcription by tethering processing factors to elongating RNAPII, e.g., 5′-end capping, 3′-end cleavage, and polyadenylation.

Ser5 phosphorylation (Ser5PO4) near the 5′ ends of genes depends principally on the kinase activity of TFIIH (Kin28 in yeast; CDK7 in metazoans). The transcription factor TFIIH is a kinase and will hyperphosphorylate the CTD of RNAP, and in doing so, causes the RNAP complex to move away from the initiation site. Subsequent to the action of TFIIH kinase, Ser2 residues are phosphorylated by CTDK-I in yeast (CDK9 kinase in metazoans). Ctk1 (CDK9) acts in complement to phosphorylation of serine 5 and is, thus, seen in middle to late elongation.

CDK8 and cyclin C (CCNC) are components of the RNA polymerase II holoenzyme that phosphorylate the carboxy-terminal domain (CTD). CDK8 regulates transcription by targeting the CDK7/cyclin H subunits of the general transcription initiation factor IIH (TFIIH), thereby providing a link between the mediator and the basal transcription machinery.

The gene CTDP1 encodes a phosphatase that interacts with the carboxy-terminus of transcription initiation factor TFIIF, a transcription factor that regulates elongation as well as initiation by RNA polymerase II.

Also involved in the phosphorylation and regulation of the RPB1 CTD is cyclin T1 (CCNT1). Cyclin T1 tightly associates and forms a complex with CDK9 kinase, both of which are involved in the phosphorylation and regulation.


 * ATP + [DNA-directed RNA polymerase II] <=> ADP + [DNA-directed RNA polymerase II] phosphate : catalyzed by CDK9 EC 2.7.11.23.

TFIIF and FCP1 cooperate for RNAPII recycling. FCP1, the CTD phosphatase, interacts with RNA polymerase II. Transcription is regulated by the state of phosphorylation of a heptapeptide repeat. The nonphosphorylated form, RNAPIIA, is recruited to the initiation complex, whereas the elongating polymerase is found with RNAPII0. RNAPII cycles during transcription. CTD phosphatase activity is regulated by two GTFs (TFIIF and TFIIB). The large subunit of TFIIF (RAP74) stimulates the CTD phosphatase activity, whereas TFIIB inhibits TFIIF-mediated stimulation. Dephosphorylation of the CTD alters the migration of the largest subunit of RNAPII (RPB1).

5' capping
The carboxy-terminal domain is also the binding site of the cap-synthesizing and cap-binding complex. In eukaryotes, after transcription of the 5' end of an RNA transcript, the cap-synthesizing complex on the CTD will remove the gamma-phosphate from the 5'-phosphate and attach a GMP, forming a 5',5'-triphosphate linkage. The synthesizing complex falls off and the cap then binds to the cap-binding complex (CBC), which is bound to the CTD.

The 5'cap of eukaryotic RNA transcripts is important for binding of the mRNA transcript to the ribosome during translation, to the CTD of RNAP, and prevents RNA degradation.

Spliceosome
The carboxy-terminal domain is also the binding site for spliceosome factors that are part of RNA splicing. These allow for the splicing and removal of introns (in the form of a lariat structure) during RNA transcription.

Mutation in the CTD
Major studies in which knockout of particular amino acids was achieved in the CTD have been carried out. The results indicate that RNA polymerase II CTD truncation mutations affect the ability to induce transcription of a subset of genes in vivo, and the lack of response to induction maps to the upstream activating sequences of these genes.

Genome surveillance complex
Several protein members of the BRCA1-associated genome surveillance complex (BASC) associate with RNA polymerase II and play a role in transcription.

The transcription factor TFIIH is involved in transcription initiation and DNA repair. MAT1 (for 'ménage à trois-1') is involved in the assembly of the CAK complex. CAK is a multisubunit protein that includes CDK7, cyclin H (CCNH), and MAT1. CAK is an essential component of the transcription factor TFIIH that is involved in transcription initiation and DNA repair.

The nucleotide excision repair (NER) pathway is a mechanism to repair damage to DNA. ERCC2 is involved in transcription-coupled NER and is an integral member of the basal transcription factor BTF2/TFIIH complex. ERCC3 is an ATP-dependent DNA helicase that functions in NER. It also is a subunit of basal transcription factor 2 (TFIIH) and, thus, functions in class II transcription. XPG (ERCC5) forms a stable complex with TFIIH, which is active in transcription and NER. ERCC6 encodes a DNA-binding protein that is important in transcription-coupled excision repair. ERCC8 interacts with Cockayne syndrome type B (CSB) protein, with p44 (GTF2H2), a subunit of the RNA polymerase II transcription factor IIH, and ERCC6. It is involved in transcription-coupled excision repair.

Higher error ratios in transcription by RNA polymerase II are observed in the presence of Mn2+ compared to Mg2+.

Transcription coactivators
The EDF1 gene encodes a protein that acts as a transcriptional coactivator by interconnecting the general transcription factor TATA element-binding protein (TBP) and gene-specific activators.

TFIID and human mediator coactivator (THRAP3) complexes (mediator complex, plus THRAP3 protein) assemble cooperatively on promoter DNA, from which they become part of the RNAPII holoenzyme.

Transcription initiation
The completed assembly of the holoenzyme with transcription factors and RNA polymerase II bound to the promoter forms the eukaryotic transcription initiation complex. Transcription in the archaea domain is similar to transcription in eukaryotes.

Transcription begins with matching of NTPs to the first and second in the DNA sequence. This, like most of the remainder of transcription, is an energy-dependent process, consuming adenosine triphosphate (ATP) or other NTP.

Promoter clearance
After the first bond is synthesized, the RNA polymerase must clear the promoter. During this time, there is a tendency to release the RNA transcript and produce truncated transcripts. This is called abortive initiation and is common for both eukaryotes and prokaryotes. Abortive initiation continues to occur until the σ factor rearranges, resulting in the transcription elongation complex (which gives a 35 bp-moving footprint). The σ factor is released before 80 nucleotides of mRNA are synthesized. Once the transcript reaches approximately 23 nucleotides, it no longer slips and elongation can occur.

Initiation regulation
Due to the range of genes that Pol II transcribes, this is the polymerase that experiences the most regulation by a range of factors at each stage of transcription. It is also one of the most complex in terms of polymerase cofactors involved.

Initiation is regulated by many mechanisms. These can be separated into two main categories:
 * 1) Protein interference.
 * 2) Regulation by phosphorylation.

Regulation by protein interference
Protein interference is the process where in some signaling protein interacts, either with the promoter or with some stage of the partially constructed complex, to prevent further construction of the polymerase complex, so preventing initiation. In general, this is a very rapid response and is used for fine level, individual gene control and for 'cascade' processes for a group of genes useful under a specific conditions (for example, DNA repair genes or heat shock genes).

Chromatin structure inhibition is the process wherein the promoter is hidden by chromatin structure. Chromatin structure is controlled by post-translational modification of the histones involved and leads to gross levels of high or low transcription levels. See: chromatin, histone, and nucleosome.

These methods of control can be combined in a modular method, allowing very high specificity in transcription initiation control.

Regulation by phosphorylation
The largest subunit of Pol II (Rpb1) has a domain at its C-terminus called the CTD (C-terminal domain). This is the target of kinases and phosphatases. The phosphorylation of the CTD is an important regulation mechanism, as this allows attraction and rejection of factors that have a function in the transcription process. The CTD can be considered as a platform for transcription factors.

The CTD consists of repetitions of an amino acid motif, YSPTSPS, of which Serines and Threonines can be phosphorylated. The number of these repeats varies; the mammalian protein contains 52, while the yeast protein contains 26. Site-directed-mutagenesis of the yeast protein has found at least 10 repeats are needed for viability. There are many different combinations of phosphorylations possible on these repeats and these can change rapidly during transcription. The regulation of these phosphorylations and the consequences for the association of transcription factors plays a major role in the regulation of transcription.

During the transcription cycle, the CTD of the large subunit of RNAP II is reversibly phosphorylated. RNAP II containing unphosphorylated CTD is recruited to the promoter, whereas the hyperphosphorylated CTD form is involved in active transcription. Phosphorylation occurs at two sites within the heptapeptide repeat, at Serine 5 and Serine 2. Serine 5 phosphorylation is confined to promoter regions and is necessary for the initiation of transcription, whereas Serine 2 phosphorylation is important for mRNA elongation and 3'-end processing.

Elongation
The process of elongation is the synthesis of a copy of the DNA into messenger RNA. RNA Pol II matches complementary RNA nucleotides to the template DNA by Watson-Crick base pairing. These RNA nucleotides are ligated, resulting in a strand of messenger RNA.

Unlike DNA replication, mRNA transcription can involve multiple RNA polymerases on a single DNA template and multiple rounds of transcription (amplification of particular mRNA), so many mRNA molecules can be rapidly produced from a single copy of a gene.

Elongation also involves a proofreading mechanism that can replace incorrectly incorporated bases. In eukaryotes, this may correspond with short pauses during transcription that allow appropriate RNA editing factors to bind. These pauses may be intrinsic to the RNA polymerase or due to chromatin structure.

Elongation regulation
RNA Pol II elongation promoters can be summarised in three classes:


 * 1) Drug/sequence-dependent arrest affected factors, e.g., SII (TFIIS) and P-TEFb protein families.
 * 2) Chromatin structure oriented factors. Based on histone post translational modifications – phosphorylation, acetylation, methylation and ubiquination.
 * See: chromatin, histone, and nucleosome
 * 1) RNA Pol II catalysis improving factors. Improve the Vmax or Km of RNA Pol II, so improving the catalytic quality of the polymerase enzyme. E.g. TFIIF, Elongin and ELL families.
 * See: Enzyme kinetics, Henri–Michaelis–Menten kinetics, Michaelis constant, and Lineweaver–Burk plot

As for initiation, protein interference, seen as the "drug/sequence-dependent arrest affected factors" and "RNA Pol II catalysis improving factors" provide a very rapid response and is used for fine level individual gene control. Elongation downregulation is also possible, in this case usually by blocking polymerase progress or by deactivating the polymerase.

Chromatin structure-oriented factors are more complex than for initiation control. Often the chromatin-altering factor becomes bound to the polymerase complex, altering the histones as they are encountered and providing a semi-permanent 'memory' of previous promotion and transcription.

Termination
Termination is the process of breaking up the polymerase complex and ending the RNA strand. In eukaryotes using RNA Pol II, this termination is very variable (up to 2000 bases), relying on post transcriptional modification.

Little regulation occurs at termination, although it has been proposed newly transcribed RNA is held in place if proper termination is inhibited, allowing very fast expression of genes given a stimulus. This has not yet been demonstrated in eukaryotes.

Transcription factory
Active RNA Pol II transcription holoenzymes can be clustered in the nucleus, in discrete sites called transcription factories. There are ~8,000 such factories in the nucleoplasm of a HeLa cell, but only 100–300 RNAP II foci per nucleus in erythroid cells, as in many other tissue types. The number of transcription factories in tissues is far more restricted than indicated by previous estimates from cultured cells. As an active transcription unit is usually associated with only one Pol II holoenzyme, a polymerase II factory may contain on average ~8 holoenzymes. Colocalization of transcribed genes has not been observed when using cultured fibroblast-like cells. Differentiated or committed tissue types have a limited number of available transcription sites. Estimates show that erythroid cells express at least 4,000 genes, so many genes are obliged to seek out and share the same factory.

The intranuclear position of many genes is correlated with their activity state. During transcription in vivo, distal active genes are dynamically organized into shared nuclear subcompartments and colocalize to the same transcription factory at high frequencies. Movement into or out of these factories results in activation (On) or abatement (Off) of transcription, rather than by recruiting and assembling a transcription complex. Usually, genes migrate to preassembled factories for transcription.

An expressed gene is preferentially located outside of its chromosome territory, but a closely linked, inactive gene is located inside.

Holoenzyme stability
RNA polymerase II holoenzyme stability determines the number of base pairs that can be transcribed before the holoenzyme loses its ability to transcribe. The length of the CTD is essential for RNA polymerase II stability. RNA polymerase II stability has been shown to be regulated by post-translation proline hydroxylation. The von Hippel–Lindau tumor suppressor protein (pVHL, human GeneID: 7428 ) complex binds the hyperphosphorylated large subunit of the RNA polymerase II complex, in a proline hydroxylation- and CTD phosphorylation-dependent manner, targeting it for ubiquitination.