CTCF

Transcriptional repressor CTCF also known as 11-zinc finger protein or CCCTC-binding factor is a transcription factor that in humans is encoded by the CTCF gene. CTCF is involved in many cellular processes, including transcriptional regulation, insulator activity, V(D)J recombination and regulation of chromatin architecture.

Discovery
CCCTC-Binding factor or CTCF was initially discovered as a negative regulator of the chicken c-myc gene. This protein was found to be binding to three regularly spaced repeats of the core sequence CCCTC and thus was named CCCTC binding factor.

Function
The primary role of CTCF is thought to be in regulating the 3D structure of chromatin. CTCF binds together strands of DNA, thus forming chromatin loops, and anchors DNA to cellular structures like the nuclear lamina. It also defines the boundaries between active and heterochromatic DNA.

Since the 3D structure of DNA influences the regulation of genes, CTCF's activity influences the expression of genes. CTCF is thought to be a primary part of the activity of insulators, sequences that block the interaction between enhancers and promoters. CTCF binding has also been both shown to promote and repress gene expression. It is unknown whether CTCF affects gene expression solely through its looping activity, or if it has some other, unknown, activity. In a recent study, it has been shown that, in addition to demarcating TADs, CTCF mediates promoter–enhancer loops, often located in promoter-proximal regions, to facilitate the promoter–enhancer interactions within one TAD. This is in line with the concept that a subpopulation of CTCF associates with the RNA polymerase II (Pol II) protein complex to activate transcription. It is likely that CTCF helps to bridge the transcription factor-bound enhancers to transcription start site-proximal regulatory elements and to initiate transcription by interacting with Pol II, thus supporting a role of CTCF in facilitating contacts between transcription regulatory sequences. This model has been demonstrated by the previous work on the beta-globin locus.

Observed activity
The binding of CTCF has been shown to have many effects, which are enumerated below. In each case, it is unknown if CTCF directly evokes the outcome or if it does so indirectly (in particular through its looping role).

Transcriptional regulation
The protein CTCF plays a heavy role in repressing the insulin-like growth factor 2 gene, by binding to the H-19 imprinting control region (ICR) along with differentially-methylated region-1 (DMR1) and MAR3.

Insulation
Binding of targeting sequence elements by CTCF can block the interaction between enhancers and promoters, therefore limiting the activity of enhancers to certain functional domains. Besides acting as enhancer blocking, CTCF can also act as a chromatin barrier by preventing the spread of heterochromatin structures.

Regulation of chromatin architecture
CTCF physically binds to itself to form homodimers, which causes the bound DNA to form loops. CTCF also occurs frequently at the boundaries of sections of DNA bound to the nuclear lamina. Using chromatin immuno-precipitation (ChIP) followed by ChIP-seq, it was found that CTCF localizes with cohesin genome-wide and affects gene regulatory mechanisms and the higher-order chromatin structure. It is currently believed that the DNA loops are formed by the "loop extrusion" mechanism, whereby the cohesin ring is actively being translocated along the DNA until it meets CTCF. CTCF has to be in a proper orientation to stop cohesin.

Regulation of RNA splicing
CTCF binding has been shown to influence mRNA splicing.

DNA binding
CTCF binds to the consensus sequence CCGCGNGGNGGCAG (in IUPAC notation). This sequence is defined by 11 zinc finger motifs in its structure. CTCF's binding is disrupted by CpG methylation of the DNA it binds to. On the other hand, CTCF binding may set boundaries for the spreading of DNA methylation. In recent studies, CTCF binding loss is reported to increase localized CpG methylation, which reflected another epigenetic remodeling role of CTCF in human genome.

CTCF binds to an average of about 55,000 DNA sites in 19 diverse cell types (12 normal and 7 immortal) and in total 77,811 distinct binding sites across all 19 cell types. CTCF's ability to bind to multiple sequences through the usage of various combinations of its zinc fingers earned it the status of a “multivalent protein”. More than 30,000 CTCF binding sites have been characterized. The human genome contains anywhere between 15,000 and 40,000 CTCF binding sites depending on cell type, suggesting a widespread role for CTCF in gene regulation. In addition CTCF binding sites act as nucleosome positioning anchors so that, when used to align various genomic signals, multiple flanking nucleosomes can be readily identified. On the other hand, high-resolution nucleosome mapping studies have demonstrated that the differences of CTCF binding between cell types may be attributed to the differences in nucleosome locations. Methylation loss at CTCF-binding site of some genes has been found to be related to human diseases, including male infertility.

Protein-protein interactions
CTCF binds to itself to form homodimers. CTCF has also been shown to interact with Y box binding protein 1. CTCF also co-localizes with cohesin, which extrudes chromatin loops by actively translocating one or two DNA strands through its ring-shaped structure, until it meets CTCF in a proper orientation. CTCF is also known to interact with chromatin remodellers such as Chd4 and Snf2h (SMARCA5).