USF1

Upstream stimulatory factor 1 is a protein that in humans is encoded by the USF1 gene.

Gene
The upstream stimulatory factor gene encodes a transcription factor USF that belongs to the proto-oncogene MYC family and is featured by a basic helix-loop-helix leucine zipper (bHLH-LZ) motif in the protein structure. USF was originally identified to regulate the major late promoters of adenovirus, and recent research has further revealed its role in tissue protection. The bHLH-LZ motif enables the transactivation capacity of the USF protein through interacting with the Initiator element (Inr) and E-box motif on the bound DNA. In the context of insulin and glucose-induced USF activities, those E-box motifs can act as a glucose-responsive element (GRE) and a part of the carbohydrate response element (ChoRE) to interact with transcription factors.

Isoforms
USF comprises two major isoforms: USF1 and USF2. USF1 gene locates on the chromosome region 1q22-q23 in both human and mice; USF2 gene locates on the chromosome 19q13 in human and chromosome 19q7 in mice, respectively. Both USF1 and USF2 transcripts comprise 10 exons and can undergo exon 4-excision during alternative splicing. From an auto-regulation perspective, these exon 4-excision products act as dominant negative regulators and are found to suppress USF-dependent gene expression.

Protein
Although USF1 and USF2 share 70% of the amino acid sequence in their bHLH-LZ region, only 40% of similarity is found in their full-length proteins. In addition, USF1 and USF2 exhibit different protein abundances in a cell type-specific manner. It has been found that USF1 and USF2 expression increases during the differentiation of erythroid cells. Despite the ubiquitous expression of both isoforms, USF1 and USF2 mediate different biological processes and functions in cells. While USF1 modulates metabolism, immune response, and tissue protection, USF2 primarily controls embryonic development, brain function, iron metabolism, and fertility. Structurally, the highly conserved bHLH-LZ structure on the C-terminus of USF yields high binding specificity and promotes the formation of USF1 homodimers or USF1-USF2 heterodimers for DNA binding. The USF-specific region (USR) on the N-terminal region, on the other hand, facilitates the nuclear translocation and activation of USF1.

Function
This gene encodes a member of the basic helix-loop-helix leucine zipper family and can function as a cellular transcription factor. The encoded protein can activate transcription through pyrimidine-rich initiator (Inr) elements and E-box motifs. This gene has been linked to familial combined hyperlipidemia (FCHL). Two transcript variants encoding distinct isoforms have been identified for this gene.

A study of mice suggested reduced USF1 levels increase metabolism in brown fat.

Modulation of DNA binding affinity
The symmetrical E-box motif is the main target of bHLH-LZ transcription factors, and USF1 has a high binding affinity for the core sequence CACGTG in the motif. USF1-DNA binding activity can be modulated by cell type-specific DNA methylation and acetylation on the E-box motif or by post-transcriptional modifications of the USF1 protein. For example, CpG methylation on the central E-box motif inhibits the complex formation of USF1 with its co-transcription factors and therefore decreases the corresponding gene expression in mouse lymphosarcoma cells. In contrast, phosphorylation of USF1 by p38 mitogen-activated protein kinases, protein kinase A or protein kinase C increases its binding to the E-box motif and activate gene transcription.

Phosphorylation
Mitogen-activated protein kinase (MAPKs) phosphorylates serine and threonine residues of substrate proteins and convert extracellular signals induced by growth factors, mitogens or cytokines into intracellular phosphorylation cascades, which regulate cell proliferation, differentiation, stress responses and apoptosis (programmed cell death). Phosphorylation by MAPKs induces a conformational change of the USF protein and exposes its DNA-binding domain for interaction. This increased structural exposure enhances DNA binding and therefore the transcriptional activity of USF.


 * ERK1 (also known as MAPK3) and ERK2 (also known as MAPK1) phosphorylate USF1 in response to TFG-β signaling in vascular smooth muscle cells. SMAD2 and SMAD3 signaling following the TFG-β receptor activation can also cooperate with EGFR / ERK pathways to activate USF1, which in turn regulates the gene expression of plasminogen activator inhibitor-1 (PAI-1), a significant biomarker and predictor of cardiovascular disease-related death and a marker of poor prognosis in breast cancer.
 * Casein kinase 2 or CK-II (CK2) is a tetrameric enzyme composed of two catalytic and two regulatory subunits. In pancreatic cells, CK2 phosphorylates USF1, PDX1 and MST1 to suppress insulin expression.

Gene transcription

 * Transforming growth factor β 1 (TGF beta 1) is encoded by the TFGB1 gene that contains an E-box within the promoter region and has been implicated in excessive extracellular matrix accumulation under a high-glucose condition. Overexpression of either USF1 or USF2 is found to elevate the TFGB1 promoter activity in human embryonic kidney cells. However, only USF1 overexpression leads to increased TGF-β1 secretion.
 * Thrombospondin 1 (TSP1) is involved in the development of diabetic nephropathy. USF1/2 binds to the E-box motif (CAGATG) on the human THBS1 promoter and regulates high-glucose-induced TSP1 expression in mesangial cells. USF2 overexpression has been found to augment THBS1 promoter activity and TSP1 expression. The resulting increase in TSP1 expression further promotes the formation of active TGF-β.
 * AP-1 transcription factor (AP-1) refers to a complex of dimeric transcription factors composed of c-Jun, c-Fos or activating transcriptionUM_chem505_1FOS_c-jun_,_c-fos_heterodimer.png (red)]]factor (ATF) that bind to the AP-1 binding site on DNA. cJun-cJun / cJun-cFos dimers preferentially bind to the phorbol 12-O-Tetradecanoylphorbol-13-acetate (TPA)-responsive element (TRE region, TGACTCA), whereas cJun-ATF dimers and ATF homodimers preferentially bind to the cAMP-responsive element (CRE, TGACGTCA). The AP-1 complex becomes activated in response to high glucose, oxidative stress, low-density lipoprotein(LDL) and oxidised LDL. It has been reported that a high glucose level upregulates USF and AP-1 binding activities, as well as the protein level of cFos.

Interaction between USF1 and other transcription factors, including SP1, PEA3 (also known as ETV4) and MTF1, also leads to cooperative transcriptional regulation. For instance, the leucine zipper motif of USF1 recruits PEA3 to form a ternary complex and co-regulates the transcription of BAX, an apoptosis regulator. Another USF1-regulated target is topoisomerase III (hTOP3⍺), which catalyzes the topological changes of DNA, modifies DNA supercoil structures, and increases the chromatin accessibility for gene expression. Similar interactions exist between USF1 and JMJD1C or H3K9 demethylase, in which the molecular interactions change chromatin accessibility and elevate the transcription of a series of lipogenic genes, including FASN, ACC, ACLY, and SREBP1.

Chromosome boundary by USF
Chromosomes are generally classified into euchromatin and heterochromatin with distinct histone modifications, compaction levels, and the resulting gene expression patterns. Heterochromatin is a tightly condensed and transcriptionally repressed chromatin domain that is characterized by distinct combinations of histone post-translational modifications. Heterochromatin is required for genome stability and gene expression regulation. However, it can spread into neighboring DNA regions and inactivate gene expression. Chromosome boundary elements are thus necessary to block such stochastic spreads of heterochromatin and maintain stable gene expression. USF1 and USF2 have been found to recruit various histone-modifying complexes, including the histone H3 methyltransferase Set1 complex and the H4 arginine 3 methyltransferase PRMT1, with the latter known to establish active chromatin domains. USF1/USF2 binding deposits a high level of activating histone modifications on adjacent nucleosomes and thus prevents the propagation of chromatin silencing modifications from the heterochromatin, such as H3K9 and K27 methylation. Other USF1/USF2-related chromatin modifications include the recruitment of the E3 ubiquitin ligase, RNF20, to moniubiquitinate histone H2B. The loss of RNF20 is found to cause an extension of the silencing modifications from the 16 kb heterochromatic domain into the β-globin locus. Moreover, USF1 and USF2 can bind to the 5' DNase I hypersensitive site HS4 and recruit an H3 acetyltransferase, PCAF, which blocks the heterochromatin spread into the β-globin locus.

FASN transactivates for lipogenesis
USF is known to bind the L-type pyruvate kinase promoter on DNA at high glucose and insulin levels. Excessive insulin activates kinases and phosphatases that post-translationally modify USF, sterol regulatory element-binding protein 1C (SREBP1C), Carbohydrate-responsive element-binding protein (ChREBP), and Liver X receptor (LXRs). With insulin stimulation, USF1 and USF2 bind to the E-boxes at -332 and -65 in the promoter region of FASN that encodes Fatty acid synthase (FAS) for lipogenesis.

Various post-translational modifications of USF1 determine its activity and signaling pathways and can affect the lipogenesis process. An abnormal increase in the USF-mediated de novo fatty acid synthesis is found to cause intracellular fatty acid accumulation and deregulate gene expression and cellular processes like tumor cell survival.

Lipogenic pathways

 * In response to insulin elevation, DNA-protein kinase (DNA-PK) involved in DNA damage repair becomes dephosphorylated and activated. The active form of DNA-PK indirectly phosphorylates USF1 at S262 through AMP-activated protein kinase (AMPK). The S262 phosphorylation increases USF1 interaction with SREBP1C near the sterol regulatory element (SRE) and facilitates the synergistic activation of SREBP1C and transcription of the downstream lipogenic genes.
 * USF1 S262 phosphorylation also recruits PCAF to acetylate USF1 at the site K237. Both S262 phosphorylation and K237 acetylation enhance USF1 activities and the subsequent transcriptional activation of the fatty acid synthase gene (FASN). Fatty acid synthase (FAS), together with Acetyl-CoA carboxylase (ACC), produces malonyl-CoA, converts it to long-chain fatty acids, and promotes the de novo fatty-acid synthesis for energy provision and protein lipidation.
 * USF1 modified with S262 phosphorylation an K237 acetylation also recruits BGR1 (also known as SMARCA4)-associated factor 60c (BAF60c). BAF60c is then phosphorylated by atypical protein kinase C (aPKC) at S257, allowing it to form a LipoBAF complex at promoters of lipogenic genes to regulate chromatin structure and gene transcription.
 * In contrast, HDAC9 deacetylates USF1 during cell fasting, prevents the recruitment of USF1-interacting factors, and suppresses the transcriptional activation of lipogenic genes.

In early embryonic development
USF1 transcription undergoes active dynamics during cell meiosis, in which the USF1 mRNA first increases significantly during 2-8 cells and then decreases to an undetectable level at the blastocyst stage, indicating its role in the embryo genome activation. USF1 siRNA knockout has been shown to compromise the blastocyst rate and deregulate the transcripts of twist-related protein 2 (increased), growth differentiation factor-9 and follistatin (decreased) by affecting their promoter-binding element E-box region during oocyte maturation.

Diabetic kidney disease
Diabetic kidney disease (DKD) (or Diabetic nephropathy) is a progressive microalbuminuria disease with a slight loss of albumin in the urine (30–300 mg per day); DKD has been viewed as a diabetic complication-related microvascular disorder in a renal manifestation. In kidney biopsy, DKD is characterized by glomerular and tubular basement thickening, mesangial expansion, glomerulosclerosis, podocyte effacement (histology) and nephron loss. DKD occurs in 30%-50% of the diabetic patient population and leads to kidney failures in up to 20% of the type 1 diabetic patients. However, a substantial portion of DKD patients do not manifest albuminuria. DKD pathogenesis is attributed to the dysregulated glucose transport at a higher glucose level and the excessive influx of intracellular glucose into endothelial cells. The elevated glucose level is sustained along with multiple metabolic phenotypes such as excess fatty acids and oxidative stress, as well as shear stress es induced by hypertension and hyperfusion, and can lead to microvascular rarefaction, hypoxia and maladaptation in glomerular neoangiogenesis.

USF1 as an insulin-sensitive transcription factor that becomes active in response to a high glucose level promotes the transactivation of genes involved in lipid metabolism, including hepatic lipase (LIPC), hepatocyte nuclear factor 4 alpha (HNF4A), Apolipoprotein AI (APOA1), Apolipoprotein L1 (APOL1) and Haptoglobin-related protein (HPR). Especially, APOL1 is known to complex with APOA-I and HDL to facilitate cell autophagy in response to injuries and prevent glomerular diseases; however, an APOL1 risk variant specific to podocyte inhibits cell autophagy and can trigger kidney disease.

Increased FASN-mediated de novo lipid synthesis
Cancer cells exhibit a set of phenotypes, including a highlighted increase in aerobic glycolysis, lactic acid production (known as the Warburg effect), elevated protein and DNA synthesis, and increased de novo or endogenous fatty acid synthesis by fatty acid synthase (FAS). FAS synthesizes primarily palmitate from malonyl-CoA, which is further esterified to triglycerides for energy storage. Normally, FASN is active during embryogenesis and in fetal lungs for lubricant production; however, it is physiologically low-expressed in non-cancerous adult cells. In contrast, abnormal FASN overexpression is detected in multiple cancer types, spanning breast cancer, colorectal cancer, prostate cancer, pancreatic cancer and ovarian cancer. FASN-mediated de novo lipid synthesis accounts for more than 93% of triglycerides in tumor cells. Specifically, tumor cells prefer glycolysis over oxidation for energy consumption and re-direct the glycolytic products towards de novo fatty acid synthesis to supply lipids for membrane production and protein lipidation for fast cell proliferation. For example, PI3K-AKT pathway is found to increase in LNCaP prostate cancer cells to stimulate FASN overexpression. Concurrently, fatty acid synthase overexpression is also post-translationally sustained by USP2 a-mediated ubiquitination reduction, stabilizing FAS for constitutive signal transduction. In addition to de novo lipogenesis, FAS promotes the localization of VEGFR-2 to the lipid raft of the endothelial cell membrane and thus enhances angiogenesis in tumor development. Meanwhile, mutual activation between FAS and ERBB2 (HER2) signaling also potentiates tumorigenesis, in which ERBB2 amplification is associated with elevated survival and proliferation of cancer cells and poor prognosis in breast and gastric cancers; an ERBB2 increase, especially, contributes to 18-25% of breast cancers. In prostate cancer cells and promyelocytic leukemia cells, USF1 activation also attains a high-level of PAI-1 expression and inhibits spontaneous or camptothecin-induced apoptosis.

Decreased USF1-p53 interaction and increased p53 instability
The poor prognosis of gastric cancers is associated with low expression of USF1 and p53. Among gastric cancer patients, 88% of the patients are diagnosed with H. pylori infection, and half of the patients show lower USF1 expression in tumor tissues. Mechanistically, H. pylori induces DNA hypermethylation in the promoter regions of USF1 and USF2 and inhibits expression. Decreased expression reduces the interaction between USF1 and p53 when DNA damage occurs, rendering p53 to associate more frequently with the E3-ubiquitin ligase HDM2 (also known as MDM2) and increasing p53 instability in cancer cells.

Familial combined hyperlipidemia
Familial combined hyperlipidemia (FCHL) was first used to describe lipid abnormalities in 47 Seattle pedigree-containing members with hypercholesterolemia and hypertriglyceridemia. The core FCHL lipid profiles feature high serum cholesterol/triglyceride, apolipoprotein B (APOB) and LDL levels. Genetic evidence has suggested a FCHL-related locus on the human chromosome 1q21-q23, which is linked to metabolic syndromes. Fine-mapping of those linked regions identifies USF1 as the first positionally cloned gene for FCHL and a target for FCHL treatment. In addition, hepatocyte nuclear factor 4 alpha (HNF4A) is also implicated in high lipid levels and metabolic syndromes. Cooperative effects of USF1 and HNF4A have been shown to regulate the expression of apolipoprotein A-II (APOA2) and apolipoprotein C-III (APOC3). Mutations in USF1, HNF4A and apolipoproteins also increase patients' susceptibility to FCHL. Additional genes subjected to USF1 regulation and involved in glucose/lipid metabolism include apolipoprotein A5 (APOA5), apolipoprotein E (APOE), hormone-sensitive lipase (LIPE), hepatic lipase (LIPC), glucokinase (GCK), islet-specific glucose-6-phosphatase catalytic-subunit-related protein (IGRP), insulin, glucagon receptor (GCGR) and ATP-binding cassette transporter A1 (ABCA1).

Interactions
USF1 (human gene) has been shown to interact with USF2, FOSL1 and GTF2I.