Mothers against decapentaplegic homolog 4

SMAD4, also called SMAD family member 4, Mothers against decapentaplegic homolog 4, or DPC4 (Deleted in Pancreatic Cancer-4) is a highly conserved protein present in all metazoans. It belongs to the SMAD family of transcription factor proteins, which act as mediators of TGF-β signal transduction. The TGFβ family of cytokines regulates critical processes during the lifecycle of metazoans, with important roles during embryo development, tissue homeostasis, regeneration, and immune regulation.

SMAD 4 belongs to the co-SMAD group (common mediator SMAD), the second class of the SMAD family. SMAD4 is the only known co-SMAD in most metazoans. It also belongs to the Darwin family of proteins that modulate members of the TGFβ protein superfamily, a family of proteins that all play a role in the regulation of cellular responses. Mammalian SMAD4 is a homolog of the Drosophila protein "Mothers against decapentaplegic" named Medea.

SMAD4 interacts with R-Smads, such as SMAD2, SMAD3, SMAD1, SMAD5 and SMAD8 (also called SMAD9) to form heterotrimeric complexes. Transcriptional coregulators, such as WWTR1 (TAZ) interact with SMADs to promote their function. Once in the nucleus, the complex of SMAD4 and two R-SMADS binds to DNA and regulates the expression of different genes depending on the cellular context. Intracellular reactions involving SMAD4 are triggered by the binding, on the surface of the cells, of growth factors from the TGFβ family. The sequence of intracellular reactions involving SMADS is called the SMAD pathway or the transforming growth factor beta (TGF-β) pathway since the sequence starts with the recognition of TGF-β by cells.

Gene
In mammals, SMAD4 is coded by a gene located on chromosome 18. In humans, the SMAD4 gene contains 54 829 base pairs and is located from pair n° 51,030,212 to pair 51,085,041 in the region 21.1 of the chromosome 18.



Protein
SMAD4 is a 552 amino-acid polypeptide with a molecular weight of 60.439 Da. SMAD4 has two functional domains known as MH1 and MH2.

The complex of two SMAD3 (or of two SMAD2) and one SMAD4 binds directly to DNA though interactions of their MH1 domains. These complexes are recruited to sites throughout the genome by cell lineage-defining transcription factors (LDTFs) that determine the context-dependent nature of TGF-β action. Early insights into the DNA binding specificity of Smad proteins came from oligonucleotide binding screens, which identified the palindromic duplex 5'–GTCTAGAC–3' as a high affinity binding sequence for SMAD3 and SMAD4 MH1 domains. Other motifs have also been identified in promoters and enhancers. These additional sites contain the CAGCC motif and the GGC(GC)|(CG) consensus sequences, the latter also known as 5GC sites. The 5GC-motifs are highly represented as clusters of sites, in SMAD-bound regions genome-wide. These clusters can also contain CAG(AC)|(CC) sites. SMAD3/SMAD4 complex also binds to the TPA-responsive gene promoter elements, which have the sequence motif TGAGTCAG.

MH1 domain complexes with DNA motifs
The first structure of SMAD4 bound to DNA was the complex with the palindromic GTCTAGAC motif. Recently, the structures of SMAD4 MH1 domain bound to several 5GC motifs have also been determined. In all complexes, the interaction with the DNA involves a conserved β-hairpin present in the MH1 domain. The hairpin is partially flexible in solution and its high degree of conformational flexibility allows recognition of the different 5-bp sequences. Efficient interactions with GC-sites occur only if a G nucleotide is located deep in the major grove, and establishes hydrogen bonds with the guanidinium group of Arg81. This interaction facilitates a complementary surface contact between the Smad DNA-binding hairpin and the major groove of the DNA. Other direct interactions involve Lys88 and Gln83. The X-ray crystal structure of the Trichoplax adhaerens SMAD4 MH1 domains bound to the GGCGC motif indicates a high conservation of this interaction in metazoans.



MH2 domain complexes
The MH2 domain, corresponding to the C-terminus, is responsible for receptor recognition and association with other SMADs. It interacts with the R-SMADS MH2 domain and forms heterodimers and heterotrimers. Some tumor mutations detected in SMAD4 enhance interactions between the MH1 and MH2 domains.

Nomenclature and origin of name
SMADs are highly conserved across species, especially in the N terminal MH1 domain and the C terminal MH2 domain. The SMAD proteins are homologs of both the Drosophila protein MAD and the C. elegans protein SMA. The name is a combination of the two. During Drosophila research, it was found that a mutation in the gene MAD in the mother repressed the gene decapentaplegic in the embryo. The phrase "Mothers against" was added, since mothers often form organizations opposing various issues, e.g. Mothers Against Drunk Driving (MADD), reflecting "the maternal-effect enhancement of dpp"; and based on a tradition of unusual naming within the research community. SMAD4 is also known as DPC4, JIP or MADH4.

Function and action mechanism
SMAD4 is a protein defined as an essential effector in the SMAD pathway. SMAD4 serves as a mediator between extracellular growth factors from the TGFβ family and genes inside the cell nucleus. The abbreviation co in co-SMAD stands for common mediator. SMAD4 is also defined as a signal transducer.

In the TGF-β pathway, TGF-β dimers are recognized by a transmembrane receptor, known as type II receptor. Once the type II receptor is activated by the binding of TGF-β, it phosphorylates a type I receptor. Type I receptor is also a cell surface receptor. This receptor then phosphorylates intracellular receptor regulated SMADS (R-SMADS) such as SMAD2 or SMAD3. The phosphorylated R-SMADS then bind to SMAD4. The R-SMADs-SMAD4 association is a heteromeric complex. This complex is going to move from the cytoplasm to the nucleus: it is the translocation. SMAD4 may form heterotrimeric, heterohexameric or heterodimeric complexes with R-SMADS.

SMAD4 is a substrate of the Erk/MAPK kinase and GSK3. The FGF (Fibroblast Growth Factor) pathway stimulation leads to Smad4 phosphorylation by Erk of the canonical MAPK site located at Threonine 277. This phosphorylation event has a dual effect on Smad4 activity. First, it allows Smad4 to reach its peak of transcriptional activity by activating a growth factor-regulated transcription activation domain located in the Smad4 linker region, SAD (Smad-Activation Domain). Second, MAPK primes Smad4 for GSK3-mediated phosphorylations that cause transcriptional inhibition and also generate a phosphodegron used as a docking site by the ubiquitin E3 ligase Beta-transducin Repeat Containing (beta-TrCP) that polyubiquitinates Smad4 and targets it for degradation in the proteasome. Smad4 GSK3 phosphorylations have been proposed to regulate the protein stability during pancreatic and colon cancer progression.

In the nucleus the heteromeric complex binds promoters and interact with transcriptional activators. SMAD3/SMAD4 complexes can directly bind the SBE. These associations are weak and require additional transcription factors such as members of the AP-1 family, TFE3 and FoxG1 to regulate gene expression.

Many TGFβ ligands use this pathway and subsequently SMAD4 is involved in many cell functions such as differentiation, apoptosis, gastrulation, embryonic development and the cell cycle.

Clinical significance
Genetic experiments such as gene knockout (KO), which consist in modifying or inactivating a gene, can be carried out in order to see the effects of a dysfunctional SMAD 4 on the study organism. Experiments are often conducted in the house mouse (Mus musculus).

It has been shown that, in mouse KO of SMAD4, the granulosa cells, which secrete hormones and growth factors during the oocyte development, undergo premature luteinization and express lower levels of follicle-stimulating hormone receptors (FSHR) and higher levels of luteinizing hormone receptors (LHR). This may be due in part to impairment of bone morphogenetic protein-7 effects as BMP-7 uses the SMAD4 signaling pathway.

Deletions in the genes coding for SMAD1 and SMAD5 have also been linked to metastasic granulosa cell tumors in mice.

SMAD4, is often found mutated in many cancers. The mutation can be inherited or acquired during an individual's lifetime. If inherited, the mutation affects both somatic cells and cells of the reproductive organs. If the SMAD 4 mutation is acquired, it will only exist in certain somatic cells. Indeed, SMAD 4 is not synthesized by all cells. The protein is present in skin, pancreatic, colon, uterus and epithelial cells. It is also produced by fibroblasts. The functional SMAD 4 participates in the regulation of the TGF-β signal transduction pathway, which negatively regulates growth of epithelial cells and the extracellular matrix (ECM). When the structure of SMAD 4 is altered, expression of the genes involved in cell growth is no longer regulated and cell proliferation can go on without any inhibition. The important number of cell divisions leads to the forming of tumors and then to multiploid colorectal cancer and pancreatic carcinoma. It is found inactivated in at least 50% of pancreatic cancers.

Somatic mutations found in human cancers of the MH1 domain of SMAD 4 have been shown to inhibit the DNA-binding function of this domain.

SMAD 4 is also found mutated in the autosomal dominant disease juvenile polyposis syndrome (JPS). JPS is characterized by hamartomatous polyps in the gastrointestinal (GI) tract. These polyps are usually benign, however they are at greater risk of developing gastrointestinal cancers, in particular colon cancer. Around 60 mutations causing JPS have been identified. They have been linked to the production of a smaller SMAD 4, with missing domains that prevent the protein from binding to R-SMADS and forming heteromeric complexes.

Mutations in SMAD4 (mostly substitutions) can cause Myhre syndrome, a rare inherited disorder characterized by mental disabilities, short stature, unusual facial features, and various bone abnormalities.