SALL4

Sal-like protein 4 (SALL4) is a transcription factor encoded by a member of the Spalt-like (SALL) gene family, SALL4. The SALL genes were identified based on their sequence homology to Spalt, which is a homeotic gene originally cloned in Drosophila melanogaster that is important for terminal trunk structure formation in embryogenesis and imaginal disc development in the larval stages. There are four human SALL proteins (SALL1, 2, 3, and 4) with structural homology and playing diverse roles in embryonic development, kidney function, and cancer. The SALL4 gene encodes at least three isoforms, termed A, B, and C, through alternative splicing, with the A and B forms being the most studied. SALL4 can alter gene expression changes through its interaction with many co-factors and epigenetic complexes. It is also known as a key embryonic stem cell (ESC) factor.

Structure, interaction partners, and DNA binding activity
SALL4 contains one zinc finger in its amino (N-) terminus and three clusters of zinc fingers that each coordinates zinc with two cysteines and two histidines (Cys2His2-type) that potentially confer nucleic acid binding activity. SALL4B lacks two of the zinc finger clusters found in the A isoform. Although it remains unclear which zinc finger cluster is responsible for SALL4’s DNA binding property

Different SALL family members can form hetero- or homodimers via their conserved glutamine (Q)-rich region. SALL4 has at least one canonical nuclear localization signal (NLS) with the K-K/R-X-K/R motif in the N-terminal portion of the protein shared among both A and B isoforms (residues 64–67). One report has suggested that with a mutated NLS sequence, SALL4 cannot localize to the nucleus. Through a 12-amino acid sequence in its N-terminus (N-12a.a.), SALL4 binds to retinoblastoma binding protein 4 (RBBP4), a subunit of the nucleosome remodeling and histone deacetylation (NuRD) complex, which also contains chromodomain-helicase-DNA binding proteins (CHD3/4 or Mi-2a/b), metastasis-associated proteins (MTA), methyl-CpG-binding domain proteins (MBD2 or MBD3), and histone deacetylases (HDAC1 and HDAC2). This association allows SALL4 to act as a transcriptional repressor. Accordingly, SALL4 has been shown to localize to heterochromatin regions in cells, for which its last zinc finger cluster (shared between SALL4A and B) is necessary. Beside the NuRD complex, SALL4 is reportedly able to bind to other epigenetic modifiers such as histone lysine-specific demethylase 1 (LSD1), which is frequently associated with the NuRD complex and subsequently gene repression. In addition, SALL4 can also activate gene expression via the recruitment of the mixed lineage leukemia (MLL) protein, which is a homolog of Drosophila Trithorax and yeast Set1 proteins and has histone 3 lysine 4 (H3K4) trimethylation activity. This interaction is best characterized in the co-regulation of HOXA9 gene by SALL4 and MLL in leukemic cells.

In mouse ESCs, Sall4 was found to bind the essential stem cell factor, octamer-binding transcription factor 4 (Oct4), in two separate unbiased mass spectrometry (spec) screens Sall4 can also bind other important pluripotency proteins such as Nanog and sex determining region Y (SRY)-box 2 protein (Sox2). Together these proteins can affect each other’s expression patterns as well as their own, thus forming a mESC-specific transcriptional regulatory circuit. SALL4 has also been reported to bind T-box 5 protein (Tbx5) in cardiac tissues as well as genetically interact with Tbx5 in mouse limb development. Other binding partners of SALL4 include promyelocytic leukemia zinc finger protein (PLZF) in sperm precursor cells, Rad50 during DNA damage repair, and b-catenin downstream of the Wnt signaling pathway. Since most of these interactions were identified by mass-spec or co-immunoprecipitation, whether they are direct are unknown. Through chromatin immunoprecipitation (ChIP) followed by next-generation sequencing or microarray, some SALL4 targets have been identified. A key verified target gene encodes the enzyme phosphatidylinositol-3,4,5-trisphosphate 3-phosphatase (PTEN). PTEN is a tumor suppressor that keeps uncontrolled cell growth in check through inducing programmed cell death, or apoptosis. SALL4 binds the PTEN promoter and recruits the NuRD complex to mediate its repression, thus leads to proliferation of cells.

Expression and role in stem cells and development
In mouse embryos, SALL4 expression is detectable as early as the two-cell stage. Its expression persists through 8- and 16-cell stages to the blastocyst, where it is found in some cells of the trophectoderm and inner cell mass (ICM), from which mouse ESCs are derived. SALL4 is an important factor for maintaining the “stemness” of ESCs of both mouse and human origin, since loss of Sall4 leads to differentiation of these pluripotent cells down the trophectoderm lineage. This is possibly due to down-regulation of Pou5f1 (encoding Oct4) expression and up-regulation of caudal-type homeobox 2 (Cdx2) gene expression. Sall4 is part of the transcriptional regulatory network that includes other pluripotent factors such as Oct4, Nanog, and Sox2 Because of its important role in early development, genetically mutated mice without functioning SALL4 die early on at the peri-implantation stage, while heterozygous mice have neural, kidney, heart defects and limb abnormalities.

Clinical significance
The various SALL4-null mouse models mimic human mutations in the SALL4 gene, which were shown to cause developmental problems in patients with Okihiro/Duane-Radial-ray syndrome. These individuals frequently have family history of hand malformation and eye movement disorders.

SALL4 expression is low to undetectable in most adult tissues with the exception of germ cells and human blood progenitor cells. However, SALL4 is re-activated and mis-regulated in various cancers such as acute myeloid leukemia (AML), B-cell acute lymphocytic leukemia (B-ALL), germ cell tumors, gastric cancer, breast cancer, hepatocellular carcinoma (HCC), lung cancer, and glioma. In many of these cancers, SALL4 expression was compared in tumor cells to the normal tissue counterpart, e.g. it is expressed in nearly half of primary human endometrial cancer samples, but not in normal or hyperplastic endometrial tissue samples. Often, SALL4 expression is correlated with worse survival and poor prognosis such as in HCC, or with metastasis such as in endometrial cancer, colorectal carcinoma, and esophageal squamous cell carcinoma. It is unclear how SALL4 expression is de-regulated in malignant cells, but DNA hypomethylation in its intron 1 region has been observed in B-ALL.

In breast cancer, Signal transducer and activator of transcription 3 (STAT3) has been reported to directly activate SALL4 expression. Furthermore, canonical Wnt signaling has been proposed to activate SALL4 gene expression in both development and in cancer. In leukemia, the mechanism of SALL4 function is better characterized; mice with over-expression of human SALL4 develop myelodysplatic syndromes (MDS)-like symptoms and eventually AML. This is consistent with high level of SALL4 expression correlating with high-risk MDS patients. Further elucidating its tumorigenesis function, knocking down SALL4 expression with short hairpin-RNA in leukemic cells or treating these cells with a peptide that mimics the N-12aa of SALL4 to inhibit its interaction with the NuRD complex both result in cell death. These suggest the primary cancer-maintaining property of SALL4 is mediated through its transcriptional repressing function. These observations have led to growing interest in SALL4 as both a diagnostic tool as well as target in cancer therapy. For example, in solid tumors such as germ cell tumors, SALL4 protein expression has become a standard diagnostic biomarker.