Zinc finger protein 226

Zinc finger protein 226 is a protein that in humans is encoded by the ZNF226 gene.

Gene
The zinc finger protein 226 is also known as the Kruppel-associated box protein. Within humans, the ZNF226 gene is found on the plus strand of chromosome19q13, spanning 13,311 nucleotides from 44,165,070 to 44,178,381.

Transcript
Currently, there are 20 different transcript variants encoding ZNF226. All of them have six or seven identified exon regions within ZNF226. The longest identified transcript, ZNF226 transcript variant x4 spans 2,797 base pairs (bp).

Protein
ZNF226 is currently known to have three isoforms within humans: ZNF226 isoform X1, ZNF226 isoform X2, and ZNF226 isoform X3. The ZNF226 isoform X1 protein is the longest known variant, with 803 amino acids. This protein contains the Kruppel associated box A (KRAB-A) domain, which functions as a transcriptional repressor. However, the exact function of the ZNF226 protein is currently unknown. Within isoform X1, there are 18 C2H2 zinc finger structural motif (zf-C2H2) domains, which are known to bind either zinc ions (Zn2+) or nucleic acid (Figure 7–8). Within those regions, cysteine and histidine are the primary amino acids that bind to Zn2+ or nucleic acid, although other amino acids have been identified for binding (Figure 7–8). In addition to the KRAB-A domain and zf-C2H2 domains, there are zinc finger double domains which also contain binding sites for ions or nucleic acids.

ZNF226 human and ortholog protein sequences have molecular weights between 89 and 92 kDa. They had isoelectric points (pI) ranging from 8.60 to 9.00. In humans, the zf-C2H2 and zinc finger double domain region of ZNF226 isoform X1 is 59.3 kDa with a theoretical pI of 9.11. With the spacing of cysteine, or C, there is a cysteine every three amino acids. At least one of the amino acids in between the C's are either aspartic acid or glutamic acid. Despite the region's patterns of aspartic acid, it is still considered to have a lesser amount of the amino acid at 1.9%. There is also repetition within the chemical patterns within humans that is characteristic of ZNF226. These repetitions appear most common within the zf-C2H2 and zinc finger double domains of the protein, notably with cysteine and histidine binding sites. Predicted secondary structures of ZNF226 demonstrate a variable number of alpha helices, beta-stranded bridges, and random coils throughout the protein. Using various programs, such as GOR4 and the Chou and Fasman program, there is overall similarity in the predictions of coiled, stranded, and helix regions throughout the protein.

Promoter
Using the Genomatix software, GXP_7536741 (1142 bp) was identified as the best promoter of ZNF226 (Figure 1). Within the last 500 bp of the promoter, the signal transducer and activator of transcription (V$STAT.01), selenocysteine tRNA activating factor (V$THAP11.01), and cell cycle regulators: cell cycle homology region (V$CHR.01) were conserved among Homo sapiens, Macaca mulatta, Pan troglodytes, and Canis lupus familiaris. In addition, the SPI-1 proto-oncogene; hematopoietic TF PU.1 (V$SPI1.02) was also known for binding to a promoter region within the c-fes proto-oncogene which encodes tyrosine kinase. The TF binding site is also found in two regions within the promoter sequence. The signal transducer and activator of transcription binding site was also conserved in two regions, and is known to have a higher binding specificity. The selenocysteine tRNA activating factor plays a role in embryonic stem cell regeneration. The cell cycle regulators: cell cycle homology region binding site plays an important role in cell survival, where mutations in the transcription factor can lead to apoptosis.

Tissue distribution
In terms of gene expression, ZNF226 is generally expressed in most tissues. Microarray data illustrates higher expression of ZNF226 within the ovaries. This is further supported by data which depicts a decrease in ZNF226 expression in granulosa cells within individuals with polycystic ovary syndrome. There was also higher expressions of ZNF226 observed within the thyroid compared to other tissues. Evidence of decreased ZNF226 expression is observed with individuals with papillary thyroid cancer.

Within fetuses, there is some level of ZNF226 expression present within all tissues throughout the gestational period of 10 to 20 weeks. However, there is a higher level of ZNF226 expression in the heart at 10 weeks of gestation, and a decreased level of expression within kidneys at 20 weeks gestation.

ZNF226 expression has been observed within epithelial progenitor cells (EPCs) in the peripheral blood (PB) and umbilical cord blood (CB). The gene expression is lower in PB-EPCs when compared to CB-EPCs. PB-EPCs have more tumor suppressor (TP53) expression when compared to CB-EPCs. CB-EPCs have more angiogenic expression, or growth and splitting of vasculature.

Transcript
Using RNAfold, minimum free energy structures were created based on the extended 5’ and 3’ untranslated region (UTR) in human sequences. Unconserved amino acids, miRNA, stem-loop formations, and RNA binding proteins (RBPs) are shown on the diagram (Figure 2–3).

miRNA targeting
Within the 5’ UTR region, both miR-4700-5p and miR-4667-5p were referenced in an experiment which identified certain miRNAs expressed consistently in ERBB2+ breast cancer gene. In addition, miR-8089 was referenced in a study showing certain novel miRNAs found within sepsis patients. miR-4271 was shown to have effects on coronary heart disease binding to the 3' UTR region of the APOC3 gene. Literature on miR-7113-5p shows that this miRNA is a mirtron.

Within the 3’ UTR region, miR-3143 is referenced in a study where miRNAs were expressed consistently in ERBB2+ breast cancer gene. miR-152-5p plays a role in inhibiting DNA methylation of genes involved in metabolic and inflammatory pathways. miR-31-3p is overexpressed in esophageal squamous cell carcinoma (ESCC). One miRNA result, miR-150-5p, was conserved across multiple homologs within the 3’ UTR region more than 3000 bp downstream. The miR-150-5p miRNA plays a role in colorectal cancer (CRC), where a lower expression of the miRNA was associated with a suppression of CRC metastasis.

RNA binding proteins
In terms of some of the RBPs found, PAPBC1 had five binding sites, two of which are highlighted on the 5’ UTR. This protein is known to attach poly-a-tails for proteins that have entered the cytoplasm, preventing them from re-entry into the nucleus. The FUS protein was another one found with a binding site on a predicted stem loop. The gene encodes for a protein which facilitates transportation of the protein into the cytoplasm. Within the 3’ UTR, the RBMY1A1, RBMX, and ACO1 proteins were some of the top scoring RBPs. The RBMY1A1 is a protein known to partake in splicing, and is required for sperm development. The RBMX protein is a homolog of the RBMY protein involved in sperm production. It is also known to promote transcription of a tumor suppressor gene, TXNIP. ACO1 is another RBP known to bind with mRNA to regulate iron levels. By binding to iron responsive elements, it can repress translation of ferritin and inhibit degradation of transferring receptor mRNA when iron levels become low.

Protein
Analysis to predict post-translational modifications of the protein were conducted on. Based on the results of Expasy's Myristoylator in Homo sapiens, Mirounga leonina, and Fukomys damarensis, it can be concluded that ZNF226 is not myristoylated at the N-terminus. Numerous predicted binding sites for post-translational modifications were also identified among the three species. The phosphorylation region at the C-terminus of the protein was also identified as a match for the protein kinase C phosphorylation binding site. S-nitrosylation was another identified modification at C354 (Figure 2). This modification is found in SRG1, a zinc finger protein that plays a role preventing nitric oxide (NO) synthesis. When NO is sustained, s-nitrosylation occurs within the protein, disrupting its transcriptional repression abilities. Acetylation was another modification identified. In the case of promyelocytic leukemia, a condition resulting in the abundance of blood forming cells in the bone marrow, promyelocytic leukemia zinc finger proteins are known to be activated by histone acetyltransferases, or by acetylation of a C-terminus lysine. Acetylation in other zinc finger proteins, such as GATA1, are known to enhance their ability to interact with other proteins. Arginine dimethylation is another identified modification within ZNF226. Arginine methylation of cellular nucleic acid binding protein (CNBP), a zinc finger protein, has shown to impede its ability to bind nucleic acids.

It is predicted that ZNF226 localizes within the nucleus, which aligns with its known functions as a transcription factor. It has also been predicted to localize within the mitochondria.

Homology/evolution
Although there is little information available on the ZNF226 gene, homologs of the gene have been found across eukaryotes and bacteria species. Strict orthologs were only found within mammals (Figure 4). The ZNF226 gene is also closely related to the paralog ZNF234 in humans, and the Zfp111 gene within mice. Across the various species in which ZNF226 orthologs and homologs that were identified, conservation of the C2H2 binding sites is apparent (Figure 4–5). In human ZNF226 paralogs, there is also conservation of the C2H2 binding sites, as well as nucleic acid binding sites.

Slow rate evolution is apparent for the ZNF226 protein. It evolves in a manner similar to the cytochrome c protein instead of the fibrinogen alpha chain protein (Figure 6).

Interacting Proteins
Two interactions detected via the two hybrid method occurred with SSBP3 and ATF4, both of which are transcription factors.

ATF4/CREB-2 is a transcription factor which binds to the long terminal repeat of the human T-cell leukemia type1 virus (HTLV-1). It can be an activator of HTLV-1.

SSBP3/CSDP is found in mice embryonic stem cells to develop into trophoblasts (provide nutrients to embryo). ZNF226 is expressed at greater levels within human stem cells.

Function
With ZNF226 being a transcription factor, playing a role in transcriptional repression, the 18 zf-C2H2 binding domains are predicted to bind to the DNA sequence shown in the sequence logo (Figure 7–9).

Associated diseases and conditions
A mutation within ZNF226 gene has been positively correlated with the presence of hepatocellular carcinoma (HCC). A particular SNP (rs2927438) also correlated with an increased expression of ZNF226 in brain frontal cortical tissue and peripheral mononuclear cells, such as T cells and B cells. The promoter region of ZNF226 was found to be hypomethylated in those who were exposed to the Chinese famine. The hypomethylated region in ZNF226 was shown to have a correlation of methylation in the blood and the prefrontal cortex, although the exact function of the protein in the famine is not understood. ZNF226 gene was listed among many other genes with a copy number variation (CNV) that was associated with single common variable immunodeficiency (CVID).

SNPs
Numerous SNPs were identified throughout the ZNF226 gene. Within the GXP_7536741 promoter, there were two SNPs of interest that were found. Listed below are associated transcription factors for both SNPs.