User:Prewmi/sandbox



In molecular genetics, the three prime untranslated region (3’-UTR) is the section of messenger RNA (mRNA) that immediately follows the translation termination codon of the gene. A mRNA molecule is transcribed from the DNA sequence and is later translated into protein. Several regions that comprise the mRNA molecule are not translated into protein including the 5' cap, 5’ untranslated region, 3’ untranslated region, and poly(A) tail. The 3'-UTR often contains several regulatory regions affecting post-transcriptional gene expression.

As an extremely influential regulatory feature of an mRNA, the 3’-untranslated region affects the polyadenylation, translation efficiency, localization, and stability of the mRNA. The 3’-UTR contains both binding sites for regulatory proteins as well as microRNAs (miRNAs). By binding to specific sites with the 3’-UTR, miRNAs can decrease gene expression of various mRNAs by either inhibiting translation or directly causing degradation of the transcript. Many 3’-UTRs also contain AU-rich elements (AREs), to which proteins bind in order to affect the stability or decay rate of transcripts in a localized manner or affect translation initiation. Furthermore, the 3’-UTR contains the signal AAUAAA that adds several hundred adenine residues called the poly(A) tail to the mRNA transcript. Poly(A) binding proteins (PABP) bind to this tail, resulting in regulation of mRNA translation, stability, and export. For example, the poly(A) tail interacts with the 5’ end of the transcript, causing a circularization of the mRNA that promotes translation. The 3'-UTR can also contain sequences that attract proteins to associate the mRNA with the cytoskeleton, transport it to or from the nucleus, and perform other types of localization. In addition to sequences within the 3’-UTR, the physical characteristics of the region, including its length and secondary structure, contribute to translation regulation. These diverse mechanisms of gene regulation ensure that the correct genes are expressed in the correct cells and at the correct times.

Physical Characteristics
In the mammalian genome there is considerable variation in the length of 3’-UTRs. This region of the mRNA transcript can range from 60 to 80 nucleotides to about 4000. The average length for the 3’-UTR in humans is approximately 800 nucleotides meanwhile the average length of 5’UTRs is only about 200 nucleotides. In general the longer the 3’-UTR is the more likely that lower levels of expression will be observed. This is so because longer 3’-UTRs will potentially possess more miRNA binding sites that have the ability to inhibit translation. The nucleotide composition between the 5’ and 3’-UTR also differences significantly. The mean G+C percentage of the 5’-UTR in warm-blooded vertebrates is about 60% and only 45% for 3’-UTRs. A significant inverse correlation has been observed between the G+C% of 5’ and 3’-UTRs and their corresponding lengths. Accordingly, UTRs that are GC-poor tend to be longer than those in GC-rich genome regions.

Sequences within the 3’-UTR also have the ability to degrade or stabilize the mRNA transcript. Modification of transcript stability allows expression to be rapidly controlled without altering translation rates. One group of the stabilization elements that are situated in the 3’-UTR are the AU-rich elements (AREs). These elements range in size from 50-150 base pairs and generally contain multiple copies of the pentanucleotide AUUUA. Early studies indicated that AREs can be variable in sequence and have defined three main classes that differ in the number and arrangement of motifs. Depending upon the protein that binds to the region the mRNA can either be stabilized or degraded. Another set of elements that is present in both the 5’ and 3’-UTR are iron response elements (IREs). The IRE is a stem-loop structure within the untranslated regions of mRNAs that encode proteins involved in cellular iron metabolism. Whether the mRNA is degraded or stabilized is dependent on the binding of specific proteins and intracellular iron concentrations.

The 3’-UTR also contains sequences that signal additions to be made either to the transcript itself or to the product of translation. There are two different polyadenylation signals present within the 3’-UTR which signal the addition of the poly(A) tail. The poly(A) tail is synthesized at a defined length of about 250 base pairs. The primary signal used is the nuclear polyadenylation signal (PAS) with the sequence AAUAAA located toward the end of the 3’-UTR. In addition, cytoplasmic polyadenylation can occur which regulates the translational activation of maternal mRNAs in early development. The element that controls this process is called the CPE which is AU-rich and located near the 3’-UTR as well. The CPE generally has the structure UUUUUUAU and is usually within 100 base pairs of the nuclear PAS. Another specific addition signaled by the 3’-UTR is the incorporation of selenocysteine at UGA codons of mRNAs encoding selenoproteins. Normally the UGA codon encodes for a stop of translation, but a conserved step-loop structure called the selenocysteine insertion sequence (SECIS) causes for the insertion of selenocysteine instead.

Role in Gene Expression
By influencing the localization, stability, export, and translation efficiency of an mRNA, the 3’-untranslated region plays a crucial role in gene expression. The 3’-UTR often contains microRNA response elements (MREs), which are sequences to which miRNAs bind. MiRNAs are short, non-coding RNA molecules capable of binding to mRNA transcripts in order to regulate their expression. One miRNA mechanism involves partial base pairing of the 5’ seed sequence of a miRNA to an MRE within the 3’-UTR of an mRNA; this binding then causes translation repression. Another mechanism involves perfect base pairing of a miRNA to an MRE, which subsequently triggers degradation of the mRNA transcript. As prevalent motifs within the 3’-UTR, MREs make up about half of such motifs. The interaction between miRNAs and MREs allows for differentiatial gene expression in various tissues and developmental stages. In addition to containing MREs, the 3’-UTR also often contains AU-rich elements (AREs), which are 50 to 150 bp in length and usually include many copies of the sequence AUUUA. ARE binding proteins (ARE-BPs) bind to AU-rich elements in a manner that is dependent upon tissue type, cell type, timing, cellular localization, and environment. In response to different intracellular and extracellular signals, ARE-BPs can promote mRNA decay, affect mRNA stability, or activate translation. This mechanism of gene regulation is involved in cell growth, cell differentiation, and adaptation to external stimuli. It therefore acts on transcripts encoding cytokines, growth factors, tumor suppressors, proto-oncogenes, cyclins, enzymes, transcription factors, receptors, and membrane proteins.

As another feature of the 3’-UTR that interacts with RNA binding proteins, the poly(A) tail contains binding sites for poly(A) binding proteins (PABPs). These proteins cooperate with other factors to affect the export, stability, decay, and translation of an mRNA. PABPs bound to the poly(A) tail interact with proteins, such as eIF4F, that are bound to the 5’ cap of the mRNA. This interaction causes circularization of the transcript, which subsequently promotes translation initiation. Furthermore, it allows for efficient translation by causing recycling of ribosomes. While the presence of a poly(A) tail usually aids in triggering translation, the absence or removal of one often leads to exonuclease-mediated degradation of the mRNA. Polyadenylation itself is regulated by sequences within the 3’-UTR of the transcript. These sequences include cytoplasmic polyadenylation elements (CPEs), which are uridine-rich sequences that contribute to both polyadenylation activation and repression. CPE-binding protein (CPEB) binds to CPEs in conjunction with a variety of other proteins in order to elicit different responses.

While the sequence that constitutes the 3’-UTR contributes greatly to gene expression, the structural characteristics of the 3’-UTR also play a large role. In general, longer 3’-UTRs correspond to lower expression rates since they often contain more miRNA and protein binding sites that are involved in inhibiting translation. Human transcripts possess 3’-UTRs that are on average twice as long as other mammalian 3’-UTRs. This trend reflects the high level of regulation involved in the human genome. The ability of length to affect gene expression proves to be important during oogenesis, the creation of an egg cell. Various mRNAs are stored with shorter 3’-UTRs in a translationally inactive state and are then activated through a process that lengthens their 3’-UTRs. In addition to length, the secondary structure of the 3’-untranslated region also has regulatory functions. Protein factors can either aid or disrupt folding of the region into various secondary structures. The most common structure is a step-loop, which provides a scaffold for RNA binding proteins and non-coding RNAs that influence expression of the transcript.

Another mechanism involving the structure of the 3’-UTR is called alternative polyadenylation (APA), which results in mRNA isoforms that differ only in their 3’-UTRs. This mechanism is especially useful for complex organisms as it provides a means of expressing the same protein but in varying amounts and locations. It is utilized by about half of human genes. APA can result from the presence of multiple polyadenylation sites or mutually exclusive terminal exons. Since it can affect the presence of protein and miRNA binding sites, APA can cause differential expression of mRNA transcripts by influencing their stability, export to the cytoplasm, and translation efficiency.

Methods of Study
Studying 3’UTRs proves to be very tedious; even if a given 3’UTR in an mRNA is showed to be present in a tissue, the effects of localization, the functional half-life, translational efficiency, and trans-acting elements must be determined to understand the full functionality of a 3’UTR. Computational approaches, primarily by sequence analysis, have shown the existence of AREs in approximately 5 to 8% of human 3’-UTRs and of one or more miRNA targets in as many as 60% or more of human 3’-UTRs. Experimental approaches have been used to define sequences that associate with specific RNA-binding proteins. Recent improvements in sequencing and cross-linking techniques have enabled fine mapping of protein binding sites within the transcript. Induced site-specific mutations, for example those that affect the termination codon, polyadenylation signal, or secondary structure of the 3’-UTR, can show how mutated regions can cause translation deregulation and disease. These types of transcript-wide methods should help our understanding of known cis elements and trans-regulatory factors within 3’-UTRs.

Disease
3’UTR mutations can be very consequential, because one alteration can be responsible for the altered expression of many genes. Transcriptionally, a mutation may affect only the allele and genes that are physically linked. However, since 3’UTR binding proteins also function in the processing and nuclear export of mRNA, a mutation can also affect other, unrelated genes. Disregulation of ARE-binding proteins (AUBPs) due to mutations in AU-rich regions can lead to diseases including tumorigenesis (cancer), hematopoiesis, and leukemogenesis. An expanded number of trinucleotide (CTG) repeats in the 3’UTR of a cAMP-dependent protein kinase gene (DMPK) causes mytonic dystrophy. Retro-transposal 3-kilobase insertion of tandemly repeated sequences within the 3ʹ’ UTR of fukutin protein is linked to Fukuyama-type congenital muscular dystrophy. Elements in the 3’-UTR have also been linked to Human acute myelogenous leukemia, α-Thalassemia, neuroblastoma. Keratinopathy, Aniridia, IPEX, and Congenital Heart Disease. The few UTR-mediated diseases identified only hint at the countless links yet to be discovered, and establish a promising future of medical advancement.

Future Development
Despite our current understanding of 3’-UTRs, they are still relative mysteries. Since mRNAs usually contain several overlapping control factors, it is often difficult to specify the identity and function of each 3’UTR element. Additionally, each 3’UTR contains many alternative AU-rich elements and polyadenylation signals. These cis and trans-acting elements, along with miRNAs, offer a limitless range of control possibilities within a single mRNA. Future research through the increased use of deep-sequencing based ribosome profiling, will reveal more regulatory subtleties as well as new control elements and binding proteins. Furthermore, the ultimate fate of a transcript lies in the transduction pathway that it is involved in, so future research in this area appears promising.