Transfer RNA



Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA ) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino acid sequence of proteins. Transfer RNA (tRNA) does this by carrying an amino acid to the protein-synthesizing machinery of a cell called the ribosome. Complementation of a 3-nucleotide codon in a messenger RNA (mRNA) by a 3-nucleotide anticodon of the tRNA results in protein synthesis based on the mRNA code. As such, tRNAs are a necessary component of translation, the biological synthesis of new proteins in accordance with the genetic code.

Overview
While the specific nucleotide sequence of an mRNA specifies which amino acids are incorporated into the protein product of the gene from which the mRNA is transcribed, the role of tRNA is to specify which sequence from the genetic code corresponds to which amino acid. The mRNA encodes a protein as a series of contiguous codons, each of which is recognized by a particular tRNA. One end of the tRNA matches the genetic code in a three-nucleotide sequence called the anticodon. The anticodon forms three complementary base pairs with a codon in mRNA during protein biosynthesis.

On the other end of the tRNA is a covalent attachment to the amino acid that corresponds to the anticodon sequence. Each type of tRNA molecule can be attached to only one type of amino acid, so each organism has many types of tRNA. Because the genetic code contains multiple codons that specify the same amino acid, there are several tRNA molecules bearing different anticodons which carry the same amino acid.

The covalent attachment to the tRNA 3' end is catalysed by enzymes called aminoacyl tRNA synthetases. During protein synthesis, tRNAs with attached amino acids are delivered to the ribosome by proteins called elongation factors, which aid in association of the tRNA with the ribosome, synthesis of the new polypeptide, and translocation (movement) of the ribosome along the mRNA. If the tRNA's anticodon matches the mRNA, another tRNA already bound to the ribosome transfers the growing polypeptide chain from its 3' end to the amino acid attached to the 3' end of the newly delivered tRNA, a reaction catalysed by the ribosome. A large number of the individual nucleotides in a tRNA molecule may be chemically modified, often by methylation or deamidation. These unusual bases sometimes affect the tRNA's interaction with ribosomes and sometimes occur in the anticodon to alter base-pairing properties.

Structure
The structure of tRNA can be decomposed into its primary structure, its secondary structure (usually visualized as the cloverleaf structure), and its tertiary structure (all tRNAs have a similar L-shaped 3D structure that allows them to fit into the P and A sites of the ribosome). The cloverleaf structure becomes the 3D L-shaped structure through coaxial stacking of the helices, which is a common RNA tertiary structure motif. The lengths of each arm, as well as the loop 'diameter', in a tRNA molecule vary from species to species. The tRNA structure consists of the following:
 * The acceptor stem is a 7- to 9-base pair (bp) stem made by the base pairing of the 5′-terminal nucleotide with the 3′-terminal nucleotide (which contains the CCA tail used to attach the amino acid). The acceptor stem may contain non-Watson-Crick base pairs.
 * The CCA tail is a cytosine-cytosine-adenine sequence at the 3′ end of the tRNA molecule. The amino acid loaded onto the tRNA by aminoacyl tRNA synthetases, to form aminoacyl-tRNA, is covalently bonded to the 3′-hydroxyl group on the CCA tail. This sequence is important for the recognition of tRNA by enzymes and critical in translation. In prokaryotes, the CCA sequence is transcribed in some tRNA sequences. In most prokaryotic tRNAs and eukaryotic tRNAs, the CCA sequence is added during processing and therefore does not appear in the tRNA gene.
 * The D loop is a 4- to 6-bp stem ending in a loop that often contains dihydrouridine.
 * The anticodon loop is a 5-bp stem whose loop contains the anticodon.
 * The TΨC loop is named so because of the characteristic presence of the unusual base Ψ in the loop, where Ψ is pseudouridine, a modified uridine. The modified base is often found within the sequence 5'-TΨCGA-3', with the T (ribothymidine, m5U) and A forming a base pair.
 * The variable loop or V loop sits between the anticodon loop and the ΨU loop and, as its name implies, varies in size from 3 to 21 bases. In some tRNAs, the "loop" is long enough to form a rigid stem, the variable arm. tRNAs with a V loop more than 10 bases long is classified as "class II" and the rest is called "class I".

Anticodon
An anticodon is a unit of three nucleotides corresponding to the three bases of an mRNA codon. Each tRNA has a distinct anticodon triplet sequence that can form 3 complementary base pairs to one or more codons for an amino acid. Some anticodons pair with more than one codon due to wobble base pairing. Frequently, the first nucleotide of the anticodon is one not found on mRNA: inosine, which can hydrogen bond to more than one base in the corresponding codon position. In genetic code, it is common for a single amino acid to be specified by all four third-position possibilities, or at least by both pyrimidines and purines; for example, the amino acid glycine is coded for by the codon sequences GGU, GGC, GGA, and GGG. Other modified nucleotides may also appear at the first anticodon position—sometimes known as the "wobble position"—resulting in subtle changes to the genetic code, as for example in mitochondria. The possibility of wobble bases reduces the number of tRNA types required: instead of 61 types with one for each sense codon of the standard genetic code), only 31 tRNAs are required to translate, unambiguously, all 61 sense codons.

Nomenclature
A tRNA is commonly named by its intended amino acid (e.g. tRNA-Asn), by its anticodon sequence (e.g. tRNA(GUU)), or by both (e.g. tRNA-Asn(GUU) or tRNA$$). These two features describe the main function of the tRNA, but do not actually cover the whole diversity of tRNA variation; as a result, numerical suffixes are added to differentiate. tRNAs intended for the same amino acid are called "isotypes"; these with the same anticodon sequence are called "isoacceptors"; and these with both being the same but differing in other places are called "isodecoders".

Aminoacylation
Aminoacylation is the process of adding an aminoacyl group to a compound. It covalently links an amino acid to the CCA 3′ end of a tRNA molecule. Each tRNA is aminoacylated (or charged) with a specific amino acid by an aminoacyl tRNA synthetase. There is normally a single aminoacyl tRNA synthetase for each amino acid, despite the fact that there can be more than one tRNA, and more than one anticodon for an amino acid. Recognition of the appropriate tRNA by the synthetases is not mediated solely by the anticodon, and the acceptor stem often plays a prominent role. Reaction: Certain organisms can have one or more aminophosphate-tRNA synthetases missing. This leads to charging of the tRNA by a chemically related amino acid, and by use of an enzyme or enzymes, the tRNA is modified to be correctly charged. For example, Helicobacter pylori has glutaminyl tRNA synthetase missing. Thus, glutamate tRNA synthetase charges tRNA-glutamine(tRNA-Gln) with glutamate. An amidotransferase then converts the acid side chain of the glutamate to the amide, forming the correctly charged gln-tRNA-Gln.
 * 1) amino acid + ATP → aminoacyl-AMP + PPi
 * 2) aminoacyl-AMP + tRNA → aminoacyl-tRNA + AMP

Binding to ribosome


The ribosome has three binding sites for tRNA molecules that span the space between the two ribosomal subunits: the A (aminoacyl), P (peptidyl), and E (exit) sites. In addition, the ribosome has two other sites for tRNA binding that are used during mRNA decoding or during the initiation of protein synthesis. These are the T site (named elongation factor Tu) and I site (initiation). By convention, the tRNA binding sites are denoted with the site on the small ribosomal subunit listed first and the site on the large ribosomal subunit listed second. For example, the A site is often written A/A, the P site, P/P, and the E site, E/E. The binding proteins like L27, L2, L14, L15, L16 at the A- and P- sites have been determined by affinity labeling by A. P. Czernilofsky et al. (Proc. Natl. Acad. Sci, USA, pp. 230–234, 1974).

Once translation initiation is complete, the first aminoacyl tRNA is located in the P/P site, ready for the elongation cycle described below. During translation elongation, tRNA first binds to the ribosome as part of a complex with elongation factor Tu (EF-Tu) or its eukaryotic (eEF-1) or archaeal counterpart. This initial tRNA binding site is called the A/T site. In the A/T site, the A-site half resides in the small ribosomal subunit where the mRNA decoding site is located. The mRNA decoding site is where the mRNA codon is read out during translation. The T-site half resides mainly on the large ribosomal subunit where EF-Tu or eEF-1 interacts with the ribosome. Once mRNA decoding is complete, the aminoacyl-tRNA is bound in the A/A site and is ready for the next peptide bond to be formed to its attached amino acid. The peptidyl-tRNA, which transfers the growing polypeptide to the aminoacyl-tRNA bound in the A/A site, is bound in the P/P site. Once the peptide bond is formed, the tRNA in the P/P site is acylated, or has a free 3' end, and the tRNA in the A/A site dissociates the growing polypeptide chain. To allow for the next elongation cycle, the tRNAs then move through hybrid A/P and P/E binding sites, before completing the cycle and residing in the P/P and E/E sites. Once the A/A and P/P tRNAs have moved to the P/P and E/E sites, the mRNA has also moved over by one codon and the A/T site is vacant, ready for the next round of mRNA decoding. The tRNA bound in the E/E site then leaves the ribosome.

The P/I site is actually the first to bind to aminoacyl tRNA, which is delivered by an initiation factor called IF2 in bacteria. However, the existence of the P/I site in eukaryotic or archaeal ribosomes has not yet been confirmed. The P-site protein L27 has been determined by affinity labeling by E. Collatz and A. P. Czernilofsky (FEBS Lett., Vol. 63, pp. 283–286, 1976).

tRNA genes
Organisms vary in the number of tRNA genes in their genome. For example, the nematode worm C. elegans, a commonly used model organism in genetics studies, has 29,647 genes in its nuclear genome, of which 620 code for tRNA. The budding yeast Saccharomyces cerevisiae has 275 tRNA genes in its genome. The number of tRNA genes per genome can vary widely, with bacterial species from groups such as Fusobacteria and Tenericutes having around 30 genes per genome while complex eukaryotic genomes such as the zebrafish (Danio rerio) can bear more than 10 thousand tRNA genes.

In the human genome, which, according to January 2013 estimates, has about 20,848 protein coding genes in total, there are 497 nuclear genes encoding cytoplasmic tRNA molecules, and 324 tRNA-derived pseudogenes—tRNA genes thought to be no longer functional (although pseudo tRNAs have been shown to be involved in antibiotic resistance in bacteria). As with all eukaryotes, there are 22 mitochondrial tRNA genes in humans. Mutations in some of these genes have been associated with severe diseases like the MELAS syndrome. Regions in nuclear chromosomes, very similar in sequence to mitochondrial tRNA genes, have also been identified (tRNA-lookalikes). These tRNA-lookalikes are also considered part of the nuclear mitochondrial DNA (genes transferred from the mitochondria to the nucleus). The phenomenon of multiple nuclear copies of mitochondrial tRNA (tRNA-lookalikes) has been observed in many higher organisms from human to the opossum suggesting the possibility that the lookalikes are functional.

Cytoplasmic tRNA genes can be grouped into 49 families according to their anticodon features. These genes are found on all chromosomes, except the 22 and Y chromosome. High clustering on 6p is observed (140 tRNA genes), as well as on chromosome 1.

The HGNC, in collaboration with the Genomic tRNA Database (GtRNAdb) and experts in the field, has approved unique names for human genes that encode tRNAs.

Typically, tRNAs genes from Bacteria are shorter (mean = 77.6 bp) than tRNAs from Archaea (mean = 83.1 bp) and eukaryotes (mean = 84.7 bp). The mature tRNA follows an opposite pattern with tRNAs from Bacteria being usually longer (median = 77.6 nt) than tRNAs from Archaea (median = 76.8 nt), with eukaryotes exhibiting the shortest mature tRNAs (median = 74.5 nt).

Evolution
Genomic tRNA content is a differentiating feature of genomes among biological domains of life: Archaea present the simplest situation in terms of genomic tRNA content with a uniform number of gene copies, Bacteria have an intermediate situation and Eukarya present the most complex situation. Eukarya present not only more tRNA gene content than the other two kingdoms but also a high variation in gene copy number among different isoacceptors, and this complexity seem to be due to duplications of tRNA genes and changes in anticodon specificity.

Evolution of the tRNA gene copy number across different species has been linked to the appearance of specific tRNA modification enzymes (uridine methyltransferases in Bacteria, and adenosine deaminases in Eukarya), which increase the decoding capacity of a given tRNA. As an example, tRNAAla encodes four different tRNA isoacceptors (AGC, UGC, GGC and CGC). In Eukarya, AGC isoacceptors are extremely enriched in gene copy number in comparison to the rest of isoacceptors, and this has been correlated with its A-to-I modification of its wobble base. This same trend has been shown for most amino acids of eukaryal species. Indeed, the effect of these two tRNA modifications is also seen in codon usage bias. Highly expressed genes seem to be enriched in codons that are exclusively using codons that will be decoded by these modified tRNAs, which suggests a possible role of these codons—and consequently of these tRNA modifications—in translation efficiency.

Many species have lost specific tRNAs during evolution. For instance, both mammals and birds lack the same 14 out of the possible 64 tRNA genes, but other life forms contain these tRNAs. For translating codons for which an exactly pairing tRNA is missing, organisms resort to a strategy called wobbling, in which imperfectly matched tRNA/mRNA pairs still give rise to translation, although this strategy also increases the propensity for translation errors. The reasons why tRNA genes have been lost during evolution remains under debate but may relate improving resistance to viral infection. Because nucleotide triplets can present more combinations than there are amino acids and associated tRNAs, there is redundancy in the genetic code, and several different 3-nucleotide codons can express the same amino acid. This codon bias is what necessitates codon optimization.

Hypothetical origin
The top half of tRNA (consisting of the T arm and the acceptor stem with 5′-terminal phosphate group and 3′-terminal CCA group) and the bottom half (consisting of the D arm and the anticodon arm) are independent units in structure as well as in function. The top half may have evolved first including the 3′-terminal genomic tag which originally may have marked tRNA-like molecules for replication in early RNA world. The bottom half may have evolved later as an expansion, e.g. as protein synthesis started in RNA world and turned it into a ribonucleoprotein world (RNP world). This proposed scenario is called genomic tag hypothesis. In fact, tRNA and tRNA-like aggregates have an important catalytic influence (i.e., as ribozymes) on replication still today. These roles may be regarded as 'molecular (or chemical) fossils' of RNA world. In March 2021, researchers reported evidence suggesting that an early form of transfer RNA could have been a replicator ribozyme molecule in the very early development of life, or abiogenesis.

tRNA-derived fragments
tRNA-derived fragments (or tRFs) are short molecules that emerge after cleavage of the mature tRNAs or the precursor transcript. Both cytoplasmic and mitochondrial tRNAs can produce fragments. There are at least four structural types of tRFs believed to originate from mature tRNAs, including the relatively long tRNA halves and short 5'-tRFs, 3'-tRFs and i-tRFs. The precursor tRNA can be cleaved to produce molecules from the 5' leader or 3' trail sequences. Cleavage enzymes include Angiogenin, Dicer, RNase Z and RNase P. Especially in the case of Angiogenin, the tRFs have a characteristically unusual cyclic phosphate at their 3' end and a hydroxyl group at the 5' end. tRFs appear to play a role in RNA interference, specifically in the suppression of retroviruses and retrotransposons that use tRNA as a primer for replication. Half-tRNAs cleaved by angiogenin are also known as tiRNAs. The biogenesis of smaller fragments, including those that function as piRNAs, are less understood.

tRFs have multiple dependencies and roles; such as exhibiting significant changes between sexes, among races and disease status. Functionally, they can be loaded on Ago and act through RNAi pathways, participate in the formation of stress granules, displace mRNAs from RNA-binding proteins or inhibit translation. At the system or the organismal level, the four types of tRFs have a diverse spectrum of activities. Functionally, tRFs are associated with viral infection, cancer, cell proliferation and also with epigenetic transgenerational regulation of metabolism.

tRFs are not restricted to humans and have been shown to exist in multiple organisms.

Two online tools are available for those wishing to learn more about tRFs: the framework for the interactive exploration of mi tochondrial and n uclear t RNA fragments (MINTbase) and the relational database of T ransfer R NA related F ragments (tRFdb). MINTbase also provides a naming scheme for the naming of tRFs called tRF-license plates (or MINTcodes) that is genome independent; the scheme compresses an RNA sequence into a shorter string.

Engineered tRNAs
tRNAs with modified anticodons and/or acceptor stems can be used to modify the genetic code. Scientists have successfully repurposed codons (sense and stop) to accept amino acids (natural and novel), for both initiation (see: start codon) and elongation.

In 1990, tRNA$$ (modified from the tRNA$$ gene metY) was inserted into E. coli, causing it to initiate protein synthesis at the UAG stop codon, as long as it is preceded by a strong Shine-Dalgarno sequence. At initiation it not only inserts the traditional formylmethionine, but also formylglutamine, as glutamyl-tRNA synthase also recognizes the new tRNA. The experiment was repeated in 1993, now with an elongator tRNA modified to be recognized by the methionyl-tRNA formyltransferase. A similar result was obtained in Mycobacterium. Later experiments showed that the new tRNA was orthogonal to the regular AUG start codon showing no detectable off-target translation initiation events in a genomically recoded E. coli strain.

tRNA biogenesis
In eukaryotic cells, tRNAs are transcribed by RNA polymerase III as pre-tRNAs in the nucleus. RNA polymerase III recognizes two highly conserved downstream promoter sequences: the 5′ intragenic control region (5′-ICR, D-control region, or A box), and the 3′-ICR (T-control region or B box) inside tRNA genes. The first promoter begins at +8 of mature tRNAs and the second promoter is located 30–60 nucleotides downstream of the first promoter. The transcription terminates after a stretch of four or more thymidines.

Pre-tRNAs undergo extensive modifications inside the nucleus. Some pre-tRNAs contain introns that are spliced, or cut, to form the functional tRNA molecule; in bacteria these self-splice, whereas in eukaryotes and archaea they are removed by tRNA-splicing endonucleases. Eukaryotic pre-tRNA contains bulge-helix-bulge (BHB) structure motif that is important for recognition and precise splicing of tRNA intron by endonucleases. This motif position and structure are evolutionarily conserved. However, some organisms, such as unicellular algae have a non-canonical position of BHB-motif as well as 5′- and 3′-ends of the spliced intron sequence. The 5′ sequence is removed by RNase P, whereas the 3′ end is removed by the tRNase Z enzyme. A notable exception is in the archaeon Nanoarchaeum equitans, which does not possess an RNase P enzyme and has a promoter placed such that transcription starts at the 5′ end of the mature tRNA. The non-templated 3′ CCA tail is added by a nucleotidyl transferase. Before tRNAs are exported into the cytoplasm by Los1/Xpo-t, tRNAs are aminoacylated. The order of the processing events is not conserved. For example, in yeast, the splicing is not carried out in the nucleus but at the cytoplasmic side of mitochondrial membranes.

History
The existence of tRNA was first hypothesized by Francis Crick as the "adaptor hypothesis" based on the assumption that there must exist an adapter molecule capable of mediating the translation of the RNA alphabet into the protein alphabet. Paul C Zamecnik, Mahlon Hoagland, and Mary Louise Stephenson discovered tRNA. Significant research on structure was conducted in the early 1960s by Alex Rich and Donald Caspar, two researchers in Boston, the Jacques Fresco group in Princeton University and a United Kingdom group at King's College London. In 1965, Robert W. Holley of Cornell University reported the primary structure and suggested three secondary structures. tRNA was first crystallized in Madison, Wisconsin, by Robert M. Bock. The cloverleaf structure was ascertained by several other studies in the following years and was finally confirmed using X-ray crystallography studies in 1974. Two independent groups, Kim Sung-Hou working under Alexander Rich and a British group headed by Aaron Klug, published the same crystallography findings within a year.

Clinical relevance
Interference with aminoacylation may be useful as an approach to treating some diseases: cancerous cells may be relatively vulnerable to disturbed aminoacylation compared to healthy cells. The protein synthesis associated with cancer and viral biology is often very dependent on specific tRNA molecules. For instance, for liver cancer charging tRNA-Lys-CUU with lysine sustains liver cancer cell growth and metastasis, whereas healthy cells have a much lower dependence on this tRNA to support cellular physiology. Similarly, hepatitis E virus requires a tRNA landscape that substantially differs from that associated with uninfected cells. Hence, inhibition of aminoacylation of specific tRNA species is considered a promising novel avenue for the rational treatment of a plethora of diseases.