Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino acid sequence of proteins. Transfer RNA (tRNA) does this by carrying an amino acid to the protein synthesizing machinery of a cell called the ribosome. Complementation of a 3-nucleotide codon in a messenger RNA (mRNA) by a 3-nucleotide anticodon of the tRNA results in protein synthesis based on the mRNA code. As such, tRNAs are a necessary component of translation, the biological synthesis of new proteins in accordance with the genetic code.
Typically, tRNAs genes from Bacteria are shorter (mean = 77.6 bp) than tRNAs from Archaea (mean = 83.1 bp) and eukaryotes (mean = 84.7 bp). The mature tRNA follows an opposite pattern with tRNAs from Bacteria being usually longer (median = 77.6 nt) than tRNAs from Archaea (median = 76.8 nt), with eukaryotes exhibiting the shortest mature tRNAs (median = 74.5 nt).
Overview
While the specific nucleotide sequence of an mRNA specifies which amino acids are incorporated into the protein product of the gene from which the mRNA is transcribed, the role of tRNA is to specify which sequence from the genetic code corresponds to which amino acid. The mRNA encodes a protein as a series of contiguous codons, each of which is recognized by a particular tRNA. One end of the tRNA matches the genetic code in a three-nucleotide sequence called the anticodon. The anticodon forms three complementary base pairs with a codon in mRNA during protein biosynthesis.
On the other end of the tRNA is a covalent attachment to the amino acid that corresponds to the anticodon sequence. Each type of tRNA molecule can be attached to only one type of amino acid, so each organism has many types of tRNA. Because the genetic code contains multiple codons that specify the same amino acid, there are several tRNA molecules bearing different anticodons which carry the same amino acid.
The covalent attachment to the tRNA 3’ end is catalysed by enzymes called aminoacyl tRNA synthetases. During protein synthesis, tRNAs with attached amino acids are delivered to the ribosome by proteins called elongation factors, which aid in association of the tRNA with the ribosome, synthesis of the new polypeptide, and translocation (movement) of the ribosome along the mRNA. If the tRNA's anticodon matches the mRNA, another tRNA already bound to the ribosome transfers the growing polypeptide chain from its 3’ end to the amino acid attached to the 3’ end of the newly delivered tRNA, a reaction catalysed by the ribosome. A large number of the individual nucleotides in a tRNA molecule may be chemically modified, often by methylation or deamidation. These unusual bases sometimes affect the tRNA's interaction with ribosomes and sometimes occur in the anticodon to alter base-pairing properties.
Structure
The structure of tRNA can be decomposed into its primary structure, its secondary structure (usually visualized as the cloverleaf structure), and its tertiary structure (all tRNAs have a similar L-shaped 3D structure that allows them to fit into the P and A sites of the ribosome). The cloverleaf structure becomes the 3D L-shaped structure through coaxial stacking of the helices, which is a common RNA tertiary structure motif. The lengths of each arm, as well as the loop 'diameter', in a tRNA molecule vary from species to species. The tRNA structure consists of the following:
- The acceptor stem is a 7- to 9-base pair (bp) stem made by the base pairing of the 5′-terminal nucleotide with the 3′-terminal nucleotide (which contains the CCA 3′-terminal group used to attach the amino acid). In general, such 3′-terminal tRNA-like structures are referred to as 'genomic tags'. The acceptor stem may contain non-Watson-Crick base pairs.
- The CCA tail is a cytosine-cytosine-adenine sequence at the 3′ end of the tRNA molecule. The amino acid loaded onto the tRNA by aminoacyl tRNA synthetases, to form aminoacyl-tRNA, is covalently bonded to the 3′-hydroxyl group on the CCA tail. This sequence is important for the recognition of tRNA by enzymes and critical in translation. In prokaryotes, the CCA sequence is transcribed in some tRNA sequences. In most prokaryotic tRNAs and eukaryotic tRNAs, the CCA sequence is added during processing and therefore does not appear in the tRNA gene.
- The D loop is a 4- to 6-bp stem ending in a loop that often contains dihydrouridine.
- The anticodon loop is a 5-bp stem whose loop contains the anticodon. The tRNA 5′-to-3′ primary structure contains the anticodon but in reverse order, since 3′-to-5′ directionality is required to read the mRNA from 5′-to-3′.
- The ΨU loop is named so because of the characteristic presence of the unusual base ΨU in the loop, where Ψ is pseudouridine, a modified uridine. The modified base is often found within the sequence 5' -TΨUCG-3'.
- The variable loop sits between the anticodon loop and the ΨU loop and, as its name implies, varies in size from 3 to 21 bases.
Anticodon
An anticodon is a unit of three nucleotides corresponding to the three bases of an mRNA codon. Each tRNA has a distinct anticodon triplet sequence that can form 3 complementary base pairs to one or more codons for an amino acid. Some anticodons pair with more than one codon due to wobble base pairing. Frequently, the first nucleotide of the anticodon is one not found on mRNA: inosine, which can hydrogen bond to more than one base in the corresponding codon position. In genetic code, it is common for a single amino acid to be specified by all four third-position possibilities, or at least by both pyrimidines and purines; for example, the amino acid glycine is coded for by the codon sequences GGU, GGC, GGA, and GGG. Other modified nucleotides may also appear at the first anticodon position—sometimes known as the "wobble position"—resulting in subtle changes to the genetic code, as for example in mitochondria. Per cell, 61 tRNA types are required to provide one-to-one correspondence between tRNA molecules and codons that specify amino acids, as there are 61 sense codons of the standard genetic code. However, many cells have under 61 types of tRNAs because the wobble base is capable of binding to several, though not necessarily all, of the codons that specify a particular amino acid. At least 31 tRNAs are required to translate, unambiguously, all 61 sense codons.
Aminoacylation
Aminoacylation is the process of adding an aminoacyl group to a compound. It covalently links an amino acid to the CCA 3′ end of a tRNA molecule. Each tRNA is aminoacylated (or charged) with a specific amino acid by an aminoacyl tRNA synthetase. There is normally a single aminoacyl tRNA synthetase for each amino acid, despite the fact that there can be more than one tRNA, and more than one anticodon for an amino acid. Recognition of the appropriate tRNA by the synthetases is not mediated solely by the anticodon, and the acceptor stem often plays a prominent role. Reaction:
Certain organisms can have one or more aminophosphate-tRNA synthetases missing. This leads to charging of the tRNA by a chemically related amino acid, and by use of an enzyme or enzymes, the tRNA is modified to be correctly charged. For example, Helicobacter pylori has glutaminyl tRNA synthetase missing. Thus, glutamate tRNA synthetase charges tRNA-glutamine(tRNA-Gln) with glutamate. An amidotransferase then converts the acid side chain of the glutamate to the amide, forming the correctly charged gln-tRNA-Gln.
Interference with aminoacylation may be useful as an approach to treating some diseases: cancerous cells may be relatively vulnerable to disturbed aminoacylation compared to healthy cells. The protein synthesis associated with cancer and viral biology is often very dependent on specific tRNA molecules. For instance, for liver cancer charging tRNA-Lys-CUU with lysine sustains liver cancer cell growth and metastasis, whereas healthy cells have a much lower dependence on this tRNA to support cellular physiology. Similarly, hepatitis E virus requires a tRNA landscape that substantially differs from that associated with uninfected cells. Hence, inhibition of aminoacylation of specific tRNA species is considered a promising novel avenue for the rational treatment of a plethora of diseases.
Binding to ribosome
The ribosome has three binding sites for tRNA molecules that span the space between the two ribosomal subunits: the A (aminoacyl), P (peptidyl), and E (exit) sites. In addition, the ribosome has two other sites for tRNA binding that are used during mRNA decoding or during the initiation of protein synthesis. These are the T site (named elongation factor Tu) and I site (initiation). By convention, the tRNA binding sites are denoted with the site on the small ribosomal subunit listed first and the site on the large ribosomal subunit listed second. For example, the A site is often written A/A, the P site, P/P, and the E site, E/E. The binding proteins like L27, L2, L14, L15, L16 at the A- and P- sites have been determined by affinity labeling by A. P. Czernilofsky et al. (Proc. Natl. Acad. Sci, USA, pp. 230–234, 1974).
Once translation initiation is complete, the first aminoacyl tRNA is located in the P/P site, ready for the elongation cycle described below. During translation elongation, tRNA first binds to the ribosome as part of a complex with elongation factor Tu (EF-Tu) or its eukaryotic (eEF-1) or archaeal counterpart. This initial tRNA binding site is called the A/T site. In the A/T site, the A-site half resides in the small ribosomal subunit where the mRNA decoding site is located. The mRNA decoding site is where the mRNA codon is read out during translation. The T-site half resides mainly on the large ribosomal subunit where EF-Tu or eEF-1 interacts with the ribosome. Once mRNA decoding is complete, the aminoacyl-tRNA is bound in the A/A site and is ready for the next peptide bond to be formed to its attached amino acid. The peptidyl-tRNA, which transfers the growing polypeptide to the aminoacyl-tRNA bound in the A/A site, is bound in the P/P site. Once the peptide bond is formed, the tRNA in the P/P site is acylated, or has a free 3’ end, and the tRNA in the A/A site dissociates the growing polypeptide chain. To allow for the next elongation cycle, the tRNAs then move through hybrid A/P and P/E binding sites, before completing the cycle and residing in the P/P and E/E sites. Once the A/A and P/P tRNAs have moved to the P/P and E/E sites, the mRNA has also moved over by one codon and the A/T site is vacant, ready for the next round of mRNA decoding. The tRNA bound in the E/E site then leaves the ribosome.
The P/I site is actually the first to bind to aminoacyl tRNA, which is delivered by an initiation factor called IF2 in bacteria. However, the existence of the P/I site in eukaryotic or archaeal ribosomes has not yet been confirmed. The P-site protein L27 has been determined by affinity labeling by E. Collatz and A. P. Czernilofsky (FEBS Lett., Vol. 63, pp. 283–286, 1976).
tRNA genes
Organisms vary in the number of tRNA genes in their genome. For example, the nematode worm C. elegans, a commonly used model organism in genetics studies, has 29,647 genes in its nuclear genome, of which 620 code for tRNA. The budding yeast Saccharomyces cerevisiae has 275 tRNA genes in its genome. The number of tRNA genes per genome can vary widely, with bacterial species from groups such as Fusobacteria and Tenericutes having around 30 genes per genome while complex eukaryotic genomes such as the zebrafish (Danio rerio) can bear more than 10 thousand tRNA genes.
In the human genome, which, according to January 2013 estimates, has about 20,848 protein coding genes in total, there are 497 nuclear genes encoding cytoplasmic tRNA molecules, and 324 tRNA-derived pseudogenes—tRNA genes thought to be no longer functional (although pseudo tRNAs have been shown to be involved in antibiotic resistance in bacteria). As with all eukaryotes, there are 22 mitochondrial tRNA genes in humans. Mutations in some of these genes have been associated with severe diseases like the MELAS syndrome. Regions in nuclear chromosomes, very similar in sequence to mitochondrial tRNA genes, have also been identified (tRNA-lookalikes). These tRNA-lookalikes are also considered part of the nuclear mitochondrial DNA (genes transferred from the mitochondria to the nucleus). The phenomenon of multiple nuclear copies of mitochondrial tRNA (tRNA-lookalikes) has been observed in many higher organisms from human to the opossum suggesting the possibility that the lookalikes are functional.
Cytoplasmic tRNA genes can be grouped into 49 families according to their anticodon features. These genes are found on all chromosomes, except the 22 and Y chromosome. High clustering on 6p is observed (140 tRNA genes), as well on 1 chromosome.
The HGNC, in collaboration with the Genomic tRNA Database (GtRNAdb) and experts in the field, has approved unique names for human genes that encode tRNAs.
Evolution
The top half of tRNA (consisting of the T arm and the acceptor stem with 5′-terminal phosphate group and 3′-terminal CCA group) and the bottom half (consisting of the D arm and the anticodon arm) are independent units in structure as well as in function. The top half may have evolved first including the 3′-terminal genomic tag which originally may have marked tRNA-like molecules for replication in early RNA world. The bottom half may have evolved later as an expansion, e.g. as protein synthesis started in RNA world and turned it into a ribonucleoprotein world (RNP world). This proposed scenario is called genomic tag hypothesis. In fact, tRNA and tRNA-like aggregates have an important catalytic influence (i.e., as ribozymes) on replication still today. These roles may be regarded as 'molecular (or chemical) fossils' of RNA world.
Genomic tRNA content is a differentiating feature of genomes among biological domains of life: Archaea present the simplest situation in terms of genomic tRNA content with a uniform number of gene copies, Bacteria have an intermediate situation and Eukarya present the most complex situation. Eukarya present not only more tRNA gene content than the other two kingdoms but also a high variation in gene copy number among different isoacceptors, and this complexity seem to be due to duplications of tRNA genes and changes in anticodon specificity.
Evolution of the tRNA gene copy number across different species has been linked to the appearance of specific tRNA modification enzymes (uridine methyltransferases in Bacteria, and adenosine deaminases in Eukarya), which increase the decoding capacity of a given tRNA. As an example, tRNAAla encodes four different tRNA isoacceptors (AGC, UGC, GGC and CGC). In Eukarya, AGC isoacceptors are extremely enriched in gene copy number in comparison to the rest of isoacceptors, and this has been correlated with its A-to-I modification of its wobble base. This same trend has been shown for most amino acids of eukaryal species. Indeed, the effect of these two tRNA modifications is also seen in codon usage bias. Highly expressed genes seem to be enriched in codons that are exclusively using codons that will be decoded by these modified tRNAs, which suggests a possible role of these codons—and consequently of these tRNA modifications—in translation efficiency.
It is important to note that many species have lost specific tRNAs during evolution. For instance, both mammals and birds lack the same 14 out of the possible 64 tRNA genes, but other life forms contain these tRNAs. For translating codons for which an exactly pairing tRNA is missing, organisms resort to a strategy called wobbling, in which imperfectly matched tRNA/mRNA pairs still give rise to translation, although this strategy also increases to propensity for translation errors. The reasons why tRNA genes have been lost during evolution remains under debate but may relate improving resistance to viral infection. Because nucleotide triplets can present more combinations than there are amino acids and associated tRNAs, there is redundancy in the genetic code, and several different 3-nucleotide codons can express the same amino acid. This codon bias is what necessitates codon optimization.
tRNA-derived fragments
tRNA-derived fragments (or tRFs) are short molecules that emerge after cleavage of the mature tRNAs or the precursor transcript. Both cytoplasmic and mitochondrial tRNAs can produce fragments. There are at least four structural types of tRFs believed to originate from mature tRNAs, including the relatively long tRNA halves and short 5’-tRFs, 3’-tRFs and i-tRFs. The precursor tRNA can be cleaved to produce molecules from the 5’ leader or 3’ trail sequences. Cleavage enzymes include Angiogenin, Dicer, RNase Z and RNase P. Especially in the case of Angiogenin, the tRFs have a characteristically unusual cyclic phosphate at their 3’ end and a hydroxyl group at the 5’ end. tRFs appear to play a role in RNA interference, specifically in the suppression of retroviruses and retrotransposons that use tRNA as a primer for replication. Half-tRNAs cleaved by angiogenin are also known as tiRNAs. The biogenesis of smaller fragments, including those that function as piRNAs, are less understood.
tRFs have multiple dependencies and roles; such as exhibiting significant changes between sexes, among races and disease status. Functionally, they can be loaded on Ago and act through RNAi pathways, participate in the formation of stress granules, displace mRNAs from RNA-binding proteins or inhibit translation. At the system or the organismal level, the four types of tRFs have a diverse spectrum of activities. Functionally, tRFs are associated with viral infection, cancer, cell proliferation and also with epigenetic transgenerational regulation of metabolism.
tRFs are not restricted to humans and have been shown to exist in multiple organisms.
Two online tools are available for those wishing to learn more about tRFs: the framework for the interactive exploration of mitochondrial and nuclear tRNA fragments (MINTbase) and the relational database of Transfer RNA related Fragments (tRFdb). MINTbase also provides a naming scheme for the naming of tRFs called tRF-license plates (or MINTcodes) that is genome independent; the scheme compresses an RNA sequence into a shorter string.
Engineered tRNAs
Artificial suppressor elongator tRNAs are used to incorporate unnatural amino acids at nonsense codons placed in the coding sequence of a gene. Engineered initiator tRNAs (tRNAfMet2 with CUA anticodon encoded by metY gene) have been used to initiate translation at the amber stop codon UAG. This type of engineered tRNA is called a nonsense suppressor tRNA because it suppresses the translation stop signal that normally occurs at UAG codons. The amber initiator tRNA inserts methionine and glutamine at UAG codons preceded by a strong Shine-Dalgarno sequence. An investigation of the amber initiator tRNA showed that it was orthogonal to the regular AUG start codon showing no detectable off-target translation initiation events in a genomically recoded E. coli strain.
tRNA biogenesis
In eukaryotic cells, tRNAs are transcribed by RNA polymerase III as pre-tRNAs in the nucleus. RNA polymerase III recognizes two highly conserved downstream promoter sequences: the 5′ intragenic control region (5′-ICR, D-control region, or A box), and the 3′-ICR (T-control region or B box) inside tRNA genes. The first promoter begins at +8 of mature tRNAs and the second promoter is located 30–60 nucleotides downstream of the first promoter. The transcription terminates after a stretch of four or more thymidines.
Pre-tRNAs undergo extensive modifications inside the nucleus. Some pre-tRNAs contain introns that are spliced, or cut, to form the functional tRNA molecule; in bacteria these self-splice, whereas in eukaryotes and archaea they are removed by tRNA-splicing endonucleases. Eukaryotic pre-tRNA contains bulge-helix-bulge (BHB) structure motif that is important for recognition and precise splicing of tRNA intron by endonucleases. This motif position and structure are evolutionarily conserved. However, some organisms, such as unicellular algae have a non-canonical position of BHB-motif as well as 5′- and 3′-ends of the spliced intron sequence. The 5′ sequence is removed by RNase P, whereas the 3′ end is removed by the tRNase Z enzyme. A notable exception is in the archaeon Nanoarchaeum equitans, which does not possess an RNase P enzyme and has a promoter placed such that transcription starts at the 5′ end of the mature tRNA. The non-templated 3′ CCA tail is added by a nucleotidyl transferase. Before tRNAs are exported into the cytoplasm by Los1/Xpo-t, tRNAs are aminoacylated. The order of the processing events is not conserved. For example, in yeast, the splicing is not carried out in the nucleus but at the cytoplasmic side of mitochondrial membranes.
Nonetheless, In March 2021, researchers reported evidence suggesting that a preliminary form of transfer RNA could have been a replicator molecule in the very early development of life, or abiogenesis.