User:SllT22/sandbox/RNA secondary structures in Hepatitis C virus

Hepatitis C virus (HCV) is a positive-sense RNA virus belonging to the species Hepacivirus C, genus Hepacivirus, family Flaviviridae. The genome of HCV is about 9.6 kbp long containing 5’ and 3’ untranslated regions (UTRs) and a single open reading frame (ORF) encoding for a polyprotein that is cleaved co- and post-translationally into three structural (C, E1, E2) and seven non-structural proteins (p7, NS2, NS3, NS4A, NS4B, NS5A, NS5B). Among many other viruses, the genome of HCV forms several RNA secondary structures playing a crucial role in the viral life cycle. Besides several structures in the non-coding regions of the genome (IRES, X-tail), conserved RNA secondary structures occur throughout the coding sequence (CDS) of the genome described below.

5' UTR
The first 341 nt of the genome are designated as 5'-UTR in HCV and is known to be highly conserved at structural level.


 * The internal ribosome entry site (IRES) is an RNA secondary structure containing three stem-loops (SLII - SLIV) located in the 5' UTR of the HCV genome. The IRES is crucial for initiation of cap-independent translation. Moreover, long-range interactions with the 3' UTR play an important role in genome cyclization . Upstream of the IRES is another stem-loop known to be conserved in the hepatitis C virus, but no function to this structure is currently known.

Core (C) gene
The core gene is the first gene in the HCV genome with an length of about 600 nt. It encodes for the capsid protein. An additional gene product has been described that is initiated by the start codon of the core gene, but in which a -2/+1 ribosome shift to an alternative reading frame occurs, resulting in the F protein.


 * Alternative reading frame secondary structure consists of two conserved stem-loops (SLV, SLVI). It is proposed to have a regulatory function in the viral translation and reppression.
 * SL588 is a stem-loop located at the begin of the core gene of HCV spanning about 80 nt. It contains with several internal loops and bulges. This secondary structure is assumed to interact with surrounding RNA secondary structures, influencing viral replication.
 * SL669 is a stem-loop located next to SL588 in the core gene of HCV with a length of 75 nt also including internal loops and bulges. Despite the experimentally validated (in vivo dimethyl sulfate - DMS - reactivity) structure of SL669 the function is currently unknown.
 * J750 is a multi-loop of about 85 nt consisting of two stem-loops SL761 and SL783 located downstream of SL669 in the core gene of HCV. The function of J750 is unknown so far but this mulit-loop was validated by in vivo DMS reactivity experiment.

Non-structural 5B (NS5B) gene
The NS5B gene is the last gene in the HCV genome encoding for the RNA-dependent RNA polymerase (RdRp). NS5B is about 1,700 nt long. At the end of the NS5B gene directly upstream of the 3' UTR are several stem-loops located which play an important role.


 * 5BSL1 (~ 35 nt) and 5BSL2 (~ 55 nt) are two neighboring stem-loops. These secondary structures are known to interact with structures downstream of the genome (3'SLIV, 3'SLII). Notably, the interaction between 5BSL2 and 3'SLII is essential for viral replication . Moreover, there are possible interactions between 3′ SLIV and 5BSL1 or 5BSL2. These interactions would compete against each other and thus, might change during translation.


 * The cis-reacting element (CRE) is structurally conserved among flaviviruses consisting of three stem-loops: 5BSL3.1, 5BSL3.2, and 5BSL3.3. It's known that the CRE is involved in replication as well as translation of the HCV genome . A kissing-loop interaction between 5BSL3.2 and 3'SLII of the X-tail seem to be essential for replication.

3'UTR
Thre 3' UTR of HCV located at the end of the genome spans about 200 - 235 nt and typically consists of three regions: a highly variable region, poly U/UC tract and a highly conserved X region of 98 nt folding into secondary structures.
 * The 3' X-tail is a highly conserved RNA structure located in the 3' UTR of the genome. The three conserved stem-loops (3'SLIII, 3'SLII, 3'SLI) are essential for genome replication and genome cyclization. Furthermore, this region can fold into an alternative conformation of two stem-loops. 3'SLI remains the same whereas the other two structures are merged into one which exposes a palindromic nucleotide sequence (dimer linkage sequence, DLS) to the apical loop. The two conformations of the 3'X tail may be related to different ligand affinities to interacting proteins and RNAs.