Cis-natural antisense transcript

Natural antisense transcripts (NATs) are a group of RNAs encoded within a cell that have transcript complementarity to other RNA transcripts. They have been identified in multiple eukaryotes, including humans, mice, yeast and Arabidopsis thaliana. This class of RNAs includes both protein-coding and non-coding RNAs. Current evidence has suggested a variety of regulatory roles for NATs, such as RNA interference (RNAi), alternative splicing, genomic imprinting, and X-chromosome inactivation. NATs are broadly grouped into two categories based on whether they act in cis or in trans. Trans-NATs are transcribed from a different location than their targets and usually have complementarity to multiple transcripts with some mismatches. MicroRNAs (miRNA) are an example of trans-NATs that can target multiple transcripts with a few mismatches. Cis-natural antisense transcripts (cis-NATs) on the other hand are transcribed from the same genomic locus as their target but from the opposite DNA strand and form perfect pairs.

Orientation


Cis-NATs have a variety of orientations and differing lengths of overlap between pairs. There have been five identified orientations for cis-NATs to date. The most common orientation is head-to-head, where the 5' ends of both transcripts align together. This orientation would result in the greatest knockdown of gene expression if transcriptional collision is the reason for transcript inhibition. There are however some studies that have suggested that tail-to-tail orientations are the most common NAT pairs. Others such as tail to tail, overlapping, nearby head-to- head, and nearby tail-to-tail are less frequently encountered. Completely overlapping NATs involve the antisense gene being located completely over top of each other. Nearby head-to-head and tail-to-tail orientations are physically discrete from each other but are located very close to each other. Current evidence suggests that there is an overrepresentation of NAT pairs in genes that have catalytic activity. There may be something about these genes in particular that makes them more prone to this type of regulation.

Identification approach
Identification of NATs in whole genomes is possible due to the large collection of sequence data available from multiple organisms. In silico methods for detecting NATs suffer from several shortcomings depending on the source of sequence information. Studies that use mRNA have sequences whose orientations are known, but the amount of mRNA sequence information available is small. Predicted gene models using algorithms trained to look for genes gives an increased coverage of the genome at the cost of confidence in the identified gene. Another resource is the extensive expressed sequence tag (EST) libraries but these small sequences must first be assigned an orientation before useful information can be extracted from them. Some studies have utilized special sequence information in the ESTs such as the poly(A) signal, poly(A) tail, and splicing sites to both filter the ESTs and to give them the correct transcriptional orientation. Combinations of the different sequence sources attempts to maximize coverage as well as maintain integrity in the data.

Pairs of NATs are identified when they form overlapping clusters. There is variability in the cut-off values used in different studies but generally ~20 nucleotides of sequence overlap is considered the minimum for transcripts to be considered and overlapping cluster. Also, transcripts must map to only one other mRNA molecule in order for it to be considered a NAT pair. Currently there are a variety of web and software resources that can be used to look for antisense pairs. The NATsdb or Natural Antisense Transcript database is a rich tool for searching for antisense pairs from multiple organisms.

Mechanisms
Molecular mechanisms behind the regulatory role of cis-NATs are not currently well understood. Three models have been proposed to explain the regulatory effects that cis-NATs have on gene expression. The first model attributes that base pairing between the cis-NAT and its complementary transcript result in a knockdown of mRNA expression. The assumption of this model is that there will be a precise alignment of at least 6 base pairs between the cis-NAT pair to make double stranded RNA. Epigenetic modifications like DNA methylation and post-translational modification of core histones form the basis of the second model. Although it is not yet clearly understood, it is thought that the reverse transcript guides methylation complexes and/or histone-modifying complexes to the promoter regions of the sense transcript and cause an inhibition of expression from the gene. Currently it is not known what attributes of cis-NATs are crucial for the epigenetic model of regulation. The final proposed model that has gained favour due to recent experimental evidence is the transcriptional collision model. During the process of transcription of cis-NATs, the transcriptional complexes assemble in the promoter regions of the gene. RNA polymerases will then begin transcribing the gene at the transcription initiation site laying down nucleotides in a 5' to 3' direction. In the areas of overlap between the cis-NATs the RNA polymerases will collide and stop at the crash site. Transcription is inhibited because RNA polymerases prematurely stop and their incomplete transcripts get degraded.

Importance
Regulation of many biological processes such as development, metabolism and many others requires a careful co-ordination between many different genes; this is usually referred to as a gene regulatory network. A flurry of interest in gene regulatory networks has been sparked by the advent of sequenced genomes of multiple organisms. The next step is to use this information to figure out how genes work together and not just in isolation. During the processes of mammalian development, there is an inactivation of the extra X-chromosome in females. It has been shown that a NAT pair called Xist and Tsix are involved in the hypermethylation of the chromosome. As much as 20–30% of mammalian genes have been shown to be the targets of miRNAs, which highlights the importance of these molecules as regulators across a wide number of genes. Evolutionary reasons for utilizing RNA for regulation of genes may be that it is less costly and faster than synthesizing proteins not needed by the cell. This could have had a selective advantage for early eukaryotes with this type of transcriptional regulation.

Disease
Antisense transcription might contribute to disease through chromosomal changes that result in the production of aberrant antisense transcripts. A documented case of cis-NATs being involved in human disease comes from an inherited form of α-thalassemia where there is silencing of the hemoglobin α-2 gene through the action of a cis-NAT. It is thought that in malignant cancer cells with activated transposable elements creates a large amount of transcriptional noise. It is likely that aberrant antisense RNA transcripts resulting from this transcriptional noise may cause stochastic methylation of CpG islands associated with oncogenes and tumor suppressor genes. This inhibition would further progress the malignancy of the cells since they lose key regulator genes. By looking at upregulated antisense transcripts in tumor cells, researchers are able to look for more candidate tumor suppressor genes. Also, aberrant cis-NATs have been implicated in neurological diseases such as Parkinson's disease.