User:47.55.242.32

SNP annotation
 Single nucleotide polymorphism (SNP) annotation annotation  is the process, where the function of an SNP in an individual is predicted using SNP annotational tools. In SNP annotation the biological information is extracted, collected and displayed in a clear form amenable to query. SNP functional annotation is done base on the available information on nucleic acid and protein sequence.

Introduction
Single nucleotide polymorphism plays an important role in genome wide association studies because they act as primary biomarker. SNPs are currently the marker of choice due to their large numbers in virtually all populations of individuals (24). The location of these biomarkers can be tremendously important in terms of predicting functional significant, genetic mapping and population genetics. Each SNP represents a difference in a single base change in a DNA building block, called a nucleotide change between two individuals at a defined location. SNPs are the most common genetic variant found in all individual with one SNP every 100–300 bp in some species. Due to the tremendous number of SNPs on the genome to expedite genotyping and analysis, there is a clear need to prioritize SNPs according to their potential function.

Annotating large numbers of SNPs is a difficult and complex process, which need computational method to handle such a large dataset. Many tools available have been developed to SNP variation in different organism, some of them are optimized for use with organisms densely sampled for SNPs (such as humans), but there are currently few tools available that are species non-specific or support non-model organism data. The majority of SNPs annotation tools provide computationally predicted putative deleterious effects of SNPs. These tools examine whether a SNP resides in functional genomic regions such as exons, splice sites, or transcription regulatory sites, and predict the potential corresponding functional effects that the SNP may have using a variety of machine-learning approaches. But the tools and systems that prioritize functionally significant SNPs, suffer from few limitations: First, they examine the putative deleterious effects of SNPs with respect to a single biological function that provide only partial information about the functional significance of SNPs. Second, current systems classify SNPs into deleterious or neutral group.

Classification of SNP annotation
For SNP annotation many genetic and genomic information are used. Based on different feature used by the annotation tool, the SNP annotation can be classified into this category.

Gene based annotation
Genomic information from the surrounding genomic element is the most useful information for interpreting the biological function of the observed variants. The information from known gene use as the standard to locate the variant attribute to indicate whether the observed variant reside in or near a gene body with the potential to disrupt the protein sequence and further its function. Gene base annotation use the concept that non synonymous mutation altering the protein sequence and the splice mutation disrupting the transcript splicing pattern.

Knowledge based annotation
Knowledge base annotation is done based on the information of gene attribute, protein function and its metabolism. In this type of annotation more emphasis is given to genetic variation that disrupts the protein function domain, protein-protein interaction and biological pathway. The non coding region of genome contain many important regulatory elements including promoter, enhancer and insulator, any kind of change in this regulatory region can change the functionality of that protein. The mutation in DNA can change the RNA sequence and then influence the RNA secondary structure, RNA binding protein recognition and miRNA binding activity,.

Functional annotation
This method mainly identifies variant function based on the information whether the variant loci are in the known functional region that harbor genomic or epigenomic signals. The function of non coding variants are extensive in terms of the affected genomic region and they involve in almost all processes of gene regulation from transcriptional to post translational level

Transcriptional gene regulation
Transcriptional gene regulation process depend on many spatial and temporal factor in the nucleus such as global or local chromatin states, nucleosome positioning, TF binding, enhancer/promoter activities. Variant that alter the function of any of these biological processes may alter the gene regulation and cause phenotypic abnormality. Genetic variants that located in distal regulatory region can affect the binding motif of TFs, chromatin regulators and other distal transcriptional factors, which disturb the interaction between enhancer/silencer and its target gene.

Alternative splicing
Alternative splicing is one of the most important components that show functional complexity of genome. Modified splicing has significant effect on the phenotype that is relevance to disease or drug metabolism. A change in splicing can be caused by modifying any of the components of the splicing machinery such as splice sites or splice enhancers or silencers. Modification in the alternative splicing site can lead to a different protein form which will show a different function. Humans use an estimated 100,000 different proteins or more, so some genes must be capable of coding for a lot more than just one protein. Alternative splicing occurs more frequently than was previously thought and can be hard to control; genes may produce tens of thousands of different transcripts, necessitating a new gene model for each alternative splice.

RNA processing and post transcriptional regulation
Mutations in the untranslated region (UTR) affect many post-transcriptional regulations. Distinctive structural features are required for many RNA molecules and cis-acting regulatory elements to execute effective functions during gene regulation. SNVs can alter the secondary structure of RNA molecules and then disrupt the proper folding of RNAs, such as tRNA/mRNA/lncRNA folding and miRNA binding recognition regions.

Translation and post translational modifications
(done)

Single nucleotide variant can also affect the cis-acting regulatory elements in mRNA’s to inhibit/promote the translation initiation. Change in the synonymous codons region due to mutation may affect the translation efficiency because of codon usage biases. The translation elongation can also be retarded by mutations along the ramp of ribosomal movement. In the post-translational level, genetic variants can contribute to proteostasis and amino acid modifications. However, mechanisms of variant effect in this field are complicated and there are only a few tools available to predict variant’s effect on translation related modifications (1).

Protein function
Non-synonymous is the variant in exons that change the amino acid sequence encoded by the gene, including single base changes and non frameshift indels. It has been extremely investigated the function of non-synonymous variants on protein and many algorithms have been developed to predict the deleteriousness and pathogenesis of single nucleotide variants (SNVs) [65]. Classical bioinformatics tools, such as SIFT [66], Polyphen [67] and MutationTaster [68], successfully predict the functional consequence of non-synonymous substitution, , ,.

Evolutionary conservation and nature selection
Comparative genomics approaches were used to predict the function-relevant variants under the assumption that the functional genetic locus should be conserved across different species at an extensive phylogenetic distance. On the other hand, some adaptive traits and the population differences are driven by positive selections of advantageous variants, and these genetic mutations are functionally relevant to population specific phenotypes. Functional prediction of variants’ effect in different biological processes is pivotal to pinpoint the molecular mechanism of diseases/traits and direct the experimental validation (1).