Talk:Protein Analysis Subcellular Localization Prediction

The PA-Subcellular Prediction Server
PA-SUB – the subcellular localization server at the university of Alberta predicts subcellular localization of proteins for various organisms that include animals, plants, fungi, Gram-negative bacteria and Gram-positive bacteria. The classifiers are 81% accurate for fungi and 92–94% accurate for the other four categories. The organization which created the software claimed that it is the most accurate subcellular predictors ever published. For further information on the comparison of the accuracies and representative coverage of current sub-cellular localization predictors click here.

The Prediction Process
The PA Subcellular Localization Prediction Server uses a Naïve Bayes classifier to assess up to 15 pieces of evidence to make a prediction on where a protein should be located, using only the protein sequence as its input. The server performs a BLAST search against the Swiss-Prot database and assembles the necessary information for its predictions by identifying sequence homologues and extracting data "tokens" from the Swiss-Prot entries.PA bases its predictions on well-understood concepts of conditional probabilities. Its explanation are presented as stacked bar-graphs that clearly display the evidence for each predicton.

Personal Review
Months back, I was mentally stirred with one of the questions the presenter received in a scientific presentation on Genomics I attended. The question was so simple that it was so difficult to answer - where is the "metabolite" located? He was referring to the specific location of where the metabolite is being transported within the cell. Of course, the answer may either be based on experimental output or predicted vis-a-vis some available software. The former is analyzed for several months; the latter, within minutes. But of course, prediction is always PREDICTION! After several months of working in Bioinformatics I came across this open-access online tool of Protein Analysis Subcellular Localization Prediction. I guess this must be the answer to one of those questions that flew out during that presentation. So I tried to run several sample analyses in silico over the server (e.g. oxalate oxidase, glutamate dehydrogenase, etc. as entries). The processing of the results is very quick. You can have your results in less than 10 seconds for single entry though the server can accept multiple sequences which is an advantage over other servers. Unlike PSORT, PA-SUB’s interface is more organized and the results page can be easily understood. PSORT processes results for a longer period of time, 40 seconds at least and the result page is not easily understood. Another subcellular prediction server is also available at SubLoc. This server along with PSORT accepts single sequence at a time. Another server, TMHMM available at the Center for Biological Sequences Analysis processes the same subcellular prediction. Compared to other localization prediction online tools, the one hosted at the University of Alberta website is the most preferred server on localization prediction because of its accuracy, reduced computational cost, simple and organized interface, and easy-to-interpret results. Even new beginners could easily maneuver over the server. Additionally, the PA Sub-cellular Localization website provides extensive statistics about its performance and comparisons to other web servers. Its novel use of machine learning techniques and its blended approach of using local sequence information, homology and global property information have allowed it to attain a level of predictive accuracy that rivals even experimentally derived data (Wishart). Given these features, it is perhaps not surprising to suggest that the PA Subcellular Localization server should be only tool to find out where proteins are found within a cell.