User talk:Wuiastate

Note: ACT Comparison File
Note: ACT Comparison File(ACT: the Artemis comparison tool  http://bioinformatics.oxfordjournals.org/content/21/16/3422.full)

S. SaengamnatdejApril 28, 2010

My note on how to make a ACT comparison file.1. Get your copy of a stand-alone version of BLAST (Windows XP or Linux) from the NCBIwebsite.2.Both operation systems, use the same command lines. For Windows XP, run them in DOSwindow and for Linux, in Xterminal window.3.Download sequences in EMBL format from ebi genomes web page and save them with the'embl' extension. (However, with SRS server, it can specify the sequence in fasta format.)4.Use Artemis to write the embl format into fasta format (Load the file >>File>Write>All Bases >FASTA format and save it with 'dna' extension).5.In the directory containing the sequence files, designate one sequence as a database with thefollowing command. (then, press enter.) formatdb -i seqA.dna -p F (formatdb is a programme, -i is for input, -p designates the sequence type: F or DNA and T for protein)6.Run BLAST. Use blastall to run blastn, tblastx, blastp, or blastx by using an appropriate inputdata file (DNA or Protein). Blastn and tblastx are suitable for generating ACT comparison files.If very large sequences (several Mb) are being compared, megablast should be used instead of blastall.For running blastn in blastall use the following command: (then, press return.) blastall -p blastn -m 8 -d seqA.dna -i seqB.dna -o seqA_vs_seqB (tblastx can be used in place of blastn, blastall is the BLAST programme, -p specifies the typeof BLAST, - m 8 for the ACT readable output, -d for the database sequence, -i for the querysequence, -o designates the name of the output file.)7.The output file can now be used in ACT programme along with two dna and embl files.8.In the case of larger files with high similarity, the megablast is preferred than blastall. It can perform only DNA-DNA alignments (blastn), but not translated DNA-DNA alignments(tblastx). Megablast is suit to comparing highly similar sequences such as the ones from thesame or very closely related species. Because the default output format from megablast is oneline per entry that ACT can read, there is no need to add '-m 8'. Therefore, the command willlook like this: (again, as the end of the command, press enter.) megablast -d seqBigA.dna -i seqBigB.dna -o SeqBigA_vs_SeqBigB Reference: The Wellcome Trust Manual of the Workshop on Working with Pathogenic Genomes. /*http://icb.med.cornell.edu/wiki/index.php/Elementolab/BWA_tutorial*/