How to Blast sequences against a genome

1. Get to a DOS window (e.g. by RUN command)

2. Type the following command to run Blast:

        blastall -pProgram -dDatabaseName -iContigFile -oFilename -eE-value

    For example, to blast a set of protein sequences against a database:

        blastall -pblastp -dK12-Prot -iO157prot.fa -oK12VsED.txt -e.00001

3. Be prepared to wait a while. With 1000's of proteins, the output may be hours in coming. The program gives no indication of its progress; it simply brings you back to a DOS prompt (>) when it's done.

4. With 1000's of proteins blasted against 1000's of protein, you should expect several 10's of megabytes-worth of output.

5. How do you know whether the program worked? If you have a large output file (i.e. dozens of megabytes), don't try to read it into something like Word (you risk choking it). I don't think that Microsoft has any solution for us, but there is an ancient freeware program from the pre-Windows era that will do the job. Click here to download DR (standing for DiRectory). Put it in the Blast directory. Type DR at a DOS prompt to run.

6. To run DR, type DR) at a DOS prompt to get a list of files in \Blast, then press the F10 key to sort the files by date of creation, then press the End key to go to the end of the list. You should see the file you just made. Press the Enter key to see the contents of the file (you can scroll through the file using the usual keys).

7. However you look at the output file, you should see something like:

BLASTP 2.1.3 [Apr-1-2001]
 

Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402.

Query= ...

If so, you win!