What'S The Easiest Way To Blast 5000 Sequences Against An Exon Fragment?
3
4
Entering edit mode
13.1 years ago
John ▴ 790

I have 5000 EST sequences that I downloaded from Genbank. I have a 500 bp exon fragment that I'd like to BLAST against this lot of sequences to see if my exon is present or not. What's the easiest way to do this?

I thought that there might be a way to do this through the NCBI directly without even having to download the sequences (e.g. an option to 'BLAST these results') but I can't find one. My next option was to download BLAST, setup a local database, import the 5000 sequences and then BLAST my exon against that.

Is there an easier option?

Thanks, John

blast est • 3.9k views
ADD COMMENT
1
Entering edit mode

yes, they are the same species.

ADD REPLY
0
Entering edit mode

how did you selected these 5000 ESTs in the first place? Same/close species as your exon fragment?

ADD REPLY
0
Entering edit mode

You can use http://sequenceserver.com to make it easier to set up the custom blast database and run searches...

ADD REPLY
6
Entering edit mode
13.1 years ago

In this case, I wouldn't use BLAST. Exonerate can be much superior in this kind of task and don't need a database setup. As suggested by Pierre, it can do exhaustive search efficiently and have a lot of models used during alignment (e. g. protein to DNA, EST to genome). It's fast, flexible with a plethora of options. The only incovenient is its output. If you want to try it, learn how to use the --ryo option.

ADD COMMENT
0
Entering edit mode

+1, I once used exonerate for a similar problem (mapping ESTs to mRNAs).

ADD REPLY
3
Entering edit mode
13.1 years ago

(I would download the executable version of blast). But are you sure you want to use Blast ? 5000 is not so big, why not running a smith-waterman alignment ?

ADD COMMENT
1
Entering edit mode
13.1 years ago

In general, it is far more efficient to run the one exon sequence against the 5000 ESTs than to run 5000 searches against a single sequence. The smaller number of seqs comprises your query set and the larger number your database, in general, and especially for BLAST.

Both answers by Jarretinha and Pierre are good. This does seem like an important set of ESTs to you and so you're likely to use them again in other searches. You wouldn't want to always run the 5000 against the newest idea or exon you have on hand.

ADD COMMENT

Login before adding your answer.

Traffic: 2406 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6