Process 50k+ sequences with Blastx for Blast2Go
1
2
Entering edit mode
9.8 years ago
satshil.r ▴ 50

Hey all,

I'm trying to process a large transcriptome fasta file with over 50k sequences (just a bit over 53k). I setup a local blast+ instance and the latest nr database (up to nr.25). I started blastx with the following command: blastx -query fasta.fa -out blastx.xml -outfmt 5 -eval 1e-3 -num_threads 32

So far it's processed only 3500 sequences over 2 days. It's a fairly decent workstation, 2x Xeon E2560 V2's with 128GB of ram. From our previous experience this shouldn't take over a few days, although at this rate it seems like it's going to take a long time. The output is also quite large for only 3500 sequences, it's already at 1.5GB.

How can I optimize blastx for importing into blast2go? I'm currently reading up on how to parallelize blastx, but I'm not sure if there are better options out there.

Thanks!

blastall blast2go blastx • 3.9k views
ADD COMMENT
0
Entering edit mode

I started blastx with the following command: blastx -query fasta.fa -out blastx.xml -outfmt 5 -eval 1e-3 -num_threads 32

How is that possible? You didn't even define a db. Also, it would make sense to opt for refseq_protein over nr since the non-refseq seqs in nr probably can't be linked to go terms anyway (I could be wrong).

ADD REPLY
0
Entering edit mode

Sorry, I forgot to include that command in this post. I used the nr database in the commands.

ADD REPLY
1
Entering edit mode
9.8 years ago
rtliu ★ 2.2k

My suggestion based on page 10 of blast2go manual

  1. use -max_target_seqs 10
  2. use -word_size 5 (default 3, more sensitive but slower)

It is a big task, be patient or find to a cluster like TACC. (https://wikis.utexas.edu/display/bioiteam/split_blast)

ADD COMMENT
0
Entering edit mode

Thanks for the help!

How long do you expect it should take to finish a job like this?

ADD REPLY
0
Entering edit mode

Hard to say, maybe 5-10 weeks if the workstation is dedicated to the blastx search.

ADD REPLY

Login before adding your answer.

Traffic: 2653 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6