Biostar Beta. Not for public use.
Question: standalone blastp: increasing word size extremely slows down the search
0
Entering edit mode

Hello,

I need to blastp a genome (15,000 seqs) against genome (12,000 seqs) using Biopython. I decided to use local blast and query genome 1 fasta file against genome 2 database ( made by makeblastdb command with second genome fast file ). I also managed to perform the blast search for default parameters of standalone blastp. However, when I try to change word size to BIGGER value ( default is 3 and i set it to 6, the blast performs extremely slow. I am kind of confused why such a thing happens because increasing word size is supposed to make things go faster. Here is how i pass arguments to NcbiblastpCommandline function:

NcbiblastpCommandline( word_size=6, query=queryInputPath, db=subjectInputPath, out=outputPath, outfmt=5 )()

things are much faster when the function does not have 'word_size=6' keyword argument. Without word size = 6 it takes around an 1,5 h to perform blast. My mac has 4gb of RAM and 1,6 GHz Intel Core i5 processor. What may be the cause?

ADD COMMENTlink 2.5 years ago aleksanderczeszyk • 0 • updated 2.5 years ago Istvan Albert 80k
Entering edit mode
2

Check that you're not running out of memory.

ADD REPLYlink 2.5 years ago
Jean-Karim Heriche
19k
Entering edit mode
2

With 4GB of RAM very likely.

ADD REPLYlink 2.5 years ago
genomax
68k
Entering edit mode
0

You may be able to save some overhead if you run BLAST directly from the command line, although not likely a meaningful amount. You may also try splitting the database up into multiple parts, just make sure you manually set the statistical options (e.g. dbsize). You'll have to do some post blast work to find the best hits, but this should get you around the memory issues.

ADD REPLYlink 2.5 years ago
pld
4.8k
0
Entering edit mode

I would recommend using blast replacements like DIAMOND or PAUDA

https://ab.inf.uni-tuebingen.de/software/diamond

https://ab.inf.uni-tuebingen.de/software/pauda

ADD COMMENTlink 2.5 years ago Istvan Albert 80k
Entering edit mode
0

Just to be clear .. there is no point in trying to use these tools on the machine described in the original post.

ADD REPLYlink 2.5 years ago
genomax
68k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0