Hi there,
Geneious has the wonderful possibility to sort blast hits into two bins, hit or no-hit. This is quite useful to i.e. separate metagenomes (i.e. host reads vs. endosymbiont reads). While Geneious works very well with smaller data-sets on a local computer, for large data-sets you need a lot of patience. Is there any way to use a command line blast in the same way so one ends up with two files with the sequence IDs or even better two files with hit vs no-hit sequences? There is no need at all to produce all that alignments.
I thought about a similar approach with BBMap, but because in our case there is no reference genome from the same species blasting the hits against a somewhat related genome might be the better approach.
Thank you for your help, JD
You can post-process your blast results if you use
-outfmt 7
. That will include queries with no hits in your output. If you useoutfmt 6
then that will include only those query ID's that have hits. You could use those to find ones that don't that way.With BBMap you can collect reads that don't align by using
outu=
option.