Biostar Beta. Not for public use.
blast database name in output
0
Entering edit mode
15 months ago
gb • 780

I wonder if anyone else had this "problem" before or have any tips for me. If I blast against multiple databases I want to know where the hit is coming from. For example:

blastn -query fasta.fa -db "database1 database2" -outfmt 6

Probably I can determine the database based on the header of the hit, but would there be a better way? Can I also assume that the sequences from genbank always start with ">gi"?

Thanks!

blast • 390 views
ADD COMMENTlink
1
Entering edit mode

Can I also assume that the sequences from genbank always start with ">gi"?

No. In fact, gi numbers have been deprecated for external use (they are still used internally at NCBI).

ADD REPLYlink
0
Entering edit mode

oh of course yes I knew that. Did not realize it because I only had ">gi" in my fasta file. Need to look up how the non gi headers look like. Thanks

ADD REPLYlink
1
Entering edit mode

What happens if you use -outfmt 7? With one database it will show it in the header, not sure how it will do with two or more.

ADD REPLYlink
0
Entering edit mode

I solved it for now by putting an "identifier" afther the >. But I will check it out.

ADD REPLYlink
0
Entering edit mode

You could include some sort of identifier in the fasta headers of the two databases (and recreate the indexes) if you want to keep them in two sets.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1