blast database name in output
0
0
Entering edit mode
5.9 years ago
gb ★ 2.2k

I wonder if anyone else had this "problem" before or have any tips for me. If I blast against multiple databases I want to know where the hit is coming from. For example:

blastn -query fasta.fa -db "database1 database2" -outfmt 6

Probably I can determine the database based on the header of the hit, but would there be a better way? Can I also assume that the sequences from genbank always start with ">gi"?

Thanks!

blast • 1.5k views
ADD COMMENT
1
Entering edit mode

Can I also assume that the sequences from genbank always start with ">gi"?

No. In fact, gi numbers have been deprecated for external use (they are still used internally at NCBI).

ADD REPLY
0
Entering edit mode

oh of course yes I knew that. Did not realize it because I only had ">gi" in my fasta file. Need to look up how the non gi headers look like. Thanks

ADD REPLY
1
Entering edit mode

What happens if you use -outfmt 7? With one database it will show it in the header, not sure how it will do with two or more.

ADD REPLY
0
Entering edit mode

I solved it for now by putting an "identifier" afther the >. But I will check it out.

ADD REPLY
0
Entering edit mode

You could include some sort of identifier in the fasta headers of the two databases (and recreate the indexes) if you want to keep them in two sets.

ADD REPLY

Login before adding your answer.

Traffic: 2685 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6