Biostar Beta. Not for public use.
Blast Xml For Multiple Databases
0
Entering edit mode
9.6 years ago
Yann • 70
@Yann1390

Hello, I run blast2.2.25+ with multiple databases, I would like to view my result for each database in my XML result file, it is possible?

blast xml • 1.9k views
ADD COMMENTlink
1
Entering edit mode

what do you mean with "view my result for each database" ?

ADD REPLYlink
0
Entering edit mode

you could use table output instead of XML and do a grep.... Arrgh, no, pleeease don't kill me...

ADD REPLYlink
0
Entering edit mode

you could use table output instead of XML and do a grep.... Arrgh, no, pleeease don't kill me... :-)

ADD REPLYlink
0
Entering edit mode

You could also use [?] on multiple xml files instead of one. Care to explain the reason why you want to have it that way?

ADD REPLYlink
0
Entering edit mode

Yann should explain this him/herself, but I guess this question is about running a blast search against multiple databases in one go and then teasing out the individual databases later. I must admit that I don't really see a problem, as the database identifiers should be part of the hit entry name. In a table, this is easy to parse, and I am sure that the XML gurus can come up with some XSLT magic to this with XML, too. By the way, I see nothing bad in running blast with multiple databases - very efficient if there are multiple small databases.

ADD REPLYlink
1
Entering edit mode
9.6 years ago
Nabellaleen • 10
@Nabellaleen2111

If I remember, Blast can run a search on only 1 database at a time, no ?

To run a Blast on multiple databases, you have to merge them with blastdb_aliastool, which create a "new" database. That's "forbid"/"block" the possibility to distingue initial database used for each result.

If you give multiple databases directly to Blast, I don't know what happened, this is not documented (I didn't found an answer anyway). So I think if it doesn't crash, it probably call blastdb_aliastool.

In any case, on NCBI, Blast XML output is "based" on this DTD : http://www.ncbi.nlm.nih.gov/data_specs/dtd/NCBI_BlastOutput.mod.dtd

It says we have a "BlastOutput" element (<!ELEMENT BlastOutput ( ... )>) contains a "BLAST Database name" attribute (<!ELEMENT BlastOutput_db (#PCDATA)>) and is a parent of iteration elements (<!ELEMENT Iteration_hits (Hit*)> - one by sequence in the database), which is a parent of hits (<!ELEMENT Hit ( ... )>) which are parents of HSPs (<!ELEMENT Hsp ( ... )>)

So, if you have multiple databases in an XML result file, there is a "BlastOutput" element for each database used, which contain the DB name in "[?]Your DB Name[?]" element, and which also contain results on this database in "[?]" element.

ADD COMMENTlink
1
Entering edit mode

@Nabella: what BLAST are you referring to? I routinely use blast with multiple databases (although I tend to avoid XML output) the syntax is -db "data1 data2" for new BLAST+ or -d "data1 data2" for traditional blastall.

ADD REPLYlink
1
Entering edit mode

It is definitely possible to do this without aliastool, as mentioned above, passing multiple arguments for db. It has the annoying consequence of not telling you "which" database a particular sequence came from though

ADD REPLYlink
0
Entering edit mode

@Nabella, As far as I can see in Yann's previous questions ( http://biostar.stackexchange.com/users/1414/yann ), he already knows how is structured XML blast. So, I still don't understand what he means with "view"

ADD REPLYlink
0
Entering edit mode

My comments are based on my (failing ?) memory for Blast command line use, so ... If you say that's possible, I believe you. But my others comments stay : how blast treat this multiple databases ? What are the results ? ...

However, as Pierre said, maybe that's another subject ;)

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.3