Advice on Blast output
2
0
Entering edit mode
5.2 years ago
stacy734 ▴ 40

Hi everyone,

I am running blastn on the commandline, trying different options for formatting. This command:

blastn -query blastme.fasta -out remote.blastn -db nr -evalue 1e-30 -outfmt 18 -max_target_seqs=1 &

Gives output that looks like this:


<body style="font-size:80%;">

Accession                                 Description                                Score E-value 

                           Rhodoferax saidenbachensis  [b-proteobacteria]                           
 CP019239  Rhodoferax saidenbachensis strain DSM 22694, complete genome                176   2e-40   

                                          Tax BLAST report                                          

Query= SRR8559322.121301.1 121301 length=221

Length=221
                                           Organism Report                                           

Accession                                 Description                                Score E-value 

                     Janthinobacterium sp. 1_2014MBL_MicDiv  [b-proteobacteria]                     
 CP011319  Janthinobacterium sp. 1_2014MBL_MicDiv, complete genome                     200   2e-47   

                                          Tax BLAST report                                          
Query= SRR8559322.122717.1 122717 length=178

Length=178
                                           Organism Report                                           

Accession...  Description... Score... E-value... 
                                          Tax BLAST report                                          

Query= SRR8559322.126209.1 126209 length=1952

Length=1952
                                           Organism Report                                           

Accession                                 Description                                Score E-value 

                               Massilia sp. NR 4-1  [b-proteobacteria]                               
 CP012201  Massilia sp. NR 4-1, complete genome                                        1857  0.0     
                                          Tax BLAST report                                          
Query= SRR8559022.132866.1 132866 length=94

Length=94
                                           Organism Report                                           

Accession...  Description... Score... E-value... 
                                          Tax BLAST report  
</small>

I'd like to get output that looks like this:

SRR8559322.119579.1 [b-proteobacteria] 
SRR8559322.121301.1 [b-proteobacteria] 
SRR8559322.122717.1
SRR8559322.126209.1 [b-proteobacteria]
SRR8559022.132866.1 

Note that the second column should be blank where there were no hits found.

There doesn't seem to be a Blast option for anything similar to this. Can anyone suggest a grep/sed type command that I could use on the results to put them into tabular form like this?

Thanks for any advice.

blast unix format • 1.4k views
ADD COMMENT
0
Entering edit mode

In the past, I have specified the XML output (-outfmt 5) and converted the results using this python script. This allows you to get a good amount of information per hit.

ADD REPLY
1
Entering edit mode
5.2 years ago
JC 13k

Blast tabular (text files with columns separated by tabulars) output is -outfmt 6, you can specify the fields to show passing the field names, for example -outfmt "7 qacc sacc evalue qstart qend sstart send". You can see the full list of fields reading the blast help information for the program, like: blastn -help

Just be aware that by default, queries without any hit are not printed in the output.

ADD COMMENT
0
Entering edit mode

In addition to JC answer, blastn does not report sequences without hit.

ADD REPLY
0
Entering edit mode
5.2 years ago
stacy734 ▴ 40

Thanks!

For others who may read this, the parameter in question can either be sskingdom or sblastname.

Stacy

ADD COMMENT

Login before adding your answer.

Traffic: 2707 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6