Biostar Beta. Not for public use.
Question: Blast - Formatting Output
18
Entering edit mode

Hi,

I've been using the blastn (version 2.2.28+) standalone tool against a custom formatted genome via:

blastn -db BLASTDB -word_size 7 -query input.fa -out filename -perc_identity 100 -outfmt 6 -max_target_seqs 2

To discard non-perfect hits and show only the 2 top hits.

The output file has a great format however is there a way to add an extra column that contains the actual target-seq (sequence of the matched hit)? Such that the fields are:

query id, subject id, % identity, alignment length, mismatches, gap opens, q. start, q. end, s. start, s. end, evalue, bit score, sequence

Thanks!

  • TJC
ADD COMMENTlink 6.2 years ago timjoncooper • 190 • updated 2.2 years ago sridhar.rg • 0
Entering edit mode
0

thre is a solution to see the sequence(query) in alignment result?

ADD REPLYlink 4.3 years ago
midox
• 220
Entering edit mode
0

all the valid fields are listed in the help

ADD REPLYlink 4.3 years ago
Istvan Albert
80k
Entering edit mode
1

i know but in:

 qseqid means Query Seq-id
              qgi means Query GI
             qacc means Query accesion
          qaccver means Query accesion.version
             qlen means Query sequence length
           sseqid means Subject Seq-id

there is one that show the query(sequence)?

ADD REPLYlink 4.3 years ago
midox
• 220
Entering edit mode
0

Obviously none of these - after all none of those descriptions indicates that it would. Keep looking.

ADD REPLYlink 4.3 years ago
Istvan Albert
80k
Entering edit mode
0

Hi!! Do you know how to see the sequence (query) in your blast result?

ADD REPLYlink 3.8 years ago
figuerm
• 0
42
Entering edit mode

Run blastn -help then look for the field called outfmt

*** Formatting options
 -outfmt <String>
   alignment view options:
     0 = pairwise,
     1 = query-anchored showing identities,
     2 = query-anchored no identities,
     3 = flat query-anchored, show identities,
     4 = flat query-anchored, no identities,
     5 = XML Blast output,
     6 = tabular,
     7 = tabular with comment lines,
     8 = Text ASN.1,
     9 = Binary ASN.1,
    10 = Comma-separated values,
    11 = BLAST archive format (ASN.1) 

   Options 6, 7, and 10 can be additionally configured to produce
   a custom format specified by space delimited format specifiers.
   The supported format specifiers are:
           qseqid means Query Seq-id
              qgi means Query GI
             qacc means Query accesion
          qaccver means Query accesion.version
             qlen means Query sequence length
           sseqid means Subject Seq-id
        sallseqid means All subject Seq-id(s), separated by a ';'
              sgi means Subject GI
           sallgi means All subject GIs
             sacc means Subject accession
          saccver means Subject accession.version
          sallacc means All subject accessions
             slen means Subject sequence length
           qstart means Start of alignment in query
             qend means End of alignment in query
           sstart means Start of alignment in subject
             send means End of alignment in subject
             qseq means Aligned part of query sequence
             sseq means Aligned part of subject sequence
           evalue means Expect value
         bitscore means Bit score
            score means Raw score
           length means Alignment length
           pident means Percentage of identical matches
           nident means Number of identical matches
         mismatch means Number of mismatches
         positive means Number of positive-scoring matches
          gapopen means Number of gap openings
             gaps means Total number of gaps
             ppos means Percentage of positive-scoring matches
           frames means Query and subject frames separated by a '/'
           qframe means Query frame
           sframe means Subject frame
             btop means Blast traceback operations (BTOP)
          staxids means Subject Taxonomy ID(s), separated by a ';'
        sscinames means Subject Scientific Name(s), separated by a ';'
        scomnames means Subject Common Name(s), separated by a ';'
       sblastnames means Subject Blast Name(s), separated by a ';'
                (in alphabetical order)
       sskingdoms means Subject Super Kingdom(s), separated by a ';'
                (in alphabetical order) 
           stitle means Subject Title
       salltitles means All Subject Title(s), separated by a '&lt;&gt;'
          sstrand means Subject Strand
            qcovs means Query Coverage Per Subject
          qcovhsp means Query Coverage Per HSP
ADD COMMENTlink 6.2 years ago Istvan Albert 80k • updated 14 months ago RamRS 21k
Entering edit mode
18

To clarify by "space delimited format specifiers", it means write it as -outfmt "6 qacc sacc qseq sseq..."

ADD REPLYlink 5.2 years ago
ostrokach
• 280
• updated 14 months ago
RamRS
21k
Entering edit mode
8

To add, one of format specifiers is std, which add there default set. It means that -outfmt "6 std qlen" prints standard and query length.

ADD REPLYlink 4.1 years ago
kamiljaron
• 120
• updated 14 months ago
RamRS
21k
Entering edit mode
2

outfmt 7 or 10 works perfect

ADD REPLYlink 6.2 years ago
H@rry
• 30
• updated 14 months ago
RamRS
21k
Entering edit mode
0

Thank you! Sorted it out now.

ADD REPLYlink 6.2 years ago
timjoncooper
• 190
• updated 14 months ago
RamRS
21k
Entering edit mode
0

Hi!!
I have the same question and I donĀ“t know how you sort it out? Was it that you used oufmt 7 or that you use -outfmt "6 qlen" ??

ADD REPLYlink 3.8 years ago
figuerm
• 0
• updated 14 months ago
RamRS
21k
Entering edit mode
0

How can I get the description ( first column in the figure) when I run the command line blastp?

blastp

ADD REPLYlink 14 months ago
sbdk82
• 40
Entering edit mode
0

How to give a mismatch parameter in blastn. I was to perform alignment allowing 1 mismatch. I'm going through a lot of parameters but can't find this one.

ADD REPLYlink 13 months ago
glady
• 250
Entering edit mode
0

Hi, Is there a way to find query strand information as well? Thanks

ADD REPLYlink 11 months ago
Neu
• 10
Entering edit mode
0

I believe that strand is the relative position of subject to query, hence if sstrand is reverse, it that the query reverse complementary to the reference sequence.

ADD REPLYlink 10 months ago
kamiljaron
• 120
0
Entering edit mode

Just so you know, I was looking for this as well. The following did the job for me:

blastn -db <db_source> -query <query_source> -out <outfile> -outfmt "6 qseqid sseqid slen qstart qend length mismatch gapopen gaps sseq"  -word_size 5 -perc_identity 80

The option "sseq" will give the sequence that the query was aligned with. The option "qseq" will be the part of the query sequence.

ADD COMMENTlink 2.2 years ago sridhar.rg • 0

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0