How To Concatenate Blast Results (M8) Via Setting Threshold Of Distance Between Two Query Hits
2
0
Entering edit mode
11.0 years ago
xiongtl2013 ▴ 40

hi, dear guys

I performance blastn (-m 8) using a query file of many sequences, and for each query sequence, the output contains many fragmental hits of significance.

however, these hits have no overlap, and what is interesting is that most gaps < 300bp (much shorter than full-length of the query sequence).

so, how can i concatenate those closely related hits into one via setting a value (e.g 300bp) when these hits match the same subject (different regions), ——also to reduce the number of output hits per query.

for example:

http://www.freeimagehosting.net/bohcp

are there any scripts or tools for this purpose?

all your replies are welcome!

blast • 3.3k views
ADD COMMENT
0
Entering edit mode
11.0 years ago
jgibbons1 ▴ 50

You can use Biopython to parse the blast output and then concatenate the sequences that match your criteria. I would not recommend parsing the tabular output though, instead re-run blast and get the results in xml format since the that is easier to parse using a script.

In the biopython tutorial the chapters you would be interested in are 3, 4, 5, and 7.

ADD COMMENT
0
Entering edit mode
7.3 years ago
Lhl ▴ 760

Have you tried genBlastA/G ? (She et al., 2011) genBlastG: using BLAST searches to build homologous gene models. Bioinformatics.

ADD COMMENT

Login before adding your answer.

Traffic: 2555 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6