Question

MEGABLAST result analysis issue

0

Entering edit mode

7.0 years ago

sharmatina189059 ▴ 110

Dear all I have 2 files with 3787 gene sequences of one genome and another file containing 3762 gene sequences of another bacterial genome which are highly similar so I have decided to do MEGABLAST. (these genomes having only sequencing errors). I want starting to end exact match to query sequences and parse output. please give suggestions. and what should be the significant e value in this case and how to parse exact matched sequences based on MEGABLAST parameters. Is there any way or I have to look all sequence alignment manually.

alignment • 1.2k views

ADD COMMENT • link updated 7.0 years ago by cschu181 ★ 2.8k • written 7.0 years ago by sharmatina189059 ▴ 110

score 1 · Answer 1 · 2017-04-19

Use -outfmt 6 'std qlen', then you will have everything you need. You are looking for lines with pident = 100 and qend = qlen. I think that should do it. e-value should be low, but doesn't really matter as you're looking for full-length matches of gene sequences and not trying to determine whether a pattern that you found is likely to be random or not.