MEGABLAST result analysis issue
1
0
Entering edit mode
7.0 years ago

Dear all I have 2 files with 3787 gene sequences of one genome and another file containing 3762 gene sequences of another bacterial genome which are highly similar so I have decided to do MEGABLAST. (these genomes having only sequencing errors). I want starting to end exact match to query sequences and parse output. please give suggestions. and what should be the significant e value in this case and how to parse exact matched sequences based on MEGABLAST parameters. Is there any way or I have to look all sequence alignment manually.

alignment • 1.2k views
ADD COMMENT
1
Entering edit mode
7.0 years ago
cschu181 ★ 2.8k

Use -outfmt 6 'std qlen', then you will have everything you need. You are looking for lines with pident = 100 and qend = qlen. I think that should do it. e-value should be low, but doesn't really matter as you're looking for full-length matches of gene sequences and not trying to determine whether a pattern that you found is likely to be random or not.

ADD COMMENT

Login before adding your answer.

Traffic: 2637 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6