A question about blast algorithm
2
1
Entering edit mode
7.0 years ago
Denis ▴ 310

I have some doubts regarding my understanding of blast algorithm that i'd like to dispel. It's well known that blast tries to find the same words in query and database and then uses it as starting point for the alignment extension. Let's imagine we have word AGTcCAT in query sequence and one word AGTgCAT in database. So there is a single mismatch between the words. My question, may the blast use this pair as starting point for the alignment extension? I think not, since blast requires only perfect matches of two words. Am i right? Thanks!

alignment blast • 1.2k views
ADD COMMENT
2
Entering edit mode
7.0 years ago

AFIAK, the seed region can't have mismatches. And the seed is created from the query string, which is matched against the database to find potential hits, which are then extended on both sides to get the final alignment. In your case, if the word size of seed region is >4, then you won't be able to find any hits. If seed <=3, there are matches (AGT = AGT; CAT=CAT etc), which can be extended according to mismatch penalty.

Note: For nucleotide, the defulat word-size is 11 and for AA, it is 3.

ADD COMMENT
2
Entering edit mode
7.0 years ago

You're right if the word size is set correctly. See this explanation of the word size parameter.

ADD COMMENT

Login before adding your answer.

Traffic: 1757 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6