Approximate matching to find similar DNA sequence
2
0
Entering edit mode
5.1 years ago
zhangdengwei ▴ 210

Hi,

I am trying to fuzzily match a DNA sequence, like "ATCATTA" in "agATCGTTAgtatt", but I would like that some errors can be tolerant, such as mismatch, insertion or deletion for seed sequence - "ATCATTA". Do you have any method with Perl or Python?

genome • 1.5k views
ADD COMMENT
0
Entering edit mode

The solution depends on how many such sequences you have and how many to search.

ADD REPLY
0
Entering edit mode

Wonderful !!! It works well. Lots of thanks to you!

ADD REPLY
2
Entering edit mode
5.1 years ago
Assa Yeroslaviz ★ 1.8k

take a look here - fuzzysearch

ADD COMMENT
0
Entering edit mode

Wonderful !!! It works well. Lots of thanks to you!

ADD REPLY
0
Entering edit mode

Wonderful !!! It works well. Lots of thanks to you!

ADD REPLY
0
Entering edit mode

hi, I had a further question. I want to match the pattern "aggacctgct.+aggcgctcaacgg" for "aggacctgctGGCCAAGACCGCTGAGAACAaggcgctcaacgg" using fuzzysearch, but it couldn't work. And how can I take the subsequence out if there are some errors in the pattern? Thanks.

ADD REPLY
0
Entering edit mode
5.1 years ago
Assa Yeroslaviz ★ 1.8k

or this agrep

ADD COMMENT
0
Entering edit mode

I could approximately match a subsequence by agrep, but if I would like to only take the subsequence out rather than the whole row, what should I do? And I haven't found the related parameters to address the problem. Thanks!!!

ADD REPLY

Login before adding your answer.

Traffic: 3161 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6