Clustering by sequence alignments
0
0
Entering edit mode
7.0 years ago
bbb ▴ 70

Which cluster method is better to use to cluster DNAs of different species based on alignment information (matches, deletions, insertion)? i.e. reference sequence - sequence of 4000 b.p. length, then feature set is 4000 * |{b.p. from reads which was matched exactly, b.p. insertions, deletions}| = 12000

dna alignment clustering • 1.5k views
ADD COMMENT
0
Entering edit mode

What about CD-HIT ??

ADD REPLY
0
Entering edit mode

CD-hit is very good to remove redundancy but is not adequate for clustering. I didn't understand the question asked, though. For clustering you need a metric of similarity or distance.

ADD REPLY
0
Entering edit mode

Starting with distance matrices, affinity propagation clustering has worked quite nicely for me.

ADD REPLY

Login before adding your answer.

Traffic: 2415 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6