Question

Which Method Is The Best Dissimilarity Measurement For Hierarchical Clustering Of Dna Methylation Data

0

Entering edit mode

12.4 years ago

Gangcai ▴ 230

Dear all, I have performed hierarchical clustering for bisulfite DNA methylation data using two different dissimilarity methods: euclidean distance and pearson correlation by using R package pvclust. However,the tree structure based on those two methods are different. My question is which one should I use for hierarchical clustering of bimodal distributed DNA methylation data? Is there any published paper that have already compared different dissimilarity methods?

Thanks in advance.

dna methylation clustering • 4.6k views

ADD COMMENT • link updated 3.9 years ago by Biostar 20 • written 12.4 years ago by Gangcai ▴ 230

3

Entering edit mode

As said often before (here and elsewhere) the attempt to recommend a single best method for data-mining is futile, given the lack of a gold standard to compare your results with. Clustering is exploratory and used for hypothesis generation, therefore the way to go is to apply many different methods (including other clustering methods: kmeans, Mclust) and try to evaluate the results in the light of your biological knowledge. Also, use e.g. GO analysis, pathway analysis and GSEA).

ADD REPLY • link 12.4 years ago by Michael 54k

score 1 · Accepted Answer · 2011-12-12

1

Entering edit mode

12.4 years ago

Sean Davis 26k

I'd suggest converting these values to M-values, compute the distance metric, then display the beta values. Keep in mind that there is no "right" answer when clustering, so experimentation is necessary.

ADD COMMENT • link 12.4 years ago by Sean Davis 26k