Entering edit mode

Hello everyone,

I am seeking some tips on how to calculate the pearson/spearman correlation between two chip seq peak profiles. I have the coordinates of the peaks in a bed format.

I think I have to bin the peaks and correlate the read count in aligned bins but I'm wandering, how to make sure the correct bins are correlated to each other?

if two similar peaks (from the two replicate IPs) have fairly different start coordinates that are shifted by several kb, then its possible they will belong to different bins. So how to account for this in the correlation in order to make sure that your bins are properly aligned ?

Thanks a lot for your suggestions.

Entering edit mode

I'm not sure how two peaks could be considered "similar" if their locations are several kb apart? We usually think of peaks being the same in replicates if their coordinates at least overlap.

That said, a very easy way to get pearson or spearman correlation values is to read all the peaksets into the `DiffBind`

package in `Bioconductor`

. Assigning the value returned by `dba.plotHeatmap()`

(or just `plot()`

) to a variable will give you a matrix of correlation values.

Loading Similar Posts

I just did that for some ChIP-seq samples that I'm analysing and I'm wondering: is that plot(results) from DiffBind the same as the Cross-correlation plot that people in ChIP-seq use to see the correlation of peaks between samples? (meaning: it is interchangable? or there is some difference?)