Question

How to correlate chip seq peaks?

1

Entering edit mode

5.8 years ago

Zee_S ▴ 60

Hello everyone,

I am seeking some tips on how to calculate the pearson/spearman correlation between two chip seq peak profiles. I have the coordinates of the peaks in a bed format.

I think I have to bin the peaks and correlate the read count in aligned bins but I'm wandering, how to make sure the correct bins are correlated to each other?

if two similar peaks (from the two replicate IPs) have fairly different start coordinates that are shifted by several kb, then its possible they will belong to different bins. So how to account for this in the correlation in order to make sure that your bins are properly aligned ?

Thanks a lot for your suggestions.

chip seq correlation pearson spearman peaks • 2.7k views

ADD COMMENT • link updated 5.8 years ago by Rory Stark ★ 2.0k • written 5.8 years ago by Zee_S ▴ 60

score 1 · Answer 1 · 2018-07-16

1

Entering edit mode

5.8 years ago

Rory Stark ★ 2.0k

I'm not sure how two peaks could be considered "similar" if their locations are several kb apart? We usually think of peaks being the same in replicates if their coordinates at least overlap.

That said, a very easy way to get pearson or spearman correlation values is to read all the peaksets into the DiffBind package in Bioconductor. Assigning the value returned by dba.plotHeatmap() (or just plot()) to a variable will give you a matrix of correlation values.

ADD COMMENT • link 5.8 years ago by Rory Stark ★ 2.0k

0

Entering edit mode

I just did that for some ChIP-seq samples that I'm analysing and I'm wondering: is that plot(results) from DiffBind the same as the Cross-correlation plot that people in ChIP-seq use to see the correlation of peaks between samples? (meaning: it is interchangable? or there is some difference?)

ADD REPLY • link 5.1 years ago by msimmer92 ▴ 300