ChIP-seq differential binding with multiple replicates using MACS2 bdgdiff.
1
0
Entering edit mode
6.1 years ago
shawn.w.foley ★ 1.3k

Hello,

I'm trying to perform a differential binding ChIP-seq experiment and am struggling with the best way to incorporate my replicates using MACS2. I have two sets of samples, control rep 1-3 and control input, as well as treated rep 1-3 and treated input. If I run macs2 callpeak as recommended I run:

macs2 callpeak -B -t ctrl.rep1.bam ctrl.rep2.bam ctrl.rep3.bam -c ctrl.input.bam -f BAM -g hs -n ctrl.repAll.peaks --nomodel --extsize 146 --buffer-size 1000000

macs2 callpeak -B -t treat.rep1.bam treat.rep2.bam treat.rep3.bam -c treat.input.bam -f BAM -g hs -n treat.repAll.peaks --nomodel --extsize 146 --buffer-size 1000000

This will generate output files in the bedgraph format that I can then use to run macs2 bdgdiff:

macs2 bdgdiff --t1 treat.repAll_treat_pileup.bdg --t2 ctrl.repAll_treat_pileup.bdg --c1 treat.repAll_control_lambda.bdg --c2 ctrl.repAll_control_lambda.bdg -l 146 -d1 spikeIn_treat -d2 spikeIn_ctrl --o-prefix treat_vs_ctrl

However, these bedgraph files are the POOLED replicates. I want to be able to utilize my individual replicates in order to get some statistical power.

1) Does anyone have experience running macs2 bdgdiff across multiple replicates for treated/untreated samples?

2) How would I combine the log odds ratio in the macs2 bdgdiff output files to get a meaningful measurement of the difference in binding pre- and post-treatment?

I've been searching online and can't find anything. Any help would be appreciated.

ChIP-Seq MACS2 bdgdiff • 5.0k views
ADD COMMENT
0
Entering edit mode

bdgdiff is not very useful, imo. I'd recommend calling peaks on each replicate individually and using something like DiffBind for more robust, believable results that are actually based on signal at your peaks. The stats are a lot more useful as well, and it integrates nicely with their QC package (ChIPQC) as well.

ADD REPLY
3
Entering edit mode
4.9 years ago
ATpoint 81k

As this question got a lot of views but no answer, I will add the suggestion to read the csaw vignette to get a good idea on how to perform differential ChIP-seq analysis. As you have replicates (which is good and desirable) you can use the edgeR framework which csaw uses internally to perform a statistically sound analysis. csaw itself suggests to use sliding windows across the genome to test for differential binding across replicates. See the vignette for details. Alternatively, one could pool all BAM files at equal contribution (= pool same number of reads from each file) and call peaks on that dataset to get a reference peak set which is then used to make a count matrix followed by the edgeR (or a tool of choice) pipeline (dispersion estimation, GLM fitting, testing of contrasts, FDR correction...).

ADD COMMENT

Login before adding your answer.

Traffic: 2935 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6