I have 3 questions about running the ChromHMM tool to find the combinatorial states of multiple ChIPseq data.
I have replicate ChIPseq data(4 replicates each) for H3K4me3 and H3K27me3 in control vs treatment. For ChromHMM, do I need to merge all the replicates and then run it, or I can run the replicates individually?
As the tutorial suggests, one should start with the bed file coming from the original bam alignment file. But has anyone tried it by using Macs2 peak files? I mean, using MACS2 first to call reliable peaks with a cutoff, and then using ChromHMM to call the combinatorial chromatin states?
Also, as I have two groups, control vs treatment. In each group, I have both H3K4me3 and H3K27me3 ChIPseq data(with 4 replicates). Now, to define bivalent chromatin states, do I give all the control and treatment H3K4me3 and H3K27me3 data in ChromHMM to learn the model or only the control data?
One of the first steps of ChromHMM is peak calling. Therefore, you need to use the bed files from bam files and not peak files. Although the default settings (i.e. ) would give you less stringent calls (argument: -p or poissonthreshold in BinarizeBed function). You may change it 1e-5 or less. If you are interested in using MACS peak calls then you may use the argument -peaks within the BinarizeBed function. However, the latter is not recommended for broad peaks marks such as H3K27me3 and H3K9me3.
Yes, you have to give control files for all the treatment file. It is again required for calling the peaks.