Question

Chip-Seq input normalization with deeptools

0

Entering edit mode

4.1 years ago

srhic ▴ 60

Hello,

I want to visualize the difference in chip-signal at specific locations by making average profiles using deeptools. For this I have generated RPKM normalized bigwig files for my treatment and control conditions and plotting them gives me good results. However, these files are not input normalized and I am concerned that this therefore may not be the best way to do the analysis.

I am now using deeptools bigwigCompare to normalize each file against its input. This gives a normalized bigwig file that shows log fold change over input. However, when I look at this file in IGV, I see a lot of regions with negative values which implies that these regions had more signal in the input than in the chip sample. I am not sure what to do with these regions or if they mean that my experiment was not reliable? Should I just remove all negative values from the bigwig files (is there a tool to do this?) and compare the positive values between treatment and control using plotprofile?

Also since I am comparing conditions, is it ok if I don’t input normalize and stick to the first approach? Would appreciate any feedback.

Thanks

ChIP-Seq deeptools • 3.1k views

ADD COMMENT • link updated 4.1 years ago by jared.andrews07 ★ 16k • written 4.1 years ago by srhic ▴ 60

0

Entering edit mode

I have a personal dislike against FPKM (=normalization only based on total read depth), here are some details why and an alternative way to scale your bigwigs. It is for ATAC-seq but the same holds true for ChIP-seq: A: ATAC-seq sample normalization (quantil normalization)

ADD REPLY • link 4.1 years ago by ATpoint 82k

0

Entering edit mode

Thanks, I will check it out

ADD REPLY • link 4.1 years ago by srhic ▴ 60

score 3 · Answer 1 · 2020-03-16

3

Entering edit mode

4.1 years ago

jared.andrews07 ★ 16k

In general, using a method that actually performs valid statistical comparisons of the sample groups at specified positions (like csaw or diffBind) is the proper way to go about this. They take input into account, and then your average profiles are just a way to show that statistically significant difference visually, rather than trying to make the claim that those regions are different based solely on signal profiles. In that case, either method should be appropriate (and likely look fairly similar, assuming equivalent IP efficiency between groups).

As for why input is higher in certain areas, have you checked to ensure they don't overlap the ENCODE blacklisted regions? These are regions with very high artificial signal in ChIP experiments, typically near centromeres/telomeres. I usually ignore peaks in these regions (or remove the reads from these regions), as they will just introduce noise.

ADD COMMENT • link 4.1 years ago by jared.andrews07 ★ 16k

0

Entering edit mode

You are correct. I input normalized my samples, filtered out negative values and the resulting plots looked pretty much identical to what I had without input normalization.

However, I am still concerned about about the input being higher than IP. I am sure some of these would be blacklisted areas but when I visualize my input normalized bigwig in IGV, it seems this issue is very widespread and not limited to specific regions (maybe >30% of all regions are showing negative enrichment). I dont know what to make of this.

ADD REPLY • link 4.1 years ago by srhic ▴ 60

0

Entering edit mode

What did you ChIP? If it's a TF, that wouldn't be surprising that input is occasionally higher than IP. Do you have good peaks?

ADD REPLY • link 4.1 years ago by jared.andrews07 ★ 16k