Question

Peak calling with MACS for DNase-seq data and Peak annotation by Homer

0

Entering edit mode

8.4 years ago

Bioinformatist Newbie ▴ 270

I used macs 1.4 and macs 2 for peak calling. Then I used Homer (annotatePeaks.pl) for peak annotation to hg19. Peak scores ranges from 3205.08 to 50 and 342.34 to 2 respectively. The number of peaks identified are 65229 and 74979 respectively. I used the default parameters in both cases. Why is there so much variation in peak scores? what is the role of this peak score ? It is not clear from the Homer website.

High peak score means that there are more chances of true positive peak? If yes, then what could be the right threshold to filter out the peaks?

In the supplementary material of a paper I found the information that they used Homer for chip-seq peaks identification and filter the peaks by peak score >10, but there they are not mentioning a reason behind this threshold.

DNase-seq MACS Homer Peak-Calling • 3.7k views

ADD COMMENT • link updated 20 months ago by Ram 43k • written 8.4 years ago by Bioinformatist Newbie ▴ 270

0

Entering edit mode

I guess MACS is responsible for the difference, not HOMER.

ADD REPLY • link 8.4 years ago by Zaag ▴ 860

score 0 · Answer 1 · 2015-12-10

0

Entering edit mode

8.4 years ago

Ian 6.0k

I hope I understood you correctly, but MACS1.4 transforms the Pvalue -10log10(P), where as MACS2 takes the -log10 of the Pvalue. This explains the 10 fold different you are seeing.

ADD COMMENT • link 8.4 years ago by Ian 6.0k

0

Entering edit mode

1- Difference in score is what you explained but how to interpret the difference in the number of peaks?

2- Some peaks have high score while others have low score. How this score is calculated and what does it mean? Is it that high scoring peaks are having more chances of being true positives while low scoring peaks have a higher chances of being false positive peaks..?

ADD REPLY • link 8.4 years ago by Bioinformatist Newbie ▴ 270