distanceToTSS different in HOMER and ChIPseeker annotation
1
0
Entering edit mode
5.7 years ago

I am comapring the annotations of HOMER and ChIPseeker. I am noticing, for a gene that is annotated by Homer, the DistanceToTSS is always higher in HOMER, compared to ChIPseeker. For example, I have a set of genes, where I am annotating H3K4me3 peaks. With Homer, I am getting the distance for those genes a little above -2000bp. But for ChIPseeker, those are annotated as just below -2000bp(-1983bp, -1970bp, etc). What is the reason behind this discrepancy?

chipseeker ChIP-Seq • 3.8k views
ADD COMMENT
2
Entering edit mode
5.7 years ago

HOMER uses RefSeq annotations by default, while ChIPseeker uses UCSC annotations if you use their default command: peakAnno <- annotatePeak(files[[4]], tssRegion=c(-3000, 3000), TxDb=txdb, annoDb="org.Hs.eg.db")

See page 5 of the org.Hs.eg.db manual

ChIPseeker also lets you set the distance you want to consider the TSS for each gene, (-3kb to +3kb) as defined above. HOMER, by default, sets it as -1kb to +100 bp. So you should make sure you're using the same annotations for both - you can change them easily for both HOMER and ChIPseeker - and make sure you have the TSS region set similarly for each.

ADD COMMENT
0
Entering edit mode

Makes sense. Thanks a lot!

Also, I have one small question regarding this. After doing the annotation in ChIPseeker, I wanted to do Functional enrichment for the genes. Now, I want to only take the genes which has the ChIPseq peak(H3K4me3 in my case) around +/-2000bp around TSS. The tutorial says, I have to use seq2gene function to get the genes and put it into enrichPathway. The code is as follows:

gene <- seq2gene(peak, tssRegion = c(-1000, 1000), flankDistance = 3000, TxDb=txdb).

How can I configure this to get the genes having the H3K4me3 peaks at their promoter(+/- 2000bp) and put it into pathway analysis? I would really appreciate your help!

Thank you very much!

ADD REPLY
1
Entering edit mode

This would be better posed as a separate question so that you can post the necessary details, along with what you've tried. I'm not super familiar with ChIPseeker, but it seems to me an easy way to do this would be to make sure that the peak parameter of the seq2gene function only contains your H3K4me3 peaks that are near the TSS. You can figure that out pretty easily by intersecting your peak list with a list of TSSs. You can do that easily in R or in python/bash and just read the peaks file in as a Granges object. And change the tssRegion to -2000,2000 and flankDistance to 2000, I'd think.

ADD REPLY

Login before adding your answer.

Traffic: 2998 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6