Error while annotating with Chipseeker
1
0
Entering edit mode
5.2 years ago
hamaor ▴ 10

I followd the Diffbind pipeline for a ChipSeq data:

peaksets2 = dba(sampleSheet="SampleSheet.csv",config = data.frame(RunParallel=TRUE, AnalysisMethod=DBA_DESEQ2, bCorPlot=FALSE, bUsePval=TRUE, minQCth=10),minOverlap = 2)

peaksets2 <- dba.count(peaksets2)
peaksets2 <- dba.contrast(peaksets2, categories=DBA_FACTOR)
peaksets2 <- dba.analyze(peaksets2,bCorPlot=FALSE)

than I tried to annotate the peaks with Chipseeker:

txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
peakAnno_inAll = annotatePeak(peaksets2.DB, TxDb=txdb, annoDb = "org.Hs.eg.db", level = "gene")

but I keep get the same error:

> peakAnno_inAll = annotatePeak(peaksets2.DB, TxDb=txdb, annoDb = "org.Hs.eg.db", level = "gene")
>> preparing features information...         2019-02-25 14:26:15 
>> identifying nearest features...       2019-02-25 14:26:15 
>> calculating distance from peak to TSS...  2019-02-25 14:26:15 
>> assigning genomic annotation...       2019-02-25 14:26:15 
Error in `[<-.data.frame`(`*tmp*`, tssIndex, "Promoter", value = TRUE) : 
  replacement has 1 row, data has 0
In addition: Warning messages:
1: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
2: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
3: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
4: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
5: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)

what do I do wrong?

ChIP-Seq diffbind chipseeker annotation • 4.1k views
ADD COMMENT
1
Entering edit mode

I have not tried to annotate peaks with Chipseeker; I use Homer's annotatePeaks.pl to annotate peaks.

annotatePeaks.pl peakfile hg19  > output.txt
ADD REPLY
0
Entering edit mode

could you send me the link for the .pl script? thanks

ADD REPLY
0
Entering edit mode

http://homer.ucsd.edu/homer/download.html - You will have to download homer software which includes this annotation script http://homer.ucsd.edu/homer/ngs/annotation.html - here's a link with detailed explanation of how Homer performs annotation of peaks.

ADD REPLY
0
Entering edit mode

thanks. I download it and tried using it, please see my comment about it

ADD REPLY
0
Entering edit mode

I generated my peaks using MACS2 and a peak file is looks like that:

 chr     start   end     length  abs_summit      pileup  -log10(pvalue)  fold_enrichment -log10(qvalue)  name
1       1024833 1025212 380     1024925 26.00   12.93083        5.17873 7.28933 P53_1_peak_1
1       10474802        10475143        342     10474959        21.00   7.06303 3.39586 2.57685 P53_1_peak_2

As it not correlated with the format supported by Homer, I modified it to look like that:

PeakID  chr     start   end     strand  length  abs_summit      pileup  #NAME?  fold_enrichment #NAME?  name
1       1       1179379 1179721 *       343     1179575 27      12.79556        4.97813 9.00724 PAX8_2_peak_1
2       1       1241319 1241739 *       421     1241490 36      20.38177        6.53943 16.17405        PAX8_2_peak_2
3       1       1283799 1284120 *       322     1283978 32      21.3853 7.75134 17.13212        PAX8_2_peak_3

By Adding PeakID column as first column, and "strand" column at the 5th column.

I'm running the Homer command: annotatePeaks.pl Homer_test_file.txt hg18 > outputfile.txt

and the final results ending with NA's for every peak:

PeakID (cmd=annotatePeaks.pl Homer_test_file.txt hg18)  Chr     Start   End     Strand  Peak Score      Focus Ratio/Region Size Annotation   Detailed Annotation      Distance to TSS Nearest PromoterID      Entrez ID       Nearest Unigene Nearest Refseq  Nearest Ensembl Gene Name    Gene Alias       Gene Description        Gene Type
101101915       2216    15      101101541       +       0       NA      NA      NA      NA      NA
74178602        3035    3       74178208        +       0       NA      NA      NA      NA      NA
35884393        1001    11      35884090        +       0       NA      NA      NA      NA      NA

I would love to get insight about what have I done wrong. Thanks a lot, Maor

ADD REPLY
0
Entering edit mode

Your peakfile should have these five tab-delimited columns -

  • Column1: Unique Peak ID
  • Column2: chromosome
  • Column3: starting position
  • Column4: ending position
  • Column5: Strand (+/- or 0/1, where 0="+", 1="-")

Please ensure your chromosomes have "chr" before the number.

ADD REPLY
0
Entering edit mode
4.2 years ago

changing the chromosome names to "chr1" helped...

ADD COMMENT

Login before adding your answer.

Traffic: 1496 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6