Biostar Beta. Not for public use.
Multiple ways to make an IntersectBed on bed file after peak calling : which one is the good one ??
Entering edit mode
19 months ago
anais1396 • 20

Hello Biostars !!

I have .bed files resulting from Macs peak calling that are biological replicates. I have 5 replicates for healthy people and 6 for patients. I merged the 5 samples to a set for healthy represented in one file and I made the same for the patients so I have one bed file for all the merged peaks from patients and one bed file for healthy.

Now I'm looking for common genomic regions between these two files. I performed an IntersectBed with a UCSC table for the known genes on the genome but I see multiple ways to do the intersectBed :

  1. intersect condition1 with ucsc table, then intersect condition2 with ucsc and then make a 'super' intersect between the two files obtained ?
  2. As intersect now allows the use of multiple files, make an intersect between condition1+condition2+ucsc table ?
  3. Use Venn diagram but here we have the same problem : make a venn diagram 3-points with cond1+cond2+ucsc or make a venn diagram with cond1+ucsc then with cond2+ucsc and compare them ?

Which strategy is the good one ??

Thank you in advance !!


Entering edit mode
12 weeks ago
ATpoint 17k

IMHO you should go for a differential read count analysis in the peak regions, rahter than doing intersections. Reason is that peak presence or absence is a rather binary statement, but not really informative. Example: Say, very simplyfied, a peak requires 100 reads to be significant, and your healthies have 999-556-777-599-1000 read counts for a given peak, respektively. The patients have 100-123-222-321-132. By intersection, you would see no difference at all, suggesting no difference between the conditions. Still, the counts are much higher in the patients, suggesting an effect for this genomic regions in the disease situation that may be of interest. This you'll only pick up by scanning for differential counts. Popular packages for this, assuming you have ChIP-seq or ATAC-seq data are DiffBind (which uses the DESeq2 framework internally), or csaw. Both have extensive documentation, so that you can dive into it.


Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1