Understanding the number of intersection in bedtools jaccard
1
0
Entering edit mode
2.8 years ago
QLFblaireau ▴ 30

Hello, I am using bedtools jaccard to compare two vcf files, as:

bedtools jaccard -a ancestors.calls.norm.snp.vcf.gz -b GC078310.calls.norm.snp.vcf.gz
intersection    union-intersection      jaccard n_intersections
1606899 1806667 0.889427        1536700

What I do not get is why n_intersections is equal to 1536700. Especially, the difference between intersection and n_intersections is not very clear to my mind.

Any help would be greatly appreciated.

Thanks a lot!

bedtools comparison vcf snps • 1.0k views
ADD COMMENT
2
Entering edit mode
2.8 years ago

If you have two 100bp regions that are 50% overlapped:

region 1 ----------
region 2      -----------
              |---|
               50bp

Here their are 50 bp overlapped so "intersection" is 50, and that is contained within 1 overlap (n_intersection = 1)

So "intersection" is the number of base pairs that are overlapped, while "n_intersections" is the number of overlaps between two intervals that those overlapped base pairs come from.

ADD COMMENT

Login before adding your answer.

Traffic: 1735 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6