Entering edit mode
6.3 years ago
finswimmer
16k
Hello,
I have a bam file and I have a bed file with my region of interests. Now I'm interested in the mean coverage for each of these region. That's no problem with picard's CollectHsMetrics and the PER_TARGET_COVERAGE flag.
But now I would like to exclude those reads in coverage calculation that doesn't overlap my target region with a given fraction or at least given bases.
I feel that bedtools might help me here. But I didn't figure out until now how.
Can someone help me?
fin swimmer
Hi , I think you are looking for bedtools intersect , which go a clear documentation here http://bedtools.readthedocs.io/en/latest/content/tools/intersect.html :) You will just need 2 bed, your region of interest and your bed you got in you first step calculation :) Best
Hello, thanks for your answer. This is what I do now:
Remove all reads from the bam file that doesn't overlap with at least the given fraction
bedtools intersect -a full.bam -b regions.bed -f 0.75 -wa > restricted.bam
Use the resulted bam file for coverage calculation with my tool if choice.
That works like it should, but I need to iterate twice over the alignment file. So I'm also interessed in faster ways.
fin swimmer
Why don't you calculate your coverage in your bam with you final bed (i mean region of interest ) using bedtools ?
bedtools needs to much memory for calculate the coverage and I'm very low at it. Also I miss the possibilty to ignore reads/bases below a given quality treshold, like in the picard tools. And I'm more satisfied with the output of picard tools.
How does bedtools handle overlapping paired reads in coverage calculation? Are they counted as 1 or 2?
fin swimmer
There is an option for bedtools genomecov (http://bedtools.readthedocs.io/en/latest/content/tools/genomecov.html) named :
-pc Calculates coverage for paired-end reads, coverage is calculated as the number of fragments covering each base pair .
To got better performances you may be can split your bed in chromosome ...