Biostar Beta. Not for public use.
Excluding reads for Coverage Calculation in given regions
0
Entering edit mode
7 months ago
Germany

Hello,

I have a bam file and I have a bed file with my region of interests. Now I'm interested in the mean coverage for each of these region. That's no problem with picard's CollectHsMetrics and the PER_TARGET_COVERAGE flag.

But now I would like to exclude those reads in coverage calculation that doesn't overlap my target region with a given fraction or at least given bases.

I feel that bedtools might help me here. But I didn't figure out until now how.

Can someone help me?

fin swimmer

ADD COMMENTlink
1
Entering edit mode

Hi , I think you are looking for bedtools intersect , which go a clear documentation here http://bedtools.readthedocs.io/en/latest/content/tools/intersect.html :) You will just need 2 bed, your region of interest and your bed you got in you first step calculation :) Best

ADD REPLYlink
0
Entering edit mode

Hello, thanks for your answer. This is what I do now:

  1. Remove all reads from the bam file that doesn't overlap with at least the given fraction

    bedtools intersect -a full.bam -b regions.bed -f 0.75 -wa > restricted.bam

  2. Use the resulted bam file for coverage calculation with my tool if choice.

That works like it should, but I need to iterate twice over the alignment file. So I'm also interessed in faster ways.

fin swimmer

ADD REPLYlink
0
Entering edit mode

Why don't you calculate your coverage in your bam with you final bed (i mean region of interest ) using bedtools ?

ADD REPLYlink
0
Entering edit mode

bedtools needs to much memory for calculate the coverage and I'm very low at it. Also I miss the possibilty to ignore reads/bases below a given quality treshold, like in the picard tools. And I'm more satisfied with the output of picard tools.

How does bedtools handle overlapping paired reads in coverage calculation? Are they counted as 1 or 2?

fin swimmer

ADD REPLYlink
0
Entering edit mode

There is an option for bedtools genomecov (http://bedtools.readthedocs.io/en/latest/content/tools/genomecov.html) named :

-pc Calculates coverage for paired-end reads, coverage is calculated as the number of fragments covering each base pair .

To got better performances you may be can split your bed in chromosome ...

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1