Calculate Coverage From Bam File
2
3
Entering edit mode
11.0 years ago
meredith19 ▴ 50

Is there a way to calculate 5x, 10x, 20x, 30x coverage from a BAM file ?

EDIT: I have a BAM file and I want to calculate the percentage of region of interest with 5x, 10x, 20x coverage. I used Samtools and it gives me a list of coverage values. How is the % region of interest with 10x calculated from these values ? Also what should be the minimum and maximum depth value when using Samtools.

Thanks in advance

coverage bam • 25k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Thanks Pierre. This page discusses the average coverage calculation. How can I calculate coverage for different depths ?

ADD REPLY
1
Entering edit mode

Hi Meredith. Coverage and depth of sequencing are often used interchangeably. I think it would be a very good idea if you refined your question to make it clearer. Try to be very specific about what EXACTLY you are trying to achieve, since for now there seems to be some confusion.

ADD REPLY
0
Entering edit mode

Hi Eric. Thanks for your response. I have a BAM file and I want to calculate the percentage of region of interest with 5x, 10x, 20x coverage. I used Samtools and it gives me a list of coverage values. How is the % region of interest with 10x calculated from these values ? Also what should be the minimum and maximum depth value when using Samtools. This is the first time that I have been exposed to sequencing. Please forgive me for my ignorance of this subject.

Thanks.

ADD REPLY
3
Entering edit mode
11.0 years ago

Hi Meredith,

First of all you should pull out your region of interest from the big bam file. That will make your analysis faster and you would be able to calculate coverage specific for that region. It should be sth like:

samtools view -bh your.sorted.bam chr1:100,000-20,000 (region of interest) > regionofinterest.bam

Now there are many posts on Biostars that tell you how to run pileup command in old samtools to get the number of reads that cover your region of interest. Basically, after running pileup command you will get a file that will contain all the positions in your region of interest and the coverage (or number of reads covering) that position. The positions that will have zero reads or zero coverage will not be present in the pileup file. Now you can write a script that can go through your pileup file and can get you the percentage of your region of interest that has 5 or more reads , 10 or more reads etc. Please read samtools commands and samtools pileup command from old samtools. Enjoy the reading.

ADD COMMENT
1
Entering edit mode

Thanks a lot, Ashutosh. Your reply was very helpful.

ADD REPLY
3
Entering edit mode
10.7 years ago
William ★ 5.3k

There is a nice plot in the qualimap output that show the percentage of the genome (or region) that is covered x (1 up to 50 with increases of 1 ) times.

Save yourself a lot of scripting and just run qualimap.

enter image description here

http://qualimap.bioinfo.cipf.es/

ADD COMMENT

Login before adding your answer.

Traffic: 2263 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6