Count 0, 5, 20 covered sites in bam
2
0
Entering edit mode
5.6 years ago
DVA ▴ 630

Hello,

I wonder how to use a bam file to learn how many sites are not covered at all, how many are covered >=5 times and 20 times. We got such stats from our sequencing company, but want to see we can do it ourselves as well.

Thank you very much.

wgs bam • 1.2k views
ADD COMMENT
0
Entering edit mode

Are you looking to get the stats at individual base level or a interval window?

ADD REPLY
0
Entering edit mode

Thank you for the comment @genomax, I am looking at individual base level.

ADD REPLY
2
Entering edit mode
5.6 years ago
samtools depth -a in.bam | awk '{D=int($3);if(D<5) {D=0;} else if(D<20) {D=5;} else D=20; a[D]++;} END {for(x in a) printf("%s\t%d\n",x,a[x]);}'

5   14484
20  51
0   3771
ADD COMMENT
1
Entering edit mode

Thanks for the help! Appreciate it.

ADD REPLY
1
Entering edit mode
5.6 years ago

You can get the relevant percentage of the genome with plotCoverage, though picard has some similar tools.

ADD COMMENT
0
Entering edit mode

Thank you very much!

ADD REPLY

Login before adding your answer.

Traffic: 2417 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6