Python or shell scripting code ro count individual mapped reads of amplicons
3
0
Entering edit mode
8.1 years ago
sp ▴ 20

I made 100 multiplex amplicon library and sequenced. Aligned it with bwa and converted to sorted-indexed bam file following the instruction of biostars.org/p/41951/ Using "flagstat" option in samtools, I can count the total reads and total mapped reads. To count individual reads, I use "samtools view input.bam chr1:1234-2345 | wc -l" and got the result.

However, my library is composed of 100 amplicons, which means that I need to do this thing 100 times if I do one by one manually. Can anyone kindly help me to share python or shell script to do this job automatically? I have basic knowledge on both tools. Thanks in advance!

sequencing next-gen alignment • 1.9k views
ADD COMMENT
2
Entering edit mode
8.1 years ago
Zaag ▴ 860

From the command line:

for i in $(ls *.bam) ; 
do
samtools view $i chr1:1234-2345 | wc -l > $i.individualcount.txt ;
samtools flagstat $i > $i.flagstat.txt
echo "Done with $i"
done ;

should give you for each bam:

  • name.bam.individual count.txt
  • name.bam.flagstat.txt
ADD COMMENT
1
Entering edit mode

FYI, the top line can just be for i in *bam. Less buttons, works the same.

ADD REPLY
0
Entering edit mode

Thanks Zaag! It does not work for my purpose though. I need iterate coordinates on same bam file. I was digging around and found "bedtools multicov" did the job! I also want to try NGSUtils, but failed to install on my ubuntu.

ADD REPLY
0
Entering edit mode
8.1 years ago
mastal511 ★ 2.1k

Try samtools idxstats.

ADD COMMENT
0
Entering edit mode

Can you show me a working example? I am stupid biologist.

ADD REPLY
0
Entering edit mode
8.1 years ago

You can take a look at the python module pysam pileup to get counts.

ADD COMMENT

Login before adding your answer.

Traffic: 2946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6