Biostar Beta. Not for public use.
pysam: filtering reads based on flag
0
Entering edit mode
3.5 years ago
@QVINTVS_FABIVS_MAXIMVS10070

I would like to return reads with a flag of 0x02 similar to this

$ samtools view -f 0x02 in.bam 1:100-200

without using system calls in pysam (which are slower to my knowledge)

#!/usr/bin/python

import pysam
pos = "1:100-200"

# open bam
bam = pysam.AlignmentFile(bam_fh,"rb")

# I would like to do something like this, syntax is not correct i imagine
reads =  bam.fetch(region=pos,until_eof=True)
good_reads = reads.flag("0x02")
print len(good_reads)

or is it impossible to subset reads using a flag in pysam without iterating. In that case I'll default to the csamtools commands.

pysam samtools python hts • 2.9k views
ADD COMMENTlink
2
Entering edit mode

Could you use named pipes instead?:

https://www.biostars.org/p/15298/#139287

Basically what I'm thinking is that you could create the named pipe programmatically (using Python's subprocess library, for example) but outside the context of pysam. Then you can open that named pipe using the pysam API, just like any other BAM file.

ADD REPLYlink
0
Entering edit mode

I ended up using the pipes

ADD REPLYlink
2
Entering edit mode

Even samtools will do the subsetting by iterating over the alignments, there's no other way to do it.

ADD REPLYlink
2
Entering edit mode
5.5 years ago
@Matt Shirley1681

I haven't benchmarked it formally, but you might try using my simplesam module. The BAM decompression is offloaded to samtools in a subprocess, and iteration is simply about as fast as Python's readline will support. If you're checking only for flag 0x02, you can use the Sam.pairedattribute. This reminds me that I need to actually write documentation for this module :)

ADD COMMENTlink
0
Entering edit mode

whats the advantage to pysam. I would love to take your module out for a spin

ADD REPLYlink
0
Entering edit mode

I'll have to give an explanation on the project page, but in addition to not requiring a C compiler (and the added underlying complexity of calling a foreign function in Python) there is not much advantage. In fact, pysam has much more functionality. The perceived advantage for me is that "simplesam" is pure Python, and so easier to test, inherit from, and extend.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.3