pysam fetch with partial reference name
1
0
Entering edit mode
7.5 years ago
yarmda ▴ 40

I'm new to pysam and trying to parse a bamfile by a RefSeq Accession number. However, that accession number is only part of the reference name (column 3 in the bamfile header) and pysam fetch seems to need the whole reference name in order to search.

Is there a way I can search on a substring of the reference name?

For example, a reference name may look like: gi|158333233|custom|NC_009925.1| where NC_00925 is the accession number I want to search on.

Thanks!

Edit: Also, how would I go about parsing the output? It looks like it is the detail of the bamfile, just without the third column (that I searched on). I want to get the first column and store it as a new variable. How could I do that?

for read in samfile.fetch("etc.")
     print read[0]

doesn't let me subset like that. So, I'm guessing it's not indexed.

Trying the above gives the error: 'pysam.calignedsegment.AlignedSegment' object has no attribute '\__getitem__'

bam python pysam • 4.3k views
ADD COMMENT
2
Entering edit mode
7.5 years ago

Regarding using a partial contig name, there's no built in way to do that, you'll need to write a function to iterate over the contigs and determine which one is the one you want.

Regarding getting read names, it's read.query_name. Please see the documentation for more information.

ADD COMMENT

Login before adding your answer.

Traffic: 2897 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6