Printing features of a specific type only from a Genbank file with Biopython
1
0
Entering edit mode
6.9 years ago
Jacob ▴ 10

I'm able to get the genbank file from the website and print the features from the file, but I want to only print the features where type='exon'. Further, I want print just the Exactposition fields for those features where type='exon' heres what I am doing so far

>>> handle = Entrez.efetch(db="nucleotide", rettype="gb", retmode="text", id="NM_001135")
>>> seq_record = SeqIO.read(handle, "gb")
>>> seq_record.features
[SeqFeature(FeatureLocation(ExactPosition(0), ExactPosition(8543), strand=1), type='source'), SeqFeature(FeatureLocation(ExactPosition(0), ExactPosition(8543), strand=1), type='gene'), SeqFeature(FeatureLocation(ExactPosition(0), ExactPosition(367), strand=1), type='exon'), SeqFeature(FeatureLocation(ExactPosition(367), ExactPosition(444), strand=1), type='exon'),
...
...

And I would like the output to be

SeqFeature(FeatureLocation(ExactPosition(0), ExactPosition(367), strand=1), type='exon'), SeqFeature(FeatureLocation(ExactPosition(367), ExactPosition(444), strand=1), type='exon'),

Or more specifically, as a final result

0-367
367-444
SeqIO Seq genbank biopython • 3.2k views
ADD COMMENT
2
Entering edit mode
6.9 years ago

Code:

from Bio import Entrez
from Bio import SeqIO

Entrez.email = 'yourmail@mail.com'
qid = "NM_001135"
handle = Entrez.efetch(db="nucleotide", rettype="gb", retmode="text", id=qid)
seq_record = SeqIO.read(handle, "gb")
for feat in seq_record.features:
    if feat.type == 'exon':
        print('{}-{}'.format(feat.location.start, feat.location.end))

Output:

0-367
367-444
444-828
828-1003
1003-1131
1131-1425
1425-1803
1803-1978
1978-2106
2106-2400
2400-2640
2640-7206
7206-7365
7365-7448
7448-7593
7593-8543
ADD COMMENT

Login before adding your answer.

Traffic: 2639 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6