Bio.Genbank.Locationparsererror
1
2
Entering edit mode
13.0 years ago
User 6115 ▴ 20

Hi all,

I'm scanning through all of GenBank's bacterial genomes using biopython.

I've been getting an occasional error recently parsing location data. Specifically:


  File "/usr/lib/pymodules/python2.7/Bio/SeqIO/init.py", line 525, in parse
    for r in i:
  File "/usr/lib/pymodules/python2.7/Bio/GenBank/Scanner.py", line 437, in parserecords
    record = self.parse(handle, dofeatures)
  File "/usr/lib/pymodules/python2.7/Bio/GenBank/Scanner.py", line 420, in parse
    if self.feed(handle, consumer, dofeatures):
  File "/usr/lib/pymodules/python2.7/Bio/GenBank/Scanner.py", line 392, in feed
    self.feedfeaturetable(consumer, self.parsefeatures(skip=False))
  File "/usr/lib/pymodules/python2.7/Bio/GenBank/Scanner.py", line 344, in _feedfeaturetable
    consumer.location(locationstring)
  File "/usr/lib/pymodules/python2.7/Bio/GenBank/init.py", line 975, in location
    raise LocationParserError(location_line)
  Bio.GenBank.LocationParserError: order(join(649703..649712,649751..649752),650047..650049)

My code is a simple loop through all filenames I feed in at the command line:


        [...]

        try:
            contig = SeqIO.parse(open(gb_file,"r"), "genbank")
        except:
            sys.stderr.write("ERROR: Parsing gbk file "+gb_file+"!\n")
            sys.exit(1)
        sys.stderr.write("Loading genome " + str(counter) + " of "+str(len(sys.argv)-1)+" ("+gb_file+")\n")

        for gb_record in contig:

           [...]

This is in the Aeropyrum pernix K1 genome, NC_000854.gbk. I don't see anything wrong with the location data. Can anyone help?

Thanks, -Morgan

biopython • 2.8k views
ADD COMMENT
3
Entering edit mode
13.0 years ago

This looks like a case of the issue discussed here:

https://redmine.open-bio.org/issues/3197

Where order and join are combined in a single location:

order(join(649703..649712,649751..649752),650047..650049)

According to the GenBank specification this should not be allowed.

Peter posted a fix in that discussion, but decided not to check it in as the files in question were identified as problematic by NCBI. You can try Peter's fix if you just need to get through these. As a more permanent solution, you should e-mail NCBI and get clarification on if this is allowed or will be fixed. If these are reported as valid then please do reopen that bug discussion and lobby for including a more permanent change.

ADD COMMENT
0
Entering edit mode

I noted this on the Biopython bug report, and emailed the NCBI.

ADD REPLY

Login before adding your answer.

Traffic: 1496 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6