Question

Error Using Entrez.Esummary From Biopython

0

Entering edit mode

12.7 years ago

Martin ▴ 30

Can someone please explain this error?

I hava a smal script that tries to fetch information from the a NCBI BioAssay using the Entrez module form Bipython. I get an error I do not understand. I try to run:

from Bio import Entrez
Entrez.email="yourname@mail.se"

handle_esummary=Entrez.esummary(db='pcassay',id='1337')
record_esummary=Entrez.read(handle_esummary)

I get the error:

File "smaltest.py", line 5, in <module>
    record_esummary=Entrez.read(handle_esummary)
  File "/usr/common/schrodinger/mmshare-v20109/lib/Linux-x86_64/lib/python2.7/site-packages/Bio/Entrez/__init__.py", line 297, in read
    record = handler.run(handle)
  File "/usr/common/schrodinger/mmshare-v20109/lib/Linux-x86_64/lib/python2.7/site-packages/Bio/Entrez/Parser.py", line 90, in run
    self.parser.ParseFile(handle)
  File "/usr/common/schrodinger/mmshare-v20109/lib/Linux-x86_64/lib/python2.7/site-packages/Bio/Entrez/Parser.py", line 105, in startElement
    itemtype = str(attrs["Type"]) # convert from Unicode
KeyError: 'Type'

biopython entrez eutils ncbi • 4.9k views

ADD COMMENT • link 12.7 years ago by Martin ▴ 30

0

Entering edit mode

I tried handle_esummary=Entrez.esummary(db="journals",id="30367"); record = Entrez.read(handle_esummary) and it is ok. But I got same error message, so I think it is the parsing problem of biopython to different database.

ADD REPLY • link 12.7 years ago by Ning-Yi Shao ▴ 390

0

Entering edit mode

I tried handle_esummary=Entrez.esummary(db="journals",id="30367"); record = Entrez.read(handle_esummary) and it is ok. But I got same error message when I use your example, so I think it is the parsing problem of biopython to different database.

ADD REPLY • link 12.7 years ago by Ning-Yi Shao ▴ 390

0

Entering edit mode

It's about programmatically access of esummary, efetch with available database (pcassay, pubmed, nucleotide, journal) at NCBI have nothing to do with parsing I guess.

ADD REPLY • link 12.7 years ago by Thaman ★ 3.3k

0

Entering edit mode

http://biopython.org/DIST/docs/api/Bio.Entrez.Parser-pysrc.html

I think the error raised at the line 263 when biopython tried to parse the XML result but failed to get an attribute named "type". I didn't read it carefully, so it is just my guessing.

ADD REPLY • link updated 4.6 years ago by Ram 43k • written 12.7 years ago by Ning-Yi Shao ▴ 390

score 1 · Answer 1 · 2011-08-22

Hi,

I can reproduce this and the cause is the NCBI using lowercase in one tag's attribute:

<Item Name="SourceNameList" type="List">

All the other [?] tags in the XML from eSummary use Type rather than type, and this is what the NCBI's XML DTD file says should be used. As a result, the XML file fails XML validation.

i.e. This is a bug in the NCBI eSummary service, please report it to them.

You can demonstrate this using http://validator.w3.org/ (or another XML validator) and either save the XML via Python, or enter this URL directly: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?tool=biopython&db=pcassay&id=1337

Peter

score 1 · Answer 2 · 2011-11-21

1

Entering edit mode

12.4 years ago

Martin ▴ 30

The bug has been fixed. The NCBI finally took their time and corrected the error.

Many thanks for your help Peter.

ADD COMMENT • link 12.4 years ago by Martin ▴ 30

score 0 · Answer 3 · 2011-09-08

0

Entering edit mode

12.6 years ago

Martin ▴ 30

NCBI has been made aware of the bug.

Waiting to see how long time it will take for them to fix it.

ADD COMMENT • link 12.6 years ago by Martin ▴ 30