URL for NCBI Efetch + XML with full annotation table
1
0
Entering edit mode
6.8 years ago

Looking at How to dump genes from GenBank in GFF3 format? ? i've tried to download the full record of CM000760.3 as XML using Efetch, but it seems that the annotations are not downloaded (which is a new "feature" to me ):

wget -O - "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=CM000760.3&rettype=gb&retmode=xml"

while https://www.ncbi.nlm.nih.gov/nuccore/CM000760?report=gbwithparts shows the full annotation table.

What would be the correct EFetch URL do get the full table ?

xml ncbi efetch annotation eutils • 2.6k views
ADD COMMENT
0
Entering edit mode

@genomax: closer but it's not GBXML ( https://www.ncbi.nlm.nih.gov/dtd/NCBI_GBSeq.mod.dtd ) it's Bioseq-set : ( https://www.ncbi.nlm.nih.gov/dtd/NCBI_Seqset.mod.dtd )

and it matters because I can generate a java parser for genbank:

$ xjc -d tmp -dtd "https://www.ncbi.nlm.nih.gov/dtd/NCBI_GBSeq.dtd"
parsing a schema...
compiling a schema...
generated/GBAltSeqData.java
generated/GBAltSeqDataItems.java
generated/GBAltSeqItem.java
generated/GBAltSeqItemInterval.java
generated/GBAltSeqItemIsgap.java
generated/GBAuthor.java
generated/GBComment.java
generated/GBCommentParagraph.java
generated/GBCommentParagraphs.java
generated/GBFeature.java
generated/GBFeatureIntervals.java
generated/GBFeaturePartial3.java
generated/GBFeaturePartial5.java
generated/GBFeatureQuals.java
generated/GBFeatureSet.java
generated/GBFeatureSetFeatures.java
generated/GBFeatureXrefs.java
generated/GBInterval.java
generated/GBIntervalInterbp.java
generated/GBIntervalIscomp.java
generated/GBKeyword.java
generated/GBQualifier.java
generated/GBReference.java
generated/GBReferenceAuthors.java
generated/GBReferenceXref.java
generated/GBSecondaryAccn.java
generated/GBSeq.java
generated/GBSeqAltSeq.java
generated/GBSeqCommentSet.java
generated/GBSeqFeatureSet.java
generated/GBSeqFeatureTable.java
generated/GBSeqKeywords.java
generated/GBSeqOtherSeqids.java
generated/GBSeqReferences.java
generated/GBSeqSecondaryAccessions.java
generated/GBSeqStrucComments.java
generated/GBSeqXrefs.java
generated/GBSeqid.java
generated/GBSet.java
generated/GBStrucComment.java
generated/GBStrucCommentItem.java
generated/GBStrucCommentItems.java
generated/GBXref.java
generated/ObjectFactory.java

but not for the other one:

$ xjc -d tmp -dtd "http://www.ncbi.nlm.nih.gov/dtd/NCBI_Seqset.dtd"
parsing a schema...
compiling a schema...
[ERROR] A class/interface with the same name "generated.SeqFeatSupport" is already in use. Use a class customization to resolve this conflict.
  line 169 of http://www.ncbi.nlm.nih.gov/dtd/NCBI_Seqfeat.mod.dtd

[ERROR] (Relevant to above error) another "SeqFeatSupport" is generated from here.
  line 331 of http://www.ncbi.nlm.nih.gov/dtd/NCBI_Seqfeat.mod.dtd
ADD REPLY
0
Entering edit mode
6.8 years ago
osulliva • 0

i haven't found efetch retmode, but sviewer.cgi can output gff: curl 'https://www.ncbi.nlm.nih.gov/sviewer/viewer.cgi?tool=portal&save=file&log$=seqview&db=nuccore&report=gff3&id=1174565284&extrafeat=976&maxplex=1' id param also takes accession.

ADD COMMENT
0
Entering edit mode

it doesn't answer my needs but you should post this to : How to dump genes from GenBank in GFF3 format?

ADD REPLY
0
Entering edit mode

i haven't found efetch retmode, but sviewer.cgi can output gff

If you do this you will undoubtedly be blocked, and receive the following from NCBI:

The site was blocked because of an excessive rate of access to the NCBI sequence servers. The NCBI web pages are a public >service and we need to make it available to a large number of different users.  Single sites with very high rates of access can >impact and cause degradation of our performance. This site made over 50,000 NCBI requests, at a rate at or exceeding one per >second. 

Additionally these request were using the source code for a web page pull down menu. This is an unauthorized, insecure and >inefficient method of obtaining data.

NCBI does not permit scripting against the main web servers. We have servers, APIs and other tools for bulk access of the >data: https://www.ncbi.nlm.nih.gov/home/develop/

ADD REPLY

Login before adding your answer.

Traffic: 2592 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6