Biostar Beta. Not for public use.
Getting GI number from NCBI database through code
0
Entering edit mode
2.3 years ago
erans995 • 0

Hello As part of a college project, I have to write a program that finds similar FASTA sequences to a one the user chooses. In my program, say the user enters "cat", I have to display to him all the relevant entries present in the DB, and let him choose one. I have a script that outputs the FASTA data of a certain entry in the NCBI database given its accession number.

I have found the following perl script that converts GI to accession number:

use LWP::Simple;
$gi_list = '24475906,224465210,50978625,9507198';

#assemble the URL
$base = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/';
$url = $base . "efetch.fcgi?db=nucleotide&id=$gi_list&rettype=acc";

#post the URL
$output = get($url);
print "$output";

However, I haven't found a way to retrieve the GI from the database through code. Thank you for taking the time to read this, I hope you will be able to help me!

ADD COMMENTlink
0
Entering edit mode

I normally don't post replies to homework or project-based questions, but I'll simply point to this post (and indicate you should point this out to your course instructor, it's been two years since the original announcement):

https://www.ncbi.nlm.nih.gov/books/NBK431010/#news_03-02-2016-phase-out-of-GI-numbers

ADD REPLYlink
0
Entering edit mode

Okay thanks for the update. Let me rephrase my question: how can I retrieve the accession number of a certain entry through code?

ADD REPLYlink
0
Entering edit mode

See my answer below.

ADD REPLYlink
1
Entering edit mode
28 days ago
genomax 68k
United States

NCBI deprecated use of GI numbers in 2016. You should switch your code to using Accession numbers.

NCBI Unix utils allow you to query using gi and retrieve accessions numbers.

$ esearch -db nuccore -query "24475906" | efetch -format acc
NM_009417.2
ADD COMMENTlink
0
Entering edit mode

But that's the point, how can I retrieve the GI through code? I don't know it...

ADD REPLYlink
0
Entering edit mode

I thought you already had gi numbers. Using your "cat" example you can get accession numbers like this.

$ esearch -db nuccore -query "cat" | efetch -format acc

AFHV02000288.1
AFHV02000289.1
AFHV02000291.1
AFHV02000292.1
AFHV02000293.1
AFHV02000294.1

I will leave it to you to figure out how to change the query and how to use this method to do URL based searches.

ADD REPLYlink
0
Entering edit mode

Okay thank you very much, I'll try to figure out the rest by myself

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1