Remove/Identify deprecated IDs from a list of Entrez IDs programatically
1
0
Entering edit mode
8.8 years ago
salamandra ▴ 550

If I have a list of Entrez IDs, how do I identify programatically those that are deprecated in order to remove them from the list?

entrez • 2.7k views
ADD COMMENT
0
Entering edit mode

and example of deprecated ID please ?

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

so it's a GENE /entrez id

ADD REPLY
1
Entering edit mode
8.8 years ago

ftp://ftp.ncbi.nih.gov/gene/DATA/gene_history.gz "comprehensive information about GeneIDs that are no longer current"

# Extract and sort this list of ID 
curl "ftp://ftp.ncbi.nih.gov/gene/DATA/gene_history.gz" | gunzip -c | tail -n+1 | cut -f 3 | LC_ALL=C sort

sort your list on the ID column

and use linux join to remove those IDs from your list. http://linux.die.net/man/1/join

ADD COMMENT

Login before adding your answer.

Traffic: 2908 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6