Resources To Batch Map Long Gene Names To Entrez Ids?
1
2
Entering edit mode
12.5 years ago

I frequently have a list of long gene names (not symbols) that I need to map to Entrez IDs. For example, instead of having the gene symbol PTEN, I have the long gene name "phosphatase and tensin homolog". I don't see where Biomart supports the mapping of long gene names (using database: Ensembl genes 64, Sanger UK).

I've tried using MatchMiner. However, often the list of long gene names I have uses something other than "official" gene names and MatchMiner has trouble mapping. It's also quite slow.

What other resources are people using to batch map large lists of long gene names? I'd appreciate any tips.

gene mapping list identifiers • 3.3k views
ADD COMMENT
5
Entering edit mode
12.5 years ago

Did you try NCBI eSearch ? I got only one hit with your example (don't forget the quotes):

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=gene&term=9606[TID]+"phosphatase+and+tensin+homolog"[GFN]

<eSearchResult>
  <Count>1</Count>
  <RetMax>1</RetMax>
  <RetStart>0</RetStart>
  <IdList>
    <Id>5728</Id>
  </IdList>
  <TranslationSet/>
  <TranslationStack>
    <TermSet>
      <Term>9606[TID]</Term>
      <Field>TID</Field>
      <Count>191183</Count>
      <Explode>Y</Explode>
    </TermSet>
    <TermSet>
      <Term>"phosphatase+and+tensin+homolog"[GFN]</Term>
      <Field>GFN</Field>
      <Count>17</Count>
      <Explode>Y</Explode>
    </TermSet>
    <OP>AND</OP>
  </TranslationStack>
  <QueryTranslation>9606[TID] AND "phosphatase+and+tensin+homolog"[GFN]</QueryTranslation>
</eSearchResult>

Edit:

Walter, you can run a loop with a shell script and call this query for each long name.

$ cat list.txt 
phosphatase and tensin homolog
notch 2

the script:

while read G 
do
    for I in `curl -s "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=gene&term=9606%5BTID%5D+%22${G// /\+}%22%5BGFN%5D" |grep "<Id>" `
       do
         echo  $G $I
       done
done < list.txt

phosphatase and tensin homolog <Id>5728</Id>
notch 2 <Id>4853</Id>
ADD COMMENT
0
Entering edit mode

I don't see how this supports batch submissions -- I typically have several hundred names to map.

ADD REPLY
0
Entering edit mode

Thanks Pierre. However, I don't see how this supports batch submissions -- I typically have several hundred names to map.

ADD REPLY
0
Entering edit mode

Ah, I see. Excellent! Thanks Pierre!

ADD REPLY

Login before adding your answer.

Traffic: 2664 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6