Question

How To Convert List Of Entrez Ids Into Gene Name

4

Entering edit mode

11.0 years ago

grosy ▴ 90

Hi Friends,

I have list of 10,000 Entrez IDs and I want to convert the multiple Entrez IDs into the respective gene names. Could someone suggest me the way to do this?

In a Bioconductor package called "Biomart", we can do this for individual gene. Like

library(org.Hs.eg.db)
library(annotate)
lookUp('3815', 'org.Hs.eg', 'SYMBOL') 
   $`3815` 
   [1] "KIT"
lookUp('3815', 'org.Hs.eg', 'REFSEQ') 
   $`3815`
   [1] "NM_000222" "NM_001093772" "NP_000213" "NP_001087241"

This answer I got it from SEQanswer, but then is there any way to do this for multiple Entrez IDs?

Thanks in advance.

r genomics entrez • 57k views

ADD COMMENT • link updated 10 months ago by cwang3444 • 0 • written 11.0 years ago by grosy ▴ 90

2

Entering edit mode

I think this may be one of the easiest way to do this task. You can convert Entrez ID into gene name by using website called "MatchMiner" (http://discover.nci.nih.gov/matchminer/MatchMinerLookup.jsp). All you need to do is to upload a file that contains all your Entrez IDs. This website will convert them into HUGO gene names.

ADD REPLY • link 11.0 years ago by hojoon.compbio ▴ 20

0

Entering edit mode

Thanks @hojoon.compbio it worked... :o)

ADD REPLY • link 11.0 years ago by grosy ▴ 90

0

Entering edit mode

What is the library "annotate" and how can I install it, please?

Thanks.

ADD REPLY • link 7.6 years ago by moxu ▴ 510

2

Entering edit mode

It's a Bioconductor package; details and installation instructions are here:

http://bioconductor.org/packages/release/bioc/html/annotate.html

ADD REPLY • link 7.6 years ago by Neilfws 49k

0

Entering edit mode

Great! Which function converts gene symbols to entrez gene ids, please?

Thanks.

ADD REPLY • link 7.6 years ago by moxu ▴ 510

0

Entering edit mode

Time for you to read some documentation I think :)

ADD REPLY • link 7.6 years ago by Neilfws 49k

0

Entering edit mode

Thanks for your question, this what I need

ADD REPLY • link 5.5 years ago by LimMo ▴ 30

score 16 · Answer 1 · 2013-04-22

16

Entering edit mode

11.0 years ago

David W 4.9k

This is an easy one - just pass a character vector that has more than one value:

getSYMBOL(c('3815', '3816', '2341'), data='org.Hs.eg')
    3815     3816     2341 
   "KIT"   "KLK1" "FNTAP2"

ADD COMMENT • link 11.0 years ago by David W 4.9k

0

Entering edit mode

Yeah Thanks a lot :) but it doesn't work more than some 100 gene IDs... so all i have to do now is to

a <- read.csv("entrez ids.csv", header = TRUE)

library(org.Hs.eg.db)

library(annotate)

d= getSYMBOL(a, data='org.Hs.eg') Error in .checkKeysAreWellFormed(keys) : keys must be supplied in a character vector with no NAs

This is the error i get....

ADD REPLY • link 11.0 years ago by grosy ▴ 90

1

Entering edit mode

When you read data into an R session with read.csv you get a dataframe containing rows and columns. In this case you probably have all your ids in one column which you can specify with $. Something like a$EntrezIDs. If you are new to R you should probably read some intro tutorials

ADD REPLY • link 11.0 years ago by David W 4.9k

1

Entering edit mode

I don't think the issue is number of IDs. I've retrieved tens of thousands of attributes (slowly) in one go using biomaRt.

ADD REPLY • link 11.0 years ago by Neilfws 49k

0

Entering edit mode

In a loop, can you pass in a vector of 100 elements at a time? (Or perhaps you need to filter out bad/NA entries?)

ADD REPLY • link 11.0 years ago by Alex Reynolds 35k

0

Entering edit mode

Actually i think the problem could be solved if i take the CSV file and list it in a variable... Like given in the Bioconducter package

"http://stuff.mit.edu/afs/athena/software/r_v2.14.1/lib/R/library/org.Hs.eg.db/html/org.Hs.egSYMBOL.html"

But the only problem i am facing now is to list the each value from the CSV file

ADD REPLY • link 11.0 years ago by grosy ▴ 90

0

Entering edit mode

d= getSYMBOL(na.omit(a), data='org.Hs.eg')

ADD REPLY • link 7.1 years ago by Adam ▴ 40

0

Entering edit mode

Hello there! I got the same issue earlier, but solved it using as.character(). Since the Entrez Gene IDs are made of numbers, they were loaded in as 'integer' initially. Hope it helps

ADD REPLY • link 10 months ago by cwang3444 • 0

0

Entering edit mode

Your answer help me a lot, thanks +1

ADD REPLY • link 5.5 years ago by LimMo ▴ 30

score 5 · Answer 2 · 2013-04-22

5

Entering edit mode

11.0 years ago

David ▴ 740

You have geneIDs that are NA.

use mget with ifnotfound=NA

a <- read.csv("entrez ids.csv", header = TRUE)
a.symbol <- as.vector(unlist(mget(a, envir=org.Hs.egSYMBOL, ifnotfound=NA)))

ADD COMMENT • link 11.0 years ago by David ▴ 740

0

Entering edit mode

I am sorry i did but still it shows the same problem

a <- read.csv("C:\Users\Desktop\entrez ids _row.csv", header = TRUE) a.symbol <- as.vector(unlist(mget(a, envir=org.Hs.egSYMBOL, ifnotfound=NA)))

Error in .checkKeysAreWellFormed(keys) : keys must be supplied in a character vector with no NAs

ADD REPLY • link 11.0 years ago by grosy ▴ 90

0

Entering edit mode

R tells you what is wrong: "keys must be supplied in a character vector with no NAs"

Do that after read.csv

a <- a[-is.na(a)]

ADD REPLY • link 11.0 years ago by David ▴ 740

score 1 · Answer 3 · 2013-04-22

1

Entering edit mode

11.0 years ago

Jordan ★ 1.3k

Another way to do without coding is to use ID Mapping in Uniprot. You can just upload a list of entrez id's and then map it.

ADD COMMENT • link 11.0 years ago by Jordan ★ 1.3k

0

Entering edit mode

ya i tried this... But i Want is from ENTREZ ID to GENE NAME... Could you suggest me the options to be choosen to convert From Entrez ID to GENE NAME?

ADD REPLY • link 11.0 years ago by grosy ▴ 90

0

Entering edit mode

One silly way of doing it is, mapping it to uniprot id's and then to your required Gene names. But I think you already got the answer. I usually download the ID mapping file from uniprot and write my own code for mapping in perl.

ADD REPLY • link 11.0 years ago by Jordan ★ 1.3k