Ensembl id to GeneSymbol with biomart
1
1
Entering edit mode
6.7 years ago
Vasu ▴ 770

Hello,

I have 3224 Ensembl id's as rownames in a dataframe "G". To convert Ensembl ids into Genesymbols I used biomart like following.

library('biomaRt')
mart <- useDataset("hsapiens_gene_ensembl", useMart("ensembl"))
genes <- rownames(G)
G <-G[,-6]
G_list <- getBM(filters= "ensembl_gene_id", attributes= c("ensembl_gene_id"                                                     "hgnc_symbol"),values=genes,mart= mart)

Now in G_list I can see only 3200 ensembl ids showing Genesymbols / No Gene_symbols. Why the other 24 ensembl ids are not seen in G_list? If there are no gene_symbol for those 24 ensembl ids it should atleast show "-"

what is the problem here?

biomart • 8.7k views
ADD COMMENT
2
Entering edit mode
6.7 years ago

More often there are many to many relationships between Ensembl ids and HGNC symbols, which is why it is very tedious to obtain exact gene symbols. It is better to use the mapIds function in org.Hs.eg.db to have those relations. I wrote a nifty function to identify these 1:1 mappings. It returns a list with 2 elements; 1st element is a data frame with 1:1 mapped ids, 2nd element are the unmapped ids, which you can remove from your dataset, if required.

Hope it helps!

mapIds2<-function(IDs,IDFrom,IDTo){
  require(org.Hs.eg.db)
  idmap=mapIds(x = org.Hs.eg.db,keys = IDs,column = IDTo,keytype = IDFrom,multiVals = "first")
  na_vec=names(idmap[is.na(idmap)==T])
  idmap=idmap[is.na(idmap)==F]
  idmap_df=data.frame("From"=names(idmap),"To"=unlist(unname(idmap)),stringsAsFactors = F)
  return(list(map=idmap_df,noMap=na_vec))
}

ADD COMMENT

Login before adding your answer.

Traffic: 1603 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6