AnnotationDbi returns different list of symbols from directly derived list of database itself
1
1
Entering edit mode
4.9 years ago

Hi.

I'm trying to annotate gene symbols next to probe IDs (Affymetrix Mouse Gene 1.0-ST Array).

I used "mogene10sttranscriptcluster.db" package (v8.7.0) of R for the annotation.

But here's the problem.

1) Using mogene10sttranscriptcluster.db directly

library(mogene10sttranscriptcluster.db)

a <- contents(mogene10sttranscriptclusterSYMBOL)

# a$'10344741'
# [1] NA

2) Using AnnotationDbi to extract the info

library(mogene10sttranscriptcluster.db)
library(AnnotationDbi)

k <- keys(mogene10sttranscriptcluster.db, keytype = "PROBEID")
b <- mapIds(mogene10sttranscriptcluster.db, keys=k, column=c("SYMBOL"), keytype="PROBEID")
b["10344741"]

# 10344741
# "Hnrnpa3" 

length(a) = length(b) = 35556

But there are some symbols not in the (1) but in the (2).

They both used the same database - mogene10sttranscriptcluster.db, but how did they get different results?

Does the AnnotationDbi converts probe ids to some other ids and then convert them to gene symbols?

The second one seems to have more symbols, so that's the one I have to use?

I'm very confused right now.

gene R AnnotationDbi mapIds • 1.4k views
ADD COMMENT
1
Entering edit mode
4.9 years ago

I found my own answer.

It seems like the mogene10sttranscriptcluster.db utilizes org.Hs.eg.db for annotation.

And the version of org.Hs.eg.db is different between mogene10sttranscriptcluster.db and AnnotationDbi.

I found this because when I loaded different version of org.Hs.eg.db, the same version of mogene10sttranscriptcluster.db (v8.7.0) produces different results.

So, check the version of your org.Hs.eg.db.

ADD COMMENT

Login before adding your answer.

Traffic: 2788 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6