Using the Correct Ensembl Organism Database in topGO in R
1
0
Entering edit mode
5.2 years ago
dthorbur ★ 1.9k

I'm using Bioconductor, and the packages topGO and biomaRt. I'm about to conduct my first GO Enrichment analysis, but I'm a little lost on how to correctly assign the mapping database for my study organisms (Three spined stickleback; Gasterosteus aculeatus). I've seen that the human one is "org.Hs.eg.db", and the mouse database is used in the workflow I've found below; "org.Mm.eg". My guess would be "org.Ga.eg", but I'd like to confirm this if possible.

go_data <- new("topGOdata",
               ontology = "BP",
               allGenes = gene_universe,
               nodeSize = 5,
               annotationFun = annFUN.org,
               mapping = "org.Mm.eg",
               ID = "ensembl")

I've had a look on the ensembl website, but I cannot seem to find the appropriate information. So my main question is where on ensembl would I find this information?

Thanks.

R Ensembl topGO bioconductor • 3.2k views
ADD COMMENT
1
Entering edit mode
5.2 years ago
h.mon 35k

I believe you are looking for org.Gg.eg.db. But why are you looking at the Ensembl site, this is a BioConductor package:

edit:

Sorry, I have organisms of agronomic importance in my mind, the package I linked before is for chicken (Gallus gallus). You can build and org.db package, however:

Creating select Interfaces for custom Annotation resources

I never built an org.db package before, so I don't know how much trouble it would be.

ADD COMMENT
1
Entering edit mode

Just in case anyone else comes across this post, I have just been shown a package on BioConductor that could be of use. It's called AnnotationDBi (https://www.bioconductor.org/packages/devel/bioc/manuals/AnnotationDbi/man/AnnotationDbi.pdf), and it apparently can retrieve GO annotations for 3 spine sticklebacks (Gasterosteus aculeatus).

ADD REPLY
0
Entering edit mode

Ah, I assumed since it said ensembl ID, it was using an ensembl database. I think the best way forward would be to actually use orthologous gene ID's in Human, Mouse, or Zebrafish as my gene ID's instead of the stickleback.

ADD REPLY
0
Entering edit mode

org.db packages provides mappings between several features and databases, such as mappings between ENTREZ and Ensembl gene identifiers, genes and GO categories, and so on. It is an amalgamation of information from several sources.

ADD REPLY
0
Entering edit mode

Thanks. I think I'll forgo making my own as I reckon it would be trickier than it superficially sounds. Also, I'm guessing a sizeable portion of the gene predictions in sticklebacks would have come from the zebrafish, so it makes sense to use it for gene ontology.

ADD REPLY

Login before adding your answer.

Traffic: 1768 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6