Retrieve all genes under a mammalian phenotype ontology term
1
0
Entering edit mode
7.0 years ago
eric.kern13 ▴ 240

I want to retrieve all genes corresponding to a given mammalian phenotype ontology term (for example, MP:0005375), preferably within R. Are there tools to do this? Can I do it within BioMart? Or is the best bet to build something around the APIs here or here?

Related: similar question for GO terms

R GO Mammalian Phenotype Ontology Biomart • 1.9k views
ADD COMMENT
0
Entering edit mode

you may like to add biomart in the tags

ADD REPLY
0
Entering edit mode

I tried to; it didn't work. I'll try again.

ADD REPLY
2
Entering edit mode
7.0 years ago
Mike Smith ★ 2.0k

I'm not sure you can do this using Ensembl's BioMart. There you can filter using specific phenotype ontology terms, but only leaf terms rather than something quite high level like MP:0005375 which is Adipose Tissue Phenotype and has many sub-terms. I don't think you can query using the phenotype ID itself. I don't know if any other data store that has this data provides a BioMart interface, but I can't see on for the two you linked to.

One suggestion is to use the httr package and query MouseMine directly. Here's a fairly crude example, where we query for your phenotype ID, and return the primary ID, gene symbol, and the NCBI Entrez Gene ID.

Load the libraries we'll need, and then create a search query XML string

library(httr)
library(jsonlite)
library(tibble)

phenotypeID <- "MP:0005375"

query <- paste0('<query model="genomic" view="Gene.primaryIdentifier Gene.symbol Gene.ncbiGeneNumber" >
                  <constraint path="Gene.ontologyAnnotations.ontologyTerm.identifier" op="=" code="A" value="',
                  phenotypeID, '" />
                </query>')

Then we can submitt the query:

postRes = POST('http://www.mousemine.org/mousemine/service/query/results',
         body=list(query=query, format='json'),
         encode='form')

Now do some processing to the result to give us a data_table with one row per gene

jsonToTxt <- fromJSON(content(postRes, as = "text"))
genes <- as_tibble(jsonToTxt$results)
colnames(genes) <- jsonToTxt$columnHeaders

Here's the output:

> genes
# A tibble: 69 × 3
   `Gene > Primary Identifier` `Gene > Symbol` `Gene > NCBI Gene Number`
                         <chr>           <chr>                     <chr>
1                   MGI:101884           Ppard                     19015
2                   MGI:101900           Mmp14                     17387
3                   MGI:102797           Acsl1                     14081
4                   MGI:102858           Fosl2                     14284
5                   MGI:103014            Il15                     16168
6                   MGI:104993            Lepr                     16847
7                   MGI:105304           Il6ra                     16194
8                   MGI:105374           Npy4r                     19065
9                   MGI:106387         Arfgef3                    215821
10                  MGI:107571            Cav2                     12390
# ... with 59 more rows
ADD COMMENT
0
Entering edit mode

Works like a charm. Thank you very much!

ADD REPLY

Login before adding your answer.

Traffic: 1959 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6