Biostar Beta. Not for public use.
OrthoDB - How to retrieve ortholog groups for only the selected species
0
Entering edit mode
18 months ago
al-ash • 100
Japan/Okinawa/OIST

I'm using JSON API to retrieve from OrthoDB ortholog cluster ids for which genes are present in the selected species (e.g. Blatella germanica and Zootermopsis nevadensis), but apparently the retrieval is not limited with the argument species=6973%2C136037 which I'm using; rather it retrieves all ortholog cluster ids predicted at the node of "Insecta". I would expect that there is a straightforward way how to limit the ortholog cluster id retrieval to the selected species but I can not figure it out (I'm a novice user).

I was expecting that if I select only some species and then check "Present in all species option" this will do the job, but this option is not returning any hit (http://www.orthodb.org/?singlecopy=1&level=50557&species=6973%2C136037&universal=1) while if I do not limit my retrieval this way and do the filtering of the retrieved ortholog group clusters manually by grepping I do obtain ortholog gene clusters for the selected species.

I'm using the following command to retrieve the ortholog cluster ids:

wget -O retrievedOGs.txt 'http://www.orthodb.org/search?singlecopy=1&limit=40000&level=50557&species=6973%2C136037'


In the subsequent step of retrieval of table of gene annotations by looping through the cluster ids saved in variable ORTHOLOGCLUSTERS:

wget -O - 'http://www.orthodb.org/tab?id='"\$ORTHOLOGCLUSTERS"'&level=50557&species=6973%2C136037' > OGtable.txt


l'm obtaining in the output file (OGtable.txt) lines without any annotation information for those ortholog clusters for which there are no genes in the selected species. I could filter the required information from this table but it is a bit wasteful since I'm wgeting from OrthoDB many ortholog cluster ids without any annotation (which takes also quite some time due to their limitation on maximum downloaded hits per second).

Is there some more efficient way how to limit the retrieval of ortholog cluster ids only to the species of interest? Thanks!