Literature Mining For Domains Involved In Regeneration
1
1
Entering edit mode
10.8 years ago
James Ashmore ▴ 100

Hello everyone,

I have a list of around 500 Pfam-A domains and I would like to mine the current literature for any involvement of proteins containing these domains in regeneration. The best option I have come up with so far is to enter the protein sequence into STITCH and get results from there, however I wonder if this is limiting my search in some way or if there is a more refined search I could use for Pfam-A domains?

Thank you, James

domain literature • 2.2k views
ADD COMMENT
0
Entering edit mode
10.8 years ago

Stitch uses pre-curated information. If you want to search the literature directly, I believe Google Scholar is your only option. You can use an inofficial Scholar API like this one (you might want to search yourself around a little bit, I think there are perl and ruby versions, too), send your PFAM domains through Google Scholar and extract links to all papers and the little snippets.

Google will block you if run more than around 1000 requests per day, so don't send all http queries at once, but keep a delay of several seconds between them. You can also spread your requests over two days. (Like me, you can listen to the XX's intro while waiting

Otherwise, you can email me and we can do this together as a little research project. This PFAM question came up a few times I just never found the time to work on it and know nothing about proteins. I have a few million research papers here that I can search in around 15 minutes for a regular expression. We can prepare a track that includes references to all PFAM domains from the genome browser, and provide it as a resource, this would solve the problem once and for all.

ADD COMMENT

Login before adding your answer.

Traffic: 2648 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6