How to browse InterProScan, BLAST, HMMER results to retrieve relevant results?
1
1
Entering edit mode
7.2 years ago
keepclam ▴ 10

I just started working on two already annotated transcriptomes; they have been annotated with BLASTP, InterProScan, HMMER and GO terms. I consistently find problems in retrieving functional information (e.g., retrieve all the endonucleases) from annotation results, and I always end up performing a keyword search in terminal with grep and similar tools, which gives me partial results.

Is there any smarter and more biologically correct way to browse annotation results?

annotation interproscan RNA-Seq bash • 2.3k views
ADD COMMENT
0
Entering edit mode

What format are the annotations in? Genbank/GFF or just text without format?

ADD REPLY
0
Entering edit mode

No, simple tab-separated text. One line of InterPro output looks like this:

Locus_22164 7c6c3b32b99a4166ca3d7b5a78aa251c 719 SMART SM00487 DEAD-like helicases superfamily 145 352 6.0E-54 T 12-08-2015 IPR014001 Helicase, superfamily 1/2, ATP-binding domain

while one line of HMMER output looks like this:

GATA PF00320.22 36 Locus_3921 - 694 1e-05 25.9 0.8 1 2 0.13 1.5e+03 -0.2 0.1 13 25 507 518 506 518 0.89 GATA zinc finger

ADD REPLY
0
Entering edit mode

When you say you want to "browse" what are you expecting out of that? Do you need a summary of all different types of domains identified? Are you interested in knowing how many loci have no identifiable function?

Potentially you could use awk to cut columns out of these file followed by some sort of sorting to classify the results.

ADD REPLY
0
Entering edit mode

I'd like to retrieve all proteins belonging to a given group of interest. Let's suppose I want to retrieve all nucleases. If I <grep> "nuclease", I automatically exclude from my results all those nucleases that don't have "nuclease" in their annotation. Is there any means to circumvent this problem? Note that InterProScan and HMMER give database IDs of their results (Pfam for HMMER and varous dbs for InterProScan).

ADD REPLY
0
Entering edit mode
7.2 years ago
cdsouthan ★ 1.9k

Wait for the ORFs to get into UniProt, then it should be easy

ADD COMMENT

Login before adding your answer.

Traffic: 2532 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6