How to parse HMMSCAN output to enable comparison of domain architecture for several proteins
0
1
Entering edit mode
7.5 years ago
Juan Cordero ▴ 140

Dear community,

I have a FASTA file with several (co-)ortholog proteins, from 2 different species, whose domain architecture I want to know (1). Next, I would like to get, for each protein, a sequence of likely true domains (2), and then, I'd like to compare such domains (3): the presence/absence and the order of appearance.

steps 1 & 2: I can do this manually for a small set of proteins in a FASTA file, but it turns out too tedious when I have 1000 FASTA files. Does anyone know any parser/tool to retrieve the significant domains for every protein from hmmscan output?

step 3: I have found metrics such as WDAC (Weighted Domain Architecture Comparison, see WDAC), ADASS (alignment-free domain architecture similarity search, see ADASS) and DA-score (Domain Architecture similarity score, see DA-score), but I couldn't manage to find any benchmark/comparison of those three or others. Does anyone know which method of those three is the most accurate/best or whether there are others?

I am a quite newbie working on this and feel a bit lost.

Thanks a lot in advance

hmmer pfam domain architecture homology • 2.4k views
ADD COMMENT

Login before adding your answer.

Traffic: 3254 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6