Biostar Beta. Not for public use.
Identify multidomain protein after hmmscan
0
Entering edit mode
14 months ago
dago ♦ 2.5k
Germany

I have some trouble to select an appropriate criterion to identify the presence of multiple domain in proteins.

I perform an hmmscan search of a list of protein selcting the flag --tblout

The output reports several fields:

--- full sequence ---- --- best 1 domain ---- --- domain number estimation ----# ..E-value  score  bias   E-value  score  bias   exp reg clu  ov env dom rep inc...
------------------- ---------- -------------------- ---------- --------- ------ 

Reading the manual I think that the first value to check if the E value of both full sequence and Best 1 domain. If the second is significant lower the the E value of the fill seq the results for this protein should be carefully considered.

I also understand that the resulting domains are in order of statistic significance. So the first one, is more likely there. Now, I have some problem to understand what parameter to consider for deciding if I am dealing with a multi-domain protein or not. Should I consider just the "exp" value?

ADD COMMENTlink
0
Entering edit mode
16 months ago
venu 6.2k
Germany

just do the hmmscan with individual family profiles (the most significant one and the next to it is enough) without --tblout flag (and if you previously used --noaliremove that also), if you find any continuous gap in the alignment with first profile and that gap is filled with the second profile, that is a multidomain protein. If the protein is multi domain one, the two families consisting those domains are come in result as first and second almost in every case.

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3