Question

Hmmscan Bias Values Greater than One

0

Entering edit mode

9.4 years ago

pld 5.1k

My understanding is that the bias value of the hmmscan output is supposed to rage between 0 and 1:

Bias - The bias composition correction (ranging between 0 and 1), is the bit score difference contributed by the null2 model. High bias scores may be a red flag for a false positive. It is difficult to correct for all possible ways in which nonrandom but nonhomologous biological sequences can appear to be similar, such as short-period tandem repeats, so there are cases where the bias correction is not strong enough (creating false positives).

http://hmmer.janelia.org/help/result

However for many hits that I am seeing against the PfamA database, I see biases above 1 for both the "full sequence" and "this domain" categories. Am I missing something?

Here's the command I used

hmmscan -domtblout <out fi> -cpu 22 PfamA.hmm <input data> > hmmscan.log

The version is 3.1b1

hmmscan hmmer pfam • 3.6k views

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 9.4 years ago by pld 5.1k

0

Entering edit mode

From the user guide:

The next number, the bias, is a correction term for biased sequence composition that has been applied to the sequence bit score.1 For instance, for the top hit MYG PHYCA that scored 222.7 bits, the bias of 3.2 bits means that this sequence originally scored 225.9 bits, which was adjusted by the slight 3.2 bit biasedcomposition correction. The only time you really need to pay attention to the bias value is when it's large, on the same order of magnitude as the sequence bit score.

After reading this part, it makes more sense why the bias value could be above one, but now I'm not sure why the documentation on the webpage says it is.

ADD REPLY • link updated 2.1 years ago by Ram 43k • written 9.4 years ago by pld 5.1k

1

Entering edit mode

9.4 years ago

Siva ★ 1.9k

I checked the HMMER3 User guide and found the following footnote on Page 18

The method that HMMER3 uses to compensate for biased composition is unpublished, and different from HMMER2. We will write it up when there's a chance.

The example hmmsearch result on the same page has bias values greater than 1.

--- full sequence ---   --- best 1 domain ---    -#dom-
           E-value  score  bias    E-value  score  bias    exp  N  Sequence              Description
           ------- ------ -----    ------- ------ -----   ---- --  --------              -----------

             6e-65  222.7   3.2    6.7e-65  222.6   2.2    1.0  1  sp|P02185|MYG_PHYCA   Myoglobin OS=Physeter catodon GN=MB PE
           3.1e-63  217.2   0.1    3.4e-63  217.0   0.0    1.0  1  sp|P02024|HBB_GORGO   Hemoglobin subunit beta OS=Gorilla gor

I have not used HMMER3 yet and it seems there are major changes compared to HMMER2.

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 9.4 years ago by Siva ★ 1.9k

0

Entering edit mode

That's not very comforting. I guess I'll email them.

ADD REPLY • link updated 2.1 years ago by Ram 43k • written 9.4 years ago by pld 5.1k

Ram · Accepted Answer · 2014-12-09

1

Entering edit mode

9.4 years ago

pld 5.1k

The user guide example and definition is correct. The bias field in hmmer results is defined as the difference in bit score after applying the bias correction and can be of values greater than one.

I just heard back from them, they are fixing the website now.

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 9.4 years ago by pld 5.1k