Some question in Hmmer package
0
0
Entering edit mode
5.1 years ago
howenwy2 • 0

I am currently using Hmmer package for sequence analysis (http://hmmer.org/). The problem is that I couldn't get any hit even if the querry data I used is from the same genome. Here is the detail of my problem:

I want to analyze the binding site in bacteria's genome. I aligned 4 sequence (13 nucleotide in each) with known binding site, and generate .hmm file by 'hmmbuild' command. After that, I used 'hmmsearch' command to find binding site in a bacteria genome (circular chromosome with 3805573 basepair).

Since the querry swquence is from the same genome, it supposed to have at least 4 hits in the result; however, no hit could be detect in my result.

I don't know how do I do to fix this problem? It seems that I need to change some default value.

I use OSX 10.10.5, and the package version of hmm is the latest one (3.2.1), could that fit? If that couldn't fit my system, what version will you recommend?

hmmer sequence alignment • 1.1k views
ADD COMMENT
2
Entering edit mode

It doesn't have to find the original sequences. Does the resulting motif has information in it? You can have an idea by plotting the MSA of the input. I suspect that four sequences is too little to generate a meaningful motif.

ADD REPLY
0
Entering edit mode

Thank you so much for the reply. I am new to bioinformatics, I am sorry that I am not sure if I understand your reply totally. The following is my MSA of the input:

CLUSTAL O(1.2.4) multiple sequence alignment

  1. gene1 TTTGAGTGTGTTA 13
  2. gene2 TTTGATCTGGTTA 13
  3. gene3 ATTGAGGTAGTTA 13
  4. gene4 TTTGAGGCTATTG 13

I need to find the (T/A)TTGANNNNNTT(G/A) in my genome sequence, and gene1 to gene4 is also the sequence from the same genome, and I need to find that sequence from the other genes in this genome. Could that be possible that the sequence is too short or too little so that I couldn't find that in the genome?

ADD REPLY
2
Entering edit mode

It seems that you're looking for a well defined motif. In this case, you could just search using regular expressions, e.g. /[T|A]TTGA.{5}TT[G|A]/g assuming the occurrences are non overlapping in which case, you probably can use look arounds like /(?= [T|A]TTGA.{5}TT[G|A])/g.

ADD REPLY
1
Entering edit mode

HMMs are probably a bit overkill for this. Since you already know the degenerate sequence, try using a fuzzy matching aligned like fuzznuc from the EMBOSS package.

ADD REPLY

Login before adding your answer.

Traffic: 2023 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6