How often should I expect to see a consensus sequence of GGNGC, where N is any base and there is less than 120 nucleotides separating this consensus sequence to the start of another of the same sequence? I really have no clue where to start. Should I take into account all four possible consensus sequences replacing nucleotide N?
- Is this a homework question?
- What do you know? That is, do you know the actual 5mer frequency or do you have to assume that they're all equally distributed? This question alone should give you a hint on how to get started.