How Do I Search A Genome For A Known Motif, And Get An Interval File Of All Instances Of The Motif?
3
3
Entering edit mode
10.3 years ago
bede.portz ▴ 540

A paper recently identified a motif in Drosophila that is poorly conserved. What I would like to do is search the Drosophila genome for all instances of said motif in a way that allows for mismatches at particular positions, and generate an interval file with the start and end coordinates for all instances of the motif in the genome. In addition to knowing the start and end coordinates, I would like to know the DNA sequence associated with these coordinates, as the motif will often vary from the consensus.

To be clear, what I want to do is the opposite of searching for a motif. I already know the motif, and would like to know all the locations of said motif, and the motif sequence at each location.

I suspect there are tools to do this? But I have not yet conducted any motif analysis, so I would appreciate any help. I tried the search function, but it appears most threads pertain to motif discovery, rather than my particular need.

Thanks

motif chip-seq • 9.4k views
ADD COMMENT
0
Entering edit mode

Have you looked at the matchPWM() function from the R Biostrings package? It can likely do what you want.

ADD REPLY
3
Entering edit mode
10.3 years ago
vj ▴ 520

What you are looking for may be FIMO. You can go to the command line documentation which gives you a number of options although I do not know about mentioning mismatches.

ADD COMMENT
0
Entering edit mode
10.3 years ago
Chris Whelan ▴ 570

You could also use the fuzznuc tool from the EMBOSS suite for this:

http://emboss.sourceforge.net/apps/cvs/emboss/apps/fuzznuc.html

ADD COMMENT
0
Entering edit mode
10.0 years ago
Ming Tommy Tang ★ 3.9k

try http://motifmap.ics.uci.edu/ if the motif is there

ADD COMMENT

Login before adding your answer.

Traffic: 2616 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6