Hi All,
I have one interest TF and its motif sequence is known (both logo and frequency matrix) and I want to identify all the related genomic regions. Is there any Perl, R or Python script to share? You can find that the perl script works perfect. therefore, I suggest you to use perl script.
blat don't works since sequence length is too short.
FYI:
motif logo (14bp) : TGGCACCATGCCAA
motif freqency matrix: `
A [ 0 0 0 0 14 2 2 7 4 0 0 0 16 14 ]
C [ 0 0 0 16 1 8 8 1 3 0 16 16 0 0 ]
G [ 0 16 16 0 0 5 4 5 2 16 0 0 0 1 ]
T [16 0 0 0 1 1 2 3 7 0 0 0 0 1 ]`
Logo:
i = ignore case d= degenerate base p = pattern
If you want to search only positive strand, use P. From 5th position, motif has degenerate bases.