TF motif binding region searching script in human genome
1
3
Entering edit mode
6.8 years ago
Shicheng Guo ★ 9.4k

Hi All,

I have one interest TF and its motif sequence is known (both logo and frequency matrix) and I want to identify all the related genomic regions. Is there any Perl, R or Python script to share? You can find that the perl script works perfect. therefore, I suggest you to use perl script.

blat don't works since sequence length is too short.

FYI:

motif logo (14bp) : TGGCACCATGCCAA

motif freqency matrix: `

A [ 0 0 0 0 14 2 2 7 4 0 0 0 16 14 ]

C [ 0 0 0 16 1 8 8 1 3 0 16 16 0 0 ]

G [ 0 16 16 0 0 5 4 5 2 16 0 0 0 1 ]

T [16 0 0 0 1 1 2 3 7 0 0 0 0 1 ]`

Logo:

enter image description here

motif logo • 1.5k views
ADD COMMENT
2
Entering edit mode
seqkit locate -i -d -p TGGCACCATGCCAA <sequence.fa>/ <sequence.fa.gz>

i = ignore case d= degenerate base p = pattern

If you want to search only positive strand, use P. From 5th position, motif has degenerate bases.

ADD REPLY
2
Entering edit mode
6.8 years ago
EagleEye 7.5k

Perl solution:

http://homer.ucsd.edu/homer/motif/index.html

Web/application solution:

http://meme-suite.org/tools/meme

ADD COMMENT
1
Entering edit mode

Yes. It is exactly what I need. Thanks.

ADD REPLY

Login before adding your answer.

Traffic: 2698 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6