Question

Pseudo-peaks in upstream regions

0

Entering edit mode

7.2 years ago

boczniak767 ▴ 850

Hi, I'd like to create a bed file defining subsequence (randomly taken 100-400bp, different for each input sequence) of 1kb sequences in bed file. I've looked of course at bedtools, and searched the web but haven't find anything useful.

bedtools random will generate fixed length sequences restricted by the chromosomes' boundaries

As I'd like to use bed file to extract sequences from genome I've also considered extraction of 1kb fasta sequences (using my bed file) and trimming it. But haven't find the answer how to do that pseudo-random trimming.

BED sequence • 1.4k views

ADD COMMENT • link 7.2 years ago by boczniak767 ▴ 850

0

Entering edit mode

It seems that I've found the solution. Because I need such pseudo-peaks as a background for analysis of peaks detected in real data, the easiest solution is to call peaks the same way on some random data. I'll try samtools merge and samtools -s commands to create the subsample of randomized alignments from my bam files and call peaks using my standard parameters.

ADD REPLY • link 7.1 years ago by boczniak767 ▴ 850

0

Entering edit mode

I think it's the end of my monologue ;-) It's turns out that using peaks called on random data is not efficient - in fact I've got less ranges than from real data, although I used bigger file.

Eventually I've used bedtools shuffle with -incl option with 2k upstream gene sequences. As input -i I've used randomized positions with regard to genes from multiplied peak file.

ADD REPLY • link 7.1 years ago by boczniak767 ▴ 850