bedtools closest (output file format)
1
0
Entering edit mode
8.1 years ago
biostart ▴ 370

Hello,

Is there a way to ask bedtools to return both regions in one line (not in two lines)? See below.

I just run into a problem with bedtools closest. Here is the command:

bedtools closest -a RNA-seq.bed -b ChIP-seq-peaks.bed -d > output.bed

The RNA-seq file contains about 30 columns, starting with Chromosome, Start, End. The ChIP-seq-peaks file is in a classical bed format. Both files are sorted.

The resulting file has two lines per each initial line of the file RNA-seq.bed. The insersecting peak is added as a separate line. Is there a way to tell bedtools to not make a line break?

Thank you!

bedtools RNA-Seq ChIP-Seq • 2.9k views
ADD COMMENT
0
Entering edit mode
8.1 years ago

Another option:

$ closest-features --nearest RNA-seq.bed ChIP-seq-peaks.bed > output.bed

Features are put onto one line.

Depending on the state of inputs, it may be worthwhile to sort, e.g.:

$ sort-bed < RNA-seq.unknown-sort.bed > RNA-seq.bed
$ sort-bed < ChIP-seq-peaks.bed.unknown-sort.bed > ChIP-seq-peaks.bed

Also make sure your inputs don't have any weird line endings, e.g.:

$ cat -e RNA-seq.unknown-line-endings.bed | head
$ cat -e ChIP-seq-peaks.unknown-line-endings.bed | head

If you have more than dollar signs at the end of each line ($) then use tools like sed or dos2unix to preprocess or convert files as needed.

ADD COMMENT

Login before adding your answer.

Traffic: 1952 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6