bedtools closest (output file format)
20 months ago
biostart • 290


Is there a way to ask bedtools to return both regions in one line (not in two lines)? See below.

I just run into a problem with bedtools closest. Here is the command:

bedtools closest -a RNA-seq.bed -b ChIP-seq-peaks.bed -d > output.bed

The RNA-seq file contains about 30 columns, starting with Chromosome, Start, End. The ChIP-seq-peaks file is in a classical bed format. Both files are sorted.

The resulting file has two lines per each initial line of the file RNA-seq.bed. The insersecting peak is added as a separate line. Is there a way to tell bedtools to not make a line break?

Thank you!

18 months ago
Seattle, WA USA

Another option:

$ closest-features --nearest RNA-seq.bed ChIP-seq-peaks.bed > output.bed

Features are put onto one line.

Depending on the state of inputs, it may be worthwhile to sort, e.g.:

$ sort-bed < RNA-seq.unknown-sort.bed > RNA-seq.bed
$ sort-bed < ChIP-seq-peaks.bed.unknown-sort.bed > ChIP-seq-peaks.bed

Also make sure your inputs don't have any weird line endings, e.g.:

$ cat -e RNA-seq.unknown-line-endings.bed | head
$ cat -e ChIP-seq-peaks.unknown-line-endings.bed | head

If you have more than dollar signs at the end of each line ($) then use tools like sed or dos2unix to preprocess or convert files as needed.


