Biostar Beta. Not for public use.
ERROR: illegal character '.' when running bedtools closest command
0
Entering edit mode
3.3 years ago
Morty • 0

Hello everyone,

I have experienced a problem when I was trying to find closest TSS to a peak by using this command:

bedtools closest -a file_peaks.narrowPeak -b path/genes.tss.bed  > file_closestTSS.txt

The error says: * ERROR: illegal character '.' found in integer conversion of string "3216969.". Exiting... I generated genes.tss.bed file from genes.gfp file which i found in Annotations of iGenome mm10

awk 'BEGIN {FS=OFS="\t"} { if($7=="+"){tss=$4-1} else { tss=$5} print $1,tss, tss+1 ".", ".", $7, $9}' path/genes.gtf > path/genes.tss.bed

Could anyone help me please? Thank you

ChIP-Seq gene • 1.9k views
ADD COMMENTlink
1
Entering edit mode

It seems that a dot '.' is on the wrong place (where an integer is expected).

Show how your bed file looks like, maybe it becomes clear where that might be.

ADD REPLYlink
0
Entering edit mode

I'm not sure, but dont you need comma aftertss+1`?

ADD REPLYlink
0
Entering edit mode

what is the output of head path/genes.tss.bed ?

ADD REPLYlink
0
Entering edit mode
head /home/s1469622/dstore/Reference_genomes/Mus_musculus/UCSC/mm10/Annotation/Archives/archive-2015-07-17-14-33-26/Genes/genes.tss.bed
chr1    3216968 3216969.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    3216024 3216025.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    3216968 3216969.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    3421901 3421902.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    3421901 3421902.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    3671348 3671349.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    3671498 3671499.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    3671348 3671349.        .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
chr1    4293012 4293013.        .       -       gene_id "Rp1"; gene_name "Rp1"; p_id "P17361"; transcript_id "NM_001195662"; tss_id "TSS6138";
chr1    4292983 4292984.        .       -       gene_id "Rp1"; gene_name "Rp1"; p_id "P17361"; transcript_id "NM_001195662"; tss_id "TSS6138";
ADD REPLYlink
1
Entering edit mode

remove the . attached with end coordinates.

Try following

awk 'BEGIN {FS=OFS="\t"} { if($7=="+"){tss=$4-1} else { tss=$5} print $1,tss, tss+1, ".", $7, $9}' path/genes.gtf > path/genes.tss.bed

ADD REPLYlink
0
Entering edit mode

I get this error when I use bedtools afterwards:

Error: Sorted input specified, but the file /home/s1469622/dstore/Reference_genomes/Mus_musculus/UCSC/mm10/Annotation/Archives/archive-2015-07-17-14-33-26/Genes/genes.tss.bed has the following out of order record
chr1    3216024 3216025 .       -       gene_id "Xkr4"; gene_name "Xkr4"; p_id "P15391"; transcript_id "NM_001011874"; tss_id "TSS27105";
ADD REPLYlink
3
Entering edit mode

You need to run bedtools sort on this.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1