I tried to convert a .txt file to .bed file.
CONTIG DNA_START DNA_END
NZ_AWYZ01001116.1 641 1875
AWZB01001762.1 1975 2386
NZ_AWZG01001811.1 646 18356
AWYT01002050.1 2311 17133
NZ_AWYV01001866.1 13969 14380
NZ_AWYN01001907.1 17959 18370
AWYR01002060.1 16722 17133
AWYX01002140.1 8145 8556
NZ_KK213231.1 7885 19297
I tried the following command
awk -F"[:-]" 'BEGIN{ OFS="\t"; }{ print $1, $2, $3; }' xx.txt > xx.bed
But bedtoos give the folowing error
Unexpected file format. Please use tab-delimited BED, GFF, or VCF. Perhaps you have non-integer starts or ends at line 1?
I checked the file with the following command
cat -A AcrIF3.bed | head
CONTIG^IDNA_START^IDNA_END^I^I$
May be unnecessary characters are there.How to remove that? I also need to extract 5000 bp upstream and downstream sequences from the start and end coordinate. So also try the following code but same error as before.
awk '$2>$3 {print $1 "\t" $2 "\t" $3 "\t" $3-5001 "\t" $2+5000 "\tpep\t0\t-"}' xx.txt > xx.bed
Do you have any idea how can I convert it to a proper bed file? Cheers
Thanks. Your suggestion works
If it solved your peoblem, please mark the answer as accepted.