How to convert my text file to GTF, GFF3 or BED file.
2
1
Entering edit mode
8.1 years ago
Franck8413 ▴ 20

Hi, I want to convert my text file to a GTF, GFF3 or BED file. My file is organized like that. This is a list of operons downloaded from RegulonDB, but I need to convert it in one of this format in order to visualize them in IGV.

Column Names: 1) OPERON_ID 2) OPERON_NAME 3) FIRSTGENEPOSLEFT 4) LASTGENEPOSRIGHT 5) REGULATIONPOSLEFT 6) REGULATIONPOSRIGHT 7) OPERON_STRAND 8) KEY_ID_ORG

ECK120007527 ycaM 946452 947882 946159 947882 forward ECK12 ECK120014360 astCADBE 1823979 1830006 1823979 1830351 reverse ECK12 ECK120014361 pyrH 191855 192580 191718 192580 forward ECK12 ECK120014362 nrdHIEF 2798745 2802483 2798457 2802483 forward ECK12 ECK120014363 cpxP 4103843 4104343 4103749 4104402 forward ECK12

I do not know if there is a way to directly convert my file in one of these formats. Otherwise , what is the command to get the correct columns ? Thank you.

RNA-Seq • 7.0k views
ADD COMMENT
0
Entering edit mode

What is the delimiter used for the columns in your file (e.g. tab, space)? It is hard to decipher that from the post above.

ADD REPLY
5
Entering edit mode
8.1 years ago
michael.ante ★ 3.8k

Hi bonardi.franck,

All the files you mentioned are text-files with a certain formal definition. See here the bed, the gff3, and gtf descriptions.

For your purpose, it seems to me it would be best to parse your data into a bed format. You can use FIRSTGENEPOSLEFT and LASTGENEPOSRIGHT to define your region (fields 2 and 3 ) and the REGULATIONPOSLEFT and REGULATIONPOSRIGHT as thick start, resp end. For the name, you can choose the OPERON_ID, the OPERON_NAME or a mixture of both. The OPERON_STRAND needs to be converted into + or -. Still you need the chromosome name (I assume KEY_ID_ORG is this here).

You can use for instance awk or perl to reformat each line of your textfile, that you have the necessary structure. It looks something like awk '{if(match($7,/forward/)){s="+"}else{s="-"} ; print $8"\t"$3"\t"$4"\t"$1"\t"s"\t"$5"\t"$6 }' mytext.txt > myoperons.bed

Cheers,

Michael

ADD COMMENT

Login before adding your answer.

Traffic: 2432 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6