Conversion between .bedGraph or .bw to 7-column format
1
0
Entering edit mode
9.2 years ago
dec986 ▴ 370

Hello,

I have DNA methylation files that have a 4 column format and come in pairs for percentage (pct) and coverage (cvg), respectively:

con@e:~/Documents/Simmons/DNA_Methylation$ head -5 Merged_9966/bs_seeker-CG.pct.bedGraph
chr1    767    768    0.466666666666667
chr1    845    846    0.142857142857143
chr1    3235    3236    0
chr1    3303    3304    0.962962962962963
chr1    3929    3930    1
con@e:~/Documents/Simmons/DNA_Methylation$ head -5 Merged_9966/bs_seeker-CG.cvg.bedGraph
chr1    767    768    15
chr1    845    846    7
chr1    3235    3236    7
chr1    3303    3304    27
chr1    3929    3930    7

I wrote a perl script to convert this into a 7 column CpG methylation format output by programs like Bismark:

chrBase    chr   base  strand    coverage freqC freqT
chr1.17158    chr1    17158    F    15.0    50.0    50.0
chr1.17309    chr1    17309    F    15.0    81.8    18.2
chr1.17334    chr1    17334    F    15.0    78.3    21.7
chr1.178185    chr1    178185    R    15.0    69.4    30.6

However, I made some unknown error in the conversion.

I have 2 questions:

  1. What is this format called?
  2. Is there a way I can convert the previous 2 files (.bedGraph) into this second format?

thanks,

-DEC

format-conversion DNA-methylation • 3.1k views
ADD COMMENT
1
Entering edit mode
9.2 years ago
  1. That format has no name, it's a purely random format.
  2. The best solution would be a short perl or python script, though you could also use join and awk if you really wanted.

A better route would be to use a generic methylation extractor, like PileOMeth. It outputs in the modified bedGraph format that bismark (among others) uses. That will give you the raw counts, so you don't have to deal with rounding back to the appropriate integer.

ADD COMMENT
0
Entering edit mode

thank you very much Devon!

After an exhaustive search, I think the rtracklayer package in R works best. There is a function called import.bedgraph which does exactly what I needed.

-DEC

ADD REPLY

Login before adding your answer.

Traffic: 2179 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6