Entering edit mode
9.3 years ago
dec986
▴
370
Hello,
I have DNA methylation files that have a 4 column format and come in pairs for percentage (pct) and coverage (cvg), respectively:
con@e:~/Documents/Simmons/DNA_Methylation$ head -5 Merged_9966/bs_seeker-CG.pct.bedGraph
chr1 767 768 0.466666666666667
chr1 845 846 0.142857142857143
chr1 3235 3236 0
chr1 3303 3304 0.962962962962963
chr1 3929 3930 1
con@e:~/Documents/Simmons/DNA_Methylation$ head -5 Merged_9966/bs_seeker-CG.cvg.bedGraph
chr1 767 768 15
chr1 845 846 7
chr1 3235 3236 7
chr1 3303 3304 27
chr1 3929 3930 7
I wrote a perl script to convert this into a 7 column CpG methylation format output by programs like Bismark:
chrBase chr base strand coverage freqC freqT
chr1.17158 chr1 17158 F 15.0 50.0 50.0
chr1.17309 chr1 17309 F 15.0 81.8 18.2
chr1.17334 chr1 17334 F 15.0 78.3 21.7
chr1.178185 chr1 178185 R 15.0 69.4 30.6
However, I made some unknown error in the conversion.
I have 2 questions:
- What is this format called?
- Is there a way I can convert the previous 2 files (.bedGraph) into this second format?
thanks,
-DEC
thank you very much Devon!
After an exhaustive search, I think the rtracklayer package in R works best. There is a function called import.bedgraph which does exactly what I needed.
-DEC