Biostar Beta. Not for public use.
Mapping WGBS probes to CGs (TCGA bed files)
0
Entering edit mode
4.6 years ago
sands3 • 0

I am currently working with WGBS methylation data from TCGA. It appears that these bed files have been generated using the tool BisSNP and contain the chromosome name, start coordinate, end coordinate, methylation value in percent, the coverage, the strand etc. According to the information of the package RnBeads, the coordinates are 0-based, spanning the first and the last coordinate in a site (i.e. end-start= 1 for a CpG). Sites on the negative strand are shifted by +1.

I tried to map the 2 coordinates per probe in the bed file to the genome and they do not seem to match CGs. If I shift all coordinates by +1, however, most of them (but not all) do match CGs. The problem is that the WGBS data that I have does not contain any information in the strand column, so I cannot shift coordinates depending on whether they are the positive or negative strand.

Has anyone faced the same problem?

ADD COMMENTlink
0
Entering edit mode
2.7 years ago
Tej Sowpati • 250
India

It depends on the method by which you are retrieving the corresponding sequence - whether the genome is 0-based or not. However, in your case, it looks like using a +1 is the correct approach. Do all of them match a cytosine when you shift them by +1? Because Bisulfite sequencing can identify methylation in non CpG context too..

ADD COMMENTlink
0
Entering edit mode
4.6 years ago
sands3 • 0

Yes, it seems to me as well that shifting by +1 is the right approach. For only a few isolated cases it does not match a C, but another nucleotide instead (in these cases there are no Cs in the surroundings, so shifting +1 or not does not make a difference).

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1