A workflow for identification of differentially methylated regions, starting with a data frame of beta values?
1
0
Entering edit mode
6.7 years ago
c.ryder3 ▴ 40

Hello! I have a data frame in R that contains 450K methylation beta values for 6 samples. The probe IDs are the row names and the sample names are the column names. It looks like this:

> head(ICGC_2)
             naive.1   memoryCS.1   naive.2    memoryCS.2  naive.3    memoryCS.3
cg00000029  0.6199970   0.5703951  0.6383819   0.5831206  0.7012571  0.6000816
cg00000108  0.9083578   0.9105157  0.9030611   0.9103147  0.9115842  0.8947593
cg00000109  0.8694214   0.7525098  0.8478160   0.7725212  0.8645145  0.7636347
cg00000165  0.1911901   0.3050081  0.1810569   0.3750369  0.2250429  0.3094155
cg00000236  0.8666489   0.8382011  0.8586420   0.8369283  0.8860430  0.8439371
cg00000289  0.6653662   0.5512665  0.5815338   0.4773868  0.6254710  0.5408634

I would like to compare the naive samples to the memoryCS samples to identify genomic regions that are differentially methylated in the naive samples vs the memoryCS samples. Can anyone suggest a workflow that will allow me to do this, with this data as the starting point? I'm aware of the DMRcate package, which includes the dmrcate function for identifying differentially methylated regions (DMRs), but this function requires an annotation object generated by cpg.annotate. CpG.annotate requires a matrix of M values, which I believe I can generate from my data frame, but it also requires a study design matrix, which I don't know how to generate. Can anyone offer me some guidance?

Thank you!

R bioconductor 450K methylation DMRcate • 2.2k views
ADD COMMENT
0
Entering edit mode

minfi,champ and RnBeads are some suggestions

ADD REPLY
0
Entering edit mode
6.7 years ago
halo22 ▴ 300

You can try minfi (https://www.bioconductor.org/help/course-materials/2014/BioC2014/minfi_BioC2014.pdf) it is a good workflow. But again for DMR's even minfi would require you to define your design matrix. Honestly, I would advice spending sometime studying the design matrix. The design matrix is essential since it guides the comparisons(naive samples vs memoryCS) by fitting an appropriate model. Try the following or spend time learning about limma. http://bioinf.wehi.edu.au/marray/ibc2004/lab3/lab3.html#EstrogenDesignMatrix

ADD COMMENT

Login before adding your answer.

Traffic: 2745 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6