Experimental Design in Edge R for DE
0
0
Entering edit mode
5.0 years ago
trettig • 0

Hello, I am struggling to figure out how to set up my comparisons for EdgeR. We have a total of 8 treatment groups plus batch, making a total of 16. Below is an example of my import data.

enter image description here

The first letter represents AOS treatment (Y = received, N = not received, the second TT treatment, the third CpG treatment, and the final letter represents batch (A or B)

I set the treatment factors with the following code:

AOS <- (factor(substring(colnames(edgereads),1,1)))
TT <- (factor(substring(colnames(edgereads),2,2)))
CpG <- (factor(substring(colnames(edgereads),3,3)))
Batch <- (factor(substring(colnames(edgereads),4,4)))
Treat <- (factor(substring(colnames(edgereads),1,4)))

And sent the design as such:

design <-model.matrix(~Batch+AOS*TT*CpG)

and set fit using this:

fit <- glmQLFit(y, design, robust = TRUE)

While there are obviously a large number of comparisons that can be made here, we have a few that we are interested in specifically. The first is the collapsed data of No AOS vs AOS ignoring other factors and repeating this with the TT and CpG treatment groups. We also want to be able to compare specific groups, such as the NoAOS:NoTT:NoCpG group vs the AOS:TT:CpG treatment group. These would also need to take into account the batch effect.

My plan is to use glmTreat to set a fold cut off of 2 fold difference.

Is there an easy way to go about making these comparisons? We have so many treatment groups, it’s a lot to create, and thus a lot to make a tiny error in and make incorrect comparisons. Is the best method to use makeConstrasts using the Treat factor I defined above? Is the batch properly factored in that way?

Thank you in advance!

RNA-Seq R EdgeR • 1.1k views
ADD COMMENT
0
Entering edit mode

I can only assume that the substring() functions are doing what you expect(?). Why are you adding multiplicative effects to your model? If at all possible, the model formula should be in its most simple form. It should be possible to combine some of your factors together and use them in that way.

ADD REPLY
0
Entering edit mode

Yes, the substring is doing what I want. It breaks each factor into a Y (Yes) or N (No) regarding the treatment.

I am using the multiplicative effects because there are 3 different treatments and we follow a 2x2x2 model. Each animal either receives (Y) or doesn't receive (N) each treatment. My understanding is that if we want to be able to pull out the effects of each thing individually or in groups we need to use the multiplicative model. For example, if we want to compare the effects of AOS and TT, or the effects of CpG and TT. We also want to examine the effects of all three factors - AOS+TT+CpG. Thinking of it as a three way ANOVA.

ADD REPLY
0
Entering edit mode

Hey, indeed, the asterisk just result in an expansion of the model terms to become additive and include the interaction, i.e., A*B is the same as A + B + A:B. I have just not seen such a 'double' multiplicative model formula like the one that you are using (but don't doubt that it is used). From my experience, model formulae should be as simple as possible. I note that, when I think and obsess more about my study, I naturally want to include more terms in my model and make them more complex, but such models rarely fit well to the data. Take a look at Gordon Smyth's (EdgeR developer's) previous answer here, where someone else had made a question that included a formula like yours: https://stat.ethz.ch/pipermail/bioconductor/2013-May/052566.html

Leaving batch there just ensures that the effect of batch will be adjusted in whatever tests you run; so, it's fine to leave it there. The batch effect ought to be consistent across your samples, though. Batch can usually be inferred via a PCA bi-plot of the normalised, transformed counts. In EdgeR, the transformation is typically logCPM(normalised count + 1)

Your input data does not look normalised - are you calculating size factors via edgeR? - take a look at Section 1.4 of edgeR: differential expression analysis of digital gene expression data (makeContrasts is mentioned in section 3.2.3 of the same)

ADD REPLY

Login before adding your answer.

Traffic: 2639 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6