Question

DESeq2 model matrix with interaction term

0

Entering edit mode

8.0 years ago

george.bask ▴ 60

Hi all, I am trying to figure out the best way to make a model matrix for the following RNA-seq experiment using DESeq2. The experimental design is as follows:

28 samples from 14 patients with a certain disease.
2 timepoints baseline and follow up
Two groups, one group of 9 patients has been given a treatment, the other group of 5 patients has been left without treatment.

So I would like to see what is the differnece between the treated group and the untreated group in the follow up timepoint vs their baseline.

In other words I am looking for the difference of differences between the treated and the untreated group, how much difference cannot be explained by the general differences between follow up and beseline (which are not of biological importance). So my experiment is paired only within groups and we know that there is much variance between the different patients.

My colData is as follow and the reference levels are baseline and untreated for each respective factor.

colData_paired_all
   timepoint     group patient     
1   baseline   treated CTS_110  
2   baseline   treated CTS_118  
3   baseline   treated CTS_116  
4   baseline   treated CTS_115 
5   baseline   treated CTS_114  
6   baseline   treated CTS_122  
7   baseline   treated CTS_124  
8   baseline   treated CTS_123  
9   baseline   treated CTS_126  
10  baseline untreated CTS_103  
11  baseline untreated CTS_117  
12  baseline untreated  CTS_39  
14  baseline untreated  CTS_33  
15 follow_up   treated CTS_110 
16 follow_up   treated CTS_118 
17 follow_up   treated CTS_116 
18 follow_up   treated CTS_115  
19 follow_up   treated CTS_114  
20 follow_up   treated CTS_122  
21 follow_up   treated CTS_124  
22 follow_up   treated CTS_123  
23 follow_up   treated CTS_126 
24 follow_up untreated CTS_103  
25 follow_up untreated CTS_117 
26 follow_up untreated  CTS_39  
27 follow_up untreated  CTS_83  
28 follow_up untreated  CTS_33

I can use the design "~ timepoint + timepoint:group" which gives a full rank matrix with coefficients

resultsNames(dds)
[1] "Intercept"                       "timepoint_follow_up_vs_baseline"
[3] "timepointbaseline.grouptreated"  "timepointfollow_up.grouptreated"

So I can calculate, DE follow up vs baseline for group untreated:

res <- results(dds, name= "timepoint_follow_up_vs_baseline")

DE follow up vs baseline for group treated:

res_followup_treated <- results(dds, list(c("timepoint_follow_up_vs_baseline", "timepointfollow_up.grouptreated")))

only the differences for group treated which cannot be explained by the general follow up vs baseline differences (the interaction term)

res_treated_diff <- results(dds, name = "timepointfollow_up.grouptreated")

My problem is that if I include the patient variable in the design, like "~ patient + timepoint + timepoint:group" the matrix is not full rank. I can see that pairing is true only within groups, but I wonder is there any way to include the patient variable, in order to block patient's effect, in the formula? Do I have already included because groups are paired in the first design the gives the full rank matrix?

Thank you very much for your help!!!

RNA-Seq DESeq2 Model Design Interaction Term • 3.9k views

ADD COMMENT • link 8.0 years ago by george.bask ▴ 60