Cluster strategy for a time-series analysis in DESeq2
1
1
Entering edit mode
6.1 years ago
Cecelia ▴ 30

Hello,

I would like to do a time-series analysis in DESeq2. I have following experimental design:

  • Treatment: Infected and Non-infected (Treat vs. Control)
  • Time: 1, 4, 12 hours
  • For each Treatment-Time point combination, we have imbalanced biological replicates (from 3 to 8).

I already did the transcript quantification using Salmon. And now I would like to have some advice on DESeq2 analysis. I followed the tutorial from here: http://www.bioconductor.org/help/workflows/rnaseqGene/#time-course-experiments to build a complete design:

ddsTCpus14 <- DESeqDataSetFromTximport(txifilesevipus14, colData=colDatapus,
                              design = ~ Treatment + Time +  Treatment:Time)

and with a reduced design:

ddsTCpus14 <- DESeq(ddsTCpus14, test="LRT", reduced = ~ Time + Treatment)

And the resultnames are:

resultsNames(ddsTCpus14)
[1] "Intercept"                  "Treatment_Treat_vs_Control"
[3] "Time_12h_vs_1h"             "Time_4h_vs_1h"             
[5] "TreatmentTreat.Time12h"     "TreatmentTreat.Time4h"

So my questions are:

Question 1

I am thinking of making the lists of DE genes for my comparisons. First comparing between Treat vs. Control in each time point:

#1hour treat vs control
pusres1 <- results(ddsTCpus14, name="Treatment_Treat_vs_Control", test="Wald", alpha=0.05)

#4hour treat vs control
pusres4 <- results(ddsTCpus14, contrast=list(c("Treatment_Treat_vs_Control","TreatmentTreat.Time4h" )), test="Wald", alpha=0.05)

#12hour treat vs control
pusres12 <- results(ddsTCpus14, contrast=list(c("Treatment_Treat_vs_Control","TreatmentTreat.Time12h" )), test="Wald", alpha=0.05)

Then comparing the treat:time interaction term between different time points:

#1h to 4h
pusresinter14 <- results(ddsTCpus14, name="Time4h.TreatmentTreat", test="Wald", alpha=0.05)

#1h to 12h
pusresinter14 <- results(ddsTCpus14, name="TreatmentTreat.Time12h", test="Wald", alpha=0.05)

My question is: Is there a way to test the interaction term between 4h and 12h? I read through this post but still could not figure it out.

https://support.bioconductor.org/p/65676/

Question 2

Assuming I have the DEG list of all the comparisons. Does it make sense if I first filter all the DEG list by adjusted P-value and fold change and than combine all the list into one big DEG list. Then I do a clustering (using hclust or other clustering approach) based on the big list. All the downstream go enrichment test will be based on the clusters.

Question 3

I am not sure if the imbalanced numbers replicates would influence the clustering. Is there any suggestions?

Some of the ideas probably make no sense for you specialists, but any suggestion would be very much appreciated!

Thanks in advance!

Cecelia

time-course RNA-Seq DEseq2 hclust • 3.1k views
ADD COMMENT
3
Entering edit mode
6.1 years ago

For question 2: there is no problem in doing that - that is a standard procedure.

For question 3: imbalanced replicate numbers will affect the statistical inferences from your sample data, which will of course indirectly influence the clustering. For one, in an unbalanced dataset, the same p- or adjusted p-value used as cut-off for statistical significance will have a different 'meaning' between one balanced comparison and another imbalanced comparison.

Now, question 1: I cannot be entirely sure

ADD COMMENT
1
Entering edit mode

Thanks a lot for your reply. Really helpful!

ADD REPLY

Login before adding your answer.

Traffic: 2487 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6