Question

basic statistics books to understand DESeq2

3

Entering edit mode

5.0 years ago

Bioinfonext ▴ 460

Hi,

I am from biological background, Could you please advise me to basic biostatistics books that can help me to understand the DESeq2 tutorial?

Kind Regards Bioinfonext

R • 2.6k views

ADD COMMENT • link updated 5.0 years ago by mmfansler ▴ 450 • written 5.0 years ago by Bioinfonext ▴ 460

0

Entering edit mode

Have you read the paper and the vignette?

ADD REPLY • link 5.0 years ago by WouterDeCoster 47k

1

Entering edit mode

Yes, I need to understand basic statistics terms used in DESeq2 like:

1) I am always confused with how to design the formula: in case of one factor, two factors and three factors

2) When exactly need to use interaction terms or group all factor into one?

3) what is beta prior?

4) what is shrink log fold changes? I only know log fold change.

5) Dispersion is generally estimation of variance but what is shrinkage?

Kind Regards Bioinfonext

ADD REPLY • link 5.0 years ago by Bioinfonext ▴ 460

0

Entering edit mode

Try this one.

Beginner’s guide to using the DESeq2 package

Michael Love1∗, Simon Anders2, Wolfgang Huber2

https://bioc.ism.ac.jp/packages/2.14/bioc/vignettes/DESeq2/inst/doc/beginner.pdf

I hope, it is not a double answer

ADD REPLY • link 5.0 years ago by natasha.sernova ★ 4.0k

0

Entering edit mode

Thank you very much.

ADD REPLY • link 5.0 years ago by Bioinfonext ▴ 460

0

Entering edit mode

If it's not enough, ask your questions one after another one in google.

For example, I’ve asked your question 5 in Google.

Below is the top answer.

The larger a dispersion value, the larger the difference in expression has to be in order for a gene to be called DE. As the number of replicates for each condition increases, the amount of dispersion shrinkage per gene decreases as we are then able to estimate the dispersion parameter from the data without shrinkage.

DESeq2 Dispersion Shrinkage - more samples is better?

https://support.bioconductor.org/p/100764/

ADD REPLY • link 5.0 years ago by natasha.sernova ★ 4.0k

score 4 · Answer 1 · 2019-04-25

4

Entering edit mode

5.0 years ago

mmfansler ▴ 450

Modern Statistics for Modern Biology by Holmes and Huber covers DESeq2 in Chapter 8 and tries to provide most of the statistical background to get there with the earlier chapters. It does assume some basic facility with R.

There is currently an online reading group, which meets weekly to present and discuss the chapters and exercises. Chapter 8 is scheduled for discussion on May 29th, 2019.

ADD COMMENT • link 5.0 years ago by mmfansler ▴ 450

0

Entering edit mode

Thank you very much to you all. I am not able to understand few things:

1) Which normalized method DESeq used at the step by default: vst or rlog

dds <- DESeq(dds)

2) what are the further step after getting res table:

> res.root.soil= results(diagdds, contrast = c("Tissue", "Root", "Soil"), alpha = 0.1)

> summary(res.root.soil)



out of 30438 with nonzero total read count

adjusted p-value < 0.1

LFC > 0 (up)     : 10, 0.033% 

LFC < 0 (down)   : 29584, 97% 

outliers [1]     : 0, 0% 

low counts [2]   : 5664, 19% 

(mean count < 0)

[1] see 'cooksCutoff' argument of ?results

[2] see 'independentFiltering' argument of ?results

3) what is the use of this step and when should I perform this and this coef is the same which I used at the res step with contrast

res <- lfcShrink(dds, coef="condition_trt_vs_untrt", type="apeglm")

Kind Regards

ADD REPLY • link 5.0 years ago by Bioinfonext ▴ 460