Question

What are median and quantile normalization?

1

Entering edit mode

6.0 years ago

pyKey ▴ 70

Hello everyone,

Normally I use TPM for within-sample analysis. Recently I got a suggestion to use Median and Quantile between-sample normalization methods. I noticed that DESeq and Limma packages offer the methods. But... what are they doing? What is the intuition behind them?

Thank you all,

RNA-Seq Normalization • 9.0k views

ADD COMMENT • link 6.0 years ago by pyKey ▴ 70

0

Entering edit mode

Right! So more explanation:

I have a bunch of RNA-Seq experiments and I am performing some simple gene expression comparisons between two conditions (wildtype vs. mutants). Some conditions have at most two replicates. I already TPM normalized all the samples, but for comparisons, another between-sample normalization step seems like a good idea.

So far your explanations are of great help. Thank you all!

ADD REPLY • link 6.0 years ago by pyKey ▴ 70

0

Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

ADD REPLY • link 6.0 years ago by GenoMax 141k

0

Entering edit mode

I already TPM normalized all the samples

If you are performing differential expression with DESeq2 or limma, don't transform the data. DESeq2 expects raw counts. For RNAseq with limma, you have to perform the voom transformation on the raw counts as well. Repeating: start with raw counts, not TPM, for both packages (and edgeR, for that matter).

ADD REPLY • link 6.0 years ago by h.mon 35k

score 1 · Answer 1 · 2018-04-26

Your question is poorly explained: what downstream analyses you intend to perform? Are you moving from within-sample comparisons to differential expression analysis?

I believe DESeq2 does not perform quantile nor median normalization, only limma.

About limma between-array normalization: quantile normalization is performed to make the distribution of microarray intensity signals the same between all arrays being analysed. Median normalization (method="scale") makes the samples to have the same median.

DESeq2 and edgeR normalize for library size, each package has a different method for performing the normalization, but the idea is to make all samples sequencing depth "the same". DESeq2 uses some transformations (rlog and vst) for exploratory analyses and visualization, but these are not used for differential expression analysis.

Some resources:

http://genomicsclass.github.io/book/pages/normalization.html

https://www.reddit.com/r/bioinformatics/comments/14eae2/can_someone_explain_median_normalization_to_me/

https://stats.stackexchange.com/questions/10744/how-does-quantile-normalization-work