Normalisation before edgeR for RNA-Seq
1
0
Entering edit mode
3.1 years ago
ZheFrench ▴ 570

I am more a DESeq2 user and switch to edgeR recently. I received scripts from other dev. With DESeq I used to directly inject raw counts...Here the guy pre-normalise count using, is that ok ?

Is this double normalization because I think edgeR intrinsically normalize reads, right ? So I was wondering if I should remove this part of code before edgeR call. What do you think ?

Roughly :

  ###### Useless section ? ######
  q <- apply(counts,2,function(x) quantile(x[x>0],prob=0.75))
  ncounts <- sweep(counts,2,q/median(q),"/")
  ################################# Should I just use counts ?

  dge <- DGEList(ncounts,genes=rownames(ncounts))
  design <- model.matrix(~0+dge$group) # no intercept #x0 = 1,  force model throught the origin
  colnames(design) <- gsub("^dge$group","",colnames(design))
  cm <- makeContrasts(contrasts=comp,levels=dge$group)

  y     <- estimateDisp(dge,design,robust=T)
  fit.y <- glmFit(y,design)
  lrt   <- glmLRT(fit.y,contrast=cm)
rna-seq edgeR • 829 views
ADD COMMENT
1
Entering edit mode
3.1 years ago
ATpoint 82k

The manual instructs to use the raw counts. https://www.bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf

Normalization only happens if you use calcNormFactors, otherwise a plain per-million scaling is performed which does not correct for library composition. I would strictly stick to the manual if in doubt. This custom code on top from your colleague should probably be ignored. There is a "quick start" section in the manual you can use for a simple analysis, be sure to use calcNormFactors and do not use prenormalized counts.

ADD COMMENT

Login before adding your answer.

Traffic: 1415 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6