Deleted:Design and contrast matrices issues for including batch effect in RNA-seq analysis
0
0
Entering edit mode
3.1 years ago
kris275 ▴ 10

Hello everyone,

I am working on a RNA-seq project that includes samples from normal tissue, adenomas, adenocarcinomas, liver metastases and was done in five batches. As I have previously read, it is best if batch effect is included in the design matrix for linear model fitting- shown in limma guide and (http://bioconductor.org/packages/release/workflows/vignettes/RNAseq123/inst/doc/limmaWorkflow.html). However, in the latter article, when they write contrasts, the lanes are not included, not accounted for if i understand correctly. A similar story is shown in limma guide chapter 17.1.5, where the DEGs are for the final coefficient.

I performed the analysis in two different ways, the first as such:

 `group2<- factor(c("N","A","N","A","N","A","N","A","N","A",
                 "C","N","C","N","C","N","C","N","C","N","C","N","C","N","C","N","C","N","C","N",
                 "C","L","N","C","L","N","C","L","N","C","C","C",        "N","L","N","C","L","N","C","L","N","C","L","N","C","L","N","L","N","L","N","L","N","C","L","N","C","L","N","C","L","N","L","N","C","L","N","C","L","N","C","L","N","L","N","C","N","C","L", 
  "N","A","N","A","N","A","N","A","N","A","N","A","N","A","N","A"),levels = c("N","A","C","L"))

batch2 <- factor(c(1,1,1,1,1,1,1,1,1,1,
                 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
                 3,3,3,3,3,3,3,3,3,3,3,3,
                 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,
                 5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5))

design <- model.matrix(~0+group2+batch2) 

cont.matrix <- makeContrasts(NvsA=N-A, 
                             NvsC=N-C,
                             NvsL=N-L,
                             AvsC=A-C,
                             AvsL=A-L,
                             CvsL=C-L,
                             levels=colnames(design))`

The contrast matrix is then

Contrasts
Levels NvsA NvsC NvsL AvsC AvsL CvsL
    N     1    1    1    0    0    0
    A    -1    0    0    1    1    0
    C     0   -1    0   -1    0    1
    L     0    0   -1    0   -1   -1
    L2    0    0    0    0    0    0
    L3    0    0    0    0    0    0
    L4    0    0    0    0    0    0
    L5    0    0    0    0    0    0

and the results include DEGs for NvsA NvsC NvsL AvsC AvsL CvsL. If I understand correctly, this does not account for batch effect and is similar to what the rnaseq-123 article from the link above did?

The other option is similar to the answer to this question https://support.bioconductor.org/p/69374/. I used an interaction type design and averaged the cases from which batch each sample came from and only included the interactions that had a sample inside (otherwise linear model did not work as coefficients were not calculable):

designy1 <- model.matrix(~0+group2:batch2)
colnames(designy1)
colnames(designy1) <- c("N1","A1","C1","L1","N2","A2","C2","L2","N3","A3","C3","L3","N4","A4","C4","L4","N5","A5","C5","L5")
designy2 <- designy1[,c(1:2,5,7,9,11:13,15:18)]

The contrast matrix then looked like this

contr.matrixy <- makeContrasts(NC= (N1+N2+N3+N4+N5)/5-(C2+C3+C4)/3, 
                               nVSa= (N1+N2+N3+N4+N5)/5-(A1+A5)/2, 
                               AVSC= (A1+A5)/2-(C2+C3+C4)/3,
                               NVL= (N1+N2+N3+N4+N5)/5-(L3+L4)/2,
                               CVL= (C2+C3+C4)/3-(L3+L4)/2,
                               AVL=(A1+A5)/2-(L3+L4)/2,
                               levels=colnames(designy2))

I then as standard used voom normalisation, linear model fitting, etc to get to DEGs.

vy <- voom(countsbindlist4I,design = designy2, plot=TRUE)
fity <- lmFit(vy, designy2)
fit2y <- contrasts.fit(fity, contr.matrixy)
efity <- eBayes(fit2y)
dt<- decideTests(efity)
summary(dt)

Is any of the solutions I used okay, or did I miss the mark with the second one? Thank you all for your help and insights.

Best regards,

RNA-Seq batch effect R design contrasts • 560 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2026 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6