ComBat-Seq

Question

How to choose the biological variables if you want to keep all the biological signal from your data when you have to adjust by batch?

0

Entering edit mode

21 months ago

ev97 ▴ 20

For RNA data, I have seen two available tools for adjusting the data by batch or other variables.

ComBat-Seq (sva package)
limma (removeBatchEffect function)

Functions:

ComBat-Seq

ComBat_seq(counts, batch, group = NULL, covar_mod = NULL, full_mod = TRUE, shrink = FALSE, shrink.disp = FALSE, gene.subset.n = NULL)

Where:

group: Vector / factor for biological condition of interest
covar_mod: Model matrix for multiple covariates to include in linear model (signals from these variables are kept in data after adjustment)

limma

removeBatchEffect(x, batch=NULL, batch2=NULL, covariates=NULL, design=matrix(1,ncol(x),1), ...)

Where:

design: design matrix relating to treatment conditions to be preserved, usually the design matrix with all experimental factors other than the batch effects.

In both methods, you have a parameter where you can put some biological variables/conditions that you would like to keep (biological signal) after the adjustment.

If you have a lot of biological information that you want to keep, the majority of the people would think to keep the maximum possible biological signal. However, this is not possible, because if you keep everything, you won't really adjust by anything.

Question: From your point of view/experience, what do you think or how do you usually make decisions about how to choose the variables that you are going to use?

I would really appreciate any feedback.

Thanks very much in advance

rnaseq combat-seq batch removeBatchEffect limma • 463 views

ADD COMMENT • link 21 months ago by ev97 ▴ 20