I have merged two RNA-seq datasets from two different batches.
The problem is that the sequencing batches are perfectly collinear now with cell type, which means that we are unable to correct for batch effects when computing residuals. Not adjusting for any covariates would risk inducing spurious DE signals, whereas adjusting the two datasets separately will mean that almost all genes are bound to be DE due to differences in baseline for the residuals during the adjustments.
We tried using methods that are usually used for producing more robust batch corrections: ComBat and SVA. Neither of these was developed to deal with our problem.
If any pieces of advice, please let me know.