is it correct to normalize batch corrected data?
2
0
Entering edit mode
8 months ago
star ▴ 350

I have gathered Single-cell RNA-seq data from various studies, and to address batch effects among them, I applied the scGen method. This method was chosen because it resulted in improved clustering and provided a tabular output.

Now, prior to proceeding with the calculation of Expression levels (average expression for each gene across cells), I have the following questions:

  1. Is it appropriate to normalize the data NormalizeData() after batch correction? I believe we need to do it since batch correction is for unwanted variables/ technical variables, not library size.
  2. The scGen method yields negative values (likely due to its requirement of log-normalized data for training the model). Should I be concerned about these negative values, or are they inconsequential given that my goal is to obtain a normalized count table?

I really appreciate any help you can provide!

batch-correction single-cell RUV scGen nomalization • 843 views
ADD COMMENT
2
Entering edit mode
8 months ago
bk11 ★ 2.4k

For your first part, you can check in the link provided what Sean Davis told before.

In Which Order Use Normalization And Batch Effects Removal?

ADD COMMENT
0
Entering edit mode

nice find - but, feels bad man. miss sean davis being here.

ADD REPLY
1
Entering edit mode
8 months ago
ATpoint 82k

These integration methods return values that should not be used for anything other than clustering and dimensionality reduction. It is not a per-gene batch correction but a per-cell one, so these methods will happily change magnitude and sign of expression values to embed the cell properly into the corrected space. That means, the values per gene have essentially no meaning. You cannot compare them individually between cells or datasets.

One reference, which to my knowledge applied to all these integration methods is http://bioconductor.org/books/release/OSCA.multisample/using-corrected-values.html

I do not know the particular method you used, my answer might not apply in that particular case, bit probably it does. So no, you should not additionally normalize these values, not use them to compare expression.

ADD COMMENT

Login before adding your answer.

Traffic: 1670 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6