How to calculate average expression of each gene from single cell RNA-Seq data
1
0
Entering edit mode
4.9 years ago
amir_v • 0

Hi,

I have got a 10X 3' scRNA-Seq dataset of two samples. I want to calculate the average expression for each gene from this scRNA-Seq data. Here, there are some challenges in calculating the average expression, which I'm not sure if I've done that correctly.

I started with some QC and removing outlier cells which includes removing cells with a high fraction of Mitochondria and also eliminating all the cells that do not express any genes.

Then for each gene in the gene-barcode matrix, I calculated average expression which I'm not sure if that is a right way to do that or not. Since I can calculate the expression of each gene by sum over all the counts across all the cells and then divided by the number of cells with non-zero counts for that specific gene.

I appreciate any comments in advance.

Thanks, Amir

RNA-Seq scRNA-Seq single cell RNA-Seq • 5.6k views
ADD COMMENT
0
Entering edit mode

Thanks for your comment. I need to measure the average expression for two different conditions which are not compared able in scRNA-Seq scale. Therefore, I need to calculate the DEG analysis in order to identify my enriched genes between two population.

I'm not sure imputation will help me for this purpose since it will scale the expression and for zero values before imputation, it will replace with a very small number which will not impact my average expression.

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. SUBMIT ANSWER is for new answers to original question.

This comment should go under @geek_y's answer below.

ADD REPLY
0
Entering edit mode

Thats why I asked what is the goal. If you are looking between two conditions, you have to do differential expression and do a violin plot of the gene of interest.

ADD REPLY
0
Entering edit mode
4.9 years ago

I guess its not meaningful to get average expression in scRNA data. You could pool the data from similar 'n' cells and then calculate the average expression. This way you will overcome the sparsity issues.

Using KNN based approach, either you can pool the data from similar cells or impute the gene expression and then calculate the mean expression. In any case, I am not sure what is the end goal here with mean counts.

ADD COMMENT
0
Entering edit mode

I currently tried using the calculate.pseudocells function from scWGCNA to create average expression pseudocells using the same KNN approach. However, I guess the function isn't working as expected as I know my data is heterogenous but the results comparing normal cells Vs Pseudocells are the same. Is the AverageExpression function from Seurat the right one to use? (Thats the one that calculate.pseudocells employs)

ADD REPLY

Login before adding your answer.

Traffic: 2455 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6