Question: clustering cell population from bulk RNA seq ?
Hi everyone. I was wondering whether it is possible to do clustering of cell populations after bulk RNA seq? To be more precise, I am sequencing all epithelial cells from lungs. And I am wondering whether there is a clustering algoritm that will allow me to differenciate epithelial cell populations afterwards.

Thanks, J

Are you trying to differentiate different populations from individual sample? This thread may be relevant: Deconvolution Methods on RNA-Seq Data (Mixed cell types)

It also sounds for me as deconvolution problem (btw, thanks for the thread reference, @igor). You can look on this:

Not all the methods are purely based on mathematics, some uses "pure" cells expression profiles or marker genes, however, I would agree with the answer below by @Devon that single cell/nuclei RNA-seq experiment would be better (although more expensive).

tldr: It's theoretically possible, but can be tricky and the results are questionable.

You're looking for "blind signal separation" techniques, such as ICA or NMF. Having said that, I should warn you that (A) these techniques tend to not work terribly well on RNAseq data, since they tend to require that signals sum linearly (RNAseq signals are competitive) and (B) because of that most of the papers will benchmark using microarray data and then say that their methods work with RNAseq too...take that with a large grain of salt. You will need a decent number of samples regardless of the technique you use. Typically you will also need to predefine how many cell types constitute your samples, hopefully you have a ballpark estimate. Note that the more sources you have to separate out the less reliable the expression estimates for each source will be.

If possible, you'd be better off just doing single-cell sequencing. Even just sequencing RNA in nuclei from single cells would be better in my mind.

