I am trying to identify the cell types for each of the clusters in my tSNE plot using the top cluster specific marker genes. The marker genes along with the tSNE plots have been identified using the Seurat R package. To try and tackle this problem I performed GO enrichment analysis on each of the clusters however a lot of times the biological process doesn't clearly point to a specific cell type. For example, anion transmembrane transport doesn't tell me if its a kidney cell or a neuron.
Is there a simple way to ID each cluster based on top marker genes in that cluster? I have a hunch that I am missing something obvious.
Please let me know if I clarify my question in any way, thank you.
What you want to do is generaly referred to as cell deconvolution, i.e., 'mapping' your own expression data to pre-defined signatures of different cell-types. There are signatures available for immune cell-types, one of the most famous being that referred to simply as Abbas, and it is implemented via the cellmix R package. If you search for abbascellmixsignature in a search engine, you'll see. The issue is that this was designed using cDNA microarrays, so, it's not amenable to RNA-seq (I've tried).
Now, work that I was doing with colleagues in London back in 2015 was on arthritis - we wanted to perform cell deconvolution. We eventually came up with a way to deconvolute RNA-seq data using information from FANTOM5, which has CAGE-seq expression on 100s of human cell-types.
There may since have been other methods to do what you want, but I am unaware right now. I can think of various ways to do it, though. Why not aim to devise your own method?
The Human Protein Atlas project will undoubtedly assist greatly this type of work that you want to do.