Singe cell identification library with Ensembl_ID:s
0
0
Entering edit mode
5.1 years ago
chilifan ▴ 120

I am analyzing some gene/cell expression data tor try to annotate the cell type using SingleR. My data is from 10x and consists of two columns that serve as row names, one ensemble Gene_ID and one symbol, for example CYTH3. The data look like below:

Gene_ID         Symbol  AAACCTGCACACTGCG.1  AAACCTGGTCAGAATA.1  AAACGGGAGATTACCC.1
ENSG00000000419 DPM1        0   0   0
ENSG00000000457 SCYL3       0   0   0
ENSG00000000460 C1orf112    0   0   0
ENSG00000000938 FGR         1   0   0
ENSG00000000971 CFH         0   0   0
ENSG00000001036 FUCA2       0   0   0
ENSG00000001084 GCLC        0   0   0
ENSG00000001167 NFYA        0   0   0
ENSG00000001460 STPG1       0   0   0

My problem is that only one column is allowed as row names, meaning that I have to delete one of the columns. But some gene symbols are duplicates with different Gene_ID:s. I understood that these are hapotypes and that we should rather use the ensembl Gene_ID:s when working with genes. My problem is that the cell type annotation libraries I found are all based on the gene symbols, meaning that it won't recognize my Gene_ID:s. Is there any workaround (like naming gene symbols from different haplotypes) or is there any library that use the Gene_ID:s for recognizing cell type?

Ensembl library haplotypes Gene_ID annotation • 877 views
ADD COMMENT

Login before adding your answer.

Traffic: 2888 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6