Biostar Beta. Not for public use.
Question: HUGO and Ensembl ids
Entering edit mode

if we have "genes with the same HUGO ids but different Ensembl id" does it make sense to add up the raw count of those? ( for RNA expression or single cell analysis). Does it make sense to treat them as isoforms?

ADD COMMENTlink 16 months ago rsafavi • 40 • updated 15 months ago EagleEye 6.4k
Entering edit mode

Please provide some examples.

ADD REPLYlink 16 months ago
Entering edit mode

Maybe I can give an example. I have RNAseq samples from human and I have differential expression of gene NDUFA6 like below:

ENS ID                        Gene Name.          logFC
ENSG00000272765.   NDUFA6.               -0.6
ENSG00000281013.   NDUFA6                0.8
ENSG00000184983.   NDUFA6.               -0.6

As you can see, there are different ENS IDs (2 alternative sequence alignments and the last one is reference gene at the Ensembl website) for the same gene name. Usually, I do not get such different FCs for the alternative versions of the same gene but now it gets tricky. Should I integrate all alignments of the same gene name into one gene expression (for all such cases) and make the DE analysis again? Or how should I interpret this results?

ADD REPLYlink 15 months ago
• 0
• updated 15 months ago
Entering edit mode


Only last one is from chromosome 22 assembly. The first 2 are from the exception contigs (haplotype variant contigs). So there is a reason to assign different 'ENSG' names for them. I would say always use 'ENSG' ids as reference/indexing purpose, when you are working with ensembl annotation. In my opinion, if you are working with gene-level analysis, you always summarize based on 'ENSG' ids. For transcript-level/isoform-level analysis, you always summarize based on 'ENST' ids


ENSG: Genes ; ENST: TranscriptVariants/Isoforms ; ENSE: Exons ; ENSP: Proteins

ADD COMMENTlink 15 months ago EagleEye 6.4k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0