Retired Ensembl genes - should I identify their successors?
1
0
Entering edit mode
7.9 years ago

Hello,

I am running WGCNA analysis on public RNA-seq expression data. Some of the gene IDs are in Entrez, others in Ensembl. I want them all in Entrez. I tried using both NCBI and BioMart to convert the Ensembl IDs, but it seems that any way you cut it, around 3000 out of 20,000 genes don't have matches. After a little investigating, I discovered that these IDs are "retired" (example).

Should I hunt harder to match up these retired Ensembl IDs to their current equivalent Entrez ID? Or is it safe to assume that these are "fringe elements" whose status as gene was revoked, and it's okay to leave them out of my analysis?

Any insight is appreciated.

Thanks,

Maureen

wgcna entrez ID ensembl ID ID conversion • 2.5k views
ADD COMMENT
1
Entering edit mode
7.9 years ago

I just checked your example, and by clicking on 'Latest Version' I get to http://feb2014.archive.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000006074;r=17:34391640-34399392, which looks pretty real to me: http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000275385;r=17:36064280-36072032

So just based on this this seems worth investigating further. Some people from Ensembl are active here and will probably be able to give you a better answer.

ADD COMMENT
2
Entering edit mode

Wouter is right. There are often legitimate reasons why we would change the ID of a gene, maybe we've split it into two genes, merged two together, but the gene would still exist. There are also cases where we have retired the genes because they're dodgy. It's worth investigating them, you'll probably gain some and lose some in the process though.

ADD REPLY

Login before adding your answer.

Traffic: 1420 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6