Stability Of Ensembl Ids
1
2
Entering edit mode
10.8 years ago
cdsouthan ★ 1.9k

What is the stabiity of Ensembl IDs at the gene, protein and transcript levels ?

I know human is pretty solid (but still not completely 1:1:1:1 with HGNC, Swiss-Prot and Entrez Gene) and there is a push to close mouse and Zebrafish via Havana

However I guess most of the other assemblies are provisional to some extent and the gene IDs therfore are difficult to lock down across re-builds, that will also cause some churn in the transcripts and ORFs

Is there any data on this ? The reason for asking is in regard to citing them and/or including raw sequence as supplimentary data

ensembl • 1.7k views
ADD COMMENT
2
Entering edit mode
10.8 years ago
Emily 23k

I can do some investigating on the stability for you and see if we have any statistics. Often for less-studied genomes the IDs will remain simply because we do not re-do the genebuild until we get a new assembly.

If you want to ensure the stability, note down the release number where you got the data as well as the IDs so that readers can look them up in the archive sites.

ADD COMMENT
0
Entering edit mode

We may be able to dig up some stats from when we redid genebuilds, but these will be patchy. The genebuild pipeline produces a stats file but we don't, as a matter of course, keep them. It will depend on whether the individual who ran the genebuild decided to keep them. Which species are you interested in?

ADD REPLY
0
Entering edit mode

Thanks for the offer, but all things considered, for the publication we are working on (a couple of genes in ~ 20 species) I have decided to (ask the eds if we can) supply the FASTA strings of what we used as supplimentary data. This simplifies everything for re-use and, besides, we extended many Ensembl ORFs using ESTs or TSAs. JFTR have the team ever looked at "churn" rates i.e. changes in the ORF sets between gene builds ?

ADD REPLY

Login before adding your answer.

Traffic: 2212 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6