Question

Stability Of Ensembl Ids

2

Entering edit mode

10.8 years ago

cdsouthan ★ 1.9k

What is the stabiity of Ensembl IDs at the gene, protein and transcript levels ?

I know human is pretty solid (but still not completely 1:1:1:1 with HGNC, Swiss-Prot and Entrez Gene) and there is a push to close mouse and Zebrafish via Havana

However I guess most of the other assemblies are provisional to some extent and the gene IDs therfore are difficult to lock down across re-builds, that will also cause some churn in the transcripts and ORFs

Is there any data on this ? The reason for asking is in regard to citing them and/or including raw sequence as supplimentary data

ensembl • 1.7k views

ADD COMMENT • link updated 7.4 years ago by Biostar 20 • written 10.8 years ago by cdsouthan ★ 1.9k

score 2 · Answer 1 · 2013-07-02

2

Entering edit mode

10.8 years ago

Emily 23k

I can do some investigating on the stability for you and see if we have any statistics. Often for less-studied genomes the IDs will remain simply because we do not re-do the genebuild until we get a new assembly.

If you want to ensure the stability, note down the release number where you got the data as well as the IDs so that readers can look them up in the archive sites.

ADD COMMENT • link 10.8 years ago by Emily 23k

0

Entering edit mode

We may be able to dig up some stats from when we redid genebuilds, but these will be patchy. The genebuild pipeline produces a stats file but we don't, as a matter of course, keep them. It will depend on whether the individual who ran the genebuild decided to keep them. Which species are you interested in?

ADD REPLY • link 10.8 years ago by Emily 23k

0

Entering edit mode

Thanks for the offer, but all things considered, for the publication we are working on (a couple of genes in ~ 20 species) I have decided to (ask the eds if we can) supply the FASTA strings of what we used as supplimentary data. This simplifies everything for re-use and, besides, we extended many Ensembl ORFs using ESTs or TSAs. JFTR have the team ever looked at "churn" rates i.e. changes in the ORF sets between gene builds ?

ADD REPLY • link 10.8 years ago by cdsouthan ★ 1.9k