Hi all,
I have noticed something odd about the Mus musculus Ensembl annotation I'm currently using (Ensembl 87/GENCODE VM12). I've found a bunch of genes which have more than one gene ID for the same gene symbol. I originally thought that these may just be located in alternatively assembled haplotypic regions, however for none of the genes I've checked does this appear to be the case.
Here are two examples:
ENSMUSG00000051396.15 Hspa14
ENSMUSG00000109865.1 Hspa14
ENSMUSG00000030786.18 Itgam
ENSMUSG00000108596.1 Itgam
Neither appear to be located on haplotype assembly regions, and they both appear on the same chromosome and at the same locus. Does anyone know why this is the case? Shouldn't all the transcripts at these loci be under the same gene ID and name?
Thanks
Yeah annotation can be a mess sometimes... Complaints should be directed to ensembl directly :P
If you look at your Itgam example, both ensembl annotation direct to the same genomic area (more or less).
http://www.ensembl.org/Mus_musculus/Gene/Splice?db=core;g=ENSMUSG00000108596;r=7:128062683-128128160
http://www.ensembl.org/Mus_musculus/Gene/Splice?db=core;g=ENSMUSG00000030786;r=7:128062640-128118491
It has probably to do with the last exon??