100% Identical Transcript Sequences - How Did They Manage To Put Them Into Different Loci?
1
3
Entering edit mode
10.2 years ago
PoGibas 5.1k

I stumbled upon two genes:

After more closer look I noticed that they are similar. So similar that I BLAST'ed loci against each other.

enter image description here

Surprise, surprise, they are 100% identical.
Those genes can be grouped as paralogues, but there is still some unclarity for me.

Is such similarity normal? If transcripts are 100% identical, how somebody managed to place them into different loci?


Edit

Three more transcipts from different loci are identical: ENST00000456123.1, ENST00000420149.1, ENST00000436568.1

ensembl • 3.1k views
ADD COMMENT
0
Entering edit mode

are their flanking sequences different (utr's?)?

ADD REPLY
0
Entering edit mode

After extending loci 1000bp to each side I still get 100%:

blastn \
    -subject <(echo chr15 83087381 83102960 | 
                  awk '{OFS="\t"; print $1,$2-1000,$3+1000}' | 
                  bedtools getfasta -fi hg.19 -bed - -fo -) \
    -query <(echo chr15 82710856 82726435 | 
                awk '{OFS="\t"; print $1,$2-1000,$3+1000}' | 
                bedtools getfasta -fi hg.19 -bed - -fo -) |   
    grep -m1 'Identities'

Identities = 17579/17579 (100%), Gaps = 0/17579 (0%)
ADD REPLY
0
Entering edit mode

interesting indeed. they might have had some long inserts etc., but I agree, that could be a mistake

ADD REPLY
0
Entering edit mode

Could you tell me how to `BLAST'ed loci against each other`??

ADD REPLY
2
Entering edit mode
10.2 years ago
Emily 23k

I've asked our friends at Havana to weigh in (either directly or via me) so we'll see what they say.

Follow up: heard back from Havana. Laurens says:

The duplication of this and neighbouring genes is due to a haplotype issue. There is an assembly gap between the two copies and the genomic sequence on the left side is from the RP13 haplotype, and the right side is from RP11 haplotype.

In the next genome assembly (GRCh38 I presume) this region will be replaced with an ungapped single haplotype (RP11) and it will only have one copy of this gene (and of the flanking genes).

ADD COMMENT
0
Entering edit mode

Emily, thank you for helping!
Can you also look at this question: Gencode V15 Exons < 3Bp (e.g., ENST00000605962 - last exon is 1bp long (same Havana annotation)).

ADD REPLY
0
Entering edit mode

On it (plus some extra characters so BioStar will let me post).

ADD REPLY
0
Entering edit mode

I'm replying so that it flags me when there are new answers. I'd like to hear the details.

ADD REPLY
1
Entering edit mode

Oh dear. I edited my original post so you might not get a notification. Well this comment will do that.

ADD REPLY

Login before adding your answer.

Traffic: 2176 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6