Question

Ensembl rest getting exon only Sequence.

0

Entering edit mode

5.9 years ago

Ali.B • 0

Hello, First of please forgive me I have little knowledge in the subject, I'm coming from a computer science only background.

I'm building a webservice around ensembl rest api, so far all is good until I needed to get an exon only sequence.

What I'm trying to do is giving coordinates for example: human:10:101654703-101659823:-1 is 1- getting the sequence from the ensembl rest api. [easy enough] 2- getting overlapping exons that are protein coding in that region. example from the api 3- using the start and end of exons to get the whole overlapping sequence.

Now here are the problem I'm facing: 1- I believe there are different sources for exons(ensembl, ensembl_havana, havana). Which should I use and how? atm I'm prioritizing ensembl_havana and using that only, but I believe that is incorrect since ensembl_havana means exons that are agreed upon by both teams so I should add the rest of the exons reported by one of the teams to that ?

2- What's an Exon rank? didn't find information about that.

3- Given a negative strand and positive strand exons what to do and vice versa?

4- What's the Exon version ?

I apologise again for the amount of questions, but I've been struggling for a week with this, I'm getting valid results and more invalid ones.

Thank you.

ensembl exon sequence • 1.4k views

ADD COMMENT • link 5.9 years ago by Ali.B • 0

score 4 · Accepted Answer · 2018-05-16

4

Entering edit mode

5.9 years ago

swbarnes2 14k

Exon rank is the # that the exon is in that transcript. First exon is rank 1, second is rank 2, etc.

"+" and "-" tell you of the transcript runs forward or reverse on the genome. If you are getting your sequence from the genome coordinates, you might need to rev comp it to get the sequence in the right orientation with regard to the transcript. If you are asking for sequence by transcript or gene or exon ID, it should be in the right orientation for that context.

Version tells you the annotation version you are looking at.

ADD COMMENT • link 5.9 years ago by swbarnes2 14k

0

Entering edit mode

Thank you for the explanation, just to make sure I understand the rank of an exon, it doesn't matter if rank 1 doesn't have the lowest start index compared to the rest, when calculating the exon sequence it should start with the sequence of exon with rank 1?

ADD REPLY • link 5.9 years ago by Ali.B • 0

0

Entering edit mode

If the gene runs backwards, exon 1 should have the highest genomic position. But regardless of the direction of the gene, exon #1 is first of its transcript (it might not be first in another transcript of the same gene). If you are pulling it out by name or exon ID, it should be in the "right" orientation no matter what.

ADD REPLY • link 5.9 years ago by swbarnes2 14k