How to retrieve "UTR" sequence from "Exon" of "Canonical Transcript" from a list of gene symbols?
1
0
Entering edit mode
9.3 years ago
Hyun Jin ▴ 20

Hi,

I'm trying to retrieve "UTR" sequence (i.e. 5'UTR sequence) from "Exon" of "Canonical Transcript" using a list of gene symbols.

First I tried Ensembl martview to do the job (http://www.ensembl.org/biomart/martview/).

But output provides more 5UTR sequences (i.e 150 results) then than input gene numbers (i.e. 80)

I assume that one gene could correspond to multiple transcripts and this is why it gives more results than It was given. Note that my input was gene ID, not transcript ID.

Therefore, I thought retrieving the UTR exon sequence only from "Canonical transcript" using a list of gene symbol could avoid this problems.

Then I searched a bit, and tried the table browser from UCSC genome browser, selected KnownCanonical table, which I can select only one canonical transcripts, but it only provides coordinates of UTR, not the sequence itself.

Please advice me or let me know any reference that I can look up, that would be very helpful.

I can do some basic R programming, but never used BiomaRt. but I'm willing to try if BiomaRt 'getSequence' is the way to go.

Thank you very much!!!

Genome-Browser BiomaRt R • 2.8k views
ADD COMMENT
1
Entering edit mode
9.3 years ago

If you have coordinates (ideally in BED format) then you can just use bedtools getfasta to retrieve the corresponding sequence.

ADD COMMENT
0
Entering edit mode

Thank you Devon, I do have BED file with canonical transcripts 5UTR coordinates. I will try

ADD REPLY

Login before adding your answer.

Traffic: 2759 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6