Finding polyA site from Ensembl GFF file
1
0
Entering edit mode
5.8 years ago
Wicklow • 0

Hi, I'm working on a project with pig genome. I'm looking for polyA sites information and downloaded this GFF file ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/003/025/GCF_000003025.6_Sscrofa11.1/GCF_000003025.6_Sscrofa11.1_rna.gbff.gz from Ensembl. There are 'three_prime_UTR' regions in this GFF file (field 3). Is there an existing method/tool which can be used directly to get the polyA site information from these 3' UTRs?

I did go through this post How to find polyA sites from gtf/gff?

Any help is much appreciated. Thanks in advance.

next-gen RNA-Seq • 1.7k views
ADD COMMENT
1
Entering edit mode

What exactly do you want to do with the information? Is the first base after the 3'UTR not a "good-enough" proxy?

ADD REPLY
0
Entering edit mode

Hi, I want to capture all polyA sites (alternative polyA sites as well) for all transcripts in pig genome. But it doesn't look like the Ensembl GFF file has multiple 'three_prime_UTR' records for a given transcript. I thought it would have them.

ADD REPLY
1
Entering edit mode
5.8 years ago
Eric Lim ★ 2.1k

I'm not aware of any existing tool to find polyA signals directly from GTF. Typically, you need to convert your region coordinates into sequences first. Converting coordinates annotated as 3UTR in your GTF to sequences is a good start.

People have implemented various machine learning algorithms, from simple motif finder to slightly more complicated variations using SVM or HMM. You might find this page useful: https://omictools.com/polyadenylation-prediction-category

You may also find some knockdown RNAseq experiments (CFIM25, etc) targeting at identifying alternative polyadenylation (APA) useful. There are quite a few out there.

Hope this info helps.

ADD COMMENT
0
Entering edit mode

Thanks Eric, I did look at this tool https://omictools.com/polya-svm-tool but just wasn't sure if this was the way to go. Will give that a try. Thanks again !

ADD REPLY
0
Entering edit mode

Glad it helps. We implemented a simple motif finding version internally and would be great to hear about your feedback on these tools.

ADD REPLY

Login before adding your answer.

Traffic: 2013 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6