Hi, I'm working on a project with pig genome. I'm looking for polyA sites information and downloaded this GFF file ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/003/025/GCF_000003025.6_Sscrofa11.1/GCF_000003025.6_Sscrofa11.1_rna.gbff.gz from Ensembl. There are 'three_prime_UTR' regions in this GFF file (field 3). Is there an existing method/tool which can be used directly to get the polyA site information from these 3' UTRs?
I did go through this post How to find polyA sites from gtf/gff?
Any help is much appreciated. Thanks in advance.
What exactly do you want to do with the information? Is the first base after the 3'UTR not a "good-enough" proxy?
Hi, I want to capture all polyA sites (alternative polyA sites as well) for all transcripts in pig genome. But it doesn't look like the Ensembl GFF file has multiple 'three_prime_UTR' records for a given transcript. I thought it would have them.