Where can I download the GTF2.2 format file for mm10?
0
0
Entering edit mode
9.3 years ago
Gjain 5.8k

Hi Everyone,

I am looking for gene annotation file in the GTF2.2 format. Specifically, I am looking for the 5UTR, CDS and 3UTR annotation for the genes.

Format needed and mentioned in the GTF2.2 readme page (shown below)

140    Twinscan    3UTR          65149    65487    .    -    .    gene_id "140.000"; transcript_id "140.000.1";
140    Twinscan    CDS           71696    71807    .    -    0    gene_id "140.000"; transcript_id "140.000.1";
140    Twinscan    start_codon   73222    73222    .    -    2    gene_id "140.000"; transcript_id "140.000.1";
140    Twinscan    CDS           73222    73222    .    -    0    gene_id "140.000"; transcript_id "140.000.1";
140    Twinscan    5UTR          73223    73504    .    -    .    gene_id "140.000"; transcript_id "140.000.1";`

I have downloaded the current gtf file from Ensembl: ftp://ftp.ensembl.org/pub/release-78/gtf/mus_musculus/

Can someone please point me in the right direction to get the above mentioned features of the genes.

Thanks in advance.

gtf • 3.6k views
ADD COMMENT
2
Entering edit mode

The Ensembl GTFs are compatible with GTF2.2. If you need the 3UTR and 5UTR lines then you can generate them with either Biomart or the appropriate txdb package in R and a bit of typing.

ADD REPLY
0
Entering edit mode

Thank you for your answer Devon.

txDB package is a good resource. I was not aware of it.

I was looking at a different place in biomart. But after you suggested, I searched again and found it using the query:

  • dataset: Mus musculus genes (GRCm38.p3)
  • Filters:None
  • Attributes: Ensembl Gene ID, Ensembl Transcript ID, 5'UTR Start, 5'UTR End, 3'UTR Start, 3' UTR End, Transcript Start (bp), Transcript End (bp), Strand, Associated Gene Name, Chromosome Name
ADD REPLY

Login before adding your answer.

Traffic: 1577 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6