Retrieving gene name from its localisation in a .gff3 annotation file.
0
0
Entering edit mode
8.4 years ago
user31888 ▴ 130

Rookie question. I have a tabulated .gff3 annotation file of human alternative events obtained here that looks like this (showing first record only):

chr1    A3SS    gene    15796    16765    .    -    .    ID=chr1:6470:6628:-@chr1:5805|5810:5659:-;Name=chr1:6470:6628:-@chr1:5805|5810:5659:-;gid=chr1:6470:6628:-@chr1:5805|5810:5659:-

I am trying to annotate the genes (getting an ID or name,...) in order to know if they are associated with certain disease states, based on the information contained in this annotation file.

I thought about extracting the chromosome localisation (columns 1, 4 and 5) and converting it to a VCF-style file that I could use with ANNOVAR or a similar program:

1   15796   16765   0   0

However, I am not sure of the meaning of the ID event (i.e. ID=chr1:6470:6628:-@chr1:5805|5810:5659:-).

(a) Could someone explain the format of the ID?

(b) Are the yellow parts the chromosome localisation and could it be used to retrieve the gene name and other info?

(c) Is there a more straightforward way of annotating the gene based on this localisation?

gene UCSC annotation • 1.7k views
ADD COMMENT

Login before adding your answer.

Traffic: 2532 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6