Can I get the same annotation data that is in ensembl gene annotation sets through mysql?
1
0
Entering edit mode
5.0 years ago
endrebak ▴ 960

Here is some sample data from zebrafish, release 95:

#!genome-build GRCz11
#!genome-version GRCz11
#!genome-date 2017-05
#!genome-build-accession NCBI:GCA_000002035.4
#!genebuild-last-updated 2018-04
4   ensembl gene    17308   18211   .   -   .   gene_id "ENSDARG00000102141"; gene_version "2"; gene_name "ptpn12"; gene_source "ensembl"; gene_biotype "protein_coding";
4   ensembl transcript  17308   18211   .   -   .   gene_id "ENSDARG00000102141"; gene_version "2"; transcript_id "ENSDART00000171737"; transcript_version "2"; gene_name "ptpn12"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "ptpn12-201"; transcript_source "ensembl"; transcript_biotype "protein_coding";
4   ensembl exon    18134   18211   .   -   .   gene_id "ENSDARG00000102141"; gene_version "2"; transcript_id "ENSDART00000171737"; transcript_version "2"; exon_number "1"; gene_name "ptpn12"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "ptpn12-201"; transcript_source "ensembl"; transcript_biotype "protein_coding"; exon_id "ENSDARE00001173708"; exon_version "2";
4   ensembl CDS 18134   18211   .   -   0   gene_id "ENSDARG00000102141"; gene_version "2"; transcript_id "ENSDART00000171737"; transcript_version "2"; exon_number "1"; gene_name "ptpn12"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "ptpn12-201"; transcript_source "ensembl"; transcript_biotype "protein_coding"; protein_id "ENSDARP00000130978"; protein_version "1";
4   ensembl exon    17948   18046   .   -   .   gene_id "ENSDARG00000102141"; gene_version "2"; transcript_id "ENSDART00000171737"; transcript_version "2"; exon_number "2"; gene_name "ptpn12"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "ptpn12-201"; transcript_source "ensembl"; transcript_biotype "protein_coding"; exon_id "ENSDARE00001162488"; exon_version "1";
4   ensembl CDS 17948   18046   .   -   0   gene_id "ENSDARG00000102141"; gene_version "2"; transcript_id "ENSDART00000171737"; transcript_version "2"; exon_number "2"; gene_name "ptpn12"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "ptpn12-201"; transcript_source "ensembl"; transcript_biotype "protein_coding"; protein_id "ENSDARP00000130978"; protein_version "1";
4   ensembl exon    17681   17772   .   -   .   gene_id "ENSDARG00000102141"; gene_version "2"; transcript_id "ENSDART00000171737"; transcript_version "2"; exon_number "3"; gene_name "ptpn12"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "ptpn12-201"; transcript_source "ensembl"; transcript_biotype "protein_coding"; exon_id "ENSDARE00001173438"; exon_version "1";
4   ensembl CDS 17681   17772   .   -   0   gene_id "ENSDARG00000102141"; gene_version "2"; transcript_id "ENSDART00000171737"; transcript_version "2"; exon_number "3"; gene_name "ptpn12"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "ptpn12-201"; transcript_source "ensembl"; transcript_biotype "protein_coding"; protein_id "ENSDARP00000130978"; protein_version "1";
4   ensembl exon    17308   17548   .   -   .   gene_id "ENSDARG00000102141"; gene_version "2"; transcript_id "ENSDART00000171737"; transcript_version "2"; exon_number "4"; gene_name "ptpn12"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "ptpn12-201"; transcript_source "ensembl"; transcript_biotype "protein_coding"; exon_id "ENSDARE00001189332"; exon_version "2";
4   ensembl CDS 17308   17548   .   -   1   gene_id "ENSDARG00000102141"; gene_version "2"; transcript_id "ENSDART00000171737"; transcript_version "2"; exon_number "4"; gene_name "ptpn12"; gene_source "ensembl"; gene_biotype "protein_coding"; transcript_name "ptpn12-201"; transcript_source "ensembl"; transcript_biotype "protein_coding"; protein_id "ENSDARP00000130978"; protein_version "1";

Is there an easy way to get the same data through mysql or is ftp the way to go? I do not care about the header.

ensembl gtf • 1.1k views
ADD COMMENT
2
Entering edit mode
5.0 years ago
Emily 23k

The data is all stored in MySQL tables, just not exactly in that format. You would need to link various tables together to get what you need, but you could do it. But if the FTP files provide just what you need, why bother?

ADD COMMENT
0
Entering edit mode

Thanks. Mysql is safer protocol than ftp is the reason. But I guess ftp is the way to go :)

ADD REPLY
0
Entering edit mode

Note also that the perl API makes SQL queries under the hood so you could write a perl script to get the same data via database queries without having to bother writing the SQL queries yourself.

ADD REPLY

Login before adding your answer.

Traffic: 2179 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6