featureCounts problem annotation
1
0
Entering edit mode
5.8 years ago
hsu ▴ 40

I have download Saccharomyces cerevisiae (Yeast) genome and annotation from Ensembl R64-1-1.

code as:

featureCounts ./Saccharomyces_cerevisiae/Ensembl/R64-1-1/Annotation/Archives/archive-2015-07-17-14-36-40/Genes/genes.gtf -o genecounts_888 -t gene -p -g Name .exprnasamout/SRX3084888.bam

There is the mistake :

Failed to open the annotation file /datc/wangjc/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Annotation/Archives/archive-2015-07-17-14-36-40/Genes/genes.gtf, or its format is incorrect, or it contains no 'gene' features.

Is there other Saccharomyces cerevisiae (Yeast) genome and annotation sources?

genome • 2.1k views
ADD COMMENT
1
Entering edit mode

Give a picture of head -n 20 genes.gtf please. Also, input files (at least in the most recent version) need to be indicated by -a.

ADD REPLY
0
Entering edit mode
I       ensembl start_codon     538     540     .       +       0       exon_number "1"; gene_biotype "protein_coding"; gene_id "YAL068W-A"; gene_name "YAL068W-A"; gene_source "ensembl"; gene_version "1"; p_id "P5379"; transcript_biotype "protein_coding"; transcript_id "YAL068W-A"; transcript_source "ensembl"; transcript_version "1"; tss_id "TSS5441";

    I       ensembl transcript      538     792     .       +       .       gene_biotype "protein_coding"; gene_id "YAL068W-A"; gene_name "YAL068W-A"; gene_source "ensembl"; gene_version "1"; p_id "P5379"; transcript_biotype "protein_coding"; transcript_id "YAL068W-A"; transcript_source "ensembl"; transcript_version "1"; tss_id "TSS5441";
    I       ensembl stop_codon      647     649     .       +       0       exon_number "1"; gene_biotype "protein_coding"; gene_id "YAL069W"; gene_name "YAL069W"; gene_source "ensembl"; gene_version "1"; p_id "P3634"; transcript_biotype "protein_coding"; transcript_id "YAL069W"; transcript_source "ensembl"; transcript_version "1"; tss_id "TSS1129";
ADD REPLY
0
Entering edit mode

Ok, looks like a normal file. Use:

featureCounts -a genes.gtf -t 'exon' -o countMatrix.txt input.bam
ADD REPLY
1
Entering edit mode
5.8 years ago
michael.ante ★ 3.8k

Usually you count on the exon feature level, which is grouped on gene level. The gene feature which you are specifying with -t gene is not a standard feature (see here) and may be not included in your file.

Nevertheless, you'll get a table with the read count per gene if you leave the -t parameter to its default value.

ADD COMMENT
0
Entering edit mode

I got these two files genecounts_888 genecounts_888.summary without -t gene

Is this correct ?

ADD REPLY
0
Entering edit mode

Yes that is correct because 'exon' is the default and that is the correct choice for standard RNA-seq.

ADD REPLY
0
Entering edit mode

Thank you very much

ADD REPLY
0
Entering edit mode

If the michael.antes answer solved your problem, please consider to mark it as accepted to help others in the future.

ADD REPLY

Login before adding your answer.

Traffic: 1832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6