FeatureCounts Not Reading GTF File Correctly
1
0
Entering edit mode
7.9 years ago
dec986 ▴ 370

Hello,

I am writing my own GTF file, and featureCounts will only process some of the reads and not others, I have no idea why, as far as I can tell, the lines are identical.

why does featureCounts recognize lines like

chrM    ENSEMBL gene    15356   15422   .   -   .   gene_id "ENSMUSG00000064372.1"; gene_type "Mt_tRNA"; gene_status "KNOWN"; gene_name "mt-Tp"; level 3;
chrM    ENSEMBL transcript  15356   15422   .   -   .   gene_id "ENSMUSG00000064372.1"; transcript_id "ENSMUST00000082423.1"; gene_type "Mt_tRNA"; gene_status "KNOWN"; gene_name "mt-Tp"; transcript_type "Mt_tRNA"; transcript_status "KNOWN"; transcript_name "mt-Tp-201"; level 3; tag "basic"; transcript_support_level "NA";
chrM    ENSEMBL exon    15356   15422   .   -   .   gene_id "ENSMUSG00000064372.1"; transcript_id "ENSMUST00000082423.1"; gene_type "Mt_tRNA"; gene_status "KNOWN"; gene_name "mt-Tp"; transcript_type "Mt_tRNA"; transcript_status "KNOWN"; transcript_name "mt-Tp-201"; exon_number 1; exon_id "ENSMUSE00000521550.1"; level 3; tag "basic"; transcript_support_level "NA";

but not read my added lines, written like this?

chr1    ENSEMBL transcript  13139159    13142763    .   -   .   gene_id "Unknown7"; transcript_id "Unknown7"; gene_type "TEC"; gene_status "PUTATIVE"; gene_name "Unknown7"; transcript_type "TEC"; transcript_status "PUTATIVE"; transcript_name "Unknown7"; exon_number 1; exon_id "Unknown7"; level 3;
chr1    ENSEMBL transcript  13139159    13142763    .   -   .   gene_id "Unknown8"; transcript_id "Unknown8"; gene_type "TEC"; gene_status "PUTATIVE"; gene_name "Unknown8"; transcript_type "TEC"; transcript_status "PUTATIVE"; transcript_name "Unknown8"; exon_number 1; exon_id "Unknown8"; level 3;
chr1    ENSEMBL transcript  13139159    13142763    .   -   .   gene_id "Unknown9"; transcript_id "Unknown9"; gene_type "TEC"; gene_status "PUTATIVE"; gene_name "Unknown9"; transcript_type "TEC"; transcript_status "PUTATIVE"; transcript_name "Unknown9"; exon_number 1; exon_id "Unknown9"; level 3;

I realize this is a tedious question... but I've spent hours on this and I can't see the problem :(

featureCounts RNA-Seq • 5.0k views
ADD COMMENT
0
Entering edit mode

Could you add the command you use for featurecounts with these gtf files?

ADD REPLY
0
Entering edit mode

the command I use is:

featureCounts -g transcript_id -a ~/GENE_DATA/mm10/embryo_novel_transcripts_only.gtf -o transcript_id_featureCount.tsv sorted.bam
ADD REPLY
1
Entering edit mode

Default behavior is to count only for feature 'exon', you have to specify the -t flag, in your case

-t transcript
ADD REPLY
0
Entering edit mode

yes! this solves my problem, instead of using -g.

ADD REPLY
0
Entering edit mode
7.9 years ago
dec986 ▴ 370

The key here is that

-t

means the 3rd column in Gencode GTF.

Also,

-g

means 9th column. I think -t option works better.

ADD COMMENT

Login before adding your answer.

Traffic: 1540 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6