Why does Rsubread featureCounts with Ensemble GRCm38.96.GTF picks ensembl_exon_id, instead of gene_id
2
0
Entering edit mode
4.9 years ago
akh22 ▴ 110

I run the following :

featureCounts(data.bam, isPairedEnd = T, isGTFAnnotationFile=T,annot.ext="Mus_musculus.GRCm38.96.gtf", GTF.attrType = "gene_id")

Even if I specify a GTF Type attribute to "gene_id", GeneID in the featureCounts output shows "ensembl_exon_id", instead of "ensembl_gene_id". Is there any way to tell featureCounts to use "ensembl_gene_id" ?

Thanks in advance.

RNA-Seq R assembly gene • 3.5k views
ADD COMMENT
2
Entering edit mode
4.9 years ago
akh22 ▴ 110

I sort of figured this out. I had run the same script in.another of my Macmini running Mojave and exactly the same versions of Rstudio and R (bioconductor packages) and worked as expected. I noticed Rstudio was acting wacky so I deleted and reinstalled Rstudio and after that, the featureCount script punched out gene_id rather than exon_ID as mentioned previously.

Anyway, thanks for all the help that pointed me to the right direction.

ADD COMMENT
1
Entering edit mode
4.9 years ago
AB ▴ 360

Probably because your gtf file doesnt have a "gene_id". Try

featureCounts(data.bam, isPairedEnd = T, isGTFAnnotationFile=T,annot.ext="Mus_musculus.GRCm38.96.gtf", GTF.attrType = "ensembl_gene_id")
ADD COMMENT
0
Entering edit mode

Ensembl gtf annotations do include an "gene_id" attribute. So, unless akh22 edited the Ensembl gtf annotation somehow, or downloaded from another source, the issue must be elsewhere.

zcat Mus_musculus.GRCm38.96.gtf.gz | head
#!genome-build GRCm38.p6
#!genome-version GRCm38
#!genome-date 2012-01
#!genome-build-accession NCBI:GCA_000001635.8
#!genebuild-last-updated 2018-11
1 havana  gene    3073253 3074322 .   +   .   gene_id "ENSMUSG00000102693"; gene_version "1"; gene_name "4933401J01Rik"; gene_source "havana"; gene_biotype "TEC";
1 havana  transcript  3073253 3074322 .   +   .   gene_id "ENSMUSG00000102693"; gene_version "1"; transcript_id "ENSMUST00000193812"; transcript_version "1"; gene_name "4933401J01Rik"; gene_source "havana"; gene_biotype "TEC"; transcript_name "4933401J01Rik-201"; transcript_source "havana"; transcript_biotype "TEC"; tag "basic"; transcript_support_level "NA";
1 havana  exon    3073253 3074322 .   +   .   gene_id "ENSMUSG00000102693"; gene_version "1"; transcript_id "ENSMUST00000193812"; transcript_version "1"; exon_number "1"; gene_name "4933401J01Rik"; gene_source "havana"; gene_biotype "TEC"; transcript_name "4933401J01Rik-201"; transcript_source "havana"; transcript_biotype "TEC"; exon_id "ENSMUSE00001343744"; exon_version "1"; tag "basic"; transcript_support_level "NA";
1 ensembl gene    3102016 3102125 .   +   .   gene_id "ENSMUSG00000064842"; gene_version "1"; gene_name "Gm26206"; gene_source "ensembl"; gene_biotype "snRNA";
1 ensembl transcript  3102016 3102125 .   +   .   gene_id "ENSMUSG00000064842"; gene_version "1"; transcript_id "ENSMUST00000082908"; transcript_version "1"; gene_name "Gm26206"; gene_source "ensembl"; gene_biotype "snRNA"; transcript_name "Gm26206-201"; transcript_source "ensembl"; transcript_biotype "snRNA"; tag "basic"; transcript_support_level "NA";
  
ADD REPLY
0
Entering edit mode

No I did not edit GTF at all. I just down loaded directly from Ensembl FTP.

#!genome-build GRCm38.p6
#!genome-version GRCm38
#!genome-date 2012-01
#!genome-build-accession NCBI:GCA_000001635.8
#!genebuild-last-updated 2018-11
1   havana  gene    3073253 3074322 .   +   .   gene_id "ENSMUSG00000102693"; gene_version "1"; gene_name "4933401J01Rik"; gene_source "havana"; gene_biotype "TEC";
1   havana  transcript  3073253 3074322 .   +   .   gene_id "ENSMUSG00000102693"; gene_version "1"; transcript_id "ENSMUST00000193812"; transcript_version "1"; gene_name "4933401J01Rik"; gene_source "havana"; gene_biotype "TEC"; transcript_name "4933401J01Rik-201"; transcript_source "havana"; transcript_biotype "TEC"; tag "basic"; transcript_support_level "NA";
1   havana  exon    3073253 3074322 .   +   .   gene_id "ENSMUSG00000102693"; gene_version "1"; transcript_id "ENSMUST00000193812"; transcript_version "1"; exon_number "1"; gene_name "4933401J01Rik"; gene_source "havana"; gene_biotype "TEC"; transcript_name "4933401J01Rik-201"; transcript_source "havana"; transcript_biotype "TEC"; exon_id "ENSMUSE00001343744"; exon_version "1"; tag "basic"; transcript_support_level "NA";
1   ensembl gene    3102016 3102125 .   +   .   gene_id "ENSMUSG00000064842"; gene_version "1"; gene_name "Gm26206"; gene_source "ensembl"; gene_biotype "snRNA";
1   ensembl transcript  3102016 3102125 .   +   .   gene_id "ENSMUSG00000064842"; gene_version "1"; transcript_id "ENSMUST00000082908"; transcript_version "1"; gene_name "Gm26206"; gene_source "ensembl"; gene_biotype "snRNA"; transcript_name "Gm26206-201"; transcript_source "ensembl"; transcript_biotype "snRNA"; tag "basic"; transcript_support_level "NA";
1   ensembl exon    3102016 3102125 .   +   .   gene_id "ENSMUSG00000064842"; gene_version "1"; transcript_id "ENSMUST00000082908"; transcript_version "1"; exon_number "1"; gene_name "Gm26206"; gene_source "ensembl"; gene_biotype "snRNA"; transcript_name "Gm26206-201"; transcript_source "ensembl"; transcript_biotype "snRNA"; exon_id "ENSMUSE00000522066"; exon_version "1"; tag "basic"; transcript_support_level "NA";
1   ensembl_havana  gene    3205901 3671498 .   -   .   gene_id "ENSMUSG00000051951"; gene_version "5"; gene_name "Xkr4"; gene_source "ensembl_havana"; gene_biotype "protein_coding";
ADD REPLY
0
Entering edit mode

Can you show a snippet of the featureCounts output?

counts <- featureCounts(data.bam, isPairedEnd = T, isGTFAnnotationFile=T,annot.ext="Mus_musculus.GRCm38.96.gtf", GTF.attrType = "gene_id")
head( counts, n =10 )
ADD REPLY
0
Entering edit mode

here we go;

$counts
                   Skin.25.IT.1.21.19.bam Skin.26.IT.1.21.19.bam Skin.27.IT.1.21.19.bam Skin.28.IT.1.21.19.bam Skin.29.IT.1.21.19.bam
ENSMUSE00001343744                      0                      0                      0                      0                      0
ENSMUSE00000522066                      0                      0                      0                      0                      0
ENSMUSE00000866652                      0                      1                      1                      0                      0
ENSMUSE00000858910                      0                      0                      0                      0                      0
ENSMUSE00000867897                      0                      0                      0                      0                      0
ENSMUSE00000863980                      0                      0                      0                      1                      0
ENSMUSE00000448840                      1                      0                      1                      1                      0
ENSMUSE00000449517                      1                      2                      2                      0                      1
ENSMUSE00000485541                      6                      4                      3                      1                      8
ENSMUSE00001339323                      1                      0                      0                      0                      0
ENSMUSE00001343189                      0                      0                      0                      1                      0
ENSMUSE00001343686                      0                      0                      0                      0                      0
ENSMUSE00001337180                      0                      0                      0                      0                      0
ENSMUSE00000869502                      0                      0                      0                      0                      0
ENSMUSE00000864479                      0                      0                      0                      0                      0
ENSMUSE00001345667                      0                      0                      0                      0                      0
ENSMUSE00001343235                      0                      0                      2                      0                      0
ENSMUSE00001343966                      0                      1                      1                      0                      0
ENSMUSE00001339227                     33                     24                     23                     15                     29
ADD REPLY

Login before adding your answer.

Traffic: 2503 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6