geneCounts from sorted bam all 0?
1
1
Entering edit mode
5.1 years ago

I appreciate any advice you could give! I'm using R version 3.4.4, platform 64-bit Linux-gnu.

I've aligned RNAseq reads to a reference genome using QuasR and created a gene model from a GFF file:

qAlign("sampleTable.txt","refGenome.fa,splicedAlignment=TRUE)

library(GenomicFeatures)

geneModel<-("path/geneModel.gff3.gz","gff3")
genes<-genes(geneModel)

I've confirmed my sorted bam file has alignments, but when I try to obtain gene counts using GenomicAlignments summarizeOverlaps() all counts are listed as 0 in my final matrix.

library(GenomicAlignments)

sortBam("alignment.bam","alignment.bam.bai",destination="Sorted") 
indexBam("Sorted.bam")
myBam<-BamFile("Sorted.bam",yieldSize=10000)
geneCounts<-summarizeOverlaps(genes,myBam,singleEnd=FALSE)

myGeneGR<-rowRanges(geneCounts)
myGeneCounts<-genecounts[genes$gene_id==myGeneGR$gene_id,]
library(SummarizedExperiments)
myGeneCounts_Matrix<-assay(myGeneCounts)
RNA-Seq R Bioconductor GenomicFeatures • 1.2k views
ADD COMMENT
1
Entering edit mode
5.1 years ago

The first obvious thing to check; are you sure that your gff and your reference genome came from the same place?

Specifically, do the chromosome names match between your reference and your gff?

ADD COMMENT
0
Entering edit mode

Yuhp, they are both of the same organism from the same assembly. Since the organism is a bacteria and there is only 1 chromosome the gene_id is the common identifier between files.

ADD REPLY
0
Entering edit mode

The gene id is in common between your gff and your reference?? Not the chromosome name?

ADD REPLY
0
Entering edit mode

If I understand what you're asking, it's the gene_id that's common. The gff lists

Chromosome     name     feature     rank     position     strand     ID

and the reference genome for the single chromosome lists

>Assembly_species_complete_genome
TTAAGGGCCC

After aligned, sorted bam file has gene_ids and so does the gene model made from the gff with makeTxDbFromGFF.

ADD REPLY
0
Entering edit mode

So your reference does not have gene_IDs? The entry under chromosome in your gff should be "Assembly_species_complete_genome". Is that what it is?

ADD REPLY
0
Entering edit mode

Yuhp, that's correct- the reference doesn't have a gene-id. There isn't a Chromosome heading in the gff file, that is just what each line starts with.

ADD REPLY

Login before adding your answer.

Traffic: 1952 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6