Hi all,
I am trying to normalize my read counts for differential gene expression with edgeR
I have a set of 21 bam files from aligning my reads to a genome, corresponding to 3 replicates at each of my 7 time points.
I would like to do DGE using edgeR, but first I need to normalize for RNA abundance between replicates.
I was told I might be able to use RSEM or edgeR to produce a normalized count matrix. The issue is that my reads were generated using the QuantSeq library prep kit, so only one fragment is produced per transcript (and therefore the read count should be a direct reflection of the number of transcripts). For this reason QuantSeq recommends using HTSeq to produce a count matrix.
Is there away to produce a count matrix with HTSeq and then normalise across the replicates, without interfering with the fact that the read counts should be a direct reflection of the transcript counts? Can edgeR normalise the count matrix?
I think I have to avoid using FPKM (part of RSEM?) but I am not sure if it is appropriate to use RPKM, TMM, Upper quartile etc. I don't know much about these kinds of counts other then that they exist.
I was trying to work it out with RSEM but it doesn't seem to accept my bam files as they were produced by aligning to a genome not transcriptome
Thanks, Chloe
Awesome thanks I'll give this a go