Hi all,
I conducted an RNAseq experiment, and have mapped my sequences to ORFs for the reference organisms (bacteria). I would like to now generate a count table. Looking through the packages (e.g., HTSeq-count, summerizeOverlaps, Verse), it appears they typically require a GTF or GFF file. However, since I mapped to ORFs, my reads are already annotated and assigned to a gene.
I'm wondering... --Are there any issues with mapping to ORFs? I'm working with bacteria, so I don't think I need to worry about issues with overlapping genes or exons being used in multiple different transcripts (as you would with Eukaryotes). --Are there any tools that can handle BAM files that are already "annotated" because they were mapped to ORFs? --Is there any reason why I should not parse the SAM files and make a count table using Python?
Mostly I'm wondering if I should start over or not, as the mapping is very time consuming. For future work, I would request the scaffolds and GFF files.
Thank you for any input!