What annotation (GTF) file should I use for strand-specific 3' mRNA sequencing? I am trying to use FeatureCounts with 3' mRNA sequencing data. I think I need an annotation file (for Hg38) that only covers mRNA and is strand specific. Can someone recommend where to get such a file?
My code:
featureCounts -T 14 -a $hg38_uscs.gtf -o $sample.featureCount.txt sample.sort.bam
I have tried to use UCSC annotation table GENCODE24, but most of my reads cannot be counted via Feature Counts, and got called as "ambiguity", because (to my knowledge) they were aligned to regions that has more than one features. Is this a common issue...? My output summary.txt
Assigned 2030063
Unassigned_Unmapped 7166618
Unassigned_MappingQuality 0
Unassigned_Chimera 0
Unassigned_FragmentLength 0
Unassigned_Duplicate 0
Unassigned_MultiMapping 1302741
Unassigned_Secondary 0
Unassigned_Nonjunction 0
Unassigned_NoFeatures 583039
Unassigned_Overlapping_Length 0
Unassigned_Ambiguity 7609437
Update
I tried the annotation file downloaded with featureCounts. It is in SAF format, and I tried
featureCounts -T 14 -F 'SAF' -a $hg38_uscs.gtf -o $sample.featureCount.txt sample.sort.bam
And the aligned reads significantly improved! I am looking at the differences of the two files, but still, if anyone has any suggestions, I would love to know. Thank you.
Can you post the exact command you've used with featureCounts as well as the *summary file that it generates?
Thank you for the reply. I have edited the post with an update.