Hi,
I have a few NextSeq projects back, all libraries made on the Neoprep by same technician in the same lab. I used my standard pipeline which trims adapters and low quality bases (BBDuk), aligns with STAR and counts with featureCounts. I use s=2 flag for 'inversely stranded' library because that is what our Illumina libraries have been. I used this for the new Neoprep/NextSeq data, I checked with Illumina tech support after my issue below.
My problem: one project shows only ~5-10% counts (of total fragments) for s=2. However, when I switch to s=1, I get ~40-60% counts. Has anyone seen this behaviour previously? Any idea on what might cause this? Typically, it is the most time-sensitive study, with most precious samples. It is from fresh-frozen tumour tissue, so not great quality, other projects are cell-line and PDX models, so a mix of perfect and very good qualities, and data is well behaved. All libraries have very low rRNA, and more than 80% aligns to the transcriptome.
Some basic run stats, for a single selected (indicative) BAM flags for 83=6,392,271, 99=7,835,476, so not unevenly distributed to one strand (as far as my understanding of the flags, open to correction).
Summary of count matrix ('forward-stranded' -> s=1):
summary(forward_stranded)
Min. 1st Qu. Median Mean 3rd Qu. Max.
7,041 6,980,000 11,800,000 13,690,000 19,360,000 38,700,000
summary(inverse_stranded)
Min. 1st Qu. Median Mean 3rd Qu. Max.
842 1,183,000 1,717,000 2,085,000 2,995,000 5,492,000