Biostar Beta. Not for public use.
Question: Next-Seq sequences map poorly to ref genome
0
Entering edit mode

Hi

We've just started using the Illumina Next-Seq platform and haven't been getting good results sequencing our CEL-Seq2 library (75cycle, R1=15bp, R2=77bp), as opposed to what we've been getting previously using Hi-Seq.

QC apparently "looks fine". But I'm not convinced given that the deviation bars in FastQC are huge. %Q30 is ~78%.
The problem is that %mapping to the reference genome is only ~20%, even after discarding sequences with ave Qscore<30. Tech advised that we've probably overloaded the DNA.

We paid for another run of the same library at 30% less loading (~0.9pM) and 20% phiX-spike in. The same problem persists. %Q30 is better at 88%, but again, the sequences are full of N's (20% of R1 sequences are all Ns) and map at 20%.

We've been told it could be the library. But there are definitely enough DNA at ~200-400bp size.

I'd like to get an opinion on whether this is likely a library-prep problem (old reagents?) or sequencing-problem (settings?)...? What could be the triggering issue?

Thank you!!!

Screen_Shot_2018_08_24_at_6_11_32_pm

Screen_Shot_2018_08_24_at_6_15_55_pm

ADD COMMENTlink 18 months ago exin • 50 • updated 18 months ago genomax 68k
Entering edit mode
1

What did your PhiX results look like? Same story?

ADD REPLYlink 18 months ago
Joe
12k
Entering edit mode
0

Hmmm.... I kind of assumed PhiX reads were not included in my fastq files and were only used for calculating the error rates. I might have to double check then. Thank you.

ADD REPLYlink 18 months ago
exin
• 50
Entering edit mode
1

Not 100% sure with the NextSeq. The software on the machine may have filtered them out already - but they got sequenced, so the data should be available I would expect. You might have to recall them from the bcl files perhaps.

It would tell you if it’s your input DNA that’s the issue though, if the PhiX looks good.

ADD REPLYlink 18 months ago
Joe
12k
Entering edit mode
0

does the cluster density look okay?

ADD REPLYlink 18 months ago
Nandini
• 810
Entering edit mode
0

1st run: ~190 2nd run: ~ 95 (I'm not sure how this converts to the 170-220 k/mm2 recommended range for NextSeq, probably the same scale?)

ADD REPLYlink 18 months ago
exin
• 50
Entering edit mode
0

Have a look in RunCompletionStatus.xml file for ClusterDensity

ADD REPLYlink 18 months ago
Nandini
• 810
Entering edit mode
0

That's the number. 95

ADD REPLYlink 18 months ago
exin
• 50
Entering edit mode
0

if it says <ClusterDensity>95</ClusterDensity> then its quite low. Its under clustered

ADD REPLYlink 18 months ago
Nandini
• 810
1
Entering edit mode

I think the problem here may be a particular option used with bcl2fastq. If a read is shorter than 22 bp (which your R1 is) it is automatically masked with N's per default. I have a hunch that is what may be happening with this run. You will need to specify --mask-short-adapter-reads 0 to turn that masking off. Then you should be able to recover sequence for all R1 reads.

If the above setting was already in use then a second possibility is that CELseq (not sure what it is) may be leading to low nucloetide diversity in the first 15 cycles (e.g. all A's at a particular cycle). NextSeq image analysis program may be having issues with recognizing clusters apart leading to N base calls.Sequencers differ in their resilience in sequencing strange libraries. MiSeq is generally the best. Followed by HiSeq. NextSeq is likely at the bottom of that list.

ADD COMMENTlink 18 months ago genomax 68k
Entering edit mode
0

Thank you for your insights! I didn't have to do the conversion, the sequences were received as fastq files. Maybe the facility handled that. Might check with them. But all the R1 reads are 15bp, wouldn't they be all masked then? The first 15cycles do have low nt diversity. Looks like we're better off going back to HiSeq...

ADD REPLYlink 18 months ago
exin
• 50

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0