Rapid GC content increase at 3' end base on Illumina platform
1
0
Entering edit mode
6.4 years ago

I have sequencing data from NextSeq and MiSeq, and the GC content increasing rapidly at 3' end base:

enter image description hereenter image description hereenter image description here

Is there something wrong with library? How did this arise?

Illumina GC content sequencing • 2.5k views
ADD COMMENT
2
Entering edit mode

Do you see this also on miseq? On NextSeq this could mean that you sequenced through the adapter and nothing is left to sequence. It's two colour chemistry, so nothing means G. We see polyG stretches at the end. Can't explain it for miseq though, since AFAIK that still uses 4 colours.

ADD REPLY
0
Entering edit mode

Yes, miseq is the same. Insert sise is about 250, and PE150 is used, as you can see above (maybe not clearly), it is caused by the missing of A rather than ployG, G/C/T all goes up except A.

ADD REPLY
0
Entering edit mode

If the base quality at the 3' end is poor, then do not even consider this finding as valid. Have you produced a FastQC report of the reads? The 3' end of Illumina reads is notorious for being poor quality.

ADD REPLY
0
Entering edit mode

Yes, the quality of last cycle is quite poor, I have posted the quality distribution, do you have any idea what does last cycle do?

ADD REPLY
1
Entering edit mode
6.4 years ago
chen ★ 2.5k

You didn't mention which one is NextSeq and which one is MiSeq, but I can guess that above is NextSeq and below is MiSeq.

Why NextSeq produces more G?

It's caused by the two-colour chemistry system. Four bases with two different colors:

  • Green + Red = A
  • Green = T
  • Red = C
  • None = G

As sequencing goes to last cycles, the signal strength decreases. So more bases will be detected as G incorrectly. That's why NextSeq produces more G.

ADD COMMENT
0
Entering edit mode

If you meant the last cycle, that's normal. The last cycle of Illumina data should be trimmed.

From your figures, I can know you were using fastp, use -t 1 option to trim the last cycle.

ADD REPLY
0
Entering edit mode

MiSeq or NextSeq is almost the same, and the figure is coming from fastp exactly, it's good to use and I'll use -t 1. Do you have any idea what does last cycle do?

ADD REPLY
1
Entering edit mode

One reason is that Illumina sequencers use the N+1 cycle to calibrate the N cycle.

Since the last cycle has no following cycle, it is not calibrated.

Just trim the last cycle using fastp with the option -t 1

ADD REPLY
0
Entering edit mode

Hi chen, is this a related issue: Can phasing or pre-phase during basecall cause indel? ?

ADD REPLY
1
Entering edit mode

I just took a loot at this thread, but seems not related

ADD REPLY
0
Entering edit mode

Thanks for taking a look at least. I figured that the N+1 cycle was related to the calibration of errors in the context of phasing and pre-phasing (I still believe that this is related to this N+1 cycle).

ADD REPLY

Login before adding your answer.

Traffic: 1700 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6