Interpretation of Trimmomatic Results after Paired-End Adapter Trimming
2
1
Entering edit mode
5.8 years ago
sevenless ▴ 30

Hi,

I have some questions concerning the output of Trimmomatic after adapter removal. I have 80 bp paired-end reads in Ilumina 1.9 encoding (Phred+33). Using FastQC for quality control, I noticed some overrepresented sequences in the data which were identified as TruSeq adapters. For this reason, I used Trimmomatic in order to trim the adapters and to drop any resulting reads with a length < 36 bp:

java -jar trimmomatic-0.38.jar PE -phred33 seq_1.fastq.gz seq_2.fastq.gz seq_1_trimmed_paired.fastq.gz seq_2_trimmed_unpaired.fastq.gz seq_1_trimmed_paired.fastq.gz seq_2_trimmed_unpaired.fastq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 MINLEN:36

As a result, I get ~99% of both reads surviving and ~1% forward reads only surviving and 0% reverse reads only surviving:

ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Input Read Pairs: 71446282 Both Surviving: 70983555 (99.35%) Forward Only Surviving: 453784 (0.64%) Reverse Only Surviving: 0 (0.00%) Dropped: 8943 (0.01%)

Is the 0% reverse only surviving the expected result? It seems that the reverse reads are the only ones affected by the adapter trimming. However, in the FastQC quality control, the warnings for overrepresented adapter sequences only showed up for the forward reads.

And what is the difference between the TruSeq3-PE.fa and the reverse complements TruSeq3-PE-2.fa adapter sequence files and which of them should actually be used to trim adapters from paired-end reads?

I would be very grateful for any help or explanations.

RNA-Seq trimmomatic adapter trimming • 7.9k views
ADD COMMENT
0
Entering edit mode

Honestly, I don't have a clear answer to you on this matter, that's why I post it as a comment and not as an answer. I have some speculations though (maybe they help):

  • as far as I know it is easier to have adapter traces in the reverse reads with truseq kits, so it is probably easier to get forward-only surviving than reverse-only surviving. Since your discard rate is ~1%, it might be by chance that you have no reverse-only.

  • Did you run FastQC before and after the trimming? How does the adapter content plot look like when compared?

ADD REPLY
0
Entering edit mode

Thanks for your comment! I don't think it's by chance because I got the same result (0% reverse only surviving) for all my files. mastal511 also explained that "Trimmomatic's default behaviour is to drop the reverse reads when it trims adapters" (see answer below).

ADD REPLY
4
Entering edit mode
5.8 years ago
mastal511 ★ 2.1k

Trimmomatic's default behaviour is to drop the reverse reads when it trims adapters, so you get forward reads only surviving as a result.

The reasoning behind this is that when you read into the adapter sequences it means that the insert is shorter than one of the reads, so the reverse read doesn't add any extra information, it is just the reverse complement of the forward read.

However, the default behaviour can be changed if you want to keep the reverse reads after adapter trimming. See the Trimmomatic manual, you need to add TRUE as the last parameter to ILLUMINACLIP.

ADD COMMENT
1
Entering edit mode

Thank you very much for your reply and the explanation about Trimmomatic's default behaviour. I understand now why I get 0% reverse-only surviving.

However, I now ran the FastQC again on the trimmed paired-end reads and strangely enough, the adapters are still reporter as overrepresented sequences, e.g. TruSeq Adapter, index 5 (GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGC).

Is there another TruSeq adapter file available that I could use for this purpose (e.g. TruSeq3-PE-2.fa) or should I add this sequence to the adapter file manually?

ADD REPLY
0
Entering edit mode

There is also the TruSeq2 file, and I usually have to use that one (i.e. most of the data files I trimmed were prepared with a TruSeq2 kit).

ADD REPLY
1
Entering edit mode

It seems that TruSeq3-PE-2.fa does the trick. Interestingly, now I get reverse-only surviving reads as well.

ADD REPLY
0
Entering edit mode

I also have this problem with the same sequence. I have seen a lot of people have it. Could it be that the adapter list everyone is using doesn't contain this sequence? (it seems like it, from reading the past messages)

ADD REPLY
1
Entering edit mode

It depends on how your library was generated, and that you should ask to who made the sequencing libraries. If they used TruSeq2 and you're using the TruSeq3 file, then you should change that. Check the files available in the Trimmomatic install directory.

ADD REPLY
0
Entering edit mode
2.1 years ago
Pegasus ▴ 100

Hi all,

I am facing a similar issue, as below

TrimmomaticPE: Started with arguments:
 T3R1-F.fastq.gz T3R1-R.fastq.gz T3R1-F_paired.fq.gz T3R1-F_unpaired.fq.gz T3R1-R_paired.fq.gz T3R1-R_unpaired.fq.gz ILLUMINACLIP:adapter.fa:2:30:10:2:True LEADING:3 TRAILING:3 MINLEN:36
Using Long Clipping Sequence: 'CTCAAATTCATCATCGTATACGCGAACATAAACAAAAACAACTGCTTCAG'
ILLUMINACLIP: Using 0 prefix pairs, 1 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Quality encoding detected as phred33
Input Read Pairs: 19765141 Both Surviving: 19568511 (99.01%) Forward Only Surviving: 98216 (0.50%) Reverse Only Surviving: 98372 (0.50%) Dropped: 42 (0.00%)
TrimmomaticPE: Completed successfully

Any recommendation? Thanks

ADD COMMENT
1
Entering edit mode

It is unclear what your issue is. Please consider asking a separate question including all necessary details.

ADD REPLY

Login before adding your answer.

Traffic: 2483 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6