Trimming paired end RNA-seq with Trimmomatic
1
0
Entering edit mode
9.8 years ago

Hello --

I've begun pre-processing of my paired-end RNA seq data (run on Illumina HiSeq).

After running fastqc on my samples, I noticed some have overrepresented sequences corresponding to adaptors.

I've been trying to use Trimmomatic to remove the adaptors, however, after Trimming I get MORE over represented reads than I do before trimming! I'm not sure what's going on.

For instance, in my unprocessed read, I'll have a single overpresented sequence corresponding to adapter index 1. Once trimmed and processed by trimmomatic, I'll have 25 overrepresented sequences, all corresponding to different variants of the adapter index 1 sequence.

Here is my command line:

Code:

TrimmomaticPE -phred33 /R1_001.fastq.gz /R2_001.fastq.gz /R1_pairedout /R1_unpairedout /R2_pairedout /R2_unpairedout ILLUMINACLIP:/TruSeq3-PE.fa:2:30:10 LEADING:5 TRAILING:5 AVGQUAL:20

Any idea what I'm doing wrong? The same thing occurs even if I leave out the ILLUMINACLIP line.

hiseq preprocessing RNA-Seq trimmomatic • 13k views
ADD COMMENT
0
Entering edit mode

Hello samantha_jeschonek!

It appears that your post has been cross-posted to another site: http://seqanswers.com/forums/showthread.php?t=44949

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY
0
Entering edit mode

woops, not sure how to delete post so it isn't posted in both places!

ADD REPLY
0
Entering edit mode
9.8 years ago

You are not actually ending up with more bad reads - it is just that the system is now able to identify more cases that before looked sort of ok.

In general the "overrepresented" sequence measure is not all that accurate. It could be that you have a large number of fused adaptors that when one adaptor gets cut off the next one shows up.

ADD COMMENT

Login before adding your answer.

Traffic: 2451 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6