Question

cutadapt low read adapters percentage

0

Entering edit mode

5.0 years ago

Morris_Chair ▴ 350

Dear Community, I finally received the fastq files from a service and I'm making clipping and trimming. I noticed that for all my output files I have a low percentage of adapters found so I wonder how this is possible,maybe I got a wrong barcode?

Thank you

=== Summary ===

Total reads processed:              25,153,241
Reads with adapters:                   234,803 (0.9%)
Reads that were too short:                 472 (0.0%)
Reads written (passing filters):    25,152,769 (100.0%)

Total basepairs processed: 3,221,015,905 bp
Quality-trimmed:                 481,378 bp (0.0%)
Total written (filtered):  3,213,762,494

bp (99.8%)

RNA-Seq adapter • 1.9k views

ADD COMMENT • link 5.0 years ago by Morris_Chair ▴ 350

1

Entering edit mode

Typically the read length is smaller than the fragment size so adapters are not expected to be found frequently. The only exception that comes to my mind is ATAC-seq and smallRNA-seq. What kind of data is that? Did you run fastqc before to check for adapter contamination? Note that a barcode is not the same as an adapter.

ADD REPLY • link 5.0 years ago by ATpoint 81k

1

Entering edit mode

Barcode/index sequences are not the same thing as adapters as @ATPoint already noted. Adapters contain index sequences, which are read as an independent read in Illumina sequencing.

In well made libraries it is perfectly fine to find very little adapter contamination.

ADD REPLY • link 5.0 years ago by GenoMax 141k

0

Entering edit mode

Hi guys, This are RNA sequencing from whole human RNA asked for 8 milion reads, 1x75 SE (happy for having more than 8M of reads per file) For each fastqc I checked the quality of the fastq file and only in a couple of files I have an exclamation mark for the adapter content parameter. When I run cutadapt I use all the sequence that the service gave me like:

Index Adapter 5′ GATCGGAAGAGCACACGTCTGAACTCCAGTCAC CTTGTA GATCTCGTATGCCGTCTTCTGCTTGATGCCGTCTTCTGCTTG

 cutadapt -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACCTTGTAGATCTCGTATGCCGTCTTCTGCTTGATGCCGTCTTCTGCTTG -q 20 -m 25 -o folder/R1.fq.gz  R1.fq.gz

The barcode is in bold which is the only part that changes for all my samples, am I doing right?

Thank you

ADD REPLY • link 5.0 years ago by Morris_Chair ▴ 350

1

Entering edit mode

The barcode is in bold which is the only part that changes for all my samples

If you had more than one sample then yes. Trimming programs will generally look for the core sequence GATCGGAAGAGCACACGTCTGAACTCCAGTCA that is common for all adapters. Once they find it, they will remove all sequence 3' including the core.

ADD REPLY • link 5.0 years ago by GenoMax 141k

0

Entering edit mode

Thank you genomax, I don't know if was important to say that I m clippnig my samples one by one.

ADD REPLY • link 5.0 years ago by Morris_Chair ▴ 350

1

Entering edit mode

Your samples have been demultiplexed (you have separate files, correct?). Then index sequences have already been taken into account in that process.

ADD REPLY • link 5.0 years ago by GenoMax 141k

0

Entering edit mode

yes, I have separated files

ADD REPLY • link 5.0 years ago by Morris_Chair ▴ 350

1

Entering edit mode

I would say you have a normal sample and things are expected. I would proceed with the downstream analysis.

ADD REPLY • link 5.0 years ago by ATpoint 81k