Interpreting trim_galore report
0
1
Entering edit mode
6.7 years ago
vaslanzadeh ▴ 20

Hello, I have a single-end small RNA-seq data and to filter out adaptors and trim low quality bases I used Trim Galore. I am new to RNAseq data analysis and have a question about trim_galore report. I understand some part of the report except "Total written (filtered):". What does it mean? Is it number of the bases with good quality which passed the filtering(18.6%)? If so, does it mean 82.4% of the bases had low quality and were trimmed off!? At the end, trim_galore removes 75.7% of the reads. Thanks

This is cutadapt 1.13+1.g6b2366d with Python 2.7.10
Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a TGGAATTCTCGG BacRNA_sequence.fq.gz
Trimming 1 adapter with at most 10.0% errors in single-end mode ...


Total reads processed:               3,876,983

Reads with adapters:                 3,668,488 (94.6%)

Reads written (passing filters):     3,876,983 (100.0%)

Total basepairs processed: 139,571,388 bp

Quality-trimmed:                      2,391,351 bp (1.7%)

Total written (filtered):           26,018,981 bp (18.6%)

. . .

3876983 sequences processed in total

Sequences removed because they became shorter than the length cutoff of 16 bp: 2933098 (75.7%)

RNA-Seq • 3.9k views
ADD COMMENT
0
Entering edit mode

Is it number of the bases with good quality which passed the filtering (18.6%)? If so, does it mean 82.4% of the bases had low quality and were trimmed off!?

Yes, you are correct in your interpretation.

ADD REPLY
0
Entering edit mode

In this case, can we simply calculate the sequencing depth as :

Total written (filtered): 26,018,981 bp * 2 / genome size ?

ADD REPLY

Login before adding your answer.

Traffic: 1588 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6