Standard value of quality score and sequence length in QC trimming of RNA-Seq
1
0
Entering edit mode
4.9 years ago
takoyaki ▴ 120

I want to know the standard value of quality score or sequence length in trimming by fastx_toolkit.

In my RNA-Seq textbook, author trim fastq sequences whose base quality scores are under 20 and lengths become under 30bp (original length is 100 bp). Is that the standard value or changes case by case ?

If there is standard value of quality score or sequence length that many biologist adopt, please tell me that value. Also, if it changes case by case, please tell me how biologist judge and decide its value.

rna-seq next-gen sequencing • 1.4k views
ADD COMMENT
1
Entering edit mode

There is no standard for these things since every dataset is different. You will need to experiment and find out what your dataset looks like on QC. If you have a good reference to align to, you may be able to use data down to Q15.

If possible omit using fastx_toolkit which, is old by NGS standards. bbduk.sh from BBMap suite, trimmomatic or cutadapt are all great alternatives.

ADD REPLY
1
Entering edit mode
4.9 years ago
vin.darb ▴ 300

Bases affiliated with a phred score less than 20 have more than 1% chance of being incorrectly sequenced.

This score is oftenly used because below this threshold, the bases are judged as "bad" by fastqc.

But firstly you can run a fastqc to see the quality of the raw fastq files. If the data are bad, you can decrease this score to not delete too many reads

ADD COMMENT

Login before adding your answer.

Traffic: 2096 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6