Reducing the size of raw sequencing data in fastq format by using a simplified quality score
0
0
Entering edit mode
11 months ago
rls_08 ▴ 40

I am looking for suggestions for a tool that will change the quality score of a fastq file into a binary pass/fail score, similar to what SRA-lite is doing. I deally I want to go from file1.fastq.gz to file2.fastq.gz with a reduction in file size.

fastq compression • 695 views
ADD COMMENT
0
Entering edit mode

Does it need to be fastq? CRAM is able to compress unaligned data quite well, usually a reduction to 66% of original fastq file size can be achieved. The samtools import command can convert fastq to unaligned CRAM. Keep in mind that manipulation of fastq file quality (strictly speaking) is already some sort of analysis as it alters the original data. Do you really need this, rather than just using CRAM, or gzip compression at maximum level (or pigz with -11). There was recently this interesting discussion at Bioinfo StackExchange that you can read to get other inspiration => https://bioinformatics.stackexchange.com/questions/20858/good-recommended-way-to-archive-fastq-and-bam-files

ADD REPLY

Login before adding your answer.

Traffic: 2241 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6