Remove duplicate reads for bisulfite sequencing reads
1
0
Entering edit mode
3.2 years ago
qwzhang0601 ▴ 80

Hello:

I wonder whether there are tools to remove duplicate bisulfite sequencing reads (e.g, those from PCR) from raw fastq files without alignment? I also wonder whether it is an usual way to remove duplicates before alignment. for bisulfite sequencing reads. Will it work as well as removing duplicates based on alignment? It seems usually duplicates are removed after alignment, but we met some difficulties to do so with a new tool (please see below). So we wonder whether we could remove duplicate reads before we do alignment (without losing any accuracy).

We used some tools and get alignment of bisulfite sequencing reads. To make full use of the reads, the tool will do alignment in two steps. (1) step 1: aligned paired-reads reads, and will also align one mate if the other mate of the pair can not align. (2) step 2, unmapped reads will be clipped into short fragments and do alignment to rescue some unmapped reads. Because the alignment (bam) file includes both paired-end alignment and single-end alignment, we get problem when we use Picard to remove duplicates.

Thanks

DNA methylation remove duplicates • 839 views
ADD COMMENT
1
Entering edit mode
3.2 years ago
GenoMax 141k

So we wonder whether we could remove duplicate reads before we do alignment

Check: A: Introducing Clumpify: Create 30% Smaller, Faster Gzipped Fastq Files

ADD COMMENT
0
Entering edit mode

Thanks. Should we remove duplicates before quality control analysis (filter adapters, low quality calls) or after quality control analysis?

ADD REPLY

Login before adding your answer.

Traffic: 2341 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6