Definition of PCR duplicates based on alignment coordinates
0
0
Entering edit mode
7.1 years ago
abascalfederico ★ 1.2k

Dear all,

I have to identify PCR duplicates by myself and would like to understand how tools like Picard's MarkDuplicates and Samtools' rmdup define them.

Do they require that the beginning and end alignment coordinates are the same? I was thinking that since the quality of reads usually degrades during the last sequencing cycles, it would be better to define read duplicates as those sharing the start coordinates, I mean, not requiring them to share also the end coordinates.

Is this how Picard/Samtools define them?

Thanks!

PCR duplicates • 1.8k views
ADD COMMENT
1
Entering edit mode

What I've found out so far... I do not have an exact answer but it seems available software mark duplicates by comparing only 5' coordinates (including clipping if present). If paired reads are at hand, the 5' coordinates of the first and second mates have to be identical (with respect to another pair of reads) to consider the pair a duplicate.

ADD REPLY

Login before adding your answer.

Traffic: 2620 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6