Adding Unpaired_Reads_Examined And Read_Pairs_Examined In Metrics Of Picard Markduplicates
0
0
Entering edit mode
10.1 years ago
ccshao ▴ 10

Adding the number of UNPAIRED_READS_EXAMINED and READ_PAIRS_EXAMINED does NOT give me the number of original file.

Here is my data: original file has 91137584, they are reliable reads by the command "samtools view -S -bq 1".

I want to remove the duplicate with picard, and hence set the "REMOVE_DUPLICATES=true". Output of metrics is:

UNPAIRED_READS_EXAMINED    READ_PAIRS_EXAMINED    UNMAPPED_READS    UNPAIRED_READ_DUPLICATES    READ_PAIR_DUPLICATES    READ_PAIR_OPTICAL_DUPLICATES    PERCENT_DUPLICATION    ESTIMATED_LIBRARY_SIZE
104579    45330048    0    68012    14161407    5034139    0.312796    74939630

the duplicate marked file has 62746758, and 91137584 - 14161407*2 - 68012 give the number of reads in new bam file.

However, 104579 + 45330048*2 gave me 90764675, less than the actual number of reads in my file, 372909 reads are missing.

Could someone explain me what is going on?

Thanks!

picard markduplicates • 2.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 2565 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6