Understanding Picards HsMetrics
1
4
Entering edit mode
7.1 years ago

Hello,

I have some problems understanding the output of picards CollectHsMetrics. Because I don't have a bait file i uses my bed file with the target regions to produce the interval list.

java -jar picard.jar BedToIntervalList I=target.bed SD=genome.dict O=target.interval_list

This target interval list I now use for the BAIT parameter and the TARGET parameter:

java -jar picard.jar CollectHsMetrics I=Sample.bam O=hsmetrix.txt BI=target.interval_list TI=target.interval_list

In my understanding the results for ON_BAIT_BASES and ON_TARGET_BASES have to be the same as the interval list is the same (and also all other values where is differ between bait and target). But in my case I have this result:

ON_BAIT_BASES: 1012933970

ON_TARGET_BASES: 678789480

Please help me understanding where this difference come from.

Thanks a lot.

fin swimmer

picardtools sam bam • 4.2k views
ADD COMMENT
1
Entering edit mode
7.1 years ago

In my understanding the 'bait' is the read and the 'target' is the reference . If you have insertions, clipped bases in your reads, there will be a higher number of bases for the reads.

looking at the code in picard:


                int onBaitBases = 0;

                if (!probes.isEmpty()) {
                    for (final Interval bait : probes) {
                        for (final AlignmentBlock block : record.getAlignmentBlocks()) {
                            final int end = CoordMath.getEnd(block.getReferenceStart(), block.getLength());

                            for (int pos = block.getReferenceStart(); pos <= end; ++pos) {
                                if (pos >= bait.getStart() && pos <= bait.getEnd()) ++onBaitBases;
                            }
                        }
                    }
ADD COMMENT
0
Entering edit mode

Hello Pierre, that makes it a little bit clearer.

So ON_TARGET means here only those bases of read which are realy mapped to a target position, and ON_BAIT counts also those bases that are not strictly mapped to a reference position but "somewhere between" (insertions) or are clipped?

Is the ratio of my 2 values a common one?

fin swimmer

ADD REPLY

Login before adding your answer.

Traffic: 2705 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6