BWA mem alignment taking a while in a virtual machine
0
0
Entering edit mode
9.2 years ago
Robert Sicko ▴ 630

I've processing targeted resequencing data from a Haloplex panel run on a MiSeq using Agilents Surecall software and generated variant calls. However, I want to generate calls using my own pipeline in GATK to compare to Surecall.

I'm having trouble with the run time for bwa mem run through virtual box.

Virtualbox note, huge difference between real time and CPU time?

[main] Real time: 21280.065 sec; CPU: 2547.937 sec

Surecall

[main] Real time: 572.116 sec; CPU: 375.354 sec

Surecall uses bma **version: 0.7.5a-r405 - Windows port version: 1.2 and the following command:

bwa.exe, mem, -M, -D, 0,0, -B, 4, -A, 1.0, -w, 100, -k, 19, -R, @RG\tID:X\tSM:X, -t, 4, hg19.fasta, R1_Cut.fastq, R2_Cut.fastq, >, X.sam

In virtualbox I'm using bwa Version: 0.7.12-r103 and the following command:

bwa mem -t 6 -M -R @RG\tID:X\tSM:X human_g1k_v37.fasta.gz R1_trimpaired.fastq.gz R2_trimpaired.fastq.gz > X.sam

As for virtualbox I'm using Biolinux 8 with 8 cores and 5Gb of memory dedicated to it. I'm thinking maybe 5Gb is not enough and it's using paging, which is causing the slow down? I might be able to increase the amount allocated slightly, but the computer itself only has 8Gb total and I know the host OS will need some. The only other difference is the -D parameter used by Surecall in bma 0.7.5. I dug through bwa's git and found:

-D FLOAT drop chains shorter than FLOAT fraction of the longest overlapping chain [%.2f]\n", opt->drop_ratio);

I did not see that in the bwa manual, so I did not specify it in my command (0 could be the default anyway as Surecall specified a few parameters with default values).

alignment next-gen bwa • 3.3k views
ADD COMMENT
1
Entering edit mode

For human, 5.5G is the minimum. Better 6GB.

ADD REPLY
0
Entering edit mode

Thanks Heng! After running with 6GB allocated to the virtual machine the run times are much better.

[main] Real time: 231.878 sec; CPU: 741.689 sec

Any chance you could elaborate on what the Surecall parameter -D, 0,0 is doing? Is this the default?

ADD REPLY

Login before adding your answer.

Traffic: 1678 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6