Number of Threads for BWA MEM
2
1
Entering edit mode
6.9 years ago
haiying.kong ▴ 360

Is it recommended to use number of threads for BWA MEM, 2**N (4, 8, 16, 32...)? Or any number is as good as well as long as the memory can support?

bwa • 15k views
ADD COMMENT
1
Entering edit mode

You can't use more than what is available on your computer for a start. Experiment with a small set of reads (100K) to see what core # works best with your system in terms of time to complete. You will saturate something (PCI-E/memory bus) and will see a plateau in performance beyond a certain number of cores/threads.

ADD REPLY
0
Entering edit mode

If your machine/cluster can support it, using many threads with -t should be fine.

ADD REPLY
0
Entering edit mode

It does not have advantage if the number of thread is 2**N, is this correct?

ADD REPLY
0
Entering edit mode

I'd stick to the greatest number of threads your processor can support, unless you're on a workstation and you actually need to do other stuff while bwa runs.

ADD REPLY
0
Entering edit mode

sorry I'm new to this what are threads and why we have to add it in the commandline

ADD REPLY
1
Entering edit mode

You are aware that most computer processors are multi-core (LINK). They can also execute more than one instance of a process, which are called threads (LINK). This allows one to use parallelization on the hardware. Since NGS data alignments can be made in parallel (by starting with multiple processes reading parts of input file) this allows execution to complete faster.

Cores/threads are only one part of the equation. You also need to make sure that the data is fed to the processor fast. Generally the bottleneck is in that process. Disks are only so fast (even with SSD) and the actual pathway to get the data from those disks to the CPU is relatively slow compared to the speed of the CPU.

ADD REPLY
2
Entering edit mode
6.3 years ago

Are you looking for something like this ?

alt text

Source: http://en.community.dell.com/techcenter/high-performance-computing/b/genomics

ADD COMMENT
0
Entering edit mode
6.3 years ago
fwuffy ▴ 110

Bump. You can not run bwa with 1000 threads on one host and expect it to be 1000x faster than single thread. There will be a theoretical optimum which is a function of memory available per thread and reference genome size. Has anyone actually done tests?

ADD COMMENT
2
Entering edit mode

This will very likely depend on your computer architecture, for example I/O speed.

ADD REPLY
1
Entering edit mode

+1 - for most setups, alignment will be I/O bound before any other bottleneck gets hit.

ADD REPLY
0
Entering edit mode

There will be a theoretical optimum

Did you mean Amdahl's law?

ADD REPLY

Login before adding your answer.

Traffic: 2346 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6