Indexing 34Gbp Reference Genomes With Bwa -A Bwtsw
2
2
Entering edit mode
12.2 years ago
Ahdf-Lell-Kocks ★ 1.6k

I have tried to index 34GB worth of primate genomes in one single go using bwa -a bwtsw and it went all the way into doing that but then crashed at this point:

[BWTIncConstructFromPacked] 7090 iterations done. 70465272750 characters processed.
[BWTIncConstructFromPacked] 7100 iterations done. 70493213214 characters processed.
[BWTIncConstructFromPacked] 7110 iterations done. 70518042750 characters processed.
[bwa_index] 118525.77 seconds elapse.
[bwa_index] Update BWT... /tools/lsf/spool/1327577148.2686228: line 8:  7412 Segmentation fault      (core dumped) ~/src/bwa/bwa-0.6.1/bwa index -a bwtsw primates.fa

I tried this on a 30000M job, trying again now (still chugging along) on 60000M.

Any suggestions on what might be happening here?

bwa • 6.3k views
ADD COMMENT
1
Entering edit mode

Can you give a bit more information, especially RAM size might be important, because I have seen bwa segfault because of memory, at least it wouldn't tell: 'couldn't aquire 200G of memory', no no...

ADD REPLY
3
Entering edit mode
12.2 years ago

You can try to find the error using gdb.

in $BWA/Makefile check that the -g option is set for CFLAGS

CFLAGS=         -g -Wall -m32

clean and re-compile:

make clean
make

run gdb and call backtrace after the failure

$ gdb /path/to/bwa 
(gdb) run -a bwtsw primates.fa
(...)
Error
(...)
(gdb) backtrace

It should give us an insight about the problem.

ADD COMMENT
3
Entering edit mode
12.2 years ago
Ahdf-Lell-Kocks ★ 1.6k

Answering my own question, it needed about 60Gb of memory to do the indexing, it now works, I tried bwa bwasw -Z 10 and will load the index in about 64Gb and align normally.

ADD COMMENT
1
Entering edit mode

I guess this is the first time that someone has run BWA on a 34GB genome. Yes, bwa/bwa-sw needs about twice memory as your reference genome.

ADD REPLY
0
Entering edit mode

Excuse me, twice memory is required only for indexing or for mapping as well?

ADD REPLY
0
Entering edit mode

I guess this is the first time that someone has run BWA on a 34GB genome. Yes, bwa/bwa-sw needs about twice memory as your reference genome.

ADD REPLY

Login before adding your answer.

Traffic: 3237 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6