Issue indexing a genome with bowtie2
0
0
Entering edit mode
3.4 years ago
Rox ★ 1.4k

Hello everyone,

I am stuck since quite some time on a side dish with Hi-C which require to first index the genome with bowtie2. I used it on some others references genome with no issues, but I just can't make it work for that one particular genome.

I use bowtie2-2.3.5.1, and my command is : bowtie2-build mother_raw_wtdbg2_58x_polished.fa bowtie2_index/mother_raw_wtdbg2_58x_polished.

This is the output I am getting :

Settings:
  Output files: "bowtie2_index/mother_raw_wtdbg2_58x_polished.*.bt2"
  Line rate: 6 (line is 64 bytes)
  Lines per side: 1 (side is 64 bytes)
  Offset rate: 4 (one in 16)
  FTable chars: 10
  Strings: unpacked
  Max bucket size: default
  Max bucket size, sqrt multiplier: default
  Max bucket size, len divisor: 4
  Difference-cover sample period: 1024
  Endianness: little
  Actual local endianness: little
  Sanity checking: disabled
  Assertions: disabled
  Random seed: 0
  Sizeofs: void*:8, int:4, long:8, size_t:8
Input files DNA, FASTA:
  mother_raw_wtdbg2_58x_polished.fa
Building a SMALL index
Reading reference sizes
  Time reading reference sizes: 00:00:33
Calculating joined length
Writing header
Reserving space for joined string
Joining reference sequences
  Time to join reference sequences: 00:00:25
bmax according to bmaxDivN setting: 660338289
Using parameters --bmax 495253717 --dcv 1024
  Doing ahead-of-time memory usage test
  Passed!  Constructing with these parameters: --bmax 495253717 --dcv 1024
Constructing suffix-array element generator
Building DifferenceCoverSample
  Building sPrime
  Building sPrimeOrder
  V-Sorting samples
  V-Sorting samples time: 00:01:27
  Allocating rank array
  Ranking v-sort output
  Ranking v-sort output time: 00:00:18
  Invoking Larsson-Sadakane on ranks
  Invoking Larsson-Sadakane on ranks time: 00:00:44
  Sanity-checking and returning
Building samples
Reserving space for 12 sample suffixes
Generating random suffixes
QSorting 12 sample offsets, eliminating duplicates
QSorting sample offsets, eliminating duplicates time: 00:00:00
Multikey QSorting 12 samples
  (Using difference cover)
  Multikey QSorting samples time: 00:00:00
Calculating bucket sizes
Splitting and merging
  Splitting and merging time: 00:00:00
Avg bucket size: 2.64135e+09 (target: 495253716)
Converting suffix-array elements to index image
Allocating ftab, absorbFtab
Entering Ebwt loop
Getting block 1 of 1
  No samples; assembling all-inclusive block

And this is the directory where it writes its outputs :

sbsuser@node125: /work/sbsuser/test/roxane/bowtie2 $ll
total 630M
-rw-r--r-- 1 sbsuser GET-PLAGE  72K Dec 10 10:48 bovin_genome_index.1.bt2
-rw-r--r-- 1 sbsuser GET-PLAGE    0 Dec 10 10:48 bovin_genome_index.2.bt2
-rw-r--r-- 1 sbsuser GET-PLAGE  43K Dec 10 10:48 bovin_genome_index.3.bt2
-rw-r--r-- 1 sbsuser GET-PLAGE 630M Dec 10 10:48 bovin_genome_index.4.bt2

So it feels ike he starts doing something, then for some reason he consider it's "empty" and stop indexing.

I have strictly no idea of what I am doing wrong and I have been pulling my hair too long on this... Can someone please help me pointing out the dumb mistake I am probably making ?

Have a nice day,

Roxane

software error bowtie2 • 1.4k views
ADD COMMENT
0
Entering edit mode

I think it is not an issue on your side. There are a couple of Github issues on this kind of error, e.g. https://github.com/BenLangmead/bowtie2/issues/194 that I would probably add a comment to it and see what the developers have to say.

ADD REPLY

Login before adding your answer.

Traffic: 2628 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6