How does the memory requirements of abyss scale ?
1
0
Entering edit mode
6.2 years ago

I have been running a few test runs with the abyss assembler on a subset of my input data (to speed up things) in order to optimise the Kmer. I'm now wondering if I can guesstimate the mem requirements of my full run on the mem usage of these trial runs .

More specifically does the mem scale with the input file size or rather with the genome size? I was thinking that eg. doubling my input size will not result in double mem used as the number of distinct Kmer to keep in mem is most likely plateauing?

any help or other user experiences is much appreciated.

thx

abyss Assembly • 1.7k views
ADD COMMENT
3
Entering edit mode
6.2 years ago
benv ▴ 730

@lieven.sterck,

You have the right idea with respect to distinct k-mers -- the memory usage of ABySS is linear w.r.t. the number of distinct k-mers in the input reads.

The number of distinct k-mers in the data set depends jointly on: (i) genome size, (ii) sequencing error rate (sequencing errors create unique k-mers), and (iii) read coverage.

You can determine the number of distinct k-mers in a data set by using a k-mer counter tool. I recommend "ntCard" from our own lab because it is quite fast.

ABySS does not currently have a feature to estimate memory requirements before running the assembly, unfortunately. It is mostly a trial-and-error affair at the moment.

If you find that you do not have adequate RAM to assemble your target genome, the ABySS Bloom filter assembly mode is worth a look (see the README).

ADD COMMENT
0
Entering edit mode

@benv

I started my full run and with a 10-fold increase in input data I only observe a 2-fold increase of the mem-usage. So this seems to confirm our reasoning :-)

thx, L.

ADD REPLY
0
Entering edit mode

perhaps even more informative, I went from 25 billion Kmers to 37 billion (kmer = 85) . which indeed roughly corresponds to the 2-fold mem increase.

ADD REPLY

Login before adding your answer.

Traffic: 2717 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6