Question

Checkpointing with abyss after hashing input files

0

Entering edit mode

5.0 years ago

Dave • 0

I am attempting to assemble a large genome, ~2.8Gbp using ABySS. The cluster on which I am running has a max run time of 48 hours. From the documentation it appears ABySS is capable of continuing runs after premature termination. But when I try to continue the run it restarts from the beginning (hashing in sequence files).

Here is the command I am currently running:

abyss-pe np=16 j=16 v=-v name=15S k=75 lib='pea' \
pea='L005_R1_001.fastq L005_R2_001.fastq'
se='L005_R1_001_unpaired.fastq L005_R2_001_unpaired.fastq'

Here is the tail of my output file:

12: Removed 42751 marked k-mer.
15: Removed 42201 marked k-mer.
14: Removed 42966 marked k-mer.
10: Removed 43162 marked k-mer.
5: Removed 42565 marked k-mer.
13: Removed 42760 marked k-mer.
4: Removed 43183 marked k-mer.
1: Removed 42650 marked k-mer.
Pruned 682074 k-mer in 55894 tips.
Pruning tips shorter than 32 bp...

The file cuts off here as the 48 hour max run time was completed with the only ouputted file being coverage.hist. Is it possible to pickup from here or is it not far enough along the workflow to have check-pointed or am I not utilising the makefile correctly?

TIA

Assembly genome sequencing abyss de novo • 795 views

ADD COMMENT • link 5.0 years ago by Dave • 0

0

Entering edit mode

Is it possible that your job is not reaching a stage that is check pointable? 48 h max job slot seems really restrictive especially for a large genome. Can you use additional cores if you can't get more time to see if things move along to a point where you get something check pointed?

ADD REPLY • link 5.0 years ago by GenoMax 141k