Question

masurca runCA failed

0

Entering edit mode

5.1 years ago

mariachiaracascarano ▴ 10

Hi everybody still a newbie in bioinformatics, stuck on masurca 3.3.0... any help will be more than welcome. I am trying to assemble a bacterial genome from miseq paired end reads in masurca 3.3.0. without grid options. my compilation file looks like:

PE= aa 519 844
/home1/cascarano/projects/miseq/pant_bact/DLK2/DLK2_S1_L001_R1_001.fastq
/home1/cascarano/projects/miseq/pant_bact/DLK2/DLK2_S1_L001_R2_001.fastq


#Illumina mate pair reads supplied as <two-character prefix> <fragment mean> <fragment stdev> <forward_reads> <reverse_reads>
#JUMP= sh 3600 200 
#pacbio OR nanopore reads must be in a single fasta or fastq file with absolute path, can be gzipped
#if you have both types of reads supply them both as NANOPORE type
#PACBIO=/FULL_PATH/pacbio.fa
#NANOPORE=/FULL_PATH/nanopore.fa
#Other reads (Sanger, 454, etc) one frg file, concatenate your frg files into one if you have many
#OTHER=/FULL_PATH/file.frg END

PARAMETERS
#set this to 1 if your Illumina jumping library reads are shorter than 100bp
#EXTEND_JUMP_READS=0
#this is k-mer size for deBruijn graph values between 25 and 127 are supported, auto will compute the optimal size based on the read data
and GC content GRAPH_KMER_SIZE = auto
#set this to 1 for all Illumina-only assemblies
#set this to 0 if you have more than 15x coverage by long reads (Pacbio or Nanopore) or any other long reads/mate pairs (Illumina MP,
Sanger, 454, etc) USE_LINKING_MATES = 1
#specifies whether to run mega-reads correction on the grid
#USE_GRID=0
#specifies grid engine to use SGE or SLURM
#GRID_ENGINE=SLURM
#specifies queue (for SGE) or partition (for SLURM) to use when running on the grid MANDATORY
#GRID_QUEUE=all.q
#batch size in the amount of long read sequence for each batch on the grid
#GRID_BATCH_SIZE=300000000
#use at most this much coverage by the longest Pacbio or Nanopore reads, discard the rest of the reads
#LHE_COVERAGE=25
#set to 1 to only do one pass of mega-reads, for faster but worse quality assembly MEGA_READS_ONE_PASS=0
#this parameter is useful if you have too many Illumina jumping library mates. Typically set it to 60 for bacteria and 300 for the
other organisms 
#LIMIT_JUMP_COVERAGE = 60
#these are the additional parameters to Celera Assembler.  do not worry about performance, number or processors or batch sizes -- these
are computed automatically. 
#set cgwErrorRate=0.25 for bacteria and 0.1<=cgwErrorRate<=0.15 for other organisms. CA_PARAMETERS =  cgwErrorRate=0.25
#minimum count k-mers used in error correction 1 means all k-mers are used.  one can increase to 2 if Illumina coverage >100
KMER_COUNT_THRESHOLD = 1
#whether to attempt to close gaps in scaffolds with Illumina data CLOSE_GAPS=1
#auto-detected number of cpus to use NUM_THREADS = 20
#this is mandatory jellyfish hash size -- a safe value is estimated_genome_size*estimated_coverage JF_SIZE = 460000000
#set this to 1 to use SOAPdenovo contigging/scaffolding module.  Assembly will be worse but will run faster. Useful for very large
(>5Gbp) genomes from Illumina-only data SOAP_ASSEMBLY=0 END

I get an error

[Mon Mar  4 12:18:14 EET 2019] Overlap/unitig failed, check output
under CA/ and runCA1.out

with less on runCA1.out I get:

----------------------------------------
END Mon Mar  4 12:18:14 2019 (0 seconds) Created 13 overlap jobs.  Last batch '001', last job '000013'.
----------------------------------------
START Mon Mar  4 12:18:14 2019 sbatch   -D `pwd` -J "ovl_genome[1-13]" -a 1-13 \   -o /home1/cascarano/projects/miseq/pant_bact/DLK2/MASURCAnoGrid/CA/1-overlapper/%A_%a.out \   /home1/cascarano/projects/miseq/pant_bact/DLK2/MASURCAnoGrid/CA/1-overlapper/overlap.sh
sh: 1: sbatch: not found
----------------------------------------
END Mon Mar  4 12:18:14 2019 (0 seconds) ERROR: Failed with signal 127

================================================================================
runCA failed.
---------------------------------------- Stack trace:
at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 1613.
    main::caFailure("Failed to submit batch jobs.") called at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 87
    main::submitBatchJobs("   -D `pwd` -J \"ovl_genome[1-13]\" -a 1-13 \\\x{a}  -o /home1/casca"..., "ovl_genome[1-13]") called at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 3809
    main::createOverlapJobs("normal") called at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 6523
----------------------------------------
Failure message:
    Failed to submit batch jobs.

masurca assembly stack illumina • 2.1k views

ADD COMMENT • link updated 9 months ago by Ram 43k • written 5.1 years ago by mariachiaracascarano ▴ 10

score 0 · Answer 1 · 2019-03-04

0

Entering edit mode

5.1 years ago

Philipp Bayer 8.3k

Are you sure you printed the correct config file? That looks like it's trying to submit a SLURM job (sbatch: not found) - perhaps the default for USE_GRID is 1, and you for some reason commented that one out - it should be USE_GRID=0 (with no comments)

ADD COMMENT • link 5.1 years ago by Philipp Bayer 8.3k

score 0 · Answer 2 · 2019-03-04

0

Entering edit mode

5.1 years ago

mariachiaracascarano ▴ 10

Hi Philip, I am actually running it on a cluster node but without grid option I used grid option at the beginning --> it gave me same error --> I thought it was the grid, I excluded it --> it gives me same results.

the last attempt is the one you see...Unfortunately the configuration file is that one.

ADD COMMENT • link 5.1 years ago by mariachiaracascarano ▴ 10