masurca runCA failed
2
0
Entering edit mode
5.1 years ago

Hi everybody still a newbie in bioinformatics, stuck on masurca 3.3.0... any help will be more than welcome. I am trying to assemble a bacterial genome from miseq paired end reads in masurca 3.3.0. without grid options. my compilation file looks like:

PE= aa 519 844
/home1/cascarano/projects/miseq/pant_bact/DLK2/DLK2_S1_L001_R1_001.fastq
/home1/cascarano/projects/miseq/pant_bact/DLK2/DLK2_S1_L001_R2_001.fastq


#Illumina mate pair reads supplied as <two-character prefix> <fragment mean> <fragment stdev> <forward_reads> <reverse_reads>
#JUMP= sh 3600 200 
#pacbio OR nanopore reads must be in a single fasta or fastq file with absolute path, can be gzipped
#if you have both types of reads supply them both as NANOPORE type
#PACBIO=/FULL_PATH/pacbio.fa
#NANOPORE=/FULL_PATH/nanopore.fa
#Other reads (Sanger, 454, etc) one frg file, concatenate your frg files into one if you have many
#OTHER=/FULL_PATH/file.frg END

PARAMETERS
#set this to 1 if your Illumina jumping library reads are shorter than 100bp
#EXTEND_JUMP_READS=0
#this is k-mer size for deBruijn graph values between 25 and 127 are supported, auto will compute the optimal size based on the read data
and GC content GRAPH_KMER_SIZE = auto
#set this to 1 for all Illumina-only assemblies
#set this to 0 if you have more than 15x coverage by long reads (Pacbio or Nanopore) or any other long reads/mate pairs (Illumina MP,
Sanger, 454, etc) USE_LINKING_MATES = 1
#specifies whether to run mega-reads correction on the grid
#USE_GRID=0
#specifies grid engine to use SGE or SLURM
#GRID_ENGINE=SLURM
#specifies queue (for SGE) or partition (for SLURM) to use when running on the grid MANDATORY
#GRID_QUEUE=all.q
#batch size in the amount of long read sequence for each batch on the grid
#GRID_BATCH_SIZE=300000000
#use at most this much coverage by the longest Pacbio or Nanopore reads, discard the rest of the reads
#LHE_COVERAGE=25
#set to 1 to only do one pass of mega-reads, for faster but worse quality assembly MEGA_READS_ONE_PASS=0
#this parameter is useful if you have too many Illumina jumping library mates. Typically set it to 60 for bacteria and 300 for the
other organisms 
#LIMIT_JUMP_COVERAGE = 60
#these are the additional parameters to Celera Assembler.  do not worry about performance, number or processors or batch sizes -- these
are computed automatically. 
#set cgwErrorRate=0.25 for bacteria and 0.1<=cgwErrorRate<=0.15 for other organisms. CA_PARAMETERS =  cgwErrorRate=0.25
#minimum count k-mers used in error correction 1 means all k-mers are used.  one can increase to 2 if Illumina coverage >100
KMER_COUNT_THRESHOLD = 1
#whether to attempt to close gaps in scaffolds with Illumina data CLOSE_GAPS=1
#auto-detected number of cpus to use NUM_THREADS = 20
#this is mandatory jellyfish hash size -- a safe value is estimated_genome_size*estimated_coverage JF_SIZE = 460000000
#set this to 1 to use SOAPdenovo contigging/scaffolding module.  Assembly will be worse but will run faster. Useful for very large
(>5Gbp) genomes from Illumina-only data SOAP_ASSEMBLY=0 END

I get an error

[Mon Mar  4 12:18:14 EET 2019] Overlap/unitig failed, check output
under CA/ and runCA1.out

with less on runCA1.out I get:

----------------------------------------
END Mon Mar  4 12:18:14 2019 (0 seconds) Created 13 overlap jobs.  Last batch '001', last job '000013'.
----------------------------------------
START Mon Mar  4 12:18:14 2019 sbatch   -D `pwd` -J "ovl_genome[1-13]" -a 1-13 \   -o /home1/cascarano/projects/miseq/pant_bact/DLK2/MASURCAnoGrid/CA/1-overlapper/%A_%a.out \   /home1/cascarano/projects/miseq/pant_bact/DLK2/MASURCAnoGrid/CA/1-overlapper/overlap.sh
sh: 1: sbatch: not found
----------------------------------------
END Mon Mar  4 12:18:14 2019 (0 seconds) ERROR: Failed with signal 127

================================================================================
runCA failed.
---------------------------------------- Stack trace:
at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 1613.
    main::caFailure("Failed to submit batch jobs.") called at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 87
    main::submitBatchJobs("   -D `pwd` -J \"ovl_genome[1-13]\" -a 1-13 \\\x{a}  -o /home1/casca"..., "ovl_genome[1-13]") called at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 3809
    main::createOverlapJobs("normal") called at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 6523
----------------------------------------
Failure message:
    Failed to submit batch jobs.
masurca assembly stack illumina • 2.1k views
ADD COMMENT
0
Entering edit mode
5.1 years ago

Are you sure you printed the correct config file? That looks like it's trying to submit a SLURM job (sbatch: not found) - perhaps the default for USE_GRID is 1, and you for some reason commented that one out - it should be USE_GRID=0 (with no comments)

ADD COMMENT
0
Entering edit mode
5.1 years ago

Hi Philip, I am actually running it on a cluster node but without grid option I used grid option at the beginning --> it gave me same error --> I thought it was the grid, I excluded it --> it gives me same results.

the last attempt is the one you see...Unfortunately the configuration file is that one.

ADD COMMENT

Login before adding your answer.

Traffic: 1829 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6