Allpaths- Keep getting ConvertToFastbQualb.pl failed for group 'paired_ends' when I run PrepareAllPaths
0
0
Entering edit mode
8.8 years ago
mafireyi ▴ 80

I am trying to use Allpaths for denovo assembly.

My data summary looks like the following.

Hiseq_Run12_17122014        25GB        Mate-pair    (Size selected to 3KB)    
Hiscan_Run20_12022012        15GB        Paired-end (Nextera V1)    180bp insert    
Hiscan_Run19_17102012        17GB        PE (Nextera V2)    500bp insert    
Hiscan_Run15_12042012        6,5GB        Paired-end (Nextera V1)    180bp insert    
Hiscan_Run14_22032012        3,94GB        Paired-end (Nextera V1)    380bp insert    
Hiscan_Run12_01032012        3,75GB        Paired-end (Nextera V1)    380bp insert    
Hiscan_Run5_08092011        3,58GB        Single-end(Nextera V1)    380bp insert    
Hiscan_Run4re_26072011        1,3GB        Single-end(Nextera V1)    180bp insert
Hiseq_Run14_150313    XXGb    Paired end 250bp insert size

I used Hiseq14_150313 as P E reads as fragment and Hiseq_Run12_17122014 matepairs as the jumping reads for my csv files. I keep getting the following error when I run PrepareAllPaths.pl

Here's my PBS script:

#!/bin/bash
#PBS -N PrepareAllpaths
#PBS -q batch
#PBS -l nodes=1:ppn=16

cd $PBS_O_WORKDIR
mkdir -p NewGuava/data

#export PATH:/scratch/sysusers/godwin/allpaths-bin/bin:$PATH

/scratch/sysusers/godwin/allpaths-bin/bin/PrepareAllPathsInputs.pl DATA_DIR=$PBS_O_WORKDIR/NewGuava/data  PLOIDY=2 IN_GROUPS_CSV=in_groups.csv IN_LIBS_CSV=in_libs.csv OVERWRITE=True

exit 0

The error I see:

Call to new failed, memory usage before call = 17169108k.

AND

**** 2015-06-29 13:10:03 (CG): ConvertToFastbQualb.pl failed for group 'paired_ends'.
---- 2015-06-29 13:10:04 (CG): Importing group 'mate_ends'.

Please assist. What may be the problem

next-gen Assembly • 3.0k views
ADD COMMENT
0
Entering edit mode

It would help if you gave us the exact command you used.

ADD REPLY
0
Entering edit mode

Try adding a memory usage PBS directive explicitly to the PBS header.

ADD REPLY
0
Entering edit mode

Thanks I have tried that. Will see the results tomorow

ADD REPLY
0
Entering edit mode

Wow, preparing datasets for ALLPATHS shouldn't take that long (unless you have tons of data). Also, ALLPATHS performs way faster on intel than on the AMD processors, FYI (we are talking 20 hrs vs. 120 hrs here).

ADD REPLY
0
Entering edit mode

I have abt 40G frag lib and 25G mate pair lib. Is that considered tonnes of data. Have a 100x coverage.

ADD REPLY
0
Entering edit mode

I had total of 86Gb (compressed data 36G pe + 50Gb mp), with little over 35X coverage. For preparing dataset, it used 127 mins wall time (32 CPUs, 512GB memory requested). Where as for actual assembly, it needed 565Gb RAM, 32 procs and ran for 6166.73 mins (both steps on AMD machine). It was a different story with Intel machine!

ADD REPLY
0
Entering edit mode

Intersecting fact to know. Thanks

ADD REPLY
0
Entering edit mode

Try adding ulimit -s unlimited to your PBS script. I know ALLPATHS team recommends it, but don't know what it does :)

ADD REPLY
0
Entering edit mode

Also, make sure the fastq files have fq or fastq extension (gzipped or uncompressed). No spaces after last , in both of the csv files, and space for the empty field eg: 2000bp, trialrun, genspp, jump, 1, , , 2000, 500, outward, ,

ADD REPLY
0
Entering edit mode

Oh. Saw your response late. My fastq files have fastq.gz extensions. Will it fail again?

ADD REPLY
0
Entering edit mode

Oh just realised you said gzipped or uncompressed. Thot that was gunzipped.

ADD REPLY
0
Entering edit mode

Sorry for the confusion. I meant compressed or uncompressed (fastq.gz or fastq)! I normally put like this:

103, 2000bp, /home/path/to/fastqfiles/2000bp/some_saple_number_R?.fastq.gz

ADD REPLY

Login before adding your answer.

Traffic: 2437 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6