ABySS path setup for temporary files
1
0
Entering edit mode
5.5 years ago
Mbillah ▴ 140

How can I increase my temporary stores? Where I put this command? I am using linux server.

export TMPDIR=/var/tmp

TIA

assembly abyss • 3.1k views
ADD COMMENT
1
Entering edit mode

What kmer settings are you using? ABySS's memory usage scales with kmer size. If you're running out of memory (RAM) rather than disk space (tmp), try a different kmer setting perhaps/

https://github.com/bcgsc/abyss/wiki/ABySS-Users-FAQ

ADD REPLY
1
Entering edit mode

Are you running this on a cluster with a scheduling system, or ‘locally’ on a bare-metal server?

If its the former, you may simply need to request more resources in your jobs file. You might be seeing no output because the job is queued/suspended/killed by the scheduler?

ADD REPLY
0
Entering edit mode

Please elaborate on your question please. What are you trying to achieve?

You want to increase temporary storage available? Are you sure this is the real issue and not insufficient RAM or similar?

ADD REPLY
1
Entering edit mode

indeed, ABySS should not use that much tmp storage.

You will have to set this before running ABySS, eg somewhere at the top of the script you use to run ABySS

You could also try to set TMPDIR to some other location as well, eg a scratch share or such?

ADD REPLY
0
Entering edit mode

How much temporary memory needed for 140 M reads?

ADD REPLY
0
Entering edit mode

Do you know how much time abyss-pe needed for 140 M sequences?

ADD REPLY
1
Entering edit mode

That would be hard to estimate since it would depend on your data, hardware you have access to.

If you see that the program is running (consuming CPU %) then best thing is to wait and watch.

ADD REPLY
0
Entering edit mode

I am waiting being 24 hours, But not produce single output file

ADD REPLY
0
Entering edit mode

and after 24h you still have no output? nothing? that should not be, you should at least already get a coverage file after reading in the input data.

How much resources are you using, mainly #cores then? I was gonna say it should run in a matter of hours for 140M reads, that is, given that you run it multi-core

ADD REPLY
0
Entering edit mode

Running at 12 cores, there is no output yet, will I wait more?

ADD REPLY
0
Entering edit mode

hard to say, but gut feeling (and user experience) thinks there should have been output already

ADD REPLY
0
Entering edit mode
5.5 years ago

You can monitor the process with the following tools

  • gotop
  • htop
  • glances

These should be easily installable under Ubuntu.

But please state your server specs !

  • RAM
  • hard disk space

140m PE illumina sequences will require a significant amount of mem and HDD space. I would guess at 64-128 GB. But don't know, I run all assemblies on servers with 256 -512 GB.

MAybe you can put your TMP on a NFS external storage, or even large external HDD, to see if that helps.

ADD COMMENT
0
Entering edit mode

RAM: 64 GB HDD: 1T I have changed the temporary file location and selected the hard disk. In top I can saw that 12 abyss process is running. Already 40 hours have passed, no output folder has beed created yet, how much time did you take? This is my script

export TMPDIR=/data/masum

abyss-pe name=test k=52 np=12 in='read_1.fq read_2.fq'

ADD REPLY
1
Entering edit mode

with that cmdline ABySS will not create an output directory and will put all its output files in the directory where you launched it. You will need to add -C <dir> to use a specific output dir.

For future runs I (or restarting this one?) would advise to add v=-v to the cmdline. this activates verbose mode and then you can more easily follow what is going on (or if something is still going on).

64Gb is also not that much RAM but should be sufficient for the 140M reads you have (that number is correct, right? seems rather low)

ADD REPLY
0
Entering edit mode

Total number of paired reads: 140958110

ADD REPLY
1
Entering edit mode

and still no output?

if so, kill the job and resubmit it with the option activated I mentioned above. But before you kill it check the mem usage of that job? eg with top or ps ux or htop or ....

ADD REPLY
0
Entering edit mode

After starting again, this last 10 hours work. Looks like everything is going on right?

https://postimg.cc/nsZWy9hr

ADD REPLY
1
Entering edit mode

it is reading in the fastq files, yes, that looks OK.

let's see what the next steps say

ADD REPLY
0
Entering edit mode

ok, I will inform you.

ADD REPLY
0
Entering edit mode

Getting error when using 12 Cores after I reduce number of cores to 8 then I still get same error. When I run it without -np including -j(thread) I notice the speed is better. And after running simple hello-world program by mpirun it is successfully running. As my machine is a single machine, How can I use all of my cores. According to manual "Without any MPI configuration, this will allow you to use multiple cores on a single machine. " So what type of command do I need?

mpirun noticed that process rank 3 with PID 0 on node cvsau-vm exited on signal 9 (Killed).

ADD REPLY
1
Entering edit mode

looks like one of your jobs is being killed by the server.

Since you're working on a single machine I think you might indeed be better off with using -j instead of np (np is only for MPI processes == if you run this in a cluster setting and send parts to different machines). I would then also not put -j to the max of your server, I always leave one (or two) free, just in case.

ADD REPLY
0
Entering edit mode

I want to maximize my cpu usage which is currently at 16% I have 16 cores and 16 threads in this stage I am using 14 threads. Can I speed up the process?

This is the update :

https://postimg.cc/R3VxvF87

ADD REPLY
1
Entering edit mode

how many input files you have? 2 , no? If so, that's fine abyss will assign a process/thread per input file (not sure if it can do multi-threaded reading of input files) for reading in the data. in the next steps you should see the CPU usage rise.

ADD REPLY
0
Entering edit mode

Hello lieven.sterck, I have 2 fastq file. First 10 hours it was able to read 6-7 M read. In the last 28 hours, it read 90 M sequences. It is now going more slowly.

If I want to find out optimum kmer, I need to run this process 20+ times. This is really time consuming for me. Can you give me a idea?

https://postimg.cc/rKtjV07x

https://postimg.cc/6TFFwGMx

Thank you

ADD REPLY
0
Entering edit mode

that is indeed pretty slow (and much slower then expected) . Though it is at 90M reads in your screenshot, no? but even then that's still slow

Can you give the exact cmd you're running?

ADD REPLY
0
Entering edit mode

Yes 90 M. This is my command:

#!bin/bash

export TMPDIR=/data/masum

abyss-pe name=goat k=52 j=14 v=-v -C /data/masum/abyss/k52 in='/data/masum/bowtie2/b143_1.fq /data/masum/bowtie2/b143_2.fq'

ADD REPLY

Login before adding your answer.

Traffic: 2707 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6