Can snakemake work with multiple nodes from HPC?
0
0
Entering edit mode
5.5 years ago
wangdp123 ▴ 340

Hi there,

I am thinking about how to use snakemake under the HPC system. Can snakemake work with multiple nodes from HPC?

For example, I request 8 nodes and each node has 20 cores in the batch script and as a result, I totally request 8*20=160 cores.

And then I set the --cores=160.

snakemake --cores 160

Does this work reasonably for snakemake? Have all the 160 cores been used properly?

PS, I have tried to use this setting to run many programs and it seems to be working well.

Thanks a lot,

Tom

snakemake HPC • 2.9k views
ADD COMMENT
1
Entering edit mode

Setting --cores/--jobs/-j to 160 is fine. In order to submit jobs to a HPC cluster, you'll, at the very minimum, provide a submit command using --cluster/-c. Read https://snakemake.readthedocs.io/en/stable/executable.html#cluster-execution for more information.

ADD REPLY
0
Entering edit mode

What if only put --cores=160 in the snakemake script without using the --cluster argument and then qsub the batch script ?

ADD REPLY
1
Entering edit mode

Snakemake will run jobs up to using 160 cores in the single machine where snakemake was invoked. Why do you want to qsub the .sh yourself? How do you deal with dependencies among jobs?

ADD REPLY
0
Entering edit mode

To clarify the process, the batch script test.sh looks like this:

#Request running time

#$ -l h_rt=48:00:00

#Request the number of nodes

#$ -l nodes=10 (each node has 16 cores, thus in total 160 cores available from 10 nodes)

#Request the memory

#$ -l h_vmem=5G

snakemake -p --cores 160 --snakefile test.snakemake

And then submit the job test.sh

qsub test.sh

In the standard error file, we can see

Provided cores: 160

Rules claiming more threads will be scaled down.

It appears that the snakemake program can run through to the end and generate all the expected resultant files without errors.

I am not certain whether snakemake only uses 16 cores from single node or make use of all the 160 cores from 10 nodes in this way? How to check about this?

Thanks,

ADD REPLY
1
Entering edit mode

No. Snakemake won't utilize all the cores in the 10 nodes you requested.

I'm still trying to wrap my head around on what you're trying to accomplish. Are you doing installation or environment setup via the test.sh? What exactly are you trying to do here?

ADD REPLY
0
Entering edit mode

In this scenario, will snakemake make use of 16 cores (from single node)? At the same time, will the other 9 nodes remain idle?

I have taken a look at the manual about --cluster.

I feel confused about the usage:

snakemake --cluster qsub -j 32

Do this mean we need to run this command within the batch shell script test.sh like this:

#Request running time

#$ -l h_rt=48:00:00

#Request the number of nodes

#$ -l nodes=10 (each node has 16 cores, thus in total 160 cores available from 10 nodes)

#Request the memory

#$ -l h_vmem=5G

snakemake -p --cluster qsub --cores 160 --snakefile test.snakemake

And then submit the job test.sh

qsub test.sh

If this way is correct? what are the differenct functions of the two "qsub"?

ADD REPLY

Login before adding your answer.

Traffic: 2748 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6