Snakemake paralellize
0
0
Entering edit mode
12 weeks ago
Fadwa ▴ 10

Hii

I am working with Snakemake to process a CSV file containing SRR IDs for downloading. In the initial rule, I use the SRA ID as a wildcard to fetch SRR files from NCBI. However, when I attempt to parallelize the job using the -j 2 option, the downloading step does not parallelize as expected. Can you please assist me with this issue?


home = os.path.expanduser("~")
fichier_csv = os.path.join(home, 'sra_list.csv')

SRA_LIST = []
with open(fichier_csv, 'rt') as f:
    for line in f:
        line = line.split()[0].strip()
        if re.match('[SED]RR\d+$', line): 
            SRA_LIST.append(line)
rule fetch_fastq:
    output:
        config["RESULTS"] + "Fastq_Files/{sra}.fastq.gz"
    log:
        config["RESULTS"] + "Supplementary_Data/Logs/{sra}.sratoolkit.log"
    benchmark:
        config["RESULTS"] + "Supplementary_Data/Benchmark/{sra}.sratoolkit.txt"
    message:
       "fetch fastq from NCBI"
    params:
       conda = "sratoolkit",
       outdir = config["RESULTS"] + "Fastq_Files"
    threads: 8
    shell:
        """
        set +eu &&
        . $(conda info --base)/etc/profile.d/conda.sh &&
        conda activate {params.conda}
        fastq-dump \
                --split-spot \
                --skip-technical {wildcards.sra} \
                --stdout 2>{log} \
        | gzip -c > {output}
        """

can you please help me to parallelize this ??

snakemake order • 314 views
ADD COMMENT
0
Entering edit mode

Do you have enough resources on the machine? You're requesting 8 threads for a single thread process.

ADD REPLY
0
Entering edit mode

Yes, i have enough resources. it's just a test

ADD REPLY
0
Entering edit mode

Try using -j 16

ADD REPLY

Login before adding your answer.

Traffic: 1822 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6