Parallelizing MAFFT alignment of multiple FASTA files
1
2
Entering edit mode
7.6 years ago
ropolocan ▴ 810

Hello,

I want to align multiple FASTA files in a directory with MAFFT.

Is it correct to do this with parallel(where --thread n is the maximum number of cores in the machine):

ls *.fasta | parallel 'mafft --adjustdirection --thread n {} > {.}_mafft.fasta'

If I am interpreting the line above correctly, each alignment will be performed in parallel using the maximum number of cores possible. Am I correct in setting the --thread option to the maximum number of cores for each alignment, or is parallel already taking care of that?

Or is it preferable to perform something like the for loop below and align the FASTA files sequentially:

for i in *.fasta; do
mafft --adjustdirection --thread n ${i} > ${i%.*}_mafft.fasta;
done

Thanks.

alignment mafft parallel • 4.0k views
ADD COMMENT
5
Entering edit mode
7.6 years ago
GenoMax 141k

It would be more efficient to do the alignments sequentially using the maximum number of cores you have available (since you are able to use multiple threads). If you started multiple parallel jobs (each with multiple cores) they would compete for the same cores and the jobs would bog each other down.

ADD COMMENT
0
Entering edit mode

Thanks @genomax2 for the clear and concise answer. Your explanation makes sense and it makes it clear why it is more efficient to run the for loop.

ADD REPLY

Login before adding your answer.

Traffic: 3340 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6