Question

Parallel for shell script with different output

1

Entering edit mode

9.4 years ago

Korsocius ▴ 250

Dear all,

I need help with parallel command. I have one script in shell. And I would like to run it 12time in same time. But for each script I need different name of output. output is .tsv and the name is same like name of input, could you help how to do that?

Thanks a lot

shell parallel • 4.7k views

ADD COMMENT • link updated 2.2 years ago by Ram 43k • written 9.4 years ago by Korsocius ▴ 250

5

Entering edit mode

see Gnu Parallel - Parallelize Serial Command Line Programs Without Changing Them

ADD REPLY • link 9.4 years ago by Giovanni M Dall'Olio 28k

2

Entering edit mode

please post your standard shell command line;

the parallel command would be somthing like

parallel scriptname <hardcoded options for the script> {1}.input {1}.output ::: prefixes

ADD REPLY • link 9.4 years ago by russhh 5.7k

0

Entering edit mode

Show some example of what is input and output.

ADD REPLY • link 9.4 years ago by GouthamAtla 12k

0

Entering edit mode

I have script with name bin.sh where is input 1.bam and output will be 1.tsv. Every input is in the different folder with same name from 1-12. In this folder are 1.bam (imput), 1.bai (input) => they are reading with shell script. Output will 1.tsv and etc. for each bam file.

ADD REPLY • link 9.4 years ago by Korsocius ▴ 250

0

Entering edit mode

parallel myscript {} {.}.bai '>' {.}.tsv ::: */*.bam

ADD REPLY • link 9.4 years ago by ole.tange ★ 4.4k

0

Entering edit mode

9.4 years ago

GouthamAtla 12k

for input in *.bam; do out=`echo $input | awk -F"." '{ print $1}'; bin.sh $input $out.tsv & done

To understand:

for input in *.bam;  #for each bam file
do
out=`echo $input | awk -F"." '{ print $1}'` #get the uniq output prefix
bin.sh $input $out.tsv & #run the srcipt and push it to background
done

ADD COMMENT • link updated 2.2 years ago by Ram 43k • written 9.4 years ago by GouthamAtla 12k

2

Entering edit mode

won't work if there are not enough cores.

ADD REPLY • link 9.4 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

Yes. But the for loop helps me a lot in other cases where the operation is computationally not expensive.

ADD REPLY • link updated 2.2 years ago by Ram 43k • written 9.4 years ago by GouthamAtla 12k

2

Entering edit mode

... and if something wrong happens, you'll have to (quickly) find & kill your PIDs ...

ADD REPLY • link 9.4 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

I used to do that.

ADD REPLY • link updated 2.2 years ago by Ram 43k • written 9.4 years ago by GouthamAtla 12k

1

Entering edit mode

I am trying to understand why people still use for loops for independent jobs.

Is it readability? Is the for-loop really easier to read than 'parallel bin.sh {} {.}.tsv ::: *.bam'? Or if the jobs were bigger/more complex using a function:

myfunc() {
  bin.sh "$1" "$2"
  #more stuff here
}
export -f myfunc
parallel myfunc {} {.}.tsv ::: *.bam

For computationally cheap jobs I really do not see the benefit of a for loop.

The only advantage I can think of is that GNU Parallel may not be installed. But that advantage can vanish in just 10 seconds: wget -O - pi.dk/3|bash

@Geek_y can you enlighten me, what you see as the advantage?

ADD REPLY • link updated 2.2 years ago by Ram 43k • written 9.4 years ago by ole.tange ★ 4.4k

0

Entering edit mode

I am used to use for loop. But definitely will shift towards parallel. Started reading your tutorial on parallel. I am from biology background and a kind of beginner in core bioinfo, hence, need some time to learn best practices.

ADD REPLY • link 9.4 years ago by GouthamAtla 12k

0

Entering edit mode

I wonder why one would dispatch computationally cheap operations to a bg core in the first place. The gain in execution time will surely be balanced out by the time taken to write the loop + dispatch to different cores.

ADD REPLY • link 9.4 years ago by Ram 43k

1

Entering edit mode

Input files are in different folders, you might wanna use a find (with an optional maxdepth) to find these files first, then run the script on them.

ADD REPLY • link 9.4 years ago by Ram 43k

score 6 · Accepted Answer · 2014-11-07

6

Entering edit mode

9.4 years ago

Pierre Lindenbaum 161k

an example with echo:

 seq 1 12  | parallel  echo "Hello" '>' 'result.{}'