Bulk quality interleaving with bbmap reformat command
1
0
Entering edit mode
5.4 years ago
Longshotx ▴ 70

I have several fastq.gz files that look like this:

Sample1_R1.fastq.gz Sample1_R2.fastq.gz

Sample2_R1.fastq.gz Sample2_R2.fastq.gz

etc...

The output needs to be: Sample1_interleaved.fastq.gz, Sample2_interleaved.fastq.gz

I am still learning how to loop commands so bare with me. This is what I tried to run:

for i in `ls -1 *_R#.fastq.gz | sed 's/_R#.fastq.gz//‘`
do
reformat.sh in=$i\_R#.fastq.gz out=$i\_interleaved.fastq.gz 
done

However this did not work. Can someone help me? Many Thanks!

bbmap bbduk reformat bash loop • 1.8k views
ADD COMMENT
1
Entering edit mode
5.2 years ago
ross_whetten ▴ 10

The # symbol in ls -1 *_R#.fastq.gz won't match 1 or 2 - the character you want is ? in that position to match any single character in the file names. Similarly, sed doesn't recognize # as a wildcard character either, but . will work as a wildcard, or you can specify a single character chosen from either 1 or 2 with [12]. An alternative to sed for removing the unwanted remainder of the filename is the basename function, which can also remove leading directory names as well. For example:

for i in /path/to/files/*_R?.fastq.gz; do name=$(basename $i _R[12].fastq.gz); 
reformat.sh in=${name}_R#.fastq.gz out=${name}_interleaved.fastq.gz; done

Note that there is a space between the $i and the _R[12].fastq.gz within the basename command.

ADD COMMENT
0
Entering edit mode

In addition to using these instructions to fix problems with the loop infenit101 you should explicitly provide two inputs to the reformat.sh command to get the interleaving.

Edit: As @ross points out below using # shortcut should indeed work.

ADD REPLY
0
Entering edit mode

@genomax - Based on the Reformat User Guide, I think the <name>_R#.fastq.gz syntax would work, although the escape before the underscore (<name>_R#) will cause problems that I didn't mention.

ADD REPLY

Login before adding your answer.

Traffic: 3107 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6