for loop over bwa aln
2.1 years ago
@phylofun51674
My for loop around bwa aln and bwa sampe doesn't seem to work. First, I created paths to the read data:
ls -1r reads/*/cleaned/*READ1.fastq.gz > READ1_list
ls -1r reads/*/cleaned/*READ2.fastq.gz > READ2_list
Next, I made a for loop using READ1_list and READ2_list around the bwa aln function:
for i in READ1_list
do
bwa aln reference/reference.unaligned.fasta $i > ${i%READ1.fastq.gz}read1.sai
done
for i in READ2_list
do
bwa aln reference/reference.unaligned.fasta $i > ${i%READ2.fastq.gz}read1.sai
done
Then, I use a for loop around bwa sampe:
for i in $(cat READ1_list | sed s'/\-READ1.fastq.gz//' | sort)
do
bwa sampe reference/reference.unaligned.fasta $i\-READ1.sai $i\-READ2.sai $i\-READ1.fastq.gz $i\-READ2.fastq.gz \
| samtools view -bS - > $i\-pe.bam
done
Can anyone spot the issue?
bwa
alignment
next-gen
• 330 views
•
link
•
updated
2.1 years ago
Joe
12k
This is overkill, but you might be interested in my bash
code for aligning/remapping:
(You'll mainly want to look at Line 95 onward, everything before that is just picking up commandline args etc.)
With a little more bash
magic, you can loop your files and pass the relevant paths as arguments:
paste -- "READS1.txt" "READS2.txt" |
while IFS=$'\t' read -r reads1 reads2; do
bash align_and_remap.sh -r /path/to/reference.fasta -1 "$reads1" -2 "$reads2" -o /path/to/outdir
done
*Disclaimer, this script is not well tested at all. Do some testing of you own before letting it loose on all your data. This is also based on an old setup with bwa aln
where you had to provide each read file separately. That doesn't need to happen with bwa mem
I believe.
Login before adding your answer.
I can't spot the issue. What happens? Are there error messages?
Unless you reads are shorter than 70bp, you should be using
bwa mem
instead ofbwa aln
.There is an error message when I run bwa sampe, but I think the issue begins with bwa aln. No error message occurs when I run bwa aln; however, the .sai output file goes to the current working directory instead of following the path in READ1_list, and the sample name is not written for the .sai file (e.g. READ1_listread1.sai instead of sample_name-read1.sai)
Put an echo in your loop to see how it looks like:
Please note that you could also do the following and pipe directly to samtools sort, without samtools view:
I see two problems in your code:
Also it is hard to find the errors without any detail, are you running this on Linux/Mac or Windows?, what are the names of the input files?, what is inside the READ1_list and READ2_list files?
That shouldn't even work. You should instead e.g.
Just try e.g.
vs.
Only the latter works in bash (as intended)..
But anyway, I wouldn't even use a for loop for this but instead just define a function and use it with find and GNU parallel. Also, why are you aligning the reads separately? Surely it would make more sense to align read pairs properly in the same runs?