I have many .bam that I want to get their .bai using samtools in the terminal.
I tried the following command :
samtools index *.bam
However, I did not get any .bai file.
using GNU parallel:
parallel samtools index ::: *.bam
thanks this is useful. Do you know what is the optimal way of setting -j ? Without setting it does it just default to max cores? Or should I set it to something total core minus 1. That is if I have 8 then I will just set it to 7? thanks.
Samtools index only accepts a single input file, so using a shell metacharacter to specify multiple files will not work. I usually use a shell wrapper to run samtools index on a single file at a time.
for INFILE in "$@"
samtools index $INFILE
Then it is simple to run:
You need to point the results to a file to create this:
So for one file it would be
samtools index file.bam file.bam.bai
See this link for a great description: Here
In your case with many bam files I would do it in a shell script as follows:
for i in *.bam
echo "Indexing: "$i
samtools index $i $i".bai"
The samtools index foo.bam foo.bam.bai syntax won't work with the two most recent versions of samtools. The way to do this is simply samtools index foo.bam. Enabling people to specify alternate index filenames is low on the priority list.
samtools index foo.bam foo.bam.bai
samtools index foo.bam
@[Devon Ryan](https://www.biostars.org/u/7403/): Yes indeed, with my samtools version I need to write samtools index foo.bam
So if I understood well , You are saying that the*.bam won't work with my samtools version?
*.bam won't work with any samtools version. I should note that understanding why this is the case will be helpful for you in general (this will be the case for many tools), so I'll explain that.
Assume you're in a directory with three BAM files: A.bam, B.bam and C.bam. When you type samtools index *.bam, your shell sees *.bam and expands it. Consequently, what samtools sees you as running is samtools index A.bam B.bam C.bam. That'd be fine if samtools index could accept more than one input file at a time, but it can't. Further, it may either then see you as using the alternate syntax that https://www.biostars.org/u/7338/ mentioned or simply die due to not knowing what to do (I'd have to check the source code, though I expect the former would happen.
samtools index *.bam
samtools index A.bam B.bam C.bam
So, what you want to do is simply iterate over the files with a for loop:
for f in *.bam
samtools index $f
Very helpful Devon Ryan Tx !
@[Jonathan Crowther](https://www.biostars.org/u/7338/) : Tx! I will give it a try
Login before adding your answer.