How to remove bam files that don't contain matching reads
1
0
Entering edit mode
5.1 years ago

Dear Biostars,

I have 45 directories, each containing 2500 bam and 2500 bam.bai files. Each bam file represents alignment results from aligning (shotgun metagenomic) sequences to a reference fasta file. Many of the bam files are empty and only contain the header and no matching/aligned reads. Is there a way to remove these bam files that don't contain matching reads?

Cheers,

Sam

bowtie2 samtools BAM • 1.2k views
ADD COMMENT
3
Entering edit mode
5.1 years ago

check with :

 find /path/to/dir -type f -name "*.bam" | while read F; do samtools view  ${F} | grep -v  -E '^@' -m1 > /dev/null || echo $F; done

then replace echo with rm

ADD COMMENT
1
Entering edit mode

Works perfectly! Thank you!

ADD REPLY
0
Entering edit mode

May I suggest to use samtools view -H ${F} so only the header will be extracted and inspectioned ?

ADD REPLY
0
Entering edit mode

no because there is always a header.

ADD REPLY

Login before adding your answer.

Traffic: 1569 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6