Biostar Beta. Not for public use.
0
Entering edit mode
2.3 years ago

I'm trying to extract the unmapped reads that appear in the "*" region when samtools idxstats is run. However, when I run samtools view -b [filename] '*' and then convert to a fastq using samtools bam2fq, the resulting fastq file is empty, and I get the message that 0 reads were processed. However, samtools idxstats definitely shows that there were unmapped reads under "*". Did I extract the reads incorrectly? If not, what else could be the problem? Thanks in advance!

genome alignment • 781 views
1
Entering edit mode

Doing samtools view -f 4 foo.bam | samtools bam2fq - may be better.

0
Entering edit mode

Thanks for the feedback. I'm aware of that flag and will use it if necessary, but I'd love to make my original command work as it would be much more efficient. Is there any way this could be possible?

1
Entering edit mode

I am not sure what version of samtools you are using but samtools view -b [filename] '*' does not work for me with samtools v. 1.8.

0
Entering edit mode

The related bug report is here, that should work provided the unmapped reads don't have mapped mates.

0
Entering edit mode

In which way would that be much more efficient?

0
Entering edit mode

It would only go through the reads classified with '*' rather than the entire genome, thus drastically reducing the number of reads to search

1
Entering edit mode

Interesting, I didn't think of it that way. However, you will miss "placed unmapped" reads when you do it like that.

/usr/bin/time -v samtools view 2D_PRO1898_S1_lib_DNA_CTTGTA_L001.bbduk.rg.markdup.bam '*' | wc -l
User time (seconds): 1.26
System time (seconds): 0.33
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.60

/usr/bin/time -v samtools view -f 4 -F 2 2D_PRO1898_S1_lib_DNA_CTTGTA_L001.bbduk.rg.markdup.bam | wc -l
User time (seconds): 854.41
System time (seconds): 19.23
Elapsed (wall clock) time (h:mm:ss or m:ss): 15:15.97


Edit: formatting

0
Entering edit mode
2.3 years ago
maarten • 0

Maybe you indexed the file with sambamba<0.67 which causes an error (see https://github.com/biod/sambamba/issues/322 ).

Try to index it again withsamtools index foo.bam and try again