Samtools Unmapped Reads
1
0
Entering edit mode
5.9 years ago

I'm trying to extract the unmapped reads that appear in the "*" region when samtools idxstats is run. However, when I run samtools view -b [filename] '*' and then convert to a fastq using samtools bam2fq, the resulting fastq file is empty, and I get the message that 0 reads were processed. However, samtools idxstats definitely shows that there were unmapped reads under "*". Did I extract the reads incorrectly? If not, what else could be the problem? Thanks in advance!

genome alignment • 4.2k views
ADD COMMENT
2
Entering edit mode

Doing samtools view -f 4 foo.bam | samtools bam2fq - may be better.

ADD REPLY
0
Entering edit mode

Thanks for the feedback. I'm aware of that flag and will use it if necessary, but I'd love to make my original command work as it would be much more efficient. Is there any way this could be possible?

ADD REPLY
1
Entering edit mode

I am not sure what version of samtools you are using but samtools view -b [filename] '*' does not work for me with samtools v. 1.8.

ADD REPLY
1
Entering edit mode

The related bug report is here, that should work provided the unmapped reads don't have mapped mates.

ADD REPLY
1
Entering edit mode

In which way would that be much more efficient?

ADD REPLY
0
Entering edit mode

It would only go through the reads classified with '*' rather than the entire genome, thus drastically reducing the number of reads to search

ADD REPLY
2
Entering edit mode

Interesting, I didn't think of it that way. However, you will miss "placed unmapped" reads when you do it like that.

/usr/bin/time -v samtools view 2D_PRO1898_S1_lib_DNA_CTTGTA_L001.bbduk.rg.markdup.bam '*' | wc -l
    User time (seconds): 1.26
    System time (seconds): 0.33
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.60
16258 # reads

/usr/bin/time -v samtools view -f 4 -F 2 2D_PRO1898_S1_lib_DNA_CTTGTA_L001.bbduk.rg.markdup.bam | wc -l
    User time (seconds): 854.41
    System time (seconds): 19.23
    Elapsed (wall clock) time (h:mm:ss or m:ss): 15:15.97
22950 # reads

Edit: formatting

ADD REPLY
1
Entering edit mode
5.8 years ago
maarten ▴ 10

Maybe you indexed the file with sambamba<0.67 which causes an error (see https://github.com/biod/sambamba/issues/322 ).

Try to index it again withsamtools index foo.bam and try again

ADD COMMENT

Login before adding your answer.

Traffic: 1933 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6