Biostar Beta. Not for public use.
Samtools Unmapped Reads
0
Entering edit mode
2.3 years ago

I'm trying to extract the unmapped reads that appear in the "*" region when samtools idxstats is run. However, when I run samtools view -b [filename] '*' and then convert to a fastq using samtools bam2fq, the resulting fastq file is empty, and I get the message that 0 reads were processed. However, samtools idxstats definitely shows that there were unmapped reads under "*". Did I extract the reads incorrectly? If not, what else could be the problem? Thanks in advance!

genome alignment • 781 views
ADD COMMENTlink
1
Entering edit mode

Doing samtools view -f 4 foo.bam | samtools bam2fq - may be better.

ADD REPLYlink
0
Entering edit mode

Thanks for the feedback. I'm aware of that flag and will use it if necessary, but I'd love to make my original command work as it would be much more efficient. Is there any way this could be possible?

ADD REPLYlink
1
Entering edit mode

I am not sure what version of samtools you are using but samtools view -b [filename] '*' does not work for me with samtools v. 1.8.

ADD REPLYlink
0
Entering edit mode

The related bug report is here, that should work provided the unmapped reads don't have mapped mates.

ADD REPLYlink
0
Entering edit mode

In which way would that be much more efficient?

ADD REPLYlink
0
Entering edit mode

It would only go through the reads classified with '*' rather than the entire genome, thus drastically reducing the number of reads to search

ADD REPLYlink
1
Entering edit mode

Interesting, I didn't think of it that way. However, you will miss "placed unmapped" reads when you do it like that.

/usr/bin/time -v samtools view 2D_PRO1898_S1_lib_DNA_CTTGTA_L001.bbduk.rg.markdup.bam '*' | wc -l
    User time (seconds): 1.26
    System time (seconds): 0.33
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.60
16258 # reads

/usr/bin/time -v samtools view -f 4 -F 2 2D_PRO1898_S1_lib_DNA_CTTGTA_L001.bbduk.rg.markdup.bam | wc -l
    User time (seconds): 854.41
    System time (seconds): 19.23
    Elapsed (wall clock) time (h:mm:ss or m:ss): 15:15.97
22950 # reads

Edit: formatting

ADD REPLYlink
0
Entering edit mode
2.3 years ago
maarten • 0

Maybe you indexed the file with sambamba<0.67 which causes an error (see https://github.com/biod/sambamba/issues/322 ).

Try to index it again withsamtools index foo.bam and try again

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1