Does BWA report how many reads mapped and how many failed somewhere in the output? The number saying 'processed' includes both or just the mapped numbers? Can I retrieve failed reads as in Bowtie?
Does BWA report how many reads mapped and how many failed somewhere in the output? The number saying 'processed' includes both or just the mapped numbers? Can I retrieve failed reads as in Bowtie?
You should check out bamtools.
You can run bamtools stats -in path/to/your/alignments.bam
, which will (quickly) report general alignment statistics.
You can also run bamtools filter -isMapped false -in path/to/your/alignments.bam > unmapped.bam
(or something quite close to that) to get your unmapped reads.
Not from the alignment program, as far as I can tell. But you can tell from the sam file, by checking the flags, or make the file into a .bam with samtools, and use flagstat from samtools to count.
I don't think you can pull out failed reads as such, but you can use samtools view to make a .bam or .sam file of reads that failed to align (-f 0x0004), and then either pull out the read data yourself, or some programs will take a .bam file as input, instead of a fastq file.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks! I didn't know that the bam alignments also include failed reads information. I use Bio::DB::Sam a lot, looks like now I can get the information by it, too.
It's possible that some aligners, like bowtie, might omit unmapped reads. But bwa certainly includes them. Note that unmapped paired reads will be given the position information of their mapped partner, so you can't assume that a read with a position is mapped, and that unmapped reads won't have positions. You have to check the binary flag.
Thanks, this is very good to know. The perl module I mentioned above actually has methods to look both mate pairs, so I think that won't be a problem.