samtools idxstats reports zero aligned or unaligned reads
0
0
Entering edit mode
6.1 years ago
Fluorine ▴ 100

I've aligned ChIP-seq data that is publically available (SRP004897), with Bowtie2 -k 3 command. I need information on the number of aligned reads per chromosome for downstream processing, for which I've used samtools idxstats. The output of idxstats shows 0 aligned and 0 unaligned reads, see below:

head idxstats.txt

"chr" "chr_length" "mapped_reads" "unmapped_reads"

chr1 248956422 0 0

chr10 133797422 0 0

chr10_GL383545v1_alt 179254 0 0

chr10_GL383546v1_alt 309802 0 0

chr10_KI270824v1_alt 181496 0 0

chr10_KI270825v1_alt 188315 0 0

chr11 135086622 0 0

chr11_GL383547v1_alt 154407 0 0

In fact i should have around 10 million aligned reads, according to bowtie2 and also confirmed with samtools flagstats. Any idea what is the problem? I've tried with two different versions of samtools (v1.2 and v1.7).

samtools idxstats • 2.7k views
ADD COMMENT
0
Entering edit mode

make sure you have the correct index and not a leftover from another run

you can double check the numbers by counting with

samtools view -F 4 -c alignment.bam chr1

This is a way to count all mapped reads to chr1

ADD REPLY
0
Entering edit mode

The index was made yesterday and I have also tried re-doing the conversion sam to bam, sorting, and indexing. With samtools view command, i get the number of mapped reads so everything is correct, but still idxstats doesn't want to give me any reads.

ADD REPLY
0
Entering edit mode

Why not use the Bowtie alignment stats that are automatically output to terminal?

There may still be an issue with indexing, as per Istvan, because Bowtie uses different genome FASTA indices to SAMtools. Bowtie also encodes different flags into the SAM format, which are evidently incompatible with SAMtools in this case.

Typically one would use samtools idxstats on a bwa-aligned BAM. Bowtie, however, gives its own alignment stats, as to which I have implied above.

Kevin

ADD REPLY
0
Entering edit mode

Bowtie output doesn't give the number of mapped reads per chromosome. I'm actually trying to replicate the data from an article and they claim they have used Bowtie2 for the alignment and after that, they have a script on GitHub which automates the downstream processing. But the script doesn't work due to this problem with samtools idxstats. If what you are saying is true about Bowtie using different FASTA indices, then the pipeline they claim to have used in the article is not correct.

ADD REPLY
0
Entering edit mode

There is a previous thread here: samtools idxstats on indexed BAM failed

However, you appear to have already tried the eventual solution, ie., sorting and indexing the BAM with samtools

The samtools genome indices are created with samtools faidx, by the way, but I'm not sure if it's truly relevant in this case. You may want to just tr it.

ADD REPLY
0
Entering edit mode

By the way, you may want to follow this thread, in addition: Calculating amount of reads on certain chromosomes in bam file

ADD REPLY

Login before adding your answer.

Traffic: 2223 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6