Samtools Idxstats
1
12
Entering edit mode
12.4 years ago
KCC ★ 4.1k

I am reading the output from samtools idxstats. From the website for samtools, it says "The output is TAB delimited with each line consisting of reference sequence name, sequence length, # mapped reads and # unmapped reads."

My input to samtools is a bam file, that I generated from a sam file, produced by bwa, which wasa used to align some reads to the reference genome.

I am trying to understand why a chromosome can have an unmapped read i.e. how is it that a read can be unmapped and yet assigned to a chromosome?

samtools • 23k views
ADD COMMENT
19
Entering edit mode
12.4 years ago
Swbarnes2 ★ 1.6k

A few reasons. For one, bwa concatenates all the references sequences together before aligning. So if a read hangs off of one sequence onto the next, it's given the appropriate mapping position, and the unmapped flag is also set, as a sign that something is off about the alignment.

Second, SAM specs call for unmapped reads to be given the chromososme and position of their mapped partner. This is so that when you sort the reads by chromosome and position, the unmapped read sorts next to its mapped mate. Again, the 4 flag tells you that the read really is unmapped. SAM specs say that if the 4 flag is set, you can't believe chromosome, positions, CIGAR strings, mapping quality, or anything else in the .sam entry.

ADD COMMENT
0
Entering edit mode

“a read hangs off of one sequence onto the next”;;what's meaning of this sentence?

a read that cover part of chr1(for a example) and chr2(for a example)? is thus read exist???how is it

ADD REPLY

Login before adding your answer.

Traffic: 1723 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6