Flag In Sam Format
6
1
Entering edit mode
12.5 years ago
Liyf ▴ 300

Hi, I am new in sequencing. I am confused about Flag in Sam format. I know 0 stands for mapping to forward strand and 16 stands for mapping to reverse strand. And 4 stands for unmapping. But what are other flag means? I really do not see any of them. I am a computer student, so I do not know much biology.

sam • 17k views
ADD COMMENT
9
Entering edit mode
12.5 years ago
Yunfei Li ▴ 310

You can find the explanation in sam format manual. To interpret it, there is a website can be helpful http://picard.sourceforge.net/explain-flags.html

ADD COMMENT
0
Entering edit mode

This is a very useful website!

ADD REPLY
6
Entering edit mode
12.5 years ago
brentp 24k

As you're a CS student, you understand they are bitwise flags? So they can be combined. So you can | (or) the flags that are powers of 2 to convey multiple pieces of information in the single number.

This python script:

def asbin(n):
    """converted a number to its binary rep (padded with 0's)"""
    return str(bin(n))[2:].zfill(17)

print "value\thex\tbinary"
for pow in range(17):
    val = 2 ** pow
    print "%-5d\t%-4x\t%s" % (val, val, asbin(val))

# set all flags 
all_ones = reduce(lambda x, y: x | 2**y, range(17), 1)
print "\nall flags set:", asbin(all_ones)

Creates this output:

value   hex binary
1       1       00000000000000001
2       2       00000000000000010
4       4       00000000000000100
8       8       00000000000001000
16      10      00000000000010000
32      20      00000000000100000
64      40      00000000001000000
128     80      00000000010000000
256     100     00000000100000000
512     200     00000001000000000
1024    400     00000010000000000
2048    800     00000100000000000
4096    1000    00001000000000000
8192    2000    00010000000000000
16384   4000    00100000000000000
32768   8000    01000000000000000
65536   10000   10000000000000000

all flags set: 11111111111111111
ADD COMMENT
3
Entering edit mode
12.5 years ago
Geparada ★ 1.5k

Here are the meaning table of this flags.

0x1 template having multiple segments in sequencing
0x2 each segment properly aligned according to the aligner
0x4 segment unmapped
0x8 next segment in the template unmapped
0x10 SEQ being reverse complemented
0x20 SEQ of the next segment in the template being reversed
0x40 the first segment in the template
0x80 the last segment in the template
0x100 secondary alignment
0x200 not passing quality controls
0x400 PCR or optical duplicate

The numbers in second column of the SAM file hexadecimal numbers transformed to decimal scale.

For example, 16 in hexadecimal is 0x10 which it's means "SEQ being reverse complemented", as you already knew.

Cheers,

ADD COMMENT
0
Entering edit mode

Thank you! I got sick these days so I am late. Haha.

ADD REPLY
0
Entering edit mode

But what it means by 0? I think it is mapped.

ADD REPLY
2
Entering edit mode
12.5 years ago

Have a look at the SAM specification.

ADD COMMENT
1
Entering edit mode
8.6 years ago
-_- ★ 1.1k

For fast and handy interpretation of the flag, try http://www.samformat.info/#/flag

ADD COMMENT
0
Entering edit mode
12.5 years ago
dli ▴ 250

this link http://genome.sph.umich.edu/wiki/SAM explains SAM format in detail.

ADD COMMENT

Login before adding your answer.

Traffic: 2732 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6