Hi, This might be an obvious question... I have an SAM file looking something like this, my question is: How to you find the contig name (if it is possible). I have looked up the definitions of the header but do not quite get if there is any contig names, is it the same as the QNAME and if so do you need to decode it like the Flags?
QNAME FLAG RNAME POS MAPQ CIGAR MRNM MPOS ISIZE SEQ QUAL OPT
@HD VN:1.0 SO:unsorted
@SQ SN:1 LN:249250621
@SQ SN:2 LN:243199373
@SQ SN:3 LN:198022430
@SQ SN:4 LN:191154276
..........................................................
@SQ SN:GL000245.1 LN:36651
@SQ SN:GL000246.1 LN:38154
@SQ SN:GL000247.1 LN:36422
@SQ SN:GL000248.1 LN:39786
@SQ SN:GL000249.1 LN:38502
@RG ID:S0143_RC SM:S0143 PU:no_index LB:exome CN:Lab PL:ILLUMINA
@PG ID:bowtie2 PN:bowtie2 VN:2.0.2
HISEQ:179:H9A7VADXX:1:1101:2961:2239 83 11 58711034 42 101M = 58710857 -278 TCTGAAGTGGAGCTTCTAGTATCCCCAGGAGCGCGAAGTGAACACGGAAGGTACCTGCAGGATCCAATTGTGTCCATTGATCTCTCAGAGTGGCTGAGGNT BFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFIFIIIIFIIFFFIIIIIIFFFIIIIIIIIIIIIIIIIIIIIIIIIIIIFIIIIIIIFFFFFFFFFB0#B AS:i:-1 XS:i:-36 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:99A1 YS:i:0 YT:Z:CP RG:Z:S0143_RC
HISEQ:179:H9A7VADXX:1:1101:2961:2239 163 11 58710857 42 99M = 58711034 278 CACTTTACCTTTTTTGTCTATAAATTCATTTTGACCACGAGGCACCCCCGGAGCCTCGGTGAATCTGCTGTGATTTTGTAGGCTGCCGGATTCACAAAT BBBFFFFFFFFFFIIIFIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFFFFFFFFBBBFBFFFFFFFFFBFFFFBBBBBFFFFFFFBFFFFFFFFB AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:99 YS:i:-1 YT:Z:CP RG:Z:S0143_RC
HISEQ:179:H9A7VADXX:1:1101:2867:2247 83 17 67173782 42 101M = 67173769 -114 AGTGAGAACTAAATTAAATAATGTATAGTGAGGCCAGATGTGGTGGCTCACACCTGTAATCCCAGCACTTTGAGAGGCTGAGGCGGGTGGATCACCTGAGG BBFBFFBBBBFFFFFFFFBFBFBBBBBBFFFFFFFFBFIFFFFFFF<BFFBFBFFBIIIFIFFFFFBBFBFBFIFFFFFFF<IIFFFIFFFFB0BFFFBBB AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:101 YS:i:0 YT:Z:CP RG:Z:S0143_RC
HISEQ:179:H9A7VADXX:1:1101:2867:2247 163 17 67173769 42 99M = 67173782 114 CTTCAGAGGCATTAGTGAGAACTAAATTAAATAATGTATAGTGAGGCCAGATGTGGTGGCTCACACCTGTAATCCCAGCACTTTGAGAGGCTGAGGCGG BBBFFFFBFFFFFIIFFFFFIIBFFFFIFFIFIIIIIFBBBFFIFFFIFFIIFFFBFFFIFBBF<BFFB7B<<BBBBFBB<BFFFBBFFFFFFBBFBBB AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:99 YS:i:0 YT:Z:CP RG:Z:S0143_RC
HISEQ:179:H9A7VADXX:1:1101:3056:2236 83 9 70883881 1 101M = 70883861 -121 ACACAAAGATCAAGGTACTTTAAAAAAGCTATTCCTATTAATAACAAATCATTTTAGTTATTAATAATAAACATTAAGTAATTGACAAATATGCTTGATNC FFFFFFFFFFFFFFFFFFFFFIIIIIIIIIIIIIIIIIIIIFIIIIIIFIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFFFFFFFFFB0#B AS:i:-1 XS:i:-1 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:99T1 YS:i:0 YT:Z:CP RG:Z:S0143_RC
This appears to be data aligned to hg19 (GRCh37) human genome. The @SQ lines are the chromosome names. Which
contig
names are you referring to?