What's the difference between 'clipping' and 'unmapped' and 'skipped'?
3
0
Entering edit mode
6.0 years ago

Hi . I am using 'TABLET SEQUENCE VIEWER' now. but I don't really understand about difference between 'clipping' and 'unmapped' and 'skipped' with CIGAR. And if any CIGAR value is '36S76M" , then what's going on 37~75 base?? Thank you.

tablet cigar alignment • 1.4k views
ADD COMMENT
1
Entering edit mode
6.0 years ago
said3427 ▴ 120

36S76M means 36 nucleotides are soft-clipped (nts that are not part of the alignment but present in the read) + 76 alignment matches.

ADD COMMENT
1
Entering edit mode
6.0 years ago
d-cameron ★ 2.9k

I suggest reading the SAM specifications document. Page 1 and 2 have a good example that addresses your question about alignment and CIGAR operators.

ADD COMMENT
0
Entering edit mode
6.0 years ago
Zhixue ▴ 10

Unmapped information always shows in FLAG with 0x4, with no information in CIGAR.

clipping(S/H) and skipped(N) information shows in CIGAR.


Taking an example,

  1. if read A is unmapped, it's FLAG includes 4(Bit) and it's CIGAR is '*'.

  2. if read A is mapped, but only part of it is mapped, it's FLAG does not include 4(Bit) and it's CIGAR is like '36S76M'(means 36nts is unmapped to reference but these soft-chipped sequences are stored in SEQ).

  • A simaliar concept is hard clipping(H) (clipped sequences NOT present in SEQ).So H can only be present as the first and/or last operation.S may only have H operations between them, or be present as the end of the CIGAR string(such as 3S89M,67M43S,31S56M21S).
  1. if read A is mapped, but parts of it are mapped to different position, which include a long gap between each of part,it's CIGAR is like 56M1200N63M.
  • For mRNA, an N operation always represents an intron.
ADD COMMENT
0
Entering edit mode

Unmapped information always shows in FLAG with 0x4, with no information in CIGAR.

If a read is unmapped (0x4), the CIGAR can be any legal CIGAR. The SAM specifications do not require a CIGAR of * for unmapped reads.

If 0x4 is set, no assumptions can be made about RNAME, POS, CIGAR, MAPQ, and bits 0x2, 0x100, and 0x800.
ADD REPLY

Login before adding your answer.

Traffic: 1819 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6