Why "No real operator (M|I|D|N)" in picard?
1
0
Entering edit mode
9.9 years ago

Hello,

Using picard CollectAlignmentSummaryMetrics I get the error No real operator (M|I|D|N) in CIGAR. I guess this happens when an operator other than M, I, D, N is encountered (and in fact I have soft clipped reads). I can override the error by setting VALIDATION_STRINGENCY=SILENT.

If my guess is correct, I would like to know why CollectAlignmentSummaryMetrics/picard is set to throw an error with operators other than M, I, D, N.

Thanks!

If relevant here's the offending output:

java -jar -Xmx2g ~/applications/picard/picard-tools-1.92/CollectAlignmentSummaryMetrics.jar \
>     IS_BISULFITE_SEQUENCED=True \
>     INPUT=$bam \
>     OUTPUT=${bam%.bam}.AlnSmryMetr.txt \
>     REFERENCE_SEQUENCE=$ref
...
Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Read name M00886:11:000000000-A88VV:1:1109:11516:16954, No real operator (M|I|D|N) in CIGAR
    at net.sf.samtools.SAMUtils.processValidationErrors(SAMUtils.java:448)
    at net.sf.samtools.BAMRecord.getCigar(BAMRecord.java:247)
    at net.sf.samtools.SAMRecord.getAlignmentEnd(SAMRecord.java:456)
    at net.sf.samtools.SAMRecord.computeIndexingBin(SAMRecord.java:1234)
    at net.sf.samtools.SAMRecord.isValid(SAMRecord.java:1644)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:540)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:522)
    at net.sf.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:481)
    at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:672)
    at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:650)
    at net.sf.picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:109)
    at net.sf.picard.analysis.SinglePassSamProgram.doWork(SinglePassSamProgram.java:55)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:119)
    at net.sf.picard.analysis.CollectAlignmentSummaryMetrics.main(CollectAlignmentSummaryMetrics.java:92)

And this is the problematic read:

samtools view $bam | grep 'M00886:11:000000000-A88VV:1:1109:11516:16954'
M00886:11:000000000-A88VV:1:1109:11516:16954    83    chr10    3012200    33    49M19S    =    3012200    -49    TAAACAAAATTATAACAAACATCAAACTCTAAATTTAAATAAAAGACCTACAAAAAACATACACTAAA    FGGGFGGGGGGGGFGGGGGGFCGGGGGGGGGGGGGGGGGFGFFGGGGGFGGGGFGGGGGGGGECCCCC    NM:i:0    MD:Z:49    AS:i:49    XS:i:46    RG:Z:grm029_pb_DALIHP.140422.DALIHPplas1_S1_L001_R_001_val_    YC:Z:CT    YD:Z:r
M00886:11:000000000-A88VV:1:1109:11516:16954    163    chr10    3012200    33    68S    =    3012200    49    TAAACAAAATTATAACAAACATCAAACTCTAAATTTAAATAAAAGACCTACAAAAAACATACACTAAA    66ACCGGCFGEGGFGFGFGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGG    AS:i:49    MD:Z:49    NM:i:0    RG:Z:grm029_pb_DALIHP.140422.DALIHPplas1_S1_L001_R_001_val_    XS:i:46    YC:Z:GA    YD:Z:r
picard cigar • 4.3k views
ADD COMMENT
2
Entering edit mode
9.9 years ago

2nd line: 68S mean that all your read is *ONLY* soft clipped : soft clipped bases of the reads are in 5' or 3' of the read and are not part of the alignment.

An aligned read with a cigar string `68S` makes no sense.

There should have one 'M' or a '=' operator.

ADD COMMENT
0
Entering edit mode

Thanks! Of course! I was mislead by the error message. Completely soft clipped reads are the result of clipping overlapping pairs. If overlap is complete one of the two pairs is essentially ignored.

ADD REPLY

Login before adding your answer.

Traffic: 2571 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6