Entering edit mode
3.0 years ago
Anna-Leigh
•
0
I'm interesting in analyzing pair-end SLAM-seq data in a custom manner.
I want to split an aligned RNA-seq into 2 files, every read with more than one mismatch of a given type (every T>C mutation) into one sam/bam, and the others into another sam/bam.
I'm sure I could do it with pysamtools, but was wondering if there is an easier way by directly parsing the MD tag. Could probably get partially there with a quick filter on the NM tag, but I also want specific types of mismatches.