Filter bam files using a bed file: Why is the mate missing?
1
0
Entering edit mode
6.6 years ago
komal.rathi ★ 4.1k

Hi everyone,

This is a sorted bam (by coordinates):

samtools view 7316-161-T_Aligned.out.sorted.bam | grep 'FCC78FRACXX:1:1101:6639:75204'

FCC78FRACXX:1:1101:6639:75204#  163 chr2    74156623    255 23M2445N77M =   74159237    8677    CGCCTATCAATCAGATTAAACTCCTGAACAAAGAAAATAAAGTGCTTAAAGGAGGTGTTGAGGTGGGCCTCCTCTTGCAGCTGCATCACAGAATCAAAGT    bbbeeeeegggggiiiiihiiiihhiihiiiihiihiifhf]Yafaccfggfcgh^ceghhhi_ceghggggeeceedddcbccbccccc_bbcccc`c]    NH:i:1  HI:i:1  AS:i:199    NM:i:0  MD:Z:100

FCC78FRACXX:1:1101:6639:75204#  83  chr2    74159237    255 18M5963N82M =   74156623    -8677   TTTTGGGAAATGGGACACCAATCTTAGAAGGAAAAAGAGTTTCATCATCAAGCTGATCTTGAACCCAAGTCATCAAATAGTCAATGTATTTTGGTGCAGA    ^cccddddb`deeeedbbdggggagiiihiiihhiiiiihhhgchihhiihhhhhfiihihfciieiiiiihhiehihiiihiiiiigggggeeeeebbb    NH:i:1  HI:i:1  AS:i:199    NM:i:0  MD:Z:100

I filtered out the reads mapped to a particular region like this:

head genes_10000Flanks.bed
chr2    74109441        74156992        ACTG2

samtools view -b -L genes_10000Flanks.bed 7316-161-T_Aligned.out.sorted.bam -o 7316-161-T_Aligned.out.filtered_new.bam

But when I look at the filtered bam file I only get one mate:

samtools view 7316-161-T_Aligned.out.filtered_new.bam | grep 'FCC78FRACXX:1:1101:6639:75204'

FCC78FRACXX:1:1101:6639:75204#  163 chr2    74156623    255 23M2445N77M =   74159237    8677    CGCCTATCAATCAGATTAAACTCCTGAACAAAGAAAATAAAGTGCTTAAAGGAGGTGTTGAGGTGGGCCTCCTCTTGCAGCTGCATCACAGAATCAAAGT    bbbeeeeegggggiiiiihiiiihhiihiiiihiihiifhf]Yafaccfggfcgh^ceghhhi_ceghggggeeceedddcbccbccccc_bbcccc`c]    NH:i:1  HI:i:1  AS:i:199    NM:i:0  MD:Z:100

Why is the other mate not getting filtered?

RNA-Seq Samtools • 2.2k views
ADD COMMENT
1
Entering edit mode
6.6 years ago

because the position 74159237 is out of the segment 74109441-74156992 and the L option doesn't work as you expect. If you want to retrieve all the reads, extends your bed or add a second step to retrieve all the reads by name. https://www.google.com/search?q=site%3Abiostars.org+read+names+bam

ADD COMMENT
0
Entering edit mode

Thanks I just thought it would be simpler than this. I was hoping something like a partial match would work..

ADD REPLY

Login before adding your answer.

Traffic: 2460 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6