Bam-files: How to know which reads are actual pairs?
0
0
Entering edit mode
9.0 years ago

I have a bam file for a chromosome sorted by read names. For some mate pairs I get output like this (cropped):

42629192        42629262        2PJ3LS1:183:C5RR7ACXX:4:1101:21144:7110/2
43729562        43729595        2PJ3LS1:183:C5RR7ACXX:4:1101:21144:7110/1
78061166        78061267        2PJ3LS1:183:C5RR7ACXX:4:1101:21144:7110/2

I guess none of these should really be considered to be in a pair since they are so far apart. But this leads to the question, how do you know that two reads belong together in a pair? Is this metadata in the bam file somewhere?

If the three reads above aligned much closer together, it would be hard to tell which two made up the pair and which was the odd man out, right?

samtools paired-end bam • 3.2k views
ADD COMMENT
0
Entering edit mode

even cropped it doesn't look like a BAM: read name should be the 1 st column.

ADD REPLY
0
Entering edit mode

I know, it is a bam-file that has been converted to bed then processed (mangled?) in python. I hope the question is still understandable and that the problem I describe is still valid though; how do I know which reads belong together or not. But thanks for pointing it out so people reading this q do not misunderstand.

ADD REPLY
0
Entering edit mode

In last "/1" is first mate and "/2" is second mate

but how does it matters when you converted them into bed?

ADD REPLY
0
Entering edit mode

It matters since I want to find the region covered by each matepair.

ADD REPLY
0
Entering edit mode

okay then in sequence ID column when suffix "/1" is first mate and "/2" is second mate

ADD REPLY

Login before adding your answer.

Traffic: 1558 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6