Question

Extract R1 and R2 bam files from merged bam file

0

Entering edit mode

5.8 years ago

c_u ▴ 520

Hi,

I started with R1 and R2 fastq files, and using a pipeline (https://github.com/ArimaGenomics/mapping_pipeline/blob/master/Arima_Mapping_UserGuide.pdf), I combined them to give a merged bam file (it also does other things like filtering for mapping quality, adding read groups and remove PCR duplicates).

Now, the files are from a HiC experiment, and I want to analyze them using HicPro, but HicPro cannot work with merged bam files, it needs separate bam files for R1 and R2. So, I wanted to know if there is a way to unmerge the merged bam file to the corresponding R1 and R2 bam files. Trying to search online I mostly found ways to convert the bam file back to fastq files (which I could do, and then again do fastq to bam, but that seems unintelligent).

Any help would be great. Suggestions for improvement are welcome.

RNA-Seq samtools • 5.6k views

ADD COMMENT • link updated 5.8 years ago by h.mon 35k • written 5.8 years ago by c_u ▴ 520

1

Entering edit mode

use samtools view with flags first in pair or second in pair . see How To Know From Which File ( R1 Or R2 ) A Read Is Coming From Based On Sam Output

ADD REPLY • link 5.8 years ago by Pierre Lindenbaum 161k

score 4 · Accepted Answer · 2018-07-10

4

Entering edit mode

5.8 years ago

swbarnes2 14k

samtools view -hbf 64 mydata.bam > R1.bam
samtools view -hbf 128 mydata.bam > R2.bam

ADD COMMENT • link 5.8 years ago by swbarnes2 14k

0

Entering edit mode

Thanks a lot for this!!

ADD REPLY • link 5.8 years ago by c_u ▴ 520

0

Entering edit mode

Hi,

I tried the command you mentioned and it resulted in the R1 and R2 bam files. But, when I try to do HiC analysis using them, a code in the pipeline that is supposed to merge the 2 bam files gives the error -

## Merging forward and reverse tags ... Forward and reverse reads not paired. Check that BAM files have the same read names and are sorted.

In other words, the 2 files (R1 and R2) are not paired. I tried to sort both of them individually, but got the same error ( I also tried to look at the files after sorting and the lines on them did not match). Can you suggest something that could be done so that the R1,R2 files come out paired?

ADD REPLY • link 5.8 years ago by c_u ▴ 520

0

Entering edit mode

If the orignal fastqs contain every single read, and the bams contain each and every read once and only once, sorting by name should line things up.

ADD REPLY • link 5.8 years ago by swbarnes2 14k