Remove singletons from bam
1
0
Entering edit mode
6.0 years ago
Gene_MMP8 ▴ 240

I have a tumor bam file and running samtools flagstat gave the following output

1020505173 + 0 in total (QC-passed reads + QC-failed reads)  
0 + 0 secondary  
0 + 0 supplementary  
0 + 0 duplicates  
1003947659 + 0 mapped (98.38% : N/A)  
1020505173 + 0 paired in sequencing  
508168252 + 0 read1  
512336921 + 0 read2  
991357546 + 0 properly paired (97.14% : N/A)  
996058067 + 0 with itself and mate mapped  
7889592 + 0 singletons (0.77% : N/A)  
3742219 + 0 with mate mapped to a different chr  
3349172 + 0 with mate mapped to a different chr (mapQ>=5)

The number of sequences in read1 and read2 differ, although they add up to the number of paired reads in sequencing. What can I do to make this right?

sequencing next-gen • 2.8k views
ADD COMMENT
0
Entering edit mode

What makes you think you have to correct this?

ADD REPLY
0
Entering edit mode

While aligning this will give an error, right? Or do aligners usually skip unmapped reads?

ADD REPLY
0
Entering edit mode

You have a bam file. Those reads are aligned.

ADD REPLY
0
Entering edit mode

Apologies for my confusion. If I run BWA-MEM will it matter if the number of reads are different? Thanks again!

ADD REPLY
0
Entering edit mode

Run bwa mem on what?

ADD REPLY
0
Entering edit mode

aligning read_1 and read_2 using bwamem

ADD REPLY
0
Entering edit mode

But the reads ARE aligned. What are you doing?

ADD REPLY
0
Entering edit mode

The reads were aligned using hg19 reference. I want to align them using hg38. So I converted the bam reads to paired fastq files and now I want to align them again.

ADD REPLY
1
Entering edit mode

Maybe you should have said that from the beginning?

Anyway: if you are aligning paired end reads then you need exactly as much reads in read1 as in read2, in the same order.

ADD REPLY
0
Entering edit mode

So is it a problem with bam2fq command? What's the way out?

ADD REPLY
0
Entering edit mode

. I want to align them using hg38. So I converted the bam reads to paired fastq files and now I want to align them again.

ADD REPLY
1
Entering edit mode
6.0 years ago

Take a look to this page https://broadinstitute.github.io/picard/explain-flags.html

Then try to use samtools view with the -f and the -F options to select those mapped reads that are of your interest.

If you have mapped paired reads, all your BAM flags will be an odd number, meaning that they come from a paired mapping

If you check the second option (read mapped in proper pair) you will see a value of 2 that can be used with samtools view -f to extract all the mapped reads that fit that condition

samtools view -f2 -b your_original.bam > your_new.bam
ADD COMMENT
0
Entering edit mode

Thanks for replying. So basically I am trying to read a bam file without the singletons. I tried out your suggestion but getting an error now.

[W::bgzf_read_block] EOF marker is absent. The input is probably truncated  
[main_samview] truncated file.

What to do here?

ADD REPLY
2
Entering edit mode

If you have this error when running the samtools command is that the original file is corrupted. If dealing with the final filtered file, give it a little time to finish the command. I have just tested the command and is working nicely

ADD REPLY

Login before adding your answer.

Traffic: 1503 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6