I have a tumor bam file and running samtools flagstat gave the following output
1020505173 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
1003947659 + 0 mapped (98.38% : N/A)
1020505173 + 0 paired in sequencing
508168252 + 0 read1
512336921 + 0 read2
991357546 + 0 properly paired (97.14% : N/A)
996058067 + 0 with itself and mate mapped
7889592 + 0 singletons (0.77% : N/A)
3742219 + 0 with mate mapped to a different chr
3349172 + 0 with mate mapped to a different chr (mapQ>=5)
The number of sequences in read1 and read2 differ, although they add up to the number of paired reads in sequencing. What can I do to make this right?
What makes you think you have to correct this?
While aligning this will give an error, right? Or do aligners usually skip unmapped reads?
You have a bam file. Those reads are aligned.
Apologies for my confusion. If I run BWA-MEM will it matter if the number of reads are different? Thanks again!
Run bwa mem on what?
aligning read_1 and read_2 using bwamem
But the reads ARE aligned. What are you doing?
The reads were aligned using hg19 reference. I want to align them using hg38. So I converted the bam reads to paired fastq files and now I want to align them again.
Maybe you should have said that from the beginning?
Anyway: if you are aligning paired end reads then you need exactly as much reads in read1 as in read2, in the same order.
So is it a problem with bam2fq command? What's the way out?