Question

Using paired-end date as single-end for mapping with bwa mem. Good, bad or ugly?

0

Entering edit mode

3.4 years ago

resug ▴ 30

Hi Biostars,

I am trying to align my paired-end reads to my assembly with bwa mem (to be used for polishing with Racon afterwards). However due to downstream application requirements (Racon) my paired-end reads have to have the same ID, which bwa does not accept on its paired-end mode. So in order to use all my reads I treated my paired-end reads as single reads for alignment with bwa mem (by deleting the ID content after the space in the header line '@' and merging both fastq files).

Now I am wondering if this approach I took was a good call or not? If so what would be the problematic here? How significant would be the difference in the quality of the alignment treated this way vs properly as paired-end? Can this alignment cause a spurious polishing later when using Racon?

I would appreciate much to have your thoughts. Thanks!

alignment bwa paired-end polishing Racon • 1.4k views

ADD COMMENT • link updated 3.4 years ago by GenoMax 141k • written 3.4 years ago by resug ▴ 30

score 1 · Answer 1 · 2020-11-26

1

Entering edit mode

3.4 years ago

GenoMax 141k

my paired-end reads have to have the same ID, which bwa does not accept on its paired-end mode.

That should be easy to change with reformat.sh from BBMap suite.

reformat.sh in1=your_R1.fq in2=your_R2.fq out1=Fixed_R1.fq out2=Fixed_R2.fq addcolon=t

addcolon=t              Append ' 1:' and ' 2:' to read names, if not already present.  Please include the flag 'int=t' if the reads are interleaved.

ADD COMMENT • link 3.4 years ago by GenoMax 141k

0

Entering edit mode

Thank Genomax. For Racon no two reads should have the same identifier up to the first whitespace, so Racon would accept this happily. However BWA-mem would not accept it because it requires the identifiers to be the same until the first whitespace in PE. This is the discrepancy I am dealing with on using Racon for polishing with a sam file generated with BWA-mem. So to be compatible I run my PE reads with BWA-mem as single reads (though not sure how good this alignment is, maybe it's just fine), or it would be great to know how to run BWA-mem with different identifiers until the whitespace in PE mode. Thanks again.

ADD REPLY • link 3.4 years ago by resug ▴ 30

0

Entering edit mode

However BWA-mem would not accept it because it requires the identifiers to be the same until the first whitespace in PE.

What addcolon= does it it will add a standard 1:N:0 after the first white space. So these reads should work for both.

You could also do the following as an alternate to addcolon=. This will create old style Illumina read headers.

addslash=t              Append ' /1' and ' /2' to read names, if not already present.

This should make the reads unique without the whitespace.

ADD REPLY • link 3.4 years ago by GenoMax 141k