Biostar Beta. Not for public use.
How to merge paired-end reads from sam files?
0
Entering edit mode
11 months ago
aquaq • 10

Hi,

I have paired-end read sequencing data. I have aligned reverse and forward reads with bwa mem. Reverse and forward reads are 120 nucleotid long and they cover a 180 nucleotid long part of a genome, hence they overlap.

bwa mem  $REF $file1 $file2 -t 20 > $sam

When I open the sam output file, the first lines begin like this:

M00135:404:HBJFESJSN:2:1101:2016:1297   53      ref   ...
M00135:404:HBJFESJSN:2:1101:2016:1297   133     ref   ...
M00135:404:HBJFESJSN:2:1101:2646:1297   53      ref   ...
M00135:404:HBJFESJSN:2:1101:2646:1297   133     ref   ...

For every pair, I have the two lines aligned to the reference from the two directions ( I know, this is the normal output). Is it possible to combine reverse and forward reads to one sequence, thus getting a 180 nucleotid long alignment for each pair?

Many thanks!

EDIT: sorry for not being clear, I would like to merge pairs after alignment is done.

ADD COMMENTlink
2
Entering edit mode
9 months ago
Belgium

BBMerge can do this :)

ADD COMMENTlink
0
Entering edit mode

Thanks. I have used pandaseq for this problem as well, but I would like to merge sequences after alignment, not before... I am sorry, I was not clear on this.

ADD REPLYlink
1
Entering edit mode

I'm not sure what you biological motivation is for this objective, but I'm completely against tampering with alignment data. Which problem are you trying to solve?

ADD REPLYlink
0
Entering edit mode

It would be just a trial. In a specific part of the sequence that we are interested in, there is a large number of mutations/sequencing error (it was a random sequence, but it was not supposed to be that random). I just wanted to be sure that it is not caused by some weird behaviour of pandaseq that I am not aware of before continuing with further analysis. But I could totally accept if that's unusual, I will find an other way to confirm it (eg by running bbmerge and comparing the results). Thanks for help!

ADD REPLYlink
0
Entering edit mode

I would also like to do this, and yes, after alignment, because I am using a downstream application that needs a merged PE format, but the alignments contain < 1% of the total original fastq reads, and it will be much more efficient to merge only the aligned reads. Did you try using aftermerge? How did it go?

ADD REPLYlink
2
Entering edit mode
9 months ago
Belgium

I just saw this tool by chance, but obviously I have no idea how well it works.

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1