Bwa: Should Sai/Sam Sequences Processed Match?
1
1
Entering edit mode
11.1 years ago
jomaco ▴ 200

Hi,

After running BWA the output includes information such as this:

[bwa_sai2sam_pe_core] print alignments... 0.66 sec
[bwa_sai2sam_pe_core] 197538574 sequences have been processed.
[main] Version: 0.6.1-r104

Should the no. of sequences processed match for the sai and sam files?

For both sai files I have the no. of sequences processed at 147913358

For the sam file it is 197538574

I ran this again this but got the same result. For other sai to sam conversions this number seems to match.

Thanks,

Jom

sam bwa • 2.2k views
ADD COMMENT
0
Entering edit mode

Unfortunately I specified the wrong read files at the sai > sam stage. These read files contained 197538574 reads instead of 147913358.

Despite my error, I do wonder how it was possible for the conversion to continue with the wrong reads files being specified. What is the purpose for the read files being specified again at the sampe stage? The process seemed to complete successfully despite my error.

Thanks for taking the time to look at my question.

ADD REPLY
0
Entering edit mode
11.1 years ago

The help for bwa sampe seems to indicate that it has the ability to output multiple hits for paired reads:

-n INT   maximum hits to output for paired reads [3]
-N INT   maximum hits to output for discordant pairs [10]
ADD COMMENT
0
Entering edit mode

If there were more lines than sequences in the sam file (ignoring the header) then your answer would account for that. However, I am not sure how this would explain why BWA has apparently processed more sequences to produce the sam file. If it said 197538574 alignments have been processed then this would make sense. Even so shouldn't the number still be the same at the sai stage?

ADD REPLY

Login before adding your answer.

Traffic: 2423 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6