I mapped some metagenomic PE Illumina data on several complete bact. genomes from NCBI and on some complete assemblies from our lab (see legend in graphs).
I used bbmap.sh with parameter “ambig=all” and I see a strange effect I do not fully understand. It seems that the order in the “ref” Fasta file does matter significantly. But for “ambig=all” this should not be the case, right (at least not that extreme)? I would understand it for “ambig=best" (since there it uses the first hit on the first ref if all scores are the same).
I mean, I guessed that this happens since the ref. chromosomes are very similar over large parts and the algo gives up after a certain number of hits. So I increased parameter "maxsites" (up to 5000). Did not change the outcome. I also tried to increase "minid" and to set "slow=t". Also did not change the outcome.
I attached two examples of barplots reflecting the number of hits on every genome:
- First graph: This is the first run, where the strains “D76” and “CAUH18” are located at the beginning of the fasta file. Command: bbmap.sh ambig=all maxindel=80 ref=all.fasta in=R1.fastq.gz in2=.R2.fastq.gz out=paired_ambigAll.sam
- Second graph: This is the mapping after I moved the strains “D76” and “CAUH18” to the end of the Fasta file. Same command with changed ref: bbmap.sh ambig=all maxindel=80 ref=all_turned.fasta in=R1.fastq.gz in2=.R2.fastq.gz out=paired_ambigAll.sam
The mapping counts are normalized relative to the strain with the most mappinged reads. “ch” means “chromosome”, “pl” stands for “plasmid”.
Thanks for some hints!
Edit: Update -- I have this problem also, if I provide only "in1" (so bbmap only runs in single-ended mode).
It looks like there's a bug in BBMap's handling of ambiguously-mapped paired reads, which are handled a bit differently from ambiguously-mapped single-ended reads. I'll investigate.
Thanks Brian! So Just to let you know: I did not run bbmap with the necessity that the reads must map paired. But I then extracted only the pairs with samtools. EDIT: Brian, do you think you'll soon have a fix? I would like to go on with bbmap but without this problem.
Yes, I'm working on it. Not sure about the timeline, though; it's not trivial. Hopefully by March. I'm very surprised about the single-ended result, though. Would it be possible for you to send me the reads and references?
Hi Brian, you got the download links via e-mail, right?
Yep, I got them. Sorry for my lack of speed on this issue!
No rush. Just wanted to make sure...