bbmap: Order of reference genomes seems to matter
0
0
Entering edit mode
7.2 years ago
mschmid ▴ 180

I mapped some metagenomic PE Illumina data on several complete bact. genomes from NCBI and on some complete assemblies from our lab (see legend in graphs).

I used bbmap.sh with parameter “ambig=all” and I see a strange effect I do not fully understand. It seems that the order in the “ref” Fasta file does matter significantly. But for “ambig=all” this should not be the case, right (at least not that extreme)? I would understand it for “ambig=best" (since there it uses the first hit on the first ref if all scores are the same).

I mean, I guessed that this happens since the ref. chromosomes are very similar over large parts and the algo gives up after a certain number of hits. So I increased parameter "maxsites" (up to 5000). Did not change the outcome. I also tried to increase "minid" and to set "slow=t". Also did not change the outcome.

I attached two examples of barplots reflecting the number of hits on every genome:

  1. First graph: This is the first run, where the strains “D76” and “CAUH18” are located at the beginning of the fasta file. Command: bbmap.sh ambig=all maxindel=80 ref=all.fasta in=R1.fastq.gz in2=.R2.fastq.gz out=paired_ambigAll.sam
  2. Second graph: This is the mapping after I moved the strains “D76” and “CAUH18” to the end of the Fasta file. Same command with changed ref: bbmap.sh ambig=all maxindel=80 ref=all_turned.fasta in=R1.fastq.gz in2=.R2.fastq.gz out=paired_ambigAll.sam

The mapping counts are normalized relative to the strain with the most mappinged reads. “ch” means “chromosome”, “pl” stands for “plasmid”.

enter image description here

enter image description here

Thanks for some hints!

Edit: Update -- I have this problem also, if I provide only "in1" (so bbmap only runs in single-ended mode).

bbmap mapping illumina • 2.4k views
ADD COMMENT
0
Entering edit mode

It looks like there's a bug in BBMap's handling of ambiguously-mapped paired reads, which are handled a bit differently from ambiguously-mapped single-ended reads. I'll investigate.

ADD REPLY
0
Entering edit mode

Thanks Brian! So Just to let you know: I did not run bbmap with the necessity that the reads must map paired. But I then extracted only the pairs with samtools. EDIT: Brian, do you think you'll soon have a fix? I would like to go on with bbmap but without this problem.

ADD REPLY
0
Entering edit mode

Yes, I'm working on it. Not sure about the timeline, though; it's not trivial. Hopefully by March. I'm very surprised about the single-ended result, though. Would it be possible for you to send me the reads and references?

ADD REPLY
0
Entering edit mode

Hi Brian, you got the download links via e-mail, right?

ADD REPLY
0
Entering edit mode

Yep, I got them. Sorry for my lack of speed on this issue!

ADD REPLY
0
Entering edit mode

No rush. Just wanted to make sure...

ADD REPLY

Login before adding your answer.

Traffic: 3555 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6