How does pilon treat single N characters in polishing genome?

0

Entering edit mode

5.7 years ago

ewilbanks • 0

Hi folks,

Does anyone know how pilon treats single N characters?

I'm using pilon and some illumina data to polish a pacbio assembled genome with some single N characters as ambiguous bases. I'm confused about how pilon considers these. For many of N instances there should be good support to correct this to an A, C, T, or G, but these aren't being touched by my current attempts. Ideas? Pilon is correcting other ambiguous bases (e.g. R, Y, K) to the correct base, but is ignoring Ns. These single Ns aren't gaps, but ambiguous bases from assembling together overlapping contigs using Geneious's assembler.

The command I'm running is:

java -Xmx120g -jar ~/software/anaconda2/pkgs/pilon-1.22-1/share/pilon-1.22-1/pilon-1.22.jar \
    --genome ref.fasta \
    --frags aln.sorted.bam \
    --unpaired u.sorted.bam \
    --changes --vcf --tracks \
    --threads 16 \
    --fix bases,amb \
    --outdir pilon_02

Assembly sequencing genome pilon polishing • 1.3k views

ADD COMMENT • link 5.7 years ago by ewilbanks • 0

Login before adding your answer.