Biostar Beta. Not for public use.
Question: Printable visualizations of large-scale alignments
0
Entering edit mode

I would like to visualize a large ("large" as in 20 bacterial genomes) multi-sequence alignment such that (a) the sequences are wrapped within pages, (b) the individual nucleotide letters remain visible (at least minutely), and (c) nucleotide differences compared to a consensus are highlighted. I wish to convert these alignments into PDF documents for subsequent printing.

Popular alignment viewers such as AliView or Geneious have their difficulties with such alignment visualizations. AliView cannot provide a wrapped layout, whereas Geneious hangs for large alignments (24 GB RAM insufficient for alignment of 20 bacterial genomes).

Do you have any software suggestions?

Note: One option may be to use a console-based alignment viewer (such as alan or alv), if the resulting visualizations could be passed to cups-pdf.

Edit 1: The R-package msa has been among my earlier tryouts. While a wonderful tool for generating small, publication-ready visualizations, it too stalls once the alignment becomes too large. For example, an alignment of 20 bacterial genomes is not converted within two hours of executing command myFirstAlignment <- msa(mySequences). Given that msa essentially wraps texshade, I would guess that the processing time with texshade would not be much smaller either.

ADD COMMENTlink 15 months ago Michael Gruenstaeudl • 40
Entering edit mode
0

I won't add this as an answer just yet since I'm not sure how well it would handle it either, but the other 2 options that occur to me are SeaView (which I think can save PDF or maybe PostScript), and ESPript.

I can't attest to how well either will deal with large data though.

Can I ask why it is you need such a large alignment 'printed'? It doesn't seem like it would be very useful.

ADD REPLYlink 15 months ago
Joe
12k
Entering edit mode
0

When you say convert to PDF, do you want them to still be vector graphics? Or would bitmaps work too?

ADD REPLYlink 15 months ago
thackl
♦ 2.6k
3
Entering edit mode

I found that I can achieve what I was looking for using the command-line alignment viewer alv. It may not be super pretty, but it is fast:

alv myReallyLargeAlignment.fasta -t dna -k -w 300 | aha | wkhtmltopdf - soughtVisualization.pdf

In addition, the alignment viewer belvu of the SeqTools package can also generate visualizations quickly, albeit only the first 10kp or so are visualized (depending on the line wrap value setting).

ADD COMMENTlink 15 months ago Michael Gruenstaeudl • 40
0
Entering edit mode

If you want PDF, there's are powerful latex package: http://ftp.cvut.cz/tex-archive/macros/latex/contrib/texshade/texshade.pdf

And if you don't want to deal with tex, there's an R interface, too: https://rdrr.io/bioc/msa/man/msaPrettyPrint.html

ADD COMMENTlink 15 months ago thackl ♦ 2.6k
Entering edit mode
0

Beat me to it! TeXShade would be my suggestion. I used it extensively in my thesis. Just be aware that it can add some compile time to the document.

ADD REPLYlink 15 months ago
Joe
12k
Entering edit mode
0

@jrj.healey Theoretically I would agree, but a compilation of a texshade section (with 150000+ bp) in a LaTeX document would take ages. I am looking for something more light-weight.

ADD REPLYlink 15 months ago
Michael Gruenstaeudl
• 40
Entering edit mode
0

@thackl The R-Package msa has been among my earlier tryouts, but is not a workable solution. It stalls when given a multi-sequence alignment of 20 or more bacterial genomes. Likewise, tex-documents with texshade-sections would take ages to compile (if at all) under such input alignments. In my experience, the only software tools that can open bacterial-sized alignments for visualization in reasonable time frames seem to be terminal/console-based tools (such as the ones mentioned above).

ADD REPLYlink 15 months ago
Michael Gruenstaeudl
• 40
Entering edit mode
0

Ah, good to know. Haven't used it for alignments that big yet.

ADD REPLYlink 15 months ago
thackl
♦ 2.6k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0