Biostar Beta. Not for public use.
Question: Fastqc Html Report To Pdf (With A Script)
14
Entering edit mode

Hey there,

Anybody have a solution to convert fastQC html output to a pdf? If you've got 50 or 500 FASTQs to check, and more importantly share via email, the hmtl output is a little clunky to deal with.

Pierre Lindenbaum suggested apache FOP on twitter. I tried it but it seems it needs an xlt style sheet to work.

I also tried wkhtmltopdf on my linux machine and wkpdf on my macbook. Both of these resulted in blank PDFs.

Opening each html one by one and printing to PDF on my mac works, but is a really really slow option.

Point is I want to script this out, ideally in linux, and get a bunch of PDFs in the end.

Posted below are two links with test data, the first shows what the output looks like in a browser, the second is the full output from fastQC.

Thanks!

fastQC example page

zip download of output

ADD COMMENTlink 8.1 years ago Caddymob • 950 • updated 8.1 years ago Neilfws 48k
Entering edit mode
1

Why, in the first place, does non-interactive FastQC output come in HTML ? - is something I think I'll never understand.

ADD REPLYlink 6.8 years ago
gaelgarcia05
• 190
7
Entering edit mode

Surprised no-one has mentioned HTMLDOC. For Ubuntu and similar simply:

sudo apt-get install htmldoc

then:

htmldoc --webpage -f output.pdf index.html

or just "htmldoc" for the GUI.

ADD COMMENTlink 8.0 years ago Neilfws 48k
Entering edit mode
1

Yes, this works nicely too! Played with the options to make things fit better, running this htmldoc --webpage --browserwidth 800 --fontsize 7 -f output.pdf fastqc_report.html

ADD REPLYlink 8.0 years ago
Caddymob
• 950
Entering edit mode
0

I installed this in Mac OS Mojave (10.14.13) and it works but the output is all black and white and without the plots :/.

ADD REPLYlink 12 months ago
msimmer92
• 180
6
Entering edit mode

I just installed wkhtmltopdf and used it on your html file with this command:

wkhtmltopdf 20A.R2.QC.fq_fastqc/fastqc_report.html test.pdf

And I got a test.pdf file back with the correct contents. Here is the pdf file uploaded to imgur: http://imgur.com/f4fCz

Imgur converted the .pdf to .png so the quality is not great.

ADD COMMENTlink 8.1 years ago Damian Kao 15k
Entering edit mode
1

I get a QPixmap: Cannot create a QPixmap when no GUI is being used error. Seems this is a bug. Running on a 64 centOS machine. Curious what you ran it on?

ADD REPLYlink 8.1 years ago
Caddymob
• 950
Entering edit mode
0

Thats crazy. I do that and I get a header and nothing else: http://public.tgen.org/jcorneveaux/FASTQC/test.pdf

ADD REPLYlink 8.1 years ago
Caddymob
• 950
Entering edit mode
0

I ran it on Ubuntu 11.04 64 bit.

ADD REPLYlink 8.1 years ago
Damian Kao
15k
4
Entering edit mode

This is the script (qcimg2pdf.sh) i use as part of Fastq workflow. I use some of the images from fastqc: Run it in the fastqc parent directories for different lanes....

#!/bin/bash
## qcimg2pdf.sh
echo "Usage: $0 -o output_prefix";

use ghostscript-9.02  ## if already exists in path comment this Line

if [[ $# -eq 0 || $# -gt 2 ]]
then
echo "No/wrong ($#) arguments detected "
echo "Run it where you have *fastqc directories";
exit 1 #exit shell script
fi

while getopts o: option
do
case $option in
o)
outprefix=$OPTARG
;;
esac
echo $outprefix;

if [[ $outprefix != "" ]];then
for j in `ls -d1 *fastqc` ;
 do
  echo $j ;

  convert \( -scale 500x500 $j/Images/per_base_quality.png $j/Images/per_base_gc_content.png +append \) \( -scale 500x500 $j/Images/per_sequence_quality.png $j/Images/per_sequence_gc_content.png +append \) -append -font Helvetica -pointsize 12 -gravity northeast -draw "translate +5,+5 text 80,80 '`grep -A5 Filename $j/fastqc_data.txt`'" QC.$j.pdf

 done

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=qc-lanes.$outprefix.pdf QC.*.pdf

else
echo "use correct arguments with only -o "
exit 1 #exit shell script
fi

done
ADD COMMENTlink 8.0 years ago Rm 7.8k
Entering edit mode
0

Brilliant solution, and just the simple kind of approach I needed, thx RM!

ADD REPLYlink 8.0 years ago
Caddymob
• 950
Entering edit mode
0

I'm not sure I understand how this works. What arguments do you need to give the script?

ADD REPLYlink 3.7 years ago
kyle.tingey
• 0
Entering edit mode
0

In my case I ran it in Linux and it produced a pdf with three plots, it didn´t convert properly the whole report into PDF.

ADD REPLYlink 12 months ago
msimmer92
• 180
3
Entering edit mode

I would be interested in extracting the raw FastQC data as generic tables to render plots in R

If anyone would like to participate I could create a github repository for this "project".

ADD COMMENTlink 8.1 years ago Jeremy Leipzig 18k
Entering edit mode
2

I think the fastqc_data.txt file has much of it..

ADD REPLYlink 8.1 years ago
Caddymob
• 950
Entering edit mode
2

I did this a while ago, but never maintained it - you're free to cannibalise as much as you want/need! https://github.com/clark-lab-robot/Repitools-git/blob/master/pkg/Repitools/R/FastQC-class.R

ADD REPLYlink 8.0 years ago
Aaron Statham
♦ 1.1k
Entering edit mode
1

There is also a bioc package called qrqc that will get you some of the fastqc stats as well. The nice thing about the package is that it does all the read processing "online" (you don't have to load the entire thing) in C code, like fastqc does.

ADD REPLYlink 8.1 years ago
Steve Lianoglou
5.0k
Entering edit mode
0

Yes, been testing qrqc too..

ADD REPLYlink 8.1 years ago
Caddymob
• 950
Entering edit mode
0

Thanks Aaron, looks like some good stuff :)

ADD REPLYlink 8.0 years ago
Caddymob
• 950
3
Entering edit mode

Here is the XSLT stylesheet for FO:

https://gist.github.com/lindenb/6b66d062c74097766dcf47912d409448

the HTML document is not a valid XML document so I used xsltproc to fix the document before using FOP. Here is the Makefile:

all:fastqc.pdf

fastqc.fo:fastqc2fo.xsl fastqc_report.html
    xsltproc --html fastqc2fo.xsl fastqc_report.html > $@

fastqc.pdf:fastqc.fo
    fop  $< $@

The result was posted on slideshare: http://www.slideshare.net/lindenb/biostar17037

Edit : my output is missing one or two tables but you get the idea.

ADD COMMENTlink 18 months ago Pierre Lindenbaum 120k
2
Entering edit mode

My company, Expected Behavior, has a service called DocRaptor that converts HTML to PDF or Excel format. Unlike wkhtmltopdf, DocRaptor generates fully functional PDF files, not just a PNG.

Here's a link to the home page:

http://docraptor.com/

And a link to the code example page. You make an HTTP POST request to DocRaptor's server, and we send your file back to you.

http://docraptor.com/examples

ADD COMMENTlink 8.0 years ago Tyler Moore • 20
1
Entering edit mode

I just tried html to latex and after a few minutes I decided that you are right: must be an easier way.

@DK's approach seems the best.

http://htmltolatex.sourceforge.net/

ADD COMMENTlink 8.1 years ago Zev.Kronenberg 11k
Entering edit mode
0

yea... I just wish it actually worked for me..

ADD REPLYlink 8.1 years ago
Caddymob
• 950

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0