Generating PDF from HTML using DocRaptor on Heroku

There comes a time one has to create PDFs for a Rails application. Searching the web will most likely bring you to libraries like “PDF Kit”:http://github.com/jdpace/PDFKit/ and “Wicket PDF”:http://github.com/mileszs/wicked_pdf/ that use “wkhtmltopdf”:http://code.google.com/p/wkhtmltopdf/ as a driver.

If your app is hosted on “Heroku”:http://heroku.com you wonder weather wkhtmltopdf is available so that you can use one of these awesome libraries. “Searching the Heroku docs”:http://docs.heroku.com/search?q=pdf, you will probably come to same conclusion: nothing on there!

As the Heroku support states:

You’re correct that there’s no official documentation. Most of our customers seem to be using wkhtmltopdf, generally with pdfkit. I do hope to document this usage soon.

http://github.com/jdpace/PDFKit
http://code.google.com/p/wkhtmltopdf/

If you want us to offer a Prince add-on, I encourage you to write them inviting them to check out our Add-on Provider Programme: http://addons.heroku.com/provider

Of course, you can always purchase your own license and run it on EC2 or another server of your choosing.

h2. PrinceXML

One of the libraries that were used at “my last job”:http://www.blaulabs.de/ was “PrinceXML”:http://princexml.com, which did a great jog generating PDFs from HTML pages. It “supports most of the HTML and CSS”:http://princexml.com/doc/7.0/ stuff, passing the “ACID2 test”:http://princexml.com/samples/acid2/. PrinceXML has “some additional CSS attributes”:http://princexml.com/doc/7.0/page-size/ that enable you to configure additional PDF specific layout settings.

h2. DocRaptor

Since PrinceXML is a commercial product, Heroku won’t support it and I did not find anything on the web, that would offer PDF generation as a service. Asking the “PrinceXML Forum”:http://www.princexml.com/bb/viewtopic.php?f=3&t=3820&sid=cbbae46520b98f82e259f53e350d650d I found out about “DocRaptor”:http://docraptor.com/. These guys provide a service to convert HTML to XLS or PDF over a webservice interface, extactly what I was looking for. As an additional bonus, they “just implemented a gem”:http://rubygems.org/gems/doc_raptor for supporting the Heroku Add-on interface. Mail to “Expected Behavior Support”:mailto:[email protected] if you want to participate in the private beta.

h3. Improvements

DocRaptor offers a great service, but they are still in early development. There were some Issues that were resolved recently. If you are a user of PrinceXML you probably know the _–baseurl_ option that allows usage of relative paths for images and stylesheets. DocRaptor “adds support for command line options”:http://docraptor.com/documentation#pdf_options just yet. The feature for “generating a PDF from a given URL”:http://docraptor.com/documentation#api_document_url is even better!

h3. Using DocRaptor from Rails

The latest “DocRaptor documentation for the Heroku Add-on”:http://doc-raptor-docs.heroku.com/ is decent and it provides “some nice examples”:http://docraptor.com/examples.

h4. PDF from raw HTML

Here is what I did to get it running on “my Rails 3 project”:http://phoet.de:

# Gemfile
gem "doc_raptor", "0.1.1"

# mime_types.rb
Mime::Type.register_alias "application/pdf", :pdf

# your_controller.rb
def your_pdf_action
  respond_to do |format|
    format.pdf do
      data = DocRaptor.create(:name => 'DocRaptor.pdf', :document_content => render_to_string, :document_type => "pdf", :prince_options => {:baseurl => 'http://nofail.de'})
      send_data data, :type => 'application/pdf', :filename => 'DocRaptor.pdf'
    end
  end
end

If you registered the pdf mime-type you will have to provide an additional layout for this. I added some PrinceXML specific parameters to the styles to make it a fullscreen PDF. One thing that is essential for making the stylesheets work is the _media => ‘screen, print’_ settings:

// application.pdf.haml
= stylesheet_link_tag 'style', :media => 'screen, print'
// you can provide a base tag for images and stylesheets
// %base{:url=>'http://blog.nofail.de'}

%style
  @page { size: A4 }
  @page { margin: 0px }
  @page { border: none }
  @page { padding: 0px }
  @page { prince-shrink-to-fit: auto }

h4. PDFs from an URL

The simplest solution for generating a PDF is to send an url to the service, so you can re-use all your view logic:

data = DocRaptor.create(:name => "DocRaptor.pdf", :document_url => "http://blog.nofail.de", :document_type => "pdf")
send_data data, :type => 'application/pdf', :filename => "DocRaptor.pdf"

One caveat though, you got to have at least two “dynos”:http://docs.heroku.com/dynos to serve the additional request from DocRaptor!

See a “working example on my homepage”:http://www.phoet.de/interest/curriculum/curriculum_table.pdf.

Generating PDF form HTML without the hassle, thanks to DocRaptor!

5 thoughts on “Generating PDF from HTML using DocRaptor on Heroku

  1. Joel Meador

    Thanks for the kind words, Peter! You’ve been giving us great feedback and ideas.

    As an aside, clicking on the skull images in the upper right of your blog takes one to nofail.com instead of nofail.de.

  2. phoet Post author

    you are welcome! thanks for providing such a nice service and fast problem resolution!

    i fixed the link, thanks for reporting.

  3. polarblau

    Thanks for that! I’ll make sure to check it.
    On a different account: Do you guys know that your skulls in the upper right, link the .com version of your blog, which is a parked site? Making some extra cash there ? ;)

Comments are closed.