Generating PDF from HTML using DocRaptor on Heroku

There comes a time one has to create PDFs for a Rails application. Searching the web will most likely bring you to libraries like PDF Kit and Wicket PDF that use wkhtmltopdf as a driver.

If your app is hosted on Heroku you wonder weather wkhtmltopdf is available so that you can use one of these awesome libraries. Searching the Heroku docs, you will probably come to same conclusion: nothing on there!

As the Heroku support states:
You’re correct that there’s no official documentation. Most of our customers seem to be using wkhtmltopdf, generally with pdfkit. I do hope to document this usage soon.

http://github.com/jdpace/PDFKit
http://code.google.com/p/wkhtmltopdf/

If you want us to offer a Prince add-on, I encourage you to write them inviting them to check out our Add-on Provider Programme: http://addons.heroku.com/provider

Of course, you can always purchase your own license and run it on EC2 or another server of your choosing.

PrinceXML

One of the libraries that were used at my last job was PrinceXML, which did a great jog generating PDFs from HTML pages. It supports most of the HTML and CSS stuff, passing the ACID2 test. PrinceXML has some additional CSS attributes that enable you to configure additional PDF specific layout settings.

DocRaptor

Since PrinceXML is a commercial product, Heroku won’t support it and I did not find anything on the web, that would offer PDF generation as a service. Asking the PrinceXML Forum I found out about DocRaptor. These guys provide a service to convert HTML to XLS or PDF over a webservice interface, extactly what I was looking for. As an additional bonus, they just implemented a gem for supporting the Heroku Add-on interface. Mail to Expected Behavior Support if you want to participate in the private beta.

Improvements

DocRaptor offers a great service, but they are still in early development. There were some Issues that were resolved recently. If you are a user of PrinceXML you probably know the —baseurl option that allows usage of relative paths for images and stylesheets. DocRaptor adds support for command line options just yet. The feature for generating a PDF from a given URL is even better!

Using DocRaptor from Rails

The latest DocRaptor documentation for the Heroku Add-on is decent and it provides some nice examples.

PDF from raw HTML

Here is what I did to get it running on my Rails 3 project:

# Gemfile
gem "doc_raptor", "0.1.1"

# mime_types.rb
Mime::Type.register_alias "application/pdf", :pdf

# your_controller.rb
def your_pdf_action
  respond_to do |format|
    format.pdf do
      data = DocRaptor.create(:name => 'DocRaptor.pdf', :document_content => render_to_string, :document_type => "pdf", :prince_options => {:baseurl => 'http://nofail.de'})
      send_data data, :type => 'application/pdf', :filename => 'DocRaptor.pdf'
    end
  end
end

If you registered the pdf mime-type you will have to provide an additional layout for this. I added some PrinceXML specific parameters to the styles to make it a fullscreen PDF. One thing that is essential for making the stylesheets work is the media => ‘screen, print’ settings:

// application.pdf.haml
= stylesheet_link_tag 'style', :media => 'screen, print'
// you can provide a base tag for images and stylesheets
// %base{:url=>'http://blog.nofail.de'}

%style
  @page { size: A4 }
  @page { margin: 0px }
  @page { border: none }
  @page { padding: 0px }
  @page { prince-shrink-to-fit: auto }

PDFs from an URL

The simplest solution for generating a PDF is to send an url to the service, so you can re-use all your view logic:

data = DocRaptor.create(:name => "DocRaptor.pdf", :document_url => "http://blog.nofail.de", :document_type => "pdf")
send_data data, :type => 'application/pdf', :filename => "DocRaptor.pdf"

One caveat though, you got to have at least two dynos to serve the additional request from DocRaptor!

See a working example on my homepage.

Generating PDF form HTML without the hassle, thanks to DocRaptor!

Rails, getting started without the hassle

I just changed jobs and am now a Rails developer at tolingo.com, which is an online translation broker. When I started out working on my new desk, I had to setup my iMac development environment. There are tons of articles of how to compile/install/run stuff like MySQL, to get you started on OS X, but I think all one really needs is Homebrew and RVM.

Homebrew

Homebrew is a Ruby based packaging tool for Mac and once you start using it, you immediately hate yourself for having wasted time on MacPorts

“Homebrew is the easiest and most flexible way to install the UNIX tools Apple didn’t include with OS X.”

This quote is from the official website and I guess they are absolutely right!

Formula

Homebrew is build around formulas. They describe how a package should be loaded from the web and installed on your system. It also cares about package dependencies, paths and all the other ugly stuff:

require 'formula'

class Wget < Formula
  homepage 'http://www.gnu.org/wget/'
  url 'http://ftp.gnu.org/wget-1.12.tar.gz'
  md5 '308a5476fc096a8a525d07279a6f6aa3'

  def install
    system "./configure --prefix=#{prefix}"
    system 'make install'
  end
end

You can easily install packages from the shell with brew:

brew install wget

Homebrew puts all the packages into ‘/usr/local’, so that it won’t interfer with other components of your system. To get your packages working, you need to include it into your $PATH. If you have any problems running something, Homebrew comes with the doctor command, that scans for problems in your setup!

Installation

Just download Homebrew to your system and update once a while:

# install homebrew via curl
sudo mkdir -p /usr/local && sudo chown -R $USER /usr/local && curl -Lsf http://bit.ly/9H4NXH | tar xvz -C/usr/local --strip 1

# update homebrew
brew update

Git, MySQL, Sphinx and more

What else do you need? Just search for it or get more infos with info!

These are the packages that I needed for development:

# install mysql and set it up
brew install mysql
mysql_install_db
# add mysqld as launch agent
cp /usr/local/Cellar/mysql/#{MYSQL_VERSION}/com.mysql.mysqld.plist ~/Library/LaunchAgents
launchctl load -w ~/Library/LaunchAgents/com.mysql.mysqld.plist

# install git
brew install git git-flow

# add git bash completion (find path to your git with 'brew info git')
ln -s /usr/local/Cellar/git/#{GIT_VERSION}/etc/bash_completion.d/git-completion.bash ~/.git-completion.bash
source .git-completion.bash

# install sphinx search-deamon
brew install sphinx

# aspell with all spellings
brew install aspell --all

# libxml and imagemagick for sprites
brew install libxml2 imagemagick

RVM the Ruby Version Manager

RVM is a command line tool for managing your local Ruby environments, you can get some more information on the RVM homepage and in earlier articles.

Quick start with installing RVM to your machine:

# install rvm via curl !!! FOLLOW RVM INSTRUCTIONS !!!
bash < <( curl http://rvm.beginrescueend.com/releases/rvm-install-head )

# download and compile latest 1.8.7
rvm install 1.8.7

# create a .rvmrc file in your app's base directory
echo "rvm use [email protected]#{YOUR_APP} --create" > #{YOUR_APP}/.rvmrc
# execute it by cd-ing to your app's directory
cd #{YOUR_APP}

Now you can work on your app with a custom gem environment. Unless you are using Bundler, this is probably what you want for installing and removing gems painlessly.

Cucumber with Celerity

Behavior driven development with Cucumber works nicely with Celerity, a JRuby implementation of a headless browser using HtmlUnit and it’s companion a Ruby wrapper called Culerity. Culerity has recently been updated with some configuration points for registering your local JRuby environment:

# jruby config für culerity (from http://rvm.beginrescueend.com/integration/culerity/)
rvm install jruby
rvm use [email protected] --create
gem install celerity
rvm wrapper [email protected] celerity jruby
# add to .profile
export JRUBY_INVOCATION="$(readlink "$(which celerity_jruby)")"

If you are experiencing any weird Broken Pipe errors (like me), have a look at this issue.

This is just an example of how you can setup your Rails development environment. Comments on this topic are appreciated!

DZone API and iPhone app

As I already mentioned, I am currently getting my hands dirty with Objective-C and iPhone application development.

The biggest problem with getting started was that I had no idea what application I could write for that device that could become somewhat usable. As I am a passionate tech reader, I consume a lot of articles posted on DZone. Usually I use a feed reader like NetNewsWire for that, which works very well for my MacBook but is nearly useless on the iPhone, because the DZone site is not very mobile friendly…

Problems

Since there was no DZone iPhone application on the marked I started working on it. Parsing DZone feeds was easy, even though the buildin XML support on iOS sucks. There were some nice libraries that made my life easier.

No deeplink

The DZone RSS feed does not provide a deeplink to the actual linked article, so one would still land on the DZone page… Since DZone does not provide an API currently, I started working on my own Rails application hosted on Heroku. Spidering the RSS, calling the page and extracting the link to the article is fragile, but it works (currently).

No voting

One of my goals was to let the iPhone user vote for the article while reading it. The lack of an API forced me to do some more fragile login and posting stuff to the DZone page, but it works too (currently)!

You can read more about the API I created on the actual page.

iPhone app

The first version of the “dzone mobile” app has passed the iTunes store review process and is available through the app store. A version with some minor bugfixes is currently beeing reviewed. Have a look at updates and documentation here or here.

voting

You have got to provide your DZone login credentials if you want to use the voting feature. Go to the iPhone Settings > DZone and add your username and password. I want you to know that there is NO SSL, so your credentials will be submitted UNSECURE!

more Features

If you are interested in pushing this further, you can add bug reports or feature requests on GitHub.

screenshots

DZone iPhone sugar!

Using the Redis addon on Heroku

I am always playing around with new addons offered by Heroku. My latest discovery was the Redis addon that is provided by Redistogo. The addon is probably in private beta (docs are still on beta), but since they put up a link to it on their site, I managed to install it to my personal website that runs in the cloud.

Redis is “an advanced key-value store” and has some features that make it a perfect match for a cache! I use caching extensively on my site and keep on trying out new ways to do it to circumvent Heroku’s readonly filesystem.

Like Memcache, Redis provides the ability to set a time to live (ttl) on a key. This comes in handy, if you have data that expires in a short period of time, like 3rd party data from Twitter etc.

Caching with Redis

Accessing Redis is very simple, since it is a text based protocol. The command reference is straight forward and there is a simple Ruby wrapper available:

require "redis"
redis = Redis.new
redis.set "foo", "bar"
# => "OK"
redis.get "foo"
# => "bar"

The redis-store gem already provides a Rails 3 compatible Cache Store implementation, but I needed some more configuration points, especially the ttl.

That’s why I wrote my own Rails 3 Redis Cache, also a great way to get used to the way of working with Redis and the Redistogo addon.

Using Rails Redis Cache

There is some configuration needed for Rails to pick up the new cache store. If you want to use different or no caching for test, development and production, you should put the config in your environment files:

# config/environemnts/production.rb
config.action_controller.perform_caching = true
config.cache_store = ActiveSupport::Cache::RailsRedisCache.new(:url => ENV['REDISTOGO_URL'])

If there is a Redis server available in all environments, you can put it in your environment file:

# config/environment.rb
ActionController::Base.cache_store = ActiveSupport::Cache::RailsRedisCache.new(:url => ENV['REDISTOGO_URL'])

The caching parts are mostly in my controllers:

@tweets = cache("tweets", :expires_in => 30.seconds){ Twitter::Search.new(...) }

The store is using the basic Rails cache store implementation which is broken in the Rails 3.0.0.beta1 version that runs on Heroku, so I added a monkey-patch for that using edge Rails.

Redis on localhost

Installing and running Redis on Mac OS X is really simple:

brew install redis
redis-server

There is also a commandline client available for direct access:

redis-cli
redis> set "foo" "bar"
OK
redis> get "foo"
"bar"

It’s key value stores, stupid!

Using blocks in Objective-C

One of my pious intentions for the year 2010 is to start writing some application for the Mac. Apart from the hype about iPhone development, I think that starting out in that area is especially appealing, as it reduces the size of the API one has to learn. Getting to know the iOS libraries is a lot easier than handling the endless amount of Cocoa frameworks.

One thing that I discovered recently is the support of blocks, that has been introduced with OSX 10.6 and iOS 4.0.

the environment matters

Consuming 3rd party data from the web is kind of a pain, especially compared to how easy it is in Ruby. So I was pleased to find Seriously, a framework for async calls and JSON/XML parsing. The Seriously examples made use of blocks:

NSString *url = @"http://api.twitter.com/1/users/show.json?screen_name=probablycorey";

[Seriously get:url handler:^(id body, NSHTTPURLResponse *response, NSError *error) {
    if (error) {
        NSLog(@"Error: %@", error);
    }
    else {
        NSLog(@"Look, JSON is parsed into a dictionary!");
        NSLog(@"%@", [body objectForKey:@"profile_background_image_url"]);
    }
}];

Executing this example in my app code raised an error, that I could not easily understand:

"__NSConcreteGlobalBlock", referenced from: ___block_holder_tmp_1.1207 in DZoneController.o
ld: symbol(s) not found
collect2: ld returned 1 exit status

The problem was, that XCode set my execution environment to OSX 10.5 which does not have support for blocks. To fix this, one has to update the MACOSX_DEPLOYMENT_TARGET build variable.

Right Clicking on the XCode project in the “Groups & Files” view will bring up a context menu with “Get Info” (or pressing CMD+I while project is selected). The info pane has a “General” tab that lets you select the “Base SDK for All Configurations”, which I set that to iOS 4.0.
An other option is to search for “MACOSX_DEPLOYMENT_TARGET” in the “Build” tab and changing that value accordingly. Make sure “Show” is set to “All Settings”.

At least some Objective-C sugar!