Using the Redis addon on Heroku

I am always playing around with new addons offered by Heroku. My latest discovery was the Redis addon that is provided by Redistogo. The addon is probably in private beta (docs are still on beta), but since they put up a link to it on their site, I managed to install it to my personal website that runs in the cloud.

Redis is “an advanced key-value store” and has some features that make it a perfect match for a cache! I use caching extensively on my site and keep on trying out new ways to do it to circumvent Heroku’s readonly filesystem.

Like Memcache, Redis provides the ability to set a time to live (ttl) on a key. This comes in handy, if you have data that expires in a short period of time, like 3rd party data from Twitter etc.

Caching with Redis

Accessing Redis is very simple, since it is a text based protocol. The command reference is straight forward and there is a simple Ruby wrapper available:

require "redis"
redis = Redis.new
redis.set "foo", "bar"
# => "OK"
redis.get "foo"
# => "bar"

The redis-store gem already provides a Rails 3 compatible Cache Store implementation, but I needed some more configuration points, especially the ttl.

That’s why I wrote my own Rails 3 Redis Cache, also a great way to get used to the way of working with Redis and the Redistogo addon.

Using Rails Redis Cache

There is some configuration needed for Rails to pick up the new cache store. If you want to use different or no caching for test, development and production, you should put the config in your environment files:

# config/environemnts/production.rb
config.action_controller.perform_caching = true
config.cache_store = ActiveSupport::Cache::RailsRedisCache.new(:url => ENV['REDISTOGO_URL'])

If there is a Redis server available in all environments, you can put it in your environment file:

# config/environment.rb
ActionController::Base.cache_store = ActiveSupport::Cache::RailsRedisCache.new(:url => ENV['REDISTOGO_URL'])

The caching parts are mostly in my controllers:

@tweets = cache("tweets", :expires_in => 30.seconds){ Twitter::Search.new(...) }

The store is using the basic Rails cache store implementation which is broken in the Rails 3.0.0.beta1 version that runs on Heroku, so I added a monkey-patch for that using edge Rails.

Redis on localhost

Installing and running Redis on Mac OS X is really simple:

brew install redis
redis-server

There is also a commandline client available for direct access:

redis-cli
redis> set "foo" "bar"
OK
redis> get "foo"
"bar"

It’s key value stores, stupid!

Mongo Ruby Driver, Mongoid and MongoMapper

================================================================= =================================================================

Update Aug. 2010

On Whyday, I created a live demo of the examples, that is running on Heroku.

=============================================================== ===============================================================

I am constantly looking around for different storage mechanisms on Heroku that can be used for caching 3rd party data. A recent update of their platform offered an MongoDB addon to access the MongoHQ service that drew my attention, so I started to evaluate this noSQL document database…

MongoDB on OS X

It’s always a good starting point to have a local installation of a technology, here is how you get it running on your Mac with Homebrew:

brew install mongodb
# create a place for MongoDB to store the data
mkdir -p /data/db
# run server with default config (adapt to the right version)
mongod run --config /usr/local/Cellar/mongodb/1.4.4-x86_64/mongod.conf

Using MongoHQ requires a user-authentication, so it’s nice to have the same credentials on your local MongoDB instance:

# start the client
mongo
> use test
> db.addUser("test", "test")

evaluating different APIs

A very basic approach, that basically wraps the MongoDB API into Ruby code is the Mongo Ruby Driver, but there are two higher level APIs close to ActiveRecord called Mongoid and MongoMapper.

Mongo Ruby Driver

It’s pretty easy to connect to your MongoDB with the right connection string:

conn = Mongo::Connection.from_uri("mongodb://user:pass@host:port/db")
db = conn.db("db")

The Mongo Ruby Driver is very simple and close to the MongoDB API:

coll = db.collection('test')
coll.insert('a' => 1)
coll.find().each { |row| p row }

MongoMapper

MongoMapper can also be accessed with a connection string:

Mongo::Connection.from_uri(MONGO_URL)

Instead of using ActiveRecord::Base MongoMapper provides the MongoMapper::Document module to handle the object document mapping. Since the structure of a document in MongoDB is open and not static like in a SQL database, you have to define the structure in code, so MongoMapper knows how to map the document to your Ruby objects:

class Person
  include MongoMapper::Document

  key :name, String
  key :age, Integer
  key :born_at, Time
  key :active, Boolean
  key :fav_colors, Array

  connection Mongo::Connection.from_uri(MONGO_URL)
  set_database_name 'basement'
end

person = Person.create({
  :name => 'Nunemaker',
  :age => 27,
  :born_at => Time.mktime(1981, 11, 25, 2, 30),
  :active => true,
  :fav_colors => %w(red green blue)
})

person.save

Person.all.each do |p|
  ...
end

Mongoid

Configuring Mongoid is somewhat different but easy:

Mongoid.database = Mongo::Connection.new(host, port).db(db)
Mongoid.database.authenticate(user, pass)

The DSL for defining Mongoid Documents is similar to MongoMapper and works mostly the same way. Querying the database is also similar to the API provided by ActiveRecord:

class Tweeter 
  include Mongoid::Document 
  field :user 
  embeds_many :tweets 
end 

class Tweet 
  include Mongoid::Document 
  field :status, :type => String 

  embedded_in :tweeter, :inverse_of => :tweets 
end

tweet = Tweet.new(:status => "This is a tweet!") 
tweet.tweeter = Tweeter.new(:user => 'ted') 
tweet.save

Tweeter.all.each do |tweeter| 
  ...
end

You can get the complete code and some more links from the GitHub project created for testing.

MongoDB is a great way to store document focused data and it’s simple to use with these great libraries!

ASIN vs ruby-aaws

I recently wrote about using ruby-aaws on Heroku. I used it for creating a virtual bookshelf on my website, so anybody interested in what I read can have a look at the ISBN, price, description and some reviews (in german). Since this is a trivial scenario it covers only a fragment of features that ruby-aaws offers.

I always felt that using ruby-aaws was way too complicated! This is how you call Amazon for the title of a book:

require "amazon"
require "amazon/aws"
require "amazon/aws/search"
il = Amazon::AWS::ItemLookup.new('ASIN', { 'ItemId'=>asin })
rg = Amazon::AWS::ResponseGroup.new('Medium')
req = Amazon::AWS::Search::Request.new
resp = req.search(il, rg)
puts resp.item_lookup_response.items[0].item.item_attributes.title.to_s

I also had to monkeypatch some stuff to get it working with Heroku the first time:

  • allow .amazonrc to be on a different location that can be used on Heroku
  • remove restriction to Ruby 1.8.7 and patch related Stuff

If you look into the source and documentation of ruby-aaws you will see that it is no fun to patch anything in there… I think I would not have done it without the help of Ian Macdonald.

Another thing was, that I could not use the builtin caching facility of ruby-aaws, cause it simply does not work on Heroku’s readonly file-system.

simplicity with ASIN

Given these restrictions, I decided to build a minimum featureset gem tailored for my requirements:

  • provide access to the Amazon-E-Commerce-API via REST
  • simple configuration points
  • minimum amount of code to write for a request
  • maximum flexibility

If you have a look into the Amazon documentation you see that it is quite easy to call the API via REST. Just append some query parameters to your desired endpoint (f.e. webservices.amazon.com) and as a result you get the desired information from Amazon. The tricky thing is, that since recently you have to sign your request with your AWS credentials. I did not find any specs on how to do that on the documentation, but Cloud Carpenters had a nice example using Python that I adapted for Ruby.
There is also the nice Amazon API signing service that frees you from self signing your requests. The reason I did not use it, is that it supports the amazon.com endpoint only (I need amazon.de).

requests with ASIN

Using ASIN is simple. You just have to provide your credentials to the configuration method, the rest is covered with sensible defaults that you can override if you wish:

require 'asin'
include ASIN

# use the configure method to setup your api credentials
configure :secret => 'your-secret', :key => 'your-key'

# you can override the api endpoint if you wish
configure :secret => 'your-secret', :key => 'your-key', :host => 'webservices.amazon.de'

After this setup you can call the REST api via the lookup method:

# lookup an item with the amazon standard identification number (asin)
item = lookup '1430218150'

# have a look at the title of the item
item.title
=> Learn Objective-C on the Mac (Learn Series)

# provide additional configuration options like the response group
lookup(asin, :ResponseGroup => :Medium)

Title is currently the only attribute that is directly supported from the Item class, but this is no restriction. ASIN uses Hashie::Mash for the internal data representation of the Amazon REST XML response. The Item class stores the response in a raw attribute that can be accessed for read:

# access the internal data representation (Hashie::Mash)
item.raw.ItemAttributes.ListPrice.FormattedPrice
=> $39.99

You can tailor the Item class to your needs by opening up the class and provide the methods you like or doing something entirely different with the raw attribute.

OR, just fork me on GitHub!

Maximum flexibility with some syntactic sugar!

Migrating to Rails 3 for Heroku Bamboo

Recently there were some interesting updates to the Heroku infrastructure, giving the opportunity to migrate my personal Rails 2 website to Rails 3.

Having an app with only a single model for caching data, there is no worry about database migration. A nice opportunity for starting out new:

rvm use 1.9.1
gem install rails --pre
rails basement-rails3
cd basement-rails3
heroku create basement-rails3 --stack bamboo-mri-1.9.1

business as usual?

Not really… Having Yehuda Katz as a core developer of Rails 3, it’s no surprise they adopted the Merb approach of just using one executable for everything. So the ‘script’ folder now contains just a ‘rails’ script. Creating controllers, running the server, jumping into the console - all through the ‘rails’ command:

rails -h
=> [...]
=>  generate    Generate new code (short-cut alias: "g")
=>  console     Start the Rails console (short-cut alias: "c")
=>  server      Start the Rails server (short-cut alias: "s")
=> [...]

I appreciate the shortcuts! No more discussions about what shortcut to use for ‘script/server’ (ss is not an option in germany…)!

dependency management

Rails 3 has changed the way of working with gems. It uses bundler to deal with dependencies. Beeing a big fan of Java’s dependency management tools like Ivy or Maven, I think that separating out the dependency issue is good idea.

All dependencies are now defined in a separate ‘Gemfile’ using an easy dsl to manage the gems:

gem "rails", "3.0.0.beta"
[...]
gem "sqlite3-ruby", :require => "sqlite3"
[...]
group :test do
  gem "test-unit", "1.2.3"
end

I had some trouble getting bundler working on my machine, but after reinstalling Rails 3 AFTER the bundler gem, everything worked fine.

The only Rails plugin in my app is Haml and I was confident that it would play well with the latest Rails version. Never the less I was pleased to find RailsPlugins.org where one can check the compatibility of plugins with Rails 3.

escaping vs. html_safe

There were just very little changes to the existing codebase in my application. Despite one thing though, that forced changes to nearly all of the wrapper objects that are used to encapsulate the data that is coming from external services like twitter. The Problem is that Rails 3 has a strict way of dealing with escaping. Every string rendered into the view will be escaped unless it is ‘html_safe’. Since my application is using a lot of pregenerated content with inline html, adding ‘html_safe’ markers is inevitable:

  def content
    @json["content"]["$t"].html_safe
  end

Ruby 1.9 is different

The biggest pile of migration problems resulted from using Ruby 1.9.1. The latest Ruby version is a lot faster, but it has changed some of the core functionality. The ‘enum_with_index’ method for example is replaced with an ‘each_with_index’ method on a hash.
Using old YAML files resulted in some strange behavior as these files have changed format slightly (because of the new symbol style that Ruby 1.9 is using, I guess):

# old
  id: home
# new
  :id: home

Ruby 1.9 also changed the way of handling unicode characters. Using these in code forces the developer to put a magic comment in the first line of the ruby file:

# coding: utf-8
[...]

beta quirks

Most of the new Rails 3 stuff just works, but there are some reasons why it is still beta:

# rails console won't quit with controll-c but exits without error typing ö.ö
rails c
=> Loading development environment (Rails 3.0.0.beta)
ruby-1.9.1-p378 > ö.ö
^C

# rails help doesn't work for commands
rails -h
=> [...]
=> All commands can be run with -h for more information.
rails generate -h
=> Could not find generator -h.

Beta but running!

Simple DB caching for Heroku

Heroku is a great platform. I like the style of the page, I appreciate the documentation and you can start up for free! One thing that I miss a lot is decent caching. The readonly filesystem eats up a lot of flexibility.

I played around with HTTP caching and Herokus Varnish works really well. The problem is that my app loads a lot of stuff from different 3rd party services like Twitter, so every new visitor will have all the load time on his first visit. Not a surprise that New Relic indicates that request times were ‘Unacceptable’…

I would like to check out the ‘Memcached Basic’ plugin of Heroku, but I did not manage to get into the private beta. So there was no other option than implementing a DB cache.

There is just one requirement that I have. Load stuff from a 3rd party service only if it’s expired. For simplicity, expired means, that the data is older than a predefined interval. In my test environment I like to use a shorter period than in production, so I define the interval in the environment files:

# config/environments/development.rb
CACHE_TIME = 30.seconds

# config/environments/production.rb
CACHE_TIME = 10.hours

A simple key-data pair is enough for my needs, because I always have a unique key for the values I want to cache. I am using Marshal.dump/Marshal.load for serialization, as they play well with anonymous inner classes that YAML can’t deal with. Encoding the data Base64 helps working around some SQLITE issues with serialized data strings:

# app/models/storage.rb
class Storage < ActiveRecord::Base
  
  validates_presence_of :key, :data
  
  def data=(data)
    write_attribute :data, ActiveSupport::Base64.encode64(Marshal.dump(data))
  end
  
  def data
    Marshal.load(ActiveSupport::Base64.decode64(read_attribute :data))
  end
  
end

The actual caching logic is embeded in my application controller. I provide a simple cache method, that can be called with a block. The block contains the remote call that I want to cache and is only executed if there is no data stored for the given key or the stored data is expired:

  # app/controllers/application_controller.rb
  def cache(key, &to_cache)
    from_db = Storage.first(:conditions => {:key => key})
    if from_db.nil? || from_db.updated_at < Time.new - CACHE_TIME
      data = (yield to_cache).collect{|t|t}
      return [] if data.nil? || data.empty?
      from_db = (from_db || Storage.new)
      from_db.key = key
      from_db.data = data
      from_db.save!
    end
    instance_variable_set :"@#{key.to_s}", from_db.data
  end

Finally the data is pushed into an instance varaible, so that I have access to it within my views.

Caching is now as simple as this:

  # cache all twitter posts and make them accessible via @tweets
  cache(:tweets){Helper::twitter_posts}

This little tweak noteable improved the response time of my app:

  This week:
  Apdex Score: 0.700.5 (Fair)

  Last week:
  Apdex Score: 0.060.5 (Unacceptable)

Sugar on rails!