Mongo Ruby Driver, Mongoid and MongoMapper

================================================================= =================================================================

Update Aug. 2010

On Whyday, I created a live demo of the examples, that is running on Heroku.

=============================================================== ===============================================================

I am constantly looking around for different storage mechanisms on Heroku that can be used for caching 3rd party data. A recent update of their platform offered an MongoDB addon to access the MongoHQ service that drew my attention, so I started to evaluate this noSQL document database…

MongoDB on OS X

It’s always a good starting point to have a local installation of a technology, here is how you get it running on your Mac with Homebrew:

brew install mongodb
# create a place for MongoDB to store the data
mkdir -p /data/db
# run server with default config (adapt to the right version)
mongod run --config /usr/local/Cellar/mongodb/1.4.4-x86_64/mongod.conf

Using MongoHQ requires a user-authentication, so it’s nice to have the same credentials on your local MongoDB instance:

# start the client
mongo
> use test
> db.addUser("test", "test")

evaluating different APIs

A very basic approach, that basically wraps the MongoDB API into Ruby code is the Mongo Ruby Driver, but there are two higher level APIs close to ActiveRecord called Mongoid and MongoMapper.

Mongo Ruby Driver

It’s pretty easy to connect to your MongoDB with the right connection string:

conn = Mongo::Connection.from_uri("mongodb://user:[email protected]:port/db")
db = conn.db("db")

The Mongo Ruby Driver is very simple and close to the MongoDB API:

coll = db.collection('test')
coll.insert('a' => 1)
coll.find().each { |row| p row }

MongoMapper

MongoMapper can also be accessed with a connection string:

Mongo::Connection.from_uri(MONGO_URL)

Instead of using ActiveRecord::Base MongoMapper provides the MongoMapper::Document module to handle the object document mapping. Since the structure of a document in MongoDB is open and not static like in a SQL database, you have to define the structure in code, so MongoMapper knows how to map the document to your Ruby objects:

class Person
  include MongoMapper::Document

  key :name, String
  key :age, Integer
  key :born_at, Time
  key :active, Boolean
  key :fav_colors, Array

  connection Mongo::Connection.from_uri(MONGO_URL)
  set_database_name 'basement'
end

person = Person.create({
  :name => 'Nunemaker',
  :age => 27,
  :born_at => Time.mktime(1981, 11, 25, 2, 30),
  :active => true,
  :fav_colors => %w(red green blue)
})

person.save

Person.all.each do |p|
  ...
end

Mongoid

Configuring Mongoid is somewhat different but easy:

Mongoid.database = Mongo::Connection.new(host, port).db(db)
Mongoid.database.authenticate(user, pass)

The DSL for defining Mongoid Documents is similar to MongoMapper and works mostly the same way. Querying the database is also similar to the API provided by ActiveRecord:

class Tweeter 
  include Mongoid::Document 
  field :user 
  embeds_many :tweets 
end 

class Tweet 
  include Mongoid::Document 
  field :status, :type => String 

  embedded_in :tweeter, :inverse_of => :tweets 
end

tweet = Tweet.new(:status => "This is a tweet!") 
tweet.tweeter = Tweeter.new(:user => 'ted') 
tweet.save

Tweeter.all.each do |tweeter| 
  ...
end

You can get the complete code and some more links from the GitHub project created for testing.

MongoDB is a great way to store document focused data and it’s simple to use with these great libraries!

ASIN vs ruby-aaws

I recently wrote about using ruby-aaws on Heroku. I used it for creating a virtual bookshelf on my website, so anybody interested in what I read can have a look at the ISBN, price, description and some reviews (in german). Since this is a trivial scenario it covers only a fragment of features that ruby-aaws offers.

I always felt that using ruby-aaws was way too complicated! This is how you call Amazon for the title of a book:

require "amazon"
require "amazon/aws"
require "amazon/aws/search"
il = Amazon::AWS::ItemLookup.new('ASIN', { 'ItemId'=>asin })
rg = Amazon::AWS::ResponseGroup.new('Medium')
req = Amazon::AWS::Search::Request.new
resp = req.search(il, rg)
puts resp.item_lookup_response.items[0].item.item_attributes.title.to_s

I also had to monkeypatch some stuff to get it working with Heroku the first time:

  • allow .amazonrc to be on a different location that can be used on Heroku
  • remove restriction to Ruby 1.8.7 and patch related Stuff

If you look into the source and documentation of ruby-aaws you will see that it is no fun to patch anything in there… I think I would not have done it without the help of Ian Macdonald.

Another thing was, that I could not use the builtin caching facility of ruby-aaws, cause it simply does not work on Heroku’s readonly file-system.

simplicity with ASIN

Given these restrictions, I decided to build a minimum featureset gem tailored for my requirements:

  • provide access to the Amazon-E-Commerce-API via REST
  • simple configuration points
  • minimum amount of code to write for a request
  • maximum flexibility

If you have a look into the Amazon documentation you see that it is quite easy to call the API via REST. Just append some query parameters to your desired endpoint (f.e. webservices.amazon.com) and as a result you get the desired information from Amazon. The tricky thing is, that since recently you have to sign your request with your AWS credentials. I did not find any specs on how to do that on the documentation, but Cloud Carpenters had a nice example using Python that I adapted for Ruby.
There is also the nice Amazon API signing service that frees you from self signing your requests. The reason I did not use it, is that it supports the amazon.com endpoint only (I need amazon.de).

requests with ASIN

Using ASIN is simple. You just have to provide your credentials to the configuration method, the rest is covered with sensible defaults that you can override if you wish:

require 'asin'
include ASIN

# use the configure method to setup your api credentials
configure :secret => 'your-secret', :key => 'your-key'

# you can override the api endpoint if you wish
configure :secret => 'your-secret', :key => 'your-key', :host => 'webservices.amazon.de'

After this setup you can call the REST api via the lookup method:

# lookup an item with the amazon standard identification number (asin)
item = lookup '1430218150'

# have a look at the title of the item
item.title
=> Learn Objective-C on the Mac (Learn Series)

# provide additional configuration options like the response group
lookup(asin, :ResponseGroup => :Medium)

Title is currently the only attribute that is directly supported from the Item class, but this is no restriction. ASIN uses Hashie::Mash for the internal data representation of the Amazon REST XML response. The Item class stores the response in a raw attribute that can be accessed for read:

# access the internal data representation (Hashie::Mash)
item.raw.ItemAttributes.ListPrice.FormattedPrice
=> $39.99

You can tailor the Item class to your needs by opening up the class and provide the methods you like or doing something entirely different with the raw attribute.

OR, just fork me on GitHub!

Maximum flexibility with some syntactic sugar!

Distinguish Ruby Runtimes with WhichRuby

Nowadays there are several decent Ruby runtimes available besides MRI ranging from alpha-versions to production-ready status. Using RVM these different interpreters become more and more interchangeable.

current problems

Since switching between runtimes became as easy rvm use x more care has to be taken to support a wide range of interpreters and versions. This is especially true for shared code like gems.

Some engines like JRuby have limitations that prevent the usage of some Ruby features. In most cases it’s possible to work around these limitations and provide a different solution that works, but might be somewhat less performant.

checking runtimes

Ruby is great at introspection, but especially 1.8 misses some key information like RUBY_ENGINE to determine the current interpreter at runtime and one has to extract it from the RUBY_DESCRIPTION constant.

WhichRuby aims at simplifying this tedious task and providing a simple API:

# [email protected]
jruby-1.4.0 > require 'which_ruby'
 => true 
jruby-1.4.0 > include WhichRuby
 => Object 
jruby-1.4.0 > jruby?
 => true

Executing different code fragments becomes as easy as defining a scope:

ruby_scope(:jruby) do
  # custom jruby code here
end

This comes in very handy for stuff like accessing Java code natively via JRuby instead of using RJB.

I don’t like Rubbae - I love it!

Ruby in Java, Java in Ruby, JRuby or Ruby Java Bridge?

Hosting (J)Rails applications on a high availability Java infrastructure with clusters, loadbalancers and all that shit stuff is great, if you already have it in place.

But does running an app on the JRE make it a JRuby application by default? Do you really want to be stuck on JRuby?

I really like the idea of running code with the most appropriate runtime available, that is why I use RVM all the time. JRuby is still very slow on startup and tools like autotest/autospec take minutes for just a handfull of tests…

So I would like to host Rails applications on a Tomcat, but on the other hand I want the tests to be executed with MRI! This should not be a problem as long as you don’t want to share a common codebase within Ruby and Java.

Integration

Is there a best practice for combining Java and Ruby?

Since every project is different, I think that you’ve got to evaluate the possible solutions to pick the one that fits best.

The next sections cover some approaches on integration of Java and Ruby code. Make sure to have RVM installed if you want to execute the provided example code. I assume that a JRE is provided with every OS nowadays…

Java in Ruby

There are two common solutions for embeding Java into a Ruby application. The first and obvious one is via JRuby, which can only be run with the JRuby runtime:

require 'java'

puts "access java.util.UUID via JRuby"
puts java.util.UUID.randomUUID().toString()
rvm use jruby 
=> Using jruby 1.5.0.RC1

jruby lib/jruby.rb 
=> access java.util.UUID via JRuby
=> a16fda6a-c57d-43b9-8376-801e48fe56b3

The same thing is possible using the Ruby Java Bridge from MRI:

require 'rjb'

puts "access java.util.UUID via RJB"
puts Rjb::import('java.util.UUID').randomUUID().toString()
rvm use 1.8.7
=> Using ruby 1.8.7 p249

ruby lib/rjb.rb 
=> access java.util.UUID via RJB
=> 1db8298c-5486-4933-be00-cdb180388e38

But it is also possible to do it the other way around!

Ruby in Java

Evaluating Ruby code in Java is dead simple with help of the JRuby library. You just need to set up a scripting container that executes your scripts:

  @Test
  public void execute_jruby_scriptlet() {
    new ScriptingContainer().runScriptlet("puts 'hello jruby world'");
  }

A more advanced example is to wire a Ruby class as a Spring bean. You need to provide some configuration and a Java interface that can be used as the basis for the bean:


<lang:jruby id="identifier" script-interfaces="de.nofail.Identifier" script-source="classpath:/ruby/identifier.rb">
    <lang:property name="uuid" value="#{ T(java.util.UUID).randomUUID().toString() }" />
</lang:jruby>
# identifier.rb
class Identifier
  def setUuid(uuid)
    @uuid = uuid
  end
  def getUuid()
    @uuid
  end
  def to_s()
    "Identifier[#{@uuid}]"
  end
end

# don't forget to return an instance as a bean
Identifier.new
// Identifier.java
public interface Identifier {
  String getUuid();
}
// SpringJRubyBeanTest.java
@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations = "/applicationContext.xml")
public class SpringJRubyBeanTest {

  @Resource(name = "identifier")
  Identifier identifier;

  @Test
  public void generateUuid_success() {
    System.out.println("uuid: " + identifier);
  }
}

The code is currently not running with JRuby 1.5.0.RC1 so you need the latest stable release for testing.

checking Ruby enginges

It is even possible to mix and match all those approches! You just need to keep track of the Ruby engine that is evaluating your code. I created a little helper called which_ruby (available on rubygems) as a sidekick for this article, but more on that in the next week.

Some Ruby sugar in your cup of Java!

taming webapp logging with log4j

Logging is an aspect of programming that should follow some simple rules. If I should describe logging for a dictionary it would be something like: “Providing essential information with object introspection at a decent level of output that enables you to look into a running application or providing runtime information for debugging.”

But even the simplest thing can be complicated if you use Java…

There are different logging frameworks like log4j, commons logging or logback even though Java includes it’s own java.util.logging API. To make it even worse, the Java community came up with a simple logging facade so one can use the logging framework of choice but implement against the facade - WTF?!

a standard that sucks

I would guess that 99% of Java code is written against log4j. Nobody really want’s to switch a logging framework during implementation and most developers I know see log4j as a technical standard.

I won’t recommend using log4j to anyone. The log4j documentation is crap! There is no usefull literature either (I never thought that one would have to buy a book for logging…). The framework misses essential things like pattern matching filters (you will have to use log4j companions) and it is difficult to configure, especially on a tomcat.

Log4j is a bitch when it comes to deployment. One has to be careful where to put the log4j jar(s) and xml or plain text configuration files. If you do it wrong you end up with broken logging behavior; webapps log to the wrong log file, spit out stuff to the system out or stop logging at all.

this is how we do it

Our Java infrastructure is service oriented. We have a bunch of webservices hosted on serveral distributed tomcats and all wars depend on log4j for logging. We want to reload the logging configuration files at runtime (why is jboss able to do it and tomcat is not?!) and manipulate it via JMX.

additional log4j configuration

To achieve this, we have to put some additional configuration into every webapp. Log4j comes with a watchdog that looks for changes in the configuration files. The functionality can be enabled by using the configureAndWatch() method. There is a precanned Spring solution for this, but it has configuration overhead and environmental restrictions. So the best place to implement it ourselves is in a custom context listener:


   [...]
	
	<listener>
		<listener-class>de.nofail.TomcatLoggingListener</listener-class>
	</listener>
   [...]
   [...]
	public void contextInitialized(ServletContextEvent sce) {
		ServletContext servletContext = sce.getServletContext();
		try {
			String log4jFile = String.format("%s-log4j.xml", servletContext.getContextPath());
			String configFilename = new File(getClass().getResource(log4jFile).toURI()).getAbsolutePath();
			DOMConfigurator.configureAndWatch(configFilename, 1000);
		} catch (URISyntaxException e) {
			throw new IllegalStateException("Error resolving log4j configuration file for context=" + servletContext, e);
		}
	}
   [...]

This listener will use a config file named WEBAPP-log4j.xml lying in the root directory. The goal is, that every webapp has it’s own logging context. Every war should include a log4j.jar so that logging won’t be affected by other apps, the classloader hierarchie will ensure this. The different xml configuration files can than be placed in a shared tomcat directory for easy access of an operations team (which is probably yourself).

Using different configuration files has the advantage that you don’t have to mess around with appenders based on packages. That’s how one can configure different log levels for the same packages in each application, very usefull if you release a new application!

Java Management Extension

There are different ways to integrate JMX functionality into a webapp. The simplest approach is to use another servlet listener for propagating an mbean:


   [...]
	
	<listener>
		<listener-class>de.nofail.JMXListener</listener-class>
	</listener>
   [...]
   [...]
	public void contextInitialized(final ServletContextEvent sce) {
		try {
			TomcatLogging mbean = new TomcatLogging();
			ObjectName clutter = new ObjectName("de.nofail:type=TomcatLogging");
			ManagementFactory.getPlatformMBeanServer().registerMBean(mbean, clutter);
		} catch (Exception e) {
			throw new IllegalStateException("Could not create JMX context: " + e.getMessage(), e);
		}
	}
   [...]

If you are using Spring it’s easy to add JMX support via annotations.

Now you can access your running webapp via jconsole:

more information?

You can have a look at a working example based on Maven on github.

Drink a cup of Java™©®, but don’t forget the sugar!