Creating a Maven webapp from scratch

Doing a little research for an article about “log4j”:http://logging.apache.org/log4j/ configuration in Java webapps, I decided to use “Maven”:http://maven.apache.org/ as a sidekick. “I already pointed out earlier”:http://blog.nofail.de/2010/01/buildr-the-build-system-that-doesnt-suck/ that my favourite build tool is “buildr”:http://buildr.apache.org/. The reason I chose Maven is that it has “archetypes”:http://maven.apache.org/guides/introduction/introduction-to-archetypes.html and that there is a deprecated guide for using such an archetype “to create a blank webapp”:http://maven.apache.org/guides/mini/guide-webapp.html that would be valuable for me.

h2. installing Maven on OS X

The Maven “documentation”:http://maven.apache.org/documentation in general and the “installation instructions”:http://maven.apache.org/download.html#Installation in particular *SUCK*. They lack essential information as most Java documentation does. So here is a step by step guide for OS X:

# go to the place you want to install Maven to
cd ~/Library
# download latest Maven version
curl -O http://apache.autinity.de/maven/binaries/apache-maven-2.2.1-bin.zip
# unzip the archive
unzip apache-maven-2.2.1-bin.zip
# add a symlink for convenience
ln -s apache-maven-2.2.1 maven
# add executables to the PATH
echo export PATH="~/Library/maven/bin":\$PATH >> ~/.profile

# open a new bash and check Maven is running
mvn --version
=> Apache Maven 2.2.1 (r801777; 2009-08-06 21:16:01+0200)

h2. creating the webapp

If Maven is running on the system, you can use the archetype to create a blank webapp:

# go to the workspace
cd ~/Documents/workspace/
# let Maven create a webapp
mvn archetype:generate \ 
   -DgroupId=de.nofail \
   -DartifactId=tomcat-logging \
   -Dversion=1.0.0-SNAPSHOT \
   -DarchetypeArtifactId=maven-archetype-webapp

This takes several minutes, as it requires a thousand some broken dependencies for whatever is necessary to create some files and folders! But after that, you have a fresh, apache conform webapp at hand:

# go to the webapp
cd tomcat-logging/
# let Maven create a distributable war
mvn clean package

This is very nice for a Java application! Zero configuration and you get a running webapp that you can start after another two thousand downloads instantly from the command line:

mvn jetty:run &
open http://localhost:8080/tomcat-logging/	

h2. all by myself

There is just one big problem with this so far: it won’t work bejond that! Maven lacks sensible defaults everywhere! Examples? Here you go:

h3. Java compiler

Maven uses Java 1.4 as the default compliance level for the Java compiler. So if you start adding some “Annotations”:http://java.sun.com/j2se/1.5.0/docs/guide/language/annotations.html or “Generics”:http://java.sun.com/j2se/1.5.0/docs/guide/language/generics.html to your code, which are Java 1.5 features, your build will fail until configuring the compiler plugin properly:

	[...]
	
		
			[...]
			
				org.apache.maven.plugins
				maven-compiler-plugin
				2.0.2
				
					1.6
					1.6
				
			
			[...]
		
	
	[...]

This should be a top level configuration!

h3. Jetty plugin

Making changes to the code won’t change anything in the webapp run by the “Jetty plugin”:http://docs.codehaus.org/display/JETTY/Maven+Jetty+Plugin. You need to configure the plugin in order to pick up changes. Since the plugin can not do any hot code replacement, it has to restart the context after every change. A no-go for most webapps:

	[...]
	
		
			[...]
			
				org.mortbay.jetty
				maven-jetty-plugin
				6.1.10
				
					10
				
			
			[...]
		
	
	[...]

Why is this soooo much XML?

h2. Eclipse integration

Maven has support for integrating a project into “Eclipse”:http://eclipse.org/ via the “Maven-Eclipse-Plugin”:http://maven.apache.org/plugins/maven-eclipse-plugin/:

# configure your workspace (do not use ~ to point to your home!)
mvn eclipse:configure-workspace -Declipse.workspace=../
# create Eclipse files
mvn eclipse:eclipse

These tasks add a M2_REPO classpath variable to your Eclipse environment that points to your local Maven repository and creates a _.project_ and _.classpath_ file from the existing pom. Just import the project with _Import > General > Existing Projects into Workspace_ and your done.

h3. better with Eclipse plugins

Since there are mature Eclipse plugins for Maven and Jetty, you should consider installing these from their update sites:

* “m2eclipse”:http://m2eclipse.sonatype.org/sites/m2e
* “run jetty run”:http://run-jetty-run.googlecode.com/svn/trunk/updatesite

It’s never easy doing stuff from scratch, but Maven should help flatten the rocky path to Java projects. Instead it piles up another Mount Everest of complexity to climb for a Java developer…

h2. additional information

Check out my github profile for “a working example project”:http://github.com/phoet/tomcat-logging that was created using these steps.

Maven – development for morons made easy!

Migrating to Rails 3 for Heroku Bamboo

Recently there were some interesting “updates to the Heroku infrastructure”:http://blog.heroku.com/archives/2010/3/5/public_beta_deployment_stacks/, giving the opportunity to migrate “my personal Rails 2 website”:http://www.phoet.de/ to “Rails 3 (beta)”:http://weblog.rubyonrails.org/2010/2/5/rails-3-0-beta-release/.

Having an app with only a single model “for caching data”:http://blog.nofail.de/2010/02/simple-db-caching-for-heroku/, there is no worry about database migration. A nice opportunity for starting out new:

rvm use 1.9.1
gem install rails --pre
rails basement-rails3
cd basement-rails3
heroku create basement-rails3 --stack bamboo-mri-1.9.1

h2. business as usual?

Not really… Having “Yehuda Katz”:http://yehudakatz.com/ as a core developer of Rails 3, it’s no surprise they adopted the “Merb”:http://merbivore.com/ approach of just using one executable for everything. So the ‘script’ folder now contains just a ‘rails’ script. Creating controllers, running the server, jumping into the console – all through the ‘rails’ command:

rails -h
=> [...]
=>  generate    Generate new code (short-cut alias: "g")
=>  console     Start the Rails console (short-cut alias: "c")
=>  server      Start the Rails server (short-cut alias: "s")
=> [...]

I appreciate the shortcuts! No more discussions about what shortcut to use for ‘script/server’ (ss is not an option in germany…)!

h2. dependency management

Rails 3 has changed the way of working with gems. It uses “bundler”:http://github.com/carlhuda/bundler/ to deal with dependencies. Beeing a big fan of Java’s dependency management tools like “Ivy”:http://ant.apache.org/ivy/ or “Maven”:http://maven.apache.org/, I think that separating out the dependency issue is good idea.

All dependencies are now defined in a separate ‘Gemfile’ using an easy dsl to manage the gems:

gem "rails", "3.0.0.beta"
[...]
gem "sqlite3-ruby", :require => "sqlite3"
[...]
group :test do
  gem "test-unit", "1.2.3"
end

I had some trouble getting bundler working on my machine, but after reinstalling Rails 3 AFTER the bundler gem, everything worked fine.

The only Rails plugin in my app is “Haml”:http://haml-lang.com/ and I was confident that it would play well with the latest Rails version. Never the less I was pleased to find “RailsPlugins.org”:http://www.railsplugins.org/ where one can check the compatibility of plugins with Rails 3.

h2. escaping vs. html_safe

There were just very little changes to the existing codebase in my application. Despite one thing though, that forced changes to nearly all of the wrapper objects that are used to encapsulate the data that is coming from external services like “twitter”:http://twitter.com/phoet/. The Problem is that “Rails 3 has a strict way of dealing with escaping”:http://yehudakatz.com/2010/02/01/safebuffers-and-rails-3-0/. Every string rendered into the view will be escaped unless it is ‘html_safe’. Since my application is using a lot of pregenerated content with inline html, adding ‘html_safe’ markers is inevitable:

  def content
    @json["content"]["$t"].html_safe
  end

h2. Ruby 1.9 is different

The biggest pile of migration problems resulted from using Ruby 1.9.1. The latest Ruby version is a lot faster, but it has changed some of the core functionality. The ‘enum_with_index’ method for example is replaced with an ‘each_with_index’ method on a hash.
Using old YAML files resulted in some strange behavior as these files have changed format slightly (because of the new symbol style that Ruby 1.9 is using, I guess):

# old
  id: home
# new
  :id: home

Ruby 1.9 also changed the way of handling unicode characters. Using these in code forces the developer to put a “magic comment”:http://blog.grayproductions.net/articles/ruby_19s_three_default_encodings/ in the first line of the ruby file:

# coding: utf-8
[...]

h2. beta quirks

Most of the new Rails 3 stuff just works, but there are some reasons why it is still beta:

# rails console won't quit with controll-c but exits without error typing ö.ö
rails c
=> Loading development environment (Rails 3.0.0.beta)
ruby-1.9.1-p378 > ö.ö
^C

# rails help doesn't work for commands
rails -h
=> [...]
=> All commands can be run with -h for more information.
rails generate -h
=> Could not find generator -h.

“Beta but running!”:http://basement-rails3.heroku.com/

noSQL – Rails models with SOAP

Using a DB is a natural thing for a Rails developer. Since Rails is a database driven application framework, that does not come as a big surprise. But there are times where environmental constraints do not allow the freedom to use the weapon of choice…

Imagine a legacy Java SOA landscape that provides tons of webservices but does not permit access to a transaction DB. Sounds phoney? Ask your local J2EE consultant!

Working around this constraint, it would be great if one could just wire a SOAP service into Rails as a backing of model data. Using Rails without a database is a little bit tricky, especially if you don’t want to forego the power of ActiveRecord!

h2. so why use Rails then?

There are a lot of people that would say “Why don’t you use “Sinatra”:http://www.sinatrarb.com/ instead?”.

First of all, most Ruby developers know how to use Rails. The Rails community is large, lively and a great resource for knowledge. Features like REST come for free and nobody want’s to miss model validations. In general, Rails plugins are lazy programmers best friend!

h2. working with ActiveForm

A simple way to get your SOAP backed noSQL model working with ActiveRecord::Validations is probably by using “ActiveForm”:http://github.com/remvee/active_form. It provides validations for non ActiveRecord models and is available on github.

You can install the Rails plugin via:

# (re)install from git as a plugin
script/plugin install --force git://github.com/remvee/active_form.git

Using the plugin in your code is simple. Inherit from ActiveForm instead of ActiveRecord::Base:

# app/models/blog.rb
class Blog < ActiveForm
  column :title
  column :message, :type => :text
  
  validates_presence_of :title, :message
  [...]
end

It’s possible to remove all evidence of database connectivity. Just kick ActiveRecord from the list of Rails frameworks and re-add it as a gem (this step is not necessary, so you might skip it and work with Rails sqlite3 default):

# config/environment.rb
Rails::Initializer.run do |config|
  [...]
  config.gem "activerecord", :version => '2.3.5'
  [...]
  config.frameworks -= [ :active_record, :active_resource, :action_mailer ]
  [...]
end

Doing so will allow you to delete the database.yml file in your application.

h2. Savon for multi-tier persistence

Accessing an enterprise SOAP service with “Savon”:http://github.com/rubiii/savon/ is easy and integrating Savon into a Rails model requires just two steps:

* implementing a to_hash method
* implementing a save hook

Since Savon communication is based on data hashes, you have to provide a thin mapping layer to convert your model into a request hash that matches your SOAP interface:

# app/models/blog.rb
  def to_hash
    { :data => {:title=>title, :message=>message} }
  end

Pushing the data to the webservice requires some custom ‘persistence’ code to be implemented. A good place for that code should be in one of the model’s save hooks:

# app/models/blog.rb
  def after_save
    client = Savon::Client.new "http://localhost:8080/"

    client.post! do |soap|
      soap.namespace = "urn:savon:blog"
      soap.body = to_hash
    end
  end

Overwriting the after_save method is a neat way to let the model code be readable for other Rails developers. Sticking to conventions is a best practice and reduces complexity greatly!

h2. more information?

There is a working example using a local “soap4r”:http://dev.ctor.org/soap4r server available on “github”:http://github.com/phoet/savon_nosql_example.

no SQL – no problem!

Simple DB caching for Heroku

Heroku is a great platform. I like the style of the page, I appreciate the documentation and you can start up for free! One thing that I miss a lot is decent caching. The readonly filesystem eats up a lot of flexibility.

I played around with HTTP caching and Herokus Varnish works really well. The problem is that my app loads a lot of stuff from different 3rd party services like Twitter, so every new visitor will have all the load time on his first visit. Not a surprise that New Relic indicates that request times were ‘Unacceptable’…

I would like to check out the ‘Memcached Basic’ plugin of Heroku, but I did not manage to get into the private beta. So there was no other option than implementing a DB cache.

There is just one requirement that I have. Load stuff from a 3rd party service only if it’s expired. For simplicity, expired means, that the data is older than a predefined interval. In my test environment I like to use a shorter period than in production, so I define the interval in the environment files:

# config/environments/development.rb
CACHE_TIME = 30.seconds

# config/environments/production.rb
CACHE_TIME = 10.hours

A simple key-data pair is enough for my needs, because I always have a unique key for the values I want to cache. I am using Marshal.dump/Marshal.load for serialization, as they play well with anonymous inner classes that YAML can’t deal with. Encoding the data Base64 helps working around some SQLITE issues with serialized data strings:

# app/models/storage.rb
class Storage < ActiveRecord::Base
  
  validates_presence_of :key, :data
  
  def data=(data)
    write_attribute :data, ActiveSupport::Base64.encode64(Marshal.dump(data))
  end
  
  def data
    Marshal.load(ActiveSupport::Base64.decode64(read_attribute :data))
  end
  
end

The actual caching logic is embeded in my application controller. I provide a simple cache method, that can be called with a block. The block contains the remote call that I want to cache and is only executed if there is no data stored for the given key or the stored data is expired:

  # app/controllers/application_controller.rb
  def cache(key, &to_cache)
    from_db = Storage.first(:conditions => {:key => key})
    if from_db.nil? || from_db.updated_at < Time.new - CACHE_TIME
      data = (yield to_cache).collect{|t|t}
      return [] if data.nil? || data.empty?
      from_db = (from_db || Storage.new)
      from_db.key = key
      from_db.data = data
      from_db.save!
    end
    instance_variable_set :"@#{key.to_s}", from_db.data
  end

Finally the data is pushed into an instance varaible, so that I have access to it within my views.

Caching is now as simple as this:

  # cache all twitter posts and make them accessible via @tweets
  cache(:tweets){Helper::twitter_posts}

This little tweak noteable improved the response time of my app:

  This week:
  Apdex Score: 0.700.5 (Fair)

  Last week:
  Apdex Score: 0.060.5 (Unacceptable)

Sugar on rails!

Writing your own DSL with Ruby

I am a big fan of Ruby. There are so many beautiful libraries out there and most of them are based on some kind of domain specific language. Take builder as an example:

  Builder::XmlMarkup.new.person { |b| b.name("Jim"); b.phone("555-1234") }
  #=> Jim555-1234

Generating XML in this manner is pretty cool! There is no crazy XML editor in the world that gives you as much flexibility as this straight forward DSL. I think that a good DSL makes coding feel natural.

h2. HOWTO DSL

If you did read “Ruby best practicies”:http://oreilly.com/catalog/9780596523015 or stuff like that, this article won’t bring anything new to you. In case you did not, you should start right away!

Anyhow, there are some simple rules behind writing a good DSL or API:

* let the user choose how to use it
* make options optional
* make use of an option hash for defaults
* make use of scoped blocks
* make use of instance_eval
* consider implementing method_missing

I am going to explain these rules for you, so that you can start writing your own cool DSL and help make programming Ruby even more fun.

h2. the Regexp DSL

I think that regular expressions are a great tool and they are ubiquitous in Ruby. Regexps are a first class citizen in Ruby and so you get a lot of built-in power for free. There are also some decent Regexp guides on the web waiting for you:

* “Programming Ruby, Regular Expressions”:http://www.ruby-doc.org/docs/ProgrammingRuby/html/language.html#UJ
* “Struggling With Ruby, Regular Expressions in ruby”:http://strugglingwithruby.blogspot.com/2009/05/regular-expressions-in-ruby.html
* […tons of other great sites…]

Regular expressions are a framework of their own, embedded into lots of programming languages. The biggest problem that I have with the Regexp DSL is missing readability. Take a look at the monster that lives in the dark pit of URI::REGEXP::PATTERN:

/
        ([a-zA-Z][-+.a-zA-Z\d]*):                     (?# 1: scheme)
        (?:
           ((?:[-_.!~*'()a-zA-Z\d;?:@&=+$,]|%[a-fA-F\d]{2})(?:[-_.!~*'()a-zA-Z\d;\/?:@&=+$,\[\]]|%[a-fA-F\d]{2})*)              (?# 2: opaque)
        |
           (?:(?:
             \/\/(?:
                 (?:(?:((?:[-_.!~*'()a-zA-Z\d;:&=+$,]|%[a-fA-F\d]{2})*)@)?  (?# 3: userinfo)
                   (?:((?:(?:(?:[a-zA-Z\d](?:[-a-zA-Z\d]*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:[-a-zA-Z\d]*[a-zA-Z\d])?)\.?|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|\[(?:(?:[a-fA-F\d]{1,4}:)*(?:[a-fA-F\d]{1,4}|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})|(?:(?:[a-fA-F\d]{1,4}:)*[a-fA-F\d]{1,4})?::(?:(?:[a-fA-F\d]{1,4}:)*(?:[a-fA-F\d]{1,4}|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}))?)\]))(?::(\d*))?))?(?# 4: host, 5: port)
               |
                 ((?:[-_.!~*'()a-zA-Z\d$,;:@&=+]|%[a-fA-F\d]{2})+)           (?# 6: registry)
               )
             |
             (?!\/\/))                              (?# XXX: '\/\/' is the mark for hostport)
             (\/(?:[-_.!~*'()a-zA-Z\d:@&=+$,]|%[a-fA-F\d]{2})*(?:;(?:[-_.!~*'()a-zA-Z\d:@&=+$,]|%[a-fA-F\d]{2})*)*(?:\/(?:[-_.!~*'()a-zA-Z\d:@&=+$,]|%[a-fA-F\d]{2})*(?:;(?:[-_.!~*'()a-zA-Z\d:@&=+$,]|%[a-fA-F\d]{2})*)*)*)?              (?# 7: path)
           )(?:\?((?:[-_.!~*'()a-zA-Z\d;\/?:@&=+$,\[\]]|%[a-fA-F\d]{2})*))?           (?# 8: query)
        )
        (?:\#((?:[-_.!~*'()a-zA-Z\d;\/?:@&=+$,\[\]]|%[a-fA-F\d]{2})*))?            (?# 9: fragment)
      /xn

You don’t need to write such complex expressions to get to the point where you don’t understand the Regexp you wrote just 2 minutes ago. That’s why Ruby provides an _extended_ mode that allows you to insert spaces, newlines and comments in the pattern:

  def test_multiline_regex_with_comments
    assert_match(%r[a # the a
                    n # the n
                    ]x, 'anna')
  end

This is nothing that feels natural to me… So I started working on “my own regular expression DSL”:http://github.com/phoet/rebuil.

h2. lessons learned from Rebuil

It is always a good idea to write a DSL. There is no better way of becoming a domain expert than implementing a custom language for that specific domain. The other great thing is that you get in depth knowledge of the key Ruby features that help you creating crisp APIs and neat frameworks.

The vision that I have about using Regexps is a very descriptive one. There are probably some people that would call it verbose, but there is always a tradeoff:

  exp = rebuil.many.group('rebuil', :cool_name)
  puts "hello #{exp.match('hello world with rebuil')[:cool_name]} world!" 
  #=> hello rebuil world

This code basically creates the Regexp _’.*(rebuil)’_. The cool thing is that you get named groups like in Ruby 1.9 for accessing them with descriptive symbols instead of an index.

h3. let the user choose how to use it

The first advice for creating a DSL is important if you want to make your library sexy for other developers. Since most hackers have different coding styles, they prefer different types of expressions. A good example for this are the different ways one can create Rebuil objects:

  # standard approach
  Rebuil::Expression.new.group('a')

  # helper within object
  rebuil.group('a')

  # helper with a block
  rebuil do |exp|
    exp.group('a')
  end

  # helper with some starting pattern
  rebuil("").group('a')

As all implemented Rebuil methods return the object instance itself, one can chain method calls for convenience.

h3. make options optional

The rebuil method can take a string as an argument for the start of the expression. As you can see in the example above, this argument is optional.

A well designed API does not force the user to provide arguments that are not mandatory.

h3. make use of an option hash for defaults

If you end up with a lot of optional parameters in your method signature, you should consider using an optional parameter hash as the last method argument. Using carefully picked keys for the parameters gives the impression of named parameters wich improves readablity greatly:

  def some_method(man, da, tory, options={:some=>'defaults'})
    ...
  end

h3. make use of scoped blocks

Providing scopes is another way to improve the readability of your code. Ruby makes it easy to do scope based programming with blocks:

  # no scope
  exp = rebuil
  exp.group('a')
  exp.characters('a')
  exp.many

  # better with a block
  rebuil do |exp|
    exp.group('a')
    exp.characters('a')
    exp.many
  end

This can be achieved by just yielding a Rebuil::Expression if a block is given.

h3. make use of instance_eval

Blocks can also be used to reduce the code you will have to write. The scoped block example above can be rewritten with less code:

  # smooth with instance_eval
  rebuil do
    group('a')
    characters('a')
    many
  end

Evaluating the block in the context of the current object is as simple as passing it on to Rubies instance_eval. The only drawback is that you can’t access stuff from your current scope like member variables. But there is a way to make both solutions work, just ask the block for the number of arguments:

  def rebuil(expression="", &block)
    ...
    # block.arity returns the number of expected arguments of the block
    block.arity < 1 ? re.instance_eval(&block) : block.call(re) if block_given?
    ...
  end

h3. consider implementing method_missing

Highly dynamic DSLs like Builder follow a different approach. They implement method_missing. The great advantage is that the user can posibly call anything on your object. The domain logic lies within your implementation of method_missing, which passes the name of the originally called method, the parameters and an optional block. Since regular expressions are deterministic, this dynamic approach does not suite Rebuil well.

With great power comes great responsibility. Always try to find the most appropriate implementation for your specific domain.

And don't forget the sugar!