Learn how to abstract database rows using
Maybe I’m unique. Maybe I’m the only programmer in
the world who hates (re) typing the same, boring
"i">SQL to update a column in a row of interest:
SET start_date = "2006-09-21"
WHERE cruise.id = 27;
But then again, given the number of CPAN
modules that abstract SQL, I’m evidently not alone.
In fact, there are many, very nice “object-relational
mapper” (ORM) frameworks in the Perl
world. The goal of an ORM is to present each database row as a
single object and use traditional, object-oriented” getter” and”
setter” methods to update a row (possibly delayed until the object
goes out of scope or is explicitly saved). I’m a big fan of a
good ORM, because I’d much rather write the previous SQL as
the far more natural:
Of course, every framework or wrapper introduces some overhead.
A good framework mostly hides the fact that
objects are really rows, yet provides enough low-level hints that
I’m encouraged to write code that maps closely to the real
SQL queries that must eventually be executed.
The Long Reign of Class::DBI
In previous columns, I’ve extolled the virtues of the
popular ORM called Class::DBI, often
abbreviated CDBI. CDBI started out as an interesting object
wrapping of rows written by Michael Schwern. Eventually, the whole
thing got wrapped with two more layers of abstraction by Tony
Bowdon, who then propelled CDBI into the “nearly
And for a good three or four years, CDBI was the ORM of choice.
I wrote articles about it; other people plugged it in to larger
frameworks, such as Catalyst (formerly
Maypole); and still others used it like
there was nothing else — which is a bit odd, because there
were other options, such as
"c">Tangram and Alzabo, which worked
hard to hide all that nasty SQL.
But then one day (and I’m sorry if I’m mispresenting
this, because it’s the way I remember it), Tony got a little
irritated at being the benevolent dictator
for what had become the most popular ORM for Perl. And he quit. He
took his toys and went home. He stopped accepting patches and
refused to reply to the community. Moreover, the next release of
CDBI included this caveat on the CDBI man
There is an active Class::DBI community.
However, I am not part of it. I am not on the mailing list, and I
don’t follow the wiki. I also do not follow Perl Monks, CPAN
reviews, annoCPAN, or whatever the tool du jour happens to
Seemingly, if there were going to be any updates to CDBI,
it’d be exactly and only what Tony wanted of the code. And
that’s certainly his right, as the owner of the code.
However, it left many other developers in an embarassing position.
For example, I applied CDBI in a number of commercial projects, and
no more support of the code meant that I’d essentially have
to fork a version of my own (legal within the Perl license) to keep
things working. And what of my clients? With the CDBI community
fatally fractured, there was really no place to go for maintenance
on what had become this key component of their applications.
Within a few months of Tony’s apparent withdrawal, Matt
Trout (one of the big users of CDBI) stepped forward to create
DBIx::Class (DBIC), essentially a
reimplementation of most of the good things about CDBI, yet
omitting some of the layers that made CDBI both slow and hard to
Initially, I was quite intrigued by the prospect: DBIC seemed to
have most of the features I had grown to like in CDBI, with half
the overhead and far more “community” support than
CDBI. However, I realized that part of what I
"i">didn’t like about CDBI — how columns were
processed when stored or updated — was duplicated
(necessarily) in DBIC. The CDBI design was a wart upon a pimple,
and the new design had to duplicate the ugliness.
So, while I made plans to migrate my frequent clients from CDBI
to DBIC, I wasn’t entirely comfortable with the decision,
thinking that even though CDBI was no longer
"i">supported, at least it worked for my customers. And
I’m glad that I waited, because along came
The Rose: Heavenly Sent
Now, I’m not sure if John Siracusa (the creator and
maintainer of many things in the Rose
namespace) went through a similar experience as mine — all I
know is that when RDBO first came out, I was thinking “Oh,
another ORM. How quaint.” And then I started looking at the
benchmarks, and my jaw dropped.
Unlike most other ORMs, which try to hide the fact that rows of
the database map directly into objects, RDBO takes the opposite
approach: that you should know that each
object is a database row. And this shows up by making most database
operations map to method calls more closely.
For example, you can create an abstract
"i">row in memory, but until you call
"c">save() on that row, it doesn’t really exist in the
database. With CDBI, I had to cheat a bit with that, hoping that a
random, memory-only object that I created in the shape of a row
wasn’t going to get accidentally realized in the
And because of many optimizations under the hood, RDBO runs
faster and with more immediate control of database actions. The
benchmarks that compare RDBO against CDBI and DBIC are very
convincing: in many cases, RDBO comes within striking distance of
hand-crafted DBI calls, beating CDBI sometimes by an order of
magnitude in speed.
Another thing I like about RDBO is that John Siracusa is
well-versed in good design practices for object-oriented
programming and large frameworks. I say this having read through a
huge number of frameworks in my life, and even having written a few
that are now popular. The design for RDBO seems quite elegant, with
the right ability to subclass or replace parts with different
components, but defaulting to the most common thing to eliminate
the need to “over-specify” for straightforward
Another thing that RDBO does “just enough” of is
hiding a few of the database differences from me. For example, I
recently wrote a pile of code for a client that would eventually
run on MySQL. Since I’d never choose
MySQL for a client now that I know about
"i">PostgreSQL, I decided that I wanted to do some early
testing without MySQL, and chose sqlite
instead. Once I got my code running with sqlite, it was a slight
matter of changing one or two methods in my code to obtain a
working MySQL implementation as well. This even included some
date/time fields, which were transparently rewritten for both
sqlite and MySQL.
RDBO creates a level of abstraction for both the database and
for rows in the database. The reason the database and rows are
separate (instead of being bound together, as they are in CDBI) is
that you might have a row in one database that you’d like to
move to another database, or you might want to mix your application
across multiple databases.
For your database, you subclass Rose::DB.
Each row is a subclass of
My personal convention for a project named
"c">ABC is to create these classes:
is a subclass of Rose::DB for my
X represents classes that deal with
the entire database.
"c">ABC::DBO is a subclass of
"c">Rose::DB::Object, the parent class of all my rows.
"c">ABC::DBO::Foo is a subclass of
"c">ABC::DBO for the Foo row.
And because I’m consistent, I can come back to this
application in a year and still know what I did. (I hope.)
The other thing that RDBO needs beyond the database information
and the row class is the metadata for a row. The metadata includes
the database columns in each table and any relationships between
one table and other tables, such as a “one to many”
RDBO can operate in two modes (well, actually two and a half, as
I describe in a moment). RDBO can either probe the database for all
metadata — this foreign key points at that table’s
primary key, and that column is a date field — or be given
the metadata explictly.
If your database is simple enough, probing the database for
metadata ensures that your metadata is never out of sync with the
existing database. However, for a large database, RDBO’s
probing-and-thinking time may exceed your limits for startup time.
And that’s where the “other half mode” comes in.
When you decide that you’re no longer willing to pay the
startup price for dynamically probed metadata, you can perform the
operation one last time and spit out fairly well-formed Perl code
that can be loaded to set up the same metadata. You can capture
this code and use it as the basis for further refinement.
For example, a TEXT field in the database
might actually be restricted by business rules to be one of
greeen, yellow, or
red. A simple edit to the Perl code spit out
by the “half” stage, and we’ll have automatic
checking against this list of enumerated values, something that
would never have been represented in the schema of the
One thing that RDBO doesn’t (yet) do is map from metadata
to the SQL data definition language that would create the
equivalent database. (Other ORM’s, including
"c">Alzabo, are able to perform the mapping.) I’ve
chatted with John recently, and he just hasn’t had the need
for the feature yet, so that’s why RDBO doesn’t do it,
and I understand his reasoning.
Next month, I plan to show some sample code using RDBO with a
typical, small database. Hopefully, the code will illustrate more
clearly some of the concepts discussed here. Until then, enjoy!
Randal Schwartz is the chief Perl guru at Stonehenge
Consulting. You can reach Randal at