Testing Catalyst and DBIC with an In-Memory Database

| 1 Comment

Writing good tests for code you care about is a creative process as much as it is an engineering discipline. You sometimes must find a balance between doing things the right way and doing things the effective way. The closer your tests are to the real usage patterns of your code the more accurate they are, but the faster and easier, the better your tests are to help you find defects.

Slow but accurate tests won't get run, while fast bug buggy tests won't produce any trust.

This dichotomy is often obvious in tests which must interact with multiple parts of a system, such as a database and a business logic layer. While many people say "Pull out your favorite mock object library and get to work!", I find mock objects are scalpels, not hammers, in that you need them for only a few specialized and precise operations, not a tool you'd use every day.

Consider databases.

One of my projects at work is a database-backed B2C web site. The schema is under fairly active development (I deployed a new version of the site with a big change to the schema an hour ago and have at least two more updates this week). I have a version of the live database handy for testing purposes (with user information scrubbed out, or where that's not possible, salted and hashed), but I almost never use it. (I only use it when verifying that a deployment will succeed before performing the deployment.)

Instead, I use DBICx::TestDatabase and a hand-selected subset of data for all of the tests. DBICx::TD takes my existing schema and creates a SQLite database in memory (not on the disk at all) and returns a DBIx::Class schema object connected to that database. My initialization code dumps in some fixture data, and then it's ready to use in all of my tests.

Again, that database is not on disk. This has at least two benefits. First, it's a lot faster than a database on disk, because it has to do less IO. Even if I'm very careful about wrapping operations in transactions, waiting for data to hit a platter is still a lot slower than waiting for data to hit RAM. Second, every process has its own unique database in memory and there's no contention for the disk or a single database file. My tests are trivially parallelizable in this respect.

All of the setup code is in my test library, t::DB. (I could use lib 't/lib';, but I haven't needed that yet.) The relevant code looks like this:

package t::DB;

use strict;
use warnings;

use Try::Tiny;
use DBICx::TestDatabase;
use MyApp ();
use Test::WWW::Mechanize::Catalyst 'MyApp';
use File::Touch;


my $schema;
sub make_schema { $schema ||= DBICx::TestDatabase->new( shift ) }

sub install_test_database
    my ($app, $schema) = @_;
    MyApp->model( 'DB' )->schema( $schema );
    MyApp->log->disable( 'warn' );

sub import
    my $self        = shift;
    my $appname     = 'MyApp';
    my $schema_name = $appname . '::Schema';
    my $schema      = make_schema( $schema_name );
    install_test_database( $appname, $schema );

        create_security_questions( $schema->resultset( 'SecurityQuestion' ) );
        create_user_roles(         $schema->resultset( 'Role'       ) );
        create_users(              $schema->resultset( 'User'       ) );
        create_invitations(        $schema->resultset( 'Invitation' ) );
        create_sectors(            $schema->resultset( 'Sector'     ) );
        create_industries(         $schema->resultset( 'Industry'   ) );
        create_stocks(             $schema->resultset( 'Stock'      ) );
        my $exception = $_;
        BAIL_OUT( 'Fixture creation failed: ' . $exception );

    # ...

When a test file uses this library, the import() method creates a new test database for my application's schema. (The schema classes are already in memory thanks to the use lines which bring them in as a dependency of the app.) The make_schema() lines initialize a closure variable containing the schema object.

Before today, this code went to some contortions using Catalyst::Test to swap the existing application's schema with the generated schema, but now all it has to do is to call install_test_database(), and Catalyst behaves as if it had connected to the schema using the in-memory database from the start.

As a bonus, I removed a couple of dependencies and deleted a few lines of ugly code.

You can take this approach and use it on your own (unless you have complex database logic SQLite can't handle, and even then I'd work with this design for a long time before I gave it up voluntarily). You can take a more general principle from it as well: make it easy to swap components in and out of your application. The schema layer adds business logic to my data storage, but it's also valuable because it presents an interface between a database and my application. Anything polymorphic (or even allomorphic) to that interface should suffice to give me confidence between tests and deployed application.

(Of course the deployed database on disk has to match the schema in memory, but I did mention we have another mechanism to test that.)

1 Comment

For Galileo, by cleverly abstracting my initial database setup code, which uses the Mojo::Command cli system, I can hook into it in my testing to create an in-memory database as you do, and inject the data quickly right before testing.

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

sponsored by the How to Make a Smoothie guide



About this Entry

This page contains a single entry by chromatic published on August 27, 2012 3:30 PM.

Simplified HTML Testing with Mojo::DOM and Mech was the previous entry in this blog.

You Don't Get to Choose How Other People Feel is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Powered by the Perl programming language

what is programming?