Writing good tests for code you care about is a creative process as much as it is an engineering discipline. You sometimes must find a balance between doing things the right way and doing things the effective way. The closer your tests are to the real usage patterns of your code the more accurate they are, but the faster and easier, the better your tests are to help you find defects.
Slow but accurate tests won't get run, while fast bug buggy tests won't produce any trust.
This dichotomy is often obvious in tests which must interact with multiple parts of a system, such as a database and a business logic layer. While many people say "Pull out your favorite mock object library and get to work!", I find mock objects are scalpels, not hammers, in that you need them for only a few specialized and precise operations, not a tool you'd use every day.
Consider databases.
One of my projects at work is a database-backed B2C web site. The schema is under fairly active development (I deployed a new version of the site with a big change to the schema an hour ago and have at least two more updates this week). I have a version of the live database handy for testing purposes (with user information scrubbed out, or where that's not possible, salted and hashed), but I almost never use it. (I only use it when verifying that a deployment will succeed before performing the deployment.)
Instead, I use DBICx::TestDatabase
and a hand-selected subset of data for all of the tests. DBICx::TD
takes my existing schema and creates a SQLite database in memory (not on the
disk at all) and returns a DBIx::Class
schema object connected to
that database. My initialization code dumps in some fixture data, and then it's
ready to use in all of my tests.
Again, that database is not on disk. This has at least two benefits. First, it's a lot faster than a database on disk, because it has to do less IO. Even if I'm very careful about wrapping operations in transactions, waiting for data to hit a platter is still a lot slower than waiting for data to hit RAM. Second, every process has its own unique database in memory and there's no contention for the disk or a single database file. My tests are trivially parallelizable in this respect.
All of the setup code is in my test library, t::DB
. (I could
use lib 't/lib';
, but I haven't needed that yet.) The relevant
code looks like this:
package t::DB;
use strict;
use warnings;
use Try::Tiny;
use DBICx::TestDatabase;
use MyApp ();
use Test::WWW::Mechanize::Catalyst 'MyApp';
use File::Touch;
make_cached_components();
my $schema;
sub make_schema { $schema ||= DBICx::TestDatabase->new( shift ) }
sub install_test_database
{
my ($app, $schema) = @_;
MyApp->model( 'DB' )->schema( $schema );
MyApp->log->disable( 'warn' );
}
sub import
{
my $self = shift;
my $appname = 'MyApp';
my $schema_name = $appname . '::Schema';
my $schema = make_schema( $schema_name );
install_test_database( $appname, $schema );
try
{
create_security_questions( $schema->resultset( 'SecurityQuestion' ) );
create_user_roles( $schema->resultset( 'Role' ) );
create_users( $schema->resultset( 'User' ) );
create_invitations( $schema->resultset( 'Invitation' ) );
create_sectors( $schema->resultset( 'Sector' ) );
create_industries( $schema->resultset( 'Industry' ) );
create_stocks( $schema->resultset( 'Stock' ) );
}
catch
{
my $exception = $_;
BAIL_OUT( 'Fixture creation failed: ' . $exception );
};
# ...
}
When a test file uses this library, the import()
method creates
a new test database for my application's schema. (The schema classes are
already in memory thanks to the use
lines which bring them in as a
dependency of the app.) The make_schema()
lines initialize a
closure variable containing the schema object.
Before today, this code went to some contortions using Catalyst::Test to swap
the existing application's schema with the generated schema, but now all it has
to do is to call install_test_database()
, and Catalyst behaves as
if it had connected to the schema using the in-memory database from the
start.
As a bonus, I removed a couple of dependencies and deleted a few lines of ugly code.
You can take this approach and use it on your own (unless you have complex database logic SQLite can't handle, and even then I'd work with this design for a long time before I gave it up voluntarily). You can take a more general principle from it as well: make it easy to swap components in and out of your application. The schema layer adds business logic to my data storage, but it's also valuable because it presents an interface between a database and my application. Anything polymorphic (or even allomorphic) to that interface should suffice to give me confidence between tests and deployed application.
(Of course the deployed database on disk has to match the schema in memory, but I did mention we have another mechanism to test that.)
For Galileo, by cleverly abstracting my initial database setup code, which uses the Mojo::Command cli system, I can hook into it in my testing to create an in-memory database as you do, and inject the data quickly right before testing.