Keep Your ResultSets DRY

For your amusement, here's my fuzzy-headed bug of the day. I wrote this code.

This application processes documents. A document may refer to one or more images outside of the document. Where this happens, we fetch all of those external images until we find one which can serve as a useful thumbnail for our document summary.

One stage of the document analysis pipeline identifies potential images. Another stage fetches and analyzes those images. We use DBIx::Class as our storage layer. We set up a request for valid potential images with:

my $valid_image_rs    = $image_rs->search({
    state                => { '!=' => INVALID },
    'fetchable.validity' => { '>=' => 0 },
}, { join => 'fetchable' });

We have, of course, an image table. We also have a fetchable table. The latter represents anything which has a URI and which we may make a network connection to fetch. This allows us to store metadata such as fetch time, etag, and last-modified header values along with a validity score. Every time we fail to fetch an image for reasons of network connectivity (on either side), we remove a point from this score. After n unsuccessful fetches, we don't try anymore without manual intervention.

This is all well and good.

Documents have a state attribute, just like images do. A document in the WAITFORIMAGE state will receive no processing until we've processed all of its images and either found an appropriate thumbnail for the summary or invalidated all of the candidates. At that point, the document should move on to another stage.

For various uninteresting technical reasons, the code to move a document from the WAITFORIMAGE stage to the WRITESUMMARY stage needs to query the database about the related images. That code looks like:

my $images_rs = $self->entry_images->search_related( 'image' );

my $image_count = $images_rs->search({ state => { '!=' => 'INVALID' } })
                            ->count;

You probably see the bug right there, in that there's no check of image fetchable validity.

I've said more than a few times that I really like how DBIC lets you enhance your ResultSet classes with custom methods. I use this quite a bit to store searches I find myself using over and over. (This isn't an MVC application, but this technique is still a great way to separate model concerns from controller actions, such that your controller can call methods on the model and leave the searching code to the model classes entirely.)

If I'd moved this ad hoc query to the appropriate ResultSet class, I wouldn't have had this problem. (I wouldn't have spent a day debugging it.)

DBIC gives you the power to encapsulate interesting queries behind method calls. Take advantage of this. Anything more complicated than a very simple search probably warrants this treatment.

Tags:

Modern Perl: The Book

Categories

Monthly Archives

Pages

About this Entry