For your amusement, here's my fuzzy-headed bug of the day. I wrote this code.
This application processes documents. A document may refer to one or more images outside of the document. Where this happens, we fetch all of those external images until we find one which can serve as a useful thumbnail for our document summary.
One stage of the document analysis pipeline identifies potential images. Another stage fetches and analyzes those images. We use DBIx::Class as our storage layer. We set up a request for valid potential images with:
my $valid_image_rs = $image_rs->search({
state => { '!=' => INVALID },
'fetchable.validity' => { '>=' => 0 },
}, { join => 'fetchable' });
We have, of course, an image
table. We also have a
fetchable
table. The latter represents anything which has a URI
and which we may make a network connection to fetch. This allows us to store
metadata such as fetch time, etag, and last-modified header values along with a
validity
score. Every time we fail to fetch an image for reasons
of network connectivity (on either side), we remove a point from this score.
After n unsuccessful fetches, we don't try anymore without manual
intervention.
This is all well and good.
Documents have a state
attribute, just like images do. A
document in the WAITFORIMAGE
state will receive no processing
until we've processed all of its images and either found an appropriate
thumbnail for the summary or invalidated all of the candidates. At that point,
the document should move on to another stage.
For various uninteresting technical reasons, the code to move a document
from the WAITFORIMAGE
stage to the WRITESUMMARY
stage
needs to query the database about the related images. That code looks like:
my $images_rs = $self->entry_images->search_related( 'image' );
my $image_count = $images_rs->search({ state => { '!=' => 'INVALID' } })
->count;
You probably see the bug right there, in that there's no check of image fetchable validity.
I've said more than a few times that I really like how DBIC lets you enhance your ResultSet classes with custom methods. I use this quite a bit to store searches I find myself using over and over. (This isn't an MVC application, but this technique is still a great way to separate model concerns from controller actions, such that your controller can call methods on the model and leave the searching code to the model classes entirely.)
If I'd moved this ad hoc query to the appropriate ResultSet class, I wouldn't have had this problem. (I wouldn't have spent a day debugging it.)
DBIC gives you the power to encapsulate interesting queries behind method calls. Take advantage of this. Anything more complicated than a very simple search probably warrants this treatment.