A Practical Use for Macros in Perl

| 10 Comments

People occasionally ask for practical examples of macros when I lament the lack of macros in Perl. While I'm usually pleased at the degree to which Perl lets me design code to get and stay out of my way, sometimes its abstractions just aren't quite enough enough to remove all of the duplication available.

(I've been refactoring one of our business projects in preparation for another round of deployment in the next couple of weeks. We could launch without these improvements, but administrative work took almost two weeks longer than the afternoon I'd planned for it, so I decided it was worth my time to reduce technical friction so that further improvements are easier. More users means more work, so why not accelerate that work while I have the chance? I have another longer technical post to write to praise the use of Moose roles for a plugin system and to show off the stupidly-great task launcher, but that's for later.)

I found myself writing two code couplets that were similar enough they triggered my "Hey, refactor away this duplication!" alert. It's extra sensitive, because I know I'll have a few more couplets like this in the very near future:

while (my $stock = $stock_rs->next)
{
    my $pe_update = $self->analyze_pe( $stock );
    $stock_txn->add( $pe_update ) if $pe_update;

    my $cash_yield_update = $self->analyze_cash_yield( $stock );
    $analysis_txn->add( $cash_yield_update ) if $cash_yield_update;
}

The *_txn variables contain objects representing deferred and scoped SQL updates. I'll talk about that at YAPC::NA 2012 in When Wrong is Better.

The general pattern is this: for every stock in the appropriate resultset, call a method in this plugin. The method will return nothing if it fails (or has nothing to do) or it will return data to be added to the appropriate transaction. I have at least two types of transactions available here at the moment, and may have more later: one transaction updates stock data and the other updates analysis data.

I have several options. I could rework the data model so that this stage always only updates one transaction, in which the loop body could instead look like:

{
    for my $method (qw( analyze_pe analyze_cash_yield ))
    {
        next unless my $result = $self->$method( $stock );
        $txn->add( $result );
    }
}

This technique of hoisting the variants into an ad hoc data structure and using existing looping techniques works well sometimes. (I use it in other parts of the system.) It's relatively easy to expand, even though it moves interesting information ("I'm calling the analyze_pe method!") to a place where tools have more trouble finding it. (I search for >analyze_pe when I want to find method calls.) You may have used something similar to define several parametric methods at BEGIN time. It's the same type of pattern, and while Perl provides most of the tools necessary to allow this, it doesn't natively express this pattern well.

I could also change the transaction object's add() method to do nothing when it receives an empty list of arguments. I like that in some ways, but I don't like it in others. I've come down on the side of keeping its invariant (it always takes only one scalar as an object) pure for now. If I change it to take a list of updates, that might be the right time to reconsider this.

What I notice in the code as it stands right now is that the individual variables $pe_update and $cash_yield_update are synthetic variables. They only exist to support the code as written; they're not necessary for the algorithm. If I were to modify this code but only this code, I'd really rather write:

{
    ADD_TXN_WITH( $self, analyze_pe,         $stock, $stock_txn    );
    ADD_TXN_WITH( $self, analyze_cash_yield, $stock, $analysis_txn );
}

... though that syntax doesn't thrill me either. The clearest possibility I see right now is:

{
    $stock_txn->add(    SKIP unless $self->analyze_pe( $stock )         );
    $analysis_txn->add( SKIP unless $self->analyze_cash_yield( $stock ) );
}

... where SKIP does some magic to move to the next statement, not the next loop iteration. (I have some ideas how to write XS to make this work, but that creepy yak needs a shave and some mouthwash.)

The second best option right now is adding a function or method as indirection to encapsulate the synthetic code. I'd rather avoid synthetic code, but at least it reduces the possibility of copy and paste bugs.

For now, with only two steps in this analysis, I'm leaving it as it is. Two repetitions of something this similar set off my refactoring alarm, but I resist the urge for refactorings this small until I see three instances of near-duplicate code.

10 Comments

Filter::Template on CPAN allows you to do something like this:

use warnings;
use strict;

sub Filter::Template::DEBUG () { 1 }
use Filter::Template;

template ADD_TXN_WITH (analyze_object, analyze_method, analyze_stock, txn_obj) {
{
my $result = analyze_object->analyze_method( analyze_stock );
txn_obj->add( $result ) unless $result;
}
}

{% ADD_TXN_WITH $self, analyze_pe, $stock, $stock_txn %}
{% ADD_TXN_WITH $self, analyze_cash_yield, $stock, $analysis_txn %}

A source filter registers template definitions and expands invocations inline before the program is compiled. All the usual source-filter caveats apply until someone submits a patch to use PPI, a B module, or something even better.

#line directives preserve the sanity of errors and warnings. Because DEBUG is set, the example above generates the output below. You may need to widen your browser or shrink your font to see the full effect.

   6 |: 
   7 D: # template ADD_TXN_WITH (analyze_object, analyze_method, analyze_stock, txn_obj) {
   8 M: # mac 4:   {
   9 M: # mac 4:     my $result = analyze_object->analyze_method( analyze_stock );
  10 M: # mac 4:     txn_obj->add( $result ) unless $result;
  11 M: # mac 4:   }
  12 M: # mac 4: }
  13 |: 
  14 S: # line 14 "template ADD_TXN_WITH (defined in filter-template.pl at line 8) invoked from filter-template.pl"
  14 S: {
  14 S: # line 14 "template ADD_TXN_WITH (defined in filter-template.pl at line 9) invoked from filter-template.pl"
  14 S:     my $result = $self->analyze_pe( $stock );
  14 S: # line 14 "template ADD_TXN_WITH (defined in filter-template.pl at line 10) invoked from filter-template.pl"
  14 S:     $stock_txn->add( $result ) unless $result;
  14 S: # line 14 "template ADD_TXN_WITH (defined in filter-template.pl at line 11) invoked from filter-template.pl"
  14 S:   }
  14 S: # line 15 "filter-template.pl"
  15 S: # line 15 "template ADD_TXN_WITH (defined in filter-template.pl at line 8) invoked from filter-template.pl"
  15 S: {
  15 S: # line 15 "template ADD_TXN_WITH (defined in filter-template.pl at line 9) invoked from filter-template.pl"
  15 S:     my $result = $self->analyze_cash_yield( $stock );
  15 S: # line 15 "template ADD_TXN_WITH (defined in filter-template.pl at line 10) invoked from filter-template.pl"
  15 S:     $analysis_txn->add( $result ) unless $result;
  15 S: # line 15 "template ADD_TXN_WITH (defined in filter-template.pl at line 11) invoked from filter-template.pl"
  15 S:   }
  15 S: # line 16 "filter-template.pl"
Global symbol "$self" requires explicit package name at template ADD_TXN_WITH (defined in filter-template.pl at line 9) invoked from filter-template.pl line 14.
Global symbol "$stock" requires explicit package name at template ADD_TXN_WITH (defined in filter-template.pl at line 9) invoked from filter-template.pl line 14.
Global symbol "$stock_txn" requires explicit package name at template ADD_TXN_WITH (defined in filter-template.pl at line 10) invoked from filter-template.pl line 14.
Global symbol "$self" requires explicit package name at template ADD_TXN_WITH (defined in filter-template.pl at line 9) invoked from filter-template.pl line 15.
Global symbol "$stock" requires explicit package name at template ADD_TXN_WITH (defined in filter-template.pl at line 9) invoked from filter-template.pl line 15.
Global symbol "$analysis_txn" requires explicit package name at template ADD_TXN_WITH (defined in filter-template.pl at line 10) invoked from filter-template.pl line 15.
Execution of filter-template.pl aborted due to compilation errors.

Now that I've actually read what you wrote earlier about macros, Filter::Template isn't the type you're looking for. It is one way to solve the immediate problem here, though.

I may be dense, but to me this does look like a case of needing macros, but rather a case of the add_txn method force checking on the calling code. If we look at what you consider the clearest code, we can see that what you really want is for add_txn to ignore undefs (or false values, but I suspect they are effectively the same here). Why not modify add_txn to reject undefs? Is there ever a case you want to add an undef to a transaction?

How about so called "bare loops" with next?

while (my $stock = $stock_rs->next)
{
    PE: { $txn->add( $self->analyze_pe($stock)         or next ) }
    CY: { $txn->add( $self->analyze_cash_yield($stock) or next ) }
}

Here, if analyze_pe or analyze_cash_yield return false, next is triggered before add thus jumping out of the bare loop before add ever happens.

You can have your suggested macro syntax already and have it be even nicer:

while ( my $stock = $stock_rs->next ) {
$self->try_add( %$_, $stock ) for ( { pe => $stock_txn }, { cash_yield => $analysis_txn } );
}

sub try_add {
my ( $self, $method, $txn, $stock ) = @_;

$method = "analyze_$method";
my $result = $self->$method( $stock );
$txn->add( $result ) if $result;

return;
}

That's absolutely not what Perl needs macros for, since it already has enough expressiveness to do almost anything they could do. However, that doesn't mean there's no need for them. In fact, there is a very pressing need:

Performance

In certain situations Perl's function call overhead can become a liability and necessitate code like this:

http://dwarvis.googlecode.com/svn/trunk/lifevis/Lifevis/Viewer.pm

In a number of places there i would've liked to not have for loops nested in for loops stretching over more than 100 lines. I would've loved to have specific actions separated into subroutines and named sensibly. However i was forced to dissolve subs into those loops because calling them was too much of a performance drain.

This is a perfect case where macros would've helped keep the code speedy, while also making it readable.

I thought about your suggestion while I was coding, but it didn't seem nicer to me. I like and use transient data structures often in situations like this, but it's more indirection than I usually want to maintain.

Your point about the abstraction of names without the overhead of dispatch is well taken.

The problem here is ultimately that the declaration does not take effect until the end of the statement. Sometimes that’s an incredibly useful property but I have also often found it a hindrance. Consider:

my $pe_update;
$pe_update = $self->analyze_pe( $stock ) and $stock_txn->add( $pe_update );

You can do away with the declaration, but weaving in the conditional then requires ugliness:

$stock_txn->add( $_ ) for grep { $_ } $self->analyze_pe( $stock );

If we could have in Perl 5 what Perl 6 allows, namely combining two modifiers as long as one is a loop and one a conditional, then it could even look pretty nice:

$stock_txn->add( $_ ) if $_ for $self->analyze_pe( $stock );

Of course object-heads will shake their heads. If analyze_foo returned an object with an add_to method you could just write this:

$self->analyze_pe( $stock )->add_to( $stock_tnx );

If the analyze_foo methods return a null class instance for a thumbs-down result then polymorphism sorts it all out for you, as Kay intended.

In the meantime, people who know monads will be chuckling and shaking their heads, barely perceptibly.

Well, if that solution is too clever, this works too:

while ( my $stock = $stock_rs->next ) {
ADD_TXN_WITH( $self, analyze_pe => $stock, $stock_txn );
ADD_TXN_WITH( $self, analyze_cash_yield => $stock, $analysis_txn );
}

sub ADD_TXN_WITH {
my ( $self, $method, $stock, $txn ) = @_;

my $result = $self->$method( $stock );
$txn->add( $result ) if $result;

return;
}

Just uploaded a bit of sugar to CPAN.

use PerlX::Perform;
...

while (my $stock = $stock_rs->next)
{
perform { $txn->add($_) } wherever $self->analyze_pe($stock);
perform { $txn->add($_) } wherever $self->analyze_cash_yield($stock);
}


It assumes that analyze_pe and analyze_cash_yield perform undef or the empty list for failure. If they return a false but defined value, then the perform {...} block is still triggered.

I once wrote Scalar::Andand for this purpose, though I consider it a horrible hack and don't really recommend it to anyone.

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

sponsored by the How to Make a Smoothie guide

Categories

Pages

About this Entry

This page contains a single entry by chromatic published on February 3, 2012 12:46 PM.

Why I Run Tests on Install was the previous entry in this blog.

Null Objects, Error Handling, and Robustness is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.


Powered by the Perl programming language

what is programming?