Inadvertent Inconsistencies: each versus Autoderef

Perl 5.12 allows you to use each, keys, and values on arrays. Perl 5.14 will automatically dereference references used as operands to the aggregate operators. The combination produces a worrisome inconsistency.

Perl 5.12's each had no obvious inconsistency problem; you had to write each @$kittens or each @{ $kittens } when using an array reference as its operand. Sure, you could write each %{ $kittens } when $kittens holds an array reference, but you'll get an error when the program runs like you would for dereferencing the wrong type of reference anyway.

With Perl 5.14, you have the curious situation where it's possible to give one of these polymorphic aggregate operators an operand which can behave both as a hash and as an array. By overloading an object, you can make it respond to array operations, or hash operations, or both.

If you use one of these objects as the operand to each, keys, or values, what is Perl to do? It's easy to test:

use Modern::Perl;

package DestroyerOfHope;

use overload
    '%{}' => \&gethash,
    '@{}' => \&getarray;

sub new
{
    my $self = shift;
    bless [qw( I am an array )], $self;
}

sub gethash  { { I => 'hash' } }
sub getarray { $_[0] }

package main;

my $d = DestroyerOfHope->new;
say each $d;

As of Perl 5.14, you get a runtime error "Type of argument to each on reference must be unblessed hashref or arrayref...". (The rationale was partly "Uh oh, this could go wrong!" and partly "Why would you want to iterate over something blessed?" The latter seems to me to ignore the fact that blessing is the only way to produce this kind of desirable overloading, but that's an argument for another time.)

While that decision certainly closes the door on this type of error, it's hardly the only way to solve this inconsistency. I see five other options:

Forbid autodereferencing on operands with any overloading
Always choose one overloading over the other (array always wins! hash always wins!), preferably producing a run-time warning
Forbid autodereferencing on operands with both types of overloading, giving a run-time error
Forbid autodereferencing with each, keys, and values
Revert the polymorphism of each, keys, and values

Keeping the existing behavior is probably the easiest, but it has two problems. First, it's inconsistent with Perl's nature. Sure, Perl deserves opaque objects, but what we have now are blessed references. Why are some references autodereferenceable and others not (especially in the presence of overload? Second, the existing behavior papers over a real problem. The interaction of these two features is inconsistent because one of the features ignores a longstanding design principle of Perl.

The real problem was making each, keys, and values work on arrays as well as hashes.

I understand the desire to make this feature work. It's easy to say "I want something like each that works on arrays!" The obvious next step is to expand that feature to include other hash aggregate operators. (The pursuit and implementation of a small consistency is easy. The pursuit and implementation of a language-wide consistency is very difficult.)

It's also much easier to hang new behavior off of existing keywords than it is both to find the right new keyword and to add a new keyword (adding new keywords is a perilous process). Would you want to type while (my ($index, $value) = arrayeach $kittens) { ... } every time you wanted to iterate over an array and get its index and value? Probably me neither.

Yet the problem remains. By making each, keys, and values polymorphic with respect to the types of their operands, Perl 5 has removed its ability to provide greater consistency across the language. (It's not just for the compiler; it's for people reading the code.)

The purest response, from the point of view of language design, is to deprecate the use of hash aggregates on anything but hashes and to find new keywords to perform the same functions on arrays. Enabling the feature set of Perl 5.12 or Perl 5.14 (or, by now, Perl 5.16) could re-enable this polymorphic behavior, but p5p could contain the damage to those releases alone and provide better options in the future.

The practical response is to acknowledge yet another wart on the language and keep the existing warning.

In user code, the best option is probably to avoid autodereferencing altogether, even as tempting as it seems. (This is a controversial statement, but I believe it's probably better to avoid the temptation to use a feature when the human brain's desire for pattern recognition and consistency may lead you down a path to using the inconsistent operators, and then where will you be?)

What's the solution in the future to avoid further inconsistencies like this? Always hew to Perl 5's fundamental principles. (Note that the biggest problem with the controversial and soon-to-be-bowdlerized smartmatch operator is that it also is a polymorphic operator and no one can memorize exactly what it does in every common situation, let alone every edge case.)

2 Comments

joel-a-berger [launchpad.net] | March 23, 2012 5:33 PM

I was excited for the auto-dereference feature until it bit me in an odd way. I know I've blogged about this once before; if you've seen this already then I apologize.

My problem was that when writing a module (targeting back at least as far as 5.10 if not farther), I ACCIDENTALLY auto-deferenced a variable ($var rather than %$var). One little character, and the kicker was that the code ran fine on my box. It wasn't until it got to CPANtesters that I discovered the problem. I love the concept, but in practice there seem to be all kinds of strange edge cases. Thanks for highlighting another one.

https://me.yahoo.com/a/7FIfCvBhrdc66LF7ls8WriYMbY02uO8I#36749 | March 26, 2012 6:01 AM

I'm a little concerned about the auto-dereferencing feature. As the first commenter notes, it's not too difficult to accidentally trigger, and the engineer side of me wonders what problem it's trying to solve.

If $foo is an arrayref, I'm perfectly happy to write @{$foo} to dereference it. To have dereferencing happen magically, just to save me typing three characters (please don't tell me that's the problem it's trying to solve), is dangerous. It can also be potentially misleading for the next person to read the code.

And yes, I prefer @{$foo} rather than @$foo, as a strong visual reminder that there's some casting going on. I prefer to write code that's as clear to follow as possible.

Doing The Appropriate Thing in The Appropriate Context is all well and good, but in this case, I think it's getting to be a little too clever.

--talexb

Inadvertent Inconsistencies: each versus Autoderef

Tags:

2 Comments

Modern Perl: The Book

Categories

Monthly Archives

Pages

About this Entry