Perl
5.12 allows you to use each
, keys
, and
values
on arrays. Perl
5.14 will automatically dereference references used as operands to the
aggregate operators. The combination produces a worrisome
inconsistency.
Perl 5.12's each
had no obvious inconsistency problem; you had
to write each @$kittens
or each @{ $kittens }
when
using an array reference as its operand. Sure, you could write each %{
$kittens }
when $kittens
holds an array reference, but
you'll get an error when the program runs like you would for dereferencing the
wrong type of reference anyway.
With Perl 5.14, you have the curious situation where it's possible to give one of these polymorphic aggregate operators an operand which can behave both as a hash and as an array. By overloading an object, you can make it respond to array operations, or hash operations, or both.
If you use one of these objects as the operand to each
,
keys
, or values
, what is Perl to do? It's easy to
test:
use Modern::Perl;
package DestroyerOfHope;
use overload
'%{}' => \&gethash,
'@{}' => \&getarray;
sub new
{
my $self = shift;
bless [qw( I am an array )], $self;
}
sub gethash { { I => 'hash' } }
sub getarray { $_[0] }
package main;
my $d = DestroyerOfHope->new;
say each $d;
As of Perl 5.14, you get a runtime error "Type of argument to each on reference must be unblessed hashref or arrayref...". (The rationale was partly "Uh oh, this could go wrong!" and partly "Why would you want to iterate over something blessed?" The latter seems to me to ignore the fact that blessing is the only way to produce this kind of desirable overloading, but that's an argument for another time.)
While that decision certainly closes the door on this type of error, it's hardly the only way to solve this inconsistency. I see five other options:
- Forbid autodereferencing on operands with any overloading
- Always choose one overloading over the other (array always wins! hash always wins!), preferably producing a run-time warning
- Forbid autodereferencing on operands with both types of overloading, giving a run-time error
- Forbid autodereferencing with
each
,keys
, andvalues
- Revert the polymorphism of
each
,keys
, andvalues
Keeping the existing behavior is probably the easiest, but it has two
problems. First, it's inconsistent with Perl's nature. Sure, Perl
deserves opaque objects, but what we have now are blessed references. Why
are some references autodereferenceable and others not (especially in the
presence of overload
? Second, the existing behavior papers over a
real problem. The interaction of these two features is inconsistent
because one of the features ignores a longstanding design principle of Perl.
The real problem was making each
, keys
, and values
work on arrays as well as hashes.
I understand the desire to make this feature work. It's easy to say "I want
something like each
that works on arrays!" The obvious next step
is to expand that feature to include other hash aggregate operators. (The
pursuit and implementation of a small consistency is easy. The pursuit and
implementation of a language-wide consistency is very difficult.)
It's also much easier to hang new behavior off of existing keywords than it
is both to find the right new keyword and to add a new keyword (adding
new keywords is a perilous process). Would you want to type
while (my ($index, $value) = arrayeach $kittens) { ... }
every
time you wanted to iterate over an array and get its index and value? Probably
me neither.
Yet the problem remains. By making each
, keys
, and
values
polymorphic with respect to the types of their operands,
Perl 5 has removed its ability to provide greater consistency across the
language. (It's not just for the compiler; it's for people reading the
code.)
The purest response, from the point of view of language design, is to deprecate the use of hash aggregates on anything but hashes and to find new keywords to perform the same functions on arrays. Enabling the feature set of Perl 5.12 or Perl 5.14 (or, by now, Perl 5.16) could re-enable this polymorphic behavior, but p5p could contain the damage to those releases alone and provide better options in the future.
The practical response is to acknowledge yet another wart on the language and keep the existing warning.
In user code, the best option is probably to avoid autodereferencing altogether, even as tempting as it seems. (This is a controversial statement, but I believe it's probably better to avoid the temptation to use a feature when the human brain's desire for pattern recognition and consistency may lead you down a path to using the inconsistent operators, and then where will you be?)
What's the solution in the future to avoid further inconsistencies like this? Always hew to Perl 5's fundamental principles. (Note that the biggest problem with the controversial and soon-to-be-bowdlerized smartmatch operator is that it also is a polymorphic operator and no one can memorize exactly what it does in every common situation, let alone every edge case.)
I was excited for the auto-dereference feature until it bit me in an odd way. I know I've blogged about this once before; if you've seen this already then I apologize.
My problem was that when writing a module (targeting back at least as far as 5.10 if not farther), I ACCIDENTALLY auto-deferenced a variable ($var rather than %$var). One little character, and the kicker was that the code ran fine on my box. It wasn't until it got to CPANtesters that I discovered the problem. I love the concept, but in practice there seem to be all kinds of strange edge cases. Thanks for highlighting another one.
I'm a little concerned about the auto-dereferencing feature. As the first commenter notes, it's not too difficult to accidentally trigger, and the engineer side of me wonders what problem it's trying to solve.
If $foo is an arrayref, I'm perfectly happy to write @{$foo} to dereference it. To have dereferencing happen magically, just to save me typing three characters (please don't tell me that's the problem it's trying to solve), is dangerous. It can also be potentially misleading for the next person to read the code.
And yes, I prefer @{$foo} rather than @$foo, as a strong visual reminder that there's some casting going on. I prefer to write code that's as clear to follow as possible.
Doing The Appropriate Thing in The Appropriate Context is all well and good, but in this case, I think it's getting to be a little too clever.
--talexb