When Do You Report Semantics Errors?

I haven't commented on David Golden's work to allow references as the first operands to push and pop because I have mixed feelings about the feature. The simple explanation is that Perl 5.12 requires an explicit array as the first operand of both keywords:

push @some_array, $some_scalar;

my $other_scalar = pop @{ $some_array_reference };

With David's changes, you will also be able to write this code for Perl 5.14:

push $some_array_reference, $some_scalar;

my $other_scalar = pop $another_array_reference;

I like this change for a couple of reasons. First, it reduces the visual clutter of dereferencing. I've never cared for Perl 5's dereferencing syntax (it may be my least favorite syntactical part of the language), and David's right in that it's unnecessary in many cases here. Second, this change improves consistency in that all of these array manipulating functions obviously operate on containers, not the values of those containers. That is to say, pushing onto or shifting from an array modifies the contents of the array as a whole. It doesn't coerce those contents into a list and perform a transformation on the list. The consistency of writing, in effect, "It doesn't matter what kind of syntactic element represents an array—a bare array, a reference, or even something with tied array-like magic—as long as it behaves properly, that's the right thing to use here." appeals to me. (See also The Why of Perl Roles for an exploration of this design principle.)

Even so, I perceive a lessening of compile-time safety, even as I know that's probably an illusion.

If I write:

my @array = qw( some values here );
push @array, 'some other value';

... everything is fine. @array is obviously an array and is obviously correct. Yet if I write:

my @array = qw( some values here );
push @rray, 'some other value';

... then strict will catch the typo. @rray is obviously an array, but it's (probably) not present in the program at this point. So far so good. Yet if I write with 5.12:

my @array = qw( some values here );
push $array, 'some other value';

... that's obviously an error, because @array and $array are different variables. (The latter is most likely a typo, but that's just the peculiarly silly naming convention of this example, so don't count on it.)

With 5.14, I could also write:

my @array = qw( some values here );
push $hash_ref, 'some other value';

... and the error won't be visible until runtime, when push realizes its first operand isn't anything array-like at all. Of course, nothing prevents me from making a similar typo with Perl 5.12:

my @array = qw( some values here );
push @{ $hash_ref }, 'some other value';

... where the difference is a bit of extra (and, to my mind, somewhat ugly) syntax.

I know this isn't a peculiar disadvantage of the change, and I like the change overall, but it still seems to me to trade a little bit of compile-time safety for the potential for run-time errors. I'll use it and I'll get used to it and I'm certain I'll like it, especially with complex data structures. Perhaps it's my familiarity with explicit deferencing that gives the illusion of compile-time safety.

(You can use Vincent Pit's autovivification pragma to avoid some of the potential damage of mistakes with explicit or implicit dereferencing, but again that's not a compile-time fix. I suspect what I really want here is Perl 6's gradual typing system.)

4 Comments

Caleb Cushing ( xenoterracide ) | November 29, 2010 1:54 PM

I kinda wish that there was some more obvious syntax for references... that wasn't so verbose. maybe ?value or something something that made it obvious I was looking at a reference in the code an not a normal scalar. @{ $value } does that, but ugily...

brian.d.foy.myopenid.com | November 29, 2010 4:00 PM

This is one of the points I (probably unsuccessfully) made in What do you care what type it is?. I'm not picking one side or the other, but this changes the notion of when you know something and when Perl can tell you that you messed up. Since it does that, I imagine new sorts of misuses and errors will come out of it.

dagolden.com | November 29, 2010 7:01 PM

I understand the concerns that people raise about the timing of errors. It's worth noting that for the push case (and for other array container functions), the internal implementation is exactly the same as if the dereference were explicit. (They produce the same op-tree.)

Assuming that you're running under strictures, that means that only new situation possible is pushing to the wrong declared variable -- like pushing to $hash instead of to $list. And even that isn't really new. I would imagine that doing push @list2, $thing when one meant push @list1, $thing happens far more often in practice that push $hash, $thing ever does. You're more likely to make a typo between names of the same type than to swap in a variable of a completely different type, particularly if your names are clear and distinct. And if you really think you're prone to that sort of mistake, you can always dereference explicitly.

stevenharyanto.myopenid.com | November 29, 2010 10:07 PM

I mostly differentiate variables using plural forms, e.g. $thing vs @things and %things, as I don't want to be too dependent on Perl's prefixes when coding in other languages.

But *occasionally* I do also write $things and @things/%things. And the funny thing is, whenever I do this, it's usually in the quick-and-dirty code that's quite convoluted and needs to be refactored anyway.

So I'm still favoring the new feature. The dereferencing syntax in Perl *is* really ugly :-)

Tags:

4 Comments

Modern Perl: The Book

Categories

Monthly Archives

Pages

About this Entry