The Problems with Indirect Object Notation

This excerpt from Modern Perl: the book discusses another feature of Perl 5 which makes parsing Perl 5 difficult. Avoiding this feature in your own code will make it more reliable and easier to debug.

Read a few Perl 5 object tutorials (or the documentation of too many CPAN modules), and you might believe that new is a language keyword just as in C++ and Java:

    my $q = new CGI; # DO NOT USE

As objects has made clear, a constructor in Perl 5 is anything which returns an object. By convention, constructors are class methods named new(), but you have the flexibility to choose a different approach to meet your needs. If new() is instead a class method, the standard method call approach should apply:

    my $q = CGI->new();

These syntaxes are equivalent in behavior, except when they're not.

The first form is the indirect object form (more precisely, the dative case), where the verb (the method) precedes the noun to which it refers (the object). This is fine in spoken languages, but it introduces difficult to debug ambiguities in Perl 5.

Bareword indirect invocations

One problem is that the name of the method is a bareword, requiring the Perl 5 parser to perform several heuristics to determine the proper interpretation. While these heuristics are well-tested and almost always correct, their failure modes can be very confusing and difficult to debug. Worse, they're fragile in the face of the order of compilation and module loading.

Parsing is more difficult for humans and the computer when the constructor takes arguments. The Java-style approach may resemble:

    # DO NOT USE
    my $obj = new Class( arg => $value );

... thus making the classname Class look like a subroutine call. Perl 5 can disambiguate many of these cases, but its heuristics depend on which package names the parser has seen so far, which barewords it has already resolved (and how it resolved them), and the names of subroutines already declared in the current package.

Imagine running afoul of a subroutine with prototypes with a name which just happens to conflict somehow with the name of a class or a method called indirectly. This happens infrequently, but it's difficult enough to debug that it's worth making impossible by avoiding this syntax.

Indirect notation scalar limitations

Another danger of the syntax is that the parser expects a single scalar expression as the object. You may have had trouble printing to a filehandle stored in an aggregate variable:

    # DOES NOT WORK AS WRITTEN
    say $config->{output} "This is a diagnostic message!";

print, close, and say -- all keywords which operate on filehandles -- operate in an indirect fashion. This was fine when filehandles were package globals, but with lexical_filehandles the problem can be more apparent, when Perl 5 tries to call the say method on the $config object. The solution is to disambiguate the expression which produces the intended invocant:

    say {$config->{output}} "This is a diagnostic message!";

Alternatives to indirect notation

Direct invocation notation does not suffer this ambiguity problem. To construct an object, call the constructor method on the class name directly:

    my $q   = CGI->new();
    my $obj = Class->new( arg => $value );

For filehandle operations, which are limited, known to the Perl 5 parser directly, and pervasive in their idiomatic use of the dative case, use curly brackets to remove ambiguity about your intended invocant. Alternately, consider loading the core IO::Handle module which allows you to perform IO operations by calling methods on filehandle objects (such as lexical filehandles).

To identify indirect calls in your code, use the CPAN module Perl::Critic::Policy::Dynamic::NoIndirect (a plugin for Perl::Critic). To forbid their use at compile time, use the CPAN module indirect.

5 Comments

matt_trout | August 22, 2009 10:06 AM

And of course there's already an explanation of -how- perl disambiguates and an example of the ways it can screw you up in the "indirect but still fatal" post on my blog

SebastianLikesChaosInVacuum | August 22, 2009 3:48 PM

There is a way to avoid the ambiguity. I found this in the camel book.

my $object = new Class::($args);

Therefore I see no technical reasons to use this form not, when no method dispatching is wanted or needed. But beginner should know and teached about the differences between the two forms.

http://www.flickr.com/photos/markstos | August 22, 2009 8:00 PM

Thanks for this reminder. I thought I had purged indirect object notation from CGI.pm in the 3.43 release, but there were clearly plenty of cases left to address.

I've patched those now, and they will be removed from the next release. Unfortunately, it seems too late to this update into 5.10.1, so it will be 5.10.2 when the change appears in the core.

Reference:
http://github.com/markstos/CGI.pm/commit/a36e451716f8cee8ac02376617bb98a33d5ac9c0

https://me.yahoo.com/a/cGs4PI5l1OXfE.Ztq8nXfQ35Nw--#00721 | August 27, 2009 9:06 PM

It's too bad that the word "indirect" was assigned to this notation. A high value name for a notation that we typically don't use. I understand the linguistic reason for the usage, but programmers have their own use for the word "indirect". It'd be nice to have a concise description of the Class->$method(@args) style notation.

james2vegas.myopenid.com | February 18, 2011 2:24 AM

As mentioned elsewhere:

use Class;
sub Class {
warn 'Called Class sub not Class package';
'Class'
}
my $q = Class->new; # calls the Class sub above
my $s = new Class; # throws a 'Bareword found where operator expected' error
my $t = Class::->new # this works
my $u = new Class::; # this also works (even with sub main in the current package)

The Problems with Indirect Object Notation

Bareword indirect invocations

Indirect notation scalar limitations

Alternatives to indirect notation

5 Comments

Modern Perl: The Book

Categories

Monthly Archives

Pages

About this Entry