UNIVERSAL and API Decisions

I explained some of the troubles with Perl 5's UNIVERSAL in Is It, Can It, Does It, and Robust Perl 5 OO. More — and subtler — problems lurk.

A common antipattern in Perl 5 APIs is to allow spectacular flexibility in argument passing. It's too common to see functions which take some kind of reference and manually switch based on the argument type. (Some APIs do need this flexibility, such as a pretty printer for nested data structures or a serializer. Even so, languages with support for multiple dispatch or pattern matching of the non-regex kind provide much cleaner, simpler, and more correct code. I take advantage of this feature all of the time in Perl 6.) An example might be:

sub my_awesome_api_does_everything
{
    my $arg = shift;

    given (ref($arg))
    {
        when 'ARRAY'  { ... }
        when 'SCALAR' { ... }
        when 'HASH'   { ... }
    }
}

That can be messy, and better API design can ameliorate that in many cases. OO fans may already have looked up Replace Conditional with Polymorphism to post in the little comment box, but that occasionally runs into the "is it a primitive or an object" conundrum that multi-paradigm languages provide.

Yes, you can make my_awesome_api_does_everything() into a method on a data type, but unless you've steadfastly avoided primitive obsession in your code, you'll have circumstances where you want to pass in a simple data structure instead of an object and vice versa.

The so-called solution of extension methods on base types has its own problems. Globals are tricky, even if they're namespaced methods: anything you could possibly want gets crammed into poor Array, potentially multiple and conflicting times. (If you turn your head the right direction, it's obvious that only a PHP programmer could have created Ruby on Rails; there but for namespaces and some degree of encapsulation....)

The real problem is that you have to manage all of that complexity somewhere. In the absence of a sane API which refuses to handle that complexity (usually the best solution) and language features which hide that complexity for you (and have plenty of experience from plenty of real programs and a copious test suite to ensure that correctness), you will get it wrong in myriad, subtle, conflicting ways.

For example, consider a bog-standard blessed hash Perl 5 object. If you pass it into my_awesome_api_does_everything(), what happens? What should happen?

You can argue that the switch statement needs another case to handle the object. You can also argue that the HASH approach should suffice. Yet accessing the object as a hash breaks encapsulation. Adding another case makes the code a little bit less maintainable (the combinations increase, adding to the complexity of this code, to say nothing of the additional documentation and testing requirements).

On the other side, the more genericity and polymorphism you can support in your APIs, the less the coupling of your program in general and the better the reuse possibilities. The ultimate goal of maintaining a system is to be able to delete code while adding features and removing bugs. Net negative SLOC production is wonderful.

You want to avoid unnecessary data conversions to use the API, but you also want to maintain safety and correctness. You don't want to rule out useful behavior, but you want to enforce consistency.

ref() is almost never the answer here, but at least it's not actively misleading like the code people often use instead, when they realize that ref() doesn't answer the interesting question.

When the right answer for the API is to poke in the reference as if it were a hash, you often see this bad code:

if (UNIVERSAL::isa( $ref, 'HASH' )) { ... } # buggy; do not use

The reason this is wrong is subtler than you may thing; part of the answer is in my previous article. I'll explain why in the next installment.

Tags:

Modern Perl: The Book

Categories

Monthly Archives

Pages

About this Entry