Improving Perl 5's Core Exceptions


As I wrote in The Stringceptional Difficulty of Changing Error Messages, using strings in place of what should be structured data when reporting errors from Perl 5 makes improving Perl 5 more difficult than it has to be.

This is fixable.

From the conceptual side, all someone has to do is to change what Perl 5 throws for its core exceptions and warnings from a string to an object. That object can overload stringification so that all Perl code which treats it as a string will continue to get the string value. All code which treats it as an object will continue to work correctly even if the string value changes. (I haven't thought about how this might break XS code which pokes into SV guts with macros....)

Someone could even provide numeric overloading so that you can compare exception types numerically without having to call methods to figure out exception information.

Reconfiguring Perl 5's guts to make this possible is fairly simple, at least at the point of the API which actually throws the errors. A few functions in util.c such as Perl_croak() need to build an object instead of a string, but that's a modest amount of code. It's much more difficult to find every place in the Perl 5 core which calls Perl_croak and friends to change them to use the new API...

... because the right way to make this API work better is not to pass C strings as the text of error messages but instead to pass symbolic constants which represent error messages. For example, instead of calling Perl_croak( "Can't invoke non-invocant" );, the calling code should instead use something like Perl_croak( PERL5_NON_INVOCANT_EXCEPTION );. This allows a quick lookup of the right exception information as well as another benefit: localization of exception messages.

Even so, that's still a lot of code to change (590 uses of Perl_croak* in the .c files of bleadperl alone, not to mention everything on the CPAN)—and this code won't be available for wide use until 5.14 next spring at the earliest. In a few years, maybe enough people will use exception objects by default that it's possible to clean up error messages throughout the core without worrying about breaking fragile old code. Then again, fragile code tends to do the lazy thing, not the correct thing.

As an intermediary step, perhaps it's possible to refactor the core to use exception type concepts instead of literal (or sprintf-style) strings within the bleadperl source code. That would allow for localization, if that's desirable, and it gets Perl 5.13.x closer to making real exception objects useful.

The real question is whether real exception objects in the core are sufficiently worthwhile to justify this change. If the best rationale for this work is "Someday, we may be able to fix old, misleading, crufty, or wrong diagnostic messages!" then ... well, how soon will be too late? This is why to plan for making—and correcting—mistakes in your language design.


chromatic wrote:

[T]hat's still a lot of code to change (590 uses of Perl_croak* in the .c files of bleadperl alone, not to mention everything on the CPAN)"

But we don't have to be quite as intimidated by what's out there on CPAN as we once were. There are a number of distributions on CPAN that enable you to traverse a minicpan repository for the purpose of identifying characteristics of distributions. As part of work I was doing over the last year with David Golden on a refactoring of ExtUtils::ParseXS, I wrote CPAN::Mini::Visit::Simple and used it to identify all CPAN distributions with .xs files. By definition, instances of Perl_croak* in CPAN distributions have to be found in .xs files.

"This allows a quick lookup of the right exception information as well as another benefit: localization of exception messages."

Please don't do that. Localized error messages are a nightmare - they're usually uninformative and "ungooglable".

The googlability is easy enough to fix as long as you have a unique error string that's language independent.

Googleable error messages are just a 'LC_ALL=C' away ;-)

LC_ALL=C is fine if you can easily and quickly reproduce the error, and the error behavior isn't itself locale-specific.

Modern Perl: The Book

cover image for Modern Perl: the book

The best Perl Programmers read Modern Perl: The Book.

sponsored by the How to Make a Smoothie guide



About this Entry

This page contains a single entry by chromatic published on August 9, 2010 1:33 PM.

The Stringceptional Difficulty of Changing Error Messages was the previous entry in this blog.

Modern Perl: The Book Seeks Comments is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Powered by the Perl programming language

what is programming?