A Decades-Old Technique to Improve Programming Languages

I promised in Testing Your Templates to explain how to solve the problem of the divergence between testable, debuggable code in your host language and a big wad of logic in a template language.

This problem is an example of the pattern of Why Writing Your Own DSL is More Difficult Than You Think. Certainly Template Toolkit is among the better templating systems (I've written a couple myself), but it exhibits problems endemic to the process. (Then again, so does PHP. Now multiply that by the fact that some people use templating systems written in PHP and if you have to lie down for a while before the feeling passes, please accept my apologies.)

The semantics of Template Toolkit are great, when they work, but then everything's great when it works the way you expect. Robust software handles the cases you don't expect with aplomb, or at least without a boom.

A simple workaround for Template Toolkit is to avoid the fallback from potential method lookup to keyed hash access when dealing with an object. In other words, if $blessed_hash->do_something() fails, try $blessed_hash->{do_something}.

... except that that doesn't work when you want to call virtual methods on unblessed references, such as calling methods on arrays or hashes.

Another option is to change the syntax such that calling a method is visibly different from accessing a member of an aggregate. Perl does this. It works pretty well, in the sense that if you use the right operator (access element versus invoke method), you've expressed your intent in a visually unambiguous fashion).

... except that people complain about the Perl dereferencing arrow quite a bit. (Okay, you don't need an arrow to do this; as the Modern Perl book explains, the postfix indexed access or postfix keyed operators of {} and [] determine the type of operation effectively.)

... and except that one of the design goals of Template Toolkit was to be robust in the face of changing values provided to the template, such that it provides a loosely coupled interface for the data it expects. That's a fine goal, but it isn't free.

Here's the thing, though. The last time I looked, Template Toolkit compiles templates into Perl code as an optimization. (The last template system I wrote did the same thing, but not as well. We should have used TT, but in our defense, TT didn't exist then.) This transliteration/compilation stage must be very, very cautious to allow standard Perl debugging and introspection tools to treat this generated code correctly. That is to say, I don't want to debug a big wad of generated code. I want to debug the code I actually wrote.

As usual, the solution is another layer of abstraction.

Perl exists in two forms. The first is the source code you and I write. The second is the optree which the Perl VM executes. There's nothing in between. You have one or the other. When your code runs, you have the optree, and the optree has references to the relevant location in the source code it came from, but the correspondence is often less useful than you might like.

While the generated code from Template Toolkit could include the correct file and line positions from templates, that's again less useful than you might like. (It's useful, but it doesn't solve every problem.)

If Perl had instead an intermediate form separate from raw code and raw optrees, something more suitable to introspection and manipulation, we could produce tools which worked with this intermediate form to improve debugging, introspection, and better code generation.

We could even inject new code to add features (fall back to attribute access; prevent the fallback to attribute access) to code, even within lexical scopes. That is to say, we could manipulate how libraries behave from the outside in, and ensure that our changes would not leak out from our desired scopes.

It's certainly possible to replace the Perl opcodes yourself, if you're comfortable reading Perl source code, writing XS, relying on black magic, and dealing with strange issues of thread safety and manipulating global or at least interpreter-global values in a lexical fashion (while dealing with the fact that use is recursive in a sense)—but isn't Perl about not making people write C to do interesting things?

Certainly this isn't a technique you'd use every day, and it's not obviously a way to make Perl run faster (though many optimizations become much easier), but the possibility for better abstraction and extension and correctness has much to recommend it.

And, yes, Lisp demonstrated this idea ages ago.

A Decades-Old Technique to Improve Programming Languages

Tags:

1 Comment

Modern Perl: The Book

Categories

Monthly Archives

Pages

About this Entry