I write a lot about Perl 5 and its ecosystem, but I spend most of my hacking time helping with Rakudo and contributing to Parrot.
I realize this leaves me open to charges of hypocrisy (though what a tired claim; the only modern sin for which you cannot blame your upbringing, your parents, your diet, society, poverty, or your circumstances is when someone believes your words do not match your actions to a surgical degree) -- if I know what Perl 5 needs, why don't I donate my precious time to make it happen?
That's a good question, but I want to talk about something else first.
"Just Write it in C!" is a Terrible Long-Term Strategy
One of my long-term goals in Parrot is to remove as much C code as possible from the project to make it run faster.
This seems counterintuitive to people used to dynamic languages such as
Perl, Python, and Ruby. "If it's too slow, you can always drop down to C!", or
so the claim goes. Sometimes that's even true.
TraceMonkey shows that it's
not true (at least if "fast" is enough and "really fast" is better). Dynamo, HotSpot, Strongtalk, Forth, and Smalltalk are also good sources of
inspiration and information -- especially the latter two.
C's overhead for bit-twiddling, character-by-character banging, and dispatch in tight loops is minimal. If you can coerce your algorithm into C code amenable to perform a lot of work there before returning, your code can get faster with smart use of C.
Of course, if your language has a smart compiler, you've likely thrown away a lot of chances for optimization: in particular, escape analysis and inlining (specialized or not) are difficult, if not impossible. You also have to pay the penalty of converting between your language's calling conventions and C's calling conventions. (You have to run a lot of C code very fast to make this worthwhile sometimes.)
Parrot crosses this C boundary often.
Sometimes it's okay to write code in PIR (Parrot's native language) that operates a little bit more slowly than the corresponding C code because calling the PIR code is faster than calling C code from PIR code (or worse, calling back into PIR code from C code called from PIR code, and so on).
My goal of removing as much C code as possible from Parrot -- writing Parrot in itself -- is a Parrot 3.0 goal. Parrot 1.3 comes out next week.
I believe we can achieve all of these goals without stopping the world, scrapping large parts of Parrot, and leaving piles of steaming debris in our wake.
Rebuilding a Charging Locomotive on the Go
How's that?
One of the not-so-hidden themes of all of my writings is gradual and inexorable progress in software. I believe that all Parrot contributors help to improve Parrot with every commit. (Some commits are regressions and some need further commits, but we're heading in the right direction and fixing more bugs than we create.)
A lot of projects can say that.
One of the poorly hidden themes in my writings is that backwards compatibility, like a dairy product, has an expiration date beyond which your fridge -- and your project -- gets scary.
I've written some great code for Parrot and I've maintained some awful code.
I believe the awful code is less awful for my work, but one of the greatest
joys of software development for me is deleting bad code because it's
unnecessary.
The strategy we've discussed for removing as much C code from Parrot as possible has several phases.
Our built-in data structures -- hashes, arrays, subroutines, et cetera --
are special files written in a mixture of an unnamed language and C. A series
of Perl libraries parses this code and generates a (much longer) plain C file
for compilation and linking into Parrot itself.
Step one of the plan is to write a compiler using Parrot's compiler tools that understands this unnamed language and can emit C code equivalent to what the current parser/generator produces. This step is underway.
Step two of the plan is to develop a new language (or extend the unnamed language) that the compiler from the first step can transliterate to C code equivalent to what the current parser/generator produces. This means that the new language can represent the same operations that the C code in the current language uses.
Step three of the plan is to replace the current Perl 5 parser/generator with the PCT-based compiler.
Step four is to add an emitter to the PCT-based compiler to produce opcodes Parrot understands which perform the same functions that the existing C code perform.
Step five is to remove the C emitter and run everything through the native Parrot instructions.
Although this process has several dramatic changes, only one is visible to
users: the switch from writing PMCs in the current mixture of semi-parsed C to
writing PMCs in a PCT-hosted language. Though we can deprecate the C version
at step two, we only have to remove it at step 5.
I believe that this is possible both technically and socially primarily because of the project's organization:
- We have a defined interface with which to write code which interacts with Parrot. (Admittedly we haven't formalized this interface yet, but that's in progress.)
- We have a documented support policy which notifies users as to our deprecation schedules and defines our backwards compatibility plans.
- We have a regular release schedule which, when combined with our documented deprecation and backwards compatibility policy, allows us to refactor and modify Parrot on a predictable schedule at a pace which allows for frequent improvements.
I'm sure some people could argue -- some of them even successfully -- that these types of changes after a project has reached 1.0 would be unnecessary if we'd only tried to plan the project in more detail years ago. Perhaps they're even right. Yet when was the last project you know of that both tried to do something new and successfully predicted the future such that it needed no architectural changes?
I don't mind rewriting code or even writing code I'll throw away in six months if it progresses toward something better that's faster, cleaner, easier to understand, easier to maintain, simpler, better designed, and/or more featureful.
Sources of my Unmotivation to Contribute to Perl 5
... which brings me to the long-promised explanation of why I have so much trouble contributing to Perl 5.
When I posted the final version of the patch to add class { ... }
to Perl 5, I had just finished wrestling with the Perl 5 parser to work around the Perl 4 apostrophe package separator superseded by double-colons and recommended against in Perl 5. You may recall the release of Perl 5 in 1994.
I wrote that patch to Perl 5 because I felt guilty about complaining about Perl 5's flaws (despite my contributions to Perl 5, such as they are) without at least attempting to address them.
I had little hope the patch would be accepted, but I was open to the possibility that someone might say "You know, it doesn't break backwards compatibility, it does make boilerplate code easier to read, to write, and to explain to novice programmers, and it is compatible with Moose, MooseX::Declare, and Perl 6.
Maybe it'll be a part of Perl 5.12, if Perl 5.12 ever comes out. Maybe someone else'll pick it up and maintain it and argue for it and get it committed to the core.
Meanwhile I can get the urge to fix a memory leak in Parrot or Rakudo, fire up Valgrind, and commit it today for people to use from the stable monthly release next Tuesday (for Parrot) and next Thursday (for Rakudo) or the packaged release included in free software distributions in July and everyone's life is a little bit better in days or weeks. The same goes for improving performance or fixing another bug or adding a feature.
If it's a big feature or an architecture change or something otherwise disruptive I still have to discuss it with other developers and figure out how it fits with our backwards compatibility and deprecation policies, but if we decide it's worthwhile, it won't languish for years between releases. The lag between producing something wonderful for other people to enjoy is months at most.
Perhaps people critical of my views are right in that it's inappropriate to compare Parrot to Perl 5. Perl 5 certainly has more users. Perl 5 has had more contributors. Several orders of magnitude more people and businesses rely on Perl 5 as it exists now.
Yet I wonder if they're perfectly happy with Perl 5 as it is -- or do they secretly wish that 5.10's argument passing performance problem found and fixed in source control three weeks after the release of 5.10 were released in a form they could use, or that Perl 5 had working subroutine signatures, or that Perl 5 had declarative classes, or that strict
and warnings
weren't optional for all new code?
The Perl 5 committers are right in that the best way to ensure that there's a Perl 5.10.1 sometime this year is to fix bugs, to write documentation, and to perform other thankless maintenance tasks. I hope to write about how to do that and I may do some of it myself. (Goodness knows I've done plenty of that for Parrot.) That's noble work and it deserves far better praise and far more respect than it gets. Even though I criticize of Perl 5's project management, I still believe that maintainers deserve much credit and appreciation for their work.
Still, the long-term viability of the project concerns me. If Perl 5 has a
volunteer time and effort problem (and it's clear that it does), I wonder how
many other potential contributors have declined to participate for reasons
similar to mine. (After several people have told me in person that they agree
with most of what I've written but don't want to say anything in public, I
wonder how many other ideas for process improvements the community has
lost.)
I take two lessons from this. First, sometimes it's easier to rebuild a train barrelling down the tracks than a train stopped in a weedy cow pasture. I wouldn't have believed that either before now.
Second, any project management strategy which relies on a sense of guilt to recruit new and prodigal developers has problems no technical mechanism can solve.