As I mentioned in Why corehackers Matters, the ability to fork and modify your own version of bleadperl -- and perhaps get it merged back into Chip's staging tree -- opens a lot of room for experimentation.
I alluded to a minor feature branch I've worked on for a couple of days:
unilaterally enabling strict
for all code not run through
-e
. This is available from my strict_by_default
bleadperl tree on GitHub. You're welcome to download it, play with it,
fork it, submit patches, or do whatever you want.
If Perl is a Shinto shrine, forking is an act of love... provided there's a merge sometime in the future.
Playing with strictperl
To build strictperl
, first clone my bleadperl tree from GitHub. Check out the strict_by_default
branch:
$ git clone git://github.com/chromatic/perl.git
$ cd perl
$ git checkout origin/strict_by_default
Then configure and build Perl as normal:
$ sh Configure -de -Dusedevel
$ make
This will build the familiar perl
binary. Now build strictperl
:
$ make strictperl
This will build a separate binary named strictperl
. If I've written the code (and especially the Makefile rules) correctly, these will be two completely separate binaries with different behaviors:
$ ./perl -e 'print $c' # no error
$ ./strictperl -e 'print $c' # no error
$ echo 'print $c' > printc.pl
$ ./perl printc.pl # no error
$ ./strictperl printc.pl
Global symbol "$c" requires explicit package name at printc.pl line 1.
Execution of printc.pl aborted due to compilation errors.
You can use strictperl
in place of regular perl
any place you like... except that several core modules are not strict
safe. In particular, Exporter and vars are the first two problematic core libraries.
Similarly, any module which does not use strict
may have strictness errors when running under strictperl
.
I don't think that's a bad thing, however; think of it as an opportunity to make lots of code strict safe even if it doesn't use strict
right now. (You could argue "Why in the world would you ever want to touch all of that code for no benefit?" You can also argue why you'd want to make your C code lint-safe, or run Perl::Critic on a codebase.) These "errors" may not be errors in practice, but if we evaluate them all, we can note declaratively in our source code that we've considered each one carefully and avoid further potential maintenance problems. Right now strictperl
is an experiment and a tool to help us identify these situations.
Patches and pull requests very welcome to help patch up the core modules for strict
safety.
How it Works
strictperl
works by changing the default hintset of nextstate
nodes in the Perl 5 optree.
Don't be scared. The implementation is slightly ugly (thanks to the way strict
itself works), but it's much less invasive or difficult than rewriting optrees as something like Devel::Declare must do.
If you look in the strict pragma, you'll see several auspicious lines:
my %bitmask = (
refs => 0x00000002,
subs => 0x00000200,
vars => 0x00000400
);
# ...
sub import {
shift;
$^H |= @_ ? bits(@_) : $default_bits;
}
This code ORs together a bitmask of all of the strict
features you've requested and toggles them on in the magic $^H
pseudo global variable. These constants correspond to three constants #define
d in perl.h:
#define HINT_STRICT_REFS 0x00000002 /* strict pragma */
/* ... */
#define HINT_STRICT_SUBS 0x00000200 /* strict pragma */
#define HINT_STRICT_VARS 0x00000400 /* strict pragma */
These hints are part of a particular type of node in the optree called a COP (control op, I presume). These are always nodes of type nextstate
; you see them often when you use B::Concise, for example:
$ perl -MO=Concise
print "Hello, world!"
6 <@> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -:1) v:{ ->3
5 <@> print vK ->6
3 <0> pushmark s ->4
4 <$> const[PV "Hello, world!"] s ->5
- syntax OK
Each COP contains information about the package and line number of the Perl code the next ops represent, as well as hint information such as which strict
pragma features are in effect. (They contain more information as well.)
When you modify the hints through $^H
, you modify the flags in the previously-executed nextstate
op. (If you're very curious, see the cop_hints
member of the cop
struct in cop.h.
There's a complicating factor. nextstate
hints nest in a
similar way that lexical scopes nest. If you enable strict
in an
outer scope, its effect remains in place in inner scopes unless they explicitly
disable it.
That's actually fortunate, in this case.
I knew that enabling strict
meant setting the appropriate hints flags when building COP nodes in the optree. That meant modifying Perl's parser. My original approach was to modify the function used to create new COP nodes, a function named newSTATEOP
. That's where I discovered the pseudo-inheritance scheme which allows strict
nesting. (I admit that I don't understand all of its implications).
After a couple of blind alleys, I realized that the only way to enable strict
pervasively was to find the creation point of the parentmost COP node in the optree and set these hint flags there.
Perl 5 uses a top-down parser; it starts by matching the most general rule
and descending into subrules to try to build a whole program. The topmost rule
is prog
; a program matches the progstart
and
lineseq
rules. progstart
is simple:
progstart: { PL_parser->expect = XSTATE; $$ = block_start(TRUE); };
You can ignore the contents of this rule. The important point is that this
is the first rule matched in a program -- a file, actually.
There's one more piece of the puzzle. If you look in the implementation of the newSTATEOP
function, you'll see that it uses a globalish (interpreter-local, anyhow) variable PL_hints
to set the hints flags on the newly-created COP:
CopHINTS_set(cop, PL_hints);
Thus my patch is very simple; progstart
now reads:
progstart:
{
PL_hints |= PL_e_script ? DEFAULT_CLI_HINTS : DEFAULT_PROGRAM_HINTS;
PL_parser->expect = XSTATE; $$ = block_start(TRUE);
}
;
PL_e_script
is another interpreter-local variable which
contains the text of code run with -e
. It's empty unless the
invoking command line used the -e
flag.
DEFAULT_CLI_HINTS
and DEFAULT_PROGRAM_HINTS
are new
constants I added to perl.h:
/* which hints are in $^H by default */
#define DEFAULT_CLI_HINTS 0
#ifdef STRICTPERL
# define DEFAULT_PROGRAM_HINTS \
HINT_STRICT_REFS | HINT_STRICT_VARS | HINT_STRICT_SUBS
#else
# define DEFAULT_PROGRAM_HINTS 0
#endif
I made them conditional on the STRICTPERL
symbol for one
specific reason: the compilation rules I added to the Makefile to
build strictperl
define -DSTRICTPERL
and rebuild the Perl 5 parser. Thus the DEFAULT_PROGRAM_HINTS
constant enables all strictures only when building strictperl
.
(Yes, cautious Makefile hackers, those rules clean up after
themselves so that the relevant files always get rebuilt when building
strictperl
and get removed after building strictperl
so that any subsequent non-strictperl
builds do not use object
files with the wrong constants defined.)
The hardest part of this whole process was getting the Makefile
rules right. I'm not quite sure they're cross-platform enough, but they work
with my testing.
The Value of strictperl
Was this process worthwhile? It was entertaining. It gave me the chance to
write code to implement a feature I believe is worth considering. It helped me
understand the optree in a bit more detail. It gave me a good opportunity to
explain some of that here.
Perhaps the best result of this process is that we now do have a
Perl with strictures enabled by default. We can experiment with that to see
how writing code works in this case. Admittedly there's a lot of work
necessary to make core libraries play nicely with strictperl
, but
we can do that in pieces because this is an optional feature you have
to enable by default, one which does not interfere with regular
perl
.
Those are the kinds of experiments I want to encourage.