I fixed a bug in Perl 5 this morning. I'm not asking for praise; lots of people fix bugs in Perl 5. In particular, Perl 5 pumpkings deserve tremendous praise for managing the bug queue -- and often fixing nasty bugs no one else wants to explore.
The bug was small and the problem seemed obvious. The fix was simple. Thus the process of fixing the bug may be valuable to document. In particular, corehackers exists in part to recruit new developers. Its articles about Perl 5 internals needs more information.
Here's a start; here's how I fixed a small, obvious bug in Perl 5.
The Bug
Dave Taylor reported a bug with the syswrite
builtin. Using syswrite
on an empty string with an offset writes garbage to a filehandle:
$ /usr/local/refperl/5.10.0/bin/perl -e 'my $foo = ""; syswrite
STDOUT, $foo, 100, 1' | less
<DC>8 /null^@^@^@^Y^@^@^@^A^@^@^@ ^@^@^@^P^@^@^@X^V9
^@^@^@^@<89>^@^@^@<80><A6>^U^H^@^@^@^@^@<A6>^U^H^@^@^@^@<80><A7>^U^H^@^@^@^@^@
<A7>^U^H^@^@^@^@<80><A8>^U^H^@^@^@^@^@<A9>^U^H^@^@^@^@^@<A5>^U^H^@^@^@^@<80><A4>^U^H^@
(END)
His test case is simple and his diagnosis looks reasonable.
At this point, the bug is obvious to a C programmer. Even though Perl 5 knows that $foo
is an empty string, the syswrite
code attempts to read 100 characters from the string starting from the second character of the string (at offset 1). The garbage shown in the bug report is whatever's in memory one byte after the memory address of the null string in $foo
.
Whatever reads this data to write with syswrite
checks the length of the string in $foo
incorrectly.
Finding the Culprit
That seemed like a reasonable hypothesis. Where was this code?
I already knew that it was likely in a C function called pp_syswrite
in one of the pp_*.c files in the Perl 5 core. If I hadn't known this, I could have used B::Concise to figure out where to look:
$ perl -MO=Concise -e 'my $foo = ""; syswrite STDOUT, $foo, 100, 1'
d <@> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e:1) v:{ ->3
5 <2> sassign vKS/2 ->6
3 <$> const[PV ""] s ->4
4 <0> padsv[$foo:1,2] sRM*/LVINTRO ->5
6 <;> nextstate(main 2 -e:1) v:{ ->7
c <@> syswrite[t3] vK/4 ->d
7 <0> pushmark s ->8
8 <#> gv[*STDOUT] s ->9
9 <0> padsv[$foo:1,2] s ->a
a <$> const[IV 100] s ->b
b <$> const[IV 1] s ->c
-e syntax OK
B::Concise
compiles a snippet of code into an optree, then
walks that optree and serializes it to textual output. That output represents
the operations Perl 5 performs when it runs the program. I've emboldened the
important line. That line shows a LISTOP (a type of node in the optree which
has multiple child nodes) which performs an operation called
syswrite
.
The node type isn't important; what's important is the name of the
operation. Each operation implies the existence of a function in the Perl 5
core called pp_operation
. PP is, as I understand it,
short for push/pop, which means that these functions operate on the Perl 5
stack (more or less equivalent to @_
in Perl 5 code).
I used App::Ack to
search for pp_syswrite
in the appropriate *.c files. It
led me to mathoms.c, which is sort of a limbo for functions that used
to be in the core and exist now for compatibility reasons.
pp_syswrite
is now:
PP(pp_syswrite)
{
return pp_send();
}
Thus the implementation of syswrite
is in the function
pp_send()
, which is in pp_sys.c. That function is too
long to reproduce here, but there's an interesting branch around 80 lines
in:
if (op_type == OP_SYSWRITE) {
Size_t length = 0; /* This length is in characters. */
STRLEN blen_chars;
/* set blen_chars to the length of the string in C chars */
/* ... */
if (MARK < SP) {
offset = SvIVx(*++MARK);
if (offset < 0) {
if (-offset > (IV)blen_chars) {
Safefree(tmpbuf);
DIE(aTHX_ "Offset outside string");
}
offset += blen_chars;
} else if (offset >= (IV)blen_chars && blen_chars > 0) {
Safefree(tmpbuf);
DIE(aTHX_ "Offset outside string");
}
} else
offset = 0;
/* ... */
This code goes through some gyrations, counting the length of the string in
C chars (this is more complex than you think when you have UTF-8 in
the string). Eventually blen_chars
contains the length of the
string. offset
is the offset value in this branch.
I've emboldened the offending code. The intent of that line's conditional is to check that the offset isn't outside of the string. Unfortunately, when the string is zero chars long, the second part of that line's conditional is false and the entire conditional fails.
Fixing the Bug
If blen_chars
is 0, then any offset of one or more chars will
be outside the range of the string, so the exception is necessary. (Deleting
that conditional also means that using an offset of 0 with an empty string will
also throw that exception. I could argue that behavior both ways.)
Vincent Pit checked in the patch for RT #67912, as well as a test case derived from Dave's code:
eval { my $buf = ''; syswrite(O, $buf, 1, 0) };
like($@, qr/^Offset outside string /);
I used Dave's command-line invocation as a quick test while creating the patch. After running the entire core test suite to verify that there were no obvious regressions, I mailed the simple patch to p5p:
--- a/pp_sys.c
+++ b/pp_sys.c
@@ -1919,7 +1919,7 @@ PP(pp_send)
DIE(aTHX_ "Offset outside string");
}
offset += blen_chars;
- } else if (offset >= (IV)blen_chars && blen_chars > 0) {
+ } else if (offset >= (IV)blen_chars) {
Safefree(tmpbuf);
DIE(aTHX_ "Offset outside string");
}
This is an interesting case where deleting code can make the project's behavior more correct.
Five hours elapsed from the time of reporting to the time of the commit of the fix. In addition, Vincent also refactored the sysio.t test for clarity and ease of maintenance. This is how community-developed software should work!
It's a simple bugfix. Anyone reading this could have figured it out without too much work. Not all bugs are this simple, but the process is reasonably easy and there are plenty of people willing to help you get started improving Perl. (It's also very satisfying to think of all of the people who won't run into this problem in the future because you fixed it for them.)