Tom Christiansen's What
Wrong with sort
and How to Fix It (blame me for the title)
gathered a lot of necessary attention about the necessity of collation to sort
data in various languages.
It also sparked a small discussion about "What in the world does that mean and why would you do that?" regarding a single line of Tom's code:
@sorted_lines = Unicode::Collate::->new->sort(@lines);
In particular, a few people asked "Why would you write
Unicode::Collate::
?" As with far too many grotty parts of Perl 5,
the answer is "To avoid bareword parsing ambiguity."
Ambiguity? Sure. Unicode::Collate
is a bareword. Oh, it's clearly a class name, unless it's a function call.
A function call?
Sure. It could be a call to Unicode::Collate()
. This is a form
of the same problem you get when making a dative (colloquially "indirect
object") method call:
# buggy code; do not use
my $object = create Some::Class; # buggy code; do not use
That is to say, the meaning of this code can change depending on what else the Perl 5 parser has seen when it compiles this code.
If you're interested in gory details and you don't mind reading heavily
macroized and partially documented accreted C code, look at the
Appending the package separator (amusingly '
; did you think I
wouldn't try it?) makes the class name obviously a class name and not a
function call. Ambiguity removed, at the cost of slightly more ugly code.
With that said, the ugliness bothers me such that I never use this syntax even as I admit its advantages. Instead I rely on coding standards to avoid potential ambiguity by using lowercase for method names. So far, I've been fortunate—but I cannot blame someone once burned for avoiding the problem at the parser level.
(A sigil to identify classes could fix this, as would a unique operator to instantiate or look up classes. None of these solutions completely satisfy me.)