In a comment on A Modern Perl Success Story for the Internet CLI, Darian Patrick asked for an explanation of my file slurping one liner:
my $contents = do { local $/ = <$fh> }
While File::Slurp exists on the CPAN to encapsulate these sort of tricks and Perl 6 provides the slurp
method on filehandles, this one-liner is in my toolbox because it's very simple (for me) to remember and type.
Perl Slurp Explained
As you may remember from perldoc perlvar, $/
is Perl 5's input record separator variable. Its contents are a literal string used to identify the end of a record when using readline
on a filehandle. Its default is the platform-default newline combination—\n
, whatever that translates to on your platform and filesystem.
To read a file with different record separators—perhaps double newlines—set $/
to a different value.
To read the whole file in one go, set $/
to the undefined value.
It's always good practice in Perl 5 to local
ize all changes to the magic global variables, especially in the smallest possible scope. This helps prevent confusing action at a distance. (I appreciate that Perl 6 moves these globals to attributes on filehandles.)
That explains how this code works:
my $contents = do { local $/; <$fh> };
(I may have first encountered this idiom in perldoc perlsub.)
How does my code work?
Idiomatic Perl Slurp
The local
ization has to occur before the assignment, for obvious reasons. As it happens, it occurs before the readline
. As the readline
uses the contents of $/
to determine how much to read, it sees the undefined value, reads the entire file, and assigns its contents to $/
. Even though leaving the do
block immediately restores the previous contents of $/
, the assignment expression occurred in scalar context, thanks to the assignment of the block's result to $contents
. An assignment expression evaluates to the value assigned: the slurped contents of the file.
As you may have determined already, the do
block both limits the scope of the local
ization and makes all of the file manipulation into a single expression suitable for assignment to $contents
.
Perl Slurp and Clarity
Using slurp
from a module is likely clearer. As well, localizing and copying file contents may be somewhat inefficient. In the case of my Identi.ca poster, files will rarely be larger than 140 characters, the program is short-lived, it blocks on user input, and it immediately produces and consumes network traffic, so this is unlikely to be a bottleneck in any sense.
I skimmed the relevant documentation and couldn't find a guarantee
that the order of operation of local
ization and
readline
will remain as it is; I tried a few experiments with B::Concise to confirm my
suspicions, but ran afoul of the Perl 5 optimizer. It may be safer to use two
expressions in the block:
my $contents = do { local $/; <$fh> }
Even still, a silly little idiom like this demonstrates some of the interesting features of Perl 5.
To learn more about the idioms of Perl 5, and to learn how to use the language effectively, see Modern Perl: The Book.
I always considered Uri's Perl Slurp-Eaze article to have the definitive one line slurp. This is truly one line as it does not need to separately open a file handle.
my $text = do { local( @ARGV, $/ ) = $file ; > };
Also note that if you're already (or thinking about) using Path-Class, there is a 'slurp' method in Path::Class::File.
Hello there,
I think my version is shorter :)
my $contents = join '', ;
An now the Benchmarks with an 100MB ASCII-Textfile.
> my $text = do { local( @ARGV, $/ ) = $file ; > };
real 0m1.181s
user 0m0.228s
sys 0m0.868s
> my $contents = join '', ;
real 0m6.485s
user 0m4.616s
sys 0m1.360s
your code wins :)