Suppose you have two hashes which (when you inspect them) contain:
%first = (
dog => 'Rodney',
cat => [qw( Daisy Jack )],
);
%second = (
dog => 'Rodney',
cat => [qw( Daisy Jack )],
);
Are they the same?
Your answer depends on what identity means to you. At this point in time,
they have what appear to be identical contents. They're probably
different variables, though. If you look at things from the point of view of
the perl
program itself, the two names %first
and
%second
refer to two different HV
structs.
That's not very interesting to most users, I suspect. The questions of how these two hashes behave when you treat them the same way is more interesting. For example, given the state of the two hashes as demonstrated earlier, do you expect this test to pass?
use Test::More;
my @first_keys = keys %first;
my @second_keys = keys %second;
is_deeply \@first_keys, \@second_keys, 'keys returned in the same order';
done_testing;
The official answer is "No, you should not." As of Perl 5.18, the official answer is "If this test ever passes, it's an unlikely coincidence."
A hash map data structure (what Perl calls "hashes") associates keys with values by calculating a simple hash of the key and using that as a cheap index into another data structure. That calculated hash is not necessarily unique, and the indexed data structure needs to be able to handle collisions of the calculated hash (even though the keys are different) as well as insertions of new key/value pairs and removals of key/value pairs.
When you traverse the Perl hash with keys
or
values
or each
, perl
traverses that data
structure in a particular order that makes sense given the state of that
data structure. If and when that data structure changes, the way
perl
traverses it will change. The details of that data structure
are hidden inside perl
; they're not obvious from the point of view
of the Perl programmer.
The implication of this is not obvious at first, but it's easy to explain:
even if two hashes have the same keys and the same values, the order in which
you traverse that hash with keys
or values
or
each
may be different between the hashes.
After all, you don't know whether %first
was defined all at
once, or whether it started empty and had the two key/value pairs added one at
a time or both at a time. You don't know whether %second
originally had three key/value pairs or four or ten. You can't look at
what a hash currently contains and guess at the details of the internal data
structure perl
uses to represent those keys and values.
As of Perl 5.17.10—what will become Perl 5.18 very soon—this behavior gets a little more predictable. With a recent patch to bleadperl, the calculated hash of two hashes defined with the same keys and values in the same way will be slightly different. In other words, even if you write:
my %first = (
dog => 'Rodney',
cat => [qw( Daisy Jack )],
);
my %second = (
dog => 'Rodney',
cat => [qw( Daisy Jack )],
);
... the test from earlier may still not pass. (It's very unlikely to pass.) Any code that relies on an accident of implementation which causes that test to pass occasionally is, sadly, buggy code.
The good news is that you almost never need to rely on that behavior which
happens by accident. When I come across code which relies on a specific
ordering of keys or values of a hash, it's often in tests which care more that
the contents of the hashes are equivalent than that the order of traversal is
the same between the hashes. (In those cases, you can use a function like Test::More's
is_deeply
to compare keys or values or both, or sort
the keys before comparing them.
In other cases, you must rely on the documented guarantee that
using keys
and values
on the same hash (not
a hash with the same contents but a hash which uses the same underlying
HV
) will always traverse that data structure in the same order
until you modify the hash.
Anything else has the potential to do the wrong thing depending on the implementation. Perl 5.18 will be a little more strict about this, which is actually a good thing, as it will identify buggy code much more easily.