The relentless pursuit of user efficiency exposes drawbacks now and then. I added HARNESS_OPTIONS=j9
to my .bashrc a while ago, and then noticed that my regular CPAN updates (cpan-update -p | cpanm
) had a lot more failures than usual.
Test::Harness
(and its internals, TAP::Harness) use the
environment variable HARNESS_OPTIONS
to customize some of its
behavior. This is very useful when running Perl tests through make
test
or ./Build test
or any other mechanism where you don't
launch the harness directly.
The j9
works well on
my four core machine; your numbers will vary based on your workloads and
hardware.
Unfortunately, it's easy to write simple tests which just don't work in a parallel world. Consider the TestServer.pm module used to test Test::WWW::Mechanize. (I chose this as an example because Andy's a good sport, and because I've already opened a pull request for it.) This module starts a server for each test to control the responses returned to the Mech object. That's all well and good; it tests network communication in a mostly real way (yes, the loopback interface isn't exactly the same as a remote server, but it's real enough for most testing uses).
The TestServer
constructor in 1.38 is:
our $pid;
sub new {
my $class = shift;
die 'An instance of TestServer has already been started.' if $pid;
# XXX This should really be a random port.
return $class->SUPER::new(13432, @_);
}
You can probably see the problem already from the comment. If multiple .t files use this module (and they do), and if these files each run in separate processes (and they do), then if these files run simultaneously (as they do i a parallel testing environment), only one file will be able to bind to this port and the others will all abort and cause test failures.
In fact, this is what happens.
I submitted a silly little patch which changes the port to:
return $class->SUPER::new(13432 + $$, @_);
... which should reduce the likelihood of collisions. (For more safety, the
code should check that the given port number is available, but then you have to
deal with race conditions and so forth, and there's a point at which adding
more complexity to your test just isn't worth it. Also, $$
can be greater than 65535, as Pete Krawczyk points out, so there out to be a sane modulus in there.)
The principle is this:
Manipulating external state in a test file reduces the possible parallelism of your test suite.
You can see the same thing when you write to hard-coded directories in certain tests. (Use File::Temp to create temporary directories—which can clean themselves up!). You can also see the problem when you use a single database for testing (use something like DBICx::TestDatabase to create and populate a database in memory).
Anti-parallelism bugs in test suites are unnecessary and in most cases are easy to fix, once you know what to look for. As the CPAN continues to grow and as our applications rely on more and more great dependencies, the mechanisms we use to manage our code become ever more important. It's easy to avoid these problems—and it's even easier to understand why parallel testing is valuable when you can cut your test run wallclock time in half.
Probably we need a few CPAN Tester smokers with parallel tests enabled, in order to get a grip on this problem over the long term. Otherwise it’ll just be spotty one-off corrections without coherent progress.
Someone has to be able to diagnose parallel testing failures as parallel testing failures and not weird one-off failures from weird smoker configurations. Maybe that's as simple as trying to run parallel tests first, then running serial tests and seeing if things change.
In the test suites I have maintained for various non CPAN projects speed has rarely been an issue. However, the test suite I am working with now takes a long time to run and parallelism would speed parts of it up (but only parts of it as sometimes a test needs to wait for something external to happen and often those things only happen some time in the future and are out of control of the test suite itself).
The most notable long test suite is for Perl itself - I wouldn't mind that one running more quickly.
As for CPAN modules, few I install take much more than a minute (many far less). Take one of my modules, DBD::ODBC. Obviously the test speed is dependent on comms with your database but for me, here, it takes 21s to run - not long. The DBD::ODBC test suite goes out of its way to ensure all test tables/procedures/functions/views created are removed after the test but many of the tests use the same names in different tests. Also, running with -j9 might not be possible due to limits on simultaneous connections to the database. Are you really suggesting I should recode this test suite just so someone can run -j9 and speed the test up?
I believe it's good CPAN citizenship not to preclude people from running your tests in parallel, when possible. Sure, 20 seconds isn't too bad for you running your full test suite just before you commit a big change, but 20 seconds per CPAN distribution used as a dependency for some projects means minutes and hours.
If it's truly impossible to run certain tests in parallel due to serial access to a shared resource, some sort of locking strategy might help.
ok, it may be good CPAN citizenship to not preclude people from running tests in parallel but it is also potentially a lot of work to change an existing suite to work in parallel. In all the years I've been looking after modules I've never once had any one report they don't work in parallel. Plus:
So all I'm saying the effort is probably not worth the gain for some modules.
Thanks for that, now all I need to do is work out how to make perlbrew do that.
Only within the past couple of years has TAP::Harness reliably been able to run tests in parallel, has been available on enough systems that it matters, and have people been starting to take advantage of it. I don't take its relative unpopularity until know as anything more than the relative obscurity of its existence.
Agreed that scouring existing test suites for parallelism blockers doesn't always make sense in terms of effort, but I believe it's worth at least considering.
Making a test suite not break under parallelism doesn’t necessitate making it run in parallel.
However I believe something like say 95% of tests on CPAN will already run fine in parallel with no further ado, and of the rest, easily the majority will be very simple to fix.
In the quasi-infinitesimal remainder of cases, sure, if the effort is not worth it, just forcibly serialise the tests and move on.
I expect a push to test parallelism to require little housekeeping effort all told. There just needs to be a reliable pressure that steers the CPAN towards it.
Yes, chromatic, you are right that it is a little more involved to detect failures as errors in parallelism.
Some of my test also suffer exactly from that port number issue. Just one thought I had now: Use the number of the test file - if you have numbered them - in addition to the fixed port number. Then thoses should be unique.