Controlling Test Parallelism with prove

If you're fortunate enough to have a test suite which allows parallel execution, one small prove feature can save you a lot of time.

prove, of course, is a relatively new utility included with Test::Harness and TAP::Harness. It handles many of the little details of running test programs and collecting and reporting the output; it's one of those utilities that looks really silly before you use it, and then becomes indispensible within a week.

prove of course has an option to run parallel tests. The -j# option allows you to specify how many test files to run at once. I've had good success with -j9 on my desktop machine; the right number depends on your tasks, the number of cores available, the amount of memory used by each process, and the runtime characteristics of each process.

prove's -l option adds the relative lib/ directory to Perl's include path, so that you can test pure-Perl code without running it through a build cycle or without having to add use lib '...'; lines to your test files.

The -r option searches a given directory recursively for .t files.

Thus, the command prove -lr -j9 t/ runs all of the .t files found under t/, up to nine at a time, and prefers modules found under lib/. This is useful.

Of course I have a shell alias with one more feature:

alias proveall='prove -j9 --state=slow,save -lr t'

prove's state flag saves information about the tests run. If you save state, subsequent runs can use that information to determine how to run tests again.

I often have several types of tests, especially for code with user interfaces and data models. The data model tests exercise business logic, and the UI tests exercise control flow and error handling. Usually the business tests take the longest to run—and usually only one or two test files take the most time. When prove saves the state of the test run, it can schedule those slow tests first so that the fast tests can run in the spots where the slow test blocks.

Again, this all depends on your workload. Much of my code is more IO bound than CPU bound. I've seen slow tests take 20% or more of total suite execution time after everything else has finished just because they have so many points where they have to wait.

I regularly have test suite times under 30 seconds (often closer to 10 or 12 seconds) on moderately large projects because I can exploit easy opportunities for parallelism. Certainly the right tweaking and scheduling could get me more benefit, but running proveall and making sure that parallelism is possible from the start gets me most of that benefit with almost no additional work.

(This isn't solely an academic obsession; in my measured personal experience, the more often I can run the entire test suite, the easier it is to find and fix bugs. I won't go as far as to say that continuous integration is a crutch, but if you're using CI and can't run the most important tests covering most of your code in 30 seconds, you're shortchanging yourself.)

Controlling Test Parallelism with prove

Tags:

1 Comment

Modern Perl: The Book

Categories

Monthly Archives

Pages

About this Entry