One of my work projects has a test suite that's grown a little too slow for my liking. With only a few hundred tests, I've seen our test runs take over 30 seconds in a couple of cases. We're diligent about separating tests into individual logical files, so that when we work on a feature we can run only its tests as we develop and save running the whole suite for the last verification before checkin, but 30 seconds is still too long.
(Before you ask, our test suite is fully parallelizable, so that 30 seconds keeps all of the cores on a development box at full usage.)
I realize that one fad in software of the last few years is continuous integration, where you submit your commits to a box somewhere for that to run the full test suite and email you if something goes wrong, sometime in the future, hopefully less than an hour from now. Hopefully.
We can't afford to wait that long to know if our code is right. (I like to push to our production box as soon as we have well-tested fixes that we know work reliably. That's often more than once per day. I'm the only one working on the site today, and I've already pushed three changes, with at least one more coming.)
When I worked on Parrot, I spent a lot of time profiling the VM to sneak out small performance improvements here and there, using less memory and far fewer CPU cycles. I'm at home in a profiler, so I picked a very small test program and ran it through Devel::NYTProf.
As you might expect, startup time dominates most of the tests. We have 21 individual test files, and startup took close to two seconds for each of them. No matter how much we parallelized, we were always going to spend no less than 40 total seconds running tests. (The non-parallel test suite runs in ~67 seconds, so parallelization already gives a 55% improvement.)
When I looked at the profile, I saw that most of the startup time came from a couple of dependencies we didn't need.
In fact, even though I use HTML::FormHandler quite happily on another project, we don't use it on this project at all. We did use a Catalyst plugin which brought it in as a dependency, but we never used that feature.
I spent an hour revising that part of our application to remove that plugin (which we used to use, then customized our code so that the plugin was almost vestigial) and the parallel test run went down to 23 seconds.
We trimmed 10% of our dependencies, just by removing one plugin and the code it brought in.
(Tip: use Devel::TraceUse to find out which modules load which modules. It's a good way to get the conceptual weight of a dependency.)
I don't shy away from dependencies when they're useful, but I ended up revising one file and ending up with less code overall than we had before I started.
With that said, there's clearly room for improvement. I traced through some Moose code which handles initialization (especially of attributes) and it seems like there should be some efficiency gains there. Similarly, both Catalyst and DBIC go through some contortions to load plugins and components that in my case were unnecessary. (I have a patch to consider for the former, but not the latter.)
Two things I learned do bother me somewhat. If you use
Devel::TraceUse on a project of any size, you'll see a lot of
inconsistency and duplication, thanks to years of TIMTOWTDI. Catalyst brings in
Moose, but DBIC eventually ends up bringing in Moo. No one can decide whether to
use base '...'; or
'...';. A lot of larger projects (DateTime, Moose extensions) have lots
of helper modules spread out in lots of tiny .pm files which each need
to be located, loaded, and compiled.
Worse, all of the nice declarative syntax you get with some of these modern libraries has a cost behind the scenes every time you run the program even if nothing has changed.
If I were to pick an optimization target for future versions of Perl 5, I'd want a good way for Moose or Catalyst or whatever to precompile my nice sugared code into a cached representation that could then load faster on subsequent invocations. (I know you're not getting an order of magnitude improvement here: you still have to set up stubs for metaclasses and the like, but even getting attributes and methods in place would be an improvement!)
I'd settle today for a preforking test suite that doesn't require me to rewrite all of my tests.
Then again, that ~25% improvement in test execution from removing one dependency isn't bad for the week either.