Running your tests in a random order is a good idea to help shake out implicit dependencies between tests. Running your tests in a deterministic random order is even better.
What’s an implicit dependency?
It’s easy to accidentally create order-dependent tests:
1 2 3 4 5 6 7 8 9
Why should I care?
Dependencies between tests are bad for a number of reasons:
- When a single test fails, you need to run many tests to reproduce the failure. This makes reproduction slower and more annoying.
- The test method is no longer complete documentation. The required setup for the test is located in many different methods.
- The complexity of the test is hidden. What looks like a two line test may actually comprise hundreds of lines of code. Complex test code is often an excellent indicator of complex production code.
Running tests in a random order isn’t enough; you need to be able to reproduce the same random order before you can fix it! RSpec and MiniTest both offer a way to specify the random seed on the command line or with environment variables. Unfortunately, the Surefire plugin for Maven does not offer a way to specify the seed, even though it allows random ordering.
Continuous integration servers
At work, we use gerrit for code reviews and
Jenkins as our CI server. Whenever a new or updated commit
is pushed to gerrit, a build is started in Jenkins. There is also a
Jenkins job to build
origin/master every 15 minutes if it has been
The Gerrit/Jenkins combination allows you to retrigger a specific build in case there were environmental issues that have since been fixed. Unfortunately for us, retriggering was being used as a way to avoid dealing with test failures due to order dependencies. To encourage us to stop and address our order dependency problem, we updated both jobs to use a deterministic seed.
For the Gerrit builds, we used the Gerrit change number, which remains constant across multiple revisions of the same commit. The Gerrit plugin makes this value available as a environment variable during script execution.
origin/master build, we chose to use the Git hash of the
commit. Since the hash contains letters, we used a shell one-liner to
scrape out something that looks reasonable as a seed.
Does it work?
Just a few days after making the above changes, another developer came to me with a strange problem. His commit was unable to pass the tests in Gerrit, but the failing test had nothing to do with his changes. We ran the tests locally using the seed from the Jenkins server and were able to reproduce the problem. Ultimately, we traced the problem to a request spec that modified some core configuration settings and didn’t reset them successfully. Success!