Prevent Test Rot

October 17, 2017

Nobody likes buggy software. Because of this fact, we must test our code. Whether we conduct these tests manually or automatically, they are a crucial part of the development process. All too often, testing strategies and automatic test suites begin to fail because they are not given the care they need to survive.

One of the most difficult things I’ve seen development teams struggle with over the years is how to maintain a solid test suite as their application grows. It doesn’t matter what the style is (TDD or otherwise) the tests begin to rot. In order to figure out how to prevent a distructive end, we must first understand what a good suite looks like.

Characteristics of a Good Test Suite

There are a number of characteristics a good suite must have to avoid the lingering threat of test rot. You’ve probably seen the items in this list before. What I’d like to stress here is that when any one of these qualities is lost, it can be catastrophic for the entire collection.

  1. fast - Slow test suites are just not run. If they’re not run, broken tests aren’t noticed until it’s too late. When can you tell that your tests are taking too long? When a single developer stops running them because they’re “too slow”. The exact time is different for every team.
  2. green - A failing test that isn’t fixed right away is a rust spot that will spread quickly. Even a yellow (skipped) test is a signal to the team that care is slipping, and that is a very slippery slope.
  3. durable - Fragile suites do not survive. If a small refactor breaks a dozen tests in unexpected ways one of two things will happen: refactoring will stop, or the collection will die.
  4. meaningful - Each test must be meaningful. That is, if it fails, it means there is a bug in the system that the failing test just caught for you. If you discover a failure that does not fit this description, action must be taken.
  5. consistent - Flaky tests are as bad or worse than those that are always failing. If a test does not pass on every run, it is flaky. On a related note, test suites should run each entry in a random order and continue to pass consistently.

Test Suite Life Cycle

The beginning of any project is great. Every developer is happy to write and run tests around their new code. As time moves forward and the system gets more and more complex, the code begins to decay. Time is often taken (sometimes not) to care for the main code of the system but test code is overlooked and neglected.

If there is one developer on the team that doesn’t take testing seriously, the rot worsens much more quickly. At first, the team fixes the tests left broken by the rogue developer. However, eventually they tire and the suite is forgotten. Even the most eager test writing developers stop writing them when the collection is neglected by everyone else.

The Broken Window

David Thomas and Andrew Hunt talk about the “Broken Window Theory” in their book — The Pragmatic Programmer. The idea is that one broken window left unrepaired leads to another. Then litter begins to appear, then graffiti. And so it goes until the whole building is condemned.

In my experience, this problem is indisputable in test suites. One neglected broken test leads to another, then another, until the lot of them are never run and the entire group is abandoned. In fact, I believe a single skipped test (i.e. that yellow asterisk shown in Rspec) is also a broken window that needs to be repaired.

A completely green suite of meaningful tests is easy to understand. If a test run has to be manually interpreted by looking at each failing or skipped test to know if the code is working correctly, the suite is in trouble.

Only You Can Prevent Test Rot

If a test is found to be brittle, useless, flaky, or slower than it’s worth, fix it. If you don’t have time to fix it, delete it with extreme prejudice.

There is certainly an argument to be had around the words “if you don’t have time to fix it”. I am all for having that colorful and difficult conversation. Barring that discussion however, if someone feels they just don’t have the time to get a test up to the proper quality, it is far better to delete it than let it hang around in a bad state.

If a test suite has 90% code coverage but is never run, it might as well be 0%. If deleting a bunch of tests results in only 50% code coverage, that is still better than zero.

For those of us that solve the slow test suite problem by running long tests on a build server, understand that can be quite tricky. Those tests are easily forgotten and ignored when they start turning red. The moment they are ignored, they might as well not exist.

Final Thoughts

It is often difficult to write good tests. It is even more difficult to ensure those tests stay good. Today’s perfectly constructed test is tomorrow’s fragile or broken one. I know it is hard to delete code that we worked so hard on. I know that it feels wrong, that we might allow bugs to slip into the system without that test being there. But the choice has to be to fix it or delete it. It’s the only way the rest can survive.