|Ned Batchelder : Blog | Code | Text | Site|
tl;dw: Stop mocking, start testing
» Home : Blog : June 2012
At PyCon 2012, Augie Fackler and Nathaniel Manista gave a talk entitled, Stop Mocking, Start Testing. This is my textual summary of the talk, the first of a series of summaries. You can look at Augie and Nathaniel's slides themselves, or watch the video:
If you not only don't have time to watch the video, but don't even want to read this summary, here's the tl;dr:
Here's (roughly) what Augie and Nathaniel said:
We work on Google Code, which has been a project since July 2006. There are about 50 engineer-years of work on it so far. Median time on the project is 2 years, people rotate in and out, which is usual for Google. Google code offers svn, hg, git, wiki, issue tracker, download service, offline batch, etc. They started off with a few implementation languages, now there are at least eight.
There are many servers and processes, components, including RPC services, all talking to each other, until finally at the bottom there's persistence. Your code is probably like this too: stateless components, messages sent between components, user data stored statefully at the bottom.
What's been the evolution of the testing process? Standard operating procedure as of 2006: Limited test coverage. We inherited the svn test suite, but it had to be run manually against a preconfigured dev env then manually examine output! Took all afternoon!
"Tests? We have users to test!" An effective but stressful way to find bugs. Users are not a test infrastructure. Tests that cost more people time than CPU time are bad. A project can't grow this way. If the feature surface area grows linearly, the time spent testing grows quadratically.
Starting to Test (2009): A new crew of engineers rolled onto the project, but they didn't understand the existing code. Policy: tests are required for new and modified code. Untouched code remained untested. The core persistence is changed a lot, so it's well tested, but the layers above might not, and that untested code would break on deploy. We set up a continuous build server, with red/green light, though a few engrs are red/green blind, so we had to find just the right colors!
We thought we were doing well, adding tests was helping, but the tests were problems themselves. Everyone made their own mock objects. We had N different implementations of a mock. When the real code changed, you have to find all N mocks and update them.
It wasn't just N mocks: even with one mock, it would tell us what we wanted to hear. The mocks do what we said, instead of accurately modeling the real code. Tests would pass, then the product would break on deploy. The mocks had diverged from the real code.
Lessons so far:
We tried to use full Selenium system tests to make up for gaps in unit coverage. Selenium is slow, race conditions creep in, difficult to diagnose problems. They weren't a good replacement for unit tests, unit tests give much better information.
We tested user stories with full system tests, this worked much better. Still use system tests, but test the user story, not the edge conditions.
We went through Enlightenment, now we have modern mocking:
Testing today: Tests are written to the interface, not the implementation. When writing tests ask yourself, "how much could the implementation change, and not have to change the test?" Running against mocks in CI makes the tests go faster, and reduces cycles.
We used to do bad things:
Now we do good things:
Define clear interfaces between components. If you can't figure out how to write a test, it's a code smell, you need to think more about the product code.