Integration tests - what level are you testing at?
Hi there,
My name is Michiel. I'm a software engineer from Nottingham currently employed by Experian Consumer Services. In my daily job, I come into contact with a lot of code written by a lot of different people. This story follows one of my recent forays into our codebase and reveals to you what I've found so that you can avoid this pitfall.
The title of this post should actually be "E2E tests considered harmful" as per Edsger Dijkstra's famous article "Go To Statement Considered Harmful." You might be thinking to yourself, "I run E2E tests all the time and they find lots of bugs!" I'm not arguing for the absence of any E2E tests. I am, however, arguing for less E2E tests in favour of more integration-level tests.
Recently I've been looking at our tests suite and wondering why our tests seem to fail sporadically. The way we run our automated test suites (excluding unit tests, mind you) is by triggering a Jenkins job from a Travis build after it is done deploying the latest code to our development environment. This means the Pull Request, in GitHub parlance, has already been merged to mainline.
You can already start to see one issue with this: what if the integration tests fail? You just end up having to branch, commit, approve, merge - again. To me this seemed like excessive overhead. We didn't need a Jenkins job for our integration tests, we don't need to have this long feedback loop. Ideally, we'd want to run our integration tests against the service in the Pull Request build.
E2E or integration?
When I first tried making these changes, I found out that there was already some work done towards running integration tests against a locally running service. In our microservice architecture, every service may have to call out to multiple, different services. To mock these calls and make assertions against the requests, we used WireMock. This allowed us to stub out all our external services and run assertions against only the service under test.
Interestingly, this approach revealed certain cracks in our testing approach. In service A, we found multiple tests that would call service B in order to test something unrelated to both! (An authentication proxy, as it turns out.) So what should have been an integration spec had now been upgraded to an E2E spec, as it relied on service A, service B and the authentication proxy.
In explaining why this is a problem, see the testing pyramid (or one variation thereof) below.
As we go up on the pyramid, test execution times go up (often drastically) and test reliability goes down. This means that E2E tests are the slowest and most unreliable tests out of the lot. By turning out integration suite into an E2E suite, we had decreased the reliability of our test suite and thereby decreased its effectiveness.
When to do an E2E suite
This all is not to say E2E tests don't have any place within your testing toolbelt. They're a very useful asset to have, especially towards the business. If something breaks, it's better to say "we tested it all together" than to say "our component works as expected!" These are just general findings as a result of our architecture - this might not apply to you.
E2E suites can be effective tools when testing the platform as a whole. It isn't to test the individual components. Because of the large scope of these tests, it is often not recommended to run these on every integration - as we had been doing. If you need to test the whole platform (for example, when releasing a major upgrade or platform-wide changes), they can be very useful and discover issues before they make their way to production.
Conclusion
As a result of these changes, we've been able to remove our Jenkins job, and I hope to be doing the same to many, many other components in the near future. So not only did we make the tests easier to understand, but we reduced the complexity of our infrastructure too. Instead we are running all the integration tests on every Pull Request, on Travis. This means there is a shorter feedback loop for developers, and it promoted cooperation between our devs and our QA engineers.
Evaluate what you are testing with each new test case you write. Don't "test the world" by writing E2E specs where a simpler integration test could have sufficed. Beware of separation of concerns, as this can make your test suite difficult to maintain and flaky.
And most of all: keep on testing!