Neal Ford writes about Coalmine Canary Tests. Canaries were carried by miners in the ‘olden days’ as an early warning system for a buildup of gas, giving them time to exit the mine before imminent death. Neal describes using a set of simple tests that verify basic execution of parts of your application. When changes are made the canary tests can be run first and give earlier feedback when something has gone horribly wrong.
Recently I started using the same name ‘canary tests’ for a slightly different type of test. The team I’ve been working with has inherited a large web application which is suffering badly from a couple of environmental dependencies. It has to be started up just right, with components started in the right order, and the timing just right. We’ve been doing archaeology on the system, learning it’s tricks and failure modes – however we’ve been limited in the amount of resource we can spend on finding root causes. Most of the time when the application has been deployed or started in a bad state we can find some corresponding symptom. One by one the team has collected these symptoms and built a set of ‘canary tests’ that are run prior to handing the deployed environment over to be tested. If one of those tests fails, we need to work out what had changed in the environment, and probably at least restart and check again before continuing.
The canary tests are not a solution, they just help us make sure we don’t continue working with the environment while it’s in a bad state. We are separately investigating each problem to find it’s root cause and put in place a permanent solution if possible. Some of these may take a long time to put in place, but at least the impact can be managed.