Goals of Test Automation

The book has now been published and the content of this chapter has likely changed substanstially.

About This Chapter

In the Test Smells narrative I introduce the various "test smells" that can act as symptoms of problems with our automated testing. In this chapter I describe the goals we should be striving to reach to ensure successful automated unit tests and customer tests. I start off with a general discussion of why we automate tests before describing the overall goals of test automation including reducing cost, improving quality and improving the understanding of our code. Each of these areas has more detailed named goals that I discuss briefly. I don't describe how to achieve these goals in this chapter; that will come in subsequent chapters where I use these goals as rationale for many of the principles and patterns.

Why Test

Much has been written about the need for automated unit and acceptance tests as part of agile software development. Writing good test code is hard and maintaining obtuse test code is even harder. Since test code is optional (not what the customer is paying for), there is a strong temptation to give up testing when the tests becomes difficult or expensive to maintain. Once we have given up on the principle of "keep the bar green to keep the code clean", much of the value of the automated tests is lost.

Over a series of projects the teams I have worked with faced a number of challenges to automated testing. The cost of writing and maintaining test suites has been a particular challenge, especially on projects with thousands of tests. Fortunately, necessity is the mother of invention and we, and others, have developed a number of solutions to address these challenges. I have since gone on to introspect about these solutions to ask why they are good solutions. I have divided them into goals (things to achieve) and principles (ways to achieve them.) I believe that adherence to the goals and principles will result in automated tests that are easier to write, read, and maintain.

Economics of Test Automation

Of course there is will always be a cost to building and maintaining an automated test suite. Ardent test automation advocates will argue that it is worth spending more to have the ability to change the software later. This "pay me now so you don't have to pay me later" argument doesn't go very far in a tough economic climate (The argument that the quality improvement is worth the extra cost often doesn't go very far either in these days of "just good enough" software quality).

Our goal should be to make the decision to do test automation a "no-brainer" by ensuring that it does not increase the cost of software development. This means that the additional cost of building and maintaining automated tests must be offset by savings through reduced manual unit testing and debugging/troubleshooting as well as the remediation cost of the defects that would have gone undetected until the formal test phase of the project or early production usage of the application. This figure shows how the cost of automation is offset by the savings received from automation.

Sketch Economics-Good embedded from Economics-Good.gif

Fig. X: An automated unit test project with a good ROI.
The cost-benefit trade off where the total cost is reduced by good test practices.

Initially, the cost of learning the new technology and practices takes additional effort but once we get past the "hump", we should settle down to a steady stage where the added cost (the part above the line) is at least fully offset by the savings (the part below the line.) If tests are hard to write, hard to understand and require frequent, expensive maintenance, the total cost of software development (the heights of the vertical arrows) goes up as illustrated in this version of the figure:

Sketch Economics-Bad embedded from Economics-Bad.gif

Fig. X: An automated unit test project with a poor ROI.
The cost-benefit trade off where the total cost is increased by poor test practices.

Note how the added work above the line is more than in the first diagram and continues to increase over time. Also, the saved effort below the line is reduced. This results in the increase in overall effort which now exceeds the original effort without test automation.

Goals of Test Automation

We all come to test automation with some notation of why having automated tests would be a "good thing". Here are some high level objectives that might apply:

Tests should help us improve quality.
Tests should help us undertand the system under test (SUT).
Tests should reduce (and not introduce) risk.
Tests should be easy to run.
Tests should be easy to write and maintain.
Tests should require minimal maintenance as the system evolves around them.

The first three objectives are the value provided by the tests while the last three objectives are focused on the characteristics of the tests themselves. Most of these objectives can be decomposed into more concrete (and measurable) goals. I have given these short catchy names so that I can refer to them as motivators of specific principles or patterns.

Tests should help us improve quality

The traditional reason for doing testing is for "quality assurance" or "QA" as it often abbreviated. What exactly do we mean by this? What exactly is quality? Traditional definitions include two main categories of quality: 1) Is the software built correctly? and 2) Have we built the right software?

Goal: Tests as Specification

Also known as: Executable Specification

If we are doing test-driven development or test-first development, the tests give us a way to capture what the SUT should be doing before we start implementing it. They give us a way to specify the behavior in various scenarios captured in a form that we can then execute (essentially an "executable specification".) To ensure we are "building the right software", we must ensure that our tests reflect how the SUT will actually be used. This can be facilitated by using user interface mock ups to capture just enough detail about how the application appears and behaves to write our tests.

The very act of thinking through various scenarios in enough detail to turn them in tests helps us to identify those areas where the requirements are ambiguous or self-contradictory. This helps us improve the quality of the specification which in turn helps improve the quality of the software it specifies.

Goal: Bug Repellent

Yes, tests find bugs but that really isn't what automated testing is about. It is about preventing bugs from happening. Think of automated tests as "bug repellent" that keeps nasty little bugs from crawling back into our software after we have made sure it doesn't contain any bugs. Where we have regression tests, we won't have bugs because the tests will point them out to us before we check in our code. (We are running all the tests before every check-in, aren't we?)

Goal: Defect Localization

Mistakes happen! Some mistakes are much more expensive to prevent than to fix. Suppose a bug does slip through somehow and it shows up in the Integration Build[SCM]. If we have made our unit tests fairly small by testing only a single behavior in each, we should be able to pinpoint the bug pretty quickly based on which test is failing. This is one of the big advantages of unit tests over customer tests. The customer tests will tell us that some behavior expected by the customer isn't working. The unit test will tell us why. We call this phenomena Defect Localization. If we have a failing customer test with no unit tests failing, that is an indication of a Missing Unit Test (see Production Bugs on page X).

All these benefits are wonderful but we cannot achieve them if we don't write tests to cover off all possible scenarios each unit of software needs to cover. Nor will we get the benefit if the tests themselves have bugs in them. Therefore it is crucial to keep the tests as simple as possible so that they can be easily seen to be correct. Writing unit tests for our unit tests is not a practical solution but we can and should write unit tests for any Test Utility Method (page X) to which we delegate complex algorithms needed by the test methods.

Tests should help us understand the SUT

Repelling bugs isn't the only thing the tests can do for us. They can show the test reader how the code is supposed to work. When viewed from the outside of any specific component (a black box view), we are in effect talking about the requirements of that piece of software.

Goal: Tests as Documentation

Without automated tests we would need to pore through the SUT code trying to answer the question "What should be the result if ...?" With automated tests, we simply use the corresponding Tests as Documentation and they should tell us what the result should be (recall that a Self-Checking Test states the expected outcome in one or more assertions.) If we want to know how the system does something, we can turn on our debugger, run the test and single-step through the code to see how it works. In this sense, the tests act as a form of documentation for the SUT.

Tests should reduce (and not introduce) risk

We've already addressed how tests should help us improve quality by helping us better document the requirements and prevent bugs from creeping in during incremental development. This is certainly one form of risk reduction. Other forms of risk reduction relate to verifying the behavior of the software in the "impossible" circumstances that cannot be induced when doing traditional customer testing of the entire application as a black box. It is a very useful exercise to review all the project risks and brainstorm about which kinds of risks could be at least partially mitigated through the use of Fully Automated Tests (see Goals of Test Automation on page X).

Goal: Tests as Safety Net

Also known as: Safety Net

When working on legacy code, I feel nervous. By definition, legacy code doesn't have a suite of automated regression tests. Making changes to this code is risky because we never know what we might break and we have no way of knowing whether we have broken something! This forces us to work very slowly and carefully by doing a lot of manual analysis before we make changes.

When working with code that has a regression tests suite we can work much more quickly. We can adopt a more experimentaly style of changing the software. "I wonder what would happen if I changed this? Interesting! So that's what this parameter is for." The tests act as a "safety net" that allows us to take chances. (Imagine trying to learn to be a trapeze artist in the circus without having that big net that allows us to make mistakes; we'd never progress beyond swinging back and forth!)

The effectiveness of the safety net is determined by how completely our tests verify the behavior of the system. Missing tests are like holes in the safety net. Incomplete assertions are like broken strands. Each can let bugs of various sizes through.

The effectiveness of the safety net is amplified by the version control capabilities of modern software developement environments. A source code Repository[SCM] like CVS, Subversion or SourceSafe lets us roll back our changes to a known point if our tests are telling use that the current set of changes has too high an impact. The built in "undo" or "local history" features of our IDE let us turn the clock back 5 seconds, 5 minutes or even 5 hours.

Goal: Do No Harm

Also known as: No Test Risk

Then there is the flip side of this discussion: How might automated tests introduce risk? We have to be careful to make sure we don't introduce new kinds of potential problems into the SUT as a result of doing automated testing. The Keep Test Logic out of Production Code principle directs us to avoid putting test-specific hooks into the SUT. It is certainly desirable to design the system for testability but any test-specific code should be plugged in by the test and only in the test environment; it should not exist in the SUT when it is in production.

Another form of risk is believing that some code is reliable because it has been well tested when in fact it has not. A common mistake made by developers new to the use of Test Doubles (page X) is replacing too much of the SUT with a Test Doubles. This leads us to another important principle: Don't Modify the SUT. That is, we must be clear about what SUT we are testing and make sure that we don't replace the parts we are testing with test-specific logic.

Sketch SUT Example embedded from SUT Example.gif

Fig. X: A range of tests, each with its own SUT.
An application, component, or unit is only the SUT with respect to a specific set of tests. The “Unit1 SUT” plays the role of DOC (part of the fi xture) to the “Unit2 Test” and is part of the “Comp1 SUT.”

Tests should be easy to run

Most of us want to write code. Testing is just a necessary evil. Automated tests give us a nice "safety net" so that we can work more quickly ("with less paranoia" is probably more accurate!) but we will only run the automated tests frequently if they are real easy to run.

What makes tests easy to run? There are four specific goals: They must be Fully Automated Tests so they can be run without any effort. They must be Self-Checking Tests so they detect and report any errors without manual inspection. They must be Repeatable Tests so they can be run multiple times with the same result. Ideally, each test should be an Independent Test that can be run all by itself. With these four goals satisfied, one push of a button (or keyboard shortcut) is all it should take to get the valuable feedback the tests provide. Let's look at these goals in a bit more detail.

Goal: Fully Automated Test

A test that can be run without any Manual Intervention (page X) is a Fully Automated Test. It is a prerequisite to many of the other goals. Yes, it is possible to write Fully Automated Tests that don't check the results and that can only be run once. The main() program that runs the code and directs print statements to the console is a good example. I consider these two aspects of test automation so important to making tests easy to run that I have made them separate goals: Self-Checking Test and Repeatable Test.

Goal: Self-Checking Test

A Self-Checking Test is one that has encoded within it everything that it needs to verify that the expected outcome is correct. Self-checking tests apply the "Hollywood Principle" ("don't call us, we will call you") to running tests. That is, the Test Runner (page X) only "calls us" when a test did not pass thus making a clean test run have zero manual effort. Many members of the xUnit family provide a Graphical Test Runner (see Test Runner) that uses a green bar to tell us that everything is "A-OK"; the bar turns red to tell us that a test has failed and needs our further investigation.

Goal: Repeatable Test

A Repeatable Test is one that can be run many times in a row with exactly the same results without any human intervention between runs. Unrepeatable Tests (see Erratic Test on page X) increase the overhead of running tests significantly. This is very undesirable because we want all developers to be able to run the tests very frequently; as often as after every "save". Unrepeatable Tests can only be run once before whoever is running the tests must do Manual Intervention. Just as bad are Nondeterministic Tests (see Erratic Test) which have different results at different times; they cause us to spend lots of time chasing down failing tests. The power of the red bar diminishes significantly when we are seeing it regularly without good reason. Pretty soon, we find ourselves ignoring the red bar assuming that it will go away if we wait long enough. Once this has happened, we've lost a lot of the value of our automated tests because the feedback that we have introoduced a bug and therefore we should fix it right away disappears. The longer we wait, the more effort it takes to find the cause of the failing test.

Tests that run only in memory and which use only local variables or fields usually end up being repeatable without any additional effort. Unrepeatable Tests usually come about because we are using a Shared Fixture (page X) of some sort (and I include any persistence of data implemented within the SUT in this definition.) When this is the case, we must ensure that our tests are "self-cleaning" as well. When cleaning is necessary, the most consistent and foolproof way is to use a generic Automated Teardown (page X) mechanism; it is possible to write tear down code for each test but this can result in Erratic Tests when not implemented correctly in every test.

Tests should be easy to write and maintain

Coding is a fundamentally hard activity. We need to keep a lot of information in our head as we work. When we are writing tests, we should be focused on testing and not on the coding of the tests. This means that tests need to be simple; simple to read and simple to write. They need to be simple to read and understand because it is hard to test the automated tests themselves. They can really only be tested properly by introducing the very bug that they are intended to detect into the SUT; this is hard to do in an automated way so it is usually only done once (if at all), when the test is first written. So we need to rely on our eyes to catch any problems that creep into the tests and to do that we need to keep them simple enough to read quickly.

Of course, if we are changing the behavior of part of our system we do expect some tests to be affected but this should be a relatively small number. We want to Minimize Test Overlap so that only a few tests are affected by any one change. Contrary to popular opinion, having more tests pass through the same code doesn't improve the quality of the code if most of the tests are doing exactly the same thing.

Tests become complicated for two reasons:

Trying to verify too much functionality in a single test, and
Too large an "expressiveness gap" between the test scripting language (e.g. Java) and the before/after relationships between domain concepts we are trying to express in our test.

Goal: Simple Tests

The former can be addressed by keeping our tests small and testing one thing at a time. It is particularly important when we do test-driven development because we write our code to pass one test at a time and we want each test to introduce only one new bit of behavior into the SUT. We should strive to Verify One Condition per Test by creating a separate Test Method (page X) for each unique combination of pre-test state and input. Each Test Method should drive the SUT through a single code path. (There should be at least one Test Method for each unique path through the code; often there will be several, one for each boundary value of the equivalence class.)

The one main exception to Test Methods being short is those customer tests which express real usage scenarios of the application. They are a useful way to document how a potential user of the software would go about using it; if these involve long sequences of steps then the Test Methods should reflect this.

Goal: Expressive Tests

The "expressiveness gap" can be addressed by building up a library of Test Utility Methods that constitute a domain-specific testing language. This allows the test automater to express the concepts what they wish to test without having to translate the thoughts into much more detailed code. Creation Methods (page X) and Custom Assertion (page X) are good examples of the building blocks that make up this Higher Level Language.

The key to solving this dilemma is avoiding duplication within tests. The DRY principle (Don't Repeat Yourself) of the Pragmatic Programmers (http://www.pragmaticprogrammer.com) should be applied to test code in the same way it is applied to production code. There is, however, a counter-force at play. We want the tests to Communicate Intent so it is best to keep the core test logic in each Test Method so it can be seen in one place but this doesn't preclude our moving a lot of supporting code into Test Utility Methods where it only needs to be modified in one place if it is affected by a change.

Goal: Separation of Concerns

Separation of Concerns applies in two dimension: First, we want to keep test code separate from our production code (Keep Test Logic Out of Production Code) and second, we want each test to focus on a single concern by Test Concerns Separately to avoid Obscure Tests (page X). A good example of what not to do is testing the business logic in the same tests as the user interface. This would involve testing two concerns at the same time. If either concern is modified (say, the UI is changed), the tests need to be modified. Testing one concern at a time may require separating the logic into different components. This is a key aspect of design for testability that is explored further in the the Using Test Doubles narrative chapter.

Tests should require minimal maintenance as the system evolves around them

Change is a fact of life. In fact, we write automated tests mostly to make change easier. So we should strive to ensure that the tests don't make change hard.

Suppose we want to change the signature of some method on a class. We go ahead and add a new parameter and all of a sudden we have 50 tests that don't compile. Does that encourage us to make the change? Probably not. So we introduce a new method with the parameter and arrange to have the old method call the new method defaulting the missing parameter to some value. Now all the tests compile but 30 of them still fail! Are the tests helping us make the change?

Goal: Robust Test

There are many kinds of changes that we want to make to the code as our project unfolds and the requirements evolve. We want to write our tests in such a way that the number of tests impacted by any one change is quite small. That means we need minimize overlap between tests. We also need to ensure that changes to the test environment don't impact our tests by isolating the SUT from the environment as much as possible. This results in much more Robust Tests.

We should strive to Verify One Condition per Test. Ideally, there should only one kind of change that would cause a test to require maintenance. System changes that impact fixture set up or tear down code can be encapsulated behind Test Utility Methods to further reduce the number of tests directly impacted by the change.

What's Next?

In this chapter I have discussed why we have automated tests and specific goals we want to try to achieve as we write our Fully Automated Tests. Before moving on to the Principles of Test Automation narrative we need to take a short side-trip to the Philosophy Of Test Automation narrative to understand different mindsets of test automaters.

Page generated at Wed Feb 09 16:39:25 +1100 2011