| testing.com > Testing Craft > Techniques (Test Automation) > Useful Features |
by James Bach
I've run test teams at Apple and Borland. We tried to automate our tests. We had some success with it, but mostly we failed. Test automation for modern GUI software is very challenging. Along the way, though, I've collected this list of useful features and caveats that you might want to consider in doing your automation.
Suite is structured to support team development.
Break large monolithic source files down into smaller, cohesive units. Put the system under source control to prevent team members from overwriting each other's work. Naturally, this applies only if the suite is being developed jointly, but beware of those small projects that become big projects, it might be worthwhile to plan ahead.
Suite can be distributed across a network of test execution systems.
As your test suite grows in size, and as your organization gains more test suites and products to test, you will find it increasingly difficult to make efficient use of your test machines. One way of maximizing efficiency is to centralize a group of test machines (at Borland there is a lab with 50 or more identical, centrally controlled systems), and create test suites that can be distributed to a number of machines at once. This can substantially reduce the time needed for a test cycle, and eliminate the possibility that a problem with one machine will stop the whole suite from running. Another idea is to make the suite distributable to machines that are not otherwise dedicated to testing, such as computers normally used in development or administration. There are obvious risks to this strategy (such as the possibility of an automated test destroying a programmer's hard disk), but if your company has very few computers and a big need to test, it's useful to have the option of borrowing a few systems and getting each of them to run part of the test cycle.
Suite can execute tests individually, or by group.
You might design the suite such that it can run an individual test, a set of specific tests, a group of related tests, all tests, all tests except specific tests or groups of tests, or only tests that failed the last time through. Also, allow the order of tests to be modified. You get the idea-- the suite should provide flexibility in test execution.
Suite interoperates with bug tracking system (bugs and tests are linkable).
Depending on the kind of testing that you do, enabling the test suite to record a failure directly into your bug tracking system (whether that system is a flat file database or something more elaborate) may save time and effort. It may also waste time, if a high percentage of failures are due to automation problems and not defects in the product. Another possible linkage might be the ability of the test automation to look in the bug tracking system for all fixed bugs and verify that they are still fixed. This requires that each bug report is accompanied by an automated test. Similar to that is the idea of marking a test as a "known failure until bug #3453 is fixed"; and design the suite to execute that test, but ignore the failure until the associated bug is marked as fixed in the database. The most feasible linkage I can think of is the ability to navigate directly from a bug in the tracking system to the part of the test suite that relates to it, and vice versa.
Suite can perform hard reset of test machines in case of system crashes.
It's common for test machines to crash during testing, so you want some way to restart the hardware if that happens. In one project, we used a software controllable power strip attached to each test machine, and used another computer to monitor the status of each test machine. If a crash was detected the monitoring computer would cycle the power on that system. With modern O/S's, it's sometimes possible to monitor the status of one process from within another process on the same computer. That may be easier and cheaper to arrange than the hardware reset.
Suite can execute unattended.
While a test suite that must be continually monitored can still be a lot better than manual testing, it's often more valuable to design it to run to completion without any help.
Suite execution can be restarted from the point of interruption in case of catastrophic failure (eg. power loss).
The more tests that you automate, the less you want your tests to start all over from the beginning if the suite is halted in the middle of execution. This isn't as easy as it may sound, if your test drivers are dependent on variables and structures stored in memory. By designing the system to write checkpoints out to disk, and to have an automatic start process that activates on reboot, and to have a means of resynchronizing to any other systems that it is connected to (such as file servers, which typically take longer to reset after the power goes out) your suite will survive a power outage and still be kicking.
Suite can be paused, single-stepped, and resumed.
In debugging the suite, or monitoring it for any reason, it's often important to be able to stop it or slow it down, perhaps so as to diagnose a problem or adjust a test, and then set it going again from where it left off.
Suite can be executed remotely.
Unless you live with your test machines, as some do, it's nice to be able to send a command from your workstation and get the test machines to wake up and start testing. Otherwise, you do a lot of walking to the lab. It's especially nice if you can query the status of a suite, start it, stop it, or adjust it, over the phone. Even from home, then, you'd be able to get those machines cracking.
Suite is executable on a variety of system configurations to support compatibility testing.
Automated suites should be designed with a minimum of assumptions about the configuration of the test machine. Since compatibility testing is so important, try to paramaterize and centralize all configuration details, so that you can make the suite run on a variety of test machines.
Suite architecture is modular for maximum flexibility.
This is as true for testware as it is for software. You will reuse your testware. You will maintain and enhance it. So, build it such that you can replace or improve one part of it, say the test reporting mechanism, without having to rewrite every test. Remember: TESTWARE IS SOFTWARE. Just as software requires careful thought and design, so you will discover that testware requires the same. Another aspect of good structure is to centralize all suite configuration parameters in one place. Here are some factors that might be controlled by a configurations file:
There are good reasons to reset the test machine to a clean, known state, and there are also good reasons not to do that. Resetting helps in the process of investigating a problem, but not resetting is a more realistic test, since presumably your users will not be rebooting their computers between opening a file and printing it! A good idea is to make it a selectable option, so the tests can run either way.
Suite execution and analysis take less time and trouble than hand- testing.
I know, it sounds pretty obvious. Alas, it needs to be said. Too many testers and managers approach test automation for its own sake. Instead, look critically at how much time and effort you are heaping on your automation. In the beginning automation costs more, yes, but too often even after a couple of years it still takes more effort to manage the automation than it would just to do the same thing by hand. The most common problem, in my experience, are false fails. Every time the test suite reports a fail that turned out to be a problem with the suite itself, or a trivial misalignment between the test and the latest conception of the product, all the time needed to solve that problem is pure automation cost. Generally speaking, keep the suite architecture as simple as possible, to keep maintenance costs down.
Suite creates summary, coverage, result, debug, and performance logs.
A summary log is an overview of the results of the test suite: how many tests were executed, how many passed, failed, or are unknown, etc. A coverage log shows what features were tested. You can achieve this by maintaining an electronic test outline that is associated with the test suite. If you have the appropriate tool, a coverage log should also report on code-level coverage. A result log records the outcome of each test. A debug log contains messages that track the progress of the test suite. Each entry should be time-stamped. This helps in locating problems with the suite when it mysteriously stops working. A performance log tracks how long each test took to execute. This helps in spotting absolute performance problems as well as unexpected changes in relative performance.
Suite creates global (suite-wide) and local (test-specific) logs.
Except for the summary log, all the logs should have global and local versions. The local versions of the logs should be cumulative from test cycle to test cycle. Each local log pertains to a single test and is stored next to that test. They form a history of the execution of that test. Global logs should be reinitialized at every test cycle, and include information about all of the tests.
Suite logs are accessible and readable.
Logs ought to be both machine readable and human readable. I also recommend that they be tied into an icon or some other convenient front end so that they are easy to get to.
Tests can be selectively activated or deactivated.
There should be a mechanism (other than commenting out test code) to deactivate a test, such that it does not execute along with all the other tests. This is useful for when a test reveals a crash bug, and there is no reason to run it again until the bug is fixed.
Tests are easily reconfigured, replicated, or modified.
An example of this is a functional test of a program with a graphical user interface that can be configured to simulate either mouse input or keyboard input. Rather than create different sets of tests to operate in different modes and contexts, design a single test with selectable behavior. Avoid hard-coding basic operations in a given test. Instead, engineer each test in layers and allow its behavior to be controlled from a central configuration file. Move sharable/reusable code into separate include files.
Tests are important and unique.
You might think that the best test suite is one with the best chance of finding a problem. Not so. Remember, you want to find a *lot* of *important* problems, and be productive in getting them reported and fixed. So, in a well- designed test suite, for every significant bug in the product, one and only one test will fail. This is an ideal, of course, but we can come close to it. In other words, if an enterprising developer changes the background color of the screen, or the spelling of a menu item, you don't want 500 tests to fail (I call it the "500 failures scenario" and it gives me chills). You want one test to fail, at most. Whenever a test fails, you know that at least one thing went wrong, but you can't know if more than one thing went wrong until you investigate. If five hundred tests fail, it would help you to budget your time if you were confident that 500 different and important problems had been detected. That way, even before you're investigation began, you would have some idea of the quality of the product.
Likewise, you don't want tests to fail on bugs so trivial that they won't be fixed, while great big whopper bugs go unnoticed. Therefore, I suggest automating interesting and important tests before doing the trivial ones. Avoid full-screen snapshots, and use partial screen shots or pattern recognition instead. Maybe have one single test that takes a series of global snapshots, just to catch some of those annoying little UI bugs. Also, consider code inspection, instead of test automation, to test data intensive functionality, like online help. It's a lot easier to read files than it is to manipulate screen shots.
Dependencies between tests can be specified.
Tests that depend on other tests can be useful. Perhaps you know that if one particular test fails, there's no reason to run any other tests in a particular group. However, this should be explicitly specified to the test suite, so that it will skip dependent tests. Otherwise, many child tests will appear to fail due to one failure in a parent test.
Tests cover specific functionality without covering more than necessary.
Narrowly defined tests help to focus on specific failures and avoid the 500 failure scenario. The downside is that overly narrow tests generally miss failures that occur on a system level. For that reason, specify a combination of narrow and broader tests. One way to do that is to create narrow tests that do individual functions; then create a few broad tests, dependent on the narrow ones, that perform the same functions in various combinations.
Tests can be executed on a similar product or a new version of the product without major modification.
Consider that the test suite will need to evolve as the software that it tests evolves. It's common for software to be extended, ported, or unbundled into smaller applications. Consider how your suite will accommodate that. One example is localization. If you are called upon to test a French version of the software, will that require a complete rewrite of each and every test? This desirable attribute of test automation can be achieved, but may lead to very complex test suites. Be careful not to trade a simple suite that can be quickly thrown out and rewritten for a super-complex suite that is theoretically flexible but also full of bugs.
Test programs are reviewable.
Tests must be maintained, and that means they must be revisited. Reviewability is how easy it is to come back to a test and understand it.
Test programs are easily added to suites.
In some suites I've seen, it's major surgery just to add a new test. Make it easy and the suite will grow more quickly.
Tests are rapidly accessible.
I once saw a test management system where it took more than 30 seconds to navigate to a single test from the top level of the system. That's awful. It discouraged test review and test development. Design the system such that you can access a given test in no more than a few seconds. Also, make sure the tests are accessible by anyone on the team.
Tests are traceable to a test outline.
To some it is obvious; for me it was a lesson that came the hard way: Do not automate tests that can not already be executed by hand. That way, when the automation breaks, you will still be able to get your testing done. Furthermore, if your automated tests are connected to a test outline, you can theoretically assess functional test coverage in real-time. The challenge is to keep the outline in sync with both the product and the tests.
Tests are reviewed, and their review status is documented in-line.
This is especially important if several people are involved in writing the tests. Believe it or not, I have seen many examples of very poorly written tests that were overlooked for years. It's an unfortunate weakness (and most of us have it) that we will quickly assume that our test suite is full of useful tests, whether or not we personally wrote those tests. Many bad experiences have convinced me that it's dangerous to assume this! Test suites most often contain nonsense. Review them! In order to help manage the review process, I suggest recording the date of review within each test. That will enable you to periodically re-review tests that haven't been touched in a while. Also, review tests that never fail, and ones that fail often. Review, especially, those tests that fail falsely.
Test hacks and temporary patches are documented in-line.
Adopt a system for recording the assumptions behind the test design, and any hacks or work-arounds that are built into the test code. This is important, as there are always hacks and temporary changes in the tests that are easy to forget about. I recommend creating a code to indicate hacks and notes, perhaps three asterisks, as in: "*** Disabled the 80387 switch until co-processor bug is fixed." Periodically, you should search for triple-asterisk codes in order not to forget about intentionally crippled tests.
Suite is well documented.
You should be able to explain to someone else how to run the test suite. At Borland I used the "Dave" test: When one of my testers claimed to have finished documenting how to run a suite, I'd send Dave, another one of the testers, to try to run it. That invariably flushed out problems.