testing.com > Testing Craft > Techniques (Test Planning) > Test Plan for a Small Project

Search

MickeyMake Test Plan

Brian Marick

Contents
     Risks to Address
       Key Victims
       Key Problems
    Product
    Major Threats
  Milestones Known About Today
  Constraints
  Open Questions and Assumptions
  Strategy
    Automation Strategy
    Test Development Strategy
         Happy Scientists
      Usability Testing
      Functional Testing
      Configuration Testing
      Installation Tests
      Stroke Tests
  Milestones
  Omissions, Risks, and Problems
  Improving the Plan During Development

This document explains how I might write a test plan. It also gives some of the reasons I write test plans in the way I do. The test plan proper is in normal font. The explanatory text is in this italic font.

My main goal is to explain the thinking that should go into a good test plan. This particular test plan, for a hypothetical product called MickeyMake, isn't as good (thorough, well-considered) as a test plan for a real product should be.

Before reading this document, you should understand the context. The Software Carpentry project is funding projects to create some open source programming tools. These projects will be small. They will be distributed. Although the initial work will be funded, the developers will be - like most open source developers are - self-organizing enthusiasts. The kind of test plan suitable for such a project would be a disaster in one that will, say, rewrite the European air traffic control system. But it might work well in a typical fast-paced "we've got the funding, now what do we do?" startup.

Consider carefully whether your project is operating in a context that fits this test plan. And, if this test plan appears idiotic to you, please consider whether that's because your context is different. One size does not fit all. At the end of the plan, I have some summary notes about why I've written this the way that I have.

Before reading this document, you should also read the MickeyMake proposal. The Software Carpentry project is a design competition. The winners will get money to implement their designs. The MickeyMake proposal was an example of what a first round design might look like. The document you're reading describes how MickeyMake might be tested. It may not make sense unless you know the proposal.


What's the purpose of a test plan? For a project the size of MickeyMake, it's to be a first part of a project-long conversation about what testing to do and how to do it. It's also a way to jog people's memories, later in the project, about how they got to where they are. It is not to be a "living document". It should die. Don't update it. The parts of the plan that need updating and revision are the schedule and task lists. They should be posted on a wall somewhere, which is where they should be changed. The room facing that wall is where the conversation should continue. Updating this document isn't worth the trouble.

Risks to Address

I follow James Bach's Threat -> Product -> Problem -> Victim model of risk[1]:

[Threat/Problem/Problem/Victim]

What does it mean to say that a product has a bug? First, the product must be presented with some Threat --- something in the input or environment that it cannot handle correctly. In this case, the threat is two cute but identical twins. (This corresponds to the common case of software being unable to handle duplicate inputs.) Next, the Product must perform its processing on that input or in the context of that environment. If it cannot handle the threat, it produces a Problem in the form of a bad result. But a problem means little or nothing unless there is some Victim who is damaged by it.

By thinking about these four categories, you can come up with more test ideas that are better tailored to what really matters.

Key Victims

Who can be hurt if MickeyMake doesn't work?

Scientists
They are the most important class of victim. (They are, after all, paying for MickeyMake.) We distinguish two cases: (1) the scientist who's excited by a tool that will help her, tries MickeyMake, and is immediately disappointed. (2) The scientist who's come to depend on MickeyMake and suddenly hits a brick wall.

Open source programmers
MickeyMake's continued viability depends on whether these people use it, modify it, port it, etc. If it falls short of their needs, they won't adopt it.[2]

The installer
This is the person who downloads a copy of the source or an executable with great anticipation, has a lot of problems installing, and finally gives up in disgust. Like the scientist, a good early experience will be very important.

The maintainer
If working with the MickeyMake code to add new features or to port it is unpleasant, people won't do it. Note that the MickeyMake developers count as maintainers, since the tool will be developed iteratively.

Key Problems

What kind of problems would be particularly bad?

Note: we believe the following will not be problems:

Product

This is, internally, a fairly simple product. The main risky area is the interface to the OS and system tools. They are used to gather dependency information.

Although the first round design doesn't mention it, the product will have a GUI. The large-scale product architecture will look like this:

GUI Command-line and
MickeyFile interface
MickeyMake core (with API)
Python

It's safe to assume that the GUI will change often. They always do. Once a few scientists have built Mickeyfiles, we may assume that their format (and the command-line arguments) will stay backward-compatible. The API may change.

Major Threats

Configuration dependencies are likely to be a major problem. There will have to be a lot of configuration testing.

Multi-level directories are already known to be a questionable area (for usability), and may provoke bugs.

The input syntax (Mickeyfiles, etc.) isn't overly baroque. We have to try all the different features and options, but we shouldn't have to worry a great deal about combinations and such.

Milestones Known About Today

In "New Models for Test Development", I claim the testing effort should be organized around the project schedule. In particular, the test plan should be constructed by identifying handoffs of code from its developers to a consumer, thinking about what problems the code could cause that consumer (making her a victim), and writing tests to see if it will cause those problems.

The design document has proposed these stages:

Development of internal classes (RuleBase and Target).
The result is code that is handed from one MickeyMake developer to those doing the parser and rule execution engine. (These may be the same people.) If there are bugs, development of those other components will be slowed down.
 
Concurrent development of parser and the rule execution engine.
(Note: I'm guessing the parser also includes main() and the handling of the command line.)

These result in components that are glued together to make up the whole of MickeyMake. That first version is likely to be used by only a few people, namely MickeyMake developers themselves and a few scientists. All of these are expected to be tolerant of problems, so bugs are more acceptable at this point.
 
Addition of variables.
Handoff is to same people as above. At some point after this milestone, MickeyMake will go to open beta.
 
GUI
Handoff is primarily to scientists and open source programmers, scientists being the most important audience.
 

My suspicion is that there will be other milestones that we don't know about yet. For example, some command-line options won't be supported until after variables.

Constraints

  1. A career tester will do some amount of testing (approx. US$20K worth). That scarce resource should be used well.
  2. Additional testing is going to be done by developers, though it's possible that we may find an open-source enthusiast tester to test this.
  3. Tests will originally be originally developed on Unix. They need to be portable to odd versions of Unix (e.g., what runs on SGIs, Crays, etc.) They also need to be portable to Windows NT. It's OK to insist that tests cannot be run without a Unix emulation package (e.g., Cygwin).
  4. The product will be built in Python, so it's OK for tests to depend on Python and its standard libraries.

Open Questions and Assumptions

  1. Some developers (e.g., the Extreme Programming people) are fans of testing individual classes and of designing the tests before writing the code. Others are not. Which class does MickeyMake developers fall into?

    We will assume that they are not unit testing fans. Testing of new code will probably be done by the developer trying a few cases out in the Python interpreter, followed at some later point by more repeatable tests through the core API or command-line interface.

  2. We assume that the command-line interface, the MickeyMake interface, and the GUI interface are simple "transducers". They take information in one form, convert it to another form, then call the Core API. For example, we assume a simple 1-1 correspondence between GUI elements and the API. (But we'll be prepared to change this assumption.)

Strategy

This section begins with the high-level answer to the questions "What testing are you going to do?" and "How are you going to go about testing?" It's important to have an overview --- it's the only thing people will remember. If it's not simple and memorable, people will go off in their own direction.

Here's the big picture.

We're going to concentrate on configuration testing and on making early adopter scientists happy. We'll also have the usual sort of functional tests.

We'll rely on scientists for usability testing.

We'll automate most tests through the Core API, using the freely available PyUnit unit testing framework. We'll have relatively few manual tests.

We'll have at least two distinct test suites. A smoke test suite will be run by developers frequently during development. It should take less than five minutes to run. The full test suite will be run after nightly builds. If the full test suite runs in less than ½ hour, it will be the installation test suite and will also be run by developers before they integrate their code. If it runs too long, a ½ hour subset will be used for those purposes. However, the full suite will always be available to installers who want to run it. Hence, it must be portable.

After reading the following sections, the reader should be able to see how all the points in the sections before Strategy were addressed. How much you want to belabor the points (how explicit the traceability should be) is a matter of your audience. But omissions that might surprise should be called out.

Automation Strategy

We have three interfaces: the Core API, the command-line/Mickeyfile interface, and the GUI interface. The permanent tests will make use of those interfaces. (The developers may run tests against new or changed classes using the Python interpreter, but those tests won't be saved unless they're through one of these three interfaces.)

The bulk of the tests will be through the Core API. They will use the PyUnit unit test framework (even though the tests will not be what people would normally call unit tests)[3]. This part of the test suite looks like this:

PyUnit tests
MickeyMake core (with API)
Python
File System

We emphasize the file system in the diagram because tests need directory trees of files. These trees, and the tests themselves, need to be portable between Unix variants and Windows variants (except, perhaps, for a small subset of platform-specific configuration tests).

We will also have a suite of tests that exercise the command-line and Mickeyfile parser. These only check that the right calls are made to the API. A fake API that records how it's been called is inserted in place of the MickeyMake core, like this:

Command-line and
MickeyFile parsing tests
Command-line and
MickeyFile interface
Fake MickeyMake API

These tests might be written in PyUnit, or they might use shell scripts (and depend on the Cygwin library on Windows). Probably PyUnit is preferable.

Note that it might be more convenient to have the Fake MickeyMake API be the real one run in a special logging mode; that's up to the developers.

Skippable design note: Note that we have two sets of tests. One purports to show that the command line and Mickeyfiles are parsed correctly and cause the right API calls, but it does not check whether the API routines work. The other checks the API routines, but nothing about parsing. It might be better to have all tests driven through the command line and Mickeyfiles, getting a good set of end-to-end tests. It's too often the case that part A seems to work, and part B seems to work, but something breaks when you put them together. Moreover, we earlier noted that the command-line interface is likely to be more stable than the API. We chose not to do end-to-end testing for these reasons:

Still, we have some end-to-end tests to provide extra reassurance.

It's too early to decide what to do about the GUI tests. If the GUI is simple (for example, a single screen), and the GUI is automatable, it would make sense to build some automated tests. These would punch a button on the GUI by calling (via Python) the particular "press button 12" command and record that the right API call results.

However, if the GUI is at all complicated, such tests are likely to break. For example, the path to that button might be through several screens, each of which is subject to redesign. In that case, the GUI tests should be manual, except for a few automated smoke tests that we'll commit to maintaining.

Test Development Strategy

Here is how the different types of tests will be created.

Happy Scientists

Our main goal with this testing is that no bugs prevent the product from acquiring the key core of early adopters necessary to success.

We will accomplish this by recruiting scientists. We will ask to copy their program's directory hierarchy and dependency structure (but not the actual code). We'll execute tests by modifying the dates on source files in the structure (or by creating or deleting object files), running MickeyMake, and seeing if it asks for the right commands to be executed. The tests will be designed using standard functional testing techniques.

The bulk of these tests will be run through the Core API. A subset will use the command line, Mickeyfiles, and (if automation is practical), the GUI. Those will be part of the smoke test.

We'll devote the bulk of any paid testing time to this task.

We expect the Software Carpentry coordinator (Greg) and judges to find these scientists. We'll look for a diversity of platforms (e.g., Unix, Windows, etc.), programming languages, and code layout. (In particular, we want someone with a complicated directory structure. Preferably more than one.)

Usability Testing

At the point where there's a command-line interface and installable product, we'll recruit scientists (possibly the same as above) to be early adopters of MickeyMake. We'll lavish them with support and pay close attention to their problems. (It's a pity support will have to be remote.)

Note that we'll get usability testing (as well as other types) as the developers "eat their own dogfood" --- use MickeyMake in its own development.

Functional Testing

We'll exercise all the elements of the product interfaces in the usual way. These tests are a safety net for the "happy scientists" tests. They'll exercise only what isn't already exercised in those tests.

These functional tests will be written by the developers. The paid tester will spend some time teaching them how. (This will be a problem if the developers are scattered throughout the world.)

Configuration Testing

Here is the matrix of machines and operating systems to be supported by the product. Cells marked with "X" represent configurations available to the developers; empty cells are ones about which more information is needed. Gray cells are untargeted (perhaps nonsensical) configurations:

    Pentium-III     Cray     Sparc     Acorn  
Windows NT 4.0 sp6    X      
Red Hat Linux 6.2 X   X  
Solaris 7     X  
Windows 2000 X      

In real life, the configuration table would be rather more complicated. It might make more sense to make it a two-level list (or N-level list).

We will early on recruit volunteers who have access to the missing configurations. We will periodically (but not more often than weekly) send them the full automated test suite to run. They will mail back the results file. Ideally, the product will have enough logging that we won't have to ask them to do any debugging. In practice, configuration problems don't lend themselves to that solution, so these people will have to commit to some debugging.

The tests already described in previous sections, when run on multiple configurations, will form the bulk of the configuration tests. Those tests, however, are not specifically designed to probe configuration dependencies. So we'll augment them with tests that are. See the next section.

Installation Tests

We hope that the configuration bugs mentioned in the previous paragraph will, for this application, be the sort that crop up during installation. (Or, more precisely, could be detected during installation.)

Here are examples of these sorts of bugs:

(These are the sorts of problems that will get the scientist mad at the installer; hence, these tests address the installer as Victim.)

So, designing installation tests is a matter of knowing zillions of little facts. We don't have enough of this experience right now. We will seek it out with a web search --- and also with an early open beta. Then we'll make that experience concrete in the form of tests.

Note that installation tests will be partially manual. The installation tester will have to choose file systems, etc.

Stroke Tests

Stroke tests check every example and statement of fact in the user documentation. Incorrect documentation makes users unhappy and could prevent us from getting our core set of early adopter scientists. (Often, early adopters don't mind bad documentation; we don't think that's true of our target audience.)

Stroke tests should be run just after the documentation is finished. Whenever a sizable new group of people will get MickeyMake, stroke tests should be rechecked. The only exception is if they will also get some experienced user to help them.

Milestones

Notes:

By the Development of internal classes (RuleBase, Target) milestone, we will have:

By the parser milestone, we will have:

By the rule execution milestone, we will have:

By the installation milestone, we will have:

The installation milestone is the point at which configuration testers and scientists get the software.

By the Addition of variables milestone, we will have:

At some point after variables have been added, we'll go to open beta. Before that happens:

By the GUI Design milestone, we will have:

By the GUI Implementation milestone, we will have:

All other milestones that appear during implementation will simply lead to more functional tests.

Omissions, Risks, and Problems

We're conspicuously ignoring everyone in the user base except scientists. We figure that if we please scientists, we'll come up with something acceptable to other users. Moreover, we'll then have the breathing room to improve MickeyMake to make everyone else happy.

Improving the Plan During Development

We don't plan to use coverage tools. There's a small budget for testing; the money can be better spent elsewhere.

We'll use Bugzilla to store user bug reports. Biweekly during the main development cycle, we'll look at all bugs that escaped testing and were found by early adopters or beta testers. If they suggest classes of missed tests, we'll revise this plan.

It's important to remember that one of the main goals of early testing is to improve this test plan. By getting some tests of all types out there, running and finding bugs, it will become much more clear how much testing is required. Testing's first deliverable is an early handle on scope and effort


Let me now give some summary comments, based on feedback I've gotten.

First, this is a test plan produced early in the design process. I believe that many decisions made early in a project will be wrong, no matter how hard you try to be right. So you shouldn't try too hard. And you should avoid detail. By the time you get to the point where you'll need it, it will be useless. This sort of project test plan should cover only broad strategies and decisions that can't be put off.

If you are more confident about your ability to get things right early, you should add more detail.

Note: as I did in the original note, I want to emphasize that this is not a "one size fits all" test plan. In some contexts, you have to provide the detail. Or you really do know enough to plan the detail early on.

Second, I was serious when I said "the plan is a first part of a project-long conversation about what testing to do and how to do it." Ideally, the test plan is the result of people brainstorming in a room, arguing over risks, writing up drafts, letting the drafts clarify their thinking, etc. etc. As a result of that process, not the document, people come to concensus about what's important for this project. (As Eisenhower said, "the plan is nothing, the planning is everything".)

Note: I'm assuming that there will be no auditors swooping in to see that you've hewed to the plan. As a third party observer, I'd usually be happier to see people thoughtfully deviating from a plan than adhering to it.

Finally, I was also serious when I said the plan should be discarded, not updated. What comes out of the Milestones section of the plan is a task list or PERT chart (as well as related devices like McConnell's top ten risks list from Rapid Development). The document might be used for reference, but the project should be run from those lists and charts. I prefer them to be very public, like James Bach's Testing Dashboards or Jerry Weinberg's Public Project Progress Posters (Quality Software Management, Volume 2) or the use of 3x5 cards in Extreme Programming (Extreme Programming Explained by Kent Beck or http://www.xprogramming.com/).

Note: such devices probably work best with small projects.

I've received some specific objections, which I'll handle in the forum.


Thanks to Greg Wilson for converting an earlier version of this into HTML.

Footnotes

  1. James! Write this up!
  2. Note: since MickeyMake is intentionally simpler than the baroque tools programmers are currently using, the fact that it can't feed the baby and wash the cat doesn't really count as a bug. The real Software Carpentry Build entries should be more extensible and scalable, so expectations will be different.
  3. Note that the SC Test product will not be available in time for this project.

Related Testing Craft Pages


There is discussion in the Wiki Forum at page MickeyMakeTestPlan.
(The Forum is explained in its FrontPage.)


Copyright © 2000 by Brian Marick (marick@testing.com, http://www.testing.com). This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, draft v1.0 or later (the latest version is presently available at http://www.opencontent.org/openpub).