2000-07-19 14:19:49 -03:00
|
|
|
Writing Python Regression Tests
|
|
|
|
-------------------------------
|
2000-06-30 03:08:35 -03:00
|
|
|
Skip Montanaro
|
2000-07-19 14:19:49 -03:00
|
|
|
(skip@mojam.com)
|
|
|
|
|
|
|
|
|
|
|
|
Introduction
|
2000-06-30 03:08:35 -03:00
|
|
|
|
|
|
|
If you add a new module to Python or modify the functionality of an existing
|
2000-07-19 14:19:49 -03:00
|
|
|
module, you should write one or more test cases to exercise that new
|
2001-05-23 01:57:49 -03:00
|
|
|
functionality. There are different ways to do this within the regression
|
|
|
|
testing facility provided with Python; any particular test should use only
|
|
|
|
one of these options. Each option requires writing a test module using the
|
|
|
|
conventions of the the selected option:
|
|
|
|
|
|
|
|
- PyUnit based tests
|
|
|
|
- doctest based tests
|
|
|
|
- "traditional" Python test modules
|
2000-07-19 14:19:49 -03:00
|
|
|
|
2001-05-23 01:57:49 -03:00
|
|
|
Regardless of the mechanics of the testing approach you choose,
|
|
|
|
you will be writing unit tests (isolated tests of functions and objects
|
2000-07-19 14:19:49 -03:00
|
|
|
defined by the module) using white box techniques. Unlike black box
|
|
|
|
testing, where you only have the external interfaces to guide your test case
|
|
|
|
writing, in white box testing you can see the code being tested and tailor
|
|
|
|
your test cases to exercise it more completely. In particular, you will be
|
|
|
|
able to refer to the C and Python code in the CVS repository when writing
|
|
|
|
your regression test cases.
|
|
|
|
|
2001-05-23 01:57:49 -03:00
|
|
|
PyUnit based tests
|
|
|
|
|
|
|
|
The PyUnit framework is based on the ideas of unit testing as espoused
|
|
|
|
by Kent Beck and the Extreme Programming (XP) movement. The specific
|
|
|
|
interface provided by the framework is tightly based on the JUnit
|
|
|
|
Java implementation of Beck's original SmallTalk test framework. Please
|
|
|
|
see the documentation of the unittest module for detailed information on
|
|
|
|
the interface and general guidelines on writing PyUnit based tests.
|
|
|
|
|
|
|
|
The test_support helper module provides a single function for use by
|
|
|
|
PyUnit based tests in the Python regression testing framework:
|
|
|
|
run_unittest() takes a unittest.TestCase derived class as a parameter
|
|
|
|
and runs the tests defined in that class. All test methods in the
|
|
|
|
Python regression framework have names that start with "test_" and use
|
|
|
|
lower-case names with words separated with underscores.
|
|
|
|
|
|
|
|
doctest based tests
|
|
|
|
|
|
|
|
Tests written to use doctest are actually part of the docstrings for
|
|
|
|
the module being tested. Each test is written as a display of an
|
|
|
|
interactive session, including the Python prompts, statements that would
|
|
|
|
be typed by the user, and the output of those statements (including
|
|
|
|
tracebacks!). The module in the test package is simply a wrapper that
|
|
|
|
causes doctest to run over the tests in the module. The test for the
|
|
|
|
doctest module provides a convenient example:
|
|
|
|
|
|
|
|
import doctest
|
|
|
|
doctest.testmod(doctest, verbose=1)
|
|
|
|
|
|
|
|
See the documentation for the doctest module for information on
|
|
|
|
writing tests using the doctest framework.
|
|
|
|
|
|
|
|
"traditional" Python test modules
|
|
|
|
|
|
|
|
The mechanics of how the "traditional" test system operates are fairly
|
|
|
|
straightforward. When a test case is run, the output is compared with the
|
|
|
|
expected output that is stored in .../Lib/test/output. If the test runs to
|
|
|
|
completion and the actual and expected outputs match, the test succeeds, if
|
|
|
|
not, it fails. If an ImportError or test_support.TestSkipped error is
|
|
|
|
raised, the test is not run.
|
|
|
|
|
2000-07-19 14:19:49 -03:00
|
|
|
|
|
|
|
Executing Test Cases
|
|
|
|
|
|
|
|
If you are writing test cases for module spam, you need to create a file
|
|
|
|
in .../Lib/test named test_spam.py and an expected output file in
|
|
|
|
.../Lib/test/output named test_spam ("..." represents the top-level
|
|
|
|
directory in the Python source tree, the directory containing the configure
|
|
|
|
script). From the top-level directory, generate the initial version of the
|
|
|
|
test output file by executing:
|
|
|
|
|
|
|
|
./python Lib/test/regrtest.py -g test_spam.py
|
|
|
|
|
2001-05-23 01:57:49 -03:00
|
|
|
(If your test does not generate any output when run successfully, this
|
|
|
|
step may be skipped; no file containing expected output will be needed
|
|
|
|
in this case.)
|
|
|
|
|
2000-07-19 14:19:49 -03:00
|
|
|
Any time you modify test_spam.py you need to generate a new expected
|
2000-06-30 03:08:35 -03:00
|
|
|
output file. Don't forget to desk check the generated output to make sure
|
|
|
|
it's really what you expected to find! To run a single test after modifying
|
|
|
|
a module, simply run regrtest.py without the -g flag:
|
|
|
|
|
2000-07-19 14:19:49 -03:00
|
|
|
./python Lib/test/regrtest.py test_spam.py
|
|
|
|
|
|
|
|
While debugging a regression test, you can of course execute it
|
|
|
|
independently of the regression testing framework and see what it prints:
|
|
|
|
|
|
|
|
./python Lib/test/test_spam.py
|
2000-06-30 03:08:35 -03:00
|
|
|
|
|
|
|
To run the entire test suite, make the "test" target at the top level:
|
|
|
|
|
|
|
|
make test
|
|
|
|
|
2000-07-19 14:19:49 -03:00
|
|
|
On non-Unix platforms where make may not be available, you can simply
|
|
|
|
execute the two runs of regrtest (optimized and non-optimized) directly:
|
|
|
|
|
|
|
|
./python Lib/test/regrtest.py
|
|
|
|
./python -O Lib/test/regrtest.py
|
|
|
|
|
|
|
|
|
|
|
|
Test cases generate output based upon values computed by the test code.
|
|
|
|
When executed, regrtest.py compares the actual output generated by executing
|
|
|
|
the test case with the expected output and reports success or failure. It
|
|
|
|
stands to reason that if the actual and expected outputs are to match, they
|
|
|
|
must not contain any machine dependencies. This means your test cases
|
|
|
|
should not print out absolute machine addresses (e.g. the return value of
|
|
|
|
the id() builtin function) or floating point numbers with large numbers of
|
|
|
|
significant digits (unless you understand what you are doing!).
|
|
|
|
|
|
|
|
|
|
|
|
Test Case Writing Tips
|
2000-06-30 03:08:35 -03:00
|
|
|
|
|
|
|
Writing good test cases is a skilled task and is too complex to discuss in
|
|
|
|
detail in this short document. Many books have been written on the subject.
|
|
|
|
I'll show my age by suggesting that Glenford Myers' "The Art of Software
|
|
|
|
Testing", published in 1979, is still the best introduction to the subject
|
|
|
|
available. It is short (177 pages), easy to read, and discusses the major
|
|
|
|
elements of software testing, though its publication predates the
|
|
|
|
object-oriented software revolution, so doesn't cover that subject at all.
|
|
|
|
Unfortunately, it is very expensive (about $100 new). If you can borrow it
|
|
|
|
or find it used (around $20), I strongly urge you to pick up a copy.
|
|
|
|
|
|
|
|
The most important goal when writing test cases is to break things. A test
|
2000-07-19 14:19:49 -03:00
|
|
|
case that doesn't uncover a bug is much less valuable than one that does.
|
|
|
|
In designing test cases you should pay attention to the following:
|
|
|
|
|
|
|
|
* Your test cases should exercise all the functions and objects defined
|
|
|
|
in the module, not just the ones meant to be called by users of your
|
|
|
|
module. This may require you to write test code that uses the module
|
|
|
|
in ways you don't expect (explicitly calling internal functions, for
|
|
|
|
example - see test_atexit.py).
|
|
|
|
|
|
|
|
* You should consider any boundary values that may tickle exceptional
|
|
|
|
conditions (e.g. if you were writing regression tests for division,
|
|
|
|
you might well want to generate tests with numerators and denominators
|
|
|
|
at the limits of floating point and integer numbers on the machine
|
|
|
|
performing the tests as well as a denominator of zero).
|
|
|
|
|
|
|
|
* You should exercise as many paths through the code as possible. This
|
|
|
|
may not always be possible, but is a goal to strive for. In
|
|
|
|
particular, when considering if statements (or their equivalent), you
|
|
|
|
want to create test cases that exercise both the true and false
|
|
|
|
branches. For loops, you should create test cases that exercise the
|
|
|
|
loop zero, one and multiple times.
|
|
|
|
|
|
|
|
* You should test with obviously invalid input. If you know that a
|
|
|
|
function requires an integer input, try calling it with other types of
|
|
|
|
objects to see how it responds.
|
|
|
|
|
|
|
|
* You should test with obviously out-of-range input. If the domain of a
|
|
|
|
function is only defined for positive integers, try calling it with a
|
|
|
|
negative integer.
|
|
|
|
|
|
|
|
* If you are going to fix a bug that wasn't uncovered by an existing
|
|
|
|
test, try to write a test case that exposes the bug (preferably before
|
|
|
|
fixing it).
|
|
|
|
|
2000-10-23 13:37:14 -03:00
|
|
|
* If you need to create a temporary file, you can use the filename in
|
|
|
|
test_support.TESTFN to do so. It is important to remove the file
|
|
|
|
when done; other tests should be able to use the name without cleaning
|
|
|
|
up after your test.
|
|
|
|
|
2000-07-19 14:19:49 -03:00
|
|
|
|
|
|
|
Regression Test Writing Rules
|
|
|
|
|
|
|
|
Each test case is different. There is no "standard" form for a Python
|
|
|
|
regression test case, though there are some general rules:
|
|
|
|
|
|
|
|
* If your test case detects a failure, raise TestFailed (found in
|
|
|
|
test_support).
|
|
|
|
|
|
|
|
* Import everything you'll need as early as possible.
|
|
|
|
|
|
|
|
* If you'll be importing objects from a module that is at least
|
|
|
|
partially platform-dependent, only import those objects you need for
|
|
|
|
the current test case to avoid spurious ImportError exceptions that
|
|
|
|
prevent the test from running to completion.
|
|
|
|
|
|
|
|
* Print all your test case results using the print statement. For
|
|
|
|
non-fatal errors, print an error message (or omit a successful
|
|
|
|
completion print) to indicate the failure, but proceed instead of
|
|
|
|
raising TestFailed.
|
|
|
|
|
2000-08-23 02:28:45 -03:00
|
|
|
* Use "assert" sparingly, if at all. It's usually better to just print
|
|
|
|
what you got, and rely on regrtest's got-vs-expected comparison to
|
|
|
|
catch deviations from what you expect. assert statements aren't
|
|
|
|
executed at all when regrtest is run in -O mode; and, because they
|
|
|
|
cause the test to stop immediately, can lead to a long & tedious
|
|
|
|
test-fix, test-fix, test-fix, ... cycle when things are badly broken
|
|
|
|
(and note that "badly broken" often includes running the test suite
|
|
|
|
for the first time on new platforms or under new implementations of
|
|
|
|
the language).
|
|
|
|
|
2000-07-19 14:19:49 -03:00
|
|
|
|
|
|
|
Miscellaneous
|
|
|
|
|
|
|
|
There is a test_support module you can import from your test case. It
|
|
|
|
provides the following useful objects:
|
|
|
|
|
|
|
|
* TestFailed - raise this exception when your regression test detects a
|
|
|
|
failure.
|
|
|
|
|
2000-08-21 13:55:57 -03:00
|
|
|
* TestSkipped - raise this if the test could not be run because the
|
|
|
|
platform doesn't offer all the required facilities (like large
|
|
|
|
file support), even if all the required modules are available.
|
|
|
|
|
2000-07-19 14:19:49 -03:00
|
|
|
* findfile(file) - you can call this function to locate a file somewhere
|
|
|
|
along sys.path or in the Lib/test tree - see test_linuxaudiodev.py for
|
|
|
|
an example of its use.
|
|
|
|
|
|
|
|
* verbose - you can use this variable to control print output. Many
|
|
|
|
modules use it. Search for "verbose" in the test_*.py files to see
|
|
|
|
lots of examples.
|
|
|
|
|
2000-08-23 02:28:45 -03:00
|
|
|
* use_large_resources - true iff tests requiring large time or space
|
|
|
|
should be run.
|
|
|
|
|
2000-07-19 14:19:49 -03:00
|
|
|
* fcmp(x,y) - you can call this function to compare two floating point
|
|
|
|
numbers when you expect them to only be approximately equal withing a
|
|
|
|
fuzz factor (test_support.FUZZ, which defaults to 1e-6).
|
|
|
|
|
2000-08-23 02:28:45 -03:00
|
|
|
NOTE: Always import something from test_support like so:
|
|
|
|
|
|
|
|
from test_support import verbose
|
|
|
|
|
|
|
|
or like so:
|
|
|
|
|
|
|
|
import test_support
|
|
|
|
... use test_support.verbose in the code ...
|
|
|
|
|
|
|
|
Never import anything from test_support like this:
|
|
|
|
|
|
|
|
from test.test_support import verbose
|
|
|
|
|
|
|
|
"test" is a package already, so can refer to modules it contains without
|
|
|
|
"test." qualification. If you do an explicit "test.xxx" qualification, that
|
|
|
|
can fool Python into believing test.xxx is a module distinct from the xxx
|
|
|
|
in the current package, and you can end up importing two distinct copies of
|
|
|
|
xxx. This is especially bad if xxx=test_support, as regrtest.py can (and
|
|
|
|
routinely does) overwrite its "verbose" and "use_large_resources"
|
|
|
|
attributes: if you get a second copy of test_support loaded, it may not
|
|
|
|
have the same values for those as regrtest intended.
|
|
|
|
|
|
|
|
|
2000-07-19 14:19:49 -03:00
|
|
|
Python and C statement coverage results are currently available at
|
|
|
|
|
|
|
|
http://www.musi-cal.com/~skip/python/Python/dist/src/
|
|
|
|
|
|
|
|
As of this writing (July, 2000) these results are being generated nightly.
|
|
|
|
You can refer to the summaries and the test coverage output files to see
|
|
|
|
where coverage is adequate or lacking and write test cases to beef up the
|
|
|
|
coverage.
|