bpo-37531, bpo-38207: On timeout, regrtest no longer attempts to call
`popen.communicate() again: it can hang until all child processes
using stdout and stderr pipes completes. Kill the worker process and
ignores its output.
Reenable test_regrtest.test_multiprocessing_timeout().
bpo-37531: Change also the faulthandler timeout of the main process
from 1 minute to 5 minutes, for Python slowest buildbots.
* Add log() method: add timestamp and load average prefixes
to main messages.
* WindowsLoadTracker:
* LOAD_FACTOR_1 is now computed using SAMPLING_INTERVAL
* Initialize the load to the arithmetic mean of the first 5 values
of the Processor Queue Length value (so over 5 seconds), rather
than 0.0.
* Handle BrokenPipeError and when typeperf exit.
* format_duration(1.5) now returns '1.5 sec', rather than
'1 sec 500 ms'
* Fix TestWorkerProcess.__repr__(): start_time is only valid
if _popen is not None.
* Fix _kill(): don't set _killed to True if _popen is None.
* _run_process(): only set _killed to False after calling
run_test_in_subprocess().
* Windows: Fix counter name in WindowsLoadTracker. Counter names are
localized: use the registry to get the counter name. Original
change written by Lorenz Mende.
* Regrtest.main() now ensures that the Windows load tracker is also
killed if an exception is raised
* TestWorkerProcess now ensures that worker processes are no longer
running before exiting: kill also worker processes when an
exception is raised.
* Enhance regrtest messages and warnings: include test name,
duration, add a worker identifier, etc.
* Rename MultiprocessRunner to TestWorkerProcess
* Use print_warning() to display warnings.
Co-Authored-By: Lorenz Mende <Lorenz.mende@gmail.com>
When using multiprocesss (-jN), the main process now uses a timeout
of 60 seconds instead of the double of the --timeout value. The
buildbot server stops a job which does not produce any output in 1200
seconds.
A root cause of bpo-37936 is that it's easy to write a .gitignore
rule that's intended to apply to a specific file (e.g., the
`pyconfig.h` generated by `./configure`) but actually applies to all
similarly-named files in the tree (e.g., `PC/pyconfig.h`.)
Specifically, any rule with no non-trailing slashes is applied in an
"unrooted" way, to files anywhere in the tree. This means that if we
write the rules in the most obvious-looking way, then
* for specific files we want to ignore that happen to be in
subdirectories (like `Modules/config.c`), the rule will work
as intended, staying "rooted" to the top of the tree; but
* when a specific file we want to ignore happens to be at the root of
the repo (like `platform`), then the obvious rule (`platform`) will
apply much more broadly than intended: if someone tries to add a
file or directory named `platform` somewhere else in the tree, it
will unexpectedly get ignored.
That's surprising behavior that can make the .gitignore file's
behavior feel finicky and unpredictable.
To avoid it, we can simply always give a rule "rooted" behavior when
that's what's intended, by systematically using leading slashes.
Further, to help make the pattern obvious when looking at the file and
minimize any need for thinking about the syntax when adding new rules:
separate the rules into one group for each type, with brief comments
identifying them.
For most of these rules it's clear whether they're meant to be rooted
or unrooted, but in a handful of cases I've only guessed. In that
case the safer default (the choice that won't hide information) is the
narrower, rooted meaning, with a leading slash. If for some of these
the unrooted meaning is desired after all, it'll be easy to move them
to the unrooted section at the top.
* Write a message when killing a worker process
* Put a timeout on the second popen.communicate() call
(after killing the process)
* Put a timeout on popen.wait() call
* Catch popen.kill() and popen.wait() exceptions
Mark some individual tests to skip when --pgo is used. The tests
marked increase the PGO task time significantly and likely don't
help improve optimization of the final executable.
Reduce the number of unit tests run for the PGO generation task. This
speeds up the task by a factor of about 15x. Running the full unit test
suite is slow. This change may result in a slightly less optimized build
since not as many code branches will be executed. If you are willing to
wait for the much slower build, the old behavior can be restored using
'./configure [..] PROFILE_TASK="-m test --pgo-extended"'. We make no
guarantees as to which PGO task set produces a faster build. Users who
care should run their own relevant benchmarks as results can depend on
the environment, workload, and compiler tool chain.
bpo-15386, bpo-37473: test_import, regrtest and libregrtest no longer
import importlib as soon as possible, as the first import, "to test
bpo-15386".
It is tested by test_import.test_there_can_be_only_one().
Sort test_import imports.
urllib.request tests now call urlcleanup() to remove temporary files
created by urlretrieve() tests and to clear the _opener global
variable set by urlopen() and functions calling indirectly urlopen().
regrtest now checks if urllib.request._url_tempfiles and
urllib.request._opener are changed by tests.
Under some conditions the earlier fix for bpo-18075, "Infinite recursion
tests triggering a segfault on Mac OS X", now causes failures on macOS
when attempting to change stack limit with resource.setrlimit
resource.RLIMIT_STACK, like regrtest does when running the test suite.
The reverted change had specified a non-default stack size when linking
the python executable on macOS. As of macOS 10.14.4, the previous
code causes a hard failure when running tests, although similar
failures had been seen under some conditions under some earlier
systems. Reverting the change to the interpreter stack size at link
time helped for release builds but caused some tests to fail when
built --with-pydebug. Try the opposite approach: continue to build
the interpreter with an increased stack size on macOS and remove
the failing setrlimit call in regrtest initialization. This will
definitely avoid the resource.RLIMIT_STACK error and should have
no, or fewer, side effects.
* regrtest: Add --cleanup option to remove "test_python_*" directories
of previous failed test jobs.
* Add "make cleantest" to run "python3 -m test --cleanup".
regrtest now uses sys.unraisablehook() to mark a test as "environment
altered" (ENV_CHANGED) if it emits an "unraisable exception".
Moreover, regrtest logs a warning in this case.
Use "python3 -m test --fail-env-changed" to catch unraisable
exceptions in tests.
When using multiprocessing (-jN option), worker processes now create
their temporary directory inside the temporary directory of the
main process. So the main process is able to remove temporary
directories of worker processes even if they crash or when they are
killed by regrtest on KeyboardInterrupt (CTRL+c).
Rework also how multiprocessing arguments are parsed in main.py.
"python3 -m test -jN ..." now continues the execution of next tests
when a worker process crash (CHILD_ERROR state). Previously, the test
suite stopped immediately. Use --failfast to stop at the first error.
Moreover, --forever now also implies --failfast.
regrtest now always detects uncollectable objects. Previously, the
check was only enabled by --findleaks. The check now also works with
-jN/--multiprocess N.
--findleaks becomes a deprecated alias to --fail-env-changed.
Rewrite run_tests_multiprocess() function as a new MultiprocessRunner
class with multiple methods to better report errors and stop
immediately when needed.
Changes:
* Worker processes are now killed immediately if tests are
interrupted or if a test does crash (CHILD_ERROR): worker
processes are killed.
* Rewrite how errors in a worker thread are reported to
the main thread. No longer ignore BaseException or parsing errors
silently.
* Remove 'finished' variable: use worker.is_alive() instead
* Always compute omitted tests. Add Regrtest.get_executed() method.
* Add TestResult and MultiprocessResult types to ensure that results
always have the same fields.
* runtest() now handles KeyboardInterrupt
* accumulate_result() and format_test_result() now takes a TestResult
* cleanup_test_droppings() is now called by runtest() and mark the
test as ENV_CHANGED if the test leaks support.TESTFN file.
* runtest() now includes code "around" the test in the test timing
* Add print_warning() in test.libregrtest.utils to standardize how
libregrtest logs warnings to ease parsing the test output.
* support.unload() is now called with abstest rather than test_name
* Rename 'test' variable/parameter to 'test_name'
* dash_R(): remove unused the_module parameter
* Remove unused imports
dash_R() function of libregrtest doesn't call support.gc_collect()
directly anymore: it's already called by dash_R_cleanup().
Call dash_R_cleanup() before starting the loop.
Fix reference leak hunting in regrtest: compute also deltas (of
reference count, allocated memory blocks, file descriptor count)
during warmup, to ensure that everything is initialized before
starting to hunt reference leaks.
Other changes:
* Replace gc.collect() with support.gc_collect()
* Move calls to read memory statistics from dash_R_cleanup() to
dash_R()
* Pass regrtest 'ns' to dash_R()
* dash_R() is now more quiet with --quiet option (don't display
progress).
* Precompute the full range for "for it in range(repcount):" to
ensure that the iteration doesn't allocate anything new.
* dash_R() now is responsible to call warm_caches().
While Windows exposes the system processor queue length, the raw value
used for load calculations on Unix systems, it does not provide an API
to access the averaged value. Hence to calculate the load we must track
and average it ourselves. We can't use multiprocessing or a thread to
read it in the background while the tests run since using those would
conflict with test_multiprocessing and test_xxsubprocess.
Thus, we use Window's asynchronous IO API to run the tracker in the
background with it sampling at the correct rate. When we wish to access
the load we check to see if there's new data on the stream, if there is,
we update our load values.
TextTestRunner of unittest.runner now uses time.perf_counter() rather
than time.time() to measure the execution time of a test: time.time()
can go backwards, whereas time.perf_counter() is monotonic.
Similar change made in libregrtest, pprint and random.
Fix bug in `Lib/test/libregrtest/runtest.py` that makes running tests an extra time than the specified number of runs.
Add check for invalid --huntrleaks/-R parameters.
If tests are re-run, use "xxx then yyy" result format (ex: "FAILURE
then SUCCESS") to show that some failing tests have been re-run.
Add also test_regrtest.test_rerun_fail() test.
* "running:" progress: Format number of seconds as hours and minutes
* format_duration(): count also minutes as hours
* Create Lib/test/libregrtest/utils.py
* No longer clear filters, like --match, to re-run failed tests in
verbose mode (-w option).
* Tests result: always indicate if tests have been interrupted.
* Enhance tests summary
* Rename support._match_test() to support.match_test(): make it
public
* Remove support.match_tests global variable. It is replaced with a
new support.set_match_tests() function, so match_test() doesn't
have to check each time if patterns were modified.
* Rewrite match_test(): use different code paths depending on the
kind of patterns for best performances.
Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>
kB (*kilo* byte) unit means 1000 bytes, whereas KiB ("kibibyte")
means 1024 bytes. KB was misused: replace kB or KB with KiB when
appropriate.
Same change for MB and GB which become MiB and GiB.
Change the output of Tools/iobench/iobench.py.
Round also the size of the documentation from 5.5 MB to 5 MiB.
When regrtest in run inside IDLE, sys.stdout and sys.stderr are not
TextIOWrapper objects and have no file descriptor associated:
sys.stderr.fileno() raises io.UnsupportedOperation.
Disable faulthandler and don't replace sys.stdout in that case.
Use a pool of integer objects toprevent false alarm when checking for
memory block leaks. Fill the pool with values in -1000..1000 which
are the most common (reference, memory block, file descriptor)
differences.
Co-Authored-By: Antoine Pitrou <pitrou@free.fr>
* Add Lib/test/pythoninfo.py: script collecting various informations
about Python to help debugging test failures.
* regrtest: remove sys.hash_info and sys.flags from header.
* Travis CI, Appveyor: run pythoninfo before tests
* bpo-26732: fix too many fds in processes started with the "forkserver" method
A child process would inherit as many fds as the number of still-running children.
* Add blurb and test comment
When running the test suite using --use=all / -u all, exclude tzdata
since it makes test_datetime too slow (15-20 min on some buildbots)
which then times out on some buildbots.
-u tzdata must now be enabled explicitly, -u tzdata or -u all,tzdata,
to run all test_datetime tests.
Fix also regrtest command line parser to allow passing -u
extralargefile to run test_zipfile64.
Travis CI: remove -tzdata. Replace -u all,-tzdata,-cpu with -u all,-cpu since tzdata is now excluded from -u all.
If threading_cleanup() fails to cleanup threads, set a a new
support.environment_altered flag to true, flag uses by save_env which
is used by regrtest to check if a test altered the environment. At
the end, the test file fails with ENV_CHANGED instead of SUCCESS, to
report that it altered the environment.
* Change the regrtest --huntrleaks checker to decide if a test file
leaks or not. Require that each run leaks at least 1 reference.
* Warmup runs are now completely ignored: ignored in the checker test
and not used anymore to compute the sum.
* Add an unit test for a reference leak.
Example of reference differences previously considered a failure
(leak) and now considered as success (success, no leak):
[3, 0, 0]
[0, 1, 0]
[8, -8, 1]
* bpo-30764: regrtest: change exit code on failure
* Exit code 2 if failed tests ("bad")
* Exit code 3 if interrupted
* bpo-30764: regrtest: add --fail-env-changed option
If the option is set, mark a test as failed if it alters the
environment, for example if it creates a file without removing it.
* regrtest --list-cases now supports --match and --match-file options.
Example: ./python -m test --list-cases -m FileTests test_os
* --list-cases now also sets support.verbose to False to prevent
messages to stdout when loading test modules.
* Add support._match_test() private function.
Use a build/ directory in the build directory, not in the source
directory, since the source directory may be read-only and must not
be modified.
Fallback on the source directory if the build directory is not
available (missing "abs_builddir" sysconfig variable).
* Add a new option taking a filename to get a list of test names to
filter tests.
* support.match_tests becomes a list.
* Modify run_unittest() to accept to match the whole test identifier,
not just a part of a test identifier.
For example, the following command only runs test_default_timeout()
of the BarrierTests class of test_threading:
$ ./python -m test -v test_threading -m test.test_threading.BarrierTests.test_default_timeout
Remove also some empty lines from test_regrtest.py to make flake8
tool happy.
Buildbots don't run tests with -vv and so only log "xxx was modified
by test_xxx" which is not enough to debug such random issue. In many
cases, I'm unable to reproduce the warning and so unable to fix it.
Always logging the value before and value after should help to debug
such warning on buildbots.
Issue #29362: Catch a crash of a worker process as a normal failure and
continue to run next tests. It allows to get the usual test summary: single
line result (OK/FAIL), total duration, etc.
It's sometimes hard to check quickly if tests succeeded, failed or something
bad happened. I added a final "Result: xxx" line which summarizes all outputs
into a single line, written at the end (it should always be the last line of
the output).
* regrtest now uses subprocesses when the -j1 command line option
is used: each test file runs in a fresh child process. Before, the -j1 option
was ignored.
* Tools/buildbot/test.bat script now uses -j1 by default to run
each test file in fresh child process.
* Replace get/restore methods with a Resource class and Resource subclasses
* Create ModuleAttr, ModuleAttrList and ModuleAttrDict helper classes
* Use __subclasses__() to get resource classes instead of using an hardcoded
list (2 shutil resources were missinged in the list!)
* Don't define MultiprocessingProcessDangling resource if the multiprocessing
module is missing
* Nicer diff for dictionaries. Useful for the big os.environ dict
* Reorder code to group resources
* Rename libregrtest.main_in_temp_cwd() to libregrtest.main()
* Add regrtest.main_in_temp_cwd() alias to libregrtest.main()
* Move old main_in_temp_cwd() code into libregrtest.Regrtest.main()
* Update multiple scripts to call libregrtest.main()
Issue #26538: libregrtest: Fix setup_tests() to keep module.__path__ type
(_NamespacePath), don't convert to a list.
Add _NamespacePath.__setitem__() method to importlib._bootstrap_external.
* Fix accumulate_result(): don't use time on interrupted and failed test
* Add unit test for interrupted test
* Add unit test on --slow with interrupted test, with and without
multiprocessing
* Fix "-m test --forever": replace _test_forever() with self._test_forever()
* Add unit test for --forever
* Add unit test for a failing test
* Fix also some pyflakes warnings in libregrtest
* Remove runtest_ns(): pass directly ns to runtest().
* Create also Regrtest.rerun_failed_tests() method.
* Inline again Regrtest.run_test(): it's no more justified to have a method
Slaves (child processes running tests for regrtest -jN) now inherit
--memlimit/-M, --threshold/-t and --nowindows/-n options.
* -M, -t and -n are now supported with -jN
* Factorize code to run tests.
* run_test_in_subprocess() now pass the whole "ns" namespace to the child
process.
Running the Python test suite with -jN now:
- Display the duration of tests which took longer than 30 seconds
- Display the tests currently running since at least 30 seconds
- Display the tests we are waiting for when the test suite is interrupted
Clenaup also run_test_in_subprocess() code.
Python doesn't display the refcount anymore by default. It only displays it
when -X showrefcount command line option is used, which is not the case here.
regrtest can be run with -X showrefcount, the option is not inherited by child
processes.
Move the code to run tests in multiple processes using threading and subprocess
to a new submodule.
Move also slave_runner() (renamed to run_tests_slave()) and
run_test_in_subprocess() (renamed to run_tests_in_subprocess()) there.
with attributes and methods.
The --threshold command line option is now ignored if the gc module is missing.
* Convert main() variables to Regrtest attributes, document some attributes
* Convert accumulate_result() function to a method
* Create setup_python() function and setup_regrtest() method.
* Import gc at top level
* Move resource.setrlimit() and the code to make the module paths absolute into
the new setup_python() function. So this code is no more executed when the
module is imported, only when main() is executed. We have a better control on
when the setup is done.
* Move textwrap import from printlist() to the top level.
* Some other minor cleanup.
Start to split regrtest.py into smaller parts with the creation of
Lib/test/libregrtest/cmdline.py: code to handle the command line, especially
parsing command line arguments. This part of the code is tested by
test_regrtest.