Refator Regrtest class:
* Rename finalize() finalize_tests().
* Pass tracer to run_test() and finalize_tests(). Remove Regrtest.tracer.
* run_test() does less things: move code to its caller.
* Regrtest.__init__() now copies 'ns' namespace attributes to
Regrtest attributes. Regrtest match_tests and ignore_tests
attributes have type FilterTuple (tuple), instead of a list.
* Add RunTests.copy(). Regrtest._rerun_failed_tests() now uses
RunTests.copy().
* Replace Regrtest.all_tests (list) with Regrtest.first_runtests
(RunTests).
* Make random_seed maximum 10x larger (9 digits, instead of 8).
* main() now calls _parse_args() and pass 'ns' to Regrtest
constructor. Remove kwargs argument from Regrtest.main().
* _parse_args() checks ns.huntrleaks.
* set_temp_dir() is now responsible to call expanduser().
* Regrtest.main() sets self.tests earlier.
* Add TestTuple and TestList types.
* Rename MatchTests to FilterTuple and rename MatchTestsDict
to FilterTestDict.
* TestResult.get_rerun_match_tests() return type
is now FilterTuple: return a tuple instead of a list.
RunTests.tests type becomes TestTuple.
When using --rerun option, regrtest now re-runs failed tests
in verbose mode in fresh worker processes to have more
deterministic behavior. So it can write its final report even
if a test killed a worker progress.
Add --fail-rerun option to regrtest: exit with non-zero exit code
if a test failed pass passed when re-run in verbose mode (in a
fresh process). That's now more useful since tests can pass
when re-run in a fresh worker progress, whereas they failed
when run after other tests when tests are run sequentially.
Rename --verbose2 option (-w) to --rerun. Keep --verbose2 as a
deprecated alias.
Changes:
* Fix and enhance statistics in regrtest summary. Add "(filtered)"
when --match and/or --ignore options are used.
* Add RunTests class.
* Add TestResult.get_rerun_match_tests() method
* Rewrite code to serialize/deserialize worker arguments as JSON
using a new WorkerJob class.
* Fix stats when a test is run with --forever --rerun.
* If failed test names cannot be parsed, log a warning and don't
filter tests.
* test_regrtest.test_rerun_success() now uses a marker file, since
the test is re-run in a separated process.
* Add tests on normalize_test_name() function.
* Add test_success() and test_skip() tests to test_regrtest.
test_netrc, test_pep646_syntax and test_xml_etree now return results
in the test_main() function.
Changes:
* Rewrite TestResult as a dataclass with a new State class.
* Add test.support.TestStats class and Regrtest.stats_dict attribute.
* libregrtest.runtest functions now modify a TestResult instance
in-place.
* libregrtest summary lists the number of run tests and skipped
tests, and denied resources.
* Add TestResult.has_meaningful_duration() method.
* Compute TestResult duration in the upper function.
* Use time.perf_counter() instead of time.monotonic().
* Regrtest: rename 'resource_denieds' attribute to 'resource_denied'.
* Rename CHILD_ERROR to MULTIPROCESSING_ERROR.
* Use match/case syntadx to have different code depending on the
test state.
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Split test_multiprocessing_fork, test_multiprocessing_forkserver and
test_multiprocessing_spawn into test packages. Each package is made
of 4 sub-tests: processes, threads, manager and misc. It allows
running more tests in parallel and so reduce the total test duration.
Currently, test_asyncio package is only splitted into sub-tests when
using command "./python -m test". With this change, it's also
splitted when passing it on the command line:
"./python -m test test_asyncio".
Remove the concept of "STDTESTS". Python is now mature enough to not
have to bother with that anymore. Removing STDTESTS simplify the
code.
When running the Python test suite with -jN option, if a worker stdout
cannot be decoded from the locale encoding report a failed testn so the
exitcode is non-zero.
Display the sanitizers present in libregrtest.
Having this in the CI output for tests with the relevant environment
variable displayed will help make it easier to do what we need to
create an equivalent local test run.
These are stubs to be used for adding hypothesis (https://hypothesis.readthedocs.io/en/latest/) tests to the standard library.
When the tests are run in an environment where `hypothesis` and its various dependencies are not installed, the stubs will turn any tests with examples into simple parameterized tests and any tests without examples are skipped.
It also adds hypothesis tests for the `zoneinfo` module, and a Github Actions workflow to run the hypothesis tests as a non-required CI job.
The full hypothesis interface is not stubbed out — missing stubs can be added as necessary.
Co-authored-by: Zac Hatfield-Dodds <zac.hatfield.dodds@gmail.com>
This runs test_asyncio sub-tests in parallel using sharding from Cinder. This suite is typically the longest-pole in runs because it is a test package with a lot of further sub-tests otherwise run serially. By breaking out the sub-tests as independent modules we can run a lot more in parallel.
After porting we can see the direct impact on a multicore system.
Without this change:
Running make test is 5 min 26 seconds
With this change:
Running make test takes 3 min 39 seconds
That'll vary based on system and parallelism. On a `-j 4` run similar to what CI and buildbot systems often do, it reduced the overall test suite completion latency by 10%.
The drawbacks are that this implementation is hacky and due to the sorting of the tests it obscures when the asyncio tests occur and involves changing CPython test infrastructure but, the wall time saved it is worth it, especially in low-core count CI runs as it pulls a long tail. The win for productivity and reserved CI resource usage is significant.
Future tests that deserve to be refactored into split up suites to benefit from are test_concurrent_futures and the way the _test_multiprocessing suite gets run for all start methods. As exposed by passing the -o flag to python -m test to get a list of the 10 longest running tests.
---------
Co-authored-by: Carl Meyer <carl@oddbird.net>
Co-authored-by: Gregory P. Smith <greg@krypto.org> [Google, LLC]
This is the implementation of PEP683
Motivation:
The PR introduces the ability to immortalize instances in CPython which bypasses reference counting. Tagging objects as immortal allows up to skip certain operations when we know that the object will be around for the entire execution of the runtime.
Note that this by itself will bring a performance regression to the runtime due to the extra reference count checks. However, this brings the ability of having truly immutable objects that are useful in other contexts such as immutable data sharing between sub-interpreters.
Remove the distutils package. It was deprecated in Python 3.10 by PEP
632 "Deprecate distutils module". For projects still using distutils
and cannot be updated to something else, the setuptools project can
be installed: it still provides distutils.
* Remove Lib/distutils/ directory
* Remove test_distutils
* Remove references to distutils
* Skip test_check_c_globals and test_peg_generator since they use
distutils
The Python test suite now fails wit exit code 4 if no tests ran. It
should help detecting typos in test names and test methods.
* Add "EXITCODE_" constants to Lib/test/libregrtest/main.py.
* Fix a typo: "NO TEST RUN" becomes "NO TESTS RAN"
On Windows, when the Python test suite is run with the -jN option,
the ANSI code page is now used as the encoding for the stdout
temporary file, rather than using UTF-8 which can lead to decoding
errors.
- Emscripten's default umask is too strict, see
https://github.com/emscripten-core/emscripten/issues/17269
- getuid/getgid and geteuid/getegid are stubs that always return 0
(root). Disable effective uid/gid syscalls and fix tests that use
chmod() current user.
- Cannot drop X bit from directory.
regrtest now also implements checking for leaked temporary files and
directories when using -jN for N >= 2. Use tempfile.mkdtemp() to
create the temporary directory. Skip this check on WASI.
When running tests with -jN, create a temporary directory per process
and mark a test as "environment changed" if a test leaks a temporary
file or directory.
Replace the child process `typeperf.exe` with a daemon thread that reads the performance counters directly. This prevents the issues that arise from inherited handles in grandchild processes (see issue37531 for discussion).
We only use the load tracker when running tests in multiprocess mode. This prevents inadvertent interactions with tests expecting a single threaded environment. Displaying load is really only helpful for buildbots running in multiprocess mode anyway.
Remove the asyncore and asynchat modules, deprecated in Python
3.6: use the asyncio module instead.
Remove the smtpd module, deprecated in Python 3.6: the aiosmtpd
module can be used instead, it is based on asyncio.
* Remove asyncore, asynchat and smtpd documentation
* Remove test_asyncore, test_asynchat and test_smtpd
* Rename Lib/asynchat.py to Lib/test/support/_asynchat.py
* Rename Lib/asyncore.py to Lib/test/support/_asyncore.py
* Rename Lib/smtpd.py to Lib/test/support/_smtpd.py
* Remove DeprecationWarning from private _asyncore, _asynchat and
_smtpd modules
* _smtpd: remove deprecated properties
Remove the --findleaks command line option of regrtest: use the
--fail-env-changed option instead. Since Python 3.7, it was a
deprecated alias to the --fail-env-changed option.
support.print_warning() now stores the original value of
sys.__stderr__ and uses it to log warnings. libregrtest uses the same
stream to log unraisable exceptions and uncaught threading
exceptions.
Partially revert commit dbe213de7ef28712bbfdb9d94a33abb9c33ef0c2:
libregrtest no longer replaces sys.__stdout__, sys.__stderr__, and
stdout and stderr file descriptors.
Remove also a few unused imports in libregrtest.
libregrtest -W/--verbose3 now also replace sys.__stdout__,
sys.__stderr__, and stdout and stderr file descriptors (fd 1 and fd
2).
support.print_warning() messages are now logged in the expected
order.
The "./python -m test test_eintr -W" command no longer logs into
stdout if the test pass.
When libregrtest spawns a worker process, stderr is now written into
stdout to keep messages order. Use a single pipe for stdout and
stderr, rather than two pipes. Previously, messages were out of order
which made analysis of buildbot logs harder
libregrtest now clears the type cache later to reduce the risk of
false alarm when checking for reference leaks. Previously, the type
cache was cleared too early and libregrtest raised a false alarm
about reference leaks under very specific conditions.
Move also support.gc_collect() outside clear/cleanup functions to
make the garbage collection more explicit.
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
* Move to a static argparse.Namespace subclass
* Roughly annotate runtest.py
* Refactor libregrtest to use lossless test result objects
* Only re-run test methods that match names of previously failing test methods
* Adopt tests to cover test method name matching
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
* Add co_firstinstr field to code object.
* Implement barebones quickening.
* Use non-quickened bytecode when tracing.
* Add NEWS item
* Add new file to Windows build.
* Don't specialize instructions with EXTENDED_ARG.
test.libregrtest now marks a test as ENV_CHANGED (altered the
execution environment) if a thread raises an exception but does not
catch it. It sets a hook on threading.excepthook. Use
--fail-env-changed option to mark the test as failed.
libregrtest regrtest_unraisable_hook() explicitly flushs
sys.stdout, sys.stderr and sys.__stderr__.
RegressionTestResult.USE_XML must now be set to True to get the JUnit
XML output.
Reduce the number of imports when --junit-xml=FILE option is not
used: 153 => 144 (-9).
Move clear_caches() from libregrtest.refleak to libregrtest.utils to
avoid importing libregrtest.refleak when it's not needed.
clear_caches() now only calls re.purge() if 're' is in sys.modules.
Reduce the number of modules imported by libregrtest.
saved_test_environment no longer imports modules at startup, but try
to get them from sys.modules. If an module is missing, skip the test.
It also sets directly support.environment_altered.
runtest() now now two saved_test_environment instances: one before
importing the test module, one after importing it.
Remove imports from test.libregrtest.save_env:
* asyncio
* logging
* multiprocessing
* shutil
* sysconfig
* urllib.request
* warnings
When a test method imports a module (ex: warnings) and the test
has a side effect (ex: add a warnings filter), the side effect is not
detected, because the module was not imported when Python
enters the saved_test_environment context manager.
test_repl.test_close_stdin() now calls
support.suppress_msvcrt_asserts() to fix the test on Windows.
* Move suppress_msvcrt_asserts() from test.libregrtest.setup to
test.support. Make its verbose parameter optional: verbose=False by
default.
* Add msvcrt.GetErrorMode().
* SuppressCrashReport now uses GetErrorMode() and SetErrorMode() of
the msvcrt module, rather than using ctypes.
* Remove also an unused variable (deadline) in wait_process().
Log "Warning -- ..." test warnings into sys.__stderr__ rather than
sys.stderr, to ensure to display them even if sys.stderr is captured.
test.libregrtest.utils.print_warning() now calls
test.support.print_warning().
When building Python in some uncommon platforms there are some known tests that will fail. Right now, the test suite has the ability to ignore entire tests using the -x option and to receive a filter file using the --matchfile filter. The problem with the --matchfile option is that it receives a file with patterns to accept and when you want to ignore a couple of tests and subtests, is too cumbersome to lists ALL tests that are not the ones that you want to accept and he problem with -x is that is not easy to ignore just a subtests that fail and the whole test needs to be ignored.
For these reasons, add a new option to allow to ignore a list of test and subtests for these situations.
test.regrtest now uses process groups in the multiprocessing mode
(-jN command line option) if process groups are available: if
os.setsid() and os.killpg() functions are available.
bpo-37531, bpo-38207: On timeout, regrtest no longer attempts to call
`popen.communicate() again: it can hang until all child processes
using stdout and stderr pipes completes. Kill the worker process and
ignores its output.
Reenable test_regrtest.test_multiprocessing_timeout().
bpo-37531: Change also the faulthandler timeout of the main process
from 1 minute to 5 minutes, for Python slowest buildbots.
* Add log() method: add timestamp and load average prefixes
to main messages.
* WindowsLoadTracker:
* LOAD_FACTOR_1 is now computed using SAMPLING_INTERVAL
* Initialize the load to the arithmetic mean of the first 5 values
of the Processor Queue Length value (so over 5 seconds), rather
than 0.0.
* Handle BrokenPipeError and when typeperf exit.
* format_duration(1.5) now returns '1.5 sec', rather than
'1 sec 500 ms'
* Fix TestWorkerProcess.__repr__(): start_time is only valid
if _popen is not None.
* Fix _kill(): don't set _killed to True if _popen is None.
* _run_process(): only set _killed to False after calling
run_test_in_subprocess().
* Windows: Fix counter name in WindowsLoadTracker. Counter names are
localized: use the registry to get the counter name. Original
change written by Lorenz Mende.
* Regrtest.main() now ensures that the Windows load tracker is also
killed if an exception is raised
* TestWorkerProcess now ensures that worker processes are no longer
running before exiting: kill also worker processes when an
exception is raised.
* Enhance regrtest messages and warnings: include test name,
duration, add a worker identifier, etc.
* Rename MultiprocessRunner to TestWorkerProcess
* Use print_warning() to display warnings.
Co-Authored-By: Lorenz Mende <Lorenz.mende@gmail.com>
When using multiprocesss (-jN), the main process now uses a timeout
of 60 seconds instead of the double of the --timeout value. The
buildbot server stops a job which does not produce any output in 1200
seconds.
A root cause of bpo-37936 is that it's easy to write a .gitignore
rule that's intended to apply to a specific file (e.g., the
`pyconfig.h` generated by `./configure`) but actually applies to all
similarly-named files in the tree (e.g., `PC/pyconfig.h`.)
Specifically, any rule with no non-trailing slashes is applied in an
"unrooted" way, to files anywhere in the tree. This means that if we
write the rules in the most obvious-looking way, then
* for specific files we want to ignore that happen to be in
subdirectories (like `Modules/config.c`), the rule will work
as intended, staying "rooted" to the top of the tree; but
* when a specific file we want to ignore happens to be at the root of
the repo (like `platform`), then the obvious rule (`platform`) will
apply much more broadly than intended: if someone tries to add a
file or directory named `platform` somewhere else in the tree, it
will unexpectedly get ignored.
That's surprising behavior that can make the .gitignore file's
behavior feel finicky and unpredictable.
To avoid it, we can simply always give a rule "rooted" behavior when
that's what's intended, by systematically using leading slashes.
Further, to help make the pattern obvious when looking at the file and
minimize any need for thinking about the syntax when adding new rules:
separate the rules into one group for each type, with brief comments
identifying them.
For most of these rules it's clear whether they're meant to be rooted
or unrooted, but in a handful of cases I've only guessed. In that
case the safer default (the choice that won't hide information) is the
narrower, rooted meaning, with a leading slash. If for some of these
the unrooted meaning is desired after all, it'll be easy to move them
to the unrooted section at the top.