* gh-109989: Fix test_c_locale_coercion when PYTHONIOENCODING is set
This fixes the existing tests when PYTHONIOENCODING is
set by unsetting PYTHONIOENCODING.
Also add a test that explicitly checks what happens
when PYTHONIOENCODING is set.
Co-authored-by: Nikita Sobolev <mail@sobolevn.me>
Instead of explicitly enumerate test classes for run_unittest()
use the unittest ability to discover tests. This also makes these
tests discoverable and runnable with unittest.
load_tests() can be used for dynamic generating tests and adding
doctests. setUpModule(), tearDownModule() and addModuleCleanup()
can be used for running code before and after all module tests.
Python initialization now ensures that sys stream encoding
names are always normalized by codecs.lookup(encoding).name.
Simplify test_c_locale_coercion: it doesn't have to normalize
encoding names anymore.
* Revert "bpo-34589: Add -X coerce_c_locale command line option (GH-9378)"
This reverts commit dbdee0073c.
* Revert "bpo-34589: C locale coercion off by default (GH-9073)"
This reverts commit 7a0791b699.
* Revert "bpo-34589: Make _PyCoreConfig.coerce_c_locale private (GH-9371)"
This reverts commit 188ebfa475.
* Add support.MS_WINDOWS: True if Python is running on Microsoft Windows.
* Add support.MACOS: True if Python is running on Apple macOS.
* Replace support.is_android with support.ANDROID
* Replace support.is_jython with support.JYTHON
* Cleanup code to initialize unix_shell
Exactly which locale requests will end up giving
you the "C" locale is actually platform dependent.
A blank locale and "POSIX" will translate to "C"
on most Linux distros, but may not do so on other platforms, so this adjusts the way the tests are structured to better account for that.
This is an initial step towards fixing the current
test failure on Cygwin (hence the issue reference)
bpo-29240, bpo-32030: If the encoding change (C locale coerced or
UTF-8 Mode changed), Py_Main() now reads again the configuration with
the new encoding.
Changes:
* Add _Py_UnixMain() called by main().
* Rename pymain_free_pymain() to pymain_clear_pymain(), it can now be
called multipled times.
* Rename pymain_parse_cmdline_envvars() to pymain_read_conf().
* Py_Main() now clears orig_argc and orig_argv at exit.
* Remove argv_copy2, Py_Main() doesn't modify argv anymore. There is
no need anymore to get two copies of the wchar_t** argv.
* _PyCoreConfig: add coerce_c_locale and coerce_c_locale_warn.
* Py_UTF8Mode is now initialized to -1.
* Locale coercion (PEP 538) now respects -I and -E options.
* Add -X utf8 command line option, PYTHONUTF8 environment variable
and a new sys.flags.utf8_mode flag.
* If the LC_CTYPE locale is "C" at startup: enable automatically the
UTF-8 mode.
* Add _winapi.GetACP(). encodings._alias_mbcs() now calls
_winapi.GetACP() to get the ANSI code page
* locale.getpreferredencoding() now returns 'UTF-8' in the UTF-8
mode. As a side effect, open() now uses the UTF-8 encoding by
default in this mode.
* Py_DecodeLocale() and Py_EncodeLocale() now use the UTF-8 encoding
in the UTF-8 Mode.
* Update subprocess._args_from_interpreter_flags() to handle -X utf8
* Skip some tests relying on the current locale if the UTF-8 mode is
enabled.
* Add test_utf8mode.py.
* _Py_DecodeUTF8_surrogateescape() gets a new optional parameter to
return also the length (number of wide characters).
* pymain_get_global_config() and pymain_set_global_config() now
always copy flag values, rather than only copying if the new value
is greater than the old value.
AIX uses iso8859-1 in the C locale, not ASCII
AIX doesn't currently provide any of the locale
coercion locales, but we leave locale coercion
enabled in case one gets added in the future.
- On some versions of FreeBSD, setting the "UTF-8" locale
succeeds, but a subsequent "nl_langinfo(CODESET)" fails
- adding a check for this in the coercion logic means that
coercion will happen on systems where this check succeeds,
and will be skipped otherwise
- that way CPython should automatically adapt to changes in
platform behaviour, rather than needing a new release to
enable coercion at build time
- this also allows UTF-8 to be re-enabled as a coercion
target, restoring the locale coercion behaviour on Mac OS X
- removes PY_WARN_ON_C_LOCALE build time flag
- locale coercion and compatibility warnings are now always compiled
in, but are off by default
- adds PYTHONCOERCECLOCALE=warn runtime option to aid in
debugging potentially locale related compatibility problems
Due to not-yet-resolved test failures on *BSD systems (including
Mac OS X), this also temporarily disables UTF-8 as a locale coercion
target, and skips testing the interpreter's behavior in the POSIX locale.
In the C locale on Mac OS X, the default filesystem encoding
used for operating system interfaces is UTF-8, but the
default encoding used on the standard streams is still ASCII.
Setting the POSIX locale also behaves differently from setting
other locales on Mac OS X, so skip that in the test suite for now.
When checking for reference leaks, test_c_locale_coercion is run
multiple times and so _LocaleCoercionTargetsTestCase.setUpClass() is
called multiple times. setUpClass() appends new value at each call,
so it looks like a reference leak.
Moving the setup from setUpClass() to setUpModule() avoids
this, eliminating the false alarm.
- new PYTHONCOERCECLOCALE config setting
- coerces legacy C locale to C.UTF-8, C.utf8 or UTF-8 by default
- always uses C.UTF-8 on Android
- uses `surrogateescape` on stdin and stdout in the coercion
target locales
- configure option to disable locale coercion at build time
- configure option to disable C locale warning at build time