cpython/Python
Victor Stinner e662c398d8
bpo-42236: Use UTF-8 encoding if nl_langinfo(CODESET) fails (GH-23086)
If the nl_langinfo(CODESET) function returns an empty string, Python
now uses UTF-8 as the filesystem encoding.

In May 2010 (commit b744ba1d14), I
modified Python to log a warning and use UTF-8 as the filesystem
encoding (instead of None) if nl_langinfo(CODESET) returns an empty
string.

In August 2020 (commit 94908bbc15), I
modified Python startup to fail with a fatal error and a specific
error message if nl_langinfo(CODESET) returns an empty string. The
intent was to prevent guessing the encoding and also investigate user
configuration where this case happens.

In 10 years (2010 to 2020), I saw zero user report about the error
message related to nl_langinfo(CODESET) returning an empty string.

Today, UTF-8 became the defacto standard and it's safe to make the
assumption that the user expects UTF-8. For example,
nl_langinfo(CODESET) can return an empty string on macOS if the
LC_CTYPE locale is not supported, and UTF-8 is the default encoding
on macOS.

While this change is likely to not affect anyone in practice, it
should make UTF-8 lover happy ;-)

Rewrite also the documentation explaining how Python selects the
filesystem encoding and error handler.
2020-11-01 23:07:23 +01:00
..
clinic bpo-40471: Fix grammar typo in 'issubclass' docstring (GH-19847) 2020-06-03 06:19:45 -07:00
Python-ast.c bpo-41746: Add type information to asdl_seq objects (GH-22223) 2020-09-16 19:42:00 +01:00
README Issue #18093: Factor out the programs that embed the runtime 2014-07-25 21:52:14 +10:00
_warnings.c bpo-42161: Use _PyLong_GetZero() and _PyLong_GetOne() (GH-22995) 2020-10-27 02:24:34 +01:00
asdl.c bpo-41746: Add type information to asdl_seq objects (GH-22223) 2020-09-16 19:42:00 +01:00
ast.c bpo-42000: Cleanup the AST related C-code (GH-22641) 2020-10-10 10:14:59 -07:00
ast_opt.c bpo-38605: Make 'from __future__ import annotations' the default (GH-20434) 2020-10-06 13:03:02 -07:00
ast_unparse.c bpo-41746: Add type information to asdl_seq objects (GH-22223) 2020-09-16 19:42:00 +01:00
bltinmodule.c bpo-42152: Use PyDict_Contains and PyDict_SetDefault if appropriate. (GH-22986) 2020-10-26 12:47:57 +02:00
bootstrap_hash.c bpo-40910: Export Py_GetArgcArgv() function (GH-20721) 2020-06-08 18:12:59 +02:00
ceval.c bpo-42099: Fix reference to ob_type in unionobject.c and ceval (GH-22829) 2020-10-27 18:55:52 +00:00
ceval_gil.h bpo-40513: Per-interpreter GIL (GH-19943) 2020-05-05 20:27:47 +02:00
codecs.c bpo-42157: unicodedata avoids references to UCD_Type (GH-22990) 2020-10-26 19:19:36 +01:00
compile.c bpo-42161: Use _PyLong_GetZero() and _PyLong_GetOne() (GH-22995) 2020-10-27 02:24:34 +01:00
condvar.h Typo fix: "throuhgh" should be "through". (GH-16704) 2019-10-10 20:43:13 -07:00
context.c bpo-40521: Fix _PyContext_Fini() (GH-21103) 2020-06-24 03:21:15 +02:00
dtoa.c bpo-40780: Fix failure of _Py_dg_dtoa to remove trailing zeros (GH-20435) 2020-05-29 14:23:57 +01:00
dup2.c bpo-32150: Expand tabs to spaces in C files. (#4583) 2017-11-28 17:56:10 +02:00
dynamic_annotations.c bpo-32241: Add the const qualifire to declarations of umodifiable strings. (#4748) 2017-12-12 13:55:04 +02:00
dynload_hpux.c bpo-41894: Fix UnicodeDecodeError while loading native module (GH-22466) 2020-10-15 10:53:27 +09:00
dynload_shlib.c bpo-41894: Fix UnicodeDecodeError while loading native module (GH-22466) 2020-10-15 10:53:27 +09:00
dynload_stub.c Issue #13959: Re-implement imp.get_suffixes() in Lib/imp.py. 2012-05-04 15:20:40 -04:00
dynload_win.c bpo-36346: Make using the legacy Unicode C API optional (GH-21437) 2020-07-10 23:26:06 +03:00
errors.c bpo-42152: Use PyDict_Contains and PyDict_SetDefault if appropriate. (GH-22986) 2020-10-26 12:47:57 +02:00
fileutils.c bpo-42236: Use UTF-8 encoding if nl_langinfo(CODESET) fails (GH-23086) 2020-11-01 23:07:23 +01:00
formatter_unicode.c bpo-41681: Fix for `f-string/str.format` error description when using 2 `,` in format specifier (GH-22036) 2020-09-01 10:34:29 -04:00
frozen.c bpo-36540: PEP 570 -- Implementation (GH-12701) 2019-04-29 13:36:57 +01:00
frozenmain.c bpo-40268: Remove a few pycore_pystate.h includes (GH-19510) 2020-04-14 17:52:15 +02:00
future.c bpo-38605: Make 'from __future__ import annotations' the default (GH-20434) 2020-10-06 13:03:02 -07:00
getargs.c bpo-41078: Rename pycore_tupleobject.h to pycore_tuple.h (GH-21056) 2020-06-22 17:27:35 +02:00
getcompiler.c closes bpo-31696: don't mention GCC in sys.version when building with clang (#3891) 2017-10-05 21:15:14 -07:00
getcopyright.c Bring Python into the next decade. (GH-17801) 2020-01-02 18:56:34 -08:00
getopt.c bpo-40527: Fix command line argument parsing (GH-19955) 2020-05-06 22:22:17 +09:00
getplatform.c bpo-32150: Expand tabs to spaces in C files. (#4583) 2017-11-28 17:56:10 +02:00
getversion.c bpo-32150: Expand tabs to spaces in C files. (#4583) 2017-11-28 17:56:10 +02:00
hamt.c bpo-29882: Add _Py_popcount32() function (GH-20518) 2020-06-08 16:30:33 +02:00
hashtable.c bpo-41061: Fix incorrect expressions in hashtable (GH-21028) 2020-06-22 00:41:48 -07:00
import.c bpo-42208: Move _PyImport_Cleanup() to pylifecycle.c (GH-23040) 2020-10-30 18:03:28 +01:00
importdl.c bpo-39573: Finish converting to new Py_IS_TYPE() macro (GH-18601) 2020-03-04 14:15:20 +01:00
importdl.h PEP 489: Multi-phase extension module initialization 2015-05-23 22:24:10 +10:00
importlib.h bpo-41323: Perform 'peephole' optimizations directly on the CFG. (GH-21517) 2020-07-30 10:03:00 +01:00
importlib_external.h bpo-38605: bump the magic number for 'annotations' future (#22630) 2020-10-10 15:19:46 -07:00
importlib_zipimport.h bpo-41323: Perform 'peephole' optimizations directly on the CFG. (GH-21517) 2020-07-30 10:03:00 +01:00
initconfig.c bpo-42236: Use UTF-8 encoding if nl_langinfo(CODESET) fails (GH-23086) 2020-11-01 23:07:23 +01:00
makeopcodetargets.py makeopcodetargets.py: we need to import Lib/opcode.py 2016-03-26 01:04:37 +01:00
marshal.c bpo-1635741: Port mashal module to multi-phase init (#22149) 2020-09-08 15:33:52 +02:00
modsupport.c closes bpo-41533: Fix a potential memory leak when allocating a stack (GH-21847) 2020-08-29 23:53:08 -05:00
mysnprintf.c bpo-36020: Require vsnprintf() to build Python (GH-20899) 2020-06-16 00:54:44 +02:00
mystrtoul.c bpo-37752: Delete redundant Py_CHARMASK in normalizestring() (GH-15095) 2019-09-10 17:04:08 +01:00
opcode_targets.h bpo-39320: Handle unpacking of **values in compiler (GH-18141) 2020-01-27 09:57:45 +00:00
pathconfig.c bpo-29778: test_embed tests the path configuration (GH-21306) 2020-07-08 00:20:37 +02:00
preconfig.c _PyPreConfig_Read() decodes argv at each iteration (GH-20786) 2020-06-10 19:33:11 +02:00
pyarena.c bpo-36254: Fix invalid uses of %d in format strings in C. (GH-12264) 2019-03-13 22:59:55 +02:00
pyctype.c
pyfpe.c bpo-29137: Remove fpectl module (#4789) 2018-01-05 23:15:34 -08:00
pyhash.c bpo-40943: Replace PY_FORMAT_SIZE_T with "z" (GH-20781) 2020-06-10 18:38:05 +02:00
pylifecycle.c bpo-42208: Call GC collect earlier in PyInterpreterState_Clear() (GH-23044) 2020-10-30 22:51:02 +01:00
pymath.c bpo-29782: Consolidate _Py_Bit_Length() (GH-20739) 2020-06-15 14:33:48 +02:00
pystate.c bpo-42208: Call GC collect earlier in PyInterpreterState_Clear() (GH-23044) 2020-10-30 22:51:02 +01:00
pystrcmp.c bpo-41524: fix pointer bug in PyOS_mystr{n}icmp (GH-21845) 2020-08-27 14:45:25 +09:00
pystrhex.c bpo-40313: speed up bytes.hex() (GH-19594) 2020-04-20 17:17:52 -07:00
pystrtod.c bpo-35081: Move dtoa.h header to the internal C API (GH-18489) 2020-02-12 22:54:42 +01:00
pythonrun.c bpo-42006: Stop using PyDict_GetItem, PyDict_GetItemString and _PyDict_GetItemId. (GH-22648) 2020-10-26 08:43:39 +02:00
pytime.c bpo-40650: Include winsock2.h in pytime.c, instead of a full windows.h (GH-20137) 2020-05-18 17:22:53 +01:00
structmember.c bpo-40268: Remove unused structmember.h includes (GH-19530) 2020-04-15 02:35:41 +02:00
symtable.c bpo-42006: Stop using PyDict_GetItem, PyDict_GetItemString and _PyDict_GetItemId. (GH-22648) 2020-10-26 08:43:39 +02:00
sysmodule.c bpo-42006: Stop using PyDict_GetItem, PyDict_GetItemString and _PyDict_GetItemId. (GH-22648) 2020-10-26 08:43:39 +02:00
thread.c bpo-40268: Remove explicit pythread.h includes (#19529) 2020-04-15 02:04:42 +02:00
thread_nt.h bpo-40268: Rename _PyInterpreterState_GET_UNSAFE() (GH-19509) 2020-04-14 15:14:01 +02:00
thread_pthread.h Fix -Wstrict-prototypes warning in thread_pthread.h. (GH-21477) 2020-07-15 08:12:05 -05:00
traceback.c bpo-40421: Add PyFrame_GetBack() function (GH-19765) 2020-04-29 03:28:46 +02:00
wordcode_helpers.h bpo-31338 (#3374) 2017-09-14 18:13:16 -07:00

README

Miscellaneous source files for the main Python shared library