Enhance the documentation of the Python startup, filesystem encoding
and error handling, locale encoding. Add a new "Python UTF-8 Mode"
section.
* Add "locale encoding" and "filesystem encoding and error handler"
to the glossary
* Remove documentation from Include/cpython/initconfig.h: move it to
Doc/c-api/init_config.rst.
* Doc/c-api/init_config.rst:
* Document command line options and environment variables
* Document default values.
* Add a new "Python UTF-8 Mode" section in Doc/library/os.rst.
* Add warnings to Py_DecodeLocale() and Py_EncodeLocale() docs.
* Document how Python selects the filesystem encoding and error
handler at a single place: PyConfig.filesystem_encoding and
PyConfig.filesystem_errors.
* PyConfig: move orig_argv member at the right place.
For example, fix the following Sphinx 3 errors:
Doc/c-api/buffer.rst:102: WARNING: Error in declarator or parameters
Invalid C declaration: Expected identifier in nested name. [error at 5]
void \*obj
-----^
Doc/c-api/arg.rst:130: WARNING: Unparseable C cross-reference: 'PyObject*'
Invalid C declaration: Expected end of definition. [error at 8]
PyObject*
--------^
The modified documentation is compatible with Sphinx 2 and Sphinx 3.
Fix two Sphinx 3 issues:
Doc/c-api/buffer.rst:304: WARNING: Duplicate C declaration, also defined in 'c-api/buffer'.
Declaration is 'PyBUF_ND'.
Doc/c-api/unicode.rst:1603: WARNING: Duplicate C declaration, also defined in 'c-api/unicode'.
Declaration is 'PyObject* PyUnicode_Translate(PyObject *str, PyObject *table, const char *errors)'.
From the source for `PyUnicode_Decode`, the implementation is:
```
if (encoding == NULL) {
return PyUnicode_DecodeUTF8Stateful(s, size, errors, NULL);
}
```
which is pretty clearly not defaulting to ASCII.
---
I assume this needs neither a news entry nor bpo link.
Use UTF-8 as the system encoding on VxWorks.
The main reason are:
1. The locale is frequently misconfigured.
2. Missing some functions to deal with locale in VxWorks C library.
* Add %T format to PyUnicode_FromFormatV(), and so to
PyUnicode_FromFormat() and PyErr_Format(), to format an object type
name: equivalent to "%s" with Py_TYPE(obj)->tp_name.
* Replace Py_TYPE(obj)->tp_name with %T format in unicodeobject.c.
* Add unit test on %T format.
* Rename unicode_fromformat_write_cstr() to
unicode_fromformat_write_utf8(), to make the intent more explicit.
PyUnicode_DecodeLocaleAndSize(), PyUnicode_DecodeLocale() and
PyUnicode_EncodeLocale() now use always use the UTF-8 encoding on
Android, instead of the current locale encoding.
On Android API 19, mbstowcs() and wcstombs() are broken and cannot be
used.
Modify locale.localeconv(), time.tzname, os.strerror() and other
functions to ignore the UTF-8 Mode: always use the current locale
encoding.
Changes:
* Add _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx(). On decoding or
encoding error, they return the position of the error and an error
message which are used to raise Unicode errors in
PyUnicode_DecodeLocale() and PyUnicode_EncodeLocale().
* Replace _Py_DecodeCurrentLocale() with _Py_DecodeLocaleEx().
* PyUnicode_DecodeLocale() now uses _Py_DecodeLocaleEx() for all
cases, especially for the strict error handler.
* Add _Py_DecodeUTF8Ex(): return more information on decoding error
and supports the strict error handler.
* Rename _Py_EncodeUTF8_surrogateescape() to _Py_EncodeUTF8Ex().
* Replace _Py_EncodeCurrentLocale() with _Py_EncodeLocaleEx().
* Ignore the UTF-8 mode to encode/decode localeconv(), strerror()
and time zone name.
* Remove PyUnicode_DecodeLocale(), PyUnicode_DecodeLocaleAndSize()
and PyUnicode_EncodeLocale() now ignore the UTF-8 mode: always use
the "current" locale.
* Remove _PyUnicode_DecodeCurrentLocale(),
_PyUnicode_DecodeCurrentLocaleAndSize() and
_PyUnicode_EncodeCurrentLocale().