* Rename _PyImport_FindExtension() to _PyImport_FindExtensionUnicode():
the filename becomes a Unicode object instead of byte string
* Rename _PyImport_FixupExtension() to _PyImport_FixupExtensionUnicode():
the filename becomes a Unicode object instead of byte string
_Py_char2wchar() callers usually need the result size in characters. Since it's
trivial to compute it in _Py_char2wchar() (O(1) whereas wcslen() is O(n)), add
an option to get it.
* PyUnicode_EncodeFSDefault(), PyUnicode_DecodeFSDefaultAndSize() and
PyUnicode_DecodeFSDefault() use the locale encoding instead of UTF-8 if
Py_FileSystemDefaultEncoding is NULL
* redecode_filenames() functions and _Py_code_object_list (issue #9630)
are no more needed: remove them
* Don't define _Py_wstat() on Windows, Windows has its own _wstat() function
with a different API (the stat buffer has another type)
* Include windows.h
* _Py_fopen() and _Py_stat() come from Python/import.c
* (_Py)_wrealpath() comes from Python/sysmodule.c
* _Py_char2wchar(), _Py_wchar2char() and _Py_wfopen() come from Modules/main.c
* (_Py)_wstat(), (_Py)_wgetcwd(), _Py_wreadlink() come from Modules/getpath.c
Redecode the filenames of:
- all modules: __file__ and __path__ attributes
- all code objects: co_filename attribute
- sys.path
- sys.meta_path
- sys.executable
- sys.path_importer_cache (keys)
Keep weak references to all code objects until initfsencoding() is called, to
be able to redecode co_filename attribute of all code objects.
retry the select() loop instead of bailing out. This is because select()
can incorrectly report a socket as ready for reading (for example, if it
received some data with an invalid checksum).
environment variable to set the filesystem encoding at Python startup.
sys.setfilesystemencoding() creates inconsistencies because it is unable to
reencode all filenames in all objects.
namespace if it occurs as a free variable in a nested block. This limitation
of the compiler has been lifted, and a new opcode introduced (DELETE_DEREF).
This sample was valid in 2.6, but fails to compile in 3.x without this change::
>>> def f():
... def print_error():
... print(e)
... try:
... something
... except Exception as e:
... print_error()
... # implicit "del e" here
This sample has always been invalid in Python, and now works::
>>> def outer(x):
... def inner():
... return x
... inner()
... del x
There is no need to bump the PYC magic number: the new opcode is used
for code that did not compile before.
Database (Py_UNICODE_TOLOWER, Py_UNICODE_ISDECIMAL, and others) now accept
and return characters from the full Unicode range (Py_UCS4).
The differences from Python code are few:
- unicodedata.numeric(), unicodedata.decimal() and unicodedata.digit()
now return the correct value for large code points
- repr() may consider more characters as printable.
... to get the filename as a unicode object, instead of a byte string. Function
needed to support unencodable filenames. Deprecate PyModule_GetFilename() in
favor on the new function.
Call _wfopen() on Windows, or fopen() otherwise. Return the new file object on
success, or NULL if the file cannot be open or (if PyErr_Occurred()) on unicode
error.
It's a ParseTuple converter: decode bytes objects to unicode using
PyUnicode_DecodeFSDefaultAndSize(); str objects are output as-is.
* Don't specify surrogateescape error handler in the comments nor the
documentation, but PyUnicode_DecodeFSDefaultAndSize() and
PyUnicode_EncodeFSDefault() because these functions use strict error handler
for the mbcs encoding (on Windows).
* Remove PyUnicode_FSConverter() comment in unicodeobject.c to avoid
inconsistency with unicodeobject.h.