Commit Graph

155 Commits

Author SHA1 Message Date
Victor Stinner 32b7bcffba
gh-93103: Py_DecodeLocale() uses _PyRuntime.preconfig (#93187)
The Py_DecodeLocale() and Py_EncodeLocale() now use
_PyRuntime.preconfig, rather than Py_UTF8Mode and
Py_LegacyWindowsFSEncodingFlag global configuration varibles, to
decide if the UTF-8 encoding is used or not.

As documented, these functions must not be called before Python is
preinitialized. The new PyConfig API should now be used, rather than
using deprecated functions like Py_SetPath() or PySys_SetArgv().
2022-05-25 00:09:48 +02:00
Inada Naoki f9c9354a7a
gh-92536: PEP 623: Remove wstr and legacy APIs from Unicode (GH-92537) 2022-05-12 14:48:38 +09:00
Victor Stinner 7cdaf87ec5
gh-91731: Replace Py_BUILD_ASSERT() with static_assert() (#91730)
Python 3.11 now uses C11 standard which adds static_assert()
to <assert.h>.

* In pytime.c, replace Py_BUILD_ASSERT() with preprocessor checks on
  SIZEOF_TIME_T with #error.
* On macOS, py_mach_timebase_info() now accepts timebase members with
  the same size than _PyTime_t.
* py_get_monotonic_clock() now saturates GetTickCount64() to
  _PyTime_MAX: GetTickCount64() is unsigned, whereas _PyTime_t is
  signed.
2022-04-20 19:26:40 +02:00
Inada Naoki 6773203487
bpo-47000: Add `locale.getencoding()` (GH-32068) 2022-04-09 09:54:54 +09:00
Alexey Izbyshev 1c8b3b5d66
bpo-47260: Fix os.closerange() potentially being a no-op in a seccomp sandbox (GH-32418)
_Py_closerange() currently assumes that close_range() closes
all file descriptors even if it returns an error (other than ENOSYS).
This assumption can be wrong on Linux if a seccomp sandbox denies
the underlying syscall, pretending that it returns EPERM or EACCES.
In this case _Py_closerange() won't close any descriptors at all,
which in the worst case can be a security issue.

Fix this by falling back to other methods in case of any close_range()
error. Note that fallbacks will not be triggered on any problems with
closing individual file descriptors because close_range() is documented
to ignore such errors on both Linux[1] and FreeBSD[2].

[1] https://man7.org/linux/man-pages/man2/close_range.2.html
[2] https://www.freebsd.org/cgi/man.cgi?query=close_range&sektion=2
2022-04-08 10:40:39 -07:00
Serhiy Storchaka 6927632492
Remove trailing spaces (GH-31695) 2022-03-05 17:47:00 +02:00
neonene d4e64cd4b0
bpo-46362: Ensure ntpath.abspath() uses the Windows API correctly (GH-30571)
This makes ntpath.abspath()/getpath_abspath() follow normpath(), since some WinAPIs such as PathCchSkipRoot() require backslashed paths.
2022-01-13 23:35:42 +00:00
Steve Dower d81182b8ec
bpo-46217: Revert use of Windows constant that is newer than what we support (GH-30473) 2022-01-08 00:06:53 +00:00
neonene 9c5fa9c97c
bpo-46208: Fix normalization of relative paths in _Py_normpath()/os.path.normpath (GH-30362) 2022-01-06 19:13:10 +00:00
Steve Dower 99fcf15052
bpo-45582: Port getpath[p].c to Python (GH-29041)
The getpath.py file is frozen at build time and executed as code over a namespace. It is never imported, nor is it meant to be importable or reusable. However, it should be easier to read, modify, and patch than the previous code.

This commit attempts to preserve every previously tested quirk, but these may be changed in the future to better align platforms.
2021-12-03 00:08:42 +00:00
Eric Snow 17c61045c5
bpo-45506: Normalize _PyPathConfig.stdlib_dir when calculated. (#29040)
The recently added PyConfig.stdlib_dir was being set with ".." entries. When __file__ was added for from modules this caused a problem on out-of-tree builds. This PR fixes that by normalizing "stdlib_dir" when it is calculated in getpath.c.

https://bugs.python.org/issue45506
2021-10-22 17:20:03 -06:00
Eric Snow b9cdd0fb9c
bpo-45020: Default to using frozen modules unless running from source tree. (gh-28940)
The default was "off".  Switching it to "on" means users get the benefit of frozen stdlib modules without having to do anything.  There's a special-case for running-in-source-tree, so contributors don't get surprised when their stdlib changes don't get used.

https://bugs.python.org/issue45020
2021-10-16 13:16:08 -06:00
Victor Stinner aac29af678
bpo-45434: pyport.h no longer includes <stdlib.h> (GH-28914)
Include <stdlib.h> explicitly in C files.

Python.h includes <wchar.h>.
2021-10-13 19:25:53 +02:00
Christian Clauss db693df3e1
Fix typos in the Python directory (GH-28767) 2021-10-06 15:55:27 -07:00
Eric Snow 16b5bc6896
Do not check isabs() on Windows. (gh-28584)
I missed this in gh-28550.

https://bugs.python.org/issue45211
2021-09-27 10:52:19 -06:00
Eric Snow ae7839bbe8
bpo-45211: Move helpers from getpath.c to internal API. (gh-28550)
This accomplishes 2 things:

* consolidates some common code between getpath.c and getpathp.c
* makes the helpers available to code in other files

FWIW, the signature of the join_relfile() function (in fileutils.c) intentionally mirrors that of Windows' PathCchCombineEx().

Note that this change is mostly moving code around. No behavior is meant to change.

https://bugs.python.org/issue45211
2021-09-27 10:00:32 -06:00
Vincent Michel 06148b1870
bpo-44219: Release the GIL during isatty syscalls (GH-28250)
Release the GIL while performing isatty() system calls on arbitrary
file descriptors. In particular, this affects os.isatty(),
os.device_encoding() and io.TextIOWrapper. By extension,
io.open() in text mode is also affected.
2021-09-09 15:12:03 +02:00
Victor Stinner c24896c0e3
bpo-44849: Fix os.set_inheritable() on FreeBSD 14 with O_PATH (GH-27623)
Fix the os.set_inheritable() function on FreeBSD 14 for file
descriptor opened with the O_PATH flag: ignore the EBADF error on
ioctl(), fallback on the fcntl() implementation.
2021-08-06 15:15:10 +02:00
Jakub Kulík 9032cf5cb1
bpo-43667: Fix broken Unicode encoding in non-UTF locales on Solaris (GH-25096) 2021-04-30 15:21:42 +02:00
Segev Finer 5e437fb872
bpo-30555: Fix WindowsConsoleIO fails in the presence of fd redirection (GH-1927)
This works by not caching the handle and instead getting the handle from
the file descriptor each time, so that if the actual handle changes by
fd redirection closing/opening the console handle beneath our feet, we
will keep working correctly.
2021-04-23 23:00:27 +01:00
Victor Stinner 9976834f80
bpo-35883: Py_DecodeLocale() escapes invalid Unicode characters (GH-24843)
Python no longer fails at startup with a fatal error if a command
line argument contains an invalid Unicode character.

The Py_DecodeLocale() function now escapes byte sequences which would
be decoded as Unicode characters outside the [U+0000; U+10ffff]
range.

Use MAX_UNICODE constant in unicodeobject.c.
2021-03-17 21:46:53 +01:00
cptpcrd 7dc71c425c
bpo-42780: Fix set_inheritable() for O_PATH file descriptors on Linux (GH-24172) 2021-01-20 15:05:51 +01:00
Victor Stinner ca06440207
bpo-32381: Remove unused _Py_fopen() function (GH-23711)
Remove the private _Py_fopen() function which is no longer needed.
Use _Py_wfopen() or _Py_fopen_obj() instead.
2020-12-09 20:54:31 +01:00
pxinwr 06afac6c57
bpo-41462: Add os.set_blocking() support for VxWorks RTOS (GH-21713) 2020-12-07 21:41:12 +01:00
Victor Stinner 3529718925
bpo-42236: os.device_encoding() respects UTF-8 Mode (GH-23119)
On Unix, the os.device_encoding() function now returns 'UTF-8' rather
than the device encoding if the Python UTF-8 Mode is enabled.
2020-11-04 11:20:10 +01:00
Victor Stinner e662c398d8
bpo-42236: Use UTF-8 encoding if nl_langinfo(CODESET) fails (GH-23086)
If the nl_langinfo(CODESET) function returns an empty string, Python
now uses UTF-8 as the filesystem encoding.

In May 2010 (commit b744ba1d14), I
modified Python to log a warning and use UTF-8 as the filesystem
encoding (instead of None) if nl_langinfo(CODESET) returns an empty
string.

In August 2020 (commit 94908bbc15), I
modified Python startup to fail with a fatal error and a specific
error message if nl_langinfo(CODESET) returns an empty string. The
intent was to prevent guessing the encoding and also investigate user
configuration where this case happens.

In 10 years (2010 to 2020), I saw zero user report about the error
message related to nl_langinfo(CODESET) returning an empty string.

Today, UTF-8 became the defacto standard and it's safe to make the
assumption that the user expects UTF-8. For example,
nl_langinfo(CODESET) can return an empty string on macOS if the
LC_CTYPE locale is not supported, and UTF-8 is the default encoding
on macOS.

While this change is likely to not affect anyone in practice, it
should make UTF-8 lover happy ;-)

Rewrite also the documentation explaining how Python selects the
filesystem encoding and error handler.
2020-11-01 23:07:23 +01:00
Victor Stinner 82458b6cdb
bpo-42236: Enhance _locale._get_locale_encoding() (GH-23083)
* Rename _Py_GetLocaleEncoding() to _Py_GetLocaleEncodingObject()
* Add _Py_GetLocaleEncoding() which returns a wchar_t* string to
  share code between _Py_GetLocaleEncodingObject()
  and config_get_locale_encoding().
* _Py_GetLocaleEncodingObject() now decodes nl_langinfo(CODESET)
  from the current locale encoding with surrogateescape,
  rather than using UTF-8.
2020-11-01 20:59:35 +01:00
Victor Stinner 710e826307
bpo-42208: Add _Py_GetLocaleEncoding() (GH-23050)
_io.TextIOWrapper no longer calls getpreferredencoding(False) of
_bootlocale to get the locale encoding, but calls
_Py_GetLocaleEncoding() instead.

Add config_get_fs_encoding() sub-function. Reorganize also
config_get_locale_encoding() code.
2020-10-31 01:02:09 +01:00
TIGirardi f2312037e3
bpo-38324: Fix test__locale.py Windows failures (GH-20529)
Use wide-char _W_* fields of lconv structure on Windows
Remove "ps_AF" from test__locale.known_numerics on Windows
2020-10-20 12:39:52 +01:00
Kyle Evans 7992579cd2
bpo-40422: Move _Py_closerange to fileutils.c (GH-22680)
This API is relatively lightweight and organizationally, given that it's
used by multiple modules, it makes sense to move it to fileutils.

Requires making sure that _posixsubprocess is compiled with the appropriate
Py_BUIILD_CORE_BUILTIN macro.
2020-10-13 22:04:44 +02:00
Serhiy Storchaka 4c8f09d7ce
bpo-36346: Make using the legacy Unicode C API optional (GH-21437)
Add compile time option USE_UNICODE_WCHAR_CACHE. Setting it to 0
makes the interpreter not using the wchar_t cache and the legacy Unicode C API.
2020-07-10 23:26:06 +03:00
Serhiy Storchaka 6c6810d989
bpo-41094: Fix decoding errors with audit when open files. (GH-21095) 2020-06-24 08:46:05 +03:00
Christian Heimes 9672912e8f
bpo-40957: Fix refleak in _Py_fopen_obj() (GH-20827)
Signed-off-by: Christian Heimes <christian@python.org>
2020-06-14 00:57:22 +09:00
Victor Stinner 361dcdcefc
bpo-40268: Remove unused osdefs.h includes (GH-19532)
When the include is needed, add required symbol in a comment.
2020-04-15 03:24:57 +02:00
Serhiy Storchaka 8f87eefe7f
bpo-39943: Add the const qualifier to pointers on non-mutable PyBytes data. (GH-19472) 2020-04-12 14:58:27 +03:00
Victor Stinner 03a8a56fac
bpo-38353: Add subfunctions to getpath.c (GH-16572)
Following symbolic links is now limited to 40 attempts, just to
prevent loops.

Add subfunctions:

* Add resolve_symlinks()
* Add calculate_argv0_path_framework()
* Add calculate_which()
* Add calculate_program_macos()

Fix also _Py_wreadlink(): readlink() result type is Py_ssize_t, not
int.
2019-10-04 02:22:39 +02:00
Zackery Spytz 5be666010e bpo-37549: os.dup() fails for standard streams on Windows 7 (GH-15389) 2019-08-23 11:38:41 -07:00
Steve Dower df2d4a6f3d
bpo-37834: Normalise handling of reparse points on Windows (GH-15231)
bpo-37834: Normalise handling of reparse points on Windows
* ntpath.realpath() and nt.stat() will traverse all supported reparse points (previously was mixed)
* nt.lstat() will let the OS traverse reparse points that are not name surrogates (previously would not traverse any reparse point)
* nt.[l]stat() will only set S_IFLNK for symlinks (previous behaviour)
* nt.readlink() will read destinations for symlinks and junction points only

bpo-1311: os.path.exists('nul') now returns True on Windows
* nt.stat('nul').st_mode is now S_IFCHR (previously was an error)
2019-08-21 15:27:33 -07:00
Victor Stinner 3939c321c9
bpo-20443: _PyConfig_Read() gets the absolute path of run_filename (GH-14053)
Python now gets the absolute path of the script filename specified on
the command line (ex: "python3 script.py"): the __file__ attribute of
the __main__ module, sys.argv[0] and sys.path[0] become an absolute
path, rather than a relative path.

* Add _Py_isabs() and _Py_abspath() functions.
* _PyConfig_Read() now tries to get the absolute path of
  run_filename, but keeps the relative path if _Py_abspath() fails.
* Reimplement os._getfullpathname() using _Py_abspath().
* Use _Py_isabs() in getpath.c.
2019-06-25 15:02:43 +02:00
Zackery Spytz 28fca0c422 bpo-37267: Do not check for FILE_TYPE_CHAR in os.dup() on Windows (GH-14051)
On Windows, os.dup() no longer creates an inheritable fd when handling a
character file.
2019-06-17 09:17:14 +02:00
Steve Dower b82e17e626
bpo-36842: Implement PEP 578 (GH-12613)
Adds sys.audit, sys.addaudithook, io.open_code, and associated C APIs.
2019-05-23 08:45:22 -07:00
Victor Stinner e251095a3f
bpo-36775: Add _Py_FORCE_UTF8_FS_ENCODING macro (GH-13056)
Add _Py_FORCE_UTF8_LOCALE and _Py_FORCE_UTF8_FS_ENCODING macros to
avoid factorize "#if defined(__ANDROID__) || defined(__VXWORKS__)"
and "#if defined(__APPLE__)".

Cleanup also config_init_fs_encoding().
2019-05-02 11:28:57 -04:00
Victor Stinner faddaedd05
bpo-36352: Avoid hardcoded MAXPATHLEN size in getpath.c (GH-12423)
* Use Py_ARRAY_LENGTH() rather than hardcoded MAXPATHLEN in getpath.c.
* Pass string length to functions modifying strings.
2019-03-19 02:58:14 +01:00
Victor Stinner 1be0d1135f
bpo-36352: Clarify fileutils.h documentation (GH-12406)
The last parameter of _Py_wreadlink(), _Py_wrealpath() and
_Py_wgetcwd() is a length, not a size: number of characters including
the trailing NUL character.

Enhance also documentation of error conditions.
2019-03-18 17:47:26 +01:00
pxinwr f4b0a1c0da bpo-31904: Add encoding support for VxWorks RTOS (GH-12051)
Use UTF-8 as the system encoding on VxWorks.

The main reason are:

1. The locale is frequently misconfigured.
2. Missing some functions to deal with locale in VxWorks C library.
2019-03-04 10:02:06 +01:00
Victor Stinner 353933e712
bpo-34523: Fix C locale coercion on FreeBSD CURRENT (GH-10672)
bpo-34523, bpo-35290: C locale coercion now resets the Python
internal "force ASCII" mode. This change fix the filesystem encoding
on FreeBSD CURRENT, which has a new "C.UTF-8" locale, when
the UTF-8 mode is disabled.

Add _Py_ResetForceASCII(): _Py_SetLocaleFromEnv() now calls it.
2018-11-23 13:08:26 +01:00
Victor Stinner 02e6bf7f20
bpo-28604: Fix localeconv() for different LC_MONETARY (GH-10606)
locale.localeconv() now sets temporarily the LC_CTYPE locale to the
LC_MONETARY locale if the two locales are different and monetary
strings are non-ASCII. This temporary change affects other threads.

Changes:

* locale.localeconv() can now set LC_CTYPE to LC_MONETARY to decode
  monetary fields.
* Add LocaleInfo.grouping_buffer: copy localeconv() grouping string
  since it can be replaced anytime if a different thread calls
  localeconv().
* _Py_GetLocaleconvNumeric() now requires a "struct lconv *"
  structure, so locale.localeconv() now longer calls localeconv()
  twice. Moreover, the function now requires all arguments to be
  non-NULL.
* Rename STATIC_LOCALE_INFO_INIT to LocaleInfo_STATIC_INIT.
* Move _Py_GetLocaleconvNumeric() definition from fileutils.h
  to pycore_fileutils.h. pycore_fileutils.h now includes locale.h.
* The _locale module is now built with Py_BUILD_CORE defined.
2018-11-20 16:20:16 +01:00
Victor Stinner 9fc57a3848
bpo-35081: Add pycore_fileutils.h (GH-10371)
Move Py_BUILD_CORE code from Include/fileutils.h to a new
Include/internal/pycore_fileutils.h file.
2018-11-07 00:44:03 +01:00
Stéphane Wirtel 74a8b6ea7e bpo-24658: Fix read/write greater than 2 GiB on macOS (GH-1705)
On macOS, fix reading from and writing into a file with a size larger than 2 GiB.
2018-10-18 01:05:04 +02:00
Victor Stinner 3d4226a832
bpo-34523: Support surrogatepass in locale codecs (GH-8995)
Add support for the "surrogatepass" error handler in
PyUnicode_DecodeFSDefault() and PyUnicode_EncodeFSDefault()
for the UTF-8 encoding.

Changes:

* _Py_DecodeUTF8Ex() and _Py_EncodeUTF8Ex() now support the
  surrogatepass error handler (_Py_ERROR_SURROGATEPASS).
* _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() now use
  the _Py_error_handler enum instead of "int surrogateescape" to pass
  the error handler. These functions now return -3 if the error
  handler is unknown.
* Add unit tests on _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx()
  in test_codecs.
* Rename get_error_handler() to _Py_GetErrorHandler() and expose it
  as a private function.
* _freeze_importlib doesn't need config.filesystem_errors="strict"
  workaround anymore.
2018-08-29 22:21:32 +02:00