Commit Graph

6692 Commits

Author SHA1 Message Date
Victor Stinner fd957c124c
bpo-41796: Call _PyAST_Fini() earlier to fix a leak (GH-23131)
Call _PyAST_Fini() on all interpreters, not only on the main
interpreter. Also, call it ealier to fix a reference leak.

Python types contain a reference to themselves in in their
PyTypeObject.tp_mro member. _PyAST_Fini() must called before the last
GC collection to destroy AST types.

_PyInterpreterState_Clear() now calls _PyAST_Fini(). It now also
calls _PyWarnings_Fini() on subinterpeters, not only on the main
interpreter.

Add an assertion in AST init_types() to ensure that the _ast module
is no longer used after _PyAST_Fini() has been called.
2020-11-03 18:07:15 +01:00
Victor Stinner 45df61fd2d
bpo-26789: Fix logging.FileHandler._open() at exit (GH-23053)
The logging.FileHandler class now keeps a reference to the builtin
open() function to be able to open or reopen the file during Python
finalization.

Fix errors like:

    Exception ignored in: (...)
    Traceback (most recent call last):
      (...)
      File ".../logging/__init__.py", line 1463, in error
      File ".../logging/__init__.py", line 1577, in _log
      File ".../logging/__init__.py", line 1587, in handle
      File ".../logging/__init__.py", line 1649, in callHandlers
      File ".../logging/__init__.py", line 948, in handle
      File ".../logging/__init__.py", line 1182, in emit
      File ".../logging/__init__.py", line 1171, in _open
    NameError: name 'open' is not defined
2020-11-02 23:17:46 +01:00
Victor Stinner 5cf4782a26
bpo-41796: Make _ast module state per interpreter (GH-23024)
The ast module internal state is now per interpreter.

* Rename "astmodulestate" to "struct ast_state"
* Add pycore_ast.h internal header: the ast_state structure is now
  declared in pycore_ast.h.
* Add PyInterpreterState.ast (struct ast_state)
* Remove get_ast_state()
* Rename get_global_ast_state() to get_ast_state()
* PyAST_obj2mod() now handles get_ast_state() failures
2020-11-02 22:03:28 +01:00
Victor Stinner 4b9aad4999
bpo-42236: Enhance init and encoding documentation (GH-23109)
Enhance the documentation of the Python startup, filesystem encoding
and error handling, locale encoding. Add a new "Python UTF-8 Mode"
section.

* Add "locale encoding" and "filesystem encoding and error handler"
  to the glossary
* Remove documentation from Include/cpython/initconfig.h: move it to
  Doc/c-api/init_config.rst.
* Doc/c-api/init_config.rst:

  * Document command line options and environment variables
  * Document default values.

* Add a new "Python UTF-8 Mode" section in Doc/library/os.rst.
* Add warnings to Py_DecodeLocale() and Py_EncodeLocale() docs.
* Document how Python selects the filesystem encoding and error
  handler at a single place: PyConfig.filesystem_encoding and
  PyConfig.filesystem_errors.
* PyConfig: move orig_argv member at the right place.
2020-11-02 16:49:54 +01:00
Julien Danjou 64366fa9b3
bpo-41435: Add sys._current_exceptions() function (GH-21689)
This adds a new function named sys._current_exceptions() which is equivalent ot
sys._current_frames() except that it returns the exceptions currently handled
by other threads. It is equivalent to calling sys.exc_info() for each running
thread.
2020-11-02 16:16:25 +02:00
Victor Stinner e662c398d8
bpo-42236: Use UTF-8 encoding if nl_langinfo(CODESET) fails (GH-23086)
If the nl_langinfo(CODESET) function returns an empty string, Python
now uses UTF-8 as the filesystem encoding.

In May 2010 (commit b744ba1d14), I
modified Python to log a warning and use UTF-8 as the filesystem
encoding (instead of None) if nl_langinfo(CODESET) returns an empty
string.

In August 2020 (commit 94908bbc15), I
modified Python startup to fail with a fatal error and a specific
error message if nl_langinfo(CODESET) returns an empty string. The
intent was to prevent guessing the encoding and also investigate user
configuration where this case happens.

In 10 years (2010 to 2020), I saw zero user report about the error
message related to nl_langinfo(CODESET) returning an empty string.

Today, UTF-8 became the defacto standard and it's safe to make the
assumption that the user expects UTF-8. For example,
nl_langinfo(CODESET) can return an empty string on macOS if the
LC_CTYPE locale is not supported, and UTF-8 is the default encoding
on macOS.

While this change is likely to not affect anyone in practice, it
should make UTF-8 lover happy ;-)

Rewrite also the documentation explaining how Python selects the
filesystem encoding and error handler.
2020-11-01 23:07:23 +01:00
Victor Stinner 82458b6cdb
bpo-42236: Enhance _locale._get_locale_encoding() (GH-23083)
* Rename _Py_GetLocaleEncoding() to _Py_GetLocaleEncodingObject()
* Add _Py_GetLocaleEncoding() which returns a wchar_t* string to
  share code between _Py_GetLocaleEncodingObject()
  and config_get_locale_encoding().
* _Py_GetLocaleEncodingObject() now decodes nl_langinfo(CODESET)
  from the current locale encoding with surrogateescape,
  rather than using UTF-8.
2020-11-01 20:59:35 +01:00
Victor Stinner 710e826307
bpo-42208: Add _Py_GetLocaleEncoding() (GH-23050)
_io.TextIOWrapper no longer calls getpreferredencoding(False) of
_bootlocale to get the locale encoding, but calls
_Py_GetLocaleEncoding() instead.

Add config_get_fs_encoding() sub-function. Reorganize also
config_get_locale_encoding() code.
2020-10-31 01:02:09 +01:00
Victor Stinner eba5bf2f56
bpo-42208: Call GC collect earlier in PyInterpreterState_Clear() (GH-23044)
The last GC collection is now done before clearing builtins and sys
dictionaries. Add also assertions to ensure that gc.collect() is no
longer called after _PyGC_Fini().

Pass also the tstate to PyInterpreterState_Clear() to pass the
correct tstate to _PyGC_CollectNoFail() and _PyGC_Fini().
2020-10-30 22:51:02 +01:00
Victor Stinner dff1ad5090
bpo-42208: Move _PyImport_Cleanup() to pylifecycle.c (GH-23040)
Move _PyImport_Cleanup() to pylifecycle.c, rename it to
finalize_modules(), split it (200 lines) into many smaller
sub-functions and cleanup the code.
2020-10-30 18:03:28 +01:00
Victor Stinner 8b3414818f
bpo-42208: Pass tstate to _PyGC_CollectNoFail() (GH-23038)
Move private _PyGC_CollectNoFail() to the internal C API.

Remove the private _PyGC_CollectIfEnabled() which was just an alias
to the public PyGC_Collect() function since Python 3.8.

Rename functions:

* collect() => gc_collect_main()
* collect_with_callback() => gc_collect_with_callback()
* collect_generations() => gc_collect_generations()
2020-10-30 17:00:00 +01:00
Neil Schemenauer 0564aafb71
bpo-42099: Fix reference to ob_type in unionobject.c and ceval (GH-22829)
* Use Py_TYPE() rather than o->ob_type.
2020-10-27 18:55:52 +00:00
Victor Stinner c9bc290dd6
bpo-42161: Use _PyLong_GetZero() and _PyLong_GetOne() (GH-22995)
Use _PyLong_GetZero() and _PyLong_GetOne()
in Objects/ and Python/ directories.
2020-10-27 02:24:34 +01:00
Victor Stinner 920cb647ba
bpo-42157: unicodedata avoids references to UCD_Type (GH-22990)
* UCD_Check() uses PyModule_Check()
* Simplify the internal _PyUnicode_Name_CAPI structure:

  * Remove size and state members
  * Remove state and self parameters of getcode() and getname()
    functions

* Remove global_module_state
2020-10-26 19:19:36 +01:00
Victor Stinner 47e1afd2a1
bpo-1635741: _PyUnicode_Name_CAPI moves to internal C API (GH-22713)
The private _PyUnicode_Name_CAPI structure of the PyCapsule API
unicodedata.ucnhash_CAPI moves to the internal C API. Moreover, the
structure gets a new state member which must be passed to the
getcode() and getname() functions.

* Move Include/ucnhash.h to Include/internal/pycore_ucnhash.h
* unicodedata module is now built with Py_BUILD_CORE_MODULE.
* unicodedata: move hashAPI variable into unicodedata_module_state.
2020-10-26 16:43:47 +01:00
Serhiy Storchaka b510e101f8
bpo-42152: Use PyDict_Contains and PyDict_SetDefault if appropriate. (GH-22986)
If PyDict_GetItemWithError is only used to check whether the key is in dict,
it is better to use PyDict_Contains instead.

And if it is used in combination with PyDict_SetItem, PyDict_SetDefault can
replace the combination.
2020-10-26 12:47:57 +02:00
Serhiy Storchaka fb5db7ec58
bpo-42006: Stop using PyDict_GetItem, PyDict_GetItemString and _PyDict_GetItemId. (GH-22648)
These functions are considered not safe because they suppress all internal errors
and can return wrong result.  PyDict_GetItemString and _PyDict_GetItemId can
also silence current exception in rare cases.

Remove no longer used _PyDict_GetItemId.
Add _PyDict_ContainsId and rename _PyDict_Contains into
_PyDict_Contains_KnownHash.
2020-10-26 08:43:39 +02:00
TIGirardi f2312037e3
bpo-38324: Fix test__locale.py Windows failures (GH-20529)
Use wide-char _W_* fields of lconv structure on Windows
Remove "ps_AF" from test__locale.known_numerics on Windows
2020-10-20 12:39:52 +01:00
Pablo Galindo 109826c850
bpo-42093: Add opcode cache for LOAD_ATTR (GH-22803) 2020-10-20 06:22:44 +01:00
Kevin Adler 1dd6d956a3
closes bpo-42030: Remove legacy AIX dynload support (GH-22717)
Since c19c5a6, AIX builds have defaulted to using dynload_shlib over
dynload_aix when dlopen is available. This function has been available
since AIX 4.3, which went out of support in 2003, the same year the
previously referenced commit was made. It has been nearly 20 years
since a version of AIX has been supported which has not used
dynload_shlib so there's no reason to keep this legacy code around.
2020-10-16 13:03:28 -05:00
Hai Shi c9f696cb96
bpo-41919, test_codecs: Move codecs.register calls to setUp() (GH-22513)
* Move the codecs' (un)register operation to testcases.
* Remove _codecs._forget_codec() and _PyCodec_Forget()
2020-10-16 10:34:15 +02:00
Kevin Adler 2d2af320d9
bpo-41894: Fix UnicodeDecodeError while loading native module (GH-22466)
When running in a non-UTF-8 locale, if an error occurs while importing a
native Python module (say because a dependent share library is missing),
the error message string returned may contain non-ASCII code points
causing a UnicodeDecodeError.

PyUnicode_DecodeFSDefault is used for buffers which may contain
filesystem  paths. For consistency with os.strerror(),
PyUnicode_DecodeLocale is used for buffers which contain system error
messages. While the shortname parameter is always encoded in ASCII
according to PEP 489, it is left decoded using PyUnicode_FromString to
minimize the changes and since it should not affect the decoding (albeit
_potentially_ slower).

In dynload_hpux, since the error buffer contains a message generated
from a static ASCII string and the module filesystem path,
PyUnicode_DecodeFSDefault is used instead of PyUnicode_DecodeLocale as
is used elsewhere.

* bpo-41894: Fix bugs in dynload error msg handling

For both dynload_aix and dynload_hpux, properly handle the possibility
that decoding strings may return NULL and when such an error happens,
properly decrement any previously decoded strings and return early.

In addition, in dynload_aix, ensure that we pass the decoded string
*object* pathname_ob to PyErr_SetImportError instead of the original
pathname buffer.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2020-10-15 10:53:27 +09:00
Kevin Adler 0cafcd3c56
closes bpo-42029: Remove dynload_dl (GH-22687)
All references to this dynamic loading method were removed in b9949db,
when support for this method was dropped, but the implementation code
was not dropped (seemingly in oversight).
2020-10-13 20:49:24 -05:00
Kyle Evans 7992579cd2
bpo-40422: Move _Py_closerange to fileutils.c (GH-22680)
This API is relatively lightweight and organizationally, given that it's
used by multiple modules, it makes sense to move it to fileutils.

Requires making sure that _posixsubprocess is compiled with the appropriate
Py_BUIILD_CORE_BUILTIN macro.
2020-10-13 22:04:44 +02:00
Serhiy Storchaka 8287aadb75
bpo-41993: Fix possible issues in remove_module() (GH-22631)
* PyMapping_HasKey() is not safe because it silences all exceptions and can return incorrect result.
* Informative exceptions from PyMapping_DelItem() are overridden with RuntimeError and
  the original exception raised before calling remove_module() is lost.
* There is a race condition between PyMapping_HasKey() and PyMapping_DelItem().
2020-10-11 16:51:07 +03:00
Serhiy Storchaka fa1d83db62
bpo-42002: Clean up initialization of the sys module. (GH-22642)
Makes the code clearer and make errors handling more correct.
2020-10-11 15:30:43 +03:00
Batuhan Taskaya 22220ae216
bpo-38605: bump the magic number for 'annotations' future (#22630) 2020-10-10 15:19:46 -07:00
Serhiy Storchaka 98c4433a81
bpo-41991: Remove _PyObject_HasAttrId (GH-22629)
It can silence arbitrary exceptions.
2020-10-10 22:23:42 +03:00
Batuhan Taskaya 02a1603f91
bpo-42000: Cleanup the AST related C-code (GH-22641)
- Use the proper asdl sequence when creating empty arguments
- Remove reduntant casts (thanks to new typed asdl_sequences)
- Remove MarshalPrototypeVisitor and some utilities from asdl generator
- Fix the header of `Python/ast.c` (kept from pgen times)

Automerge-Triggered-By: @pablogsal
2020-10-10 10:14:59 -07:00
Vladimir Matveev 037245c5ac
bpo-41756: Add PyIter_Send function (#22443) 2020-10-09 17:15:15 -07:00
Batuhan Taskaya 044a1048ca
bpo-38605: Make 'from __future__ import annotations' the default (GH-20434)
The hard part was making all the tests pass; there are some subtle issues here, because apparently the future import wasn't tested very thoroughly in previous Python versions.

For example, `inspect.signature()` returned type objects normally (except for forward references), but strings with the future import. We changed it to try and return type objects by calling `typing.get_type_hints()`, but fall back on returning strings if that function fails (which it may do if there are future references in the annotations that require passing in a specific namespace to resolve).
2020-10-06 13:03:02 -07:00
Serhiy Storchaka dcc54215ac
bpo-41936. Remove macros Py_ALLOW_RECURSION/Py_END_ALLOW_RECURSION (GH-22552) 2020-10-05 12:32:00 +03:00
Victor Stinner bd0a08ea90
bpo-21955: Change my nickname in BINARY_ADD comment (GH-22481) 2020-10-01 18:57:37 +02:00
Mark Shannon 17b5be0c0a
bpo-41670: Remove outdated predict macro invocation. (GH-22026)
Remove PREDICTion of POP_BLOCK from FOR_ITER.
2020-09-29 10:09:13 +01:00
Hai Shi d332e7b816
bpo-41842: Add codecs.unregister() function (GH-22360)
Add codecs.unregister() and PyCodec_Unregister() functions
to unregister a codec search function.
2020-09-28 23:41:11 +02:00
Mark Shannon 02d126aa09
bpo-39934: Account for control blocks in 'except' in compiler. (GH-22395)
* Account for control blocks in 'except' in compiler. Fixes #39934.
2020-09-25 14:04:19 +01:00
Victor Stinner b7d8d8dbfe
bpo-40941: Fix stackdepth compiler warnings (GH-22377)
Explicitly cast a difference of two pointers to int:
PyFrameObject.f_stackdepth is an int.
2020-09-23 14:07:16 +02:00
Victor Stinner 71f2ff4ccf
bpo-40941: Fix fold_tuple_on_constants() compiler warnings (GH-22378)
Add explicit casts to fix compiler warnings in
fold_tuple_on_constants().

The limit of constants per code is now INT_MAX, rather than UINT_MAX.
2020-09-23 14:06:55 +02:00
Victor Stinner 19c3ac92bf
bpo-41834: Remove _Py_CheckRecursionLimit variable (GH-22359)
Remove the global _Py_CheckRecursionLimit variable: it has been
replaced by ceval.recursion_limit of the PyInterpreterState
structure.

There is no need to keep the variable for the stable ABI, since
Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() were not usable
in Python 3.8 and older: these macros accessed PyThreadState members,
whereas the PyThreadState structure is opaque in the limited C API.
2020-09-23 14:04:57 +02:00
Samuel Marks c322948892
bpo-41819: Fix compiler warning in init_dump_ascii_wstr() (GH-22332)
Fix the compiler warning:

format specifies type `wint_t` (aka `int`) but the argument has type `unsigned int`
2020-09-21 10:35:17 +02:00
Vladimir Matveev 2b05361bf7
bpo-41756: Introduce PyGen_Send C API (GH-22196)
The new API allows to efficiently send values into native generators
and coroutines avoiding use of StopIteration exceptions to signal 
returns.

ceval loop now uses this method instead of the old "private"
_PyGen_Send C API. This translates to 1.6x increased performance
of 'await' calls in micro-benchmarks.

Aside from CPython core improvements, this new API will also allow 
Cython to generate more efficient code, benefiting high-performance
IO libraries like uvloop.
2020-09-18 18:38:38 -07:00
Pablo Galindo a5634c4067
bpo-41746: Add type information to asdl_seq objects (GH-22223)
* Add new capability to the PEG parser to type variable assignments. For instance:
```
       | a[asdl_stmt_seq*]=';'.small_stmt+ [';'] NEWLINE { a }
```

* Add new sequence types from the asdl definition (automatically generated)
* Make `asdl_seq` type a generic aliasing pointer type.
* Create a new `asdl_generic_seq` for the generic case using `void*`.
* The old `asdl_seq_GET`/`ast_seq_SET` macros now are typed.
* New `asdl_seq_GET_UNTYPED`/`ast_seq_SET_UNTYPED` macros for dealing with generic sequences.
* Changes all possible `asdl_seq` types to use specific versions everywhere.
2020-09-16 19:42:00 +01:00
Victor Stinner e5fbe0cbd4
bpo-41631: _ast module uses again a global state (#21961)
Partially revert commit ac46eb4ad6662cf6d771b20d8963658b2186c48c:
"bpo-38113: Update the Python-ast.c generator to PEP384 (gh-15957)".

Using a module state per module instance is causing subtle practical
problems.

For example, the Mercurial project replaces the __import__() function
to implement lazy import, whereas Python expected that "import _ast"
always return a fully initialized _ast module.

Add _PyAST_Fini() to clear the state at exit.

The _ast module has no state (set _astmodule.m_size to 0). Remove
astmodule_traverse(), astmodule_clear() and astmodule_free()
functions.
2020-09-15 18:03:34 +02:00
Victor Stinner 640e8e1d5f
Fix compiler warnings in init_dump_ascii_wstr() (GH-22150)
Fix GCC 9.3 (using -O3) warnings on x86:

initconfig.c: In function ‘init_dump_ascii_wstr’:
initconfig.c:2679:34: warning: format ‘%lc’ expects argument of type
‘wint_t’, but argument 2 has type ‘wchar_t’ {aka ‘long int’}
 2679 |             PySys_WriteStderr("%lc", ch);
initconfig.c:2682:38: warning: format ‘%x’ expects argument of type
‘unsigned int’, but argument 2 has type ‘wchar_t’ {aka ‘long int’}
 2682 |             PySys_WriteStderr("\\x%02x", ch);
initconfig.c:2686:38: warning: format ‘%x’ expects argument of type
‘unsigned int’, but argument 2 has type ‘wchar_t’ {aka ‘long int’}
 2686 |             PySys_WriteStderr("\\U%08x", ch);
initconfig.c:2690:38: warning: format ‘%x’ expects argument of type
‘unsigned int’, but argument 2 has type ‘wchar_t’ {aka ‘long int’}
 2690 |             PySys_WriteStderr("\\u%04x", ch);
2020-09-09 12:07:17 +02:00
Serhiy Storchaka 58de1dd6a8
bpo-41525: Make the Python program help ASCII-only (GH-21836) 2020-09-09 01:28:02 +01:00
Victor Stinner f315142ddc
bpo-1635741: Port mashal module to multi-phase init (#22149)
Port the 'mashal' extension module to the multi-phase initialization
API (PEP 489).
2020-09-08 15:33:52 +02:00
han-solo 0d6aa7f0ee
bpo-41681: Fix for `f-string/str.format` error description when using 2 `,` in format specifier (GH-22036)
* Fixed `f-string/str.format` error description when using two `,` in format specifier.

Co-authored-by: millefalcon <hanish0019@hmail.com>
2020-09-01 10:34:29 -04:00
Tony Solomonik 75c80b0bda
closes bpo-41533: Fix a potential memory leak when allocating a stack (GH-21847)
Free the stack allocated in va_build_stack if do_mkstack fails
and the stack is not a small_stack
2020-08-29 23:53:08 -05:00
wmeehan 97eaf2b5e5
bpo-41524: fix pointer bug in PyOS_mystr{n}icmp (GH-21845)
* bpo-41524: fix pointer bug in PyOS_mystr{n}icmp

The existing implementations of PyOS_mystrnicmp and PyOS_mystricmp
can increment pointers beyond the end of a string.

This commit fixes those cases by moving the mutation out of the condition.

* 📜🤖 Added by blurb_it.

* Address comments

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2020-08-27 14:45:25 +09:00
Hai Shi 8aa163eea6
bpo-1635741: Explict GC collect after PyInterpreterState_Clear() (GH-21902)
Fix a reference cycle by triggering an explicit GC collection
after calling PyInterpreterState_Clear().
2020-08-17 22:36:19 +02:00