Commit Graph

285 Commits

Author SHA1 Message Date
Petr Viktorin 9769b7ae06
[3.13] gh-113993: Allow interned strings to be mortal, and fix related issues (GH-120520) (GH-120945)
* Add an InternalDocs file describing how interning should work and how to use it.

* Add internal functions to *explicitly* request what kind of interning is done:
  - `_PyUnicode_InternMortal`
  - `_PyUnicode_InternImmortal`
  - `_PyUnicode_InternStatic`

* Switch uses of `PyUnicode_InternInPlace` to those.

* Disallow using `_Py_SetImmortal` on strings directly.
  You should use `_PyUnicode_InternImmortal` instead:
  - Strings should be interned before immortalization, otherwise you're possibly
    interning a immortalizing copy.
  - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
    `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
    backports, as they are now part of public API and version-specific ABI.

* Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.

* Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
  - `_Py_ID`
  - `_Py_STR` (including the empty string)
  - one-character latin-1 singletons

  Now, when you intern a singleton, that exact singleton will be interned.

* Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).

* Intern `_Py_STR` singletons at startup.

* For free-threaded builds, intern `_Py_LATIN1_CHR` singletons at startup.

* Beef up the tests. Cover internal details (marked with `@cpython_only`).

* Add lots of assertions

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
2024-06-24 20:24:19 +02:00
Brett Simmers c2627d6eea
gh-116322: Add Py_mod_gil module slot (#116882)
This PR adds the ability to enable the GIL if it was disabled at
interpreter startup, and modifies the multi-phase module initialization
path to enable the GIL when loading a module, unless that module's spec
includes a slot indicating it can run safely without the GIL.

PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went
with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148.

A warning will be issued up to once per interpreter for the first
GIL-using module that is loaded. If `-v` is given, a shorter message
will be printed to stderr every time a GIL-using module is loaded
(including the first one that issues a warning).
2024-05-03 11:30:55 -04:00
Russell Keith-Magee ab99438900
gh-114099: Modify preprocessor symbol usage to support older macOS SDKs (GH-118073)
Co-authored-by: Joshua Root jmr@macports.org
2024-04-19 14:56:33 -04:00
Donghee Na 94444ea45a
gh-112069: Add _PySet_NextEntryRef to be thread-safe. (gh-117990) 2024-04-19 00:18:22 +09:00
Russell Keith-Magee f006338017
gh-114099: Additions to standard library to support iOS (GH-117052)
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Co-authored-by: Malcolm Smith <smith@chaquo.com>
Co-authored-by: Ned Deily <nad@python.org>
2024-03-28 03:59:33 -04:00
Serhiy Storchaka d2d8332f71
gh-113626: Add allow_code parameter in marshal functions (GH-113648)
Passing allow_code=False prevents serialization and de-serialization of
code objects which is incompatible between Python versions.
2024-01-16 18:05:15 +02:00
Victor Stinner 5e4af2a3e9
gh-106320: Move private _PySet API to the internal API (#107041)
* Add pycore_setobject.h header file.
* Move the following API to the internal C API:

  * _PySet_Dummy
  * _PySet_NextEntry()
  * _PySet_Update()
2023-07-22 17:04:34 +02:00
Inada Naoki d5bd32fb48
gh-104922: remove PY_SSIZE_T_CLEAN (#106315) 2023-07-02 15:07:46 +09:00
Serhiy Storchaka 8bf6904b22
gh-101006: Improve error handling when read marshal data (GH-101007)
* EOFError no longer overrides other errors such as MemoryError or OSError at
  the start of the object.
* Raise more relevant error when the NULL object occurs as a code object
  component.
* Minimize an overhead of calling PyErr_Occurred().
2023-06-29 12:22:19 +03:00
Victor Stinner 00e75a3372
gh-106084: Remove old PyObject call aliases (#106085)
Remove old aliases which were kept backwards compatibility with
Python 3.8:

* _PyObject_CallMethodNoArgs()
* _PyObject_CallMethodOneArg()
* _PyObject_CallOneArg()
* _PyObject_FastCallDict()
* _PyObject_Vectorcall()
* _PyObject_VectorcallMethod()
* _PyVectorcall_Function()

Update code which used these aliases to use new names.
2023-06-26 08:08:12 +02:00
Dong-hee Na 058b960535
gh-103906: Remove immortal refcounting in compile/marshal.c (gh-103922) 2023-06-05 22:38:36 +09:00
Irit Katriel ee26ca13a1
gh-105184: document that marshal functions can fail and need to be checked with PyErr_Occurred (#105185) 2023-06-02 08:59:18 +01:00
Eric Snow a9c6e0618f
gh-99113: Add Py_MOD_PER_INTERPRETER_GIL_SUPPORTED (gh-104205)
Here we are doing no more than adding the value for Py_mod_multiple_interpreters and using it for stdlib modules.  We will start checking for it in gh-104206 (once PyInterpreterState.ceval.own_gil is added in gh-104204).
2023-05-05 21:11:27 +00:00
Mark Shannon 7559f5fda9
GH-101291: Rearrange the size bits in PyLongObject (GH-102464)
* Eliminate all remaining uses of Py_SIZE and Py_SET_SIZE on PyLongObject, adding asserts.

* Change layout of size/sign bits in longobject to support future addition of immortal ints and tagged medium ints.

* Add functions to hide some internals of long object, and for setting sign and digit count.

* Replace uses of IS_MEDIUM_VALUE macro with _PyLong_IsCompact().
2023-03-22 14:49:51 +00:00
Mark Shannon c1b1f51cd1
GH-101291: Refactor the `PyLongObject` struct into object header and PyLongValue struct. (GH-101292) 2023-01-30 10:03:04 +00:00
Victor Stinner 81f7359f67
gh-99537: Use Py_SETREF(var, NULL) in C code (#99687)
Replace "Py_DECREF(var); var = NULL;" with "Py_SETREF(var, NULL);".
2022-11-23 14:57:50 +01:00
Victor Stinner 8211cf5d28
gh-99300: Replace Py_INCREF() with Py_NewRef() (#99530)
Replace Py_INCREF() and Py_XINCREF() using a cast with Py_NewRef()
and Py_XNewRef().
2022-11-16 18:34:24 +01:00
Victor Stinner d8f239d86e
gh-99300: Use Py_NewRef() in Python/ directory (#99302)
Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and
Py_XNewRef() in C files of the Python/ directory.
2022-11-10 09:03:39 +01:00
Brett Cannon 9711265182
gh-98925: Lower marshal recursion depth for WASI (GH-98938)
For wasmtime 2.0, the stack depth cost is 6% higher. This causes the default max `marshal` recursion depth to blow the stack.

As the default marshal depth is 2000 and Windows is set to 1000, split the difference and choose 1500 for WASI to be safe.
2022-11-01 15:51:05 -07:00
Inada Naoki 6dcfd6c5e3
gh-78214: marshal: Stabilize FLAG_REF usage (GH-8226)
Use FLAG_REF always for interned strings.

Refcounts of interned string is very unstable.
When compiling same source, refcounts of interned string in the output may be 1 or >1.
It makes FLAG_REF usage unstable.

To help reproducible build, use FLAG_REF for interned string even if refcnt(obj)==1.
2022-05-04 10:01:15 +09:00
Mark Shannon 944fffee89
GH-88116: Use a compact format to represent end line and column offsets. (GH-91666)
* Stores all location info in linetable to conform to PEP 626.

* Remove column table from code objects.

* Remove end-line table from code objects.

* Document new location table format
2022-04-21 16:10:37 +01:00
Victor Stinner 85addfb9c6
bpo-35134: Remove the Include/code.h header file (GH-32385)
Remove the Include/code.h header file. C extensions should only
include the main <Python.h> header file.

Python.h includes directly Include/cpython/code.h instead.
2022-04-07 02:29:52 +02:00
Brandt Bucher 2bde6827ea
bpo-46841: Quicken code in-place (GH-31888)
* Moves the bytecode to the end of the corresponding PyCodeObject, and quickens it in-place.

* Removes the almost-always-unused co_varnames, co_freevars, and co_cellvars member caches

* _PyOpcode_Deopt is a new mapping from all opcodes to their un-quickened forms.

* _PyOpcode_InlineCacheEntries is renamed to _PyOpcode_Caches

* _Py_IncrementCountAndMaybeQuicken is renamed to _PyCode_Warmup

* _Py_Quicken is renamed to _PyCode_Quicken

* _co_quickened is renamed to _co_code_adaptive (and is now a read-only memoryview).

* Do not emit unused nonzero opargs anymore in the compiler.
2022-03-21 11:11:17 +00:00
Victor Stinner 882d8096c2
bpo-46906: Add PyFloat_Pack8() to the C API (GH-31657)
Add new functions to pack and unpack C double (serialize and
deserialize):

* PyFloat_Pack2(), PyFloat_Pack4(), PyFloat_Pack8()
* PyFloat_Unpack2(), PyFloat_Unpack4(), PyFloat_Unpack8()

Document these functions and add unit tests.

Rename private functions and move them from the internal C API
to the public C API:

* _PyFloat_Pack2() => PyFloat_Pack2()
* _PyFloat_Pack4() => PyFloat_Pack4()
* _PyFloat_Pack8() => PyFloat_Pack8()
* _PyFloat_Unpack2() => PyFloat_Unpack2()
* _PyFloat_Unpack4() => PyFloat_Unpack4()
* _PyFloat_Unpack8() => PyFloat_Unpack8()

Replace the "unsigned char*" type with "char*" which is more common
and easy to use.
2022-03-12 00:10:02 +01:00
Eric Snow 81c72044a1
bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized global objects. (gh-30928)
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code.  It is still used in a number of non-builtin stdlib modules.

The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime.  A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).

https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.

The core of the change is in:

* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers

I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings.  That check is added to the PR CI config.

The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()).  This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.

The following are not changed (yet):

* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init

https://bugs.python.org/issue46541
2022-02-08 13:39:07 -07:00
Victor Stinner 8e5de40f90
bpo-35134: Move classobject.h to Include/cpython/ (GH-28968)
Move classobject.h, context.h, genobject.h and longintrepr.h header
files from Include/ to Include/cpython/.

Remove redundant "#ifndef Py_LIMITED_API" in context.h.

Remove explicit #include "longintrepr.h" in C files. It's not needed,
Python.h already includes it.
2021-10-15 09:46:29 +02:00
Victor Stinner 0a883a76cd
bpo-35134: Add Include/cpython/floatobject.h (GH-28957)
Split Include/floatobject.h into sub-files: add
Include/cpython/floatobject.h and
Include/internal/pycore_floatobject.h.
2021-10-14 23:41:06 +02:00
Victor Stinner d943d19172
bpo-45439: Move _PyObject_CallNoArgs() to pycore_call.h (GH-28895)
* Move _PyObject_CallNoArgs() to pycore_call.h (internal C API).
* _ssl, _sqlite and _testcapi extensions now call the public
  PyObject_CallNoArgs() function, rather than _PyObject_CallNoArgs().
* _lsprof extension is now built with Py_BUILD_CORE_MODULE macro
  defined to get access to internal _PyObject_CallNoArgs().
2021-10-12 08:38:19 +02:00
Victor Stinner ce3489cfdb
bpo-45439: Rename _PyObject_CallNoArg() to _PyObject_CallNoArgs() (GH-28891)
Fix typo in the private _PyObject_CallNoArg() function name: rename
it to _PyObject_CallNoArgs() to be consistent with the public
function PyObject_CallNoArgs().
2021-10-12 00:42:23 +02:00
Victor Stinner 7974c30b9f
bpo-45094: Add Py_NO_INLINE macro (GH-28140)
* Rename _Py_NO_INLINE macro to Py_NO_INLINE: make it public and
  document it.
* Sort macros in the C API documentation.
2021-09-03 16:44:02 +02:00
Brandt Bucher 51999c960e
bpo-37596: Clean up the set/frozenset marshalling code (GH-28068) 2021-08-31 09:18:33 -07:00
Brandt Bucher 33d95c6fac
bpo-37596: Make `set` and `frozenset` marshalling deterministic (GH-27926) 2021-08-25 13:14:34 +02:00
Gabriele N. Tornetta 2f180ce2cb
bpo-44530: Add co_qualname field to PyCodeObject (GH-26941) 2021-07-07 12:21:51 +01:00
Pablo Galindo 98eee94421
bpo-43950: Add code.co_positions (PEP 657) (GH-26955)
This PR is part of PEP 657 and augments the compiler to emit ending
line numbers as well as starting and ending columns from the AST
into compiled code objects. This allows bytecodes to be correlated
to the exact source code ranges that generated them.

This information is made available through the following public APIs:

* The `co_positions` method on code objects.
* The C API function `PyCode_Addr2Location`.

Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>
Co-authored-by: Ammar Askar <ammar@ammaraskar.com>
2021-07-02 15:10:11 +01:00
Pablo Galindo 66c53b48e1
Fix compiler errors for unused variables in marshal.c (GH-26977) 2021-06-30 21:55:57 +01:00
Steve Dower 139de04518
bpo-41180: Replace marshal code.__new__ audit event with marshal.load[s] and marshal.dumps (GH-26961) 2021-06-30 17:21:37 +01:00
Guido van Rossum 355f5dd36a
bpo-43693: Turn localspluskinds into an object (GH-26749)
Managing it as a bare pointer to malloc'ed bytes is just too awkward in a few places.
2021-06-21 13:53:04 -07:00
Eric Snow 2ab27c4af4
bpo-43693: Un-revert commits 2c1e258 and b2bf2bc. (gh-26577)
These were reverted in gh-26530 (commit 17c4edc) due to refleaks.

* 2c1e258 - Compute deref offsets in compiler (gh-25152)
* b2bf2bc - Add new internal code objects fields: co_fastlocalnames and co_fastlocalkinds. (gh-26388)

This change fixes the refleaks.

https://bugs.python.org/issue43693
2021-06-07 12:22:26 -06:00
Pablo Galindo 17c4edc4e0
bpo-43693: Revert commits 2c1e2583fd and b2bf2bc1ec (GH-26530)
* Revert "bpo-43693: Compute deref offsets in compiler (gh-25152)"

This reverts commit b2bf2bc1ec.

* Revert "bpo-43693: Add new internal code objects fields: co_fastlocalnames and co_fastlocalkinds. (gh-26388)"

This reverts commit 2c1e2583fd.

These two commits are breaking the refleak buildbots.
2021-06-04 17:51:05 +01:00
Eric Snow 2c1e2583fd
bpo-43693: Add new internal code objects fields: co_fastlocalnames and co_fastlocalkinds. (gh-26388)
A number of places in the code base (notably ceval.c and frameobject.c) rely on mapping variable names to indices in the frame "locals plus" array (AKA fast locals), and thus opargs.  Currently the compiler indirectly encodes that information on the code object as the tuples co_varnames, co_cellvars, and co_freevars.  At runtime the dependent code must calculate the proper mapping from those, which isn't ideal and impacts performance-sensitive sections.  This is something we can easily address in the compiler instead.

This change addresses the situation by replacing internal use of co_varnames, etc. with a single combined tuple of names in locals-plus order, along with a minimal array mapping each to its kind (local vs. cell vs. free).  These two new PyCodeObject fields, co_fastlocalnames and co_fastllocalkinds, are not exposed to Python code for now, but co_varnames, etc. are still available with the same values as before (though computed lazily).

Aside from the (mild) performance impact, there are a number of other benefits:

* there's now a clear, direct relationship between locals-plus and variables
* code that relies on the locals-plus-to-name mapping is simpler
* marshaled code objects are smaller and serialize/de-serialize faster

Also note that we can take this approach further by expanding the possible values in co_fastlocalkinds to include specific argument types (e.g. positional-only, kwargs).  Doing so would allow further speed-ups in _PyEval_MakeFrameVector(), which is where args get unpacked into the locals-plus array.  It would also allow us to shrink marshaled code objects even further.

https://bugs.python.org/issue43693
2021-06-03 10:28:27 -06:00
Eric Snow 9f494d4929
bpo-43693: Add _PyCode_New(). (gh-26375)
This is an internal-only API that helps us manage the many values used to create a code object.

https://bugs.python.org/issue43693
2021-05-27 09:54:34 -06:00
Mark Shannon adcd220556
bpo-40222: "Zero cost" exception handling (GH-25729)
"Zero cost" exception handling.

* Uses a lookup table to determine how to handle exceptions.
* Removes SETUP_FINALLY and POP_TOP block instructions, eliminating (most of) the runtime overhead of try statements.
* Reduces the size of the frame object by about 60%.
2021-05-07 15:19:19 +01:00
Victor Stinner 00d7abd7ef
bpo-42519: Replace PyMem_MALLOC() with PyMem_Malloc() (GH-23586)
No longer use deprecated aliases to functions:

* Replace PyMem_MALLOC() with PyMem_Malloc()
* Replace PyMem_REALLOC() with PyMem_Realloc()
* Replace PyMem_FREE() with PyMem_Free()
* Replace PyMem_Del() with PyMem_Free()
* Replace PyMem_DEL() with PyMem_Free()

Modify also the PyMem_DEL() macro to use directly PyMem_Free().
2020-12-01 09:56:42 +01:00
Mark Shannon 877df851c3
bpo-42246: Partial implementation of PEP 626. (GH-23113)
* Implement new line number table format, as defined in PEP 626.
2020-11-12 09:43:29 +00:00
Victor Stinner f315142ddc
bpo-1635741: Port mashal module to multi-phase init (#22149)
Port the 'mashal' extension module to the multi-phase initialization
API (PEP 489).
2020-09-08 15:33:52 +02:00
tkmikan d160e0f8e2
bpo-41180: Audit code.__new__ when unmarshalling (GH-21271) 2020-07-03 21:56:30 +01:00
Victor Stinner a482dc500b
bpo-40602: Write unit tests for _Py_hashtable_t (GH-20091)
Cleanup also hashtable.c.
Rename _Py_hashtable_t members:

* Rename entries to nentries
* Rename num_buckets to nbuckets
2020-05-14 21:55:47 +02:00
Victor Stinner 5b0a30354d
bpo-40609: _Py_hashtable_t values become void* (GH-20065)
_Py_hashtable_t values become regular "void *" pointers.

* Add _Py_hashtable_entry_t.data member
* Remove _Py_hashtable_t.data_size member
* Remove _Py_hashtable_t.get_func member. It is no longer needed
  to specialize _Py_hashtable_get() for a specific value size, since
  all entries now have the same size (void*).
* Remove the following macros:

  * _Py_HASHTABLE_GET()
  * _Py_HASHTABLE_SET()
  * _Py_HASHTABLE_SET_NODATA()
  * _Py_HASHTABLE_POP()

* Rename _Py_hashtable_pop() to _Py_hashtable_steal()
* _Py_hashtable_foreach() callback now gets key and value rather than
  entry.
* Remove _Py_hashtable_value_destroy_func type. value_destroy_func
  callback now only has a single parameter: data (void*).
2020-05-13 04:40:30 +02:00
Victor Stinner 2d0a3d682f
bpo-40609: Add destroy functions to _Py_hashtable (GH-20062)
Add key_destroy_func and value_destroy_func parameters to
_Py_hashtable_new_full().

marshal.c and _tracemalloc.c use these destroy functions.
2020-05-13 02:50:18 +02:00
Victor Stinner f9b3b582b8
bpo-40609: Remove _Py_hashtable_t.key_size (GH-20060)
Rewrite _Py_hashtable_t type to always store the key as
a "const void *" pointer. Add an explicit "key" member to
_Py_hashtable_entry_t.

Remove _Py_hashtable_t.key_size member.

hash and compare functions drop their hash table parameter, and their
'key' parameter type becomes "const void *".
2020-05-13 02:26:02 +02:00