From 2ab5b092e5a82390c236708b7c163a32dfc928a1 Mon Sep 17 00:00:00 2001 From: Nick Coghlan Date: Fri, 3 Jul 2015 19:49:15 +1000 Subject: [PATCH] Close #24458: PEP 489 documentation Patch by Petr Viktorin. --- Doc/c-api/init.rst | 2 + Doc/c-api/module.rst | 320 +++++++++++++++++++++++++++--------- Doc/extending/building.rst | 63 +++++-- Doc/extending/extending.rst | 7 + Doc/extending/windows.rst | 5 +- Doc/whatsnew/3.5.rst | 4 +- Misc/NEWS | 3 + 7 files changed, 306 insertions(+), 98 deletions(-) diff --git a/Doc/c-api/init.rst b/Doc/c-api/init.rst index e6d6d9fa87f..81823bf3830 100644 --- a/Doc/c-api/init.rst +++ b/Doc/c-api/init.rst @@ -873,6 +873,8 @@ been created. instead. +.. _sub-interpreter-support: + Sub-interpreter support ======================= diff --git a/Doc/c-api/module.rst b/Doc/c-api/module.rst index df9301f987b..ef778ccaedb 100644 --- a/Doc/c-api/module.rst +++ b/Doc/c-api/module.rst @@ -82,6 +82,18 @@ Module Objects Similar to :c:func:`PyModule_GetNameObject` but return the name encoded to ``'utf-8'``. +.. c:function:: void* PyModule_GetState(PyObject *module) + + Return the "state" of the module, that is, a pointer to the block of memory + allocated at module creation time, or *NULL*. See + :c:member:`PyModuleDef.m_size`. + + +.. c:function:: PyModuleDef* PyModule_GetDef(PyObject *module) + + Return a pointer to the :c:type:`PyModuleDef` struct from which the module was + created, or *NULL* if the module wasn't created from a definition. + .. c:function:: PyObject* PyModule_GetFilenameObject(PyObject *module) @@ -107,57 +119,25 @@ Module Objects unencodable filenames, use :c:func:`PyModule_GetFilenameObject` instead. -Per-interpreter module state -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Single-phase initialization creates singleton modules that can store additional -information as part of the interpreter, allow that state to be retrieved later -with only a reference to the module definition, rather than to the module -itself. - -.. c:function:: void* PyModule_GetState(PyObject *module) - - Return the "state" of the module, that is, a pointer to the block of memory - allocated at module creation time, or *NULL*. See - :c:member:`PyModuleDef.m_size`. - - -.. c:function:: PyModuleDef* PyModule_GetDef(PyObject *module) - - Return a pointer to the :c:type:`PyModuleDef` struct from which the module was - created, or *NULL* if the module wasn't created with - :c:func:`PyModule_Create`. - -.. c:function:: PyObject* PyState_FindModule(PyModuleDef *def) - - Returns the module object that was created from *def* for the current interpreter. - This method requires that the module object has been attached to the interpreter state with - :c:func:`PyState_AddModule` beforehand. In case the corresponding module object is not - found or has not been attached to the interpreter state yet, it returns NULL. - -.. c:function:: int PyState_AddModule(PyObject *module, PyModuleDef *def) - - Attaches the module object passed to the function to the interpreter state. This allows - the module object to be accessible via - :c:func:`PyState_FindModule`. - - .. versionadded:: 3.3 - -.. c:function:: int PyState_RemoveModule(PyModuleDef *def) - - Removes the module object created from *def* from the interpreter state. - - .. versionadded:: 3.3 +.. _initializing-modules: Initializing C modules ^^^^^^^^^^^^^^^^^^^^^^ +Modules objects are usually created from extension modules (shared libraries +which export an initialization function), or compiled-in modules +(where the initialization function is added using :c:func:`PyImport_AppendInittab`). +See :ref:`building` or :ref:`extending-with-embedding` for details. + +The initialization function can either pass pass a module definition instance +to :c:func:`PyModule_Create`, and return the resulting module object, +or request "multi-phase initialization" by returning the definition struct itself. + .. c:type:: PyModuleDef - This struct holds all information that is needed to create a module object. - There is usually only one static variable of that type for each module, which - is statically initialized and then passed to :c:func:`PyModule_Create` in the - module initialization function. + The module definition struct, which holds all information needed to create + a module object. There is usually only one statically initialized variable + of this type for each module. .. c:member:: PyModuleDef_Base m_base @@ -174,19 +154,21 @@ Initializing C modules .. c:member:: Py_ssize_t m_size - Some modules allow re-initialization (calling their ``PyInit_*`` function - more than once). These modules should keep their state in a per-module - memory area that can be retrieved with :c:func:`PyModule_GetState`. + Module state may be kept in a per-module memory area that can be + retrieved with :c:func:`PyModule_GetState`, rather than in static globals. + This makes modules safe for use in multiple sub-interpreters. - This memory should be used, rather than static globals, to hold per-module - state, since it is then safe for use in multiple sub-interpreters. It is - freed when the module object is deallocated, after the :c:member:`m_free` - function has been called, if present. + This memory area is allocated based on *m_size* on module creation, + and freed when the module object is deallocated, after the + :c:member:`m_free` function has been called, if present. - Setting ``m_size`` to ``-1`` means that the module can not be - re-initialized because it has global state. Setting it to a non-negative - value means that the module can be re-initialized and specifies the - additional amount of memory it requires for its state. + Setting ``m_size`` to ``-1`` means that the module does not support + sub-interpreters, because it has global state. + + Setting it to a non-negative value means that the module can be + re-initialized and specifies the additional amount of memory it requires + for its state. Non-negative ``m_size`` is required for multi-phase + initialization. See :PEP:`3121` for more details. @@ -198,7 +180,15 @@ Initializing C modules .. c:member:: PyModuleDef_Slot* m_slots An array of slot definitions for multi-phase initialization, terminated by - a *NULL* entry. + a ``{0, NULL}`` entry. + When using single-phase initialization, *m_slots* must be *NULL*. + + .. versionchanged:: 3.5 + + Prior to version 3.5, this member was always set to *NULL*, + and was defined as: + + .. c:member:: inquiry m_reload .. c:member:: traverseproc m_traverse @@ -215,20 +205,23 @@ Initializing C modules A function to call during deallocation of the module object, or *NULL* if not needed. +Single-phase initialization +........................... + The module initialization function may create and return the module object directly. This is referred to as "single-phase initialization", and uses one of the following two module creation functions: -.. c:function:: PyObject* PyModule_Create(PyModuleDef *module) +.. c:function:: PyObject* PyModule_Create(PyModuleDef *def) - Create a new module object, given the definition in *module*. This behaves + Create a new module object, given the definition in *def*. This behaves like :c:func:`PyModule_Create2` with *module_api_version* set to :const:`PYTHON_API_VERSION`. -.. c:function:: PyObject* PyModule_Create2(PyModuleDef *module, int module_api_version) +.. c:function:: PyObject* PyModule_Create2(PyModuleDef *def, int module_api_version) - Create a new module object, given the definition in *module*, assuming the + Create a new module object, given the definition in *def*, assuming the API version *module_api_version*. If that version does not match the version of the running interpreter, a :exc:`RuntimeWarning` is emitted. @@ -237,39 +230,179 @@ of the following two module creation functions: Most uses of this function should be using :c:func:`PyModule_Create` instead; only use this if you are sure you need it. +Before it is returned from in the initialization function, the resulting module +object is typically populated using functions like :c:func:`PyModule_AddObject`. -Alternatively, the module initialization function may instead return a -:c:type:`PyModuleDef` instance with a non-empty ``m_slots`` array. This is -referred to as "multi-phase initialization", and ``PyModuleDef`` instance -should be initialized with the following function: +.. _multi-phase-initialization: -.. c:function:: PyObject* PyModuleDef_Init(PyModuleDef *module) +Multi-phase initialization +.......................... + +An alternate way to specify extensions is to request "multi-phase initialization". +Extension modules created this way behave more like Python modules: the +initialization is split between the *creation phase*, when the module object +is created, and the *execution phase*, when it is populated. +The distinction is similar to the :py:meth:`__new__` and :py:meth:`__init__` methods +of classes. + +Unlike modules created using single-phase initialization, these modules are not +singletons: if the *sys.modules* entry is removed and the module is re-imported, +a new module object is created, and the old module is subject to normal garbage +collection -- as with Python modules. +By default, multiple modules created from the same definition should be +independent: changes to one should not affect the others. +This means that all state should be specific to the module object (using e.g. +using :c:func:`PyModule_GetState`), or its contents (such as the module's +:attr:`__dict__` or individual classes created with :c:func:`PyType_FromSpec`). + +All modules created using multi-phase initialization are expected to support +:ref:`sub-interpreters `. Making sure multiple modules +are independent is typically enough to achieve this. + +To request multi-phase initialization, the initialization function +(PyInit_modulename) returns a :c:type:`PyModuleDef` instance with non-empty +:c:member:`~PyModuleDef.m_slots`. Before it is returned, the ``PyModuleDef`` +instance must be initialized with the following function: + +.. c:function:: PyObject* PyModuleDef_Init(PyModuleDef *def) Ensures a module definition is a properly initialized Python object that correctly reports its type and reference count. -.. XXX (ncoghlan): It's not clear if it makes sense to document PyModule_ExecDef - PyModule_FromDefAndSpec or PyModule_FromDefAndSpec2 here, as end user code - generally shouldn't be calling those. + Returns *def* cast to ``PyObject*``, or *NULL* if an error occurred. -The module initialization function (if using single phase initialization) or -a function called from a module execution slot (if using multiphase -initialization), can use the following functions to help initialize the module -state: + .. versionadded:: 3.5 + +The *m_slots* member of the module definition must point to an array of +``PyModuleDef_Slot`` structures: + +.. c:type:: PyModuleDef_Slot + + .. c:member:: int slot + + A slot ID, chosen from the available values explained below. + + .. c:member:: void* value + + Value of the slot, whose meaning depends on the slot ID. + + .. versionadded:: 3.5 + +The *m_slots* array must be terminated by a slot with id 0. + +The available slot types are: + +.. c:var:: Py_mod_create + + Specifies a function that is called to create the module object itself. + The *value* pointer of this slot must point to a function of the signature: + + .. c:function:: PyObject* create_module(PyObject *spec, PyModuleDef *def) + + The function receives a :py:class:`~importlib.machinery.ModuleSpec` + instance, as defined in :PEP:`451`, and the module definition. + It should return a new module object, or set an error + and return *NULL*. + + This function should be kept minimal. In particular, it should not + call arbitrary Python code, as trying to import the same module again may + result in an infinite loop. + + Multiple ``Py_mod_create`` slots may not be specified in one module + definition. + + If ``Py_mod_create`` is not specified, the import machinery will create + a normal module object using :c:func:`PyModule_New`. The name is taken from + *spec*, not the definition, to allow extension modules to dynamically adjust + to their place in the module hierarchy and be imported under different + names through symlinks, all while sharing a single module definition. + + There is no requirement for the returned object to be an instance of + :c:type:`PyModule_Type`. Any type can be used, as long as it supports + setting and getting import-related attributes. + However, only ``PyModule_Type`` instances may be returned if the + ``PyModuleDef`` has non-*NULL* ``m_methods``, ``m_traverse``, ``m_clear``, + ``m_free``; non-zero ``m_size``; or slots other than ``Py_mod_create``. + +.. c:var:: Py_mod_exec + + Specifies a function that is called to *execute* the module. + This is equivalent to executing the code of a Python module: typically, + this function adds classes and constants to the module. + The signature of the function is: + + .. c:function:: int exec_module(PyObject* module) + + If multiple ``Py_mod_exec`` slots are specified, they are processed in the + order they appear in the *m_slots* array. + +See :PEP:`489` for more details on multi-phase initialization. + +Low-level module creation functions +................................... + +The following functions are called under the hood when using multi-phase +initialization. They can be used directly, for example when creating module +objects dynamically. Note that both ``PyModule_FromDefAndSpec`` and +``PyModule_ExecDef`` must be called to fully initialize a module. + +.. c:function:: PyObject * PyModule_FromDefAndSpec(PyModuleDef *def, PyObject *spec) + + Create a new module object, given the definition in *module* and the + ModuleSpec *spec*. This behaves like :c:func:`PyModule_FromDefAndSpec2` + with *module_api_version* set to :const:`PYTHON_API_VERSION`. + + .. versionadded:: 3.5 + +.. c:function:: PyObject * PyModule_FromDefAndSpec2(PyModuleDef *def, PyObject *spec, int module_api_version) + + Create a new module object, given the definition in *module* and the + ModuleSpec *spec*, assuming the API version *module_api_version*. + If that version does not match the version of the running interpreter, + a :exc:`RuntimeWarning` is emitted. + + .. note:: + + Most uses of this function should be using :c:func:`PyModule_FromDefAndSpec` + instead; only use this if you are sure you need it. + + .. versionadded:: 3.5 + +.. c:function:: int PyModule_ExecDef(PyObject *module, PyModuleDef *def) + + Process any execution slots (:c:data:`Py_mod_exec`) given in *def*. + + .. versionadded:: 3.5 .. c:function:: int PyModule_SetDocString(PyObject *module, const char *docstring) - Set the docstring for *module* to *docstring*. Return ``-1`` on error, ``0`` - on success. + Set the docstring for *module* to *docstring*. + This function is called automatically when creating a module from + ``PyModuleDef``, using either ``PyModule_Create`` or + ``PyModule_FromDefAndSpec``. + + .. versionadded:: 3.5 .. c:function:: int PyModule_AddFunctions(PyObject *module, PyMethodDef *functions) - Add the functions from the ``NULL`` terminated *functions* array to *module*. + Add the functions from the *NULL* terminated *functions* array to *module*. Refer to the :c:type:`PyMethodDef` documentation for details on individual entries (due to the lack of a shared module namespace, module level "functions" implemented in C typically receive the module as their first parameter, making them similar to instance methods on Python classes). + This function is called automatically when creating a module from + ``PyModuleDef``, using either ``PyModule_Create`` or + ``PyModule_FromDefAndSpec``. + .. versionadded:: 3.5 + +Support functions +................. + +The module initialization function (if using single phase initialization) or +a function called from a module execution slot (if using multi-phase +initialization), can use the following functions to help initialize the module +state: .. c:function:: int PyModule_AddObject(PyObject *module, const char *name, PyObject *value) @@ -288,7 +421,7 @@ state: Add a string constant to *module* as *name*. This convenience function can be used from the module's initialization function. The string *value* must be - null-terminated. Return ``-1`` on error, ``0`` on success. + *NULL*-terminated. Return ``-1`` on error, ``0`` on success. .. c:function:: int PyModule_AddIntMacro(PyObject *module, macro) @@ -302,3 +435,36 @@ state: .. c:function:: int PyModule_AddStringMacro(PyObject *module, macro) Add a string constant to *module*. + + +Module lookup +^^^^^^^^^^^^^ + +Single-phase initialization creates singleton modules that can be looked up +in the context of the current interpreter. This allows the module object to be +retrieved later with only a reference to the module definition. + +These functions will not work on modules created using multi-phase initialization, +since multiple such modules can be created from a single definition. + +.. c:function:: PyObject* PyState_FindModule(PyModuleDef *def) + + Returns the module object that was created from *def* for the current interpreter. + This method requires that the module object has been attached to the interpreter state with + :c:func:`PyState_AddModule` beforehand. In case the corresponding module object is not + found or has not been attached to the interpreter state yet, it returns *NULL*. + +.. c:function:: int PyState_AddModule(PyObject *module, PyModuleDef *def) + + Attaches the module object passed to the function to the interpreter state. This allows + the module object to be accessible via :c:func:`PyState_FindModule`. + + Only effective on modules created using single-phase initialization. + + .. versionadded:: 3.3 + +.. c:function:: int PyState_RemoveModule(PyModuleDef *def) + + Removes the module object created from *def* from the interpreter state. + + .. versionadded:: 3.3 diff --git a/Doc/extending/building.rst b/Doc/extending/building.rst index 06d300547d4..aafa3d89d0a 100644 --- a/Doc/extending/building.rst +++ b/Doc/extending/building.rst @@ -1,27 +1,58 @@ .. highlightlang:: c - .. _building: -******************************************** +***************************** +Building C and C++ Extensions +***************************** + +A C extension for CPython is a shared library (e.g. a ``.so`` file on Linux, +``.pyd`` on Windows), which exports an *initialization function*. + +To be importable, the shared library must be available on :envvar:`PYTHONPATH`, +and must be named after the module name, with an appropriate extension. +When using distutils, the correct filename is generated automatically. + +The initialization function has the signature: + +.. c:function:: PyObject* PyInit_modulename(void) + +It returns either a fully-initialized module, or a :c:type:`PyModuleDef` +instance. See :ref:`initializing-modules` for details. + +.. highlightlang:: python + +For modules with ASCII-only names, the function must be named +``PyInit_``, with ```` replaced by the name of the +module. When using :ref:`multi-phase-initialization`, non-ASCII module names +are allowed. In this case, the initialization function name is +``PyInitU_``, with ```` encoded using Python's +*punycode* encoding with hyphens replaced by underscores. In Python:: + + def initfunc_name(name): + try: + suffix = b'_' + name.encode('ascii') + except UnicodeEncodeError: + suffix = b'U_' + name.encode('punycode').replace(b'-', b'_') + return b'PyInit' + suffix + +It is possible to export multiple modules from a single shared library by +defining multiple initialization functions. However, importing them requires +using symbolic links or a custom importer, because by default only the +function corresponding to the filename is found. +See :PEP:`489#multiple-modules-in-one-library` for details. + + +.. highlightlang:: c + Building C and C++ Extensions with distutils -******************************************** +============================================ .. sectionauthor:: Martin v. Löwis - -Starting in Python 1.4, Python provides, on Unix, a special make file for -building make files for building dynamically-linked extensions and custom -interpreters. Starting with Python 2.0, this mechanism (known as related to -Makefile.pre.in, and Setup files) is no longer supported. Building custom -interpreters was rarely used, and extension modules can be built using -distutils. - -Building an extension module using distutils requires that distutils is -installed on the build machine, which is included in Python 2.x and available -separately for Python 1.5. Since distutils also supports creation of binary -packages, users don't necessarily need a compiler and distutils to install the -extension. +Extension modules can be built using distutils, which is included in Python. +Since distutils also supports creation of binary packages, users don't +necessarily need a compiler and distutils to install the extension. A distutils package contains a driver script, :file:`setup.py`. This is a plain Python file, which, in the most simple case, could look like this:: diff --git a/Doc/extending/extending.rst b/Doc/extending/extending.rst index a83fb6e5799..8cc41840d7c 100644 --- a/Doc/extending/extending.rst +++ b/Doc/extending/extending.rst @@ -413,6 +413,13 @@ A more substantial example module is included in the Python source distribution as :file:`Modules/xxmodule.c`. This file may be used as a template or simply read as an example. +.. note:: + + Unlike our ``spam`` example, ``xxmodule`` uses *multi-phase initialization* + (new in Python 3.5), where a PyModuleDef structure is returned from + ``PyInit_spam``, and creation of the module is left to the import machinery. + For details on multi-phase initialization, see :PEP:`489`. + .. _compilation: diff --git a/Doc/extending/windows.rst b/Doc/extending/windows.rst index 3fd5e576de0..f0c69b851e7 100644 --- a/Doc/extending/windows.rst +++ b/Doc/extending/windows.rst @@ -98,9 +98,8 @@ described here are distributed with the Python sources in the it. Copy your C sources into it. Note that the module source file name does not necessarily have to match the module name, but the name of the initialization function should match the module name --- you can only import a - module :mod:`spam` if its initialization function is called :c:func:`initspam`, - and it should call :c:func:`Py_InitModule` with the string ``"spam"`` as its - first argument (use the minimal :file:`example.c` in this directory as a guide). + module :mod:`spam` if its initialization function is called :c:func:`PyInit_spam`, + (see :ref:`building`, or use the minimal :file:`Modules/xxmodule.c` as a guide). By convention, it lives in a file called :file:`spam.c` or :file:`spammodule.c`. The output file should be called :file:`spam.pyd` (in Release mode) or :file:`spam_d.pyd` (in Debug mode). The extension :file:`.pyd` was chosen diff --git a/Doc/whatsnew/3.5.rst b/Doc/whatsnew/3.5.rst index b73c80df3d4..3239ce537a5 100644 --- a/Doc/whatsnew/3.5.rst +++ b/Doc/whatsnew/3.5.rst @@ -283,7 +283,7 @@ two step module loading mechanism introduced by :pep:`451` in Python 3.4. This change brings the import semantics of extension modules that opt-in to using the new mechanism much closer to those of Python source and bytecode -modules, including the ability to any valid identifier as a module name, +modules, including the ability to use any valid identifier as a module name, rather than being restricted to ASCII. .. seealso:: @@ -763,7 +763,7 @@ unicodedata ----------- * The :mod:`unicodedata` module now uses data from `Unicode 8.0.0 -`_. + `_. wsgiref diff --git a/Misc/NEWS b/Misc/NEWS index 75e717f8115..09d79a40cda 100644 --- a/Misc/NEWS +++ b/Misc/NEWS @@ -92,6 +92,9 @@ Tests Documentation ------------- +- Issue #24458: Update documentation to cover multi-phase initialization for + extension modules (PEP 489). Patch by Petr Viktorin. + - Issue #24351: Clarify what is meant by "identifier" in the context of string.Template instances.