2008-09-25 23:58:36 -03:00
|
|
|
.. highlightlang:: c
|
|
|
|
|
2011-08-18 20:22:42 -03:00
|
|
|
.. _cporting-howto:
|
|
|
|
|
2012-02-28 19:17:23 -04:00
|
|
|
*************************************
|
|
|
|
Porting Extension Modules to Python 3
|
|
|
|
*************************************
|
2008-09-25 23:58:36 -03:00
|
|
|
|
|
|
|
:author: Benjamin Peterson
|
|
|
|
|
|
|
|
|
|
|
|
.. topic:: Abstract
|
|
|
|
|
2012-02-28 19:17:23 -04:00
|
|
|
Although changing the C-API was not one of Python 3's objectives,
|
|
|
|
the many Python-level changes made leaving Python 2's API intact
|
|
|
|
impossible. In fact, some changes such as :func:`int` and
|
|
|
|
:func:`long` unification are more obvious on the C level. This
|
|
|
|
document endeavors to document incompatibilities and how they can
|
|
|
|
be worked around.
|
2008-09-25 23:58:36 -03:00
|
|
|
|
|
|
|
|
|
|
|
Conditional compilation
|
|
|
|
=======================
|
|
|
|
|
2012-02-28 19:17:23 -04:00
|
|
|
The easiest way to compile only some code for Python 3 is to check
|
|
|
|
if :c:macro:`PY_MAJOR_VERSION` is greater than or equal to 3. ::
|
2008-09-25 23:58:36 -03:00
|
|
|
|
|
|
|
#if PY_MAJOR_VERSION >= 3
|
|
|
|
#define IS_PY3K
|
|
|
|
#endif
|
|
|
|
|
|
|
|
API functions that are not present can be aliased to their equivalents within
|
2008-09-26 18:15:21 -03:00
|
|
|
conditional blocks.
|
2008-09-25 23:58:36 -03:00
|
|
|
|
|
|
|
|
|
|
|
Changes to Object APIs
|
|
|
|
======================
|
|
|
|
|
2012-02-28 19:17:23 -04:00
|
|
|
Python 3 merged together some types with similar functions while cleanly
|
2008-09-25 23:58:36 -03:00
|
|
|
separating others.
|
|
|
|
|
|
|
|
|
|
|
|
str/unicode Unification
|
|
|
|
-----------------------
|
|
|
|
|
|
|
|
|
2012-02-28 19:17:23 -04:00
|
|
|
Python 3's :func:`str` (``PyString_*`` functions in C) type is equivalent to
|
|
|
|
Python 2's :func:`unicode` (``PyUnicode_*``). The old 8-bit string type has
|
|
|
|
become :func:`bytes`. Python 2.6 and later provide a compatibility header,
|
2008-09-25 23:58:36 -03:00
|
|
|
:file:`bytesobject.h`, mapping ``PyBytes`` names to ``PyString`` ones. For best
|
2012-02-28 19:17:23 -04:00
|
|
|
compatibility with Python 3, :c:type:`PyUnicode` should be used for textual data and
|
2012-01-14 11:42:02 -04:00
|
|
|
:c:type:`PyBytes` for binary data. It's also important to remember that
|
2012-02-28 19:17:23 -04:00
|
|
|
:c:type:`PyBytes` and :c:type:`PyUnicode` in Python 3 are not interchangeable like
|
|
|
|
:c:type:`PyString` and :c:type:`PyUnicode` are in Python 2. The following example
|
2012-01-14 11:42:02 -04:00
|
|
|
shows best practices with regards to :c:type:`PyUnicode`, :c:type:`PyString`,
|
|
|
|
and :c:type:`PyBytes`. ::
|
2008-09-25 23:58:36 -03:00
|
|
|
|
|
|
|
#include "stdlib.h"
|
|
|
|
#include "Python.h"
|
|
|
|
#include "bytesobject.h"
|
|
|
|
|
|
|
|
/* text example */
|
|
|
|
static PyObject *
|
|
|
|
say_hello(PyObject *self, PyObject *args) {
|
|
|
|
PyObject *name, *result;
|
|
|
|
|
|
|
|
if (!PyArg_ParseTuple(args, "U:say_hello", &name))
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
result = PyUnicode_FromFormat("Hello, %S!", name);
|
|
|
|
return result;
|
|
|
|
}
|
|
|
|
|
2008-09-27 20:28:43 -03:00
|
|
|
/* just a forward */
|
2008-09-25 23:58:36 -03:00
|
|
|
static char * do_encode(PyObject *);
|
|
|
|
|
|
|
|
/* bytes example */
|
|
|
|
static PyObject *
|
|
|
|
encode_object(PyObject *self, PyObject *args) {
|
|
|
|
char *encoded;
|
|
|
|
PyObject *result, *myobj;
|
|
|
|
|
|
|
|
if (!PyArg_ParseTuple(args, "O:encode_object", &myobj))
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
encoded = do_encode(myobj);
|
|
|
|
if (encoded == NULL)
|
|
|
|
return NULL;
|
|
|
|
result = PyBytes_FromString(encoded);
|
|
|
|
free(encoded);
|
|
|
|
return result;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
long/int Unification
|
|
|
|
--------------------
|
|
|
|
|
2012-02-28 19:17:23 -04:00
|
|
|
Python 3 has only one integer type, :func:`int`. But it actually
|
|
|
|
corresponds to Python 2's :func:`long` type--the :func:`int` type
|
|
|
|
used in Python 2 was removed. In the C-API, ``PyInt_*`` functions
|
|
|
|
are replaced by their ``PyLong_*`` equivalents.
|
|
|
|
|
|
|
|
The best course of action here is using the ``PyInt_*`` functions aliased to
|
2009-04-03 18:45:29 -03:00
|
|
|
``PyLong_*`` found in :file:`intobject.h`. The abstract ``PyNumber_*`` APIs
|
2008-09-27 20:28:43 -03:00
|
|
|
can also be used in some cases. ::
|
2008-09-25 23:58:36 -03:00
|
|
|
|
|
|
|
#include "Python.h"
|
2008-09-27 20:28:43 -03:00
|
|
|
#include "intobject.h"
|
2008-09-25 23:58:36 -03:00
|
|
|
|
|
|
|
static PyObject *
|
|
|
|
add_ints(PyObject *self, PyObject *args) {
|
|
|
|
int one, two;
|
|
|
|
PyObject *result;
|
|
|
|
|
|
|
|
if (!PyArg_ParseTuple(args, "ii:add_ints", &one, &two))
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
return PyInt_FromLong(one + two);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Module initialization and state
|
|
|
|
===============================
|
|
|
|
|
2012-02-28 19:17:23 -04:00
|
|
|
Python 3 has a revamped extension module initialization system. (See
|
|
|
|
:pep:`3121`.) Instead of storing module state in globals, they should
|
|
|
|
be stored in an interpreter specific structure. Creating modules that
|
|
|
|
act correctly in both Python 2 and Python 3 is tricky. The following
|
|
|
|
simple example demonstrates how. ::
|
2008-09-25 23:58:36 -03:00
|
|
|
|
|
|
|
#include "Python.h"
|
|
|
|
|
|
|
|
struct module_state {
|
|
|
|
PyObject *error;
|
|
|
|
};
|
|
|
|
|
|
|
|
#if PY_MAJOR_VERSION >= 3
|
|
|
|
#define GETSTATE(m) ((struct module_state*)PyModule_GetState(m))
|
|
|
|
#else
|
|
|
|
#define GETSTATE(m) (&_state)
|
|
|
|
static struct module_state _state;
|
|
|
|
#endif
|
|
|
|
|
|
|
|
static PyObject *
|
|
|
|
error_out(PyObject *m) {
|
|
|
|
struct module_state *st = GETSTATE(m);
|
|
|
|
PyErr_SetString(st->error, "something bad happened");
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static PyMethodDef myextension_methods[] = {
|
|
|
|
{"error_out", (PyCFunction)error_out, METH_NOARGS, NULL},
|
|
|
|
{NULL, NULL}
|
|
|
|
};
|
|
|
|
|
|
|
|
#if PY_MAJOR_VERSION >= 3
|
|
|
|
|
|
|
|
static int myextension_traverse(PyObject *m, visitproc visit, void *arg) {
|
|
|
|
Py_VISIT(GETSTATE(m)->error);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int myextension_clear(PyObject *m) {
|
|
|
|
Py_CLEAR(GETSTATE(m)->error);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
static struct PyModuleDef moduledef = {
|
|
|
|
PyModuleDef_HEAD_INIT,
|
|
|
|
"myextension",
|
|
|
|
NULL,
|
|
|
|
sizeof(struct module_state),
|
|
|
|
myextension_methods,
|
|
|
|
NULL,
|
|
|
|
myextension_traverse,
|
|
|
|
myextension_clear,
|
|
|
|
NULL
|
|
|
|
};
|
|
|
|
|
|
|
|
#define INITERROR return NULL
|
|
|
|
|
|
|
|
PyObject *
|
|
|
|
PyInit_myextension(void)
|
|
|
|
|
|
|
|
#else
|
|
|
|
#define INITERROR return
|
|
|
|
|
|
|
|
void
|
|
|
|
initmyextension(void)
|
|
|
|
#endif
|
|
|
|
{
|
|
|
|
#if PY_MAJOR_VERSION >= 3
|
|
|
|
PyObject *module = PyModule_Create(&moduledef);
|
|
|
|
#else
|
|
|
|
PyObject *module = Py_InitModule("myextension", myextension_methods);
|
|
|
|
#endif
|
|
|
|
|
|
|
|
if (module == NULL)
|
|
|
|
INITERROR;
|
|
|
|
struct module_state *st = GETSTATE(module);
|
|
|
|
|
|
|
|
st->error = PyErr_NewException("myextension.Error", NULL, NULL);
|
|
|
|
if (st->error == NULL) {
|
|
|
|
Py_DECREF(module);
|
|
|
|
INITERROR;
|
|
|
|
}
|
|
|
|
|
|
|
|
#if PY_MAJOR_VERSION >= 3
|
|
|
|
return module;
|
|
|
|
#endif
|
|
|
|
}
|
2008-09-26 17:52:06 -03:00
|
|
|
|
|
|
|
|
2011-10-09 09:03:44 -03:00
|
|
|
CObject replaced with Capsule
|
|
|
|
=============================
|
|
|
|
|
2012-01-14 11:42:02 -04:00
|
|
|
The :c:type:`Capsule` object was introduced in Python 3.1 and 2.7 to replace
|
|
|
|
:c:type:`CObject`. CObjects were useful,
|
|
|
|
but the :c:type:`CObject` API was problematic: it didn't permit distinguishing
|
2011-10-09 09:03:44 -03:00
|
|
|
between valid CObjects, which allowed mismatched CObjects to crash the
|
|
|
|
interpreter, and some of its APIs relied on undefined behavior in C.
|
|
|
|
(For further reading on the rationale behind Capsules, please see :issue:`5630`.)
|
|
|
|
|
|
|
|
If you're currently using CObjects, and you want to migrate to 3.1 or newer,
|
|
|
|
you'll need to switch to Capsules.
|
2012-01-14 11:42:02 -04:00
|
|
|
:c:type:`CObject` was deprecated in 3.1 and 2.7 and completely removed in
|
2011-10-09 09:03:44 -03:00
|
|
|
Python 3.2. If you only support 2.7, or 3.1 and above, you
|
2012-02-28 19:17:23 -04:00
|
|
|
can simply switch to :c:type:`Capsule`. If you need to support Python 3.0,
|
|
|
|
or versions of Python earlier than 2.7,
|
|
|
|
you'll have to support both CObjects and Capsules.
|
|
|
|
(Note that Python 3.0 is no longer supported, and it is not recommended
|
|
|
|
for production use.)
|
2011-10-09 09:03:44 -03:00
|
|
|
|
|
|
|
The following example header file :file:`capsulethunk.h` may
|
2012-02-28 19:17:23 -04:00
|
|
|
solve the problem for you. Simply write your code against the
|
|
|
|
:c:type:`Capsule` API and include this header file after
|
|
|
|
:file:`Python.h`. Your code will automatically use Capsules
|
|
|
|
in versions of Python with Capsules, and switch to CObjects
|
|
|
|
when Capsules are unavailable.
|
2011-10-09 09:03:44 -03:00
|
|
|
|
|
|
|
:file:`capsulethunk.h` simulates Capsules using CObjects. However,
|
2012-01-14 11:42:02 -04:00
|
|
|
:c:type:`CObject` provides no place to store the capsule's "name". As a
|
|
|
|
result the simulated :c:type:`Capsule` objects created by :file:`capsulethunk.h`
|
2011-10-09 09:03:44 -03:00
|
|
|
behave slightly differently from real Capsules. Specifically:
|
|
|
|
|
2012-01-14 11:42:02 -04:00
|
|
|
* The name parameter passed in to :c:func:`PyCapsule_New` is ignored.
|
2011-10-09 09:03:44 -03:00
|
|
|
|
2012-01-14 11:42:02 -04:00
|
|
|
* The name parameter passed in to :c:func:`PyCapsule_IsValid` and
|
|
|
|
:c:func:`PyCapsule_GetPointer` is ignored, and no error checking
|
2011-10-09 09:03:44 -03:00
|
|
|
of the name is performed.
|
|
|
|
|
2012-01-14 11:42:02 -04:00
|
|
|
* :c:func:`PyCapsule_GetName` always returns NULL.
|
2011-10-09 09:03:44 -03:00
|
|
|
|
2012-01-14 11:42:02 -04:00
|
|
|
* :c:func:`PyCapsule_SetName` always throws an exception and
|
2011-10-09 09:03:44 -03:00
|
|
|
returns failure. (Since there's no way to store a name
|
2012-01-14 11:42:02 -04:00
|
|
|
in a CObject, noisy failure of :c:func:`PyCapsule_SetName`
|
2011-10-09 09:03:44 -03:00
|
|
|
was deemed preferable to silent failure here. If this is
|
2012-03-24 13:38:29 -03:00
|
|
|
inconvenient, feel free to modify your local
|
2011-10-09 09:03:44 -03:00
|
|
|
copy as you see fit.)
|
|
|
|
|
|
|
|
You can find :file:`capsulethunk.h` in the Python source distribution
|
2012-03-05 11:43:41 -04:00
|
|
|
as :source:`Doc/includes/capsulethunk.h`. We also include it here for
|
|
|
|
your convenience:
|
2011-10-09 09:03:44 -03:00
|
|
|
|
|
|
|
.. literalinclude:: ../includes/capsulethunk.h
|
|
|
|
|
|
|
|
|
|
|
|
|
2008-09-26 17:52:06 -03:00
|
|
|
Other options
|
|
|
|
=============
|
|
|
|
|
|
|
|
If you are writing a new extension module, you might consider `Cython
|
|
|
|
<http://www.cython.org>`_. It translates a Python-like language to C. The
|
2012-02-28 19:17:23 -04:00
|
|
|
extension modules it creates are compatible with Python 3 and Python 2.
|
2008-09-26 17:52:06 -03:00
|
|
|
|