gh-98712: Clarify "readonly bytes-like object" semantics in C arg-parsing docs (#98710)

This commit is contained in:
Petr Viktorin 2022-12-23 16:00:21 +01:00 committed by GitHub
parent 88d565f32a
commit 49f6ff719c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 35 additions and 19 deletions

View File

@ -34,24 +34,39 @@ These formats allow accessing an object as a contiguous chunk of memory.
You don't have to provide raw storage for the returned unicode or bytes You don't have to provide raw storage for the returned unicode or bytes
area. area.
In general, when a format sets a pointer to a buffer, the buffer is
managed by the corresponding Python object, and the buffer shares
the lifetime of this object. You won't have to release any memory yourself.
The only exceptions are ``es``, ``es#``, ``et`` and ``et#``.
However, when a :c:type:`Py_buffer` structure gets filled, the underlying
buffer is locked so that the caller can subsequently use the buffer even
inside a :c:type:`Py_BEGIN_ALLOW_THREADS` block without the risk of mutable data
being resized or destroyed. As a result, **you have to call**
:c:func:`PyBuffer_Release` after you have finished processing the data (or
in any early abort case).
Unless otherwise stated, buffers are not NUL-terminated. Unless otherwise stated, buffers are not NUL-terminated.
Some formats require a read-only :term:`bytes-like object`, and set a There are three ways strings and buffers can be converted to C:
pointer instead of a buffer structure. They work by checking that
the object's :c:member:`PyBufferProcs.bf_releasebuffer` field is ``NULL``, * Formats such as ``y*`` and ``s*`` fill a :c:type:`Py_buffer` structure.
which disallows mutable objects such as :class:`bytearray`. This locks the underlying buffer so that the caller can subsequently use
the buffer even inside a :c:type:`Py_BEGIN_ALLOW_THREADS`
block without the risk of mutable data being resized or destroyed.
As a result, **you have to call** :c:func:`PyBuffer_Release` after you have
finished processing the data (or in any early abort case).
* The ``es``, ``es#``, ``et`` and ``et#`` formats allocate the result buffer.
**You have to call** :c:func:`PyMem_Free` after you have finished
processing the data (or in any early abort case).
* .. _c-arg-borrowed-buffer:
Other formats take a :class:`str` or a read-only :term:`bytes-like object`,
such as :class:`bytes`, and provide a ``const char *`` pointer to
its buffer.
In this case the buffer is "borrowed": it is managed by the corresponding
Python object, and shares the lifetime of this object.
You won't have to release any memory yourself.
To ensure that the underlying buffer may be safely borrowed, the object's
:c:member:`PyBufferProcs.bf_releasebuffer` field must be ``NULL``.
This disallows common mutable objects such as :class:`bytearray`,
but also some read-only objects such as :class:`memoryview` of
:class:`bytes`.
Besides this ``bf_releasebuffer`` requirement, there is no check to verify
whether the input object is immutable (e.g. whether it would honor a request
for a writable buffer, or whether another thread can mutate the data).
.. note:: .. note::
@ -89,7 +104,7 @@ which disallows mutable objects such as :class:`bytearray`.
Unicode objects are converted to C strings using ``'utf-8'`` encoding. Unicode objects are converted to C strings using ``'utf-8'`` encoding.
``s#`` (:class:`str`, read-only :term:`bytes-like object`) [const char \*, :c:type:`Py_ssize_t`] ``s#`` (:class:`str`, read-only :term:`bytes-like object`) [const char \*, :c:type:`Py_ssize_t`]
Like ``s*``, except that it doesn't accept mutable objects. Like ``s*``, except that it provides a :ref:`borrowed buffer <c-arg-borrowed-buffer>`.
The result is stored into two C variables, The result is stored into two C variables,
the first one a pointer to a C string, the second one its length. the first one a pointer to a C string, the second one its length.
The string may contain embedded null bytes. Unicode objects are converted The string may contain embedded null bytes. Unicode objects are converted
@ -108,8 +123,9 @@ which disallows mutable objects such as :class:`bytearray`.
pointer is set to ``NULL``. pointer is set to ``NULL``.
``y`` (read-only :term:`bytes-like object`) [const char \*] ``y`` (read-only :term:`bytes-like object`) [const char \*]
This format converts a bytes-like object to a C pointer to a character This format converts a bytes-like object to a C pointer to a
string; it does not accept Unicode objects. The bytes buffer must not :ref:`borrowed <c-arg-borrowed-buffer>` character string;
it does not accept Unicode objects. The bytes buffer must not
contain embedded null bytes; if it does, a :exc:`ValueError` contain embedded null bytes; if it does, a :exc:`ValueError`
exception is raised. exception is raised.