BufferedWriter() was buffering calls that are the exact same size as the buffer. it's a very common case to read/write in blocks of the exact buffer size.
it's pointless to copy a full buffer, it's costing extra memory copy and the full buffer will have to be written in the next call anyway.
Co-authored-by: rmorotti <romain.morotti@man.com>
lseek() always returns 0 for character pseudo-devices like
`/dev/urandom` (for other non-regular files, e.g. `/dev/stdin`, it
always returns -1, to which CPython reacts by raising appropriate
exceptions). They are thus technically seekable despite not having seek
semantics.
When calling read() on e.g. an instance of `io.BufferedReader` that
wraps such a file, `BufferedReader` reads ahead, filling its buffer,
creating a discrepancy between the number of bytes read and the internal
`tell()` always returning 0, which previously resulted in e.g.
`BufferedReader.tell()` or `BufferedReader.seek()` being able to return
positions < 0 even though these are supposed to be always >= 0.
Invariably keep the return value non-negative by returning
max(former_return_value, 0) instead, and add some corresponding tests.
The `@critical_section` directive instructs Argument Clinic to generate calls
to `Py_BEGIN_CRITICAL_SECTION()` and `Py_END_CRITICAL_SECTION()` around the
bound function. In `--disable-gil` builds, these calls will lock and unlock
the `self` object. They are no-ops in the default build.
This is used in one place (`_io._Buffered.close`) as a demonstration.
Subsequent PRs will use it more widely in the `_io.Buffered` bindings.
Move private _PyBytes functions to the internal C API
(pycore_bytesobject.h):
* _PyBytes_DecodeEscape()
* _PyBytes_FormatEx()
* _PyBytes_FromHex()
* _PyBytes_Join()
No longer export these functions.
Remove private pylifecycle.h functions: move them to the internal C
API ( pycore_atexit.h, pycore_pylifecycle.h and pycore_signal.h). No
longer export most of these functions.
Move _testcapi.test_atexit() to _testinternalcapi.
When preparing the _io extension module for isolation, many methods were
adapted to Argument Clinic. Some of these used the '*args: object'
signature, which is incorrect. These are now corrected to an exact
signature, and marked unused, since they are stub methods.
- Replace query with parameter in bufferediobase_unsupported()
- Replace query with parameter in iobase_unsupported()
- Hide delegate: Add method wrapper for _PyIOBase_check_seekable
- Hide delegate: Add method wraper for _PyIOBase_check_readable
- Hide delegate: Add method wraper for _PyIOBase_check_writable
- Replace query with parameter in _PyIOBase_check_seekable()
- Replace query with parameter in _PyIOBase_check_readable()
- Replace query with parameter in _PyIOBase_check_writable()
The implementation of __sizeof__() methods using _PyObject_SIZE() now
use an unsigned type (size_t) to compute the size, rather than a signed
type (Py_ssize_t).
Cast explicitly signed (Py_ssize_t) values to unsigned type
(Py_ssize_t).
We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules.
The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings).
https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
* (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros
* Include/internal/pycore_runtime_init.h - added the static initializers for the global strings
* Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState
* Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers
I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config.
The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *.
The following are not changed (yet):
* stop using _Py_IDENTIFIER() in the stdlib modules
* (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API
* (maybe) intern the strings during runtime init
https://bugs.python.org/issue46541
* Move _PyObject_CallNoArgs() to pycore_call.h (internal C API).
* _ssl, _sqlite and _testcapi extensions now call the public
PyObject_CallNoArgs() function, rather than _PyObject_CallNoArgs().
* _lsprof extension is now built with Py_BUILD_CORE_MODULE macro
defined to get access to internal _PyObject_CallNoArgs().
Fix typo in the private _PyObject_CallNoArg() function name: rename
it to _PyObject_CallNoArgs() to be consistent with the public
function PyObject_CallNoArgs().
The truncate() method of io.BufferedReader() should raise
UnsupportedOperation when it is called on a read-only
io.BufferedReader() instance.
https://bugs.python.org/issue35950
Automerge-Triggered-By: @methane
The bulk of this patch was generated automatically with:
for name in \
PyObject_Vectorcall \
Py_TPFLAGS_HAVE_VECTORCALL \
PyObject_VectorcallMethod \
PyVectorcall_Function \
PyObject_CallOneArg \
PyObject_CallMethodNoArgs \
PyObject_CallMethodOneArg \
;
do
echo $name
git grep -lwz _$name | xargs -0 sed -i "s/\b_$name\b/$name/g"
done
old=_PyObject_FastCallDict
new=PyObject_VectorcallDict
git grep -lwz $old | xargs -0 sed -i "s/\b$old\b/$new/g"
and then cleaned up:
- Revert changes to in docs & news
- Revert changes to backcompat defines in headers
- Nudge misaligned comments
When called on a closed object, readinto() segfaults on account
of a write to a freed buffer:
==220553== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==220553== Access not within mapped region at address 0x2A
==220553== at 0x48408A0: memmove (vg_replace_strmem.c:1272)
==220553== by 0x58DB0C: _buffered_readinto_generic (bufferedio.c:972)
==220553== by 0x58DCBA: _io__Buffered_readinto_impl (bufferedio.c:1053)
==220553== by 0x58DCBA: _io__Buffered_readinto (bufferedio.c.h:253)
Reproducer:
reader = open ("/dev/zero", "rb")
_void = reader.read (42)
reader.close ()
reader.readinto (bytearray (42)) ### BANG!
The problem exists since 2012 when commit dc469454ec added code
to free the read buffer on close().
Signed-off-by: Philipp Gesang <philipp.gesang@intra2net.com>