Commit Graph

208 Commits

Author SHA1 Message Date
Victor Stinner 65dd745f1a
gh-99300: Use Py_NewRef() in Modules/ directory (#99473)
Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and
Py_XNewRef() in test C files of the Modules/ directory.
2022-11-14 16:21:40 +01:00
Eric Snow 73679b13ca
gh-90110: Update the C-analyzer Tool (gh-99307) 2022-11-10 09:03:57 -07:00
Benjamin Peterson 0f156c1c56
Remove unused arrange_output_buffer function from zlibmodule.c. (GH-98358) 2022-10-17 09:38:34 -07:00
Ruben Vorderman eae7dad402
gh-95534: Improve gzip reading speed by 10% (#97664)
Change summary:
+ There is now a `gzip.READ_BUFFER_SIZE` constant that is 128KB. Other programs that read in 128KB chunks: pigz and cat. So this seems best practice among good programs. Also it is faster than 8 kb chunks.
+ a zlib._ZlibDecompressor was added. This is the _bz2.BZ2Decompressor ported to zlib. Since the zlib.Decompress object is better for in-memory decompression, the _ZlibDecompressor is hidden. It only makes sense in file decompression, and that is already implemented now in the gzip library. No need to bother the users with this.
+ The ZlibDecompressor uses the older Cpython arrange_output_buffer functions, as those are faster and more appropriate for the use case. 
+ GzipFile.read has been optimized. There is no longer a `unconsumed_tail` member to write back to padded file. This is instead handled by the ZlibDecompressor itself, which has an internal buffer. `_add_read_data` has been inlined, as it was just two calls.

EDIT: While I am adding improvements anyway, I figured I could add another one-liner optimization now to the python -m gzip application. That read chunks in io.DEFAULT_BUFFER_SIZE previously, but has been updated now to use READ_BUFFER_SIZE chunks.
2022-10-16 19:10:58 -07:00
Gregory P. Smith 9d1c4d69db
bpo-38256: Fix binascii.crc32() when inputs are 4+GiB (GH-32000)
When compiled with `USE_ZLIB_CRC32` defined (`configure` sets this on POSIX systems), `binascii.crc32(...)` failed to compute the correct value when the input data was >= 4GiB. Because the zlib crc32 API is limited to a 32-bit length.

This lines it up with the `zlib.crc32(...)` implementation that doesn't have that flaw.

**Performance:** This also adopts the same GIL releasing for larger inputs logic that `zlib.crc32` has, and causes the Windows build to always use zlib's crc32 instead of our slow C code as zlib is a required build dependency on Windows.
2022-03-20 12:28:15 -07:00
Ma Lin b3f2d4c8ba
bpo-47040: improve document of checksum functions (gh-31955)
Clarifies a versionchanged note on crc32 & adler32 docs that the workaround is only needed for Python 2 and earlier.
Also cleans up an unnecessary intermediate variable in the implementation.

Authored-By: Ma Lin / animalize
Co-authored-by: Gregory P. Smith <greg@krypto.org>
2022-03-19 14:42:04 -07:00
Ma Lin 7edb6270a7
bpo-41735: Fix thread lock in zlib.Decompress.flush() may go wrong (GH-29587)
* Fix thread lock in zlib.Decompress.flush() may go wrong

Getting `.unconsumed_tail` before acquiring the thread lock may mix up decompress state.
2021-11-26 16:18:17 -08:00
Christian Clauss dd02a696e5
Fix typos in the Modules directory (GH-28761) 2021-10-07 01:34:42 -07:00
Mohamad Mansour 8f943ca257
[codemod] Fix non-matching bracket pairs (GH-28473)
Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
2021-09-22 01:09:00 +02:00
Ruben Vorderman ea23e7820f
bpo-43613: Faster implementation of gzip.compress and gzip.decompress (GH-27941)
Co-authored-by: Łukasz Langa <lukasz@langa.pl>
2021-09-02 17:02:59 +02:00
Ma Lin a9a69bb3ea
bpo-41486: zlib uses an UINT32_MAX sliding window for the output buffer (GH-26143)
* zlib uses an UINT32_MAX sliding window for the output buffer

These funtions have an initial output buffer size parameter:
- zlib.decompress(data, /, wbits=MAX_WBITS, bufsize=DEF_BUF_SIZE)
- zlib.Decompress.flush([length])

If the initial size > UINT32_MAX, use an UINT32_MAX sliding window, instead of clamping to UINT32_MAX.
Speed up when (the initial size == the actual size).

This fixes a memory consumption and copying performance regression in earlier 3.10 beta releases if someone used an output buffer larger than 4GiB with zlib.decompress.

Reviewed-by: Gregory P. Smith
2021-07-04 18:10:44 -07:00
Ma Lin 251ffa9d2b
bpo-41486: Fix initial buffer size can't > UINT32_MAX in zlib module (GH-25738)
* Fix initial buffer size can't > UINT32_MAX in zlib module

After commit f9bedb630e, in 64-bit build,
if the initial buffer size > UINT32_MAX, ValueError will be raised.

These two functions are affected:
1. zlib.decompress(data, /, wbits=MAX_WBITS, bufsize=DEF_BUF_SIZE)
2. zlib.Decompress.flush([length])

This commit re-allows the size > UINT32_MAX.

* adds curly braces per PEP 7.

* Renames `Buffer_*` to `OutputBuffer_*` for clarity
2021-04-30 16:32:49 -07:00
Erlend Egeberg Aasland 9746cda705
bpo-43916: Apply Py_TPFLAGS_DISALLOW_INSTANTIATION to selected types (GH-25748)
Apply Py_TPFLAGS_DISALLOW_INSTANTIATION to the following types:

* _dbm.dbm
* _gdbm.gdbm
* _multibytecodec.MultibyteCodec
* _sre..SRE_Scanner
* _thread._localdummy
* _thread.lock
* _winapi.Overlapped
* array.arrayiterator
* functools.KeyWrapper
* functools._lru_list_elem
* pyexpat.xmlparser
* re.Match
* re.Pattern
* unicodedata.UCD
* zlib.Compress
* zlib.Decompress
2021-04-30 16:04:57 +02:00
Ma Lin f9bedb630e
bpo-41486: Faster bz2/lzma/zlib via new output buffering (GH-21740)
Faster bz2/lzma/zlib via new output buffering.
Also adds .readall() function to _compression.DecompressReader class
to take best advantage of this in the consume-all-output at once scenario.

Often a 5-20% speedup in common scenarios due to less data copying.

Contributed by Ma Lin.
2021-04-27 23:58:54 -07:00
Ma Lin 93f411838a
Fix thread locks in zlib module may go wrong in rare case. (#22126)
Setting `next_in` before acquiring the thread lock may mix up compress/decompress state in other threads.
2021-04-27 10:37:11 +02:00
Victor Stinner 32bd68c839
bpo-42519: Replace PyObject_MALLOC() with PyObject_Malloc() (GH-23587)
No longer use deprecated aliases to functions:

* Replace PyObject_MALLOC() with PyObject_Malloc()
* Replace PyObject_REALLOC() with PyObject_Realloc()
* Replace PyObject_FREE() with PyObject_Free()
* Replace PyObject_Del() with PyObject_Free()
* Replace PyObject_DEL() with PyObject_Free()
2020-12-01 10:37:39 +01:00
Mohamed Koubaa 1aaa21ff81
bpo-1635741 port zlib module to multi-phase init (GH-21995)
Port the zlib extension module to multi-phase initialization (PEP 489).
2020-09-07 10:27:55 +02:00
Serhiy Storchaka 578c3955e0
bpo-37999: No longer use __int__ in implicit integer conversions. (GH-15636)
Only __index__ should be used to make integer conversions lossless.
2020-05-26 18:43:38 +03:00
Victor Stinner 4a21e57fe5
bpo-40268: Remove unused structmember.h includes (GH-19530)
If only offsetof() is needed: include stddef.h instead.

When structmember.h is used, add a comment explaining that
PyMemberDef is used.
2020-04-15 02:35:41 +02:00
Victor Stinner 62183b8d6d
bpo-40268: Remove explicit pythread.h includes (#19529)
Remove explicit pythread.h includes: it is always included
by Python.h.
2020-04-15 02:04:42 +02:00
Hai Shi f707d94af6
bpo-39968: Convert extension modules' macros of get_module_state() to inline functions (GH-19017) 2020-03-16 14:15:01 +01:00
Dino Viehland a1ffad0719 bpo-38074: Make zlib extension module PEP-384 compatible (GH-15792)
Updated zlibmodule.c to be PEP 384 compliant.
2019-09-10 03:27:03 -07:00
Jeroen Demeyer 530f506ac9 bpo-36974: tp_print -> tp_vectorcall_offset and tp_reserved -> tp_as_async (GH-13464)
Automatically replace
tp_print -> tp_vectorcall_offset
tp_compare -> tp_as_async
tp_reserved -> tp_as_async
2019-05-30 19:13:39 -07:00
Serhiy Storchaka 6a44f6eef3
bpo-36048: Use __index__() instead of __int__() for implicit conversion if available. (GH-11952)
Deprecate using the __int__() method in implicit conversions of Python
numbers to C integers.
2019-02-25 17:57:58 +02:00
Alexey Izbyshev 3d4fabb2a4 bpo-35090: Fix potential division by zero in allocator wrappers (GH-10174)
* Fix potential division by zero in BZ2_Malloc()
* Avoid division by zero in PyLzma_Malloc()
* Avoid division by zero and integer overflow in PyZlib_Malloc()

Reported by Svace static analyzer.
2018-10-28 17:45:50 +01:00
Zackery Spytz d2cbfffc84 bpo-25007: Add copy protocol support to zlib compressors and decompressors (GH-7940) 2018-06-27 21:04:51 +03:00
Xiang Zhang bc3f2289b9
bpo-32969: Expose some missing constants in zlib and fix the doc (GH-5988) 2018-03-07 13:05:37 +08:00
Tal Einat 4f57409a2f bpo-31926: fix missing *_METHODDEF statements by argument clinic (#4230)
When a single .c file contains several functions and/or methods with
the same name, a safety _METHODDEF #define statement is generated
only for one of them.

This fixes the bug by using the full name of the function to avoid
duplicates rather than just the name.
2017-11-03 11:09:00 +02:00
Antoine Pitrou a6a4dc816d bpo-31370: Remove support for threads-less builds (#3385)
* Remove Setup.config
* Always define WITH_THREAD for compatibility.
2017-09-07 18:56:24 +02:00
Segev Finer 679b566622 bpo-9566: Fix some Windows x64 compiler warnings (#2492)
* bpo-9566: Silence liblzma warnings

* bpo-9566: Silence tcl warnings

* bpo-9566: Silence tk warnings

* bpo-9566: Silence tix warnings

* bpo-9566: Fix some library warnings

* bpo-9566: Fix msvcrtmodule.c warnings

* bpo-9566: Silence _bz2 warnings

* bpo-9566: Fixed some _ssl warnings

* bpo-9566: Fix _msi warnings

* bpo-9566: Silence _ctypes warnings

* Revert "bpo-9566: Fixed some _ssl warnings"

This reverts commit a639001c94.

* bpo-9566: Also consider NULL as a possible error in HANDLE_return_converter

* bpo-9566: whitespace fixes
2017-07-26 15:17:57 -07:00
Christian Heimes f051e43b22 Issue #28126: Replace Py_MEMCPY with memcpy(). Visual Studio can properly optimize memcpy(). 2016-09-13 20:22:02 +02:00
Serhiy Storchaka 15f3228b7c Issue #16764: Support keyword arguments to zlib.decompress(). Patch by
Xiang Zhang.
2016-08-15 10:06:16 +03:00
Martin Panter 525a949251 Issue #27130: Merge zlib 64-bit fixes from 3.5 2016-07-23 03:39:49 +00:00
Martin Panter 84544c1020 Issue #27130: Fix handling of buffers exceeding UINT_MAX in “zlib” module
Patch by Xiang Zhang.
2016-07-23 03:02:07 +00:00
Serhiy Storchaka 2954f83999 - Issue #27332: Fixed the type of the first argument of module-level functions
generated by Argument Clinic.  Patch by Petr Viktorin.
2016-07-07 18:20:03 +03:00
Serhiy Storchaka 1a2b24f02d Issue #27332: Fixed the type of the first argument of module-level functions
generated by Argument Clinic.  Patch by Petr Viktorin.
2016-07-07 17:35:15 +03:00
Serhiy Storchaka 95657cdd40 Issue #26243: Only the level argument to zlib.compress() is keyword argument
now.  The first argument is positional-only.
2016-06-25 22:43:05 +03:00
Martin Panter 1ab2f14281 Issue #27164: Merge raw Deflate zdict support from 3.5 2016-06-05 12:07:48 +00:00
Martin Panter 3f0ee83f14 Issue #27164: Allow decompressing raw Deflate streams with predefined zdict
Based on patch by Xiang Zhang.
2016-06-05 10:48:34 +00:00
Martin Panter f4affb71bc Issue #5784: Merge zlib from 3.5 2016-05-27 08:00:24 +00:00
Martin Panter 0fdf41d847 Issue #5784: Expand documentation and tests for zlib wbits parameter
Based on documentation by AM Kuchling.
2016-05-27 07:32:11 +00:00
Serhiy Storchaka f01e408c16 Issue #26200: Added Py_SETREF and replaced Py_XSETREF with Py_SETREF
in places where Py_DECREF was used.
2016-04-10 18:12:01 +03:00
Serhiy Storchaka 57a01d3a0e Issue #26200: Added Py_SETREF and replaced Py_XSETREF with Py_SETREF
in places where Py_DECREF was used.
2016-04-10 18:05:40 +03:00
Serhiy Storchaka ec39756960 Issue #22570: Renamed Py_SETREF to Py_XSETREF. 2016-04-06 09:50:03 +03:00
Serhiy Storchaka 48842714b9 Issue #22570: Renamed Py_SETREF to Py_XSETREF. 2016-04-06 09:45:48 +03:00
Martin Panter b0cb42dfdb Issue 26243: Forgot to update zlib doc strings in Argument Clinic 2016-02-10 10:45:54 +00:00
Martin Panter 1fe0d13d12 Issue #26243: zlib.compress() keyword argument support by Aviv Palivoda 2016-02-10 10:06:36 +00:00
Martin Panter 8254f793c0 Issue #26244: Merge zlib documentation from 3.5 2016-02-03 07:52:06 +00:00
Martin Panter 567d513b9b Issue #26244: Clarify default zlib compression level in documentation
Based on patch by Aviv Palivoda.
2016-02-03 07:06:33 +00:00
Serhiy Storchaka 726fc139a5 Issue #20440: More use of Py_SETREF.
This patch is manually crafted and contains changes that couldn't be handled
automatically.
2015-12-27 15:44:33 +02:00