cpython

Commit Graph

Author	SHA1	Message	Date
Victor Stinner	3d4226a832	bpo-34523: Support surrogatepass in locale codecs (GH-8995) Add support for the "surrogatepass" error handler in PyUnicode_DecodeFSDefault() and PyUnicode_EncodeFSDefault() for the UTF-8 encoding. Changes: * _Py_DecodeUTF8Ex() and _Py_EncodeUTF8Ex() now support the surrogatepass error handler (_Py_ERROR_SURROGATEPASS). * _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() now use the _Py_error_handler enum instead of "int surrogateescape" to pass the error handler. These functions now return -3 if the error handler is unknown. * Add unit tests on _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() in test_codecs. * Rename get_error_handler() to _Py_GetErrorHandler() and expose it as a private function. * _freeze_importlib doesn't need config.filesystem_errors="strict" workaround anymore.	2018-08-29 22:21:32 +02:00
Tal Einat	c929df3b96	bpo-20180: complete AC conversion of Objects/stringlib/transmogrify.h (GH-8039) * converted bytes methods: expandtabs, ljust, rjust, center, zfill * updated char_convertor to properly set the C default value	2018-07-06 13:17:38 +03:00
Siddhesh Poyarekar	55edd0c185	bpo-33012: Fix invalid function cast warnings with gcc 8 for METH_NOARGS. (GH-6030) METH_NOARGS functions need only a single argument but they are cast into a PyCFunction, which takes two arguments. This triggers an invalid function cast warning in gcc8 due to the argument mismatch. Fix this by adding a dummy unused argument.	2018-04-29 21:59:33 +03:00
INADA Naoki	a49ac99029	bpo-32677: Add .isascii() to str, bytes and bytearray (GH-5342)	2018-01-27 14:06:21 +09:00
Barry Warsaw	b2e5794870	bpo-31338 (#3374 ) * Add Py_UNREACHABLE() as an alias to abort(). * Use Py_UNREACHABLE() instead of assert(0) * Convert more unreachable code to use Py_UNREACHABLE() * Document Py_UNREACHABLE() and a few other macros.	2017-09-14 18:13:16 -07:00
Stefan Krah	f432a3234f	bpo-30923: Silence fall-through warnings included in -Wextra since gcc-7.0. (#3157 )	2017-08-21 13:09:59 +02:00
Serhiy Storchaka	5075416b8f	bpo-30978: str.format_map() now passes key lookup exceptions through. (#2790 ) Previously any exception was replaced with a KeyError exception.	2017-08-03 11:45:23 +03:00
Serhiy Storchaka	0a58f72762	bpo-24821: Fixed the slowing down to 25 times in the searching of some (#505 ) unlucky Unicode characters.	2017-03-30 09:11:10 +03:00
Serhiy Storchaka	d1302c0154	Issue #28999 : Use Py_RETURN_NONE, Py_RETURN_TRUE and Py_RETURN_FALSE wherever possible but Coccinelle couldn't find opportunity.	2017-01-23 10:23:58 +02:00
Xiang Zhang	7a4da324dc	Issue #29145 : Merge 3.6.	2017-01-10 10:56:38 +08:00
Serhiy Storchaka	998c9cdd42	Issue #28561 : Clean up UTF-8 encoder: remove dead code, update comments, etc. Patch by Xiang Zhang.	2016-10-30 18:25:27 +02:00
Christian Heimes	f051e43b22	Issue #28126 : Replace Py_MEMCPY with memcpy(). Visual Studio can properly optimize memcpy().	2016-09-13 20:22:02 +02:00
Benjamin Peterson	621b430a14	remove all usage of Py_LOCAL	2016-09-09 13:54:34 -07:00
Victor Stinner	1a05d6c04d	PEP 7 style for if/else in C Add also a newline for readability in normalize_encoding().	2016-09-02 12:12:23 +02:00
Raymond Hettinger	15f44ab043	Issue #27895 : Spelling fixes (Contributed by Ville Skyttä).	2016-08-30 10:47:49 -07:00
Serhiy Storchaka	e09132f2c7	Backed out changeset b0087e17cd5e (issue #26765 ) For unknown reasons it perhaps caused a crash on 32-bit Windows (issue #).	2016-07-03 13:57:48 +03:00
Serhiy Storchaka	355048970b	Issue #26765 : Moved wrappers for bytes and bytearray methods to common header file.	2016-07-01 17:57:30 +03:00
Serhiy Storchaka	bcde10aa7e	Issue #26765 : Ensure that bytes- and unicode-specific stringlib files are used with correct type.	2016-05-16 09:42:29 +03:00
Serhiy Storchaka	fb81d3cbe7	Issue #26765 : Moved common code for the replace() method of bytes and bytearray to a template file.	2016-05-05 09:26:07 +03:00
Serhiy Storchaka	dd40fc3e57	Issue #26765 : Moved common code and docstrings for bytes and bytearray methods to bytes_methods.c.	2016-05-04 22:23:26 +03:00
Serhiy Storchaka	b6a9c9761c	Issue #26778 : Fixed "a/an/and" typos in code comment, documentation and error messages.	2016-04-17 09:39:28 +03:00
Serhiy Storchaka	6a7b3a77b4	Issue #26778 : Fixed "a/an/and" typos in code comment and documentation.	2016-04-17 08:32:47 +03:00
Serhiy Storchaka	21a663ea28	Issue #26057 : Got rid of nonneeded use of PyUnicode_FromObject().	2016-04-13 15:37:23 +03:00
Serhiy Storchaka	413fdcea21	Issue #24821 : Refactor STRINGLIB(fastsearch_memchr_1char) and split it on STRINGLIB(find_char) and STRINGLIB(rfind_char) that can be used independedly without special preconditions.	2015-11-14 15:42:17 +02:00
Victor Stinner	6bd525b656	Optimize error handlers of ASCII and Latin1 encoders when the replacement string is pure ASCII: use _PyBytesWriter_WriteBytes(), don't check individual character. Cleanup unicode_encode_ucs1(): * Rename repunicode to rep * Clear rep object on error * Factorize code between bytes and unicode path	2015-10-09 13:10:05 +02:00
Victor Stinner	ce179bf6ba	Add _PyBytesWriter_WriteBytes() to factorize the code	2015-10-09 12:57:22 +02:00
Victor Stinner	ad7715891e	_PyBytesWriter: simplify code to avoid "prealloc" parameters Substract preallocate bytes from min_size before calling _PyBytesWriter_Prepare().	2015-10-09 12:38:53 +02:00
Victor Stinner	e7bf86cd7d	Optimize backslashreplace error handler Issue #25318: Optimize backslashreplace and xmlcharrefreplace error handlers in UTF-8 encoder. Optimize also backslashreplace error handler for ASCII and Latin1 encoders. Use the new _PyBytesWriter API to optimize these error handlers for the encoders. It avoids to create an exception and call the slow implementation of the error handler.	2015-10-09 01:39:28 +02:00
Victor Stinner	fdfbf78114	Issue #25318 : Add _PyBytesWriter API Add a new private API to optimize Unicode encoders. It uses a small buffer allocated on the stack and supports overallocation. Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable overallocation for the UTF-8 encoder with error handlers. unicode_encode_ucs1(): initialize collend to collstart+1 to not check the current character twice, we already know that it is not ASCII.	2015-10-09 00:33:49 +02:00
Victor Stinner	01ada3996b	Issue #25267 : The UTF-8 encoder is now up to 75 times as fast for error handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``. Patch co-written with Serhiy Storchaka.	2015-10-01 21:54:51 +02:00
Eric V. Smith	ab2aa6dc91	Fixed an incorrect comment.	2015-08-26 14:10:32 -04:00
Serhiy Storchaka	9ce71a6475	Fixed typos in comments.	2015-05-18 22:20:18 +03:00
Serhiy Storchaka	7e29eea926	Fixed typos in comments.	2015-05-18 22:19:42 +03:00
Serhiy Storchaka	0d4df752ac	Issue #15027 : The UTF-32 encoder is now 3x to 7x faster.	2015-05-12 23:12:45 +03:00
Serhiy Storchaka	d9d769fcdd	Issue #23573 : Increased performance of string search operations (str.find, str.index, str.count, the in operator, str.split, str.partition) with arguments of different kinds (UCS1, UCS2, UCS4).	2015-03-24 21:55:47 +02:00
Serhiy Storchaka	009b811d67	Removed unintentional trailing spaces in non-external and non-generated C files.	2015-03-18 21:53:15 +02:00
Serhiy Storchaka	4fdb68491e	Issue #22896 : Avoid to use PyObject_AsCharBuffer(), PyObject_AsReadBuffer() and PyObject_AsWriteBuffer().	2015-02-03 01:21:08 +02:00
Serhiy Storchaka	b757c83ec6	Issue #22581 : Use more "bytes-like object" throughout the docs and comments.	2014-12-05 22:25:22 +02:00
Benjamin Peterson	1cc9520327	s/stringobject/bytesobject/ (closes #22036 ) Patch by Martin Matusiak.	2014-07-23 21:39:37 -07:00
Benjamin Peterson	d455ce4fd4	merge 3.3	2014-03-30 19:52:39 -04:00
Benjamin Peterson	0ad6098b67	merge 3.2	2014-03-30 19:52:22 -04:00
Benjamin Peterson	23cf403ca1	fix expandtabs overflow detection to be consistent and not rely on signed overflow	2014-03-30 19:47:57 -04:00
Serhiy Storchaka	3079328d29	Reverted changeset b72c5573c5e7 (issue #15027 ).	2014-01-04 22:44:01 +02:00
Serhiy Storchaka	583a93943c	Issue #15027 : Rewrite the UTF-32 encoder. It is now 1.6x to 3.5x faster.	2014-01-04 19:25:37 +02:00
Benjamin Peterson	0ee22bf774	fix format spec recursive expansion (closes #19729 )	2013-11-26 19:22:36 -06:00
Serhiy Storchaka	dc2fd5101a	Remove dead code committed in issue #12892 .	2013-11-19 15:56:05 +02:00
Serhiy Storchaka	58cf607d13	Issue #12892 : The utf-16* and utf-32* codecs now reject (lone) surrogates. The utf-16* and utf-32* encoders no longer allow surrogate code points (U+D800-U+DFFF) to be encoded. The utf-32* decoders no longer decode byte sequences that correspond to surrogate code points. The surrogatepass error handler now works with the utf-16* and utf-32* codecs. Based on patches by Victor Stinner and Kang-Hao (Kenny) Lu.	2013-11-19 11:32:41 +02:00
Ezio Melotti	745d54d2fa	#17806 : Added keyword-argument support for "tabsize" to str/bytes.expandtabs().	2013-11-16 19:10:57 +02:00
Victor Stinner	cc64eb5b9f	Issue #18408 : Fix bytearrayiter.partition()/rpartition(), handle PyByteArray_FromStringAndSize() failure (ex: on memory allocation failure)	2013-10-29 03:15:37 +01:00
Serhiy Storchaka	8fa8ee3970	Issue #18701 : Remove support of old CPython versions (<3.0) from C code.	2013-08-17 00:48:02 +03:00

1 2 3 4 5

227 Commits