cpython

Commit Graph

Author	SHA1	Message	Date
Victor Stinner	fecc4f2b47	bpo-36356: Release Unicode interned strings on Valgrind (#12431 ) When Python is compiled with Valgrind support, release Unicode interned strings at exit in _PyUnicode_Fini(). * Rename _Py_ReleaseInternedUnicodeStrings() to unicode_release_interned() and make it private. * unicode_release_interned() is now called from _PyUnicode_Fini(): it must be called with a running Python thread state for TRASHCAN, it cannot be called from pymain_free(). * Don't display statistics on interned strings at exit anymore	2019-03-19 14:20:29 +01:00
Victor Stinner	5f9cf23502	bpo-36301: Error if decoding pybuilddir.txt fails (GH-12422) Python initialization now fails if decoding pybuilddir.txt configuration file fails at startup. _PyPathConfig_Calculate() now reports memory allocation failure and decoding error on decoding pybuilddir.txt content from UTF-8/surrogateescape.	2019-03-19 01:46:25 +01:00
Inada Naoki	6a16b18224	bpo-36297: remove "unicode_internal" codec (GH-12342)	2019-03-18 15:44:11 +09:00
Victor Stinner	6d43f6f081	bpo-35713: Split _Py_InitializeCore into subfunctions (GH-11650) * Split _Py_InitializeCore_impl() into subfunctions: add multiple pycore_init_xxx() functions * Preliminary sys.stderr is now set earlier to get an usable sys.stderr ealier. * Move code into _Py_Initialize_ReconfigureCore() to be able to call it from _Py_InitializeCore(). * Split _PyExc_Init(): create a new _PyBuiltins_AddExceptions() function. * Call _PyExc_Init() earlier in _Py_InitializeCore_impl() and new_interpreter() to get working exceptions earlier. * _Py_ReadyTypes() now returns _PyInitError rather than calling Py_FatalError(). * Misc code cleanup	2019-01-22 21:18:05 +01:00
Victor Stinner	bf4ac2d2fd	bpo-35713: Rework Python initialization (GH-11647) * The PyByteArray_Init() and PyByteArray_Fini() functions have been removed. They did nothing since Python 2.7.4 and Python 3.2.0, were excluded from the limited API (stable ABI), and were not documented. * Move "_PyXXX_Init()" and "_PyXXX_Fini()" declarations from Include/cpython/pylifecycle.h to Include/internal/pycore_pylifecycle.h. Replace "PyAPI_FUNC(TYPE)" with "extern TYPE". * _PyExc_Init() now returns an error on failure rather than calling Py_FatalError(). Move macros inside _PyExc_Init() and undefine them when done. Rewrite macros to make them look more like statement: add ";" when using them, add "do { ... } while (0)". * _PyUnicode_Init() now returns a _PyInitError error rather than call Py_FatalError(). * Move stdin check from _PySys_BeginInit() to init_sys_streams(). * _Py_ReadyTypes() now returns a _PyInitError error rather than calling Py_FatalError().	2019-01-22 17:39:03 +01:00
Serhiy Storchaka	d586ccb04f	bpo-35552: Fix reading past the end in PyUnicode_FromFormat() and PyBytes_FromFormat(). (GH-11276) Format characters "%s" and "%V" in PyUnicode_FromFormat() and "%s" in PyBytes_FromFormat() no longer read memory past the limit if precision is specified.	2019-01-12 10:30:35 +02:00
Xtreak	3f7983a25a	bpo-35560: Remove assertion from format(float, "n") (GH-11288) Fix an assertion error in format() in debug build for floating point formatting with "n" format, zero padding and small width. Release build is not impacted. Patch by Karthikeyan Singaravelan.	2019-01-07 16:09:14 +01:00
animalize	a1d1425306	bpo-35636: Remove redundant check in unicode_hash(). (GH-11402) _Py_HashBytes() does the check for empty string.	2019-01-02 14:16:06 +02:00
Serhiy Storchaka	bb86bf4c4e	bpo-35444: Unify and optimize the helper for getting a builtin object. (GH-11047) This speeds up pickling of some iterators. This fixes also error handling in pickling methods when fail to look up builtin "getattr".	2018-12-11 08:28:18 +02:00
Serhiy Storchaka	eeb719eac6	bpo-35365: Use a wchar_t* buffer in the code page decoder. (GH-10837)	2018-12-04 10:25:50 +02:00
Serhiy Storchaka	4013c17911	bpo-35372: Fix the code page decoder for input > 2 GiB. (GH-10848)	2018-12-03 10:36:45 +02:00
Victor Stinner	bde9d6bbb4	bpo-34523, bpo-35322: Fix unicode_encode_locale() (GH-10759) Fix memory leak in PyUnicode_EncodeLocale() and PyUnicode_EncodeFSDefault() on error handling. Changes: * Fix unicode_encode_locale() error handling * Fix test_codecs.LocaleCodecTest	2018-11-28 10:26:20 +01:00
Victor Stinner	163403a63e	bpo-33954: Fix compiler warning in _PyUnicode_FastFill() (GH-10737) 'data' argument of unicode_fill() is modified, so it must not be constant. Add more assertions to unicode_fill(): check the maximum character value.	2018-11-27 12:41:17 +01:00
Serhiy Storchaka	62be74290a	bpo-33012: Fix invalid function cast warnings with gcc 8. (GH-6749) Fix invalid function cast warnings with gcc 8 for method conventions different from METH_NOARGS, METH_O and METH_VARARGS excluding Argument Clinic generated code.	2018-11-27 13:27:31 +02:00
Victor Stinner	59423e3ddd	bpo-33954: Fix _PyUnicode_InsertThousandsGrouping() (GH-10623) Fix str.format(), float.__format__() and complex.__format__() methods for non-ASCII decimal point when using the "n" formatter. Changes: * Rewrite _PyUnicode_InsertThousandsGrouping(): it now requires a _PyUnicodeWriter object for the buffer and a Python str object for digits. * Rename FILL() macro to unicode_fill(), convert it to static inline function, add "assert(0 <= start);" and rework its code.	2018-11-26 13:40:01 +01:00
Victor Stinner	a42de742e7	bpo-35059: Cast void* to PyObject* (GH-10650) Don't pass void* to Python macros: use _PyObject_CAST().	2018-11-22 10:25:22 +01:00
Victor Stinner	bcda8f1d42	bpo-35081: Add Include/internal/pycore_object.h (GH-10640) Move _PyObject_GC_TRACK() and _PyObject_GC_UNTRACK() from Include/objimpl.h to Include/internal/pycore_object.h.	2018-11-21 22:27:47 +01:00
Gregory P. Smith	746b2d35ea	bpo-35214: Fix OOB memory access in unicode escape parser (GH-10506) Discovered using clang's MemorySanitizer when it ran python3's test_fstring test_misformed_unicode_character_name. An msan build will fail by simply executing: ./python -c 'u"\N"'	2018-11-13 13:16:54 -08:00
Victor Stinner	621cebe81b	bpo-35081: Rename internal headers (GH-10275) Rename Include/internal/ headers: * pycore_hash.h -> pycore_pyhash.h * pycore_lifecycle.h -> pycore_pylifecycle.h * pycore_mem.h -> pycore_pymem.h * pycore_state.h -> pycore_pystate.h Add missing headers to Makefile.pre.in and PCbuild: * pycore_condvar.h. * pycore_hamt.h * pycore_pyhash.h	2018-11-12 16:53:38 +01:00
Victor Stinner	9fc57a3848	bpo-35081: Add pycore_fileutils.h (GH-10371) Move Py_BUILD_CORE code from Include/fileutils.h to a new Include/internal/pycore_fileutils.h file.	2018-11-07 00:44:03 +01:00
Victor Stinner	27e2d1f219	bpo-35081: Add pycore_ prefix to internal header files (GH-10263) * Rename Include/internal/ header files: * pyatomic.h -> pycore_atomic.h * ceval.h -> pycore_ceval.h * condvar.h -> pycore_condvar.h * context.h -> pycore_context.h * pygetopt.h -> pycore_getopt.h * gil.h -> pycore_gil.h * hamt.h -> pycore_hamt.h * hash.h -> pycore_hash.h * mem.h -> pycore_mem.h * pystate.h -> pycore_state.h * warnings.h -> pycore_warnings.h * PCbuild project, Makefile.pre.in, Modules/Setup: add the Include/internal/ directory to the search paths of header files. * Update includes. For example, replace #include "internal/mem.h" with #include "pycore_mem.h".	2018-11-01 00:52:28 +01:00
Victor Stinner	50fe3f8913	bpo-9263: _PyXXX_CheckConsistency() use _PyObject_ASSERT() (GH-10108) Use _PyObject_ASSERT() in: * _PyDict_CheckConsistency() * _PyType_CheckConsistency() * _PyUnicode_CheckConsistency() _PyObject_ASSERT() dumps the faulty object if the assertion fails to help debugging.	2018-10-26 18:47:15 +02:00
Serhiy Storchaka	c46db9232f	bpo-30863: Rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharString(). (GH-2599) They no longer cache the wchar_t* representation of string objects.	2018-10-23 22:58:24 +03:00
Emanuele Gaifas	fc8205cb4b	Add missing closing quote and trailing period in str.isidentifier() docstring (GH-9756) This rectifies commit `ffc5a14d00`.	2018-10-08 16:14:47 +05:30
Sanyam Khurana	ffc5a14d00	bpo-33014: Clarify str.isidentifier docstring (GH-6088) * bpo-33014: Clarify str.isidentifier docstring * bpo-33014: Add code example in isidentifier documentation	2018-10-08 12:23:32 +05:30
Victor Stinner	998b806366	Revert "bpo-34595: Add %T format to PyUnicode_FromFormatV() (GH-9080)" (GH-9187) This reverts commit `886483e2b9`.	2018-09-12 00:23:25 +02:00
Victor Stinner	886483e2b9	bpo-34595: Add %T format to PyUnicode_FromFormatV() (GH-9080) * Add %T format to PyUnicode_FromFormatV(), and so to PyUnicode_FromFormat() and PyErr_Format(), to format an object type name: equivalent to "%s" with Py_TYPE(obj)->tp_name. * Replace Py_TYPE(obj)->tp_name with %T format in unicodeobject.c. * Add unit test on %T format. * Rename unicode_fromformat_write_cstr() to unicode_fromformat_write_utf8(), to make the intent more explicit.	2018-09-07 18:00:58 +02:00
Victor Stinner	3d4226a832	bpo-34523: Support surrogatepass in locale codecs (GH-8995) Add support for the "surrogatepass" error handler in PyUnicode_DecodeFSDefault() and PyUnicode_EncodeFSDefault() for the UTF-8 encoding. Changes: * _Py_DecodeUTF8Ex() and _Py_EncodeUTF8Ex() now support the surrogatepass error handler (_Py_ERROR_SURROGATEPASS). * _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() now use the _Py_error_handler enum instead of "int surrogateescape" to pass the error handler. These functions now return -3 if the error handler is unknown. * Add unit tests on _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() in test_codecs. * Rename get_error_handler() to _Py_GetErrorHandler() and expose it as a private function. * _freeze_importlib doesn't need config.filesystem_errors="strict" workaround anymore.	2018-08-29 22:21:32 +02:00
Victor Stinner	b2457efc78	bpo-34523: Add _PyCoreConfig.filesystem_encoding (GH-8963) _PyCoreConfig_Read() is now responsible to choose the filesystem encoding and error handler. Using Py_Main(), the encoding is now chosen even before calling Py_Initialize(). _PyCoreConfig.filesystem_encoding is now the reference, instead of Py_FileSystemDefaultEncoding, for the Python filesystem encoding. Changes: * Add filesystem_encoding and filesystem_errors to _PyCoreConfig * _PyCoreConfig_Read() now reads the locale encoding for the file system encoding. * PyUnicode_EncodeFSDefault() and PyUnicode_DecodeFSDefaultAndSize() now use the interpreter configuration rather than Py_FileSystemDefaultEncoding and Py_FileSystemDefaultEncodeErrors global configuration variables. * Add _Py_SetFileSystemEncoding() and _Py_ClearFileSystemEncoding() private functions to only modify Py_FileSystemDefaultEncoding and Py_FileSystemDefaultEncodeErrors in coreconfig.c. * _Py_CoerceLegacyLocale() now takes an int rather than _PyCoreConfig for the warning.	2018-08-29 13:25:36 +02:00
Alexey Izbyshev	74a307d48e	bpo-34435: Add missing NULL check to unicode_encode_ucs1(). (GH-8823) Reported by Svace static analyzer.	2018-08-19 21:52:04 +03:00
Zackery Spytz	e349bf2358	bpo-22602: Raise an exception in the UTF-7 decoder for ill-formed sequences starting with "+". (GH-8741) The UTF-7 decoder now raises UnicodeDecodeError for ill-formed sequences starting with "+" (as specified in RFC 2152).	2018-08-19 07:43:38 +03:00
Victor Stinner	caba55b3b7	bpo-34301: Add _PyInterpreterState_Get() helper function (GH-8592) sys_setcheckinterval() now uses a local variable to parse arguments, before writing into interp->check_interval.	2018-08-03 15:33:52 +02:00
INADA Naoki	16dfca4d82	bpo-34087: Fix buffer overflow in int(s) and similar functions (GH-8274) `_PyUnicode_TransformDecimalAndSpaceToASCII()` missed trailing NUL char. It caused buffer overflow in `_Py_string_to_number_with_underscores()`. This bug is introduced in `9b6c60cb`.	2018-07-14 12:06:43 +09:00
Bup	fc93bd467e	Change tp_size to tp_basicsize in comment and realign the comments (GH-6775)	2018-06-19 16:59:55 +08:00
Siddhesh Poyarekar	55edd0c185	bpo-33012: Fix invalid function cast warnings with gcc 8 for METH_NOARGS. (GH-6030) METH_NOARGS functions need only a single argument but they are cast into a PyCFunction, which takes two arguments. This triggers an invalid function cast warning in gcc8 due to the argument mismatch. Fix this by adding a dummy unused argument.	2018-04-29 21:59:33 +03:00
Xiang Zhang	2b77a921e6	bpo-29803: remove a redandunt op and fix a comment in unicodeobject.c (#660 )	2018-02-13 18:33:32 +08:00
Serhiy Storchaka	b7e2d67f7c	bpo-32827: Fix usage of _PyUnicodeWriter_Prepare() in decoding errors handler. (GH-5636)	2018-02-13 08:27:33 +02:00
oldk	aa0735f597	bpo-32747: Remove trailing spaces in docstrings. (GH-5491)	2018-02-02 10:52:55 +02:00
Xiang Zhang	2c7fd46e11	bpo-32583: Fix possible crashing in builtin Unicode decoders (#5325 ) When using customized decode error handlers, it is possible for builtin decoders to write out-of-bounds and then crash.	2018-01-31 20:48:05 +08:00
INADA Naoki	7cc95f5069	Fix wrong assert in unicodeobject (GH-5340)	2018-01-28 02:07:09 +09:00
INADA Naoki	a49ac99029	bpo-32677: Add .isascii() to str, bytes and bytearray (GH-5342)	2018-01-27 14:06:21 +09:00
Victor Stinner	7ed7aead95	bpo-29240: Fix locale encodings in UTF-8 Mode (#5170 ) Modify locale.localeconv(), time.tzname, os.strerror() and other functions to ignore the UTF-8 Mode: always use the current locale encoding. Changes: * Add _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx(). On decoding or encoding error, they return the position of the error and an error message which are used to raise Unicode errors in PyUnicode_DecodeLocale() and PyUnicode_EncodeLocale(). * Replace _Py_DecodeCurrentLocale() with _Py_DecodeLocaleEx(). * PyUnicode_DecodeLocale() now uses _Py_DecodeLocaleEx() for all cases, especially for the strict error handler. * Add _Py_DecodeUTF8Ex(): return more information on decoding error and supports the strict error handler. * Rename _Py_EncodeUTF8_surrogateescape() to _Py_EncodeUTF8Ex(). * Replace _Py_EncodeCurrentLocale() with _Py_EncodeLocaleEx(). * Ignore the UTF-8 mode to encode/decode localeconv(), strerror() and time zone name. * Remove PyUnicode_DecodeLocale(), PyUnicode_DecodeLocaleAndSize() and PyUnicode_EncodeLocale() now ignore the UTF-8 mode: always use the "current" locale. * Remove _PyUnicode_DecodeCurrentLocale(), _PyUnicode_DecodeCurrentLocaleAndSize() and _PyUnicode_EncodeCurrentLocale().	2018-01-15 10:45:49 +01:00
Victor Stinner	cb3ae5588b	bpo-29240: Ignore UTF-8 Mode in time module (#5148 ) time.strftime() must use the current LC_CTYPE encoding, not UTF-8 if the UTF-8 mode is enabled. Add _PyUnicode_DecodeCurrentLocale() function.	2018-01-11 10:37:59 +01:00
Victor Stinner	2cba6b8579	bpo-29240: readline now ignores the UTF-8 Mode (#5145 ) Add new fuctions ignoring the UTF-8 mode: * _Py_DecodeCurrentLocale() * _Py_EncodeCurrentLocale() * _PyUnicode_DecodeCurrentLocaleAndSize() * _PyUnicode_EncodeCurrentLocale() Modify the readline module to use these functions. Re-enable test_readline.test_nonascii().	2018-01-10 22:46:15 +01:00
Victor Stinner	9dd762013f	bpo-32030: Add _Py_EncodeLocaleRaw() (#4961 ) Replace Py_EncodeLocale() with _Py_EncodeLocaleRaw() in: * _Py_wfopen() * _Py_wreadlink() * _Py_wrealpath() * _Py_wstat() * pymain_open_filename() These functions are called early during Python intialization, only the RAW memory allocator must be used.	2017-12-21 16:20:32 +01:00
Victor Stinner	e47e698da6	bpo-32030: Add _Py_EncodeUTF8_surrogateescape() (#4960 ) Py_EncodeLocale() now uses _Py_EncodeUTF8_surrogateescape(), instead of using temporary unicode and bytes objects. So Py_EncodeLocale() doesn't use the Python C API anymore.	2017-12-21 15:45:16 +01:00
Serhiy Storchaka	a5552f023e	bpo-32240: Add the const qualifier to declarations of PyObject* array arguments. (#4746 )	2017-12-15 13:11:11 +02:00
Victor Stinner	91106cd9ff	bpo-29240: PEP 540: Add a new UTF-8 Mode (#855 ) * Add -X utf8 command line option, PYTHONUTF8 environment variable and a new sys.flags.utf8_mode flag. * If the LC_CTYPE locale is "C" at startup: enable automatically the UTF-8 mode. * Add _winapi.GetACP(). encodings._alias_mbcs() now calls _winapi.GetACP() to get the ANSI code page * locale.getpreferredencoding() now returns 'UTF-8' in the UTF-8 mode. As a side effect, open() now uses the UTF-8 encoding by default in this mode. * Py_DecodeLocale() and Py_EncodeLocale() now use the UTF-8 encoding in the UTF-8 Mode. * Update subprocess._args_from_interpreter_flags() to handle -X utf8 * Skip some tests relying on the current locale if the UTF-8 mode is enabled. * Add test_utf8mode.py. * _Py_DecodeUTF8_surrogateescape() gets a new optional parameter to return also the length (number of wide characters). * pymain_get_global_config() and pymain_set_global_config() now always copy flag values, rather than only copying if the new value is greater than the old value.	2017-12-13 12:29:09 +01:00
Victor Stinner	6a54c676e6	bpo-31979: Remove unused align_maxchar() function (#4527 )	2017-11-23 19:02:23 +01:00
Serhiy Storchaka	9b6c60cbce	bpo-31979: Simplify transforming decimals to ASCII (#4336 ) in int(), float() and complex() parsers. This also speeds up parsing non-ASCII numbers by around 20%.	2017-11-13 21:23:48 +02:00
Serhiy Storchaka	e2f92de6a9	Add the const qualifier to "char *" variables that refer to literal strings. (#4370 )	2017-11-11 13:06:26 +02:00
stratakis	e8b1965639	bpo-23699: Use a macro to reduce boilerplate code in rich comparison functions (GH-793)	2017-11-02 20:32:54 +10:00
Serhiy Storchaka	a2314283ff	bpo-20047: Make bytearray methods partition() and rpartition() rejecting (#4158 ) separators that are not bytes-like objects.	2017-10-29 02:11:54 +03:00
Serhiy Storchaka	56cb465cc9	bpo-31825: Fixed OverflowError in the 'unicode-escape' codec (#4058 ) and in codecs.escape_decode() when decode an escaped non-ascii byte.	2017-10-20 17:08:15 +03:00
Barry Warsaw	b2e5794870	bpo-31338 (#3374 ) * Add Py_UNREACHABLE() as an alias to abort(). * Use Py_UNREACHABLE() instead of assert(0) * Convert more unreachable code to use Py_UNREACHABLE() * Document Py_UNREACHABLE() and a few other macros.	2017-09-14 18:13:16 -07:00
Serhiy Storchaka	e3b2b4b8d9	bpo-31393: Fix the use of PyUnicode_READY(). (#3451 )	2017-09-08 09:58:51 +03:00
Eric Snow	2ebc5ce42a	bpo-30860: Consolidate stateful runtime globals. (#3397 ) * group the (stateful) runtime globals into various topical structs * consolidate the topical structs under a single top-level _PyRuntimeState struct * add a check-c-globals.py script that helps identify runtime globals Other globals are excluded (see globals.txt and check-c-globals.py).	2017-09-07 23:51:28 -06:00
Stefan Krah	f432a3234f	bpo-30923: Silence fall-through warnings included in -Wextra since gcc-7.0. (#3157 )	2017-08-21 13:09:59 +02:00
Serhiy Storchaka	64e461be09	bpo-22207: Add checks for possible integer overflows in unicodeobject.c. (#2623 ) Based on patch by Victor Stinner.	2017-07-11 06:55:25 +03:00
Serhiy Storchaka	f7eae0adfc	[security] bpo-13617: Reject embedded null characters in wchar* strings. (#2302 ) Based on patch by Victor Stinner. Add private C API function _PyUnicode_AsUnicode() which is similar to PyUnicode_AsUnicode(), but checks for null characters.	2017-06-28 08:30:06 +03:00
Serhiy Storchaka	e613e6add5	bpo-30708: Check for null characters in PyUnicode_AsWideCharString(). (#2285 ) Raise a ValueError if the second argument is NULL and the wchar_t\* string contains null characters.	2017-06-27 16:03:14 +03:00
Serhiy Storchaka	40db90c1ce	bpo-29802: Fix reference counting in module-level struct functions (#1213 ) when pass arguments of wrong type.	2017-04-20 21:19:31 +03:00
Serhiy Storchaka	b879fe82e7	Expand the PySlice_GetIndicesEx macro. (#1023 )	2017-04-08 09:53:51 +03:00
Lisa Roach	43ba8861e0	bpo-29549: Fixes docstring for str.index (#256 ) * Updates B.index documentation. * Updates str.index documentation, makes it Argument Clinic compatible. * Removes ArgumentClinic code. * Finishes string.index documentation. * Updates string.rindex documentation. * Documents B.rindex.	2017-04-04 22:36:22 -07:00
Serhiy Storchaka	fff9a31a91	bpo-29865: Use PyXXX_GET_SIZE macros rather than Py_SIZE for concrete types. (#748 )	2017-03-21 08:53:25 +02:00
Serhiy Storchaka	004e03fb0c	bpo-29116: Improve error message for concatenating str with non-str. (#710 )	2017-03-19 19:38:42 +02:00
Serhiy Storchaka	202fda55c2	bpo-24037: Add Argument Clinic converter `bool(accept={int})`. (#485 )	2017-03-12 10:10:47 +02:00
Serhiy Storchaka	370fd202f1	Use Py_RETURN_FALSE/Py_RETURN_TRUE rather than PyBool_FromLong(0)/PyBool_FromLong(1). (#567 )	2017-03-08 20:47:48 +02:00
Serhiy Storchaka	9f8ad3f39e	bpo-29568: Disable any characters between two percents for escaped percent "%%" in the format string for classic string formatting. (GH-513)	2017-03-08 11:51:19 +08:00
Martin Panter	91a8866dc1	Fix grammar in doc string, RST markup	2017-01-24 00:30:06 +00:00
Serhiy Storchaka	228b12edcc	Issue #28999 : Use Py_RETURN_NONE, Py_RETURN_TRUE and Py_RETURN_FALSE wherever possible. Patch is writen with Coccinelle.	2017-01-23 09:47:21 +02:00
Serhiy Storchaka	2a404b63d4	Issue #28769 : The result of PyUnicode_AsUTF8AndSize() and PyUnicode_AsUTF8() is now of type "const char " rather of "char ".	2017-01-22 23:07:07 +02:00
Victor Stinner	0c4a828cad	Run Argument Clinic: METH_VARARGS=>METH_FASTCALL Issue #29286. Run Argument Clinic to get the new faster METH_FASTCALL calling convention for functions using "boring" positional arguments. Manually fix _elementtree: _elementtree_XMLParser_doctype() must remain consistent with the clinic code.	2017-01-17 02:21:47 +01:00
INADA Naoki	15f94596b6	Issue #20180 : forgot to update AC output.	2017-01-16 21:49:13 +09:00
INADA Naoki	3ae2056512	Issue #20180 : convert unicode methods to AC.	2017-01-16 20:41:20 +09:00
Xiang Zhang	7a4da324dc	Issue #29145 : Merge 3.6.	2017-01-10 10:56:38 +08:00
Xiang Zhang	95403d74d7	Issue #29145 : Merge 3.5.	2017-01-10 10:54:19 +08:00
Xiang Zhang	b0541f4cdf	Issue #29145 : Fix overflow checks in str.replace() and str.join(). Based on patch by Martin Panter.	2017-01-10 10:52:00 +08:00
Xiang Zhang	62497d52d9	Issue #29044 : Merge 3.6.	2016-12-22 15:31:55 +08:00
Xiang Zhang	437a5d2c25	Issue #29044 : Merge 3.5.	2016-12-22 15:31:22 +08:00
Xiang Zhang	ea1cf87030	Issue #29044 : Fix a use-after-free in string '%c' formatter.	2016-12-22 15:30:47 +08:00
Xiang Zhang	b211068f5c	Issue #28822 : Adjust indices handling of PyUnicode_FindChar().	2016-12-20 22:52:33 +08:00
Xavier de Gaye	31eaf49ed9	Merge 3.6.	2016-12-15 21:01:52 +01:00
Xavier de Gaye	76febd0792	Issue #26919 : On Android, operating system data is now always encoded/decoded to/from UTF-8, instead of the locale encoding to avoid inconsistencies with os.fsencode() and os.fsdecode() which are already using UTF-8.	2016-12-15 20:59:58 +01:00
Serhiy Storchaka	fb3134f4d4	Issue #28808 : PyUnicode_CompareWithASCIIString() now never raises exceptions.	2016-12-06 00:20:26 +02:00
Serhiy Storchaka	9a953dbb34	Issue #28808 : PyUnicode_CompareWithASCIIString() now never raises exceptions.	2016-12-06 00:17:45 +02:00
Serhiy Storchaka	419967b832	Issue #28808 : PyUnicode_CompareWithASCIIString() now never raises exceptions.	2016-12-06 00:13:34 +02:00
Victor Stinner	de4ae3d486	Backed out changeset b9c9691c72c5 Issue #28858: The change b9c9691c72c5 introduced a regression. It seems like _PyObject_CallArg1() uses more stack memory than PyObject_CallFunctionObjArgs().	2016-12-04 22:59:09 +01:00
Victor Stinner	27580c1fb5	Replace PyObject_CallFunctionObjArgs() with fastcall * PyObject_CallFunctionObjArgs(func, NULL) => _PyObject_CallNoArg(func) * PyObject_CallFunctionObjArgs(func, arg, NULL) => _PyObject_CallArg1(func, arg) PyObject_CallFunctionObjArgs() allocates 40 bytes on the C stack and requires extra work to "parse" C arguments to build a C array of PyObject*. _PyObject_CallNoArg() and _PyObject_CallArg1() are simpler and don't allocate memory on the C stack. This change is part of the fastcall project. The change on listsort() is related to the issue #23507.	2016-12-01 14:43:22 +01:00
Serhiy Storchaka	99250d5c63	Issue #28774 : Simplified encoding a str result of an error handler in ASCII and Latin1 encoders.	2016-11-23 15:13:00 +02:00
Xiang Zhang	d04d8474df	Issue #28774 : Fix start/end pos in unicode_encode_ucs1(). Fix error position of the unicode error in ASCII and Latin1 encoders when a string returned by the error handler contains multiple non-encodable characters (non-ASCII for the ASCII codec, characters out of the U+0000-U+00FF range for Latin1).	2016-11-23 19:34:01 +08:00
Serhiy Storchaka	50911476f5	Issue #28760 : Clean up and fix comments in PyUnicode_AsUnicodeEscapeString(). Patch by Xiang Zhang.	2016-11-21 11:47:16 +02:00
Serhiy Storchaka	ac0720eaa4	Issue #28760 : Clean up and fix comments in PyUnicode_AsUnicodeEscapeString(). Patch by Xiang Zhang.	2016-11-21 11:46:51 +02:00
Serhiy Storchaka	460bd0d284	Issue #19569 : Compiler warnings are now emitted if use most of deprecated functions.	2016-11-20 12:16:46 +02:00
Serhiy Storchaka	27b74244fb	Issue #28701 : _PyUnicode_EqualToASCIIId and _PyUnicode_EqualToASCIIString now require ASCII right argument and assert this condition in debug build.	2016-11-16 20:03:03 +02:00
Serhiy Storchaka	a83a6a3275	Issue #28701 : _PyUnicode_EqualToASCIIId and _PyUnicode_EqualToASCIIString now require ASCII right argument and assert this condition in debug build.	2016-11-16 20:02:44 +02:00
Serhiy Storchaka	e6d6131f78	Fixed an off-by-one error in _PyUnicode_EqualToASCIIString (issue #28701 ).	2016-11-16 16:13:13 +02:00
Serhiy Storchaka	df66b9c425	Fixed an off-by-one error in _PyUnicode_EqualToASCIIString (issue #28701 ).	2016-11-16 16:12:56 +02:00
Serhiy Storchaka	292dd1b2ad	Fixed an off-by-one error in _PyUnicode_EqualToASCIIString (issue #28701 ).	2016-11-16 16:12:34 +02:00
Serhiy Storchaka	503db266a5	Issue #21449 : Removed private function _PyUnicode_CompareWithId.	2016-11-16 15:56:50 +02:00
Serhiy Storchaka	dddec81b2d	Issue #21449 : Removed private function _PyUnicode_CompareWithId.	2016-11-16 15:56:27 +02:00
Serhiy Storchaka	29a5447360	Issue #28701 : Replace _PyUnicode_CompareWithId with _PyUnicode_EqualToASCIIId. The latter function is more readable, faster and doesn't raise exceptions. Based on patch by Xiang Zhang.	2016-11-16 15:41:31 +02:00
Serhiy Storchaka	fab6acd9f5	Issue #28701 : Replace _PyUnicode_CompareWithId with _PyUnicode_EqualToASCIIId. The latter function is more readable, faster and doesn't raise exceptions. Based on patch by Xiang Zhang.	2016-11-16 15:41:11 +02:00
Serhiy Storchaka	f5894dd646	Issue #28701 : Replace _PyUnicode_CompareWithId with _PyUnicode_EqualToASCIIId. The latter function is more readable, faster and doesn't raise exceptions. Based on patch by Xiang Zhang.	2016-11-16 15:40:39 +02:00
Serhiy Storchaka	1a73bf365e	Issue #28701 : Replace PyUnicode_CompareWithASCIIString with _PyUnicode_EqualToASCIIString. The latter function is more readable, faster and doesn't raise exceptions.	2016-11-16 10:19:57 +02:00
Serhiy Storchaka	3b73ea1278	Issue #28701 : Replace PyUnicode_CompareWithASCIIString with _PyUnicode_EqualToASCIIString. The latter function is more readable, faster and doesn't raise exceptions.	2016-11-16 10:19:20 +02:00
Serhiy Storchaka	f4934ea77d	Issue #28701 : Replace PyUnicode_CompareWithASCIIString with _PyUnicode_EqualToASCIIString. The latter function is more readable, faster and doesn't raise exceptions.	2016-11-16 10:17:58 +02:00
Serhiy Storchaka	616034eb73	Issue #28648 : Fixed crash in Py_DecodeLocale() in debug build on Mac OS X when decode astral characters.	2016-11-12 14:37:11 +02:00
Serhiy Storchaka	babe4f8e5e	Issue #28648 : Fixed crash in Py_DecodeLocale() in debug build on Mac OS X when decode astral characters.	2016-11-12 14:36:02 +02:00
Serhiy Storchaka	6b4b6e956e	Issue #28648 : Fixed crash in Py_DecodeLocale() in debug build on Mac OS X when decode astral characters.	2016-11-12 14:35:46 +02:00
Serhiy Storchaka	84293aff9f	Issue #28648 : Fixed crash in Py_DecodeLocale() in debug build on Mac OS X when decode astral characters.	2016-11-12 14:29:48 +02:00
Serhiy Storchaka	b626643734	Issue #28648 : Fixed crash in Py_DecodeLocale() in debug build on Mac OS X when decode astral characters.	2016-11-12 14:28:06 +02:00
Steve Dower	257a4c1503	Closes #27781 : Removes special cases for the experimental aspect of PEP 529	2016-11-06 19:35:24 -08:00
Steve Dower	78057b4159	Closes #27781 : Removes special cases for the experimental aspect of PEP 529	2016-11-06 19:35:08 -08:00
Eric V. Smith	5646648678	Issue 28128: Print out better error/warning messages for invalid string escapes. Backport to 3.6.	2016-10-31 14:46:26 -04:00
Eric V. Smith	42454af094	Issue 28128: Print out better error/warning messages for invalid string escapes.	2016-10-31 09:22:08 -04:00
Serhiy Storchaka	2edcd1cba4	Issue #28426 : Deprecated undocumented functions PyUnicode_AsEncodedObject(), PyUnicode_AsDecodedObject(), PyUnicode_AsDecodedUnicode() and PyUnicode_AsEncodedUnicode().	2016-10-27 21:08:00 +03:00
Serhiy Storchaka	0093907f0e	Issue #28426 : Deprecated undocumented functions PyUnicode_AsEncodedObject(), PyUnicode_AsDecodedObject(), PyUnicode_AsDecodedUnicode() and PyUnicode_AsEncodedUnicode().	2016-10-27 21:05:49 +03:00
Serhiy Storchaka	a4f8823063	Issue #28408 : Fixed a leak and remove redundant code in _PyUnicodeWriter_Finish(). Patch by Xiang Zhang.	2016-10-25 13:25:04 +03:00
Serhiy Storchaka	c8bc3d1c07	Issue #28408 : Fixed a leak and remove redundant code in _PyUnicodeWriter_Finish(). Patch by Xiang Zhang.	2016-10-25 13:23:56 +03:00
Serhiy Storchaka	d7e5ff13bb	Issue #28426 : Fixed potential crash in PyUnicode_AsDecodedObject() in debug build.	2016-10-25 10:18:16 +03:00
Serhiy Storchaka	c4a3e90aa8	Issue #28426 : Fixed potential crash in PyUnicode_AsDecodedObject() in debug build.	2016-10-25 10:17:33 +03:00
Serhiy Storchaka	839023f12c	Issue #28426 : Fixed potential crash in PyUnicode_AsDecodedObject() in debug build.	2016-10-25 10:13:43 +03:00
Serhiy Storchaka	77eede35fc	Issue #28426 : Fixed potential crash in PyUnicode_AsDecodedObject() in debug build.	2016-10-25 10:07:51 +03:00
Serhiy Storchaka	2fbc019c8c	Issue #28439 : Remove redundant checks in PyUnicode_EncodeLocale and PyUnicode_DecodeLocaleAndSize. Patch by Xiang Zhang.	2016-10-23 15:41:36 +03:00
Serhiy Storchaka	f8d7d41507	Issue #28511 : Use the "U" format instead of "O!" in PyArg_Parse*.	2016-10-23 15:12:25 +03:00
Serhiy Storchaka	523c449ca0	Issue #28504 : Cleanup unicode_decode_call_errorhandler_wchar/writer. Patch by Xiang Zhang.	2016-10-22 23:18:31 +03:00
Serhiy Storchaka	14ab277632	Issue #28410 : Added _PyErr_FormatFromCause() -- the helper for raising new exception with setting current exception as __cause__. _PyErr_FormatFromCause(exception, format, args...) is equivalent to Python raise exception(format % args) from sys.exc_info()[1]	2016-10-21 17:10:42 +03:00
Serhiy Storchaka	467ab194fc	Issue #28410 : Added _PyErr_FormatFromCause() -- the helper for raising new exception with setting current exception as __cause__. _PyErr_FormatFromCause(exception, format, args...) is equivalent to Python raise exception(format % args) from sys.exc_info()[1]	2016-10-21 17:09:17 +03:00
Benjamin Peterson	d6d49f16f4	merge 3.6 (#28454 )	2016-10-16 15:42:33 -07:00
Benjamin Peterson	3aa75528a1	merge 3.5 (#28454 )	2016-10-16 15:42:24 -07:00
Benjamin Peterson	8d761ff045	remove extra PyErr_Format arguments (closes #28454 ) Patch from Xiang Zhang.	2016-10-16 15:41:46 -07:00
Victor Stinner	5a33759fba	Merge 3.6	2016-10-12 13:59:13 +02:00
Victor Stinner	ebe17e0347	Fix _Py_normalize_encoding() command It's not exactly the same than encodings.normalize_encoding(): the C function also converts to lowercase.	2016-10-12 13:57:45 +02:00
Benjamin Peterson	8a3748290a	merge 3.6 (#28417 )	2016-10-11 23:01:12 -07:00
Benjamin Peterson	b329e1bb5b	va_end vargs2 once (closes #28417 )	2016-10-11 23:00:58 -07:00
Serhiy Storchaka	2e58f1a52a	Issue #28400 : Removed uncessary checks in unicode_char and resize_copy. 1. In resize_copy we don't need to PyUnicode_READY(unicode) since when it's not PyUnicode_WCHAR_KIND it should be ready. 2. In unicode_char, PyUnicode_1BYTE_KIND is handled by get_latin1_char. Patch by Xiang Zhang.	2016-10-09 23:44:48 +03:00
Serhiy Storchaka	21d9f10c94	Merge from 3.5.	2016-10-08 22:46:01 +03:00
Serhiy Storchaka	9c0e1f83af	Issue #28379 : Added sanity checks and tests for PyUnicode_CopyCharacters(). Patch by Xiang Zhang.	2016-10-08 22:45:38 +03:00
Victor Stinner	44f4874e68	Merge 3.5	2016-09-21 14:13:53 +02:00
Victor Stinner	1ddf53d496	Fix PyUnicode_FromFormatV() error handling Issue #28233: Fix a memory leak if the format string contains a non-ASCII character, destroy the unicode writer.	2016-09-21 14:13:14 +02:00
Christian Heimes	2f2fee19ec	va_end() all va_copy()ed va_lists.	2016-09-21 11:37:27 +02:00
Benjamin Peterson	0c21214f3e	replace usage of Py_VA_COPY with the (C99) standard va_copy	2016-09-20 20:39:33 -07:00
Christian Heimes	f051e43b22	Issue #28126 : Replace Py_MEMCPY with memcpy(). Visual Studio can properly optimize memcpy().	2016-09-13 20:22:02 +02:00
Benjamin Peterson	621b430a14	remove all usage of Py_LOCAL	2016-09-09 13:54:34 -07:00
Benjamin Peterson	33d2a492d0	promote some shifts to unsigned, so as not to invoke undefined behavior	2016-09-06 20:40:04 -07:00
R David Murray	110b6fecbb	#27364 : Deprecate invalid escape strings in str/byutes. Patch by Emanuel Barry, reviewed by Serhiy Storchaka and Martin Panter.	2016-09-08 15:34:08 -04:00
Steve Dower	cc16be85c0	Issue #27781 : Change file system encoding on Windows to UTF-8 (PEP 529)	2016-09-08 10:35:16 -07:00
Benjamin Peterson	47ff0734b8	more PY_LONG_LONG to long long	2016-09-08 09:15:54 -07:00
Benjamin Peterson	2e7c5e9c11	replace some Py_LOCAL_INLINE with the inline keyword	2016-09-07 15:33:32 -07:00
Benjamin Peterson	4b9abf3a27	merge 3.5	2016-09-06 20:42:17 -07:00
Brett Cannon	a571120410	Issue #27182 : Add support for path-like objects to PyUnicode_FSDecoder().	2016-09-06 19:36:01 -07:00
Victor Stinner	62ec3317d2	Optimize unicode_escape and raw_unicode_escape Issue #16334. Patch written by Serhiy Storchaka.	2016-09-06 17:04:34 -07:00
Victor Stinner	2740e46089	_PyUnicodeWriter: assert that max character <= MAX_UNICODE	2016-09-06 16:58:36 -07:00
Brett Cannon	ec6ce879c7	Issue #26027 : Support path-like objects in PyUnicode-FSConverter(). This is to add support for os.exec() and os.spawn() functions. Part of PEP 519.	2016-09-06 15:50:29 -07:00
Benjamin Peterson	9b3d77052f	replace Python aliases for standard integer types with the standard integer types (#17884 )	2016-09-06 13:24:00 -07:00
Serhiy Storchaka	ea525a2d1a	Issue #27078 : Added BUILD_STRING opcode. Optimized f-strings evaluation.	2016-09-06 22:07:53 +03:00
Benjamin Peterson	af580dff4a	replace PY_LONG_LONG with long long	2016-09-06 10:46:49 -07:00
Benjamin Peterson	ed4aa83ff7	require a long long data type (closes #27961 )	2016-09-05 17:44:18 -07:00
Victor Stinner	942889aae2	Issue #27938 : Add a fast-path for us-ascii encoding Other changes: * Rewrite _Py_normalize_encoding() as a C implementation of encodings.normalize_encoding(). For example, " utf-8 " is now normalized to "utf_8". So the fast path is now used for more name variants of the same encoding. * Avoid strcpy() when encoding is NULL: call directly the UTF-8 codec	2016-09-05 15:40:10 -07:00
Victor Stinner	1a05d6c04d	PEP 7 style for if/else in C Add also a newline for readability in normalize_encoding().	2016-09-02 12:12:23 +02:00
Raymond Hettinger	15f44ab043	Issue #27895 : Spelling fixes (Contributed by Ville Skyttä).	2016-08-30 10:47:49 -07:00
Serhiy Storchaka	febc332056	Issue #26754 : Undocumented support of general bytes-like objects as path in compile() and similar functions is now deprecated.	2016-08-06 23:29:29 +03:00
Berker Peksag	ced8d4c6eb	Issue #27454 : Use PyDict_SetDefault in PyUnicode_InternInPlace Patch by INADA Naoki.	2016-07-25 04:40:39 +03:00
Serhiy Storchaka	f95de0e8cc	Issue #26754 : PyUnicode_FSDecoder() accepted a filename argument encoded as an iterable of integers. Now only strings and byte-like objects are accepted.	2016-06-18 13:56:16 +03:00
Serhiy Storchaka	9305d83425	Issue #26754 : PyUnicode_FSDecoder() accepted a filename argument encoded as an iterable of integers. Now only strings and byte-like objects are accepted.	2016-06-18 13:53:36 +03:00
Martin Panter	0b7d84de6b	Issue #27171 : Merge typo fixes from 3.5	2016-06-02 10:11:18 +00:00
Martin Panter	e26da7c03a	Issue #27171 : Fix typos in documentation, comments, and test function names	2016-06-02 10:07:09 +00:00
Serhiy Storchaka	dd40fc3e57	Issue #26765 : Moved common code and docstrings for bytes and bytearray methods to bytes_methods.c.	2016-05-04 22:23:26 +03:00
Martin Panter	cda80940ed	Issue #15984 : Merge PyUnicode doc from 3.5	2016-04-15 02:27:11 +00:00
Martin Panter	6245cb3c01	Correct “an” → “a” with “Unicode”, “user”, “UTF”, etc This affects documentation, code comments, and a debugging messages.	2016-04-15 02:14:19 +00:00
Serhiy Storchaka	21a663ea28	Issue #26057 : Got rid of nonneeded use of PyUnicode_FromObject().	2016-04-13 15:37:23 +03:00
Serhiy Storchaka	f01e408c16	Issue #26200 : Added Py_SETREF and replaced Py_XSETREF with Py_SETREF in places where Py_DECREF was used.	2016-04-10 18:12:01 +03:00
Serhiy Storchaka	57a01d3a0e	Issue #26200 : Added Py_SETREF and replaced Py_XSETREF with Py_SETREF in places where Py_DECREF was used.	2016-04-10 18:05:40 +03:00
Serhiy Storchaka	ec39756960	Issue #22570 : Renamed Py_SETREF to Py_XSETREF.	2016-04-06 09:50:03 +03:00
Serhiy Storchaka	48842714b9	Issue #22570 : Renamed Py_SETREF to Py_XSETREF.	2016-04-06 09:45:48 +03:00
Serhiy Storchaka	ab479c49d3	Issue #26494 : Fixed crash on iterating exhausting iterators. Affected classes are generic sequence iterators, iterators of str, bytes, bytearray, list, tuple, set, frozenset, dict, OrderedDict, corresponding views and os.scandir() iterator.	2016-03-30 20:41:15 +03:00
Serhiy Storchaka	fbb1c5ee06	Issue #26494 : Fixed crash on iterating exhausting iterators. Affected classes are generic sequence iterators, iterators of str, bytes, bytearray, list, tuple, set, frozenset, dict, OrderedDict, corresponding views and os.scandir() iterator.	2016-03-30 20:40:02 +03:00
Victor Stinner	f2192855dd	Merge 3.5	2016-03-01 22:07:53 +01:00
Victor Stinner	337986740f	Issue #26464 : Fix unicode_fast_translate() again Initialize i variable if the string is non-ASCII.	2016-03-01 21:59:58 +01:00
Victor Stinner	3d9d77a3dc	Merge 3.5	2016-03-01 21:30:50 +01:00
Victor Stinner	6c9aa8f2bf	Fix str.translate() Issue #26464: Fix str.translate() when string is ASCII and first replacements removes character, but next replacement uses a non-ASCII character or a string longer than 1 character. Regression introduced in Python 3.5.0.	2016-03-01 21:30:30 +01:00
Victor Stinner	5b96f17b1c	Merge 3.5	2016-01-27 17:01:13 +01:00
Victor Stinner	5bc03a6d4d	Fix resize_compact() Issue #26217: resize_compact() must set wstr_length to 0 after freeing the wstr string. Otherwise, an assertion fails in _PyUnicode_CheckConsistency().	2016-01-27 16:56:53 +01:00
Serhiy Storchaka	726fc139a5	Issue #20440 : More use of Py_SETREF. This patch is manually crafted and contains changes that couldn't be handled automatically.	2015-12-27 15:44:33 +02:00
Serhiy Storchaka	191321d11b	Issue #20440 : More use of Py_SETREF. This patch is manually crafted and contains changes that couldn't be handled automatically.	2015-12-27 15:41:34 +02:00
Serhiy Storchaka	ef1585eb9a	Issue #25923 : Added more const qualifiers to signatures of static and private functions.	2015-12-25 20:01:53 +02:00
Serhiy Storchaka	2d06e84455	Issue #25923 : Added the const qualifier to static constant arrays.	2015-12-25 19:53:18 +02:00
Serhiy Storchaka	f006940351	Issue #20440 : Massive replacing unsafe attribute setting code with special macro Py_SETREF.	2015-12-24 10:39:57 +02:00
Serhiy Storchaka	5a57ade58e	Issue #20440 : Massive replacing unsafe attribute setting code with special macro Py_SETREF.	2015-12-24 10:35:59 +02:00
Serhiy Storchaka	9b3a2eec1c	Issues #25890 , #25891 , #25892 : Removed unused variables in Windows code. Reported by Alexander Riccio.	2015-12-18 10:03:13 +02:00
Serhiy Storchaka	7c088a9b5c	Issue #25709 : Fixed problem with in-place string concatenation and utf-8 cache.	2015-12-03 01:05:52 +02:00
Serhiy Storchaka	6648bf5661	Issue #25709 : Fixed problem with in-place string concatenation and utf-8 cache.	2015-12-03 01:04:37 +02:00
Serhiy Storchaka	31b9410654	Issue #25709 : Fixed problem with in-place string concatenation and utf-8 cache.	2015-12-03 01:02:03 +02:00
Serhiy Storchaka	7aa690860e	Issue #25709 : Fixed problem with in-place string concatenation and utf-8 cache.	2015-12-03 01:02:03 +02:00
Benjamin Peterson	d798dc1034	merge 3.5 (#25630 )	2015-11-15 21:57:50 -08:00
Benjamin Peterson	a4d33b3428	make the PyUnicode_FSConverter cleanup set the decrefed argument to NULL (closes #25630 )	2015-11-15 21:57:39 -08:00
Serhiy Storchaka	413fdcea21	Issue #24821 : Refactor STRINGLIB(fastsearch_memchr_1char) and split it on STRINGLIB(find_char) and STRINGLIB(rfind_char) that can be used independedly without special preconditions.	2015-11-14 15:42:17 +02:00
Serhiy Storchaka	4a7c03aab4	Issue #25523 : Merge a-to-an corrections from 3.5.	2015-11-02 14:44:29 +02:00
Serhiy Storchaka	a84f6c3dd3	Issue #25523 : Merge a-to-an corrections from 3.4.	2015-11-02 14:39:05 +02:00
Serhiy Storchaka	d65c9496da	Issue #25523 : Further a-to-an corrections.	2015-11-02 14:10:23 +02:00
Victor Stinner	358af13526	Issue #25353 : Optimize unicode escape and raw unicode escape encoders to use the new _PyBytesWriter API.	2015-10-12 22:36:57 +02:00
Victor Stinner	6c2cdae9e6	Writer APIs: use empty string singletons Modify _PyBytesWriter_Finish() and _PyUnicodeWriter_Finish() to return the empty bytes/Unicode string if the string is empty.	2015-10-12 13:29:43 +02:00
Victor Stinner	6bd525b656	Optimize error handlers of ASCII and Latin1 encoders when the replacement string is pure ASCII: use _PyBytesWriter_WriteBytes(), don't check individual character. Cleanup unicode_encode_ucs1(): * Rename repunicode to rep * Clear rep object on error * Factorize code between bytes and unicode path	2015-10-09 13:10:05 +02:00
Victor Stinner	ce179bf6ba	Add _PyBytesWriter_WriteBytes() to factorize the code	2015-10-09 12:57:22 +02:00
Victor Stinner	ad7715891e	_PyBytesWriter: simplify code to avoid "prealloc" parameters Substract preallocate bytes from min_size before calling _PyBytesWriter_Prepare().	2015-10-09 12:38:53 +02:00
Victor Stinner	3fa36ff5e4	Issue #25318 : Fix backslashreplace() Fix code to estimate the needed space.	2015-10-09 03:37:11 +02:00
Victor Stinner	797485e101	Issue #25318 : Avoid sprintf() in backslashreplace() Rewrite backslashreplace() to be closer to PyCodec_BackslashReplaceErrors(). Add also unit tests for non-BMP characters.	2015-10-09 03:17:30 +02:00
Victor Stinner	0016507c16	Issue #25318 : Move _PyBytesWriter to bytesobject.c Declare also the private API in bytesobject.h.	2015-10-09 01:53:21 +02:00
Victor Stinner	e7bf86cd7d	Optimize backslashreplace error handler Issue #25318: Optimize backslashreplace and xmlcharrefreplace error handlers in UTF-8 encoder. Optimize also backslashreplace error handler for ASCII and Latin1 encoders. Use the new _PyBytesWriter API to optimize these error handlers for the encoders. It avoids to create an exception and call the slow implementation of the error handler.	2015-10-09 01:39:28 +02:00
Victor Stinner	fdfbf78114	Issue #25318 : Add _PyBytesWriter API Add a new private API to optimize Unicode encoders. It uses a small buffer allocated on the stack and supports overallocation. Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable overallocation for the UTF-8 encoder with error handlers. unicode_encode_ucs1(): initialize collend to collstart+1 to not check the current character twice, we already know that it is not ASCII.	2015-10-09 00:33:49 +02:00
Victor Stinner	74e8fac3c8	Issue #25301 : Fix compatibility with ISO C90	2015-10-05 13:49:26 +02:00
Victor Stinner	1d65d9192d	Issue #25301 : The UTF-8 decoder is now up to 15 times as fast for error handlers: ``ignore``, ``replace`` and ``surrogateescape``.	2015-10-05 13:43:50 +02:00
Victor Stinner	eb36fdaad8	Fix _PyUnicodeWriter_PrepareKind() Initialize kind to 0 (PyUnicode_WCHAR_KIND) to ensure that _PyUnicodeWriter_PrepareKind() handles correctly read-only buffer: copy the buffer.	2015-10-03 01:55:51 +02:00
Serhiy Storchaka	29e68edbf4	Issue #24848 : Fixed bugs in UTF-7 decoding of misformed data: 1. Non-ASCII bytes were accepted after shift sequence. 2. A low surrogate could be emitted in case of error in high surrogate. 3. In some circumstances the '\xfd' character was produced instead of the replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).	2015-10-02 13:14:03 +03:00
Serhiy Storchaka	58c8f2bb6d	Issue #24848 : Fixed bugs in UTF-7 decoding of misformed data: 1. Non-ASCII bytes were accepted after shift sequence. 2. A low surrogate could be emitted in case of error in high surrogate. 3. In some circumstances the '\xfd' character was produced instead of the replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).	2015-10-02 13:13:14 +03:00
Serhiy Storchaka	28b21e50c8	Issue #24848 : Fixed bugs in UTF-7 decoding of misformed data: 1. Non-ASCII bytes were accepted after shift sequence. 2. A low surrogate could be emitted in case of error in high surrogate.	2015-10-02 13:07:28 +03:00
Victor Stinner	3222da26fe	Make _PyUnicode_TranslateCharmap() symbol private unicodeobject.h exposes PyUnicode_TranslateCharmap() and PyUnicode_Translate().	2015-10-01 22:07:32 +02:00
Victor Stinner	01ada3996b	Issue #25267 : The UTF-8 encoder is now up to 75 times as fast for error handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``. Patch co-written with Serhiy Storchaka.	2015-10-01 21:54:51 +02:00
Victor Stinner	c3713e9706	Optimize ascii/latin1+surrogateescape encoders Issue #25227: Optimize ASCII and latin1 encoders with the ``surrogateescape`` error handler: the encoders are now up to 3 times as fast. Initial patch written by Serhiy Storchaka.	2015-09-29 12:32:13 +02:00
Victor Stinner	0030cd52da	Issue #25227 : Cleanup unicode_encode_ucs1() error handler * Change limit type from unsigned int to Py_UCS4, to use the same type than the "ch" variable (an Unicode character). * Reuse ch variable for _Py_ERROR_XMLCHARREFREPLACE * Add some newlines for readability	2015-09-24 14:45:00 +02:00
Victor Stinner	54385b206d	Issue #24870 : revert unwanted change Sorry, I pushed the patch on the UTF-8 decoder by mistake :-(	2015-09-22 10:46:52 +02:00
Victor Stinner	5ebae87628	Issue #25207 , #14626 : Fix my commit. It doesn't work to use #define XXX defined(YYY)" and then "#ifdef XXX" to check YYY.	2015-09-22 01:29:33 +02:00
Victor Stinner	6174474bea	_PyUnicodeWriter_PrepareInternal(): make the assertion more strict	2015-09-22 01:01:17 +02:00
Victor Stinner	ca9381ea01	Issue #24870 : Add _PyUnicodeWriter_PrepareKind() macro Add a macro which ensures that the writer has at least the requested kind.	2015-09-22 00:58:32 +02:00
Victor Stinner	5014920cb7	Issue #24870 : Reuse the new _Py_error_handler enum Factorize code with the new get_error_handler() function. Add some empty lines for readability.	2015-09-22 00:26:54 +02:00
Victor Stinner	f96418de05	Issue #24870 : Optimize the ASCII decoder for error handlers: surrogateescape, ignore and replace. Initial patch written by Naoki Inada. The decoder is now up to 60 times as fast for these error handlers. Add also unit tests for the ASCII decoder.	2015-09-21 23:06:27 +02:00
Zachary Ware	070bd62cfa	Closes #21279 : Merge with 3.5	2015-08-06 00:05:13 -05:00
Zachary Ware	d987a81d29	Issue #21279 : Merge with 3.4	2015-08-06 00:04:23 -05:00
Zachary Ware	79b98df023	Issue #21279 : Flesh out str.translate docs Initial patch by Kinga Farkas, Martin Panter, and John Posner.	2015-08-05 23:54:15 -05:00
Raymond Hettinger	ac2ef65c32	Make the unicode equality test an external function rather than in-lining it. The real benefit of the unicode specialized function comes from bypassing the overhead of PyObject_RichCompareBool() and not from being in-lined (especially since there was almost no shared data between the caller and callee). Also, the in-lining was having a negative effect on code generation for the callee.	2015-07-04 16:04:44 -07:00
Serhiy Storchaka	d4ea03c785	Issue #24284 : The startswith and endswith methods of the str class no longer return True when finding the empty string and the indexes are completely out of range.	2015-05-31 09:15:51 +03:00
Antoine Pitrou	873e0df946	Fix some compilation warnings when using gcc (-Wmaybe-uninitialized).	2015-05-19 21:06:04 +02:00
Antoine Pitrou	f6d1f1fa8a	Fix some compilation warnings when using gcc (-Wmaybe-uninitialized).	2015-05-19 21:04:33 +02:00
Serhiy Storchaka	0d4df752ac	Issue #15027 : The UTF-32 encoder is now 3x to 7x faster.	2015-05-12 23:12:45 +03:00
Serhiy Storchaka	7e9d1d1a1b	Issue #23908 : os functions now reject paths with embedded null character on Windows instead of silently truncate them. Removed no longer used _PyUnicode_HasNULChars().	2015-04-20 10:12:28 +03:00
Serhiy Storchaka	1009bf18b3	Issue #23501 : Argumen Clinic now generates code into separate files by default.	2015-04-03 23:53:51 +03:00
Victor Stinner	1912b39def	_PyUnicodeWriter_WriteStr() now checks that the input string is consistent in debug mode to detect bugs earlier. _PyUnicodeWriter_Finish() doesn't check if the read only string is consistent, whereas it does check consistency for strings built by itself.	2015-03-26 09:37:23 +01:00
Serhiy Storchaka	d9d769fcdd	Issue #23573 : Increased performance of string search operations (str.find, str.index, str.count, the in operator, str.split, str.partition) with arguments of different kinds (UCS1, UCS2, UCS4).	2015-03-24 21:55:47 +02:00
Victor Stinner	f50e187724	Fix compiler warnings: comparison between signed and unsigned numbers	2015-03-20 11:32:24 +01:00
Victor Stinner	0c39b1b970	Initialize variables to prevent GCC warnings	2015-03-18 15:02:06 +01:00
Benjamin Peterson	e5a853c390	use PyMem_NEW to detect overflow (closes #23362 )	2015-03-02 13:23:25 -05:00
Steve Dower	3e96f324dc	Issue #23451 : Update pyconfig.h for Windows to require Vista headers and remove unnecessary version checks.	2015-03-02 08:01:10 -08:00
Serhiy Storchaka	78a8249127	Issue #23490 : Fixed possible crashes related to interoperability between old-style and new API for string with 2**30-1 characters.	2015-02-20 21:34:39 +02:00
Serhiy Storchaka	e55181f517	Issue #23490 : Fixed possible crashes related to interoperability between old-style and new API for string with 2**30-1 characters.	2015-02-20 21:34:06 +02:00
Serhiy Storchaka	4d0d982985	Issue #23446 : Use PyMem_New instead of PyMem_Malloc to avoid possible integer overflows. Added few missed PyErr_NoMemory().	2015-02-16 13:33:32 +02:00
Serhiy Storchaka	1a1ff29659	Issue #23446 : Use PyMem_New instead of PyMem_Malloc to avoid possible integer overflows. Added few missed PyErr_NoMemory().	2015-02-16 13:28:22 +02:00
Serhiy Storchaka	4dbc305002	Issue #23055 : Fixed a buffer overflow in PyUnicode_FromFormatV. Analysis and fix by Guido Vranken.	2015-01-27 22:18:46 +02:00
Victor Stinner	29dacf2e97	Issue #15859 : PyUnicode_EncodeFSDefault(), PyUnicode_EncodeMBCS() and PyUnicode_EncodeCodePage() now raise an exception if the object is not an Unicode object. For PyUnicode_EncodeFSDefault(), it was already the case on platforms other than Windows. Patch written by Campbell Barton.	2015-01-26 16:41:32 +01:00
Serhiy Storchaka	bbd3aa8ece	Issue #23321 : Fixed a crash in str.decode() when error handler returned replacment string longer than mailformed input data.	2015-01-26 01:24:31 +02:00

... 3 4 5 6 7 ...

1644 Commits