cpython

Commit Graph

Author	SHA1	Message	Date
Nick Coghlan	573b1fd779	Fix str docstring	2012-08-16 14:13:07 +10:00
Antoine Pitrou	b4bbee25b1	Issue #14579 : Fix CVE-2012-2135: vulnerability in the utf-16 decoder after error handling. Patch by Serhiy Storchaka.	2012-07-21 00:45:14 +02:00
Mark Dickinson	01ac8b6ab1	Use correct types for ASCII_CHAR_MASK integer constants.	2012-07-07 14:08:48 +02:00
Antoine Pitrou	aaefac76dd	Issue #14874 : Restore charmap decoding speed to pre-PEP 393 levels. Patch by Serhiy Storchaka.	2012-06-16 22:48:21 +02:00
Victor Stinner	f185226244	_copy_characters(): move debug code at the top to avoid noisy #ifdef And don't use assert() anymore if check_maxchar is set: return -1 on error instead.	2012-06-16 16:38:26 +02:00
Victor Stinner	07621338fb	Fix PyUnicode_GetSize(): Don't replace _PyUnicode_Ready() exception	2012-06-16 04:53:46 +02:00
Victor Stinner	8a8b3eaabe	Fix a compiler warning in _copy_characters() and remove debug code	2012-06-16 04:53:25 +02:00
Victor Stinner	24e403bbee	Oops, fix my previous change on _copy_characters()	2012-06-16 04:53:00 +02:00
Victor Stinner	ca439eecea	Fix unicode_adjust_maxchar(): catch PyUnicode_New() failure	2012-06-16 03:17:34 +02:00
Victor Stinner	184252ad3f	Fix "%f" format of str%args if the result is not an ASCII or latin1 string	2012-06-16 02:57:41 +02:00
Victor Stinner	9a77770add	Remove debug code	2012-06-16 02:44:43 +02:00
Victor Stinner	c9d369f1bf	Optimize _PyUnicode_FastCopyCharacters() when maxchar(from) > maxchar(to)	2012-06-16 02:22:37 +02:00
Victor Stinner	f05e17ece9	unicodeobject.c: Remove debug code	2012-06-16 01:53:04 +02:00
Antoine Pitrou	27f6a3b0bf	Issue #15026 : utf-16 encoding is now significantly faster (up to 10x). Patch by Serhiy Storchaka.	2012-06-15 22:15:23 +02:00
Kristján Valur Jónsson	55e5dc8371	Rearrange code to beat an optimizer bug affecting Release x64 on windows with VS2010sp1	2012-06-06 21:58:08 +00:00
Victor Stinner	d7b7c7472b	Issue #14993 : Use standard "unsigned char" instead of a unsigned char bitfield	2012-06-04 22:52:12 +02:00
Kristjan Valur Jonsson	85634d7a2e	Issue #14909 : A number of places were using PyMem_Realloc() apis and PyObject_GC_Resize() with incorrect error handling. In case of errors, the original object would be leaked. This checkin fixes those cases.	2012-05-31 09:37:31 +00:00
Victor Stinner	3a7d096f2f	Issue #14744 : Fix compilation on Windows (part 2)	2012-05-29 18:53:56 +02:00
Victor Stinner	d3f0882dfb	Issue #14744 : Use the new _PyUnicodeWriter internal API to speed up str%args and str.format(args) * Formatting string, int, float and complex use the _PyUnicodeWriter API. It avoids a temporary buffer in most cases. * Add _PyUnicodeWriter_WriteStr() to restore the PyAccu optimization: just keep a reference to the string if the output is only composed of one string * Disable overallocation when formatting the last argument of str%args and str.format(args) * Overallocation allocates at least 100 characters: add min_length attribute to the _PyUnicodeWriter structure * Add new private functions: _PyUnicode_FastCopyCharacters(), _PyUnicode_FastFill() and _PyUnicode_FromASCII() The speed up is around 20% in average.	2012-05-29 12:57:52 +02:00
Antoine Pitrou	63065d761e	Issue #14624 : UTF-16 decoding is now 3x to 4x faster on various inputs. Patch by Serhiy Storchaka.	2012-05-15 23:48:04 +02:00
Martin v. Löwis	b05c0738d8	Silence VS 2010 signed/unsigned warnings.	2012-05-15 13:45:49 +02:00
Antoine Pitrou	758153badb	Fix refleaks introduced by 83da67651687.	2012-05-12 15:51:51 +02:00
Antoine Pitrou	e45c0c5cef	Fix logic error introduced by 83da67651687.	2012-05-12 15:49:07 +02:00
Benjamin Peterson	1ff2e35e84	simplify by shortcutting when the kind of the needle is larger than the haystack	2012-05-11 17:41:20 -05:00
Antoine Pitrou	ca5f91b888	Issue #14738 : Speed-up UTF-8 decoding on non-ASCII data. Patch by Serhiy Storchaka.	2012-05-10 16:36:02 +02:00
Victor Stinner	3b1a74a9c3	Rename unicode_write_t structure and its methods to "_PyUnicodeWriter"	2012-05-09 22:25:00 +02:00
Victor Stinner	ee4544c920	Issue #14744 : Inline unicode_writer_write_char() and unicode_write_str() Optimize also PyUnicode_Format(): call unicode_writer_prepare() only once per argument.	2012-05-09 22:24:08 +02:00
Victor Stinner	f59c28c930	unicode_writer_finish() checks string consistency	2012-05-09 03:24:14 +02:00
Victor Stinner	106802547c	Backout ab500b297900: the check for integer overflow is wrong Issue #14716: Change integer overflow check in unicode_writer_prepare() to compute the limit at compile time instead of runtime. Patch writen by Serhiy Storchaka.	2012-05-07 23:50:05 +02:00
Victor Stinner	0576f9b4cf	Issue #14716 : Change integer overflow check in unicode_writer_prepare() to compute the limit at compile time instead of runtime. Patch writen by Serhiy Storchaka.	2012-05-07 13:02:44 +02:00
Victor Stinner	202fdca133	Close #14716 : str.format() now uses the new "unicode writer" API instead of the PyAccu API. For example, it makes str.format() from 25% to 30% faster on Linux.	2012-05-07 12:47:02 +02:00
Mark Dickinson	99e2e5552a	Issue #14700 : Fix two broken and undefined-behaviour-inducing overflow checks in old-style string formatting. Thanks Serhiy Storchaka for report and original patch.	2012-05-07 11:20:50 +01:00
Victor Stinner	d0dba6eee8	unicode_writer: don't force inline when it is not necessary Keep inline for performance critical functions (functions used in loops)	2012-05-04 01:19:15 +02:00
Benjamin Peterson	b63f49f2b4	if the kind of the string to count is larger than the string to search, shortcut to 0	2012-05-03 18:31:07 -04:00
Victor Stinner	a7b654be30	unicode_writer: add finish() method and assertions to write_str() method * The write_str() method does nothing if the length is zero. * Replace "struct unicode_writer_t" with "unicode_writer_t"	2012-05-03 23:58:55 +02:00
Victor Stinner	bf4e266397	Issue #14687 : Remove redundant length attribute of unicode_write_t The length can be read directly from the buffer	2012-05-03 19:27:14 +02:00
Victor Stinner	7989157e49	Issue #14687 : Cleanup unicode_writer_prepare() "Inline" PyUnicode_Resize(): call directly resize_compact()	2012-05-03 13:43:07 +02:00
Victor Stinner	f2c76aa6cb	Issue #14687 : str%tuple now uses an optimistic "unicode writer" instead of an accumulator. Directly write characters into the output (don't use a temporary list): resize and widen the string on demand.	2012-05-03 13:10:40 +02:00
Victor Stinner	1b487b467b	Issue #14624 , #14687 : Optimize unicode_widen() Don't convert uninitialized characters. Patch written by Serhiy Storchaka.	2012-05-03 12:29:04 +02:00
Victor Stinner	3a7f7977f1	Remove buggy assertion in PyUnicode_Substring() Use also directly unicode_empty, instead of PyUnicode_New(0,0).	2012-05-03 03:36:40 +02:00
Victor Stinner	684d5fd420	Fix PyUnicode_Substring() for start >= length and start > end Remove the fast-path for 1-character string: unicode_fromascii() and _PyUnicode_FromUCS*() now have their own fast-path for 1-character strings.	2012-05-03 02:32:34 +02:00
Victor Stinner	b6cd014d75	Unicode: optimize creating of 1-character strings	2012-05-03 02:17:04 +02:00
Victor Stinner	bff7c96834	Issue #14687 : Optimize str%tuple for the "%(name)s" syntax Avoid an useless and expensive call to PyUnicode_READ().	2012-05-03 01:44:59 +02:00
Victor Stinner	e6abb488c9	unicodeobject.c: Add MAX_MAXCHAR() macro to (micro-)optimize the computation of the second argument of PyUnicode_New(). * Create also align_maxchar() function * Optimize fix_decimal_and_space_to_ascii(): don't compute the maximum character when ch <= 127 (it is ASCII)	2012-05-02 01:15:40 +02:00
Victor Stinner	438106b66e	Issue #14687 : Cleanup PyUnicode_Format()	2012-05-02 00:41:57 +02:00
Victor Stinner	b5c3ea3af3	Issue #14687 : Optimize str%args * formatfloat() uses unicode_fromascii() instead of PyUnicode_DecodeASCII() to not have to check characters, we know that it is really ASCII * Use PyUnicode_FromOrdinal() instead of _PyUnicode_FromUCS4() to format a character: if avoids a call to ucs4lib_find_max_char() to compute the maximum character (whereas we already know it, it is just the character itself)	2012-05-02 00:29:36 +02:00
Victor Stinner	b80e46eca4	Issue #14687 : Avoid an useless duplicated string in PyUnicode_Format()	2012-04-30 05:21:52 +02:00
Victor Stinner	aff3cc659b	Issue #14687 : Cleanup PyUnicode_Format()	2012-04-30 05:19:21 +02:00
Victor Stinner	b11d91d969	Fix my previous commit: bool is a long, restore the specical case for bool	2012-04-28 00:25:34 +02:00
Victor Stinner	d0880d57b0	Simplify and optimize formatlong() * Remove _PyBytes_FormatLong(): inline it into formatlong() * the input type is always a long, so remove the code for bool * don't duplicate the string if the length does not change * Use PyUnicode_DATA() instead of _PyUnicode_AsString()	2012-04-27 23:40:13 +02:00
Victor Stinner	94d558b063	Optimize _PyUnicode_FindMaxChar() find pure ASCII strings	2012-04-27 22:26:58 +02:00
Victor Stinner	8f825060f1	Check newly created consistency using _PyUnicode_CheckConsistency(str, 1) * In debug mode, fill the string data with invalid characters * Simplify also reference counting in PyCodec_BackslashReplaceErrors() and PyCodec_XMLCharRefReplaceError()	2012-04-27 13:55:39 +02:00
Victor Stinner	718fbf078c	_PyUnicode_CheckConsistency() ensures that the unicode string ends with a null character	2012-04-26 00:39:37 +02:00
Benjamin Peterson	b9f4c9daad	make pointer arith c89	2012-04-23 21:45:40 -04:00
Benjamin Peterson	f3b7d86e25	use correct base ptr	2012-04-23 18:07:01 -04:00
Benjamin Peterson	2844a7a6d3	simplify and reformat	2012-04-23 18:00:25 -04:00
Victor Stinner	ece58deb9f	Close #14648 : Compute correctly maxchar in str.format() for substrin	2012-04-23 23:36:38 +02:00
Benjamin Peterson	64ed576de8	merge 3.2 (#14509 )	2012-04-09 15:04:39 -04:00
Benjamin Peterson	ca819c3c9d	merge 3.1 (#14509 )	2012-04-09 15:01:02 -04:00
Benjamin Peterson	f6622c8a3e	fix build without Py_DEBUG and DNDEBUG (closes #14509 )	2012-04-09 14:53:07 -04:00
Victor Stinner	afb5205c48	Close #14249 : Use bit shifts instead of an union, it's more efficient. Patch written by Serhiy Storchaka	2012-04-05 22:54:49 +02:00
Victor Stinner	e7eee01f36	Close #14249 : Use an union instead of a long to short pointer to avoid aliasing issue. Speed up UTF-16 by 20%.	2012-04-05 13:44:34 +02:00
Antoine Pitrou	a701388de1	Rename _PyIter_GetBuiltin to _PyObject_GetBuiltin, and do not include it in the stable ABI.	2012-04-05 00:04:20 +02:00
Kristján Valur Jónsson	31668b8f7a	Issue #14288 : Serialization support for builtin iterators.	2012-04-03 10:49:41 +00:00
Benjamin Peterson	0df542985a	grammar	2012-03-26 14:50:32 -04:00
Benjamin Peterson	c067d6661f	merge 3.2	2012-03-25 22:41:16 -04:00
Benjamin Peterson	a8755c586e	kill this terribly outdated comment	2012-03-25 22:40:54 -04:00
Victor Stinner	0d03478b88	Remove an unused variable	2012-03-06 02:06:01 +01:00
Victor Stinner	c9590ad745	Close #14085 : remove assertions from PyUnicode_WRITE macro Add checks in PyUnicode_WriteChar() and convert PyUnicode_New() assertion to a test raising a Python exception.	2012-03-04 01:34:37 +01:00
Ezio Melotti	cda6b6d60d	#14081 : The sep and maxsplit parameter to str.split, bytes.split, and bytearray.split may now be passed as keyword arguments.	2012-02-26 09:39:55 +02:00
Victor Stinner	b0800dc53b	Oops, revert unwanted changes	2012-02-25 00:47:08 +01:00
Victor Stinner	abc649ddbe	Issue #14107 : fix bigmem tests on str.capitalize(), str.swapcase() and str.title(). Compute correctly how much memory is required for the test (memuse).	2012-02-25 00:43:27 +01:00
Antoine Pitrou	842c0f17eb	Fix compilation error under Windows (and warnings too).	2012-02-24 13:30:46 +01:00
Victor Stinner	90f50d4df9	Issue #13706 : Fix format(float, "n") for locale with non-ASCII decimal point (e.g. ps_aF)	2012-02-24 01:44:47 +01:00
Victor Stinner	41a863cb81	Issue #13706 : Fix format(int, "n") for locale with non-ASCII thousands separator * Decode thousands separator and decimal point using PyUnicode_DecodeLocale() (from the locale encoding), instead of decoding them implicitly from latin1 * Remove _PyUnicode_InsertThousandsGroupingLocale(), it was not used * Change _PyUnicode_InsertThousandsGrouping() API to return the maximum character if unicode is NULL * Replace MIN/MAX macros by Py_MIN/Py_MAX * stringlib/undef.h undefines STRINGLIB_IS_UNICODE * stringlib/localeutil.h only supports Unicode	2012-02-24 00:37:51 +01:00
Victor Stinner	b429d3b09c	Fix doc of an internal function: unicode_write_cstr()	2012-02-22 21:22:20 +01:00
Antoine Pitrou	ba6bafcfbe	Fix compile failure under Windows	2012-02-22 16:41:50 +01:00
Victor Stinner	c516610f0b	Optimize str%arg for number formats: %i, %d, %u, %x, %p Write a specialized function to write an ASCII/latin1 C char* string into a Python Unicode string.	2012-02-22 13:55:02 +01:00
Victor Stinner	99d7ad0bb0	Micro-optimize computation of maxchar in PyUnicode_TransformDecimalToASCII()	2012-02-22 13:37:39 +01:00
Victor Stinner	da79e632c4	Micro-optimize unicode_expandtabs(): use FILL() macro to write N spaces	2012-02-22 13:37:04 +01:00
Victor Stinner	15e9ed299c	PyUnicode_New() and unicode_putchar() check for MAX_UNICODE maximum (U+10FFFF)	2012-02-22 13:36:20 +01:00
Benjamin Peterson	d9a3591ed1	merge 3.2	2012-02-21 11:12:14 -05:00
Benjamin Peterson	e249dcab7a	merge 3.2	2012-02-21 11:09:13 -05:00
Benjamin Peterson	69e9727657	ensure no one tries to hash things before the random seed is found	2012-02-21 11:08:50 -05:00
Georg Brandl	16fa2a1097	Forgot the "empty string -> hash == 0" special case for strings.	2012-02-21 00:50:13 +01:00
Georg Brandl	2fb477c0f0	Merge 3.2: Issue #13703 plus some related test suite fixes.	2012-02-21 00:33:36 +01:00
Georg Brandl	09a7c72cad	Merge from 3.1: Issue #13703 : add a way to randomize the hash values of basic types (str, bytes, datetime) in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated. The environment variable PYTHONHASHSEED and the new command line flag -R control this behavior.	2012-02-20 21:31:46 +01:00
Georg Brandl	2daf6ae249	Issue #13703 : add a way to randomize the hash values of basic types (str, bytes, datetime) in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated. The environment variable PYTHONHASHSEED and the new command line flag -R control this behavior.	2012-02-20 19:54:16 +01:00
Victor Stinner	c3a6b02d70	(Merge 3.2) Issue #13913 : normalize utf-8 codec name in UTF-8 decoder	2012-02-14 01:18:10 +01:00
Victor Stinner	cbe01342bc	Issue #13913 : normalize utf-8 codec name in UTF-8 decoder	2012-02-14 01:17:45 +01:00
Victor Stinner	d1cd99b533	Backout d2c1521ad0a1: _Py_IDENTIFIER() uses UTF-8 again	2012-02-07 23:05:55 +01:00
Victor Stinner	d446d8e09a	_Py_Identifier are always ASCII strings	2012-02-05 01:45:45 +01:00
Antoine Pitrou	7ab4af0427	Issue #13848 : open() and the FileIO constructor now check for NUL characters in the file name. Patch by Hynek Schlawack.	2012-01-29 18:43:36 +01:00
Antoine Pitrou	1334884ff2	Issue #13848 : open() and the FileIO constructor now check for NUL characters in the file name. Patch by Hynek Schlawack.	2012-01-29 18:36:34 +01:00
Benjamin Peterson	eea4846d23	don't ready in case_operation, since most callers do it themselves	2012-01-16 14:28:50 -05:00
Gregory P. Smith	f5b62a9b31	Consolidate the occurrances of the prime used as the multiplier when hashing.	2012-01-14 15:45:13 -08:00
Gregory P. Smith	63e6c3222f	Consolidate the occurrances of the prime used as the multiplier when hashing to a single #define instead of having several copies in several files. This excludes the Modules/ tree (datetime and expat both have a copy for their own purposes with no need for it to be the same).	2012-01-14 15:31:34 -08:00
Benjamin Peterson	c8d8b8861e	fix possible refleaks if PyUnicode_READY fails	2012-01-14 13:37:31 -05:00
Benjamin Peterson	bac79498c8	always explicitly check for -1 from PyUnicode_READY	2012-01-14 13:34:47 -05:00
Benjamin Peterson	d5890c8db5	add str.casefold() (closes #13752 )	2012-01-14 13:23:30 -05:00

1 2 3 4 5 ...

992 Commits