Commit Graph

4593 Commits

Author SHA1 Message Date
Ezio Melotti 5263c13801 Merge removal of trailing whitespace from 3.3. 2013-04-21 04:08:18 +03:00
Ezio Melotti 6b02772c13 Remove trailing whitespace. 2013-04-21 04:07:51 +03:00
Victor Stinner bb4503f61e Partial revert of changeset 9744b2df134c
PyUnicode_Append() cannot call directly resize_compact(): I forgot that a
string can be ready *and* not compact (a legacy string can also be ready).
2013-04-18 09:41:34 +02:00
Victor Stinner fb161b1b6d Split PyUnicode_DecodeCharmap() into subfunction for readability 2013-04-18 01:44:27 +02:00
Victor Stinner 170ca6f84b Fix bug in Unicode decoders related to _PyUnicodeWriter
Bug introduced by changesets 7ed9993d53b4 and edf029fc9591.
2013-04-18 00:25:28 +02:00
Victor Stinner 376cfa122d Fix typo in unicode_decode_call_errorhandler_writer()
Bug introduced by changeset 7ed9993d53b4.
2013-04-17 23:58:16 +02:00
Victor Stinner 8f674ccd64 Close #17694: Add minimum length to _PyUnicodeWriter
* Add also min_char attribute to _PyUnicodeWriter structure (currently unused)
 * _PyUnicodeWriter_Init() has no more argument (except the writer itself):
   min_length and overallocate must be set explicitly
 * In error handlers, only enable overallocation if the replacement string
   is longer than 1 character
 * CJK decoders don't use overallocation anymore
 * Set min_length, instead of preallocating memory using
   _PyUnicodeWriter_Prepare(), in many decoders
 * _PyUnicode_DecodeUnicodeInternal() checks for integer overflow
2013-04-17 23:02:17 +02:00
Victor Stinner 77282cb4f8 Cleanup PyUnicode_Contains()
* No need to double-check that strings are ready: test already done by
   PyUnicode_FromObject()
 * Remove useless kind variable (use kind1 instead)
2013-04-14 19:22:47 +02:00
Victor Stinner d92e078c8d Minor change: fix character in do_strip() for the ASCII case 2013-04-14 19:17:42 +02:00
Victor Stinner f033510fee Cleanup PyUnicode_Append()
* Check also that right is a Unicode object
 * call directly resize_compact() instead of unicode_resize() for a more
   explicit error handling, and to avoid testing some properties twice
   (ex: unicode_modifiable())
2013-04-14 19:13:03 +02:00
Victor Stinner 4560f9c63f PyUnicode_Join(): move use_memcpy test out of the loop to cleanup and optimize the code 2013-04-14 18:56:46 +02:00
Victor Stinner 55c08781e8 Optimize repr(str): use _PyUnicode_FastCopyCharacters() when no character is escaped 2013-04-14 18:45:39 +02:00
Victor Stinner af03757d20 Optimize ascii(str): don't encode/decode repr if repr is already ASCII 2013-04-14 18:44:10 +02:00
Victor Stinner 76b3b2726c stringlib: remove unused STRINGLIB_RESIZE macro 2013-04-14 16:29:09 +02:00
Victor Stinner 8a1a6cffd6 Add _PyUnicodeWriter_WriteCharInline() 2013-04-14 02:35:33 +02:00
Serhiy Storchaka e2cef885a2 Issue #16061: Speed up str.replace() for replacing 1-character strings. 2013-04-13 22:45:04 +03:00
Mark Dickinson 93196eb44f Issue #17715: Merge fix from 3.3. 2013-04-13 17:46:04 +01:00
Mark Dickinson c9734484ca Issue #17715: Add missing NULL Check to PyNumber_Long. 2013-04-13 17:44:44 +01:00
Mark Dickinson 556e94b8fe Issue #17643: Add __callback__ attribute to weakref.ref. 2013-04-13 15:45:44 +01:00
Mark Dickinson 548677bb8c Issue #16447: Merge fix from 3.3. 2013-04-13 15:30:16 +01:00
Mark Dickinson 64aafeb4de Issue #16447: Fix potential segfault when setting __name__ on a class. 2013-04-13 15:26:58 +01:00
Victor Stinner a0dd0213cc Close #17693: Rewrite CJK decoders to use the _PyUnicodeWriter API instead of
the legacy Py_UNICODE API.

Add also a new _PyUnicodeWriter_WriteChar() function.
2013-04-11 22:09:04 +02:00
Antoine Pitrou dc040f099d Fix supernumerary 's' in sys._debugmallocstats() output. 2013-04-11 21:02:20 +02:00
Antoine Pitrou 36b045f4db Fix supernumerary 's' in sys._debugmallocstats() output. 2013-04-11 21:01:40 +02:00
Benjamin Peterson 34ad84d80a merge 3.3 (#17669) 2013-04-10 17:01:38 -04:00
Benjamin Peterson c9314d9e08 don't run frame if it has no stack (closes #17669) 2013-04-10 17:00:56 -04:00
Victor Stinner 247109e74d Issue #17615: On Windows (VS2010), Performances of wmemcmp() to compare Unicode
strings are not convincing. For UCS2 (16-bit wchar_t type), use a dummy loop
instead of wmemcmp(). The dummy loop is as fast, or a little bit faster.

wchar_t is only 16-bit long on Windows. wmemcmp() is still used for 32-bit
wchar_t.
2013-04-09 23:53:26 +02:00
Victor Stinner 0cff4b16d9 replace(): only call PyUnicode_DATA(u) once 2013-04-09 22:52:48 +02:00
Victor Stinner cc7af72192 Write super-fast version of str.strip(), str.lstrip() and str.rstrip() for pure ASCII 2013-04-09 22:39:24 +02:00
Victor Stinner f50a4e9bc9 Don't calls macros in PyUnicode_WRITE() parameters
PyUnicode_WRITE() expands some parameters twice or more.
2013-04-09 22:38:52 +02:00
Victor Stinner 9c79e41fc5 Fix do_strip(): don't call PyUnicode_READ() in Py_UNICODE_ISSPACE() to not call
it twice
2013-04-09 22:21:08 +02:00
Victor Stinner b3a6014504 Fix _PyUnicode_XStrip()
Inline the BLOOM_MEMBER() to only call PyUnicode_READ() only once (per loop
iteration). Store also the length of the seperator in a variable to avoid calls
to PyUnicode_GET_LENGTH().
2013-04-09 22:19:21 +02:00
Victor Stinner 63d5c1a14a Optimize PyUnicode_DecodeCharmap()
Avoid expensive PyUnicode_READ() and PyUnicode_WRITE(), manipulate pointers
instead.
2013-04-09 22:13:33 +02:00
Victor Stinner a85af502a4 Optimize make_bloom_mask(), used by str.strip(), str.lstrip() and str.rstrip()
Write specialized functions per Unicode kind to avoid the expensive
PyUnicode_READ() macro.
2013-04-09 21:53:54 +02:00
Victor Stinner 69ed0f4c86 Use PyUnicode_READ() instead of PyUnicode_READ_CHAR()
"PyUnicode_READ_CHAR() is less efficient than PyUnicode_READ() because it calls
PyUnicode_KIND() and might call it twice." according to its documentation.
2013-04-09 21:48:24 +02:00
Victor Stinner 03c3e35d42 Add fast-path in PyUnicode_DecodeCharmap() for pure 8 bit encodings:
cp037, cp500 and iso8859_1 codecs
2013-04-09 21:53:09 +02:00
Victor Stinner cd777eaf53 Issue #17615: Comparing two Unicode strings now uses wmemcmp() when possible
wmemcmp() is twice faster than a dummy loop (342 usec vs 744 usec) on Fedora
18/x86_64, GCC 4.7.2.
2013-04-08 22:43:44 +02:00
Victor Stinner c1302bba4c Issue #17615: Expand expensive PyUnicode_READ() macro in unicode_compare():
write specialized functions for each combination of Unicode kinds.
2013-04-08 21:50:54 +02:00
Victor Stinner 7efa3b8242 Close #13126: "Simplify" FASTSEARCH() code to help the compiler to emit more
efficient machine code. Patch written by Antoine Pitrou.

Without this change, str.find() was 10% slower than str.rfind() in the worst
case.
2013-04-08 00:26:43 +02:00
Serhiy Storchaka ee57f159af Revert a premature patch for issue #14010 (changeset 846bd418aee5). 2013-04-06 22:55:12 +03:00
Serhiy Storchaka 278d03bd66 Revert a premature patch for issue #14010 (changeset aaaf36026511). 2013-04-06 22:52:34 +03:00
Serhiy Storchaka aac81e2780 Issue #14010: Fix a crash when iterating or deleting deeply nested filters
(builting and in itertools module, i.e. map(), itertools.chain(), etc).
2013-04-06 21:20:30 +03:00
Serhiy Storchaka e8f706eda7 Issue #14010: Fix a crash when iterating or deleting deeply nested filters
(builting and in itertools module, i.e. map(), itertools.chain(), etc).
2013-04-06 21:14:43 +03:00
Antoine Pitrou 0aaaa62200 Issue #17469: Fix _Py_GetAllocatedBlocks() and sys.getallocatedblocks() when running on valgrind. 2013-04-06 01:15:30 +02:00
Victor Stinner 207dd38726 fix unused variable 2013-04-03 03:14:58 +02:00
Victor Stinner eb4b5ac8af Close #16757: Avoid calling the expensive _PyUnicode_FindMaxChar() function
when possible
2013-04-03 02:02:33 +02:00
Victor Stinner cfc4c13b04 Add _PyUnicodeWriter_WriteSubstring() function
Write a function to enable more optimizations:

 * If the substring is the whole string and overallocation is disabled, just
   keep a reference to the string, don't copy characters
 * Avoid a call to the expensive _PyUnicode_FindMaxChar() function when
   possible
2013-04-03 01:48:39 +02:00
Benjamin Peterson d3f41fe121 merge 3.3 (#17610) 2013-04-01 17:43:30 -04:00
Benjamin Peterson 6395241471 list slotdefs in offset order rather than sorting them (closes #17610)
This means we can remove our usage of qsort() than relied on undefined behavior.
2013-04-01 17:41:41 -04:00
Antoine Pitrou 7faf70512a Issue #17591: Use lowercase filenames when including Windows header files.
Patch by Roumen Petrov.
2013-03-31 22:48:04 +02:00