Commit Graph

5002 Commits

Author SHA1 Message Date
Benjamin Peterson 42124a727d initialize map/filter/zip in _PyBuiltin_Init rather than the catch-all function 2012-10-30 23:41:54 -04:00
Benjamin Peterson 7ff2094bc7 merge 3.3 (#16369) 2012-10-30 23:31:12 -04:00
Benjamin Peterson e8ea97fffb merge 3.2 (#16369) 2012-10-30 23:27:52 -04:00
Benjamin Peterson c43112823b initialize more global type objects (closes #16369) 2012-10-30 23:21:10 -04:00
Victor Stinner 7a6d7cf3db Issue #9566: Use the right type to fix a compiler warnings on Win64 2012-10-31 00:37:41 +01:00
Victor Stinner 4ca1cf35fb Issue #16086: PyTypeObject.tp_flags and PyType_Spec.flags are now unsigned
... (unsigned long and unsigned int) to avoid an undefined behaviour with
Py_TPFLAGS_TYPE_SUBCLASS ((1 << 31). PyType_GetFlags() result type is now
unsigned too (unsigned long, instead of long).
2012-10-30 23:40:45 +01:00
Victor Stinner e64322e034 Close #14625: Rewrite the UTF-32 decoder. It is now 3x to 4x faster
Patch written by Serhiy Storchaka.
2012-10-30 23:12:47 +01:00
Victor Stinner 76df43de30 Issue #16330: Use surrogate-related macros
Patch written by Serhiy Storchaka.
2012-10-30 01:42:39 +01:00
Mark Dickinson fb90c0934c Issue #14700: Fix buggy overflow checks for large precision and width in new-style and old-style formatting. 2012-10-28 10:18:03 +00:00
Victor Stinner c6cf1ba29e Replace usage of the deprecated Py_UNICODE_COPY() with Py_MEMCPY() in resize_copy() 2012-10-23 02:54:47 +02:00
Victor Stinner fe75fb4b3e Optimize _PyUnicode_HasNULChars(): use findchar() instead of PyUnicode_Contains() 2012-10-23 02:52:18 +02:00
Victor Stinner 6fa627578a Inline raise_translate_exception(): it is only used once 2012-10-23 02:51:50 +02:00
Victor Stinner e5567ad236 Optimize PyUnicode_RichCompare() for Py_EQ and Py_NE: always use memcmp() 2012-10-23 02:48:49 +02:00
Antoine Pitrou 6f7b0da6bc Issue #12805: Make bytes.join and bytearray.join faster when the separator is empty.
Patch by Serhiy Storchaka.
2012-10-20 23:08:34 +02:00
Mark Dickinson e453e4c007 Issue 16280: Drop questionable special-casing of null pointer in PyLong_FromVoidPtr. 2012-10-18 22:18:42 +01:00
Mark Dickinson 5cb65917e1 Issue #16277: merge fix from 3.3 2012-10-18 19:53:45 +01:00
Mark Dickinson 44362a88ad Issue #16277: merge fix from 3.2 2012-10-18 19:53:28 +01:00
Mark Dickinson 91044799f7 Issue #16277: in PyLong_FromVoidPtr, add missing branch for sizeof(void*) <= sizeof(long). 2012-10-18 19:21:43 +01:00
Christian Heimes 743e0cd6b5 Issue #16166: Add PY_LITTLE_ENDIAN and PY_BIG_ENDIAN macros and unified
endianess detection and handling.
2012-10-17 23:52:17 +02:00
Eric Snow 42da889fec merge for issue #16160: Subclass support now works for types.SimpleNamespace. 2012-10-16 22:45:49 -07:00
Eric Snow 547298c94c Close #16160: Subclass support now works for types.SimpleNamespace. Thanks to RDM for noticing. 2012-10-16 22:35:38 -07:00
Antoine Pitrou cfc22b4a9b Issue #15958: bytes.join and bytearray.join now accept arbitrary buffer objects. 2012-10-16 21:07:23 +02:00
Chris Jerdonek 4a7df9aba9 Issue #14783: Merge changes from 3.3. 2012-10-07 15:02:16 -07:00
Chris Jerdonek 042fa653ab Issue #14783: Merge changes from 3.2. 2012-10-07 14:56:27 -07:00
Chris Jerdonek 83fe2e1c22 Issue #14783: Improve int() docstring and also str(), range(), and slice().
This commit rewrites the docstring for int() to incorporate the documentation
changes made in issue #16036.  It also switches the docstrings for int(),
str(), range(), and slice() to use multi-line signatures.
2012-10-07 14:48:36 -07:00
Armin Ronacher 74b38b190f Issue #16148: Small improvements and cleanup. Added version information
to docs.
2012-10-07 10:29:32 +02:00
Victor Stinner 4c63a972d1 Cleanup PyUnicode_FromFormatV() for zero padding
Skip the "0" instead of parsing it twice: detect zero padding and then parsed
as a digit of the width.
2012-10-06 23:55:33 +02:00
Victor Stinner 15a1136547 Issue #16147: PyUnicode_FromFormatV() doesn't need anymore to allocate a buffer
on the heap to format numbers.
2012-10-06 23:48:20 +02:00
Victor Stinner ff5a848db5 Issue #16147: PyUnicode_FromFormatV() now raises an error if the argument of
'%c' is not in the range(0x110000).
2012-10-06 23:05:45 +02:00
Victor Stinner 3921e90c5a Issue #16147: PyUnicode_FromFormatV() now detects integer overflow when parsing
width and precision
2012-10-06 23:05:00 +02:00
Victor Stinner e215d960be Issue #16147: Rewrite PyUnicode_FromFormatV() to use _PyUnicodeWriter API
* Simplify the code: replace 4 steps with one unique step using the
   _PyUnicodeWriter API. PyUnicode_Format() has the same design. It avoids to
   store intermediate results which require to allocate an array of pointers on
   the heap.
 * Use the _PyUnicodeWriter API for speed (and its convinient API):
   overallocate the buffer to reduce the number of "realloc()"
 * Implement "width" and "precision" in Python, don't rely on sprintf(). It
   avoids to need of a temporary buffer allocated on the heap: only use a small
   buffer allocated in the stack.
 * Add _PyUnicodeWriter_WriteCstr() function
 * Split PyUnicode_FromFormatV() into two functions: add
   unicode_fromformat_arg().
 * Inline parse_format_flags(): the format of an argument is now only parsed
   once, it's no more needed to have a subfunction.
 * Optimize PyUnicode_FromFormatV() for characters between two "%" arguments:
   search the next "%" and copy the substring in one chunk, instead of copying
   character per character.
2012-10-06 23:03:36 +02:00
Mark Dickinson cf46d62fcb Issue #16096: port fix from 3.3 2012-10-06 18:50:19 +01:00
Mark Dickinson fc9adb62fb Issue #16096: Fix signed overflow in Objects/longobject.c. Thanks Serhiy Storchaka. 2012-10-06 18:50:02 +01:00
Mark Dickinson ff9c54aca2 Issue #16096: Merge fixes from 3.3. 2012-10-06 18:05:14 +01:00
Mark Dickinson c04ddff290 Issue #16096: Fix several occurrences of potential signed integer overflow. Thanks Serhiy Storchaka. 2012-10-06 18:04:49 +01:00
Christian Heimes b70e8a1958 and another one 2012-10-06 17:16:39 +02:00
Christian Heimes 6314d164c9 move var declaration to top of block to fix compilation on Windows, fixes a7ec0a1b0f7c 2012-10-06 17:13:29 +02:00
Armin Ronacher 23c5bb4030 Fixed a missing incref introduced by a7ec0a1b0f7c 2012-10-06 14:30:32 +02:00
Armin Ronacher 226b1db0e2 Added notimplemented_dealloc for better error reporting 2012-10-06 14:28:58 +02:00
Armin Ronacher aa9a79d279 Issue #16148: implemented PEP 424 2012-10-06 14:03:24 +02:00
Victor Stinner 8c6db45d3e In debug mode, unicode_write_cstr() now checks that non-ASCII characters are
not written into an ASCII string
2012-10-06 00:40:45 +02:00
Ezio Melotti 080a2c087e #16127: merge with 3.3. 2012-10-05 03:34:02 +03:00
Ezio Melotti e7f90375b1 #16127: remove outdated references to narrow builds. Patch by Serhiy Storchaka. 2012-10-05 03:33:31 +03:00
Victor Stinner 1929407406 Fix PyUnicode_Format(): return NULL if PyUnicode_READY(uformat) failed
This error cannot occur in practice: PyUnicode_FromObject() always return
a "ready" string.
2012-10-05 00:09:33 +02:00
Victor Stinner 770e19e0cc Optimize unicode_compare(): use memcmp() when comparing two UCS1 strings 2012-10-04 22:59:45 +02:00
Victor Stinner 90db9c47dc Enable also ptr==ptr optimization in PyUnicode_Compare()
It was already implemented in PyUnicode_RichCompare()
2012-10-04 21:53:50 +02:00
Victor Stinner 9cc98c93a7 long_to_decimal_string_internal() doesn't need to write the final NULL character 2012-10-04 02:43:02 +02:00
Victor Stinner aa7712711d unicode_result_wchar(): move the assert() to the "#ifdef Py_DEBUG" block 2012-10-04 02:32:58 +02:00
Victor Stinner a4708231e6 Split the huge PyUnicode_Format() function (+540 lines) into subfunctions 2012-10-04 02:19:54 +02:00
Victor Stinner a049443fab PyUnicode_Format(): disable overallocation when we are writing the last part
of the output string
2012-10-03 23:03:46 +02:00
Victor Stinner afffce489b Unicode: resize_compact() and resize_inplace() fills also the Unicode strings
with invalid bytes in debug mode, as done by PyUnicode_New()
2012-10-03 23:03:17 +02:00
Victor Stinner c89d28fdfc Issue #15609: Fix refleak introduced by my last optimization 2012-10-02 12:54:07 +02:00
Victor Stinner 621ef3d84f Issue #15609: Optimize str%args for integer argument
- Use _PyLong_FormatWriter() instead of formatlong() when possible, to avoid
   a temporary buffer
 - Enable the fast path when width is smaller or equals to the length,
   and when the precision is bigger or equals to the length
 - Add unit tests!
 - formatlong() uses PyUnicode_Resize() instead of _PyUnicode_FromASCII()
   to resize the output string
2012-10-02 00:33:47 +02:00
Benjamin Peterson b8350f1c7d upgrade to UCD 6.2 2012-09-29 13:47:39 -04:00
Ezio Melotti 0e1af282b8 Fix typo. 2012-09-28 16:43:40 +03:00
Mark Dickinson 7c95bb35e4 Issue #16060: Fix a double DECREF in int() implementation. Thanks Serhiy Storchaka. 2012-09-27 19:38:59 +01:00
Antoine Pitrou a1f7655fa7 Issue #15379: Fix passing of non-BMP characters as integers for the charmap decoder (already working as unicode strings).
Patch by Serhiy Storchaka.
2012-09-23 20:00:04 +02:00
Antoine Pitrou 6f80f5d444 Issue #15379: Fix passing of non-BMP characters as integers for the charmap decoder (already working as unicode strings).
Patch by Serhiy Storchaka.
2012-09-23 19:55:21 +02:00
Mark Dickinson 5710c2a3e8 Issue 15959: Merge from 3.2. 2012-09-20 21:30:34 +01:00
Mark Dickinson c286e58044 Issue 15959: Fix type mismatch for quick{_neg}_int_allocs. Thanks Serhiy Storchaka. 2012-09-20 21:29:28 +01:00
Antoine Pitrou ca8aa4acf6 Issue #15144: Fix possible integer overflow when handling pointers as integer values, by using Py_uintptr_t instead of size_t.
Patch by Serhiy Storchaka.
2012-09-20 20:56:47 +02:00
Trent Nelson da064d0745 Silence compiler warnings on Solaris 10 via explicit (void *) casts. 2012-09-18 22:00:25 -04:00
Trent Nelson ab02db23b1 Silence compiler warnings on Solaris 10 via explicit (void *) casts.
(Compiler: Solaris Studio 12.3)
2012-09-18 21:58:03 -04:00
Christian Heimes 7ae251a025 Fix out of bounds read in long_new() for empty bytes with an explicit base. int(b'', somebase) calls PyLong_FromString() with char* of length 1 but the function accesses the first argument at offset 1. CID 715359 2012-09-12 15:32:06 +02:00
Christian Heimes 79b97ee2ab Fix out of bounds read in long_new() for empty bytes with an explicit base. int(b'', somebase) calls PyLong_FromString() with char* of length 1 but the function accesses the first argument at offset 1. CID 715359 2012-09-12 15:31:43 +02:00
Christian Heimes 5f520f4fed Issue #15900: Fixed reference leak in PyUnicode_TranslateCharmap() 2012-09-11 14:03:25 +02:00
Christian Heimes 76c082911b Fixed memory leak in error branch of object_repr which may leak a reference to mod when type_qualname returns NULL. CID 715371 2012-09-10 17:00:30 +02:00
Christian Heimes e81dc296f2 Fixed memory leak in error branch of object_repr which may leak a reference to mod when type_name returns NULL. CID 715371 2012-09-10 16:57:36 +02:00
Christian Heimes f4f9939a96 Fixed memory leak in error branch of formatfloat(). CID 719687 2012-09-10 11:48:41 +02:00
Christian Heimes 455657961e Fixed possible reference leak to mod when type_name() returns NULL 2012-09-10 03:01:16 +02:00
Christian Heimes a0e7e41cba Fixed possible reference leak to mod when type_name() returns NULL 2012-09-10 03:00:14 +02:00
Christian Heimes c4fe3fed6e PyTuple_Pack() was missing va_end() in its error branch which lead to a resource leak. 2012-09-10 02:55:13 +02:00
Christian Heimes d5a88044a3 PyTuple_Pack() was missing va_end() in its error branch which lead to a resource leak. 2012-09-10 02:54:51 +02:00
Christian Heimes 110ac16b9f Fixed resource leak to scratch when _PyUnicodeWriter_Prepare fails 2012-09-10 02:51:27 +02:00
Christian Heimes f03572d040 Py_TYPE() has already dereferenced self before the NULL check. Moved Py_TYPE() after the check for self == NULL 2012-09-10 02:45:56 +02:00
Christian Heimes 949f331731 Py_TYPE() has already dereferenced self before the NULL check. Moved Py_TYPE() after the check for self == NULL 2012-09-10 02:45:31 +02:00
Antoine Pitrou 5b4faae307 Issue #13992: The trashcan mechanism is now thread-safe. This eliminates
sporadic crashes in multi-thread programs when several long deallocator
chains ran concurrently and involved subclasses of built-in container
types.

Note that the trashcan functions are part of the stable ABI, therefore
they have to be kept around for binary compatibility of extensions.
2012-09-06 01:17:42 +02:00
Antoine Pitrou 56cd62c04a Issue #13992: The trashcan mechanism is now thread-safe. This eliminates
sporadic crashes in multi-thread programs when several long deallocator
chains ran concurrently and involved subclasses of built-in container
types.

Because of this change, a couple extension modules compiled for 3.2.4
(those which use the trashcan mechanism, despite it being undocumented)
will not be loadable by 3.2.3 and earlier. However, extension modules
compiled for 3.2.3 and earlier will be loadable by 3.2.4.
2012-09-06 00:59:49 +02:00
Alexander Belopolsky f73c69e06f Issue #15855: added docstrings for memoryview methods and data descriptors new in 3.3. 2012-09-03 16:51:01 -04:00
Alexander Belopolsky e370c38131 Issue #15855: added docstrings for memoryview methods and data descriptors (merge 3.2). 2012-09-03 16:43:55 -04:00
Alexander Belopolsky 397e5c98bc Issue #15855: added docstrings for memoryview methods and data descriptors. 2012-09-03 16:29:11 -04:00
Antoine Pitrou 057119b0b7 Fix C++-style comment (xlc compilation failure) 2012-09-02 17:56:33 +02:00
Benjamin Peterson 6a42bd67d7 Make super() internal errors RuntimeError instead of SystemError (closes #15839) 2012-09-01 23:04:38 -04:00
Benjamin Peterson 4e07a8c9aa merge heads 2012-08-28 18:02:18 -04:00
Benjamin Peterson 59043f96ea merge 3.2 (#15801) 2012-08-28 18:01:45 -04:00
Benjamin Peterson 28a6cfaefc use the stricter PyMapping_Check (closes #15801) 2012-08-28 17:55:35 -04:00
Richard Oudkerk ea62bd50a3 Issue #15784: Modify OSError.__str__() to better distinguish between
errno error numbers and Windows error numbers.
2012-08-28 19:33:26 +01:00
Nick Coghlan 06e1ab0a6b Close #15573: use value-based memoryview comparisons (patch by Stefan Krah) 2012-08-25 17:59:50 +10:00
Brett Cannon 07c6e71689 Issue #15778: Coerce ImportError.args to a string when it isn't
already one.

Patch by Dave Malcolm.
2012-08-24 13:05:09 -04:00
Stefan Krah 5b27c53e36 Merge 3.2. 2012-08-21 08:25:41 +02:00
Stefan Krah 7cacd2eb92 Issue #15736: Fix overflow in _PySequence_BytesToCharpArray(). 2012-08-21 08:16:09 +02:00
Stefan Krah 6adf2433e4 Merge 3.2. 2012-08-20 11:13:58 +02:00
Stefan Krah fd24f9e51e Issue #15732: Fix (constructed) crash in _PySequence_BytesToCharpArray().
Found by Coverity.
2012-08-20 11:04:24 +02:00
Stefan Krah 8528c3145e Issue #15728: Fix leak in PyUnicode_AsWideCharString(). Found by Coverity. 2012-08-19 21:52:43 +02:00
Stefan Krah 7fda33b56a Mereg 3.2. 2012-08-19 11:22:28 +02:00
Stefan Krah 6b962860e2 Check for NULL return value in PyStructSequence_NewType(). Found by Coverity. 2012-08-19 11:20:41 +02:00
Nick Coghlan 0e41628d35 Merge str docstring fix from 3.2 2012-08-16 14:14:30 +10:00
Nick Coghlan 573b1fd779 Fix str docstring 2012-08-16 14:13:07 +10:00
Antoine Pitrou 721738fbee Issue #15604: Update uses of PyObject_IsTrue() to check for and handle errors correctly.
Patch by Serhiy Storchaka.
2012-08-15 23:20:39 +02:00
Antoine Pitrou 6f430e4963 Issue #15604: Update uses of PyObject_IsTrue() to check for and handle errors correctly.
Patch by Serhiy Storchaka.
2012-08-15 23:18:25 +02:00
Victor Stinner b3f5501250 Close #15534: Fix a typo in the fast search function of the string library (_s => s)
Replace _s with ptr to avoid future confusion. Add also non regression tests.
2012-08-02 23:05:01 +02:00
Richard Oudkerk 5562d9dc5d Issue #1692335: Move initial args assignment to BaseException.__new__
to help pickling of naive subclasses.
2012-07-28 17:45:28 +01:00
Stefan Krah e4c0799d9c Add unused parameter to a METH_NOARGS function. 2012-07-28 14:10:02 +02:00
Stefan Krah 7d12d9df13 Issue #12834: Fix PyBuffer_ToContiguous() for non-contiguous arrays. 2012-07-28 12:25:55 +02:00
Martin v. Löwis 3bbd2fad4d Issue #15456: Fix code __sizeof__ after #12399 change.
Patch by Serhiy Storchaka.
2012-07-26 22:23:23 +02:00
Antoine Pitrou b4bbee25b1 Issue #14579: Fix CVE-2012-2135: vulnerability in the utf-16 decoder after error handling.
Patch by Serhiy Storchaka.
2012-07-21 00:45:14 +02:00
Andrew Svetlov a0364764fd Merge 3.2 2012-07-20 14:52:54 +03:00
Andrew Svetlov ddcb6206bf Issue #15404: Refleak in PyMethodObject repr. 2012-07-20 14:51:45 +03:00
Meador Inge f4cc2161d5 Issue #15394: Fix ref leaks in PyModule_Create.
Patch by Julia Lawall.
2012-07-19 13:51:59 -05:00
Meador Inge 29e49d6394 Issue #15394: Fix ref leaks in PyModule_Create.
Patch by Julia Lawall.
2012-07-19 13:45:43 -05:00
Mark Dickinson 01ac8b6ab1 Use correct types for ASCII_CHAR_MASK integer constants. 2012-07-07 14:08:48 +02:00
Antoine Pitrou f87289bb58 Issue #15229: An OSError subclass whose __init__ doesn't call back
OSError.__init__ could produce incomplete instances, leading to crashes
when calling str() on them.
2012-06-30 23:37:47 +02:00
Antoine Pitrou a504a7a7d1 Issue #15055: update dictnotes.txt. Patch by Mark Shannon. 2012-06-24 21:03:45 +02:00
Antoine Pitrou 66a3a7ed10 Try to fix crash on x86 OpenIndiana buildbot. 2012-06-24 00:42:59 +02:00
Antoine Pitrou 1351ca6e66 Replace assert() with a more informative fatal error. 2012-06-24 00:30:12 +02:00
Antoine Pitrou bb78f57c14 Use struct member (ht_type) instead of casting pointers. 2012-06-24 00:18:27 +02:00
Martin v. Löwis 9c56409d33 Issue #15146: Add PyType_FromSpecWithBases. Patch by Robin Schreiber. 2012-06-23 23:20:45 +02:00
Mark Dickinson 106c4145ff Issue #14923: Optimize continuation-byte check in UTF-8 decoding. Patch by Serhiy Storchaka. 2012-06-23 21:45:14 +01:00
Antoine Pitrou 99cc629969 Issue #15142: Fix reference leak when deallocating instances of types created using PyType_FromSpec(). 2012-06-23 14:42:38 +02:00
Antoine Pitrou a4db02c7a3 Issue #15142: Fix reference leak when deallocating instances of types created using PyType_FromSpec(). 2012-06-23 14:45:21 +02:00
David Malcolm 49526f48fc Issue #14785: Add sys._debugmallocstats() to help debug low-level memory allocation issues 2012-06-22 14:55:41 -04:00
Antoine Pitrou a759d4e9f4 Make private function static (from `make smelly`) 2012-06-21 17:26:28 +02:00
Nick Coghlan 5b0dac12b8 Issue #13783: PEP 380 cleanup part 2, using the new identifier APIs in the generator implementation 2012-06-17 15:45:11 +10:00
Nick Coghlan c40bc09942 Issue #13783: the PEP 380 implementation no longer expands the public C API 2012-06-17 15:15:49 +10:00
Antoine Pitrou aaefac76dd Issue #14874: Restore charmap decoding speed to pre-PEP 393 levels.
Patch by Serhiy Storchaka.
2012-06-16 22:48:21 +02:00
Victor Stinner f185226244 _copy_characters(): move debug code at the top to avoid noisy #ifdef
And don't use assert() anymore if check_maxchar is set: return -1 on error
instead.
2012-06-16 16:38:26 +02:00
Victor Stinner 07621338fb Fix PyUnicode_GetSize(): Don't replace _PyUnicode_Ready() exception 2012-06-16 04:53:46 +02:00
Victor Stinner 8a8b3eaabe Fix a compiler warning in _copy_characters() and remove debug code 2012-06-16 04:53:25 +02:00
Victor Stinner 24e403bbee Oops, fix my previous change on _copy_characters() 2012-06-16 04:53:00 +02:00
Victor Stinner ca439eecea Fix unicode_adjust_maxchar(): catch PyUnicode_New() failure 2012-06-16 03:17:34 +02:00
Victor Stinner 184252ad3f Fix "%f" format of str%args if the result is not an ASCII or latin1 string 2012-06-16 02:57:41 +02:00
Victor Stinner 9a77770add Remove debug code 2012-06-16 02:44:43 +02:00
Victor Stinner c9d369f1bf Optimize _PyUnicode_FastCopyCharacters() when maxchar(from) > maxchar(to) 2012-06-16 02:22:37 +02:00
Victor Stinner f05e17ece9 unicodeobject.c: Remove debug code 2012-06-16 01:53:04 +02:00
Antoine Pitrou 27f6a3b0bf Issue #15026: utf-16 encoding is now significantly faster (up to 10x).
Patch by Serhiy Storchaka.
2012-06-15 22:15:23 +02:00
Kristján Valur Jónsson 55e5dc8371 Rearrange code to beat an optimizer bug affecting Release x64 on windows
with VS2010sp1
2012-06-06 21:58:08 +00:00
Victor Stinner d7b7c7472b Issue #14993: Use standard "unsigned char" instead of a unsigned char bitfield 2012-06-04 22:52:12 +02:00
Barry Warsaw 409da157d7 Eric Snow's implementation of PEP 421.
Issue 14673: Add sys.implementation
2012-06-03 16:18:47 -04:00
Kristjan Valur Jonsson 85634d7a2e Issue #14909: A number of places were using PyMem_Realloc() apis and
PyObject_GC_Resize() with incorrect error handling.  In case of errors,
the original object would be leaked.  This checkin fixes those cases.
2012-05-31 09:37:31 +00:00
Victor Stinner 3a7d096f2f Issue #14744: Fix compilation on Windows (part 2) 2012-05-29 18:53:56 +02:00
Victor Stinner e577ab38ea Issue #14744: Fix compilation on Windows 2012-05-29 18:51:10 +02:00
Victor Stinner d3f0882dfb Issue #14744: Use the new _PyUnicodeWriter internal API to speed up str%args and str.format(args)
* Formatting string, int, float and complex use the _PyUnicodeWriter API. It
   avoids a temporary buffer in most cases.
 * Add _PyUnicodeWriter_WriteStr() to restore the PyAccu optimization: just
   keep a reference to the string if the output is only composed of one string
 * Disable overallocation when formatting the last argument of str%args and
   str.format(args)
 * Overallocation allocates at least 100 characters: add min_length attribute
   to the _PyUnicodeWriter structure
 * Add new private functions: _PyUnicode_FastCopyCharacters(),
   _PyUnicode_FastFill() and _PyUnicode_FromASCII()

The speed up is around 20% in average.
2012-05-29 12:57:52 +02:00
Richard Oudkerk 3e0a1eb889 Issue #14930: Make memoryview objects weakrefable. 2012-05-28 21:35:09 +01:00
Nick Coghlan 0b43bcf528 Close #14857: fix regression in references to PEP 3135 implicit __class__ closure variable. Reopens issue #12370, but also updates unittest.mock to workaround that issue 2012-05-27 18:17:07 +10:00
Larry Hastings ca28e99202 Issue #14889: PyBytes_FromObject(bytes) now just increfs and returns.
Previously, if you passed in a bytes object, it would create a whole
new object.
2012-05-24 22:58:30 -07:00
Eric V. Smith 984b11f88f issue 14660: Implement PEP 420, namespace packages. 2012-05-24 20:21:04 -04:00
Antoine Pitrou b7d033db78 Issue #14829: Fix bisect and range() indexing with large indices (>= 2 ** 32) under 64-bit Windows.
(untested, because of Windows build issues under 3.x)
2012-05-16 14:39:36 +02:00
Antoine Pitrou a103b96a80 Issue #14829: Fix bisect and range() indexing with large indices (>= 2 ** 32) under 64-bit Windows. 2012-05-16 14:37:54 +02:00
Antoine Pitrou 32bc80c523 Fix build failure. 2012-05-16 12:51:55 +02:00
Antoine Pitrou 63065d761e Issue #14624: UTF-16 decoding is now 3x to 4x faster on various inputs.
Patch by Serhiy Storchaka.
2012-05-15 23:48:04 +02:00
Martin v. Löwis b05c0738d8 Silence VS 2010 signed/unsigned warnings. 2012-05-15 13:45:49 +02:00
Benjamin Peterson d5a1c44455 PEP 415: Implement suppression of __context__ display with an exception attribute
This replaces the original PEP 409 implementation. See #14133.
2012-05-14 22:09:31 -07:00
Antoine Pitrou 1b634c266c Use size_t, not ssize_t (issue #14801). 2012-05-14 14:44:37 +02:00
Antoine Pitrou a1433fed8e Remove tab characters 2012-05-14 14:43:25 +02:00
Antoine Pitrou 682d94c11a Use size_t, not ssize_t (issue #14801). 2012-05-14 14:43:03 +02:00
Antoine Pitrou 9a2349030a Issue #14417: Mutating a dict during lookup now restarts the lookup instead of raising a RuntimeError (undoes issue #14205). 2012-05-13 20:48:01 +02:00
Brian Curtin 401f9f3d32 Fix #13210. Port the Windows build from VS2008 to VS2010. 2012-05-13 11:19:23 -05:00
Antoine Pitrou 2d169b268b Make the reference counting of dictkeys objects participate in refleak hunting
(issue #13903).
2012-05-12 23:43:44 +02:00
Antoine Pitrou 758153badb Fix refleaks introduced by 83da67651687. 2012-05-12 15:51:51 +02:00
Antoine Pitrou e45c0c5cef Fix logic error introduced by 83da67651687. 2012-05-12 15:49:07 +02:00
Benjamin Peterson 1ff2e35e84 simplify by shortcutting when the kind of the needle is larger than the haystack 2012-05-11 17:41:20 -05:00
Antoine Pitrou ca5f91b888 Issue #14738: Speed-up UTF-8 decoding on non-ASCII data. Patch by Serhiy Storchaka. 2012-05-10 16:36:02 +02:00
Victor Stinner 3b1a74a9c3 Rename unicode_write_t structure and its methods to "_PyUnicodeWriter" 2012-05-09 22:25:00 +02:00
Victor Stinner ee4544c920 Issue #14744: Inline unicode_writer_write_char() and unicode_write_str()
Optimize also PyUnicode_Format(): call unicode_writer_prepare() only once
per argument.
2012-05-09 22:24:08 +02:00
Victor Stinner f59c28c930 unicode_writer_finish() checks string consistency 2012-05-09 03:24:14 +02:00
Benjamin Peterson 1cffbac2cb merge 3.2 (#14752) 2012-05-08 09:22:45 -04:00
Benjamin Peterson 89a6e9a27b fix possible refleak (closes #14752) 2012-05-08 09:22:24 -04:00
Victor Stinner 106802547c Backout ab500b297900: the check for integer overflow is wrong
Issue #14716: Change integer overflow check in unicode_writer_prepare()
to compute the limit at compile time instead of runtime. Patch writen by Serhiy
Storchaka.
2012-05-07 23:50:05 +02:00
Victor Stinner 0576f9b4cf Issue #14716: Change integer overflow check in unicode_writer_prepare()
to compute the limit at compile time instead of runtime. Patch writen by Serhiy
Storchaka.
2012-05-07 13:02:44 +02:00
Victor Stinner 202fdca133 Close #14716: str.format() now uses the new "unicode writer" API instead of the
PyAccu API. For example, it makes str.format() from 25% to 30% faster on Linux.
2012-05-07 12:47:02 +02:00
Mark Dickinson 99e2e5552a Issue #14700: Fix two broken and undefined-behaviour-inducing overflow checks in old-style string formatting. Thanks Serhiy Storchaka for report and original patch. 2012-05-07 11:20:50 +01:00
Victor Stinner d0dba6eee8 unicode_writer: don't force inline when it is not necessary
Keep inline for performance critical functions (functions used in loops)
2012-05-04 01:19:15 +02:00
Benjamin Peterson 9cd8853d45 merge 3.2 (#14717) 2012-05-03 18:44:33 -04:00
Benjamin Peterson ab3da290fe close() doesn't take any args (closes #14717) 2012-05-03 18:44:09 -04:00
Benjamin Peterson b63f49f2b4 if the kind of the string to count is larger than the string to search, shortcut to 0 2012-05-03 18:31:07 -04:00
Victor Stinner a7b654be30 unicode_writer: add finish() method and assertions to write_str() method
* The write_str() method does nothing if the length is zero.
 * Replace "struct unicode_writer_t" with "unicode_writer_t"
2012-05-03 23:58:55 +02:00
Victor Stinner bf4e266397 Issue #14687: Remove redundant length attribute of unicode_write_t
The length can be read directly from the buffer
2012-05-03 19:27:14 +02:00
Victor Stinner 7989157e49 Issue #14687: Cleanup unicode_writer_prepare()
"Inline" PyUnicode_Resize(): call directly resize_compact()
2012-05-03 13:43:07 +02:00
Victor Stinner f2c76aa6cb Issue #14687: str%tuple now uses an optimistic "unicode writer" instead of an
accumulator. Directly write characters into the output (don't use a temporary
list): resize and widen the string on demand.
2012-05-03 13:10:40 +02:00
Victor Stinner 1b487b467b Issue #14624, #14687: Optimize unicode_widen()
Don't convert uninitialized characters. Patch written by Serhiy Storchaka.
2012-05-03 12:29:04 +02:00
Victor Stinner 3a7f7977f1 Remove buggy assertion in PyUnicode_Substring()
Use also directly unicode_empty, instead of PyUnicode_New(0,0).
2012-05-03 03:36:40 +02:00
Victor Stinner 684d5fd420 Fix PyUnicode_Substring() for start >= length and start > end
Remove the fast-path for 1-character string: unicode_fromascii() and
_PyUnicode_FromUCS*() now have their own fast-path for 1-character strings.
2012-05-03 02:32:34 +02:00
Victor Stinner b6cd014d75 Unicode: optimize creating of 1-character strings 2012-05-03 02:17:04 +02:00
Victor Stinner bff7c96834 Issue #14687: Optimize str%tuple for the "%(name)s" syntax
Avoid an useless and expensive call to PyUnicode_READ().
2012-05-03 01:44:59 +02:00
Victor Stinner e6abb488c9 unicodeobject.c: Add MAX_MAXCHAR() macro to (micro-)optimize the computation
of the second argument of PyUnicode_New().

 * Create also align_maxchar() function
 * Optimize fix_decimal_and_space_to_ascii(): don't compute the maximum
   character when ch <= 127 (it is ASCII)
2012-05-02 01:15:40 +02:00
Victor Stinner 438106b66e Issue #14687: Cleanup PyUnicode_Format() 2012-05-02 00:41:57 +02:00
Victor Stinner b5c3ea3af3 Issue #14687: Optimize str%args
* formatfloat() uses unicode_fromascii() instead of PyUnicode_DecodeASCII()
   to not have to check characters, we know that it is really ASCII
 * Use PyUnicode_FromOrdinal() instead of _PyUnicode_FromUCS4() to format
   a character: if avoids a call to ucs4lib_find_max_char() to compute
   the maximum character (whereas we already know it, it is just the character
   itself)
2012-05-02 00:29:36 +02:00
Benjamin Peterson 8fbd295458 merge 3.2 (#14699) 2012-05-01 09:51:46 -04:00
Benjamin Peterson 7295c6a871 fix calling the classmethod descriptor directly (closes #14699) 2012-05-01 09:51:09 -04:00
Benjamin Peterson a6f195e48e change insertdict to not steal references (#13903) 2012-04-30 10:23:40 -04:00
Victor Stinner b80e46eca4 Issue #14687: Avoid an useless duplicated string in PyUnicode_Format() 2012-04-30 05:21:52 +02:00
Victor Stinner aff3cc659b Issue #14687: Cleanup PyUnicode_Format() 2012-04-30 05:19:21 +02:00
Brett Cannon 62228dbd6c Issues #13959, 14647: Re-implement imp.reload() in Lib/imp.py.
Thanks to Eric Snow for the patch.
2012-04-29 14:38:11 -04:00
Victor Stinner b11d91d969 Fix my previous commit: bool is a long, restore the specical case for bool 2012-04-28 00:25:34 +02:00
Victor Stinner d0880d57b0 Simplify and optimize formatlong()
* Remove _PyBytes_FormatLong(): inline it into formatlong()
 * the input type is always a long, so remove the code for bool
 * don't duplicate the string if the length does not change
 * Use PyUnicode_DATA() instead of _PyUnicode_AsString()
2012-04-27 23:40:13 +02:00
Victor Stinner 94d558b063 Optimize _PyUnicode_FindMaxChar() find pure ASCII strings 2012-04-27 22:26:58 +02:00
Benjamin Peterson 64acccf46d decref cached keys on type deallocation (#13903) 2012-04-27 15:07:36 -04:00
Victor Stinner 8f825060f1 Check newly created consistency using _PyUnicode_CheckConsistency(str, 1)
* In debug mode, fill the string data with invalid characters
 * Simplify also reference counting in PyCodec_BackslashReplaceErrors()
   and PyCodec_XMLCharRefReplaceError()
2012-04-27 13:55:39 +02:00
Victor Stinner 718fbf078c _PyUnicode_CheckConsistency() ensures that the unicode string ends with a
null character
2012-04-26 00:39:37 +02:00
Victor Stinner 3065093bb3 long_to_decimal_string() and _PyLong_Format() check the consistency of newly
created strings using _PyUnicode_CheckConsistency() in debug mode
2012-04-26 00:37:21 +02:00
Benjamin Peterson 15ee821eb5 distiguish between refusing to creating shared keys and error (#13903) 2012-04-24 14:44:18 -04:00
Martin v. Loewis 4f2f3b6217 Account for shared keys in type's __sizeof__ (#13903). 2012-04-24 19:13:57 +02:00
Benjamin Peterson 42f58818d6 merge 3.2 (#14658) 2012-04-24 11:09:20 -04:00
Benjamin Peterson 7b1668735a don't use a slot wrapper from a different special method (closes #14658)
This also alters the fix to #11603. Specifically, setting __repr__ to
object.__str__ now raises a recursion RuntimeError when str() or repr() is
called instead of silently bypassing the recursion. I believe this behavior is
more correct.
2012-04-24 11:06:25 -04:00
Benjamin Peterson 7ce67e45f8 fix dict gc tracking (#13903) 2012-04-24 10:32:57 -04:00
Benjamin Peterson b9f4c9daad make pointer arith c89 2012-04-23 21:45:40 -04:00
Benjamin Peterson f3b7d86e25 use correct base ptr 2012-04-23 18:07:01 -04:00
Benjamin Peterson 2844a7a6d3 simplify and reformat 2012-04-23 18:00:25 -04:00
Victor Stinner ece58deb9f Close #14648: Compute correctly maxchar in str.format() for substrin 2012-04-23 23:36:38 +02:00
Benjamin Peterson db780d0d13 fix instance dicts with str subclasses (#13903) 2012-04-23 13:44:32 -04:00
Benjamin Peterson 53b977127f don't make shared keys with dict subclasses 2012-04-23 11:50:47 -04:00
Benjamin Peterson 7d95e40721 Implement PEP 412: Key-sharing dictionaries (closes #13903)
Patch from Mark Shannon.
2012-04-23 11:24:50 -04:00
Mark Dickinson 9a359bd97f Issue #14630: Merge fix from 3.2. 2012-04-20 21:44:09 +01:00
Mark Dickinson bcc17eefd2 Issue #14630: Fix an incorrect access of ob_digit[0] for a zero instance of an int subclass. 2012-04-20 21:42:49 +01:00
Mark Dickinson e28465482c Issue #14339: Improve speed of bin, oct and hex builtins. Patch by Serhiy Storchaka (with minor modifications). 2012-04-20 21:21:24 +01:00
Victor Stinner b0b224233e Issue #14385: Support other types than dict for __builtins__
It is now possible to use a custom type for the __builtins__ namespace, instead
of a dict. It can be used for sandboxing for example.  Raise also a NameError
instead of ImportError if __build_class__ name if not found in __builtins__.
2012-04-19 00:57:45 +02:00
Benjamin Peterson 6e3358a1d5 merge 3.2 (#14612) 2012-04-18 11:19:00 -04:00
Benjamin Peterson e42fb307ed SETUP_WITH acts like SETUP_FINALLY for the purposes of setting f_lineno (closes #14612) 2012-04-18 11:14:31 -04:00
Victor Stinner 0db176f8f6 Issue #14386: Expose the dict_proxy internal type as types.MappingProxyType 2012-04-16 00:16:30 +02:00
Brett Cannon fd0741555b Issue #2377: Make importlib the implementation of __import__().
importlib._bootstrap is now frozen into Python/importlib.h and stored
as _frozen_importlib in sys.modules. Py_Initialize() loads the frozen
code along with sys and imp and then uses _frozen_importlib._install()
to set builtins.__import__() w/ _frozen_importlib.__import__().
2012-04-14 14:10:13 -04:00
Brett Cannon 79ec55e980 Issue #1559549: Add 'name' and 'path' attributes to ImportError.
Currently import does not use these attributes as they are planned
for use by importlib (which will be another commit).

Thanks to Filip Gruszczyński for the initial patch and Brian Curtin
for refining it.
2012-04-12 20:24:54 -04:00
Benjamin Peterson 64ed576de8 merge 3.2 (#14509) 2012-04-09 15:04:39 -04:00
Benjamin Peterson ca819c3c9d merge 3.1 (#14509) 2012-04-09 15:01:02 -04:00
Benjamin Peterson f6622c8a3e fix build without Py_DEBUG and DNDEBUG (closes #14509) 2012-04-09 14:53:07 -04:00
Victor Stinner afb5205c48 Close #14249: Use bit shifts instead of an union, it's more efficient.
Patch written by Serhiy Storchaka
2012-04-05 22:54:49 +02:00
Victor Stinner e7eee01f36 Close #14249: Use an union instead of a long to short pointer to avoid aliasing
issue. Speed up UTF-16 by 20%.
2012-04-05 13:44:34 +02:00
Antoine Pitrou a701388de1 Rename _PyIter_GetBuiltin to _PyObject_GetBuiltin, and do not include it in the stable ABI. 2012-04-05 00:04:20 +02:00
Kristján Valur Jónsson 31668b8f7a Issue #14288: Serialization support for builtin iterators. 2012-04-03 10:49:41 +00:00
Benjamin Peterson 9ee601e197 merge 3.2 2012-04-01 18:51:37 -04:00
Benjamin Peterson b6af60c2a9 adjust formatting 2012-04-01 18:49:54 -04:00
Benjamin Peterson 3471bb67e7 remove extraneous condition 2012-04-01 18:48:40 -04:00
Benjamin Peterson 29f843816b merge heads 2012-04-01 18:48:11 -04:00
Benjamin Peterson ab3c1c1994 be consistent with rest of function 2012-04-01 18:48:02 -04:00
Antoine Pitrou 29b964d0dd Issue #13019: Fix potential reference leaks in bytearray.extend().
Patch by Suman Saha.
2012-04-01 16:08:11 +02:00
Antoine Pitrou 58bb82e7b4 Issue #13019: Fix potential reference leaks in bytearray.extend().
Patch by Suman Saha.
2012-04-01 16:05:46 +02:00
Kristján Valur Jónsson daa06544c8 Issue #14435: Remove special block allocation code from floatobject.c
PyFloatObjects are now allocated using PyObject_MALLOC like all other
internal types, but maintain a limited freelist of objects at hand for
performance.  This will result in more consistent memory usage by Python.
2012-03-30 09:18:15 +00:00
Victor Stinner 3c1e48176e Issue #14383: Add _PyDict_GetItemId() and _PyDict_SetItemId() functions
These functions simplify the usage of static constant Unicode strings.
Generalize the usage of _Py_Identifier in ceval.c and typeobject.c.
2012-03-26 22:10:51 +02:00
Benjamin Peterson 0df542985a grammar 2012-03-26 14:50:32 -04:00
Benjamin Peterson c067d6661f merge 3.2 2012-03-25 22:41:16 -04:00
Benjamin Peterson a8755c586e kill this terribly outdated comment 2012-03-25 22:40:54 -04:00
Antoine Pitrou d0acb411ef Issue #14387: Do not include accu.h from Python.h. 2012-03-22 14:42:18 +01:00
Antoine Pitrou 0197ff97d0 Issue #14387: Do not include accu.h from Python.h. 2012-03-22 14:38:16 +01:00
Victor Stinner 59af08f545 Micro-optimize PyObject_GetAttrString()
w cannot be NULL so use Py_DECREF() instead of Py_XDECREF().
2012-03-22 02:09:08 +01:00
Benjamin Peterson 520e8508a0 long() -> int() 2012-03-21 14:51:14 -04:00
Benjamin Peterson b7f1da5a3c make _PyNumber_ConvertIntegralToInt static, since it's only used in abstract.c 2012-03-21 14:44:43 -04:00
Benjamin Peterson d614e707ca rewrite this function, which was still accounting for classic classes 2012-03-21 14:38:11 -04:00
Benjamin Peterson 1b1a8e7cb5 correctly lookup __trunc__ in int() constructor 2012-03-20 23:48:11 -04:00
Benjamin Peterson 9fc9bf465a some more identifier goodness 2012-03-20 23:26:41 -04:00
Benjamin Peterson 96384b93aa make extra arguments to object.__init__/__new__ to errors in most cases (finishes #1683368) 2012-03-17 00:05:44 -05:00
Benjamin Peterson 9a03ecfa50 simply this slightly 2012-03-16 20:15:54 -05:00
Benjamin Peterson de394543b4 merge 3.2 (#14334) 2012-03-16 09:35:38 -05:00
Benjamin Peterson 16d84ac355 check to make sure the attribute is a string (#14334) 2012-03-16 09:32:59 -05:00
Benjamin Peterson f50af113ab space 2012-03-15 15:37:54 -05:00
Benjamin Peterson 2afe6aeae8 perform yield from delegation by repeating YIELD_FROM opcode (closes #14230)
This allows generators that are using yield from to be seen by debuggers. It
also kills the f_yieldfrom field on frame objects.

Patch mostly from Mark Shannon with a few tweaks by me.
2012-03-15 15:37:39 -05:00
Victor Stinner ba108823b6 Close #14232: catch mmap() failure in new_arena() of obmalloc 2012-03-10 00:21:44 +01:00
Benjamin Peterson 74529ad3f4 refactor and avoid warnings 2012-03-09 07:25:32 -08:00
Victor Stinner 2d01dc00bc Issue #14211: _PyObject_GenericSetAttrWithDict() keeps a strong reference to
the descriptor because it may be destroyed before being used, destroyed during
the update of the dict for example.
2012-03-09 00:44:13 +01:00
Victor Stinner d74782b0ac Close #14199: _PyType_Lookup() and super_getattro() keep a strong reference to
the type MRO to avoid a crash if the MRO is changed during the lookup.
2012-03-09 00:39:08 +01:00
Benjamin Peterson 9a6338651e merge 3.2 (#3787e896dbe9) 2012-03-07 18:52:52 -06:00
Benjamin Peterson 52c424343d allow cycles throught the __dict__ slot to be cleared (closes #1469629)
Patch from Armin, test from me.
2012-03-07 18:41:11 -06:00
Benjamin Peterson 657e9ebef5 make gi_running a boolean 2012-03-07 18:17:03 -06:00
Benjamin Peterson 9fc309083a indicate we're not running as we leave this block 2012-03-07 18:11:31 -06:00
Benjamin Peterson 099a78fe6d make delegating generators say they running (closes #14220) 2012-03-07 17:57:04 -06:00
Stefan Krah 4e14174e24 Whitespace. 2012-03-06 15:27:31 +01:00
Victor Stinner 0d03478b88 Remove an unused variable 2012-03-06 02:06:01 +01:00
Victor Stinner 198b291df7 Close #14205: dict lookup raises a RuntimeError if the dict is modified during
a lookup.

"if you want to make a sandbox on top of CPython, you have to fix segfaults"
so let's fix segfaults!
2012-03-06 01:03:13 +01:00
Stefan Krah 1e88f3faa6 Merge. 2012-03-05 17:48:21 +01:00
Stefan Krah 1649c1b33a Issue #14181: Preserve backwards compatibility for getbufferprocs that a) do
not adhere to the new documentation and b) manage to clobber view->obj before
returning failure.
2012-03-05 17:45:17 +01:00
Benjamin Peterson 400a968dfc remove f_yieldfrom access from Python (closes #13970) 2012-03-05 09:03:51 -06:00
Stefan Krah 4e99a315b7 Issue #14181: Allow memoryview construction from an object that uses the
getbuffer redirection scheme.
2012-03-05 09:30:47 +01:00
Victor Stinner c9590ad745 Close #14085: remove assertions from PyUnicode_WRITE macro
Add checks in PyUnicode_WriteChar() and convert PyUnicode_New() assertion to a
test raising a Python exception.
2012-03-04 01:34:37 +01:00
Antoine Pitrou 70d2717f2e Issue #13521: dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
Patch by Filip Gruszczyński.
2012-02-27 00:59:34 +01:00
Antoine Pitrou e965d97ed1 Issue #13521: dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
Patch by Filip Gruszczyński.
2012-02-27 00:45:12 +01:00
Nick Coghlan ab7bf2143e Close issue #6210: Implement PEP 409 2012-02-26 17:49:52 +10:00
Ezio Melotti cda6b6d60d #14081: The sep and maxsplit parameter to str.split, bytes.split, and bytearray.split may now be passed as keyword arguments. 2012-02-26 09:39:55 +02:00
Stefan Krah 9a2d99e28a - Issue #10181: New memoryview implementation fixes multiple ownership
and lifetime issues of dynamically allocated Py_buffer members (#9990)
  as well as crashes (#8305, #7433). Many new features have been added
  (See whatsnew/3.3), and the documentation has been updated extensively.
  The ndarray test object from _testbuffer.c implements all aspects of
  PEP-3118, so further development towards the complete implementation
  of the PEP can proceed in a test-driven manner.

  Thanks to Nick Coghlan, Antoine Pitrou and Pauli Virtanen for review
  and many ideas.

- Issue #12834: Fix incorrect results of memoryview.tobytes() for
  non-contiguous arrays.

- Issue #5231: Introduce memoryview.cast() method that allows changing
  format and shape without making a copy of the underlying memory.
2012-02-25 12:24:21 +01:00
Victor Stinner 6f73874edd Close #14095: type.__new__() doesn't remove __qualname__ key from the class
dict anymore if the key is present. Reject also non-string qualified names.
And fix reference leaks in type.__new__().
2012-02-25 01:22:36 +01:00
Victor Stinner b0800dc53b Oops, revert unwanted changes 2012-02-25 00:47:08 +01:00
Victor Stinner abc649ddbe Issue #14107: fix bigmem tests on str.capitalize(), str.swapcase() and
str.title(). Compute correctly how much memory is required for the test
(memuse).
2012-02-25 00:43:27 +01:00
Antoine Pitrou 842c0f17eb Fix compilation error under Windows (and warnings too). 2012-02-24 13:30:46 +01:00
Victor Stinner 90f50d4df9 Issue #13706: Fix format(float, "n") for locale with non-ASCII decimal point (e.g. ps_aF) 2012-02-24 01:44:47 +01:00
Victor Stinner 41a863cb81 Issue #13706: Fix format(int, "n") for locale with non-ASCII thousands separator
* Decode thousands separator and decimal point using PyUnicode_DecodeLocale()
   (from the locale encoding), instead of decoding them implicitly from latin1
 * Remove _PyUnicode_InsertThousandsGroupingLocale(), it was not used
 * Change _PyUnicode_InsertThousandsGrouping() API to return the maximum
   character if unicode is NULL
 * Replace MIN/MAX macros by Py_MIN/Py_MAX
 * stringlib/undef.h undefines STRINGLIB_IS_UNICODE
 * stringlib/localeutil.h only supports Unicode
2012-02-24 00:37:51 +01:00
Victor Stinner b429d3b09c Fix doc of an internal function: unicode_write_cstr() 2012-02-22 21:22:20 +01:00
Antoine Pitrou ba6bafcfbe Fix compile failure under Windows 2012-02-22 16:41:50 +01:00
Victor Stinner c516610f0b Optimize str%arg for number formats: %i, %d, %u, %x, %p
Write a specialized function to write an ASCII/latin1 C char* string into a
Python Unicode string.
2012-02-22 13:55:02 +01:00
Victor Stinner 99d7ad0bb0 Micro-optimize computation of maxchar in PyUnicode_TransformDecimalToASCII() 2012-02-22 13:37:39 +01:00
Victor Stinner da79e632c4 Micro-optimize unicode_expandtabs(): use FILL() macro to write N spaces 2012-02-22 13:37:04 +01:00
Victor Stinner 15e9ed299c PyUnicode_New() and unicode_putchar() check for MAX_UNICODE maximum (U+10FFFF) 2012-02-22 13:36:20 +01:00
Benjamin Peterson d9a3591ed1 merge 3.2 2012-02-21 11:12:14 -05:00
Benjamin Peterson e249dcab7a merge 3.2 2012-02-21 11:09:13 -05:00
Benjamin Peterson 69e9727657 ensure no one tries to hash things before the random seed is found 2012-02-21 11:08:50 -05:00
Benjamin Peterson 71f660e00f update to Unicode 6.1 2012-02-20 22:24:29 -05:00
Georg Brandl 16fa2a1097 Forgot the "empty string -> hash == 0" special case for strings. 2012-02-21 00:50:13 +01:00
Georg Brandl 2fb477c0f0 Merge 3.2: Issue #13703 plus some related test suite fixes. 2012-02-21 00:33:36 +01:00
Georg Brandl 09a7c72cad Merge from 3.1: Issue #13703: add a way to randomize the hash values of basic types (str, bytes, datetime)
in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated.

The environment variable PYTHONHASHSEED and the new command line flag -R control this
behavior.
2012-02-20 21:31:46 +01:00
Georg Brandl 2daf6ae249 Issue #13703: add a way to randomize the hash values of basic types (str, bytes, datetime)
in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated.

The environment variable PYTHONHASHSEED and the new command line flag -R control this
behavior.
2012-02-20 19:54:16 +01:00
Benjamin Peterson 006c5a2235 check for NULL to fix segfault 2012-02-19 20:36:12 -05:00
Benjamin Peterson 23d7f12ffb use new generic __dict__ descriptor implementations 2012-02-19 20:02:57 -05:00
Benjamin Peterson 8eb1269c34 add generic implementation of a __dict__ descriptor for C types 2012-02-19 19:59:10 -05:00
Benjamin Peterson b900d6a78c initialize __dict__ if needed 2012-02-19 10:17:30 -05:00
Benjamin Peterson 2cf936fe7a use defaults 2012-02-19 01:16:13 -05:00
Benjamin Peterson 84e821e961 merge 3.2 2012-02-19 01:14:21 -05:00
Benjamin Peterson 496c53d83e use Py_CLEAR 2012-02-19 01:11:56 -05:00
Benjamin Peterson 01d7eba316 allow arbitrary attributes on classmethod and staticmethod (closes #14051) 2012-02-19 01:10:25 -05:00
Antoine Pitrou 552be9b214 Issue #13020: Fix a reference leak when allocating a structsequence object fails.
Patch by Suman Saha.
2012-02-15 02:54:33 +01:00
Antoine Pitrou 4b3c7846c9 Fix indentation 2012-02-15 02:52:58 +01:00
Antoine Pitrou 37784ba5c0 Issue #13020: Fix a reference leak when allocating a structsequence object fails.
Patch by Suman Saha.
2012-02-15 02:51:43 +01:00
Victor Stinner c3a6b02d70 (Merge 3.2) Issue #13913: normalize utf-8 codec name in UTF-8 decoder 2012-02-14 01:18:10 +01:00
Victor Stinner cbe01342bc Issue #13913: normalize utf-8 codec name in UTF-8 decoder 2012-02-14 01:17:45 +01:00
Benjamin Peterson 67e700697e merge 3.2 2012-02-10 08:47:04 -05:00
Benjamin Peterson efe7c9d4d7 this is only a borrowed ref in Brett's branch 2012-02-10 08:46:54 -05:00
Victor Stinner d1cd99b533 Backout d2c1521ad0a1: _Py_IDENTIFIER() uses UTF-8 again 2012-02-07 23:05:55 +01:00
Benjamin Peterson 9878b63c7c merge 3.2 2012-02-06 11:30:05 -05:00
Benjamin Peterson 2f9c71bbba bltinmod is borrowed, so it shouldn't be decrefed 2012-02-06 11:28:45 -05:00
Victor Stinner d446d8e09a _Py_Identifier are always ASCII strings 2012-02-05 01:45:45 +01:00
Benjamin Peterson 951138c795 merge 3.2 2012-02-03 19:25:01 -05:00
Benjamin Peterson 90b13583bc put returns on their own lines 2012-02-03 19:22:31 -05:00
Benjamin Peterson 2372bb0722 merge 3.2 (closes #13908) 2012-01-29 20:17:07 -05:00
Benjamin Peterson 2652d2570e ready types returned from PyType_FromSpec 2012-01-29 20:16:37 -05:00
Benjamin Peterson e28108cbd7 adjust declaration 2012-01-29 20:13:18 -05:00
Antoine Pitrou 7ab4af0427 Issue #13848: open() and the FileIO constructor now check for NUL characters in the file name.
Patch by Hynek Schlawack.
2012-01-29 18:43:36 +01:00
Antoine Pitrou 1334884ff2 Issue #13848: open() and the FileIO constructor now check for NUL characters in the file name.
Patch by Hynek Schlawack.
2012-01-29 18:36:34 +01:00
Mark Dickinson 963816defc Merge 3.2 -> default (issue 13889) 2012-01-27 21:17:04 +00:00
Mark Dickinson 261896b559 Issue #13889: Add missing _Py_SET_53BIT_PRECISION_* calls around uses of dtoa.c functions in float round. 2012-01-27 21:16:01 +00:00
Georg Brandl 20f6bc20bd merge with 3.2 2012-01-22 21:31:39 +01:00
Georg Brandl beca27a394 Fix #13834: strip() strips leading and trailing whitespace. 2012-01-22 21:31:21 +01:00
Benjamin Peterson ce79852077 use the static identifier api for looking up special methods
I had to move the static identifier code from unicodeobject.h to object.h in
order for this to work.
2012-01-22 11:24:29 -05:00
Antoine Pitrou ac456a1839 Fix some of the remaining test_capi leaks 2012-01-18 21:35:21 +01:00
Antoine Pitrou 8b0a74e936 Fix some of the remaining test_capi refleaks 2012-01-18 21:29:05 +01:00
Antoine Pitrou 84091bfa45 Fix some of the refleaks in test_capi (ported from 3.2) 2012-01-18 21:24:18 +01:00
Antoine Pitrou 55f217f22d Fix refleaks in test_capi
(this was easier than I thought!)
2012-01-18 21:23:13 +01:00
Antoine Pitrou bb5b92d324 Merge refleak fixes from 3.2 2012-01-18 16:19:19 +01:00
Antoine Pitrou 1c7ade5284 Fix leaking a RuntimeError objects when creating sub-interpreters 2012-01-18 16:13:31 +01:00
Benjamin Peterson eea4846d23 don't ready in case_operation, since most callers do it themselves 2012-01-16 14:28:50 -05:00
Benjamin Peterson c6630b9291 fix old titlecase function for extended case chars 2012-01-15 21:33:32 -05:00
Benjamin Peterson 9487c4db82 comment about how flags could be expanded 2012-01-15 21:26:23 -05:00
Benjamin Peterson ad9c569825 delta encoding of upper/lower/title makes a glorious return (#12736) 2012-01-15 21:19:20 -05:00
Gregory P. Smith f5b62a9b31 Consolidate the occurrances of the prime used as the multiplier when hashing. 2012-01-14 15:45:13 -08:00
Gregory P. Smith 63e6c3222f Consolidate the occurrances of the prime used as the multiplier when hashing
to a single #define instead of having several copies in several files.

This excludes the Modules/ tree (datetime and expat both have a copy
for their own purposes with no need for it to be the same).
2012-01-14 15:31:34 -08:00
Benjamin Peterson c8d8b8861e fix possible refleaks if PyUnicode_READY fails 2012-01-14 13:37:31 -05:00
Benjamin Peterson bac79498c8 always explicitly check for -1 from PyUnicode_READY 2012-01-14 13:34:47 -05:00
Benjamin Peterson d5890c8db5 add str.casefold() (closes #13752) 2012-01-14 13:23:30 -05:00
Nick Coghlan 138f4656e3 Add a separate NEWS entry for a change to PyObject_CallMethod in the PEP 380 patch, and make the private CallMethod variants consistent with the public one 2012-01-14 16:45:48 +10:00
Amaury Forgeot d'Arc e557da804a Fix a crash when the return value of a subgenerator is a temporary
object (with a refcount of 1)
2012-01-13 21:06:12 +01:00
Benjamin Peterson 53aa1d7c57 fix possible if unlikely leak 2011-12-20 13:29:45 -06:00
Georg Brandl ac0675cc01 Small clarification in docstring of dict.update(): the positional argument is not required. 2011-12-18 19:30:55 +01:00
Victor Stinner bb2e9c477d Issue #11231: Fix bytes and bytearray docstrings
Patch written by Brice Berna.
2011-12-17 23:18:07 +01:00
Nick Coghlan 1f7ce62bd6 Implement PEP 380 - 'yield from' (closes #11682) 2012-01-13 21:43:40 +10:00
Benjamin Peterson e51757f6de move do_title to a better place 2012-01-12 21:10:29 -05:00
Benjamin Peterson 821e4cfd01 make fix_decimal_and_space_to_ascii check if it modifies the string 2012-01-12 15:40:18 -05:00
Benjamin Peterson 0c91392fe6 kill capwords implementation which has been disabled since the begining 2012-01-12 15:25:41 -05:00
Benjamin Peterson 21e0da228d remove some usage of Py_UNICODE_TOUPPER/LOWER 2012-01-11 21:00:42 -05:00
Benjamin Peterson b2bf01d824 use full unicode mappings for upper/lower/title case (#12736)
Also broaden the category of characters that count as lowercase/uppercase.
2012-01-11 18:17:06 -05:00
Antoine Pitrou 94f6fa62bf Issue #13738: Simplify implementation of bytes.lower() and bytes.upper(). 2012-01-08 16:22:46 +01:00
Victor Stinner 3fe553160c Add a new PyUnicode_Fill() function
It is faster than the unicode_fill() function which was implemented in
formatter_unicode.c.
2012-01-04 00:33:50 +01:00
Benjamin Peterson 5e458f520c also decref the right thing 2012-01-02 10:12:13 -06:00
Benjamin Peterson 4c13a4a352 ready the correct string 2012-01-02 09:07:38 -06:00
Benjamin Peterson 22a29708fd fix some possible refleaks from PyUnicode_READY error conditions 2012-01-02 09:00:30 -06:00
Benjamin Peterson 9ca3ffac94 == -1 is convention 2012-01-01 16:04:29 -06:00
Benjamin Peterson e157cf1012 make switch more robust 2012-01-01 15:56:20 -06:00
Benjamin Peterson 2199227be4 fix weird indentation 2011-12-28 12:01:31 -06:00
Antoine Pitrou 5b62942074 Issue #13577: Built-in methods and functions now have a __qualname__.
Patch by sbt.
2011-12-23 12:40:16 +01:00
Benjamin Peterson c0b95d18fa 4 space indentation 2011-12-20 17:24:05 -06:00
Benjamin Peterson ead6b53659 fix spacing around switch statements 2011-12-20 17:23:42 -06:00
Benjamin Peterson 822c790527 merge 3.2 2011-12-20 13:32:50 -06:00
Georg Brandl f928b5d27e Merge with 3.2. 2011-12-18 19:32:37 +01:00
Victor Stinner 6099a03202 Issue #13624: Write a specialized UTF-8 encoder to allow more optimization
The main bottleneck was the PyUnicode_READ() macro.
2011-12-18 14:22:26 +01:00
Victor Stinner 73f53b57d1 Optimize str * n for len(str)==1 and UCS-2 or UCS-4 2011-12-18 03:26:31 +01:00
Victor Stinner f644110816 Issue #13621: Optimize str.replace(char1, char2)
Use findchar() which is more optimized than a dummy loop using
PyUnicode_READ().  PyUnicode_READ() is a complex and slow macro.
2011-12-18 02:43:08 +01:00
Victor Stinner f8eac00779 Issue #13623: Fix a performance regression introduced by issue #12170 in
bytes.find() and handle correctly OverflowError (raise the same ValueError than
the error for -1).
2011-12-18 01:17:41 +01:00
Victor Stinner e010fc029d Issue #11231: Fix bytes and bytearray docstrings
Patch written by Brice Berna.
2011-12-17 23:18:43 +01:00
Victor Stinner ab870218e3 Issue #10951: Fix compiler warnings in timemodule.c and unicodeobject.c
Thanks Jérémy Anger for the fix.
2011-12-17 22:39:43 +01:00
Benjamin Peterson f2fe7f0881 fix possible NULL dereference 2011-12-17 08:02:20 -05:00
Victor Stinner 2f197078fb The locale decoder raises a UnicodeDecodeError instead of an OSError
Search the invalid character using mbrtowc().
2011-12-17 07:08:30 +01:00
Victor Stinner 1b57967b96 Issue #13560: Locale codec functions use the classic "errors" parameter,
instead of surrogateescape

So it would be possible to support more error handlers later.
2011-12-17 05:47:23 +01:00
Victor Stinner ab59594326 What's New in Python 3.3: complete the deprecation list
Add also FIXMEs in unicodeobject.c
2011-12-17 04:59:06 +01:00
Victor Stinner 1f33f2b0c3 Issue #13560: os.strerror() now uses the current locale encoding instead of UTF-8 2011-12-17 04:45:09 +01:00
Victor Stinner f2ea71fcc8 Issue #13560: Add PyUnicode_EncodeLocale()
* Use PyUnicode_EncodeLocale() in time.strftime() if wcsftime() is not
   available
 * Document my last changes in Misc/NEWS
2011-12-17 04:13:41 +01:00
Victor Stinner af02e1c85a Add PyUnicode_DecodeLocaleAndSize() and PyUnicode_DecodeLocale()
* PyUnicode_DecodeLocaleAndSize() and PyUnicode_DecodeLocale() decode a string
   from the current locale encoding
 * _Py_char2wchar() writes an "error code" in the size argument to indicate
   if the function failed because of memory allocation failure or because of a
   decoding error. The function doesn't write the error message directly to
   stderr.
 * Fix time.strftime() (if wcsftime() is missing): decode strftime() result
   from the current locale encoding, not from the filesystem encoding.
2011-12-16 23:56:01 +01:00
Antoine Pitrou 093ce9cd8c Issue #6695: Full garbage collection runs now clear the freelist of set objects.
Initial patch by Matthias Troffaes.
2011-12-16 11:24:27 +01:00
Benjamin Peterson bfebb7b54a improve abstract property support (closes #11610)
Thanks to Darren Dale for patch.
2011-12-15 15:34:02 -05:00
Antoine Pitrou e0e2735f41 Fix OSError.__init__ and OSError.__new__ so that each of them can be
overriden and take additional arguments (followup to issue #12555).
2011-12-15 14:31:28 +01:00
Antoine Pitrou d73a9acb63 Fix the fix for issue #12149: it was incorrect, although it had the side
effect of appearing to resolve the issue.  Thanks to Mark Shannon for
noticing.
2011-12-15 14:17:36 +01:00
Antoine Pitrou 2e872082f6 Fix the fix for issue #12149: it was incorrect, although it had the side
effect of appearing to resolve the issue.  Thanks to Mark Shannon for
noticing.
2011-12-15 14:15:31 +01:00
Florent Xicluna aa6c1d240f Issue #13575: there is only one class type. 2011-12-12 18:54:29 +01:00
Antoine Pitrou 9d57481f04 Issue #13577: various kinds of descriptors now have a __qualname__ attribute.
Patch by sbt.
2011-12-12 13:47:25 +01:00
Victor Stinner 16e6a80923 PyUnicode_Resize(): warn about canonical representation
Call also directly unicode_resize() in unicodeobject.c
2011-12-12 13:24:15 +01:00
Victor Stinner b0a82a6a7f Fix PyUnicode_Resize() for compact string: leave the string unchanged on error
Fix also PyUnicode_Resize() doc
2011-12-12 13:08:33 +01:00
Victor Stinner bf6e560d0c Make PyUnicode_Copy() private => _PyUnicode_Copy()
Undocument the function.

Make also decode_utf8_errors() as private (static).
2011-12-12 01:53:47 +01:00
Victor Stinner 7a9105a380 resize_copy() now supports legacy ready strings 2011-12-12 00:13:42 +01:00
Victor Stinner 488fa49acf Rewrite PyUnicode_Append(); unicode_modifiable() is more strict
* Rename unicode_resizable() to unicode_modifiable()
 * Rename _PyUnicode_Dirty() to unicode_check_modifiable() to make it clear
   that the function is private
 * Inline PyUnicode_Concat() and unicode_append_inplace() in PyUnicode_Append()
   to simplify the code
 * unicode_modifiable() return 0 if the hash has been computed or if the string
   is not an exact unicode string
 * Remove _PyUnicode_DIRTY(): no need to reset the hash anymore, because if the
   hash has already been computed, you cannot modify a string inplace anymore
 * PyUnicode_Concat() checks for integer overflow
2011-12-12 00:01:39 +01:00
Victor Stinner c4b495497a Create unicode_result_unchanged() subfunction 2011-12-11 22:44:26 +01:00
Victor Stinner eaab604829 Fix fixup() for unchanged unicode subtype
If maxchar_new == 0 and self is a unicode subtype, return u instead of duplicating u.
2011-12-11 22:22:39 +01:00
Victor Stinner e6b2d4407a unicode_fromascii() doesn't check string content twice in debug mode
_PyUnicode_CheckConsistency() also checks string content.
2011-12-11 21:54:30 +01:00
Victor Stinner a1d12bb119 Call directly PyUnicode_DecodeUTF8Stateful() instead of PyUnicode_DecodeUTF8()
* Remove micro-optimization from PyUnicode_FromStringAndSize():
   PyUnicode_DecodeUTF8Stateful() has already these optimizations (for size=0
   and one ascii char).
 * Rename utf8_max_char_size_and_char_count() to utf8_scanner(), and remove an
   useless variable
2011-12-11 21:53:09 +01:00
Victor Stinner 382955ff4e Use directly unicode_empty instead of PyUnicode_New(0, 0) 2011-12-11 21:44:00 +01:00
Victor Stinner 785938eebd Move the slowest UTF-8 decoder to its own subfunction
* Create decode_utf8_errors()
 * Reuse unicode_fromascii()
 * decode_utf8_errors() doesn't refit at the beginning
 * Remove refit_partial_string(), use unicode_adjust_maxchar() instead
2011-12-11 20:09:03 +01:00
Victor Stinner 84def3774d Fix error handling in resize_compact() 2011-12-11 20:04:56 +01:00
Victor Stinner 8faf8216e4 PyUnicode_FromWideChar() and PyUnicode_FromUnicode() raise a ValueError if a
character in not in range [U+0000; U+10ffff].
2011-12-08 22:14:11 +01:00
Antoine Pitrou b0e1f8b38b Issue #13503: Use a more efficient reduction format for bytearrays with
pickle protocol >= 3.  The old reduction format is kept with older
protocols in order to allow unpickling under Python 2.

Patch by Irmen de Jong.
2011-12-05 20:40:08 +01:00
Victor Stinner 0a54cf12a0 Fix PyObject_Repr(): don't call PyUnicode_READY() if res is NULL 2011-12-01 03:22:44 +01:00
Victor Stinner b37b17423b Replace PyUnicode_FromUnicode(NULL, 0) by PyUnicode_New(0, 0)
Create an empty string with the new Unicode API.
2011-12-01 03:18:59 +01:00
Victor Stinner db88ae5d66 PyObject_Repr() ensures that the result is a ready Unicode string
And PyObject_Str() and PyObject_Repr() don't make strings ready in debug
mode to ensure that the caller makes the string ready before using it.
2011-12-01 02:15:00 +01:00
Victor Stinner 551ac95733 Py_UNICODE_HIGH_SURROGATE() and Py_UNICODE_LOW_SURROGATE() macros
And use surrogates macros everywhere in unicodeobject.c
2011-11-29 22:58:13 +01:00
Antoine Pitrou c366117820 Merge heads 2011-11-26 01:13:12 +01:00
Antoine Pitrou f0effe6379 Better resolution for issue #11849: Ensure that free()d memory arenas are really released
on POSIX systems supporting anonymous memory mappings.  Patch by Charles-François Natali.
2011-11-26 01:11:02 +01:00
Victor Stinner 6345be9a14 Close #13093: PyUnicode_EncodeDecimal() doesn't support error handlers
different than "strict" anymore. The caller was unable to compute the
size of the output buffer: it depends on the error handler.
2011-11-25 20:09:01 +01:00
Antoine Pitrou 86a36b500a PEP 3155 / issue #13448: Qualified name for classes and functions. 2011-11-25 18:56:07 +01:00
Benjamin Peterson 1518e8713d and back to the "magic" formula (with a comment) it is 2011-11-23 10:44:52 -06:00
Benjamin Peterson 5944c36931 cave to those who like readable code 2011-11-22 19:05:49 -06:00
Benjamin Peterson 0268675193 fix compiler warning by implementing this more cleverly 2011-11-22 15:29:32 -05:00
Victor Stinner ca4f20782e find_maxchar_surrogates() reuses surrogate macros 2011-11-22 03:38:40 +01:00
Victor Stinner 0d3721d986 Issue #13441: Disable temporary the check on the maximum character until
the Solaris issue is solved.

But add assertion on the maximum character in various encoders: UTF-7, UTF-8,
wide character (wchar_t*, Py_UNICODE*), unicode-escape, raw-unicode-escape.

Fix also unicode_encode_ucs1() for backslashreplace error handler: Python is
now always "wide".
2011-11-22 03:27:53 +01:00
Victor Stinner f8facacf30 Fix compiler warnings 2011-11-22 02:30:47 +01:00
Victor Stinner 9d3b93ba30 Use the new Unicode API
* Replace PyUnicode_FromUnicode(NULL, 0) by PyUnicode_New(0, 0)
 * Replce PyUnicode_FromUnicode(str, len) by PyUnicode_FromWideChar(str, len)
 * Replace Py_UNICODE by wchar_t
 * posix_putenv() uses PyUnicode_FromFormat() to create the string, instead
   of PyUnicode_FromUnicode() + _snwprintf()
2011-11-22 02:27:30 +01:00
Victor Stinner b84d723509 (Merge 3.2) Issue #13093: Fix error handling on PyUnicode_EncodeDecimal() 2011-11-22 01:50:07 +01:00
Victor Stinner cfed46e00a PyUnicode_FromKindAndData() fails with a ValueError if size < 0 2011-11-22 01:29:14 +01:00
Victor Stinner 42885206ec UTF-8 decoder: set consumed value in the latin1 fast-path 2011-11-22 01:23:02 +01:00
Victor Stinner d3df8ab377 Replace _PyUnicode_READY_REPLACE() and _PyUnicode_ReadyReplace() with unicode_ready()
* unicode_ready() has a simpler API
 * try to reuse unicode_empty and latin1_char singleton everywhere
 * Fix a reference leak in _PyUnicode_TranslateCharmap()
 * PyUnicode_InternInPlace() doesn't try to get a singleton anymore, to avoid
   having to handle a failure
2011-11-22 01:22:34 +01:00
Victor Stinner f01245067a Rewrite PyUnicode_TransformDecimalToASCII() to use the new Unicode API 2011-11-21 23:12:56 +01:00
Victor Stinner 2d718f39a5 Remove an unused variable from PyUnicode_Copy() 2011-11-21 23:11:52 +01:00
Victor Stinner 87af4f2f3a Simplify PyUnicode_Copy()
USe PyUnicode_Copy() in fixup()
2011-11-21 23:03:47 +01:00
Victor Stinner 5bbe5e7c85 Fix a compiler warning in _PyUnicode_CheckConsistency() 2011-11-21 22:54:05 +01:00
Victor Stinner 42bf77537e Rewrite PyUnicode_EncodeDecimal() to use the new Unicode API
Add tests for PyUnicode_EncodeDecimal() and
PyUnicode_TransformDecimalToASCII().
2011-11-21 22:52:58 +01:00
Antoine Pitrou ce4a9da705 Issue #13411: memoryview objects are now hashable when the underlying object is hashable. 2011-11-21 20:46:33 +01:00
Antoine Pitrou 0a3229de6b Issue #13417: speed up utf-8 decoding by around 2x for the non-fully-ASCII case.
This almost catches up with pre-PEP 393 performance, when decoding needed
only one pass.
2011-11-21 20:39:13 +01:00
Victor Stinner da29cc36aa Issue #13441: _PyUnicode_CheckConsistency() dumps the string if the maximum
character is bigger than U+10FFFF and locale.localeconv() dumps the string
before decoding it.

Temporary hack to debug the issue #13441.
2011-11-21 14:31:41 +01:00
Victor Stinner 9e30aa52fd Fix misuse of PyUnicode_GET_SIZE() => PyUnicode_GET_LENGTH()
And PyUnicode_GetSize() => PyUnicode_GetLength()
2011-11-21 02:49:52 +01:00
Victor Stinner 53b33e767d UnicodeTranslateError uses the new Unicode API
The index is a character index, not a index in a Py_UNICODE* string.
2011-11-21 01:17:27 +01:00
Victor Stinner da1ddf37c6 UnicodeEncodeError uses the new Unicode API
The index is a character index, not a index in a Py_UNICODE* string.
2011-11-20 22:50:23 +01:00
Victor Stinner 4ead7c7be8 PyObject_Str() ensures that the result string is ready
and check the string consistency.

_PyUnicode_CheckConsistency() doesn't check the hash anymore. It should be
possible to call this function even if hash(str) was already called.
2011-11-20 19:48:36 +01:00
Victor Stinner 0fc35196bb stringlib: remove unused STRINGLIB_FILL 2011-11-20 19:30:15 +01:00
Victor Stinner b960b34577 PyUnicode_AsUTF32String() calls directly _PyUnicode_EncodeUTF32(),
instead of calling the deprecated PyUnicode_EncodeUTF32() function
2011-11-20 19:12:52 +01:00
Victor Stinner 77faf69ca1 _PyUnicode_CheckConsistency() also checks maxchar maximum value,
not only its minimum value
2011-11-20 18:56:05 +01:00
Victor Stinner d5c4022d2a Remove the two ugly and unused WRITE_ASCII_OR_WSTR and WRITE_WSTR macros 2011-11-20 18:41:31 +01:00
Victor Stinner 2e9cfadd7c Reuse surrogate macros in UTF-16 decoder 2011-11-20 18:40:27 +01:00
Victor Stinner ae4f7c8e59 charmap_encoding_error() uses the new Unicode API 2011-11-20 18:28:55 +01:00
Victor Stinner ac931b1e5b Use PyUnicode_EncodeCodePage() instead of PyUnicode_EncodeMBCS() with
PyUnicode_AsUnicodeAndSize()
2011-11-20 18:27:03 +01:00
Victor Stinner 22168998f5 charmap encoders uses Py_UCS4, not Py_UNICODE 2011-11-20 17:09:18 +01:00
Antoine Pitrou f34a0cdc6c Issue #10227: Add an allocation cache for a single slice object.
Patch by Stefan Behnel.
2011-11-18 20:14:34 +01:00
Victor Stinner 1f7951711c Catch PyUnicode_AS_UNICODE() errors 2011-11-17 00:45:54 +01:00
Ezio Melotti 11060a4a48 #13406: silence deprecation warnings in test_codecs. 2011-11-16 09:39:10 +02:00
Antoine Pitrou 78edf7576e Issue #13333: The UTF-7 decoder now accepts lone surrogates
(the encoder already accepts them).
2011-11-15 01:44:16 +01:00
Antoine Pitrou 5418ee0b9a Issue #13333: The UTF-7 decoder now accepts lone surrogates
(the encoder already accepts them).
2011-11-15 01:42:21 +01:00
Antoine Pitrou 9a812cbc89 Issue #13389: Full garbage collection passes now clear the freelists for
list and dict objects.  They already cleared other freelists in the
interpreter.
2011-11-15 00:00:12 +01:00
Antoine Pitrou 39aba4f563 Use the small object allocator for small bytearrays 2011-11-12 21:15:28 +01:00
Antoine Pitrou 31b92a534f Sanitize reference management in the utf-8 encoder 2011-11-12 18:35:19 +01:00
Eli Bendersky e92ff0503c Issue #13161: fix doc strings of __i*__ operators. Closes #13161 2011-11-11 17:02:16 +02:00
Eli Bendersky d3baae73be Issue #13161: fix doc strings of __i*__ operators 2011-11-11 16:57:05 +02:00
Antoine Pitrou 0290c7a811 Fix regression on 2-byte wchar_t systems (Windows) 2011-11-11 13:29:12 +01:00
Antoine Pitrou 44c6affc79 Avoid crashing because of an unaligned word access 2011-11-11 02:59:42 +01:00
Antoine Pitrou de20b0b50e Issue #13149: Speed up append-only StringIO objects.
This is very similar to the "lazy strings" idea.
2011-11-10 21:47:38 +01:00
Victor Stinner 9f4b1e9c50 Fix and deprecated the unicode_internal codec
unicode_internal codec uses Py_UNICODE instead of the real internal
representation (PEP 393: Py_UCS1, Py_UCS2 or Py_UCS4) for backward
compatibility.
2011-11-10 20:56:30 +01:00
Victor Stinner 24729f36bf Prefer Py_UCS4 or wchar_t over Py_UNICODE 2011-11-10 20:31:37 +01:00
Victor Stinner ebf3ba808e PyUnicode_DecodeCharmap() uses the new Unicode API 2011-11-10 20:30:22 +01:00
Victor Stinner a98b28c1bf Avoid PyUnicode_AS_UNICODE in the UTF-8 encoder 2011-11-10 20:21:49 +01:00
Victor Stinner 3326cb6a36 Fix "unicode_escape" encoder 2011-11-10 20:15:25 +01:00
Victor Stinner 0e36826a04 Fix UTF-7 encoder on Windows 2011-11-10 20:12:49 +01:00
Martin v. Löwis 1db7c13be1 Port encoders from Py_UNICODE API to unicode object API. 2011-11-10 18:24:32 +01:00
Victor Stinner 62aa4d086a Strip trailing spaces 2011-11-09 00:03:45 +01:00
Victor Stinner 0a045efb49 Fix a compiler warning: use unsiged for maxchar in unicode_widen() 2011-11-09 00:02:42 +01:00
Victor Stinner 596a6c4ffc Fix the code page decoder
* unicode_decode_call_errorhandler() now supports the PyUnicode_WCHAR_KIND
   kind
 * unicode_decode_call_errorhandler() calls copy_characters() instead of
   PyUnicode_CopyCharacters()
2011-11-09 00:02:18 +01:00
Antoine Pitrou a8f63c02ef Fix missing goto 2011-11-08 18:37:16 +01:00
Martin v. Löwis d10759f6ed Make _PyUnicode_FromId return borrowed references.
http://mail.python.org/pipermail/python-dev/2011-November/114347.html
2011-11-07 13:00:05 +01:00
Martin v. Löwis e9b11c1cd8 Change decoders to use Unicode API instead of Py_UNICODE. 2011-11-08 17:35:34 +01:00
Petri Lehtinen 9589ab1745 Revert "Accept None as start and stop parameters for list.index() and tuple.index()"
Issue #13340.
2011-11-06 21:06:10 +02:00
Petri Lehtinen ebfaabd663 Revert "Accept None as start and stop parameters for list.index() and tuple.index()"
Issue #13340.
2011-11-06 21:02:39 +02:00
Amaury Forgeot d'Arc 864741b2c7 Issue #13350: Replace most usages of PyUnicode_Format by PyUnicode_FromFormat. 2011-11-06 15:10:48 +01:00
Petri Lehtinen 8e9f6c4251 Accept None as start and stop parameters for list.index() and tuple.index().
Closes #13340.
2011-11-05 23:25:34 +02:00
Petri Lehtinen c2f0a46111 Accept None as start and stop parameters for list.index() and tuple.index()
Closes #13340.
2011-11-05 23:24:31 +02:00
Benjamin Peterson 878ce389a0 add introspection to range objects (closes #9896)
Patch by Daniel Urban.
2011-11-05 15:17:52 -04:00
Victor Stinner e30c0a1014 Fix gdb/libpython.py for not ready Unicode strings
_PyUnicode_CheckConsistency() checks also hash and length value for not ready
Unicode strings.
2011-11-04 20:54:05 +01:00
Victor Stinner 2fc507fe45 Replace tabs by spaces 2011-11-04 20:06:39 +01:00
Martin v. Löwis 12be46ca84 Drop Py_UNICODE based encode exceptions. 2011-11-04 19:04:15 +01:00
Martin v. Löwis 3d325191bf Port code page codec to Unicode API. 2011-11-04 18:23:06 +01:00
Martin v. Löwis b09af03b8a Port error handlers from Py_UNICODE indexing to code point indexing. 2011-11-04 11:16:41 +01:00
Victor Stinner fcd9653667 Fix a compiler warning in unicode_encode_ucs1() 2011-11-04 00:28:50 +01:00
Victor Stinner fc026c98d8 Fix PyUnicode_EncodeCharmap() 2011-11-04 00:24:51 +01:00
Victor Stinner 7931d9a951 Replace PyUnicodeObject type by PyObject
* _PyUnicode_CheckConsistency() now takes a PyObject* instead of void*
 * Remove now useless casts to PyObject*
2011-11-04 00:22:48 +01:00
Victor Stinner 76a31a6bff Cleanup decode_code_page_stateful() and encode_code_page()
* Fix decode_code_page_errors() result
 * Inline decode_code_page() and encode_code_page_chunk()
 * Replace the PyUnicodeObject type by PyObject
2011-11-04 00:05:13 +01:00
Victor Stinner 7581cef699 Adapt the code page encoder to the new unicode_encode_call_errorhandler()
The code is not correct, but at least it doesn't crash anymore.
2011-11-03 22:32:33 +01:00
Brian Curtin 2787ea41fd Fix a compile error (apparently Windows only) introduced in 295fdfd4f422 2011-11-02 15:09:37 -05:00
Martin v. Löwis 23e275b3ad Port UCS1 and charmap codecs to new API. 2011-11-02 18:02:51 +01:00
Martin v. Löwis 9e8166843c Introduce PyObject* API for raising encode errors. 2011-11-02 12:45:42 +01:00
Benjamin Peterson 2b50a01d11 remove unused variable 2011-10-30 14:24:44 -04:00
Petri Lehtinen e0aa803714 Fix the return value of set_discard (issue #10519) 2011-10-30 14:35:12 +02:00
Petri Lehtinen 5acc27ebe4 Avoid unnecessary recursive function calls (closes #10519) 2011-10-30 13:56:41 +02:00
Petri Lehtinen a94200e6ce Issue #13018: Fix reference leaks in error paths in dictobject.c.
Patch by Suman Saha.
2011-10-24 21:12:58 +03:00
Nick Coghlan de31b191e5 Issue 1294232: Fix errors in metaclass calculation affecting some cases of metaclass inheritance. Patch by Daniel Urban. 2011-10-23 22:04:16 +10:00
Benjamin Peterson 9d9141f5db adjust braces a bit 2011-10-19 16:57:40 -04:00
Antoine Pitrou 551ba20e8e Issue #13188: When called without an explicit traceback argument,
generator.throw() now gets the traceback from the passed exception's
`__traceback__` attribute.  Patch by Petri Lehtinen.
2011-10-18 16:40:50 +02:00
Benjamin Peterson 2963fe0711 plug possible refleak (closes #13199) 2011-10-17 13:09:27 -04:00
Martin v. Löwis 0d3072e98d Drop Py_UCS4_ functions. Closes #13246. 2011-10-31 08:40:56 +01:00
Benjamin Peterson 1cebc207ea merge 3.2 2011-10-30 14:24:59 -04:00
Petri Lehtinen c34f5c256a Fix the return value of set_discard (issue #10519) 2011-10-30 14:35:39 +02:00
Petri Lehtinen 7c5e34d8a3 Avoid unnecessary recursive function calls (#closes #10519) 2011-10-30 13:57:45 +02:00
Victor Stinner 57ffa9d4ff PyUnicode_AsUnicodeCopy() uses PyUnicode_AsUnicodeAndSize() to get directly the length 2011-10-23 20:10:08 +02:00
Victor Stinner af9e4b8c29 Fix PyUnicode_InternImmortal(): PyUnicode_InternInPlace() may changes *p 2011-10-23 20:07:00 +02:00
Victor Stinner 9faa384bed Cast directly to unsigned char, instead of using Py_CHARMASK
We don't need "& 0xff" on an unsigned char.
2011-10-23 20:06:00 +02:00
Victor Stinner 9db1a8b69f Replace PyUnicodeObject* by PyObject* where it was irrevelant
A Unicode string can now be a PyASCIIObject, PyCompactUnicodeObject or
PyUnicodeObject. Aliasing a PyASCIIObject* or PyCompactUnicodeObject* to
PyUnicodeObject* is wrong
2011-10-23 20:04:37 +02:00
Victor Stinner 0d60e87ad6 Fix data variable in _PyUnicode_Dump() for compact ASCII 2011-10-23 19:47:19 +02:00
Victor Stinner d8e61c348e Remove last references to the removed Unicode free list 2011-10-23 19:43:33 +02:00
Victor Stinner 065836ec9c PyUnicode_FSDecoder() ensures that the decoded string is ready 2011-10-27 01:56:33 +02:00
Petri Lehtinen 08a95cabe3 merge heads 2011-10-24 21:22:39 +03:00
Petri Lehtinen 24bd5adcff Merge 3.2 2011-10-24 21:17:52 +03:00
Mark Dickinson 8d48b43ea9 Issue #12965: Fix some inaccurate comments in Objects/longobject.c. Thanks Stefan Krah. 2011-10-23 20:47:14 +01:00
Mark Dickinson 36645681c8 Issue #13201: equality for range objects is now based on equality of the underlying sequences. Thanks Sven Marnach for the patch. 2011-10-23 19:53:01 +01:00
Nick Coghlan 9715d26305 Merge issue 1294232 patch from 3.2 2011-10-23 22:36:42 +10:00
Victor Stinner dd18d3ad9e Fix unicode_subtype_new() on debug build
Patch written by Stefan Behnel.
2011-10-22 11:08:10 +02:00
Ezio Melotti f881751ded Remove unused variable. 2011-10-22 01:01:32 +03:00
Ezio Melotti 931b8aac80 #12753: Add support for Unicode name aliases and named sequences. 2011-10-21 21:57:36 +03:00
Antoine Pitrou ac65d96777 Issue #12170: The count(), find(), rfind(), index() and rindex() methods
of bytes and bytearray objects now accept an integer between 0 and 255
as their first argument.  Patch by Petri Lehtinen.
2011-10-20 23:54:17 +02:00
Benjamin Peterson dc37ce95e8 merge 3.2 2011-10-19 16:58:15 -04:00
Victor Stinner 6707293e75 Add consistency check to _PyUnicode_New() 2011-10-18 22:10:14 +02:00
Victor Stinner 3a50e7056e Issue #12281: Rewrite the MBCS codec to handle correctly replace and ignore
error handlers on all Windows versions. The MBCS codec is now supporting all
error handlers, instead of only replace to encode and ignore to decode.
2011-10-18 21:21:00 +02:00
Antoine Pitrou cf28eacafe Issue #13188: When called without an explicit traceback argument,
generator.throw() now gets the traceback from the passed exception's
``__traceback__`` attribute.  Patch by Petri Lehtinen.
2011-10-18 16:42:55 +02:00
Antoine Pitrou 5b9f4c1539 Fix typo 2011-10-17 19:21:04 +02:00
Benjamin Peterson 897d059221 merge 3.2 (#13199) 2011-10-17 13:10:24 -04:00
Benjamin Peterson 7a6debe79c remove some duplication 2011-10-15 09:25:28 -04:00
Martin v. Löwis 1c67dd9b15 Port SetAttrString/HasAttrString to SetAttrId/GetAttrId. 2011-10-14 15:16:45 +02:00
Martin v. Löwis bd928fef42 Rename _Py_identifier to _Py_IDENTIFIER. 2011-10-14 10:20:37 +02:00
Victor Stinner f5cff56a1b Issue #13088: Add shared Py_hexdigits constant to format a number into base 16 2011-10-14 02:13:11 +02:00
Victor Stinner d1a9cc29b9 dictviews_or() uses _Py_identifier 2011-10-13 22:51:17 +02:00
Martin v. Löwis bfc6d74b25 Use GetAttrId directly. Proposed by Amaury. 2011-10-13 20:03:57 +02:00
Antoine Pitrou f0b934b01a Reuse the stringlib in findchar(), and make its signature more convenient 2011-10-13 18:55:09 +02:00
Antoine Pitrou c198d0599b Add a comment explaining this heuristic. 2011-10-13 18:07:37 +02:00
Antoine Pitrou dda339e6d2 Simplify heuristic for when to use memchr 2011-10-13 17:58:11 +02:00
Victor Stinner 55c991197b Optimize unicode_subscript() for step != 1 and ascii strings 2011-10-13 01:17:06 +02:00
Victor Stinner 127226ba69 Don't use PyUnicode_MAX_CHAR_VALUE() macro in Py_MAX() 2011-10-13 01:12:34 +02:00
Victor Stinner 9e7a1bcfd6 Optimize findchar() for PyUnicode_1BYTE_KIND: use memchr and memrchr 2011-10-13 00:18:12 +02:00
Antoine Pitrou dd4e2f0153 Issue #13155: Optimize finding the optimal character width of an unicode string 2011-10-13 00:02:27 +02:00
Victor Stinner 49a0a21f37 Unicode replace() avoids calling unicode_adjust_maxchar() when it's useless
Add also a special case if the result is an empty string.
2011-10-12 23:46:10 +02:00
Antoine Pitrou 6b4883dec0 PEP 3151 / issue #12555: reworking the OS and IO exception hierarchy. 2011-10-12 02:54:14 +02:00
Victor Stinner 983b1434bd Backed out changeset 952d91a7d376
If maxchar == PyUnicode_MAX_CHAR_VALUE(unicode), we do an useless copy.
2011-10-12 00:54:35 +02:00
Antoine Pitrou e55ad2dff0 Relax condition 2011-10-12 00:36:51 +02:00
Victor Stinner d218bf14cc stringlib: Fix STRINGLIB_STR for UCS2/UCS4 2011-10-12 00:14:32 +02:00
Victor Stinner 4e10100dee Fix compiler warning in _PyUnicode_FromUCS2() 2011-10-11 23:27:52 +02:00
Victor Stinner 8cc70dcf70 Fix fastsearch for UCS2 and UCS4
* If needle is 0, try (p[0] >> 16) & 0xff for UCS4
 * Disable fastsearch_memchr_1char() if needle is zero for UCS2 and UCS4
2011-10-11 23:22:22 +02:00
Antoine Pitrou 950468e553 Use _PyUnicode_CONVERT_BYTES() where applicable. 2011-10-11 22:45:48 +02:00
Victor Stinner 577db2c9f0 PyUnicode_AsUnicodeCopy() now checks if PyUnicode_AsUnicode() failed 2011-10-11 22:12:48 +02:00
Victor Stinner c4f281eba3 Fix misuse of PyUnicode_GET_SIZE, use PyUnicode_GET_LENGTH instead 2011-10-11 22:11:42 +02:00
Victor Stinner ed2682be2f Reuse PyUnicode_Copy() in validate_and_copy_tuple() 2011-10-11 21:53:24 +02:00
Antoine Pitrou e459a0877e Issue #13136: speed up conversion between different character widths. 2011-10-11 20:58:41 +02:00
Antoine Pitrou 2c3b2302ad Issue #13134: optimize finding single-character strings using memchr 2011-10-11 20:29:21 +02:00
Antoine Pitrou 2871698546 /* Remove unused code. It has been committed out since 2000 (!). */ 2011-10-11 03:17:47 +02:00
Antoine Pitrou 53bb548f22 Avoid exporting private helpers
(thanks "make smelly")
2011-10-10 23:49:24 +02:00
Martin v. Löwis 1ee1b6fe0d Use identifier API for PyObject_GetAttrString. 2011-10-10 18:11:30 +02:00
Victor Stinner 794d567b17 any_find_slice() doesn't use callbacks anymore
* Call directly the right find/rfind method: allow inlining functions
 * Remove Py_LOCAL_CALLBACK (added for any_find_slice)
2011-10-10 03:21:36 +02:00
Martin v. Löwis afe55bba33 Add API for static strings, primarily good for identifiers.
Thanks to Konrad Schöbel and Jasper Schulz for helping with the mass-editing.
2011-10-09 10:38:36 +02:00
Antoine Pitrou eaf139b3fc Fix typo in the PyUnicode_Find() implementation 2011-10-09 00:33:09 +02:00
Georg Brandl 388349add2 Closes #12192: Document that mutating list methods do not return the instance (original patch by Mike Hoy). 2011-10-08 18:32:40 +02:00
Martin v. Löwis c47adb04b3 Change PyUnicode_KIND to 1,2,4. Drop _KIND_SIZE and _CHARACTER_SIZE. 2011-10-07 20:55:35 +02:00
Victor Stinner dd07732af5 PyUnicode_Join() calls directly memcpy() if all strings are of the same kind 2011-10-07 17:02:31 +02:00
Antoine Pitrou 978b9d2a27 Fix formatting memory consumption with very large padding specifications 2011-10-07 12:35:48 +02:00
Victor Stinner 59de0ee9e0 str.replace(a, a) is now returning str unchanged if a is a 2011-10-07 10:01:28 +02:00
Antoine Pitrou 4574e62c6e Fix massive slowdown in string formatting with str.format.
Example:
./python -m timeit -s "f='{}' + '-' * 1024 + '{}'; s='abcd' * 16384" "f.format(s, s)"

-> before: 547 usec per loop
-> after: 13 usec per loop
-> 3.2: 22.5 usec per loop
-> 2.7: 12.6 usec per loop
2011-10-07 02:26:47 +02:00
Antoine Pitrou 5c0ba36d5f Fix massive slowdown in string formatting with the % operator 2011-10-07 01:54:09 +02:00
Antoine Pitrou 7c46da7993 Ensure that 1-char singletons get used 2011-10-06 22:07:51 +02:00
Antoine Pitrou c61c8d7a5e Issue #12911: Fix memory consumption when calculating the repr() of huge tuples or lists.
This introduces a small private API for this common pattern.
The issue has been discovered thanks to Martin's huge-mem buildbot.
2011-10-06 19:04:12 +02:00
Antoine Pitrou eeb7eea1f9 Issue #12911: Fix memory consumption when calculating the repr() of huge tuples or lists.
This introduces a small private API for this common pattern.
The issue has been discovered thanks to Martin's huge-mem buildbot.
2011-10-06 18:57:27 +02:00
Victor Stinner c6f0df7b20 Fix PyUnicode_Join() for len==1 and non-exact string 2011-10-06 15:58:54 +02:00
Antoine Pitrou dbf697ae5c Fix compilation warnings under 64-bit Windows 2011-10-06 15:34:41 +02:00
Antoine Pitrou 15a66cf134 Fix compilation under Windows 2011-10-06 15:25:32 +02:00
Victor Stinner 200f21340d Fix assertion in unicode_adjust_maxchar() 2011-10-06 13:27:56 +02:00
Victor Stinner acf47b807f Fix my last change on PyUnicode_Join(): don't process separator if len==1 2011-10-06 12:32:37 +02:00
Victor Stinner 25a4b29c95 str.replace() avoids memory when it's possible 2011-10-06 12:31:55 +02:00
Victor Stinner 56c161ab00 _copy_characters() fails more quickly in debug mode on inconsistent state 2011-10-06 02:47:11 +02:00
Victor Stinner c729b8e92f Fix a compiler warning: don't define unicode_is_singleton() in release mode 2011-10-06 02:36:59 +02:00
Victor Stinner fb9ea8c57e Don't check for the maximum character when copying from unicodeobject.c
* Create copy_characters() function which doesn't check for the maximum
   character in release mode
 * _PyUnicode_CheckConsistency() is no more static to be able to use it
   in _PyUnicode_FormatAdvanced() (in formatter_unicode.c)
 * _PyUnicode_CheckConsistency() checks the string hash
2011-10-06 01:45:57 +02:00
Victor Stinner 05d1189566 Fix post-condition in unicode_repr(): check the result, not the input 2011-10-06 01:13:58 +02:00
Victor Stinner f48323e3b3 replace() uses unicode_fromascii() if the input and replace string is ASCII 2011-10-05 23:27:08 +02:00
Victor Stinner 0617b6e18b unicode_fromascii() checks that the input is ASCII in debug mode 2011-10-05 23:26:01 +02:00
Victor Stinner c3cec7868b Add asciilib: similar to ucs1, ucs2 and ucs4 library, but specialized to ASCII
ucs1, ucs2 and ucs4 libraries have to scan created substring to find the
maximum character, whereas it is not need to ASCII strings. Because ASCII
strings are common, it is useful to optimize ASCII.
2011-10-05 21:24:08 +02:00
Victor Stinner 14f8f02826 Fix PyUnicode_Partition(): str_in->str_obj 2011-10-05 20:58:25 +02:00
Victor Stinner 31392e741d Fix my_basename(): make the string ready 2011-10-05 20:14:23 +02:00
Victor Stinner bb10a1f759 Ensure that newly created strings use the most efficient store in debug mode 2011-10-05 01:34:17 +02:00
Victor Stinner 9310abbf40 Replace PyUnicodeObject* with PyObject* where it was inappropriate 2011-10-05 00:59:23 +02:00
Victor Stinner ce5faf673e unicodeobject.c doesn't make output strings ready in debug mode
Try to only create non ready strings in debug mode to ensure that all functions
(not only in unicodeobject.c, everywhere) make input strings ready.
2011-10-05 00:42:43 +02:00
Georg Brandl 7597addbd4 More typoes. 2011-10-05 16:36:47 +02:00
Victor Stinner c80d6d20d5 Speedup str[a🅱️step] for step != 1
Try to stop the scanner of the maximum character before the end using a limit
depending on the kind (e.g. 256 for PyUnicode_2BYTE_KIND).
2011-10-05 14:13:28 +02:00
Victor Stinner ae86485517 Speedup find_maxchar_surrogates() for 32-bit wchar_t
If we have at least one character in U+10000-U+10FFFF, we know that we must use
PyUnicode_4BYTE_KIND kind.
2011-10-05 14:02:44 +02:00
Victor Stinner b9275c104e Speedup str[a:b] and PyUnicode_FromKindAndData
* str[a:b] doesn't scan the string for the maximum character if the string
   is ascii only
 * PyUnicode_FromKindAndData() stops if we are sure that we cannot use a
   shorter character type. For example, _PyUnicode_FromUCS1() stops if we
   have at least one character in range U+0080-U+00FF
2011-10-05 14:01:42 +02:00
Victor Stinner 702c734395 Speedup the ASCII decoder
It is faster for long string and a little bit faster for short strings,
benchmark on Linux 32 bits, Intel Core i5 @ 3.33GHz:

./python -m timeit 'x=b"a"' 'x.decode("ascii")'
./python -m timeit 'x=b"x"*80' 'x.decode("ascii")'
./python -m timeit 'x=b"abc"*4096' 'x.decode("ascii")'

length |   before   | after
-------+------------+-----------
     1 | 0.234 usec | 0.229 usec
    80 | 0.381 usec | 0.357 usec
12,288 |  11.2 usec |  3.01 usec
2011-10-05 13:50:52 +02:00
Victor Stinner e1335c711c Fix usage og PyUnicode_READY() 2011-10-04 20:53:03 +02:00
Victor Stinner e06e145943 _PyUnicode_READY_REPLACE() cannot be used in unicode_subtype_new() 2011-10-04 20:52:31 +02:00
Victor Stinner 17efeed284 Add DONT_MAKE_RESULT_READY to unicodeobject.c to help detecting bugs
Use also _PyUnicode_READY_REPLACE() when it's applicable.
2011-10-04 20:05:46 +02:00
Victor Stinner 6b56a7fd3d Add assertion to _Py_ReleaseInternedUnicodeStrings() if READY fails 2011-10-04 20:04:52 +02:00
Antoine Pitrou 875f29bb95 Fix naïve heuristic in unicode slicing (followup to 1b4f886dc9e2) 2011-10-04 20:00:49 +02:00
Antoine Pitrou 2242522fde Add a necessary call to PyUnicode_READY() (followup to ab5086539ab9) 2011-10-04 19:10:51 +02:00
Antoine Pitrou 7aec401966 Optimize string slicing to use the new API 2011-10-04 19:08:01 +02:00
Antoine Pitrou e19aa388e8 When expandtabs() would be a no-op, don't create a duplicate string 2011-10-04 16:04:01 +02:00
Antoine Pitrou e71d574a39 Migrate str.expandtabs to the new API 2011-10-04 15:55:09 +02:00
Benjamin Peterson 7f3140ef80 fix parens 2011-10-03 19:37:29 -04:00
Benjamin Peterson 4bfce8f81f fix formatting 2011-10-03 19:35:07 -04:00
Benjamin Peterson ccc51c1fc6 fix compiler warnings 2011-10-03 19:34:12 -04:00
Victor Stinner b092365cc6 Move in-place Unicode append to its own subfunction 2011-10-04 01:17:31 +02:00
Victor Stinner a5f9163501 Reindent internal Unicode macros 2011-10-04 01:07:11 +02:00
Victor Stinner a41463c203 Document utf8_length and wstr_length states
Ensure these states with assertions in _PyUnicode_CheckConsistency().
2011-10-04 01:05:08 +02:00
Victor Stinner 9566311014 resize_inplace() sets utf8_length to zero if the utf8 is not shared8
Cleanup also the code.
2011-10-04 01:03:50 +02:00
Victor Stinner 9e9d689d85 PyUnicode_New() sets utf8_length to zero for latin1 2011-10-04 01:02:02 +02:00
Victor Stinner 016980454e Unicode: raise SystemError instead of ValueError or RuntimeError on invalid
state
2011-10-04 00:04:26 +02:00
Victor Stinner 7f11ad4594 Unicode: document when the wstr pointer is shared with data
Add also related assertions to _PyUnicode_CheckConsistency().
2011-10-04 00:00:20 +02:00
Victor Stinner 03490918b7 Add _PyUnicode_HAS_WSTR_MEMORY() macro 2011-10-03 23:45:12 +02:00
Victor Stinner 9ce5a835bb PyUnicode_Join() checks output length in debug mode
PyUnicode_CopyCharacters() may copies less character than requested size, if
the input string is smaller than the argument. (This is very unlikely, but who
knows!?)

Avoid also calling PyUnicode_CopyCharacters() if the string is empty.
2011-10-03 23:36:02 +02:00
Victor Stinner b803895355 Fix a compiler warning in PyUnicode_Append()
Don't check PyUnicode_CopyCharacters() in release mode. Rename also some
variables.
2011-10-03 23:27:56 +02:00
Victor Stinner 8cfcbed4e3 Improve string forms and PyUnicode_Resize() documentation
Remove also the FIXME for resize_copy(): as discussed with Martin, copy the
string on resize if the string is not resizable is just fine.
2011-10-03 23:19:21 +02:00
Victor Stinner 77bb47b312 Simplify unicode_resizable(): singletons reference count is at least 2 2011-10-03 20:06:05 +02:00
Victor Stinner 85041a54bd _PyUnicode_CheckConsistency() checks utf8 field consistency 2011-10-03 14:42:39 +02:00
Victor Stinner 3cf4637e4e unicode_subtype_new() copies also the ascii flag 2011-10-03 14:42:15 +02:00
Victor Stinner 42dfd71333 unicode_kind_name() doesn't check consistency anymore
It is is called from _PyUnicode_Dump() and so must not fail.
2011-10-03 14:41:45 +02:00
Victor Stinner a3b334da6d PyUnicode_Ready() now sets ascii=1 if maxchar < 128
ascii=1 is no more reserved to PyASCIIObject. Use
PyUnicode_IS_COMPACT_ASCII(obj) to check if obj is a PyASCIIObject (as before).
2011-10-03 13:53:37 +02:00
Victor Stinner 1b4f9ceca7 Create _PyUnicode_READY_REPLACE() to reuse singleton
Only use _PyUnicode_READY_REPLACE() on just created strings.
2011-10-03 13:28:14 +02:00
Victor Stinner c379ead9af Fix resize_compact() and resize_inplace(); reenable full resize optimizations
* resize_compact() updates also wstr_len for non-ascii strings sharing wstr
 * resize_inplace() updates also utf8_len/wstr_len for strings sharing
   utf8/wstr
2011-10-03 12:52:27 +02:00
Victor Stinner 34411e17b0 resize_inplace() has been fixed: reenable this optimization 2011-10-03 12:21:33 +02:00
Victor Stinner a849a4b6b4 _PyUnicode_Dump() indicates if wstr and/or utf8 are shared 2011-10-03 12:12:11 +02:00
Victor Stinner 1c8d0c76a1 Fix resize_inplace(): update shared utf8 pointer 2011-10-03 12:11:00 +02:00
Victor Stinner ca4f7a4298 Disable unicode_resize() optimization on Windows (16-bit wchar_t) 2011-10-03 04:18:04 +02:00
Victor Stinner 126c559d05 _PyUnicode_Ready() for 16-bit wchar_t 2011-10-03 04:17:10 +02:00
Victor Stinner 2fd82278cb Fix compilation error on Windows
Fix also a compiler warning.
2011-10-03 04:06:05 +02:00
Victor Stinner a3be613a56 Use PyUnicode_WCHAR_KIND to check if a string is a wstr string
Simplify the test in wstr pointer in unicode_sizeof().
2011-10-03 02:16:37 +02:00
Victor Stinner 910337b42e Add _PyUnicode_CheckConsistency() macro to help debugging
* Document Unicode string states
 * Use _PyUnicode_CheckConsistency() to ensure that objects are always
   consistent.
2011-10-03 03:20:16 +02:00
Victor Stinner 4fae54cb0e In release mode, PyUnicode_InternInPlace() does nothing if the input is NULL or
not a unicode, instead of failing with a fatal error.

Use assertions in debug mode (provide better error messages).
2011-10-03 02:01:52 +02:00
Victor Stinner 23e5668214 PyUnicode_Append() now works in-place when it's possible 2011-10-03 03:54:37 +02:00
Victor Stinner fe226c0d37 Rewrite PyUnicode_Resize()
* Rename _PyUnicode_Resize() to unicode_resize()
 * unicode_resize() creates a copy if the string cannot be resized instead
   of failing
 * Optimize resize_copy() for wstr strings
 * Disable temporary resize_inplace()
2011-10-03 03:52:20 +02:00
Victor Stinner 829c0adca9 Add _PyUnicode_HAS_UTF8_MEMORY() macro 2011-10-03 01:08:02 +02:00
Victor Stinner fe0c155c4f Write _PyUnicode_Dump() to help debugging 2011-10-03 02:59:31 +02:00
Victor Stinner f42dc448e0 PyUnicode_CopyCharacters() fails when copying latin1 into ascii 2011-10-02 23:33:16 +02:00
Victor Stinner c53be96c54 unicode_convert_wchar_to_ucs4() cannot fail 2011-10-02 21:33:54 +02:00
Victor Stinner c3c7415639 Add _PyUnicode_DATA_ANY(op) private macro 2011-10-02 20:39:55 +02:00
Victor Stinner a464fc141d unicode_empty and unicode_latin1 are PyObject* objects, not PyUnicodeObject* 2011-10-02 20:39:30 +02:00
Victor Stinner 267aa24365 PyUnicode_FindChar() raises a IndexError on invalid index 2011-10-02 01:08:37 +02:00
Victor Stinner bc603d12b7 Optimize _PyUnicode_AsKind() for UCS1->UCS4 and UCS2->UCS4
* Ensure that the input string is ready
 * Raise a ValueError instead of of a fatal error
2011-10-02 01:00:40 +02:00
Victor Stinner 5a706cf8c0 Fix usage of PyUnicode_READY() in PyUnicode_GetLength() 2011-10-02 00:36:53 +02:00
Victor Stinner cd9950fd09 PyUnicode_WriteChar() raises IndexError on invalid index
PyUnicode_WriteChar() raises also a ValueError if the string has more than 1
reference.
2011-10-02 00:34:53 +02:00
Victor Stinner 2fe5ced752 PyUnicode_ReadChar() raises a IndexError if the index in invalid
unicode_getitem() reuses PyUnicode_ReadChar()
2011-10-02 00:25:40 +02:00
Victor Stinner 202b62bd90 PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown 2011-10-01 23:48:37 +02:00
Victor Stinner 07ac3ebd7b Optimize unicode_subtype_new(): don't encode to wchar_t and decode from wchar_t
Rewrite unicode_subtype_new(): allocate directly the right type.
2011-10-01 16:16:43 +02:00
Victor Stinner e90fe6a8f4 Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros
* Rename existing _PyUnicode_UTF8() macro to PyUnicode_UTF8()
 * Rename existing _PyUnicode_UTF8_LENGTH() macro to PyUnicode_UTF8_LENGTH()
 * PyUnicode_UTF8() and PyUnicode_UTF8_LENGTH() are more strict
2011-10-01 16:48:13 +02:00
Martin v. Löwis 0b1d348990 Issue 13085: Fix some memory leaks. Patch by Stefan Krah. 2011-10-01 16:35:40 +02:00
Benjamin Peterson 5c0fb00ad8 merge heads 2011-10-01 00:12:20 -04:00
Benjamin Peterson 31616ea2ff remove reference to non-existent file 2011-10-01 00:11:09 -04:00
Victor Stinner de636f3c34 PyUnicode_Substring() now accepts end bigger than string length
Fix also a bug: call PyUnicode_READY() before reading string length.
2011-10-01 03:55:54 +02:00
Victor Stinner c759f3e7ec Ooops, avoid a division by zero in unicode_repeat() 2011-10-01 03:09:58 +02:00
Victor Stinner d3a83d5eb3 PyUnicode_FromObject() ensures that its output is a ready string 2011-10-01 03:09:33 +02:00
Victor Stinner 67ca64ce54 I want a super fast 'a' * n!
* Optimize unicode_repeat() for a special case with memset()
 * Simplify integer overflow checking; remove the second check because
   PyUnicode_New() already does it and uses a smaller limit (Py_ssize_t vs
   size_t)
2011-10-01 02:47:29 +02:00
Victor Stinner e9a2935c1f Fix usage of PyUnicode_READY in unicodeobject.c 2011-10-01 02:14:59 +02:00
Victor Stinner 12bab6dace Remove private substring() function, reuse public PyUnicode_Substring()
* PyUnicode_Substring() now fails if start or end is invalid
 * PyUnicode_Substring() reuses PyUnicode_Copy() for non-exact strings
2011-10-01 01:53:49 +02:00
Victor Stinner c841e7db1f Optimize PyUnicode_Copy(): don't recompute maximum character 2011-10-01 01:34:32 +02:00
Victor Stinner 2219e0a37e PyUnicode_FromObject() reuses PyUnicode_Copy()
* PyUnicode_Copy() is faster than substring()
 * Fix also a compiler warning
2011-10-01 01:16:59 +02:00
Victor Stinner 034f6cf10c Add PyUnicode_Copy() function, include it to the public API 2011-09-30 02:26:44 +02:00