Commit Graph

4289 Commits

Author SHA1 Message Date
Stefan Krah 4e14174e24 Whitespace. 2012-03-06 15:27:31 +01:00
Victor Stinner 0d03478b88 Remove an unused variable 2012-03-06 02:06:01 +01:00
Victor Stinner 198b291df7 Close #14205: dict lookup raises a RuntimeError if the dict is modified during
a lookup.

"if you want to make a sandbox on top of CPython, you have to fix segfaults"
so let's fix segfaults!
2012-03-06 01:03:13 +01:00
Stefan Krah 1e88f3faa6 Merge. 2012-03-05 17:48:21 +01:00
Stefan Krah 1649c1b33a Issue #14181: Preserve backwards compatibility for getbufferprocs that a) do
not adhere to the new documentation and b) manage to clobber view->obj before
returning failure.
2012-03-05 17:45:17 +01:00
Benjamin Peterson 400a968dfc remove f_yieldfrom access from Python (closes #13970) 2012-03-05 09:03:51 -06:00
Stefan Krah 4e99a315b7 Issue #14181: Allow memoryview construction from an object that uses the
getbuffer redirection scheme.
2012-03-05 09:30:47 +01:00
Victor Stinner c9590ad745 Close #14085: remove assertions from PyUnicode_WRITE macro
Add checks in PyUnicode_WriteChar() and convert PyUnicode_New() assertion to a
test raising a Python exception.
2012-03-04 01:34:37 +01:00
Antoine Pitrou 70d2717f2e Issue #13521: dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
Patch by Filip Gruszczyński.
2012-02-27 00:59:34 +01:00
Antoine Pitrou e965d97ed1 Issue #13521: dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
Patch by Filip Gruszczyński.
2012-02-27 00:45:12 +01:00
Nick Coghlan ab7bf2143e Close issue #6210: Implement PEP 409 2012-02-26 17:49:52 +10:00
Ezio Melotti cda6b6d60d #14081: The sep and maxsplit parameter to str.split, bytes.split, and bytearray.split may now be passed as keyword arguments. 2012-02-26 09:39:55 +02:00
Stefan Krah 9a2d99e28a - Issue #10181: New memoryview implementation fixes multiple ownership
and lifetime issues of dynamically allocated Py_buffer members (#9990)
  as well as crashes (#8305, #7433). Many new features have been added
  (See whatsnew/3.3), and the documentation has been updated extensively.
  The ndarray test object from _testbuffer.c implements all aspects of
  PEP-3118, so further development towards the complete implementation
  of the PEP can proceed in a test-driven manner.

  Thanks to Nick Coghlan, Antoine Pitrou and Pauli Virtanen for review
  and many ideas.

- Issue #12834: Fix incorrect results of memoryview.tobytes() for
  non-contiguous arrays.

- Issue #5231: Introduce memoryview.cast() method that allows changing
  format and shape without making a copy of the underlying memory.
2012-02-25 12:24:21 +01:00
Victor Stinner 6f73874edd Close #14095: type.__new__() doesn't remove __qualname__ key from the class
dict anymore if the key is present. Reject also non-string qualified names.
And fix reference leaks in type.__new__().
2012-02-25 01:22:36 +01:00
Victor Stinner b0800dc53b Oops, revert unwanted changes 2012-02-25 00:47:08 +01:00
Victor Stinner abc649ddbe Issue #14107: fix bigmem tests on str.capitalize(), str.swapcase() and
str.title(). Compute correctly how much memory is required for the test
(memuse).
2012-02-25 00:43:27 +01:00
Antoine Pitrou 842c0f17eb Fix compilation error under Windows (and warnings too). 2012-02-24 13:30:46 +01:00
Victor Stinner 90f50d4df9 Issue #13706: Fix format(float, "n") for locale with non-ASCII decimal point (e.g. ps_aF) 2012-02-24 01:44:47 +01:00
Victor Stinner 41a863cb81 Issue #13706: Fix format(int, "n") for locale with non-ASCII thousands separator
* Decode thousands separator and decimal point using PyUnicode_DecodeLocale()
   (from the locale encoding), instead of decoding them implicitly from latin1
 * Remove _PyUnicode_InsertThousandsGroupingLocale(), it was not used
 * Change _PyUnicode_InsertThousandsGrouping() API to return the maximum
   character if unicode is NULL
 * Replace MIN/MAX macros by Py_MIN/Py_MAX
 * stringlib/undef.h undefines STRINGLIB_IS_UNICODE
 * stringlib/localeutil.h only supports Unicode
2012-02-24 00:37:51 +01:00
Victor Stinner b429d3b09c Fix doc of an internal function: unicode_write_cstr() 2012-02-22 21:22:20 +01:00
Antoine Pitrou ba6bafcfbe Fix compile failure under Windows 2012-02-22 16:41:50 +01:00
Victor Stinner c516610f0b Optimize str%arg for number formats: %i, %d, %u, %x, %p
Write a specialized function to write an ASCII/latin1 C char* string into a
Python Unicode string.
2012-02-22 13:55:02 +01:00
Victor Stinner 99d7ad0bb0 Micro-optimize computation of maxchar in PyUnicode_TransformDecimalToASCII() 2012-02-22 13:37:39 +01:00
Victor Stinner da79e632c4 Micro-optimize unicode_expandtabs(): use FILL() macro to write N spaces 2012-02-22 13:37:04 +01:00
Victor Stinner 15e9ed299c PyUnicode_New() and unicode_putchar() check for MAX_UNICODE maximum (U+10FFFF) 2012-02-22 13:36:20 +01:00
Benjamin Peterson d9a3591ed1 merge 3.2 2012-02-21 11:12:14 -05:00
Benjamin Peterson e249dcab7a merge 3.2 2012-02-21 11:09:13 -05:00
Benjamin Peterson 69e9727657 ensure no one tries to hash things before the random seed is found 2012-02-21 11:08:50 -05:00
Benjamin Peterson 71f660e00f update to Unicode 6.1 2012-02-20 22:24:29 -05:00
Georg Brandl 16fa2a1097 Forgot the "empty string -> hash == 0" special case for strings. 2012-02-21 00:50:13 +01:00
Georg Brandl 2fb477c0f0 Merge 3.2: Issue #13703 plus some related test suite fixes. 2012-02-21 00:33:36 +01:00
Georg Brandl 09a7c72cad Merge from 3.1: Issue #13703: add a way to randomize the hash values of basic types (str, bytes, datetime)
in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated.

The environment variable PYTHONHASHSEED and the new command line flag -R control this
behavior.
2012-02-20 21:31:46 +01:00
Georg Brandl 2daf6ae249 Issue #13703: add a way to randomize the hash values of basic types (str, bytes, datetime)
in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated.

The environment variable PYTHONHASHSEED and the new command line flag -R control this
behavior.
2012-02-20 19:54:16 +01:00
Benjamin Peterson 006c5a2235 check for NULL to fix segfault 2012-02-19 20:36:12 -05:00
Benjamin Peterson 23d7f12ffb use new generic __dict__ descriptor implementations 2012-02-19 20:02:57 -05:00
Benjamin Peterson 8eb1269c34 add generic implementation of a __dict__ descriptor for C types 2012-02-19 19:59:10 -05:00
Benjamin Peterson b900d6a78c initialize __dict__ if needed 2012-02-19 10:17:30 -05:00
Benjamin Peterson 2cf936fe7a use defaults 2012-02-19 01:16:13 -05:00
Benjamin Peterson 84e821e961 merge 3.2 2012-02-19 01:14:21 -05:00
Benjamin Peterson 496c53d83e use Py_CLEAR 2012-02-19 01:11:56 -05:00
Benjamin Peterson 01d7eba316 allow arbitrary attributes on classmethod and staticmethod (closes #14051) 2012-02-19 01:10:25 -05:00
Antoine Pitrou 552be9b214 Issue #13020: Fix a reference leak when allocating a structsequence object fails.
Patch by Suman Saha.
2012-02-15 02:54:33 +01:00
Antoine Pitrou 4b3c7846c9 Fix indentation 2012-02-15 02:52:58 +01:00
Antoine Pitrou 37784ba5c0 Issue #13020: Fix a reference leak when allocating a structsequence object fails.
Patch by Suman Saha.
2012-02-15 02:51:43 +01:00
Victor Stinner c3a6b02d70 (Merge 3.2) Issue #13913: normalize utf-8 codec name in UTF-8 decoder 2012-02-14 01:18:10 +01:00
Victor Stinner cbe01342bc Issue #13913: normalize utf-8 codec name in UTF-8 decoder 2012-02-14 01:17:45 +01:00
Benjamin Peterson 67e700697e merge 3.2 2012-02-10 08:47:04 -05:00
Benjamin Peterson efe7c9d4d7 this is only a borrowed ref in Brett's branch 2012-02-10 08:46:54 -05:00
Victor Stinner d1cd99b533 Backout d2c1521ad0a1: _Py_IDENTIFIER() uses UTF-8 again 2012-02-07 23:05:55 +01:00
Benjamin Peterson 9878b63c7c merge 3.2 2012-02-06 11:30:05 -05:00
Benjamin Peterson 2f9c71bbba bltinmod is borrowed, so it shouldn't be decrefed 2012-02-06 11:28:45 -05:00
Victor Stinner d446d8e09a _Py_Identifier are always ASCII strings 2012-02-05 01:45:45 +01:00
Benjamin Peterson 951138c795 merge 3.2 2012-02-03 19:25:01 -05:00
Benjamin Peterson 90b13583bc put returns on their own lines 2012-02-03 19:22:31 -05:00
Benjamin Peterson 2372bb0722 merge 3.2 (closes #13908) 2012-01-29 20:17:07 -05:00
Benjamin Peterson 2652d2570e ready types returned from PyType_FromSpec 2012-01-29 20:16:37 -05:00
Benjamin Peterson e28108cbd7 adjust declaration 2012-01-29 20:13:18 -05:00
Antoine Pitrou 7ab4af0427 Issue #13848: open() and the FileIO constructor now check for NUL characters in the file name.
Patch by Hynek Schlawack.
2012-01-29 18:43:36 +01:00
Antoine Pitrou 1334884ff2 Issue #13848: open() and the FileIO constructor now check for NUL characters in the file name.
Patch by Hynek Schlawack.
2012-01-29 18:36:34 +01:00
Mark Dickinson 963816defc Merge 3.2 -> default (issue 13889) 2012-01-27 21:17:04 +00:00
Mark Dickinson 261896b559 Issue #13889: Add missing _Py_SET_53BIT_PRECISION_* calls around uses of dtoa.c functions in float round. 2012-01-27 21:16:01 +00:00
Georg Brandl 20f6bc20bd merge with 3.2 2012-01-22 21:31:39 +01:00
Georg Brandl beca27a394 Fix #13834: strip() strips leading and trailing whitespace. 2012-01-22 21:31:21 +01:00
Benjamin Peterson ce79852077 use the static identifier api for looking up special methods
I had to move the static identifier code from unicodeobject.h to object.h in
order for this to work.
2012-01-22 11:24:29 -05:00
Antoine Pitrou ac456a1839 Fix some of the remaining test_capi leaks 2012-01-18 21:35:21 +01:00
Antoine Pitrou 8b0a74e936 Fix some of the remaining test_capi refleaks 2012-01-18 21:29:05 +01:00
Antoine Pitrou 84091bfa45 Fix some of the refleaks in test_capi (ported from 3.2) 2012-01-18 21:24:18 +01:00
Antoine Pitrou 55f217f22d Fix refleaks in test_capi
(this was easier than I thought!)
2012-01-18 21:23:13 +01:00
Antoine Pitrou bb5b92d324 Merge refleak fixes from 3.2 2012-01-18 16:19:19 +01:00
Antoine Pitrou 1c7ade5284 Fix leaking a RuntimeError objects when creating sub-interpreters 2012-01-18 16:13:31 +01:00
Benjamin Peterson eea4846d23 don't ready in case_operation, since most callers do it themselves 2012-01-16 14:28:50 -05:00
Benjamin Peterson c6630b9291 fix old titlecase function for extended case chars 2012-01-15 21:33:32 -05:00
Benjamin Peterson 9487c4db82 comment about how flags could be expanded 2012-01-15 21:26:23 -05:00
Benjamin Peterson ad9c569825 delta encoding of upper/lower/title makes a glorious return (#12736) 2012-01-15 21:19:20 -05:00
Gregory P. Smith f5b62a9b31 Consolidate the occurrances of the prime used as the multiplier when hashing. 2012-01-14 15:45:13 -08:00
Gregory P. Smith 63e6c3222f Consolidate the occurrances of the prime used as the multiplier when hashing
to a single #define instead of having several copies in several files.

This excludes the Modules/ tree (datetime and expat both have a copy
for their own purposes with no need for it to be the same).
2012-01-14 15:31:34 -08:00
Benjamin Peterson c8d8b8861e fix possible refleaks if PyUnicode_READY fails 2012-01-14 13:37:31 -05:00
Benjamin Peterson bac79498c8 always explicitly check for -1 from PyUnicode_READY 2012-01-14 13:34:47 -05:00
Benjamin Peterson d5890c8db5 add str.casefold() (closes #13752) 2012-01-14 13:23:30 -05:00
Nick Coghlan 138f4656e3 Add a separate NEWS entry for a change to PyObject_CallMethod in the PEP 380 patch, and make the private CallMethod variants consistent with the public one 2012-01-14 16:45:48 +10:00
Amaury Forgeot d'Arc e557da804a Fix a crash when the return value of a subgenerator is a temporary
object (with a refcount of 1)
2012-01-13 21:06:12 +01:00
Benjamin Peterson 53aa1d7c57 fix possible if unlikely leak 2011-12-20 13:29:45 -06:00
Georg Brandl ac0675cc01 Small clarification in docstring of dict.update(): the positional argument is not required. 2011-12-18 19:30:55 +01:00
Victor Stinner bb2e9c477d Issue #11231: Fix bytes and bytearray docstrings
Patch written by Brice Berna.
2011-12-17 23:18:07 +01:00
Nick Coghlan 1f7ce62bd6 Implement PEP 380 - 'yield from' (closes #11682) 2012-01-13 21:43:40 +10:00
Benjamin Peterson e51757f6de move do_title to a better place 2012-01-12 21:10:29 -05:00
Benjamin Peterson 821e4cfd01 make fix_decimal_and_space_to_ascii check if it modifies the string 2012-01-12 15:40:18 -05:00
Benjamin Peterson 0c91392fe6 kill capwords implementation which has been disabled since the begining 2012-01-12 15:25:41 -05:00
Benjamin Peterson 21e0da228d remove some usage of Py_UNICODE_TOUPPER/LOWER 2012-01-11 21:00:42 -05:00
Benjamin Peterson b2bf01d824 use full unicode mappings for upper/lower/title case (#12736)
Also broaden the category of characters that count as lowercase/uppercase.
2012-01-11 18:17:06 -05:00
Antoine Pitrou 94f6fa62bf Issue #13738: Simplify implementation of bytes.lower() and bytes.upper(). 2012-01-08 16:22:46 +01:00
Victor Stinner 3fe553160c Add a new PyUnicode_Fill() function
It is faster than the unicode_fill() function which was implemented in
formatter_unicode.c.
2012-01-04 00:33:50 +01:00
Benjamin Peterson 5e458f520c also decref the right thing 2012-01-02 10:12:13 -06:00
Benjamin Peterson 4c13a4a352 ready the correct string 2012-01-02 09:07:38 -06:00
Benjamin Peterson 22a29708fd fix some possible refleaks from PyUnicode_READY error conditions 2012-01-02 09:00:30 -06:00
Benjamin Peterson 9ca3ffac94 == -1 is convention 2012-01-01 16:04:29 -06:00
Benjamin Peterson e157cf1012 make switch more robust 2012-01-01 15:56:20 -06:00
Benjamin Peterson 2199227be4 fix weird indentation 2011-12-28 12:01:31 -06:00
Antoine Pitrou 5b62942074 Issue #13577: Built-in methods and functions now have a __qualname__.
Patch by sbt.
2011-12-23 12:40:16 +01:00
Benjamin Peterson c0b95d18fa 4 space indentation 2011-12-20 17:24:05 -06:00
Benjamin Peterson ead6b53659 fix spacing around switch statements 2011-12-20 17:23:42 -06:00
Benjamin Peterson 822c790527 merge 3.2 2011-12-20 13:32:50 -06:00
Georg Brandl f928b5d27e Merge with 3.2. 2011-12-18 19:32:37 +01:00
Victor Stinner 6099a03202 Issue #13624: Write a specialized UTF-8 encoder to allow more optimization
The main bottleneck was the PyUnicode_READ() macro.
2011-12-18 14:22:26 +01:00
Victor Stinner 73f53b57d1 Optimize str * n for len(str)==1 and UCS-2 or UCS-4 2011-12-18 03:26:31 +01:00
Victor Stinner f644110816 Issue #13621: Optimize str.replace(char1, char2)
Use findchar() which is more optimized than a dummy loop using
PyUnicode_READ().  PyUnicode_READ() is a complex and slow macro.
2011-12-18 02:43:08 +01:00
Victor Stinner f8eac00779 Issue #13623: Fix a performance regression introduced by issue #12170 in
bytes.find() and handle correctly OverflowError (raise the same ValueError than
the error for -1).
2011-12-18 01:17:41 +01:00
Victor Stinner e010fc029d Issue #11231: Fix bytes and bytearray docstrings
Patch written by Brice Berna.
2011-12-17 23:18:43 +01:00
Victor Stinner ab870218e3 Issue #10951: Fix compiler warnings in timemodule.c and unicodeobject.c
Thanks Jérémy Anger for the fix.
2011-12-17 22:39:43 +01:00
Benjamin Peterson f2fe7f0881 fix possible NULL dereference 2011-12-17 08:02:20 -05:00
Victor Stinner 2f197078fb The locale decoder raises a UnicodeDecodeError instead of an OSError
Search the invalid character using mbrtowc().
2011-12-17 07:08:30 +01:00
Victor Stinner 1b57967b96 Issue #13560: Locale codec functions use the classic "errors" parameter,
instead of surrogateescape

So it would be possible to support more error handlers later.
2011-12-17 05:47:23 +01:00
Victor Stinner ab59594326 What's New in Python 3.3: complete the deprecation list
Add also FIXMEs in unicodeobject.c
2011-12-17 04:59:06 +01:00
Victor Stinner 1f33f2b0c3 Issue #13560: os.strerror() now uses the current locale encoding instead of UTF-8 2011-12-17 04:45:09 +01:00
Victor Stinner f2ea71fcc8 Issue #13560: Add PyUnicode_EncodeLocale()
* Use PyUnicode_EncodeLocale() in time.strftime() if wcsftime() is not
   available
 * Document my last changes in Misc/NEWS
2011-12-17 04:13:41 +01:00
Victor Stinner af02e1c85a Add PyUnicode_DecodeLocaleAndSize() and PyUnicode_DecodeLocale()
* PyUnicode_DecodeLocaleAndSize() and PyUnicode_DecodeLocale() decode a string
   from the current locale encoding
 * _Py_char2wchar() writes an "error code" in the size argument to indicate
   if the function failed because of memory allocation failure or because of a
   decoding error. The function doesn't write the error message directly to
   stderr.
 * Fix time.strftime() (if wcsftime() is missing): decode strftime() result
   from the current locale encoding, not from the filesystem encoding.
2011-12-16 23:56:01 +01:00
Antoine Pitrou 093ce9cd8c Issue #6695: Full garbage collection runs now clear the freelist of set objects.
Initial patch by Matthias Troffaes.
2011-12-16 11:24:27 +01:00
Benjamin Peterson bfebb7b54a improve abstract property support (closes #11610)
Thanks to Darren Dale for patch.
2011-12-15 15:34:02 -05:00
Antoine Pitrou e0e2735f41 Fix OSError.__init__ and OSError.__new__ so that each of them can be
overriden and take additional arguments (followup to issue #12555).
2011-12-15 14:31:28 +01:00
Antoine Pitrou d73a9acb63 Fix the fix for issue #12149: it was incorrect, although it had the side
effect of appearing to resolve the issue.  Thanks to Mark Shannon for
noticing.
2011-12-15 14:17:36 +01:00
Antoine Pitrou 2e872082f6 Fix the fix for issue #12149: it was incorrect, although it had the side
effect of appearing to resolve the issue.  Thanks to Mark Shannon for
noticing.
2011-12-15 14:15:31 +01:00
Florent Xicluna aa6c1d240f Issue #13575: there is only one class type. 2011-12-12 18:54:29 +01:00
Antoine Pitrou 9d57481f04 Issue #13577: various kinds of descriptors now have a __qualname__ attribute.
Patch by sbt.
2011-12-12 13:47:25 +01:00
Victor Stinner 16e6a80923 PyUnicode_Resize(): warn about canonical representation
Call also directly unicode_resize() in unicodeobject.c
2011-12-12 13:24:15 +01:00
Victor Stinner b0a82a6a7f Fix PyUnicode_Resize() for compact string: leave the string unchanged on error
Fix also PyUnicode_Resize() doc
2011-12-12 13:08:33 +01:00
Victor Stinner bf6e560d0c Make PyUnicode_Copy() private => _PyUnicode_Copy()
Undocument the function.

Make also decode_utf8_errors() as private (static).
2011-12-12 01:53:47 +01:00
Victor Stinner 7a9105a380 resize_copy() now supports legacy ready strings 2011-12-12 00:13:42 +01:00
Victor Stinner 488fa49acf Rewrite PyUnicode_Append(); unicode_modifiable() is more strict
* Rename unicode_resizable() to unicode_modifiable()
 * Rename _PyUnicode_Dirty() to unicode_check_modifiable() to make it clear
   that the function is private
 * Inline PyUnicode_Concat() and unicode_append_inplace() in PyUnicode_Append()
   to simplify the code
 * unicode_modifiable() return 0 if the hash has been computed or if the string
   is not an exact unicode string
 * Remove _PyUnicode_DIRTY(): no need to reset the hash anymore, because if the
   hash has already been computed, you cannot modify a string inplace anymore
 * PyUnicode_Concat() checks for integer overflow
2011-12-12 00:01:39 +01:00
Victor Stinner c4b495497a Create unicode_result_unchanged() subfunction 2011-12-11 22:44:26 +01:00
Victor Stinner eaab604829 Fix fixup() for unchanged unicode subtype
If maxchar_new == 0 and self is a unicode subtype, return u instead of duplicating u.
2011-12-11 22:22:39 +01:00
Victor Stinner e6b2d4407a unicode_fromascii() doesn't check string content twice in debug mode
_PyUnicode_CheckConsistency() also checks string content.
2011-12-11 21:54:30 +01:00
Victor Stinner a1d12bb119 Call directly PyUnicode_DecodeUTF8Stateful() instead of PyUnicode_DecodeUTF8()
* Remove micro-optimization from PyUnicode_FromStringAndSize():
   PyUnicode_DecodeUTF8Stateful() has already these optimizations (for size=0
   and one ascii char).
 * Rename utf8_max_char_size_and_char_count() to utf8_scanner(), and remove an
   useless variable
2011-12-11 21:53:09 +01:00
Victor Stinner 382955ff4e Use directly unicode_empty instead of PyUnicode_New(0, 0) 2011-12-11 21:44:00 +01:00
Victor Stinner 785938eebd Move the slowest UTF-8 decoder to its own subfunction
* Create decode_utf8_errors()
 * Reuse unicode_fromascii()
 * decode_utf8_errors() doesn't refit at the beginning
 * Remove refit_partial_string(), use unicode_adjust_maxchar() instead
2011-12-11 20:09:03 +01:00
Victor Stinner 84def3774d Fix error handling in resize_compact() 2011-12-11 20:04:56 +01:00
Victor Stinner 8faf8216e4 PyUnicode_FromWideChar() and PyUnicode_FromUnicode() raise a ValueError if a
character in not in range [U+0000; U+10ffff].
2011-12-08 22:14:11 +01:00
Antoine Pitrou b0e1f8b38b Issue #13503: Use a more efficient reduction format for bytearrays with
pickle protocol >= 3.  The old reduction format is kept with older
protocols in order to allow unpickling under Python 2.

Patch by Irmen de Jong.
2011-12-05 20:40:08 +01:00
Victor Stinner 0a54cf12a0 Fix PyObject_Repr(): don't call PyUnicode_READY() if res is NULL 2011-12-01 03:22:44 +01:00
Victor Stinner b37b17423b Replace PyUnicode_FromUnicode(NULL, 0) by PyUnicode_New(0, 0)
Create an empty string with the new Unicode API.
2011-12-01 03:18:59 +01:00
Victor Stinner db88ae5d66 PyObject_Repr() ensures that the result is a ready Unicode string
And PyObject_Str() and PyObject_Repr() don't make strings ready in debug
mode to ensure that the caller makes the string ready before using it.
2011-12-01 02:15:00 +01:00
Victor Stinner 551ac95733 Py_UNICODE_HIGH_SURROGATE() and Py_UNICODE_LOW_SURROGATE() macros
And use surrogates macros everywhere in unicodeobject.c
2011-11-29 22:58:13 +01:00
Antoine Pitrou c366117820 Merge heads 2011-11-26 01:13:12 +01:00
Antoine Pitrou f0effe6379 Better resolution for issue #11849: Ensure that free()d memory arenas are really released
on POSIX systems supporting anonymous memory mappings.  Patch by Charles-François Natali.
2011-11-26 01:11:02 +01:00
Victor Stinner 6345be9a14 Close #13093: PyUnicode_EncodeDecimal() doesn't support error handlers
different than "strict" anymore. The caller was unable to compute the
size of the output buffer: it depends on the error handler.
2011-11-25 20:09:01 +01:00
Antoine Pitrou 86a36b500a PEP 3155 / issue #13448: Qualified name for classes and functions. 2011-11-25 18:56:07 +01:00
Benjamin Peterson 1518e8713d and back to the "magic" formula (with a comment) it is 2011-11-23 10:44:52 -06:00
Benjamin Peterson 5944c36931 cave to those who like readable code 2011-11-22 19:05:49 -06:00
Benjamin Peterson 0268675193 fix compiler warning by implementing this more cleverly 2011-11-22 15:29:32 -05:00
Victor Stinner ca4f20782e find_maxchar_surrogates() reuses surrogate macros 2011-11-22 03:38:40 +01:00
Victor Stinner 0d3721d986 Issue #13441: Disable temporary the check on the maximum character until
the Solaris issue is solved.

But add assertion on the maximum character in various encoders: UTF-7, UTF-8,
wide character (wchar_t*, Py_UNICODE*), unicode-escape, raw-unicode-escape.

Fix also unicode_encode_ucs1() for backslashreplace error handler: Python is
now always "wide".
2011-11-22 03:27:53 +01:00
Victor Stinner f8facacf30 Fix compiler warnings 2011-11-22 02:30:47 +01:00
Victor Stinner 9d3b93ba30 Use the new Unicode API
* Replace PyUnicode_FromUnicode(NULL, 0) by PyUnicode_New(0, 0)
 * Replce PyUnicode_FromUnicode(str, len) by PyUnicode_FromWideChar(str, len)
 * Replace Py_UNICODE by wchar_t
 * posix_putenv() uses PyUnicode_FromFormat() to create the string, instead
   of PyUnicode_FromUnicode() + _snwprintf()
2011-11-22 02:27:30 +01:00
Victor Stinner b84d723509 (Merge 3.2) Issue #13093: Fix error handling on PyUnicode_EncodeDecimal() 2011-11-22 01:50:07 +01:00
Victor Stinner cfed46e00a PyUnicode_FromKindAndData() fails with a ValueError if size < 0 2011-11-22 01:29:14 +01:00
Victor Stinner 42885206ec UTF-8 decoder: set consumed value in the latin1 fast-path 2011-11-22 01:23:02 +01:00
Victor Stinner d3df8ab377 Replace _PyUnicode_READY_REPLACE() and _PyUnicode_ReadyReplace() with unicode_ready()
* unicode_ready() has a simpler API
 * try to reuse unicode_empty and latin1_char singleton everywhere
 * Fix a reference leak in _PyUnicode_TranslateCharmap()
 * PyUnicode_InternInPlace() doesn't try to get a singleton anymore, to avoid
   having to handle a failure
2011-11-22 01:22:34 +01:00
Victor Stinner f01245067a Rewrite PyUnicode_TransformDecimalToASCII() to use the new Unicode API 2011-11-21 23:12:56 +01:00
Victor Stinner 2d718f39a5 Remove an unused variable from PyUnicode_Copy() 2011-11-21 23:11:52 +01:00
Victor Stinner 87af4f2f3a Simplify PyUnicode_Copy()
USe PyUnicode_Copy() in fixup()
2011-11-21 23:03:47 +01:00
Victor Stinner 5bbe5e7c85 Fix a compiler warning in _PyUnicode_CheckConsistency() 2011-11-21 22:54:05 +01:00
Victor Stinner 42bf77537e Rewrite PyUnicode_EncodeDecimal() to use the new Unicode API
Add tests for PyUnicode_EncodeDecimal() and
PyUnicode_TransformDecimalToASCII().
2011-11-21 22:52:58 +01:00
Antoine Pitrou ce4a9da705 Issue #13411: memoryview objects are now hashable when the underlying object is hashable. 2011-11-21 20:46:33 +01:00
Antoine Pitrou 0a3229de6b Issue #13417: speed up utf-8 decoding by around 2x for the non-fully-ASCII case.
This almost catches up with pre-PEP 393 performance, when decoding needed
only one pass.
2011-11-21 20:39:13 +01:00
Victor Stinner da29cc36aa Issue #13441: _PyUnicode_CheckConsistency() dumps the string if the maximum
character is bigger than U+10FFFF and locale.localeconv() dumps the string
before decoding it.

Temporary hack to debug the issue #13441.
2011-11-21 14:31:41 +01:00
Victor Stinner 9e30aa52fd Fix misuse of PyUnicode_GET_SIZE() => PyUnicode_GET_LENGTH()
And PyUnicode_GetSize() => PyUnicode_GetLength()
2011-11-21 02:49:52 +01:00
Victor Stinner 53b33e767d UnicodeTranslateError uses the new Unicode API
The index is a character index, not a index in a Py_UNICODE* string.
2011-11-21 01:17:27 +01:00
Victor Stinner da1ddf37c6 UnicodeEncodeError uses the new Unicode API
The index is a character index, not a index in a Py_UNICODE* string.
2011-11-20 22:50:23 +01:00
Victor Stinner 4ead7c7be8 PyObject_Str() ensures that the result string is ready
and check the string consistency.

_PyUnicode_CheckConsistency() doesn't check the hash anymore. It should be
possible to call this function even if hash(str) was already called.
2011-11-20 19:48:36 +01:00
Victor Stinner 0fc35196bb stringlib: remove unused STRINGLIB_FILL 2011-11-20 19:30:15 +01:00
Victor Stinner b960b34577 PyUnicode_AsUTF32String() calls directly _PyUnicode_EncodeUTF32(),
instead of calling the deprecated PyUnicode_EncodeUTF32() function
2011-11-20 19:12:52 +01:00
Victor Stinner 77faf69ca1 _PyUnicode_CheckConsistency() also checks maxchar maximum value,
not only its minimum value
2011-11-20 18:56:05 +01:00
Victor Stinner d5c4022d2a Remove the two ugly and unused WRITE_ASCII_OR_WSTR and WRITE_WSTR macros 2011-11-20 18:41:31 +01:00
Victor Stinner 2e9cfadd7c Reuse surrogate macros in UTF-16 decoder 2011-11-20 18:40:27 +01:00
Victor Stinner ae4f7c8e59 charmap_encoding_error() uses the new Unicode API 2011-11-20 18:28:55 +01:00
Victor Stinner ac931b1e5b Use PyUnicode_EncodeCodePage() instead of PyUnicode_EncodeMBCS() with
PyUnicode_AsUnicodeAndSize()
2011-11-20 18:27:03 +01:00
Victor Stinner 22168998f5 charmap encoders uses Py_UCS4, not Py_UNICODE 2011-11-20 17:09:18 +01:00
Antoine Pitrou f34a0cdc6c Issue #10227: Add an allocation cache for a single slice object.
Patch by Stefan Behnel.
2011-11-18 20:14:34 +01:00
Victor Stinner 1f7951711c Catch PyUnicode_AS_UNICODE() errors 2011-11-17 00:45:54 +01:00
Ezio Melotti 11060a4a48 #13406: silence deprecation warnings in test_codecs. 2011-11-16 09:39:10 +02:00
Antoine Pitrou 78edf7576e Issue #13333: The UTF-7 decoder now accepts lone surrogates
(the encoder already accepts them).
2011-11-15 01:44:16 +01:00
Antoine Pitrou 5418ee0b9a Issue #13333: The UTF-7 decoder now accepts lone surrogates
(the encoder already accepts them).
2011-11-15 01:42:21 +01:00
Antoine Pitrou 9a812cbc89 Issue #13389: Full garbage collection passes now clear the freelists for
list and dict objects.  They already cleared other freelists in the
interpreter.
2011-11-15 00:00:12 +01:00
Antoine Pitrou 39aba4f563 Use the small object allocator for small bytearrays 2011-11-12 21:15:28 +01:00
Antoine Pitrou 31b92a534f Sanitize reference management in the utf-8 encoder 2011-11-12 18:35:19 +01:00
Eli Bendersky e92ff0503c Issue #13161: fix doc strings of __i*__ operators. Closes #13161 2011-11-11 17:02:16 +02:00
Eli Bendersky d3baae73be Issue #13161: fix doc strings of __i*__ operators 2011-11-11 16:57:05 +02:00
Antoine Pitrou 0290c7a811 Fix regression on 2-byte wchar_t systems (Windows) 2011-11-11 13:29:12 +01:00
Antoine Pitrou 44c6affc79 Avoid crashing because of an unaligned word access 2011-11-11 02:59:42 +01:00
Antoine Pitrou de20b0b50e Issue #13149: Speed up append-only StringIO objects.
This is very similar to the "lazy strings" idea.
2011-11-10 21:47:38 +01:00
Victor Stinner 9f4b1e9c50 Fix and deprecated the unicode_internal codec
unicode_internal codec uses Py_UNICODE instead of the real internal
representation (PEP 393: Py_UCS1, Py_UCS2 or Py_UCS4) for backward
compatibility.
2011-11-10 20:56:30 +01:00
Victor Stinner 24729f36bf Prefer Py_UCS4 or wchar_t over Py_UNICODE 2011-11-10 20:31:37 +01:00
Victor Stinner ebf3ba808e PyUnicode_DecodeCharmap() uses the new Unicode API 2011-11-10 20:30:22 +01:00
Victor Stinner a98b28c1bf Avoid PyUnicode_AS_UNICODE in the UTF-8 encoder 2011-11-10 20:21:49 +01:00
Victor Stinner 3326cb6a36 Fix "unicode_escape" encoder 2011-11-10 20:15:25 +01:00
Victor Stinner 0e36826a04 Fix UTF-7 encoder on Windows 2011-11-10 20:12:49 +01:00
Martin v. Löwis 1db7c13be1 Port encoders from Py_UNICODE API to unicode object API. 2011-11-10 18:24:32 +01:00
Victor Stinner 62aa4d086a Strip trailing spaces 2011-11-09 00:03:45 +01:00
Victor Stinner 0a045efb49 Fix a compiler warning: use unsiged for maxchar in unicode_widen() 2011-11-09 00:02:42 +01:00
Victor Stinner 596a6c4ffc Fix the code page decoder
* unicode_decode_call_errorhandler() now supports the PyUnicode_WCHAR_KIND
   kind
 * unicode_decode_call_errorhandler() calls copy_characters() instead of
   PyUnicode_CopyCharacters()
2011-11-09 00:02:18 +01:00
Antoine Pitrou a8f63c02ef Fix missing goto 2011-11-08 18:37:16 +01:00