Stefan Krah
4e99a315b7
Issue #14181 : Allow memoryview construction from an object that uses the
...
getbuffer redirection scheme.
2012-03-05 09:30:47 +01:00
Victor Stinner
c9590ad745
Close #14085 : remove assertions from PyUnicode_WRITE macro
...
Add checks in PyUnicode_WriteChar() and convert PyUnicode_New() assertion to a
test raising a Python exception.
2012-03-04 01:34:37 +01:00
Antoine Pitrou
70d2717f2e
Issue #13521 : dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
...
Patch by Filip Gruszczyński.
2012-02-27 00:59:34 +01:00
Antoine Pitrou
e965d97ed1
Issue #13521 : dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
...
Patch by Filip Gruszczyński.
2012-02-27 00:45:12 +01:00
Nick Coghlan
ab7bf2143e
Close issue #6210 : Implement PEP 409
2012-02-26 17:49:52 +10:00
Ezio Melotti
cda6b6d60d
#14081 : The sep and maxsplit parameter to str.split, bytes.split, and bytearray.split may now be passed as keyword arguments.
2012-02-26 09:39:55 +02:00
Stefan Krah
9a2d99e28a
- Issue #10181 : New memoryview implementation fixes multiple ownership
...
and lifetime issues of dynamically allocated Py_buffer members (#9990 )
as well as crashes (#8305 , #7433 ). Many new features have been added
(See whatsnew/3.3), and the documentation has been updated extensively.
The ndarray test object from _testbuffer.c implements all aspects of
PEP-3118, so further development towards the complete implementation
of the PEP can proceed in a test-driven manner.
Thanks to Nick Coghlan, Antoine Pitrou and Pauli Virtanen for review
and many ideas.
- Issue #12834 : Fix incorrect results of memoryview.tobytes() for
non-contiguous arrays.
- Issue #5231 : Introduce memoryview.cast() method that allows changing
format and shape without making a copy of the underlying memory.
2012-02-25 12:24:21 +01:00
Victor Stinner
6f73874edd
Close #14095 : type.__new__() doesn't remove __qualname__ key from the class
...
dict anymore if the key is present. Reject also non-string qualified names.
And fix reference leaks in type.__new__().
2012-02-25 01:22:36 +01:00
Victor Stinner
b0800dc53b
Oops, revert unwanted changes
2012-02-25 00:47:08 +01:00
Victor Stinner
abc649ddbe
Issue #14107 : fix bigmem tests on str.capitalize(), str.swapcase() and
...
str.title(). Compute correctly how much memory is required for the test
(memuse).
2012-02-25 00:43:27 +01:00
Antoine Pitrou
842c0f17eb
Fix compilation error under Windows (and warnings too).
2012-02-24 13:30:46 +01:00
Victor Stinner
90f50d4df9
Issue #13706 : Fix format(float, "n") for locale with non-ASCII decimal point (e.g. ps_aF)
2012-02-24 01:44:47 +01:00
Victor Stinner
41a863cb81
Issue #13706 : Fix format(int, "n") for locale with non-ASCII thousands separator
...
* Decode thousands separator and decimal point using PyUnicode_DecodeLocale()
(from the locale encoding), instead of decoding them implicitly from latin1
* Remove _PyUnicode_InsertThousandsGroupingLocale(), it was not used
* Change _PyUnicode_InsertThousandsGrouping() API to return the maximum
character if unicode is NULL
* Replace MIN/MAX macros by Py_MIN/Py_MAX
* stringlib/undef.h undefines STRINGLIB_IS_UNICODE
* stringlib/localeutil.h only supports Unicode
2012-02-24 00:37:51 +01:00
Victor Stinner
b429d3b09c
Fix doc of an internal function: unicode_write_cstr()
2012-02-22 21:22:20 +01:00
Antoine Pitrou
ba6bafcfbe
Fix compile failure under Windows
2012-02-22 16:41:50 +01:00
Victor Stinner
c516610f0b
Optimize str%arg for number formats: %i, %d, %u, %x, %p
...
Write a specialized function to write an ASCII/latin1 C char* string into a
Python Unicode string.
2012-02-22 13:55:02 +01:00
Victor Stinner
99d7ad0bb0
Micro-optimize computation of maxchar in PyUnicode_TransformDecimalToASCII()
2012-02-22 13:37:39 +01:00
Victor Stinner
da79e632c4
Micro-optimize unicode_expandtabs(): use FILL() macro to write N spaces
2012-02-22 13:37:04 +01:00
Victor Stinner
15e9ed299c
PyUnicode_New() and unicode_putchar() check for MAX_UNICODE maximum (U+10FFFF)
2012-02-22 13:36:20 +01:00
Benjamin Peterson
d9a3591ed1
merge 3.2
2012-02-21 11:12:14 -05:00
Benjamin Peterson
e249dcab7a
merge 3.2
2012-02-21 11:09:13 -05:00
Benjamin Peterson
69e9727657
ensure no one tries to hash things before the random seed is found
2012-02-21 11:08:50 -05:00
Benjamin Peterson
71f660e00f
update to Unicode 6.1
2012-02-20 22:24:29 -05:00
Georg Brandl
16fa2a1097
Forgot the "empty string -> hash == 0" special case for strings.
2012-02-21 00:50:13 +01:00
Georg Brandl
2fb477c0f0
Merge 3.2: Issue #13703 plus some related test suite fixes.
2012-02-21 00:33:36 +01:00
Georg Brandl
09a7c72cad
Merge from 3.1: Issue #13703 : add a way to randomize the hash values of basic types (str, bytes, datetime)
...
in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated.
The environment variable PYTHONHASHSEED and the new command line flag -R control this
behavior.
2012-02-20 21:31:46 +01:00
Georg Brandl
2daf6ae249
Issue #13703 : add a way to randomize the hash values of basic types (str, bytes, datetime)
...
in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated.
The environment variable PYTHONHASHSEED and the new command line flag -R control this
behavior.
2012-02-20 19:54:16 +01:00
Benjamin Peterson
006c5a2235
check for NULL to fix segfault
2012-02-19 20:36:12 -05:00
Benjamin Peterson
23d7f12ffb
use new generic __dict__ descriptor implementations
2012-02-19 20:02:57 -05:00
Benjamin Peterson
8eb1269c34
add generic implementation of a __dict__ descriptor for C types
2012-02-19 19:59:10 -05:00
Benjamin Peterson
b900d6a78c
initialize __dict__ if needed
2012-02-19 10:17:30 -05:00
Benjamin Peterson
2cf936fe7a
use defaults
2012-02-19 01:16:13 -05:00
Benjamin Peterson
84e821e961
merge 3.2
2012-02-19 01:14:21 -05:00
Benjamin Peterson
496c53d83e
use Py_CLEAR
2012-02-19 01:11:56 -05:00
Benjamin Peterson
01d7eba316
allow arbitrary attributes on classmethod and staticmethod ( closes #14051 )
2012-02-19 01:10:25 -05:00
Antoine Pitrou
552be9b214
Issue #13020 : Fix a reference leak when allocating a structsequence object fails.
...
Patch by Suman Saha.
2012-02-15 02:54:33 +01:00
Antoine Pitrou
4b3c7846c9
Fix indentation
2012-02-15 02:52:58 +01:00
Antoine Pitrou
37784ba5c0
Issue #13020 : Fix a reference leak when allocating a structsequence object fails.
...
Patch by Suman Saha.
2012-02-15 02:51:43 +01:00
Victor Stinner
c3a6b02d70
(Merge 3.2) Issue #13913 : normalize utf-8 codec name in UTF-8 decoder
2012-02-14 01:18:10 +01:00
Victor Stinner
cbe01342bc
Issue #13913 : normalize utf-8 codec name in UTF-8 decoder
2012-02-14 01:17:45 +01:00
Benjamin Peterson
67e700697e
merge 3.2
2012-02-10 08:47:04 -05:00
Benjamin Peterson
efe7c9d4d7
this is only a borrowed ref in Brett's branch
2012-02-10 08:46:54 -05:00
Victor Stinner
d1cd99b533
Backout d2c1521ad0a1: _Py_IDENTIFIER() uses UTF-8 again
2012-02-07 23:05:55 +01:00
Benjamin Peterson
9878b63c7c
merge 3.2
2012-02-06 11:30:05 -05:00
Benjamin Peterson
2f9c71bbba
bltinmod is borrowed, so it shouldn't be decrefed
2012-02-06 11:28:45 -05:00
Victor Stinner
d446d8e09a
_Py_Identifier are always ASCII strings
2012-02-05 01:45:45 +01:00
Benjamin Peterson
951138c795
merge 3.2
2012-02-03 19:25:01 -05:00
Benjamin Peterson
90b13583bc
put returns on their own lines
2012-02-03 19:22:31 -05:00
Benjamin Peterson
2372bb0722
merge 3.2 ( closes #13908 )
2012-01-29 20:17:07 -05:00
Benjamin Peterson
2652d2570e
ready types returned from PyType_FromSpec
2012-01-29 20:16:37 -05:00
Benjamin Peterson
e28108cbd7
adjust declaration
2012-01-29 20:13:18 -05:00
Antoine Pitrou
7ab4af0427
Issue #13848 : open() and the FileIO constructor now check for NUL characters in the file name.
...
Patch by Hynek Schlawack.
2012-01-29 18:43:36 +01:00
Antoine Pitrou
1334884ff2
Issue #13848 : open() and the FileIO constructor now check for NUL characters in the file name.
...
Patch by Hynek Schlawack.
2012-01-29 18:36:34 +01:00
Mark Dickinson
963816defc
Merge 3.2 -> default (issue 13889)
2012-01-27 21:17:04 +00:00
Mark Dickinson
261896b559
Issue #13889 : Add missing _Py_SET_53BIT_PRECISION_* calls around uses of dtoa.c functions in float round.
2012-01-27 21:16:01 +00:00
Georg Brandl
20f6bc20bd
merge with 3.2
2012-01-22 21:31:39 +01:00
Georg Brandl
beca27a394
Fix #13834 : strip() strips leading and trailing whitespace.
2012-01-22 21:31:21 +01:00
Benjamin Peterson
ce79852077
use the static identifier api for looking up special methods
...
I had to move the static identifier code from unicodeobject.h to object.h in
order for this to work.
2012-01-22 11:24:29 -05:00
Antoine Pitrou
ac456a1839
Fix some of the remaining test_capi leaks
2012-01-18 21:35:21 +01:00
Antoine Pitrou
8b0a74e936
Fix some of the remaining test_capi refleaks
2012-01-18 21:29:05 +01:00
Antoine Pitrou
84091bfa45
Fix some of the refleaks in test_capi (ported from 3.2)
2012-01-18 21:24:18 +01:00
Antoine Pitrou
55f217f22d
Fix refleaks in test_capi
...
(this was easier than I thought!)
2012-01-18 21:23:13 +01:00
Antoine Pitrou
bb5b92d324
Merge refleak fixes from 3.2
2012-01-18 16:19:19 +01:00
Antoine Pitrou
1c7ade5284
Fix leaking a RuntimeError objects when creating sub-interpreters
2012-01-18 16:13:31 +01:00
Benjamin Peterson
eea4846d23
don't ready in case_operation, since most callers do it themselves
2012-01-16 14:28:50 -05:00
Benjamin Peterson
c6630b9291
fix old titlecase function for extended case chars
2012-01-15 21:33:32 -05:00
Benjamin Peterson
9487c4db82
comment about how flags could be expanded
2012-01-15 21:26:23 -05:00
Benjamin Peterson
ad9c569825
delta encoding of upper/lower/title makes a glorious return ( #12736 )
2012-01-15 21:19:20 -05:00
Gregory P. Smith
f5b62a9b31
Consolidate the occurrances of the prime used as the multiplier when hashing.
2012-01-14 15:45:13 -08:00
Gregory P. Smith
63e6c3222f
Consolidate the occurrances of the prime used as the multiplier when hashing
...
to a single #define instead of having several copies in several files.
This excludes the Modules/ tree (datetime and expat both have a copy
for their own purposes with no need for it to be the same).
2012-01-14 15:31:34 -08:00
Benjamin Peterson
c8d8b8861e
fix possible refleaks if PyUnicode_READY fails
2012-01-14 13:37:31 -05:00
Benjamin Peterson
bac79498c8
always explicitly check for -1 from PyUnicode_READY
2012-01-14 13:34:47 -05:00
Benjamin Peterson
d5890c8db5
add str.casefold() ( closes #13752 )
2012-01-14 13:23:30 -05:00
Nick Coghlan
138f4656e3
Add a separate NEWS entry for a change to PyObject_CallMethod in the PEP 380 patch, and make the private CallMethod variants consistent with the public one
2012-01-14 16:45:48 +10:00
Amaury Forgeot d'Arc
e557da804a
Fix a crash when the return value of a subgenerator is a temporary
...
object (with a refcount of 1)
2012-01-13 21:06:12 +01:00
Benjamin Peterson
53aa1d7c57
fix possible if unlikely leak
2011-12-20 13:29:45 -06:00
Georg Brandl
ac0675cc01
Small clarification in docstring of dict.update(): the positional argument is not required.
2011-12-18 19:30:55 +01:00
Victor Stinner
bb2e9c477d
Issue #11231 : Fix bytes and bytearray docstrings
...
Patch written by Brice Berna.
2011-12-17 23:18:07 +01:00
Nick Coghlan
1f7ce62bd6
Implement PEP 380 - 'yield from' ( closes #11682 )
2012-01-13 21:43:40 +10:00
Benjamin Peterson
e51757f6de
move do_title to a better place
2012-01-12 21:10:29 -05:00
Benjamin Peterson
821e4cfd01
make fix_decimal_and_space_to_ascii check if it modifies the string
2012-01-12 15:40:18 -05:00
Benjamin Peterson
0c91392fe6
kill capwords implementation which has been disabled since the begining
2012-01-12 15:25:41 -05:00
Benjamin Peterson
21e0da228d
remove some usage of Py_UNICODE_TOUPPER/LOWER
2012-01-11 21:00:42 -05:00
Benjamin Peterson
b2bf01d824
use full unicode mappings for upper/lower/title case ( #12736 )
...
Also broaden the category of characters that count as lowercase/uppercase.
2012-01-11 18:17:06 -05:00
Antoine Pitrou
94f6fa62bf
Issue #13738 : Simplify implementation of bytes.lower() and bytes.upper().
2012-01-08 16:22:46 +01:00
Victor Stinner
3fe553160c
Add a new PyUnicode_Fill() function
...
It is faster than the unicode_fill() function which was implemented in
formatter_unicode.c.
2012-01-04 00:33:50 +01:00
Benjamin Peterson
5e458f520c
also decref the right thing
2012-01-02 10:12:13 -06:00
Benjamin Peterson
4c13a4a352
ready the correct string
2012-01-02 09:07:38 -06:00
Benjamin Peterson
22a29708fd
fix some possible refleaks from PyUnicode_READY error conditions
2012-01-02 09:00:30 -06:00
Benjamin Peterson
9ca3ffac94
== -1 is convention
2012-01-01 16:04:29 -06:00
Benjamin Peterson
e157cf1012
make switch more robust
2012-01-01 15:56:20 -06:00
Benjamin Peterson
2199227be4
fix weird indentation
2011-12-28 12:01:31 -06:00
Antoine Pitrou
5b62942074
Issue #13577 : Built-in methods and functions now have a __qualname__.
...
Patch by sbt.
2011-12-23 12:40:16 +01:00
Benjamin Peterson
c0b95d18fa
4 space indentation
2011-12-20 17:24:05 -06:00
Benjamin Peterson
ead6b53659
fix spacing around switch statements
2011-12-20 17:23:42 -06:00
Benjamin Peterson
822c790527
merge 3.2
2011-12-20 13:32:50 -06:00
Georg Brandl
f928b5d27e
Merge with 3.2.
2011-12-18 19:32:37 +01:00
Victor Stinner
6099a03202
Issue #13624 : Write a specialized UTF-8 encoder to allow more optimization
...
The main bottleneck was the PyUnicode_READ() macro.
2011-12-18 14:22:26 +01:00
Victor Stinner
73f53b57d1
Optimize str * n for len(str)==1 and UCS-2 or UCS-4
2011-12-18 03:26:31 +01:00
Victor Stinner
f644110816
Issue #13621 : Optimize str.replace(char1, char2)
...
Use findchar() which is more optimized than a dummy loop using
PyUnicode_READ(). PyUnicode_READ() is a complex and slow macro.
2011-12-18 02:43:08 +01:00
Victor Stinner
f8eac00779
Issue #13623 : Fix a performance regression introduced by issue #12170 in
...
bytes.find() and handle correctly OverflowError (raise the same ValueError than
the error for -1).
2011-12-18 01:17:41 +01:00
Victor Stinner
e010fc029d
Issue #11231 : Fix bytes and bytearray docstrings
...
Patch written by Brice Berna.
2011-12-17 23:18:43 +01:00
Victor Stinner
ab870218e3
Issue #10951 : Fix compiler warnings in timemodule.c and unicodeobject.c
...
Thanks Jérémy Anger for the fix.
2011-12-17 22:39:43 +01:00
Benjamin Peterson
f2fe7f0881
fix possible NULL dereference
2011-12-17 08:02:20 -05:00
Victor Stinner
2f197078fb
The locale decoder raises a UnicodeDecodeError instead of an OSError
...
Search the invalid character using mbrtowc().
2011-12-17 07:08:30 +01:00
Victor Stinner
1b57967b96
Issue #13560 : Locale codec functions use the classic "errors" parameter,
...
instead of surrogateescape
So it would be possible to support more error handlers later.
2011-12-17 05:47:23 +01:00
Victor Stinner
ab59594326
What's New in Python 3.3: complete the deprecation list
...
Add also FIXMEs in unicodeobject.c
2011-12-17 04:59:06 +01:00
Victor Stinner
1f33f2b0c3
Issue #13560 : os.strerror() now uses the current locale encoding instead of UTF-8
2011-12-17 04:45:09 +01:00
Victor Stinner
f2ea71fcc8
Issue #13560 : Add PyUnicode_EncodeLocale()
...
* Use PyUnicode_EncodeLocale() in time.strftime() if wcsftime() is not
available
* Document my last changes in Misc/NEWS
2011-12-17 04:13:41 +01:00
Victor Stinner
af02e1c85a
Add PyUnicode_DecodeLocaleAndSize() and PyUnicode_DecodeLocale()
...
* PyUnicode_DecodeLocaleAndSize() and PyUnicode_DecodeLocale() decode a string
from the current locale encoding
* _Py_char2wchar() writes an "error code" in the size argument to indicate
if the function failed because of memory allocation failure or because of a
decoding error. The function doesn't write the error message directly to
stderr.
* Fix time.strftime() (if wcsftime() is missing): decode strftime() result
from the current locale encoding, not from the filesystem encoding.
2011-12-16 23:56:01 +01:00
Antoine Pitrou
093ce9cd8c
Issue #6695 : Full garbage collection runs now clear the freelist of set objects.
...
Initial patch by Matthias Troffaes.
2011-12-16 11:24:27 +01:00
Benjamin Peterson
bfebb7b54a
improve abstract property support ( closes #11610 )
...
Thanks to Darren Dale for patch.
2011-12-15 15:34:02 -05:00
Antoine Pitrou
e0e2735f41
Fix OSError.__init__ and OSError.__new__ so that each of them can be
...
overriden and take additional arguments (followup to issue #12555 ).
2011-12-15 14:31:28 +01:00
Antoine Pitrou
d73a9acb63
Fix the fix for issue #12149 : it was incorrect, although it had the side
...
effect of appearing to resolve the issue. Thanks to Mark Shannon for
noticing.
2011-12-15 14:17:36 +01:00
Antoine Pitrou
2e872082f6
Fix the fix for issue #12149 : it was incorrect, although it had the side
...
effect of appearing to resolve the issue. Thanks to Mark Shannon for
noticing.
2011-12-15 14:15:31 +01:00
Florent Xicluna
aa6c1d240f
Issue #13575 : there is only one class type.
2011-12-12 18:54:29 +01:00
Antoine Pitrou
9d57481f04
Issue #13577 : various kinds of descriptors now have a __qualname__ attribute.
...
Patch by sbt.
2011-12-12 13:47:25 +01:00
Victor Stinner
16e6a80923
PyUnicode_Resize(): warn about canonical representation
...
Call also directly unicode_resize() in unicodeobject.c
2011-12-12 13:24:15 +01:00
Victor Stinner
b0a82a6a7f
Fix PyUnicode_Resize() for compact string: leave the string unchanged on error
...
Fix also PyUnicode_Resize() doc
2011-12-12 13:08:33 +01:00
Victor Stinner
bf6e560d0c
Make PyUnicode_Copy() private => _PyUnicode_Copy()
...
Undocument the function.
Make also decode_utf8_errors() as private (static).
2011-12-12 01:53:47 +01:00
Victor Stinner
7a9105a380
resize_copy() now supports legacy ready strings
2011-12-12 00:13:42 +01:00
Victor Stinner
488fa49acf
Rewrite PyUnicode_Append(); unicode_modifiable() is more strict
...
* Rename unicode_resizable() to unicode_modifiable()
* Rename _PyUnicode_Dirty() to unicode_check_modifiable() to make it clear
that the function is private
* Inline PyUnicode_Concat() and unicode_append_inplace() in PyUnicode_Append()
to simplify the code
* unicode_modifiable() return 0 if the hash has been computed or if the string
is not an exact unicode string
* Remove _PyUnicode_DIRTY(): no need to reset the hash anymore, because if the
hash has already been computed, you cannot modify a string inplace anymore
* PyUnicode_Concat() checks for integer overflow
2011-12-12 00:01:39 +01:00
Victor Stinner
c4b495497a
Create unicode_result_unchanged() subfunction
2011-12-11 22:44:26 +01:00
Victor Stinner
eaab604829
Fix fixup() for unchanged unicode subtype
...
If maxchar_new == 0 and self is a unicode subtype, return u instead of duplicating u.
2011-12-11 22:22:39 +01:00
Victor Stinner
e6b2d4407a
unicode_fromascii() doesn't check string content twice in debug mode
...
_PyUnicode_CheckConsistency() also checks string content.
2011-12-11 21:54:30 +01:00
Victor Stinner
a1d12bb119
Call directly PyUnicode_DecodeUTF8Stateful() instead of PyUnicode_DecodeUTF8()
...
* Remove micro-optimization from PyUnicode_FromStringAndSize():
PyUnicode_DecodeUTF8Stateful() has already these optimizations (for size=0
and one ascii char).
* Rename utf8_max_char_size_and_char_count() to utf8_scanner(), and remove an
useless variable
2011-12-11 21:53:09 +01:00
Victor Stinner
382955ff4e
Use directly unicode_empty instead of PyUnicode_New(0, 0)
2011-12-11 21:44:00 +01:00
Victor Stinner
785938eebd
Move the slowest UTF-8 decoder to its own subfunction
...
* Create decode_utf8_errors()
* Reuse unicode_fromascii()
* decode_utf8_errors() doesn't refit at the beginning
* Remove refit_partial_string(), use unicode_adjust_maxchar() instead
2011-12-11 20:09:03 +01:00
Victor Stinner
84def3774d
Fix error handling in resize_compact()
2011-12-11 20:04:56 +01:00
Victor Stinner
8faf8216e4
PyUnicode_FromWideChar() and PyUnicode_FromUnicode() raise a ValueError if a
...
character in not in range [U+0000; U+10ffff].
2011-12-08 22:14:11 +01:00
Antoine Pitrou
b0e1f8b38b
Issue #13503 : Use a more efficient reduction format for bytearrays with
...
pickle protocol >= 3. The old reduction format is kept with older
protocols in order to allow unpickling under Python 2.
Patch by Irmen de Jong.
2011-12-05 20:40:08 +01:00
Victor Stinner
0a54cf12a0
Fix PyObject_Repr(): don't call PyUnicode_READY() if res is NULL
2011-12-01 03:22:44 +01:00
Victor Stinner
b37b17423b
Replace PyUnicode_FromUnicode(NULL, 0) by PyUnicode_New(0, 0)
...
Create an empty string with the new Unicode API.
2011-12-01 03:18:59 +01:00
Victor Stinner
db88ae5d66
PyObject_Repr() ensures that the result is a ready Unicode string
...
And PyObject_Str() and PyObject_Repr() don't make strings ready in debug
mode to ensure that the caller makes the string ready before using it.
2011-12-01 02:15:00 +01:00
Victor Stinner
551ac95733
Py_UNICODE_HIGH_SURROGATE() and Py_UNICODE_LOW_SURROGATE() macros
...
And use surrogates macros everywhere in unicodeobject.c
2011-11-29 22:58:13 +01:00
Antoine Pitrou
c366117820
Merge heads
2011-11-26 01:13:12 +01:00
Antoine Pitrou
f0effe6379
Better resolution for issue #11849 : Ensure that free()d memory arenas are really released
...
on POSIX systems supporting anonymous memory mappings. Patch by Charles-François Natali.
2011-11-26 01:11:02 +01:00
Victor Stinner
6345be9a14
Close #13093 : PyUnicode_EncodeDecimal() doesn't support error handlers
...
different than "strict" anymore. The caller was unable to compute the
size of the output buffer: it depends on the error handler.
2011-11-25 20:09:01 +01:00
Antoine Pitrou
86a36b500a
PEP 3155 / issue #13448 : Qualified name for classes and functions.
2011-11-25 18:56:07 +01:00
Benjamin Peterson
1518e8713d
and back to the "magic" formula (with a comment) it is
2011-11-23 10:44:52 -06:00
Benjamin Peterson
5944c36931
cave to those who like readable code
2011-11-22 19:05:49 -06:00
Benjamin Peterson
0268675193
fix compiler warning by implementing this more cleverly
2011-11-22 15:29:32 -05:00
Victor Stinner
ca4f20782e
find_maxchar_surrogates() reuses surrogate macros
2011-11-22 03:38:40 +01:00
Victor Stinner
0d3721d986
Issue #13441 : Disable temporary the check on the maximum character until
...
the Solaris issue is solved.
But add assertion on the maximum character in various encoders: UTF-7, UTF-8,
wide character (wchar_t*, Py_UNICODE*), unicode-escape, raw-unicode-escape.
Fix also unicode_encode_ucs1() for backslashreplace error handler: Python is
now always "wide".
2011-11-22 03:27:53 +01:00
Victor Stinner
f8facacf30
Fix compiler warnings
2011-11-22 02:30:47 +01:00
Victor Stinner
9d3b93ba30
Use the new Unicode API
...
* Replace PyUnicode_FromUnicode(NULL, 0) by PyUnicode_New(0, 0)
* Replce PyUnicode_FromUnicode(str, len) by PyUnicode_FromWideChar(str, len)
* Replace Py_UNICODE by wchar_t
* posix_putenv() uses PyUnicode_FromFormat() to create the string, instead
of PyUnicode_FromUnicode() + _snwprintf()
2011-11-22 02:27:30 +01:00
Victor Stinner
b84d723509
(Merge 3.2) Issue #13093 : Fix error handling on PyUnicode_EncodeDecimal()
2011-11-22 01:50:07 +01:00
Victor Stinner
cfed46e00a
PyUnicode_FromKindAndData() fails with a ValueError if size < 0
2011-11-22 01:29:14 +01:00
Victor Stinner
42885206ec
UTF-8 decoder: set consumed value in the latin1 fast-path
2011-11-22 01:23:02 +01:00
Victor Stinner
d3df8ab377
Replace _PyUnicode_READY_REPLACE() and _PyUnicode_ReadyReplace() with unicode_ready()
...
* unicode_ready() has a simpler API
* try to reuse unicode_empty and latin1_char singleton everywhere
* Fix a reference leak in _PyUnicode_TranslateCharmap()
* PyUnicode_InternInPlace() doesn't try to get a singleton anymore, to avoid
having to handle a failure
2011-11-22 01:22:34 +01:00
Victor Stinner
f01245067a
Rewrite PyUnicode_TransformDecimalToASCII() to use the new Unicode API
2011-11-21 23:12:56 +01:00
Victor Stinner
2d718f39a5
Remove an unused variable from PyUnicode_Copy()
2011-11-21 23:11:52 +01:00
Victor Stinner
87af4f2f3a
Simplify PyUnicode_Copy()
...
USe PyUnicode_Copy() in fixup()
2011-11-21 23:03:47 +01:00
Victor Stinner
5bbe5e7c85
Fix a compiler warning in _PyUnicode_CheckConsistency()
2011-11-21 22:54:05 +01:00
Victor Stinner
42bf77537e
Rewrite PyUnicode_EncodeDecimal() to use the new Unicode API
...
Add tests for PyUnicode_EncodeDecimal() and
PyUnicode_TransformDecimalToASCII().
2011-11-21 22:52:58 +01:00
Antoine Pitrou
ce4a9da705
Issue #13411 : memoryview objects are now hashable when the underlying object is hashable.
2011-11-21 20:46:33 +01:00
Antoine Pitrou
0a3229de6b
Issue #13417 : speed up utf-8 decoding by around 2x for the non-fully-ASCII case.
...
This almost catches up with pre-PEP 393 performance, when decoding needed
only one pass.
2011-11-21 20:39:13 +01:00
Victor Stinner
da29cc36aa
Issue #13441 : _PyUnicode_CheckConsistency() dumps the string if the maximum
...
character is bigger than U+10FFFF and locale.localeconv() dumps the string
before decoding it.
Temporary hack to debug the issue #13441 .
2011-11-21 14:31:41 +01:00
Victor Stinner
9e30aa52fd
Fix misuse of PyUnicode_GET_SIZE() => PyUnicode_GET_LENGTH()
...
And PyUnicode_GetSize() => PyUnicode_GetLength()
2011-11-21 02:49:52 +01:00
Victor Stinner
53b33e767d
UnicodeTranslateError uses the new Unicode API
...
The index is a character index, not a index in a Py_UNICODE* string.
2011-11-21 01:17:27 +01:00
Victor Stinner
da1ddf37c6
UnicodeEncodeError uses the new Unicode API
...
The index is a character index, not a index in a Py_UNICODE* string.
2011-11-20 22:50:23 +01:00
Victor Stinner
4ead7c7be8
PyObject_Str() ensures that the result string is ready
...
and check the string consistency.
_PyUnicode_CheckConsistency() doesn't check the hash anymore. It should be
possible to call this function even if hash(str) was already called.
2011-11-20 19:48:36 +01:00
Victor Stinner
0fc35196bb
stringlib: remove unused STRINGLIB_FILL
2011-11-20 19:30:15 +01:00
Victor Stinner
b960b34577
PyUnicode_AsUTF32String() calls directly _PyUnicode_EncodeUTF32(),
...
instead of calling the deprecated PyUnicode_EncodeUTF32() function
2011-11-20 19:12:52 +01:00
Victor Stinner
77faf69ca1
_PyUnicode_CheckConsistency() also checks maxchar maximum value,
...
not only its minimum value
2011-11-20 18:56:05 +01:00
Victor Stinner
d5c4022d2a
Remove the two ugly and unused WRITE_ASCII_OR_WSTR and WRITE_WSTR macros
2011-11-20 18:41:31 +01:00
Victor Stinner
2e9cfadd7c
Reuse surrogate macros in UTF-16 decoder
2011-11-20 18:40:27 +01:00
Victor Stinner
ae4f7c8e59
charmap_encoding_error() uses the new Unicode API
2011-11-20 18:28:55 +01:00
Victor Stinner
ac931b1e5b
Use PyUnicode_EncodeCodePage() instead of PyUnicode_EncodeMBCS() with
...
PyUnicode_AsUnicodeAndSize()
2011-11-20 18:27:03 +01:00
Victor Stinner
22168998f5
charmap encoders uses Py_UCS4, not Py_UNICODE
2011-11-20 17:09:18 +01:00
Antoine Pitrou
f34a0cdc6c
Issue #10227 : Add an allocation cache for a single slice object.
...
Patch by Stefan Behnel.
2011-11-18 20:14:34 +01:00
Victor Stinner
1f7951711c
Catch PyUnicode_AS_UNICODE() errors
2011-11-17 00:45:54 +01:00
Ezio Melotti
11060a4a48
#13406 : silence deprecation warnings in test_codecs.
2011-11-16 09:39:10 +02:00
Antoine Pitrou
78edf7576e
Issue #13333 : The UTF-7 decoder now accepts lone surrogates
...
(the encoder already accepts them).
2011-11-15 01:44:16 +01:00
Antoine Pitrou
5418ee0b9a
Issue #13333 : The UTF-7 decoder now accepts lone surrogates
...
(the encoder already accepts them).
2011-11-15 01:42:21 +01:00
Antoine Pitrou
9a812cbc89
Issue #13389 : Full garbage collection passes now clear the freelists for
...
list and dict objects. They already cleared other freelists in the
interpreter.
2011-11-15 00:00:12 +01:00
Antoine Pitrou
39aba4f563
Use the small object allocator for small bytearrays
2011-11-12 21:15:28 +01:00
Antoine Pitrou
31b92a534f
Sanitize reference management in the utf-8 encoder
2011-11-12 18:35:19 +01:00
Eli Bendersky
e92ff0503c
Issue #13161 : fix doc strings of __i*__ operators. Closes #13161
2011-11-11 17:02:16 +02:00
Eli Bendersky
d3baae73be
Issue #13161 : fix doc strings of __i*__ operators
2011-11-11 16:57:05 +02:00
Antoine Pitrou
0290c7a811
Fix regression on 2-byte wchar_t systems (Windows)
2011-11-11 13:29:12 +01:00
Antoine Pitrou
44c6affc79
Avoid crashing because of an unaligned word access
2011-11-11 02:59:42 +01:00
Antoine Pitrou
de20b0b50e
Issue #13149 : Speed up append-only StringIO objects.
...
This is very similar to the "lazy strings" idea.
2011-11-10 21:47:38 +01:00
Victor Stinner
9f4b1e9c50
Fix and deprecated the unicode_internal codec
...
unicode_internal codec uses Py_UNICODE instead of the real internal
representation (PEP 393: Py_UCS1, Py_UCS2 or Py_UCS4) for backward
compatibility.
2011-11-10 20:56:30 +01:00
Victor Stinner
24729f36bf
Prefer Py_UCS4 or wchar_t over Py_UNICODE
2011-11-10 20:31:37 +01:00
Victor Stinner
ebf3ba808e
PyUnicode_DecodeCharmap() uses the new Unicode API
2011-11-10 20:30:22 +01:00
Victor Stinner
a98b28c1bf
Avoid PyUnicode_AS_UNICODE in the UTF-8 encoder
2011-11-10 20:21:49 +01:00
Victor Stinner
3326cb6a36
Fix "unicode_escape" encoder
2011-11-10 20:15:25 +01:00
Victor Stinner
0e36826a04
Fix UTF-7 encoder on Windows
2011-11-10 20:12:49 +01:00
Martin v. Löwis
1db7c13be1
Port encoders from Py_UNICODE API to unicode object API.
2011-11-10 18:24:32 +01:00
Victor Stinner
62aa4d086a
Strip trailing spaces
2011-11-09 00:03:45 +01:00
Victor Stinner
0a045efb49
Fix a compiler warning: use unsiged for maxchar in unicode_widen()
2011-11-09 00:02:42 +01:00
Victor Stinner
596a6c4ffc
Fix the code page decoder
...
* unicode_decode_call_errorhandler() now supports the PyUnicode_WCHAR_KIND
kind
* unicode_decode_call_errorhandler() calls copy_characters() instead of
PyUnicode_CopyCharacters()
2011-11-09 00:02:18 +01:00
Antoine Pitrou
a8f63c02ef
Fix missing goto
2011-11-08 18:37:16 +01:00
Martin v. Löwis
d10759f6ed
Make _PyUnicode_FromId return borrowed references.
...
http://mail.python.org/pipermail/python-dev/2011-November/114347.html
2011-11-07 13:00:05 +01:00
Martin v. Löwis
e9b11c1cd8
Change decoders to use Unicode API instead of Py_UNICODE.
2011-11-08 17:35:34 +01:00
Petri Lehtinen
9589ab1745
Revert "Accept None as start and stop parameters for list.index() and tuple.index()"
...
Issue #13340 .
2011-11-06 21:06:10 +02:00
Petri Lehtinen
ebfaabd663
Revert "Accept None as start and stop parameters for list.index() and tuple.index()"
...
Issue #13340 .
2011-11-06 21:02:39 +02:00
Amaury Forgeot d'Arc
864741b2c7
Issue #13350 : Replace most usages of PyUnicode_Format by PyUnicode_FromFormat.
2011-11-06 15:10:48 +01:00
Petri Lehtinen
8e9f6c4251
Accept None as start and stop parameters for list.index() and tuple.index().
...
Closes #13340 .
2011-11-05 23:25:34 +02:00
Petri Lehtinen
c2f0a46111
Accept None as start and stop parameters for list.index() and tuple.index()
...
Closes #13340 .
2011-11-05 23:24:31 +02:00
Benjamin Peterson
878ce389a0
add introspection to range objects ( closes #9896 )
...
Patch by Daniel Urban.
2011-11-05 15:17:52 -04:00
Victor Stinner
e30c0a1014
Fix gdb/libpython.py for not ready Unicode strings
...
_PyUnicode_CheckConsistency() checks also hash and length value for not ready
Unicode strings.
2011-11-04 20:54:05 +01:00
Victor Stinner
2fc507fe45
Replace tabs by spaces
2011-11-04 20:06:39 +01:00
Martin v. Löwis
12be46ca84
Drop Py_UNICODE based encode exceptions.
2011-11-04 19:04:15 +01:00
Martin v. Löwis
3d325191bf
Port code page codec to Unicode API.
2011-11-04 18:23:06 +01:00
Martin v. Löwis
b09af03b8a
Port error handlers from Py_UNICODE indexing to code point indexing.
2011-11-04 11:16:41 +01:00
Victor Stinner
fcd9653667
Fix a compiler warning in unicode_encode_ucs1()
2011-11-04 00:28:50 +01:00
Victor Stinner
fc026c98d8
Fix PyUnicode_EncodeCharmap()
2011-11-04 00:24:51 +01:00
Victor Stinner
7931d9a951
Replace PyUnicodeObject type by PyObject
...
* _PyUnicode_CheckConsistency() now takes a PyObject* instead of void*
* Remove now useless casts to PyObject*
2011-11-04 00:22:48 +01:00
Victor Stinner
76a31a6bff
Cleanup decode_code_page_stateful() and encode_code_page()
...
* Fix decode_code_page_errors() result
* Inline decode_code_page() and encode_code_page_chunk()
* Replace the PyUnicodeObject type by PyObject
2011-11-04 00:05:13 +01:00
Victor Stinner
7581cef699
Adapt the code page encoder to the new unicode_encode_call_errorhandler()
...
The code is not correct, but at least it doesn't crash anymore.
2011-11-03 22:32:33 +01:00
Brian Curtin
2787ea41fd
Fix a compile error (apparently Windows only) introduced in 295fdfd4f422
2011-11-02 15:09:37 -05:00
Martin v. Löwis
23e275b3ad
Port UCS1 and charmap codecs to new API.
2011-11-02 18:02:51 +01:00
Martin v. Löwis
9e8166843c
Introduce PyObject* API for raising encode errors.
2011-11-02 12:45:42 +01:00
Benjamin Peterson
2b50a01d11
remove unused variable
2011-10-30 14:24:44 -04:00
Petri Lehtinen
e0aa803714
Fix the return value of set_discard (issue #10519 )
2011-10-30 14:35:12 +02:00
Petri Lehtinen
5acc27ebe4
Avoid unnecessary recursive function calls ( closes #10519 )
2011-10-30 13:56:41 +02:00
Petri Lehtinen
a94200e6ce
Issue #13018 : Fix reference leaks in error paths in dictobject.c.
...
Patch by Suman Saha.
2011-10-24 21:12:58 +03:00
Nick Coghlan
de31b191e5
Issue 1294232: Fix errors in metaclass calculation affecting some cases of metaclass inheritance. Patch by Daniel Urban.
2011-10-23 22:04:16 +10:00
Benjamin Peterson
9d9141f5db
adjust braces a bit
2011-10-19 16:57:40 -04:00
Antoine Pitrou
551ba20e8e
Issue #13188 : When called without an explicit traceback argument,
...
generator.throw() now gets the traceback from the passed exception's
`__traceback__` attribute. Patch by Petri Lehtinen.
2011-10-18 16:40:50 +02:00
Benjamin Peterson
2963fe0711
plug possible refleak ( closes #13199 )
2011-10-17 13:09:27 -04:00
Martin v. Löwis
0d3072e98d
Drop Py_UCS4_ functions. Closes #13246 .
2011-10-31 08:40:56 +01:00
Benjamin Peterson
1cebc207ea
merge 3.2
2011-10-30 14:24:59 -04:00
Petri Lehtinen
c34f5c256a
Fix the return value of set_discard (issue #10519 )
2011-10-30 14:35:39 +02:00
Petri Lehtinen
7c5e34d8a3
Avoid unnecessary recursive function calls (#closes #10519 )
2011-10-30 13:57:45 +02:00
Victor Stinner
57ffa9d4ff
PyUnicode_AsUnicodeCopy() uses PyUnicode_AsUnicodeAndSize() to get directly the length
2011-10-23 20:10:08 +02:00
Victor Stinner
af9e4b8c29
Fix PyUnicode_InternImmortal(): PyUnicode_InternInPlace() may changes *p
2011-10-23 20:07:00 +02:00
Victor Stinner
9faa384bed
Cast directly to unsigned char, instead of using Py_CHARMASK
...
We don't need "& 0xff" on an unsigned char.
2011-10-23 20:06:00 +02:00
Victor Stinner
9db1a8b69f
Replace PyUnicodeObject* by PyObject* where it was irrevelant
...
A Unicode string can now be a PyASCIIObject, PyCompactUnicodeObject or
PyUnicodeObject. Aliasing a PyASCIIObject* or PyCompactUnicodeObject* to
PyUnicodeObject* is wrong
2011-10-23 20:04:37 +02:00
Victor Stinner
0d60e87ad6
Fix data variable in _PyUnicode_Dump() for compact ASCII
2011-10-23 19:47:19 +02:00
Victor Stinner
d8e61c348e
Remove last references to the removed Unicode free list
2011-10-23 19:43:33 +02:00
Victor Stinner
065836ec9c
PyUnicode_FSDecoder() ensures that the decoded string is ready
2011-10-27 01:56:33 +02:00
Petri Lehtinen
08a95cabe3
merge heads
2011-10-24 21:22:39 +03:00
Petri Lehtinen
24bd5adcff
Merge 3.2
2011-10-24 21:17:52 +03:00
Mark Dickinson
8d48b43ea9
Issue #12965 : Fix some inaccurate comments in Objects/longobject.c. Thanks Stefan Krah.
2011-10-23 20:47:14 +01:00
Mark Dickinson
36645681c8
Issue #13201 : equality for range objects is now based on equality of the underlying sequences. Thanks Sven Marnach for the patch.
2011-10-23 19:53:01 +01:00
Nick Coghlan
9715d26305
Merge issue 1294232 patch from 3.2
2011-10-23 22:36:42 +10:00
Victor Stinner
dd18d3ad9e
Fix unicode_subtype_new() on debug build
...
Patch written by Stefan Behnel.
2011-10-22 11:08:10 +02:00
Ezio Melotti
f881751ded
Remove unused variable.
2011-10-22 01:01:32 +03:00
Ezio Melotti
931b8aac80
#12753 : Add support for Unicode name aliases and named sequences.
2011-10-21 21:57:36 +03:00
Antoine Pitrou
ac65d96777
Issue #12170 : The count(), find(), rfind(), index() and rindex() methods
...
of bytes and bytearray objects now accept an integer between 0 and 255
as their first argument. Patch by Petri Lehtinen.
2011-10-20 23:54:17 +02:00
Benjamin Peterson
dc37ce95e8
merge 3.2
2011-10-19 16:58:15 -04:00
Victor Stinner
6707293e75
Add consistency check to _PyUnicode_New()
2011-10-18 22:10:14 +02:00
Victor Stinner
3a50e7056e
Issue #12281 : Rewrite the MBCS codec to handle correctly replace and ignore
...
error handlers on all Windows versions. The MBCS codec is now supporting all
error handlers, instead of only replace to encode and ignore to decode.
2011-10-18 21:21:00 +02:00
Antoine Pitrou
cf28eacafe
Issue #13188 : When called without an explicit traceback argument,
...
generator.throw() now gets the traceback from the passed exception's
``__traceback__`` attribute. Patch by Petri Lehtinen.
2011-10-18 16:42:55 +02:00
Antoine Pitrou
5b9f4c1539
Fix typo
2011-10-17 19:21:04 +02:00
Benjamin Peterson
897d059221
merge 3.2 ( #13199 )
2011-10-17 13:10:24 -04:00
Benjamin Peterson
7a6debe79c
remove some duplication
2011-10-15 09:25:28 -04:00
Martin v. Löwis
1c67dd9b15
Port SetAttrString/HasAttrString to SetAttrId/GetAttrId.
2011-10-14 15:16:45 +02:00
Martin v. Löwis
bd928fef42
Rename _Py_identifier to _Py_IDENTIFIER.
2011-10-14 10:20:37 +02:00
Victor Stinner
f5cff56a1b
Issue #13088 : Add shared Py_hexdigits constant to format a number into base 16
2011-10-14 02:13:11 +02:00
Victor Stinner
d1a9cc29b9
dictviews_or() uses _Py_identifier
2011-10-13 22:51:17 +02:00
Martin v. Löwis
bfc6d74b25
Use GetAttrId directly. Proposed by Amaury.
2011-10-13 20:03:57 +02:00
Antoine Pitrou
f0b934b01a
Reuse the stringlib in findchar(), and make its signature more convenient
2011-10-13 18:55:09 +02:00
Antoine Pitrou
c198d0599b
Add a comment explaining this heuristic.
2011-10-13 18:07:37 +02:00
Antoine Pitrou
dda339e6d2
Simplify heuristic for when to use memchr
2011-10-13 17:58:11 +02:00
Victor Stinner
55c991197b
Optimize unicode_subscript() for step != 1 and ascii strings
2011-10-13 01:17:06 +02:00
Victor Stinner
127226ba69
Don't use PyUnicode_MAX_CHAR_VALUE() macro in Py_MAX()
2011-10-13 01:12:34 +02:00
Victor Stinner
9e7a1bcfd6
Optimize findchar() for PyUnicode_1BYTE_KIND: use memchr and memrchr
2011-10-13 00:18:12 +02:00
Antoine Pitrou
dd4e2f0153
Issue #13155 : Optimize finding the optimal character width of an unicode string
2011-10-13 00:02:27 +02:00
Victor Stinner
49a0a21f37
Unicode replace() avoids calling unicode_adjust_maxchar() when it's useless
...
Add also a special case if the result is an empty string.
2011-10-12 23:46:10 +02:00
Antoine Pitrou
6b4883dec0
PEP 3151 / issue #12555 : reworking the OS and IO exception hierarchy.
2011-10-12 02:54:14 +02:00
Victor Stinner
983b1434bd
Backed out changeset 952d91a7d376
...
If maxchar == PyUnicode_MAX_CHAR_VALUE(unicode), we do an useless copy.
2011-10-12 00:54:35 +02:00
Antoine Pitrou
e55ad2dff0
Relax condition
2011-10-12 00:36:51 +02:00
Victor Stinner
d218bf14cc
stringlib: Fix STRINGLIB_STR for UCS2/UCS4
2011-10-12 00:14:32 +02:00
Victor Stinner
4e10100dee
Fix compiler warning in _PyUnicode_FromUCS2()
2011-10-11 23:27:52 +02:00
Victor Stinner
8cc70dcf70
Fix fastsearch for UCS2 and UCS4
...
* If needle is 0, try (p[0] >> 16) & 0xff for UCS4
* Disable fastsearch_memchr_1char() if needle is zero for UCS2 and UCS4
2011-10-11 23:22:22 +02:00
Antoine Pitrou
950468e553
Use _PyUnicode_CONVERT_BYTES() where applicable.
2011-10-11 22:45:48 +02:00
Victor Stinner
577db2c9f0
PyUnicode_AsUnicodeCopy() now checks if PyUnicode_AsUnicode() failed
2011-10-11 22:12:48 +02:00
Victor Stinner
c4f281eba3
Fix misuse of PyUnicode_GET_SIZE, use PyUnicode_GET_LENGTH instead
2011-10-11 22:11:42 +02:00
Victor Stinner
ed2682be2f
Reuse PyUnicode_Copy() in validate_and_copy_tuple()
2011-10-11 21:53:24 +02:00
Antoine Pitrou
e459a0877e
Issue #13136 : speed up conversion between different character widths.
2011-10-11 20:58:41 +02:00
Antoine Pitrou
2c3b2302ad
Issue #13134 : optimize finding single-character strings using memchr
2011-10-11 20:29:21 +02:00
Antoine Pitrou
2871698546
/* Remove unused code. It has been committed out since 2000 (!). */
2011-10-11 03:17:47 +02:00
Antoine Pitrou
53bb548f22
Avoid exporting private helpers
...
(thanks "make smelly")
2011-10-10 23:49:24 +02:00
Martin v. Löwis
1ee1b6fe0d
Use identifier API for PyObject_GetAttrString.
2011-10-10 18:11:30 +02:00
Victor Stinner
794d567b17
any_find_slice() doesn't use callbacks anymore
...
* Call directly the right find/rfind method: allow inlining functions
* Remove Py_LOCAL_CALLBACK (added for any_find_slice)
2011-10-10 03:21:36 +02:00
Martin v. Löwis
afe55bba33
Add API for static strings, primarily good for identifiers.
...
Thanks to Konrad Schöbel and Jasper Schulz for helping with the mass-editing.
2011-10-09 10:38:36 +02:00
Antoine Pitrou
eaf139b3fc
Fix typo in the PyUnicode_Find() implementation
2011-10-09 00:33:09 +02:00
Georg Brandl
388349add2
Closes #12192 : Document that mutating list methods do not return the instance (original patch by Mike Hoy).
2011-10-08 18:32:40 +02:00
Martin v. Löwis
c47adb04b3
Change PyUnicode_KIND to 1,2,4. Drop _KIND_SIZE and _CHARACTER_SIZE.
2011-10-07 20:55:35 +02:00
Victor Stinner
dd07732af5
PyUnicode_Join() calls directly memcpy() if all strings are of the same kind
2011-10-07 17:02:31 +02:00
Antoine Pitrou
978b9d2a27
Fix formatting memory consumption with very large padding specifications
2011-10-07 12:35:48 +02:00
Victor Stinner
59de0ee9e0
str.replace(a, a) is now returning str unchanged if a is a
2011-10-07 10:01:28 +02:00
Antoine Pitrou
4574e62c6e
Fix massive slowdown in string formatting with str.format.
...
Example:
./python -m timeit -s "f='{}' + '-' * 1024 + '{}'; s='abcd' * 16384" "f.format(s, s)"
-> before: 547 usec per loop
-> after: 13 usec per loop
-> 3.2: 22.5 usec per loop
-> 2.7: 12.6 usec per loop
2011-10-07 02:26:47 +02:00
Antoine Pitrou
5c0ba36d5f
Fix massive slowdown in string formatting with the % operator
2011-10-07 01:54:09 +02:00
Antoine Pitrou
7c46da7993
Ensure that 1-char singletons get used
2011-10-06 22:07:51 +02:00
Antoine Pitrou
c61c8d7a5e
Issue #12911 : Fix memory consumption when calculating the repr() of huge tuples or lists.
...
This introduces a small private API for this common pattern.
The issue has been discovered thanks to Martin's huge-mem buildbot.
2011-10-06 19:04:12 +02:00
Antoine Pitrou
eeb7eea1f9
Issue #12911 : Fix memory consumption when calculating the repr() of huge tuples or lists.
...
This introduces a small private API for this common pattern.
The issue has been discovered thanks to Martin's huge-mem buildbot.
2011-10-06 18:57:27 +02:00
Victor Stinner
c6f0df7b20
Fix PyUnicode_Join() for len==1 and non-exact string
2011-10-06 15:58:54 +02:00
Antoine Pitrou
dbf697ae5c
Fix compilation warnings under 64-bit Windows
2011-10-06 15:34:41 +02:00
Antoine Pitrou
15a66cf134
Fix compilation under Windows
2011-10-06 15:25:32 +02:00
Victor Stinner
200f21340d
Fix assertion in unicode_adjust_maxchar()
2011-10-06 13:27:56 +02:00
Victor Stinner
acf47b807f
Fix my last change on PyUnicode_Join(): don't process separator if len==1
2011-10-06 12:32:37 +02:00
Victor Stinner
25a4b29c95
str.replace() avoids memory when it's possible
2011-10-06 12:31:55 +02:00
Victor Stinner
56c161ab00
_copy_characters() fails more quickly in debug mode on inconsistent state
2011-10-06 02:47:11 +02:00
Victor Stinner
c729b8e92f
Fix a compiler warning: don't define unicode_is_singleton() in release mode
2011-10-06 02:36:59 +02:00
Victor Stinner
fb9ea8c57e
Don't check for the maximum character when copying from unicodeobject.c
...
* Create copy_characters() function which doesn't check for the maximum
character in release mode
* _PyUnicode_CheckConsistency() is no more static to be able to use it
in _PyUnicode_FormatAdvanced() (in formatter_unicode.c)
* _PyUnicode_CheckConsistency() checks the string hash
2011-10-06 01:45:57 +02:00
Victor Stinner
05d1189566
Fix post-condition in unicode_repr(): check the result, not the input
2011-10-06 01:13:58 +02:00
Victor Stinner
f48323e3b3
replace() uses unicode_fromascii() if the input and replace string is ASCII
2011-10-05 23:27:08 +02:00
Victor Stinner
0617b6e18b
unicode_fromascii() checks that the input is ASCII in debug mode
2011-10-05 23:26:01 +02:00
Victor Stinner
c3cec7868b
Add asciilib: similar to ucs1, ucs2 and ucs4 library, but specialized to ASCII
...
ucs1, ucs2 and ucs4 libraries have to scan created substring to find the
maximum character, whereas it is not need to ASCII strings. Because ASCII
strings are common, it is useful to optimize ASCII.
2011-10-05 21:24:08 +02:00
Victor Stinner
14f8f02826
Fix PyUnicode_Partition(): str_in->str_obj
2011-10-05 20:58:25 +02:00
Victor Stinner
31392e741d
Fix my_basename(): make the string ready
2011-10-05 20:14:23 +02:00
Victor Stinner
bb10a1f759
Ensure that newly created strings use the most efficient store in debug mode
2011-10-05 01:34:17 +02:00
Victor Stinner
9310abbf40
Replace PyUnicodeObject* with PyObject* where it was inappropriate
2011-10-05 00:59:23 +02:00
Victor Stinner
ce5faf673e
unicodeobject.c doesn't make output strings ready in debug mode
...
Try to only create non ready strings in debug mode to ensure that all functions
(not only in unicodeobject.c, everywhere) make input strings ready.
2011-10-05 00:42:43 +02:00
Georg Brandl
7597addbd4
More typoes.
2011-10-05 16:36:47 +02:00
Victor Stinner
c80d6d20d5
Speedup str[a 🅱️ step] for step != 1
...
Try to stop the scanner of the maximum character before the end using a limit
depending on the kind (e.g. 256 for PyUnicode_2BYTE_KIND).
2011-10-05 14:13:28 +02:00
Victor Stinner
ae86485517
Speedup find_maxchar_surrogates() for 32-bit wchar_t
...
If we have at least one character in U+10000-U+10FFFF, we know that we must use
PyUnicode_4BYTE_KIND kind.
2011-10-05 14:02:44 +02:00
Victor Stinner
b9275c104e
Speedup str[a:b] and PyUnicode_FromKindAndData
...
* str[a:b] doesn't scan the string for the maximum character if the string
is ascii only
* PyUnicode_FromKindAndData() stops if we are sure that we cannot use a
shorter character type. For example, _PyUnicode_FromUCS1() stops if we
have at least one character in range U+0080-U+00FF
2011-10-05 14:01:42 +02:00
Victor Stinner
702c734395
Speedup the ASCII decoder
...
It is faster for long string and a little bit faster for short strings,
benchmark on Linux 32 bits, Intel Core i5 @ 3.33GHz:
./python -m timeit 'x=b"a"' 'x.decode("ascii")'
./python -m timeit 'x=b"x"*80' 'x.decode("ascii")'
./python -m timeit 'x=b"abc"*4096' 'x.decode("ascii")'
length | before | after
-------+------------+-----------
1 | 0.234 usec | 0.229 usec
80 | 0.381 usec | 0.357 usec
12,288 | 11.2 usec | 3.01 usec
2011-10-05 13:50:52 +02:00
Victor Stinner
e1335c711c
Fix usage og PyUnicode_READY()
2011-10-04 20:53:03 +02:00
Victor Stinner
e06e145943
_PyUnicode_READY_REPLACE() cannot be used in unicode_subtype_new()
2011-10-04 20:52:31 +02:00
Victor Stinner
17efeed284
Add DONT_MAKE_RESULT_READY to unicodeobject.c to help detecting bugs
...
Use also _PyUnicode_READY_REPLACE() when it's applicable.
2011-10-04 20:05:46 +02:00
Victor Stinner
6b56a7fd3d
Add assertion to _Py_ReleaseInternedUnicodeStrings() if READY fails
2011-10-04 20:04:52 +02:00
Antoine Pitrou
875f29bb95
Fix naïve heuristic in unicode slicing (followup to 1b4f886dc9e2)
2011-10-04 20:00:49 +02:00
Antoine Pitrou
2242522fde
Add a necessary call to PyUnicode_READY() (followup to ab5086539ab9)
2011-10-04 19:10:51 +02:00
Antoine Pitrou
7aec401966
Optimize string slicing to use the new API
2011-10-04 19:08:01 +02:00
Antoine Pitrou
e19aa388e8
When expandtabs() would be a no-op, don't create a duplicate string
2011-10-04 16:04:01 +02:00
Antoine Pitrou
e71d574a39
Migrate str.expandtabs to the new API
2011-10-04 15:55:09 +02:00
Benjamin Peterson
7f3140ef80
fix parens
2011-10-03 19:37:29 -04:00
Benjamin Peterson
4bfce8f81f
fix formatting
2011-10-03 19:35:07 -04:00
Benjamin Peterson
ccc51c1fc6
fix compiler warnings
2011-10-03 19:34:12 -04:00
Victor Stinner
b092365cc6
Move in-place Unicode append to its own subfunction
2011-10-04 01:17:31 +02:00
Victor Stinner
a5f9163501
Reindent internal Unicode macros
2011-10-04 01:07:11 +02:00
Victor Stinner
a41463c203
Document utf8_length and wstr_length states
...
Ensure these states with assertions in _PyUnicode_CheckConsistency().
2011-10-04 01:05:08 +02:00
Victor Stinner
9566311014
resize_inplace() sets utf8_length to zero if the utf8 is not shared8
...
Cleanup also the code.
2011-10-04 01:03:50 +02:00
Victor Stinner
9e9d689d85
PyUnicode_New() sets utf8_length to zero for latin1
2011-10-04 01:02:02 +02:00
Victor Stinner
016980454e
Unicode: raise SystemError instead of ValueError or RuntimeError on invalid
...
state
2011-10-04 00:04:26 +02:00
Victor Stinner
7f11ad4594
Unicode: document when the wstr pointer is shared with data
...
Add also related assertions to _PyUnicode_CheckConsistency().
2011-10-04 00:00:20 +02:00
Victor Stinner
03490918b7
Add _PyUnicode_HAS_WSTR_MEMORY() macro
2011-10-03 23:45:12 +02:00
Victor Stinner
9ce5a835bb
PyUnicode_Join() checks output length in debug mode
...
PyUnicode_CopyCharacters() may copies less character than requested size, if
the input string is smaller than the argument. (This is very unlikely, but who
knows!?)
Avoid also calling PyUnicode_CopyCharacters() if the string is empty.
2011-10-03 23:36:02 +02:00
Victor Stinner
b803895355
Fix a compiler warning in PyUnicode_Append()
...
Don't check PyUnicode_CopyCharacters() in release mode. Rename also some
variables.
2011-10-03 23:27:56 +02:00
Victor Stinner
8cfcbed4e3
Improve string forms and PyUnicode_Resize() documentation
...
Remove also the FIXME for resize_copy(): as discussed with Martin, copy the
string on resize if the string is not resizable is just fine.
2011-10-03 23:19:21 +02:00
Victor Stinner
77bb47b312
Simplify unicode_resizable(): singletons reference count is at least 2
2011-10-03 20:06:05 +02:00
Victor Stinner
85041a54bd
_PyUnicode_CheckConsistency() checks utf8 field consistency
2011-10-03 14:42:39 +02:00
Victor Stinner
3cf4637e4e
unicode_subtype_new() copies also the ascii flag
2011-10-03 14:42:15 +02:00
Victor Stinner
42dfd71333
unicode_kind_name() doesn't check consistency anymore
...
It is is called from _PyUnicode_Dump() and so must not fail.
2011-10-03 14:41:45 +02:00
Victor Stinner
a3b334da6d
PyUnicode_Ready() now sets ascii=1 if maxchar < 128
...
ascii=1 is no more reserved to PyASCIIObject. Use
PyUnicode_IS_COMPACT_ASCII(obj) to check if obj is a PyASCIIObject (as before).
2011-10-03 13:53:37 +02:00
Victor Stinner
1b4f9ceca7
Create _PyUnicode_READY_REPLACE() to reuse singleton
...
Only use _PyUnicode_READY_REPLACE() on just created strings.
2011-10-03 13:28:14 +02:00
Victor Stinner
c379ead9af
Fix resize_compact() and resize_inplace(); reenable full resize optimizations
...
* resize_compact() updates also wstr_len for non-ascii strings sharing wstr
* resize_inplace() updates also utf8_len/wstr_len for strings sharing
utf8/wstr
2011-10-03 12:52:27 +02:00
Victor Stinner
34411e17b0
resize_inplace() has been fixed: reenable this optimization
2011-10-03 12:21:33 +02:00
Victor Stinner
a849a4b6b4
_PyUnicode_Dump() indicates if wstr and/or utf8 are shared
2011-10-03 12:12:11 +02:00
Victor Stinner
1c8d0c76a1
Fix resize_inplace(): update shared utf8 pointer
2011-10-03 12:11:00 +02:00
Victor Stinner
ca4f7a4298
Disable unicode_resize() optimization on Windows (16-bit wchar_t)
2011-10-03 04:18:04 +02:00
Victor Stinner
126c559d05
_PyUnicode_Ready() for 16-bit wchar_t
2011-10-03 04:17:10 +02:00
Victor Stinner
2fd82278cb
Fix compilation error on Windows
...
Fix also a compiler warning.
2011-10-03 04:06:05 +02:00