Commit Graph

52765 Commits

Author SHA1 Message Date
Victor Stinner 488fa49acf Rewrite PyUnicode_Append(); unicode_modifiable() is more strict
* Rename unicode_resizable() to unicode_modifiable()
 * Rename _PyUnicode_Dirty() to unicode_check_modifiable() to make it clear
   that the function is private
 * Inline PyUnicode_Concat() and unicode_append_inplace() in PyUnicode_Append()
   to simplify the code
 * unicode_modifiable() return 0 if the hash has been computed or if the string
   is not an exact unicode string
 * Remove _PyUnicode_DIRTY(): no need to reset the hash anymore, because if the
   hash has already been computed, you cannot modify a string inplace anymore
 * PyUnicode_Concat() checks for integer overflow
2011-12-12 00:01:39 +01:00
Victor Stinner 24c74be9a3 PyUnicode_IS_ASCII() macro ensures that the string is ready
It has no sense to check if a not ready string is ASCII or not.
2011-12-12 01:24:20 +01:00
Victor Stinner c4b495497a Create unicode_result_unchanged() subfunction 2011-12-11 22:44:26 +01:00
Victor Stinner eaab604829 Fix fixup() for unchanged unicode subtype
If maxchar_new == 0 and self is a unicode subtype, return u instead of duplicating u.
2011-12-11 22:22:39 +01:00
Victor Stinner e6b2d4407a unicode_fromascii() doesn't check string content twice in debug mode
_PyUnicode_CheckConsistency() also checks string content.
2011-12-11 21:54:30 +01:00
Victor Stinner a1d12bb119 Call directly PyUnicode_DecodeUTF8Stateful() instead of PyUnicode_DecodeUTF8()
* Remove micro-optimization from PyUnicode_FromStringAndSize():
   PyUnicode_DecodeUTF8Stateful() has already these optimizations (for size=0
   and one ascii char).
 * Rename utf8_max_char_size_and_char_count() to utf8_scanner(), and remove an
   useless variable
2011-12-11 21:53:09 +01:00
Victor Stinner 382955ff4e Use directly unicode_empty instead of PyUnicode_New(0, 0) 2011-12-11 21:44:00 +01:00
Victor Stinner 785938eebd Move the slowest UTF-8 decoder to its own subfunction
* Create decode_utf8_errors()
 * Reuse unicode_fromascii()
 * decode_utf8_errors() doesn't refit at the beginning
 * Remove refit_partial_string(), use unicode_adjust_maxchar() instead
2011-12-11 20:09:03 +01:00
Victor Stinner 84def3774d Fix error handling in resize_compact() 2011-12-11 20:04:56 +01:00
Benjamin Peterson 8bbe788deb merge heads 2011-12-10 17:55:31 -05:00
Benjamin Peterson 2122cf717f alias resource.error to OSError 2011-12-10 17:50:22 -05:00
Giampaolo Rodola' 836e9aab2f fix #13563: make use of with statement in ftplib.py where needed 2011-12-10 21:25:04 +01:00
Florent Xicluna 313b2ad1a8 Fix imports in xml.dom. 2011-12-10 21:14:53 +01:00
Lars Gustäbel 0a9dd2f11d Issue #5689: Add support for lzma compression to the tarfile module. 2011-12-10 20:38:14 +01:00
Benjamin Peterson ce2af33562 merge 3.2 2011-12-10 12:44:37 -05:00
Benjamin Peterson b870aa1255 we're always going to have gc 2011-12-10 12:44:25 -05:00
Benjamin Peterson d3a345a21f merge 3.2 2011-12-10 12:38:52 -05:00
Benjamin Peterson 964561bb7c you can't get resource.error if you can't import resource 2011-12-10 12:31:42 -05:00
Victor Stinner 10a6ddb062 Issue #11886: Fix also test_time for the non-DST timezone name (EST/AEST) 2011-12-10 14:37:53 +01:00
Charles-François Natali 1635e9cc59 Issue #13453: Catch EAI_FAIL in support.transient_internet. 2011-12-10 13:17:46 +01:00
Charles-François Natali 13859bfedc Issue #13453: Catch EAI_FAIL in support.transient_internet. 2011-12-10 13:16:44 +01:00
Florent Xicluna 7f1c15b854 Fix comment in difflib. 2011-12-10 13:02:17 +01:00
Lars Gustäbel c67c0b0db1 Merge with 3.2: Fix doc typo. 2011-12-10 12:48:03 +01:00
Lars Gustäbel 0c6cbbd632 Fix doc typo. 2011-12-10 12:45:45 +01:00
Florent Xicluna 67317750af Issue #13248: turn 3.2's PendingDeprecationWarning into 3.3's DeprecationWarning (cgi, importlib, nntplib, smtpd). 2011-12-10 11:07:42 +01:00
Florent Xicluna 720682efd1 Merge 3.2 2011-12-09 23:42:29 +01:00
Florent Xicluna 5126df602c Remove obsolete py3k comment. 2011-12-09 23:41:21 +01:00
Florent Xicluna 0e686cbb7d Fix docstring typo. 2011-12-09 23:41:19 +01:00
Antoine Pitrou a9e9abb8ef Issue #13528: rework the performance question in the programming FAQ 2011-12-09 23:11:16 +01:00
Antoine Pitrou 432259feea Issue #13528: rework the performance question in the programming FAQ 2011-12-09 23:10:31 +01:00
Florent Xicluna 1b7458b2a1 Closes #2979: add parameter 'use_builtin_types' to the SimpleXMLRPCServer. 2011-12-09 22:35:06 +01:00
Victor Stinner e3b47152a4 Write tests for invalid characters (U+00110000)
Test the following functions:

 * codecs.raw_unicode_escape_decode()
 * PyUnicode_FromWideChar()
 * PyUnicode_FromUnicode()
 * "unicode_internal" and "unicode_escape" decoders
2011-12-09 20:49:49 +01:00
Victor Stinner db6238964d (Merge 3.2) Issue #5905: time.strftime() is now using the locale encoding,
instead of UTF-8, if the wcsftime() function is not available.
2011-12-09 20:21:17 +01:00
Victor Stinner 720f34a3e8 Issue #5905: time.strftime() is now using the locale encoding, instead of
UTF-8, if the wcsftime() function is not available.
2011-12-09 20:19:24 +01:00
Victor Stinner 7f54f75900 Issue #13441: Enable the workaround for Solaris locale bug
Skip locales triggering the mbstowcs() bug. I collected the locale list thanks
my previous commit:

 * hu_HU (ISO8859-2): character U+30000020
 * de_AT (ISO8859-1): character U+30000076
 * cs_CZ (ISO8859-2): character U+30000020
 * sk_SK (ISO8859-2): character U+30000020
 * pl_PL (ISO8859-2): character U+30000020
 * fr_CA (ISO8859-1): character U+30000020
2011-12-09 11:29:44 +01:00
Victor Stinner 69291c4af0 Issue #13441: Skip some locales (e.g. cs_CZ and hu_HU) on Solaris to workaround
a mbstowcs() bug. For example, on Solaris, the hu_HU locale uses the locale
encoding ISO-8859-2, the thousauds separator is b'\xA0' and it is decoded as
U+30000020 (an invalid character) by mbstowcs().

The workaround is not enabled yet (commented): I would like first to get
more information about the failing locales.
2011-12-09 10:28:45 +01:00
Victor Stinner 5446bba269 Issue #13441: Don't test the hu_HU locale on Solaris to workaround a mbstowcs()
bug. On Solaris, if the locale is hu_HU (and if the locale encoding is not
UTF-8), the thousauds separator is b'\xA0' which is decoded as U+30000020
instead of U+0020 by mbstowcs().
2011-12-09 01:20:03 +01:00
Nadeem Vawda 3459922c1b What's New in Python 3.3: Add entry for lzma module (issue #6715). 2011-12-09 01:32:46 +02:00
Victor Stinner b6821013df Document PyUnicode_Copy() and PyUnicode_EncodeCodePage() 2011-12-09 00:18:11 +01:00
Victor Stinner d1be878d7b What's New in Python 3.3: Add a Deprecated section 2011-12-09 00:10:41 +01:00
Victor Stinner 706141316a Issue #13441: Log the locale when localeconv() fails 2011-12-08 23:42:52 +01:00
Stefan Krah 221ea5d931 Merge fix for issue #13547. 2011-12-08 23:31:40 +01:00
Stefan Krah 383dd58533 Issue #13547: clean Lib/_sysconfigdata.py and Modules/_testembed 2011-12-08 23:25:15 +01:00
Stefan Krah 2ac5fac268 Merge. 2011-12-08 22:30:18 +01:00
Stefan Krah 9a17cc3c53 Merge second fix for issue #11149. 2011-12-08 22:22:58 +01:00
Stefan Krah af04ff2b97 Issue #11149: Also enable -fwrapv if $CC is a full path
or has a trailing version number.
2011-12-08 22:20:31 +01:00
Victor Stinner 8faf8216e4 PyUnicode_FromWideChar() and PyUnicode_FromUnicode() raise a ValueError if a
character in not in range [U+0000; U+10ffff].
2011-12-08 22:14:11 +01:00
Victor Stinner bc9f0c68f5 (Merge 3.2) Issue #11886: workaround an OS bug (time zone data) in test_time
Australian Eastern Standard Time (UTC+10) is called "EST" (as Eastern Standard
Time, UTC-5) instead of "AEST" on some operating systems (e.g. FreeBSD), which
is wrong. See for example this bug:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=93810
2011-12-08 00:33:14 +01:00
Victor Stinner 0cd479074d Issue #11886: workaround an OS bug (time zone data) in test_time
Australian Eastern Standard Time (UTC+10) is called "EST" (as Eastern Standard
Time, UTC-5) instead of "AEST" on some operating systems (e.g. FreeBSD), which
is wrong. See for example this bug:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=93810
2011-12-08 00:32:51 +01:00
Victor Stinner 0c4fbff6a7 libpython.py: defer call to gdb.lookup_type('PyUnicodeObject')
The lookup fails at startup if Python is linked to a shared library.
2011-12-08 00:08:22 +01:00