Antoine Pitrou
f0b934b01a
Reuse the stringlib in findchar(), and make its signature more convenient
2011-10-13 18:55:09 +02:00
Victor Stinner
55c991197b
Optimize unicode_subscript() for step != 1 and ascii strings
2011-10-13 01:17:06 +02:00
Victor Stinner
127226ba69
Don't use PyUnicode_MAX_CHAR_VALUE() macro in Py_MAX()
2011-10-13 01:12:34 +02:00
Victor Stinner
9e7a1bcfd6
Optimize findchar() for PyUnicode_1BYTE_KIND: use memchr and memrchr
2011-10-13 00:18:12 +02:00
Antoine Pitrou
dd4e2f0153
Issue #13155 : Optimize finding the optimal character width of an unicode string
2011-10-13 00:02:27 +02:00
Victor Stinner
49a0a21f37
Unicode replace() avoids calling unicode_adjust_maxchar() when it's useless
...
Add also a special case if the result is an empty string.
2011-10-12 23:46:10 +02:00
Victor Stinner
983b1434bd
Backed out changeset 952d91a7d376
...
If maxchar == PyUnicode_MAX_CHAR_VALUE(unicode), we do an useless copy.
2011-10-12 00:54:35 +02:00
Antoine Pitrou
e55ad2dff0
Relax condition
2011-10-12 00:36:51 +02:00
Victor Stinner
4e10100dee
Fix compiler warning in _PyUnicode_FromUCS2()
2011-10-11 23:27:52 +02:00
Antoine Pitrou
950468e553
Use _PyUnicode_CONVERT_BYTES() where applicable.
2011-10-11 22:45:48 +02:00
Victor Stinner
577db2c9f0
PyUnicode_AsUnicodeCopy() now checks if PyUnicode_AsUnicode() failed
2011-10-11 22:12:48 +02:00
Victor Stinner
c4f281eba3
Fix misuse of PyUnicode_GET_SIZE, use PyUnicode_GET_LENGTH instead
2011-10-11 22:11:42 +02:00
Antoine Pitrou
e459a0877e
Issue #13136 : speed up conversion between different character widths.
2011-10-11 20:58:41 +02:00
Antoine Pitrou
2871698546
/* Remove unused code. It has been committed out since 2000 (!). */
2011-10-11 03:17:47 +02:00
Antoine Pitrou
53bb548f22
Avoid exporting private helpers
...
(thanks "make smelly")
2011-10-10 23:49:24 +02:00
Victor Stinner
794d567b17
any_find_slice() doesn't use callbacks anymore
...
* Call directly the right find/rfind method: allow inlining functions
* Remove Py_LOCAL_CALLBACK (added for any_find_slice)
2011-10-10 03:21:36 +02:00
Martin v. Löwis
afe55bba33
Add API for static strings, primarily good for identifiers.
...
Thanks to Konrad Schöbel and Jasper Schulz for helping with the mass-editing.
2011-10-09 10:38:36 +02:00
Antoine Pitrou
eaf139b3fc
Fix typo in the PyUnicode_Find() implementation
2011-10-09 00:33:09 +02:00
Martin v. Löwis
c47adb04b3
Change PyUnicode_KIND to 1,2,4. Drop _KIND_SIZE and _CHARACTER_SIZE.
2011-10-07 20:55:35 +02:00
Victor Stinner
dd07732af5
PyUnicode_Join() calls directly memcpy() if all strings are of the same kind
2011-10-07 17:02:31 +02:00
Antoine Pitrou
978b9d2a27
Fix formatting memory consumption with very large padding specifications
2011-10-07 12:35:48 +02:00
Victor Stinner
59de0ee9e0
str.replace(a, a) is now returning str unchanged if a is a
2011-10-07 10:01:28 +02:00
Antoine Pitrou
5c0ba36d5f
Fix massive slowdown in string formatting with the % operator
2011-10-07 01:54:09 +02:00
Antoine Pitrou
7c46da7993
Ensure that 1-char singletons get used
2011-10-06 22:07:51 +02:00
Victor Stinner
c6f0df7b20
Fix PyUnicode_Join() for len==1 and non-exact string
2011-10-06 15:58:54 +02:00
Antoine Pitrou
15a66cf134
Fix compilation under Windows
2011-10-06 15:25:32 +02:00
Victor Stinner
200f21340d
Fix assertion in unicode_adjust_maxchar()
2011-10-06 13:27:56 +02:00
Victor Stinner
acf47b807f
Fix my last change on PyUnicode_Join(): don't process separator if len==1
2011-10-06 12:32:37 +02:00
Victor Stinner
25a4b29c95
str.replace() avoids memory when it's possible
2011-10-06 12:31:55 +02:00
Victor Stinner
56c161ab00
_copy_characters() fails more quickly in debug mode on inconsistent state
2011-10-06 02:47:11 +02:00
Victor Stinner
c729b8e92f
Fix a compiler warning: don't define unicode_is_singleton() in release mode
2011-10-06 02:36:59 +02:00
Victor Stinner
fb9ea8c57e
Don't check for the maximum character when copying from unicodeobject.c
...
* Create copy_characters() function which doesn't check for the maximum
character in release mode
* _PyUnicode_CheckConsistency() is no more static to be able to use it
in _PyUnicode_FormatAdvanced() (in formatter_unicode.c)
* _PyUnicode_CheckConsistency() checks the string hash
2011-10-06 01:45:57 +02:00
Victor Stinner
05d1189566
Fix post-condition in unicode_repr(): check the result, not the input
2011-10-06 01:13:58 +02:00
Victor Stinner
f48323e3b3
replace() uses unicode_fromascii() if the input and replace string is ASCII
2011-10-05 23:27:08 +02:00
Victor Stinner
0617b6e18b
unicode_fromascii() checks that the input is ASCII in debug mode
2011-10-05 23:26:01 +02:00
Victor Stinner
c3cec7868b
Add asciilib: similar to ucs1, ucs2 and ucs4 library, but specialized to ASCII
...
ucs1, ucs2 and ucs4 libraries have to scan created substring to find the
maximum character, whereas it is not need to ASCII strings. Because ASCII
strings are common, it is useful to optimize ASCII.
2011-10-05 21:24:08 +02:00
Victor Stinner
14f8f02826
Fix PyUnicode_Partition(): str_in->str_obj
2011-10-05 20:58:25 +02:00
Victor Stinner
bb10a1f759
Ensure that newly created strings use the most efficient store in debug mode
2011-10-05 01:34:17 +02:00
Victor Stinner
9310abbf40
Replace PyUnicodeObject* with PyObject* where it was inappropriate
2011-10-05 00:59:23 +02:00
Victor Stinner
ce5faf673e
unicodeobject.c doesn't make output strings ready in debug mode
...
Try to only create non ready strings in debug mode to ensure that all functions
(not only in unicodeobject.c, everywhere) make input strings ready.
2011-10-05 00:42:43 +02:00
Georg Brandl
7597addbd4
More typoes.
2011-10-05 16:36:47 +02:00
Victor Stinner
c80d6d20d5
Speedup str[a 🅱️ step] for step != 1
...
Try to stop the scanner of the maximum character before the end using a limit
depending on the kind (e.g. 256 for PyUnicode_2BYTE_KIND).
2011-10-05 14:13:28 +02:00
Victor Stinner
ae86485517
Speedup find_maxchar_surrogates() for 32-bit wchar_t
...
If we have at least one character in U+10000-U+10FFFF, we know that we must use
PyUnicode_4BYTE_KIND kind.
2011-10-05 14:02:44 +02:00
Victor Stinner
b9275c104e
Speedup str[a:b] and PyUnicode_FromKindAndData
...
* str[a:b] doesn't scan the string for the maximum character if the string
is ascii only
* PyUnicode_FromKindAndData() stops if we are sure that we cannot use a
shorter character type. For example, _PyUnicode_FromUCS1() stops if we
have at least one character in range U+0080-U+00FF
2011-10-05 14:01:42 +02:00
Victor Stinner
702c734395
Speedup the ASCII decoder
...
It is faster for long string and a little bit faster for short strings,
benchmark on Linux 32 bits, Intel Core i5 @ 3.33GHz:
./python -m timeit 'x=b"a"' 'x.decode("ascii")'
./python -m timeit 'x=b"x"*80' 'x.decode("ascii")'
./python -m timeit 'x=b"abc"*4096' 'x.decode("ascii")'
length | before | after
-------+------------+-----------
1 | 0.234 usec | 0.229 usec
80 | 0.381 usec | 0.357 usec
12,288 | 11.2 usec | 3.01 usec
2011-10-05 13:50:52 +02:00
Victor Stinner
e1335c711c
Fix usage og PyUnicode_READY()
2011-10-04 20:53:03 +02:00
Victor Stinner
e06e145943
_PyUnicode_READY_REPLACE() cannot be used in unicode_subtype_new()
2011-10-04 20:52:31 +02:00
Victor Stinner
17efeed284
Add DONT_MAKE_RESULT_READY to unicodeobject.c to help detecting bugs
...
Use also _PyUnicode_READY_REPLACE() when it's applicable.
2011-10-04 20:05:46 +02:00
Victor Stinner
6b56a7fd3d
Add assertion to _Py_ReleaseInternedUnicodeStrings() if READY fails
2011-10-04 20:04:52 +02:00
Antoine Pitrou
875f29bb95
Fix naïve heuristic in unicode slicing (followup to 1b4f886dc9e2)
2011-10-04 20:00:49 +02:00
Antoine Pitrou
2242522fde
Add a necessary call to PyUnicode_READY() (followup to ab5086539ab9)
2011-10-04 19:10:51 +02:00
Antoine Pitrou
7aec401966
Optimize string slicing to use the new API
2011-10-04 19:08:01 +02:00
Antoine Pitrou
e19aa388e8
When expandtabs() would be a no-op, don't create a duplicate string
2011-10-04 16:04:01 +02:00
Antoine Pitrou
e71d574a39
Migrate str.expandtabs to the new API
2011-10-04 15:55:09 +02:00
Benjamin Peterson
7f3140ef80
fix parens
2011-10-03 19:37:29 -04:00
Benjamin Peterson
4bfce8f81f
fix formatting
2011-10-03 19:35:07 -04:00
Benjamin Peterson
ccc51c1fc6
fix compiler warnings
2011-10-03 19:34:12 -04:00
Victor Stinner
b092365cc6
Move in-place Unicode append to its own subfunction
2011-10-04 01:17:31 +02:00
Victor Stinner
a5f9163501
Reindent internal Unicode macros
2011-10-04 01:07:11 +02:00
Victor Stinner
a41463c203
Document utf8_length and wstr_length states
...
Ensure these states with assertions in _PyUnicode_CheckConsistency().
2011-10-04 01:05:08 +02:00
Victor Stinner
9566311014
resize_inplace() sets utf8_length to zero if the utf8 is not shared8
...
Cleanup also the code.
2011-10-04 01:03:50 +02:00
Victor Stinner
9e9d689d85
PyUnicode_New() sets utf8_length to zero for latin1
2011-10-04 01:02:02 +02:00
Victor Stinner
016980454e
Unicode: raise SystemError instead of ValueError or RuntimeError on invalid
...
state
2011-10-04 00:04:26 +02:00
Victor Stinner
7f11ad4594
Unicode: document when the wstr pointer is shared with data
...
Add also related assertions to _PyUnicode_CheckConsistency().
2011-10-04 00:00:20 +02:00
Victor Stinner
03490918b7
Add _PyUnicode_HAS_WSTR_MEMORY() macro
2011-10-03 23:45:12 +02:00
Victor Stinner
9ce5a835bb
PyUnicode_Join() checks output length in debug mode
...
PyUnicode_CopyCharacters() may copies less character than requested size, if
the input string is smaller than the argument. (This is very unlikely, but who
knows!?)
Avoid also calling PyUnicode_CopyCharacters() if the string is empty.
2011-10-03 23:36:02 +02:00
Victor Stinner
b803895355
Fix a compiler warning in PyUnicode_Append()
...
Don't check PyUnicode_CopyCharacters() in release mode. Rename also some
variables.
2011-10-03 23:27:56 +02:00
Victor Stinner
8cfcbed4e3
Improve string forms and PyUnicode_Resize() documentation
...
Remove also the FIXME for resize_copy(): as discussed with Martin, copy the
string on resize if the string is not resizable is just fine.
2011-10-03 23:19:21 +02:00
Victor Stinner
77bb47b312
Simplify unicode_resizable(): singletons reference count is at least 2
2011-10-03 20:06:05 +02:00
Victor Stinner
85041a54bd
_PyUnicode_CheckConsistency() checks utf8 field consistency
2011-10-03 14:42:39 +02:00
Victor Stinner
3cf4637e4e
unicode_subtype_new() copies also the ascii flag
2011-10-03 14:42:15 +02:00
Victor Stinner
42dfd71333
unicode_kind_name() doesn't check consistency anymore
...
It is is called from _PyUnicode_Dump() and so must not fail.
2011-10-03 14:41:45 +02:00
Victor Stinner
a3b334da6d
PyUnicode_Ready() now sets ascii=1 if maxchar < 128
...
ascii=1 is no more reserved to PyASCIIObject. Use
PyUnicode_IS_COMPACT_ASCII(obj) to check if obj is a PyASCIIObject (as before).
2011-10-03 13:53:37 +02:00
Victor Stinner
1b4f9ceca7
Create _PyUnicode_READY_REPLACE() to reuse singleton
...
Only use _PyUnicode_READY_REPLACE() on just created strings.
2011-10-03 13:28:14 +02:00
Victor Stinner
c379ead9af
Fix resize_compact() and resize_inplace(); reenable full resize optimizations
...
* resize_compact() updates also wstr_len for non-ascii strings sharing wstr
* resize_inplace() updates also utf8_len/wstr_len for strings sharing
utf8/wstr
2011-10-03 12:52:27 +02:00
Victor Stinner
34411e17b0
resize_inplace() has been fixed: reenable this optimization
2011-10-03 12:21:33 +02:00
Victor Stinner
a849a4b6b4
_PyUnicode_Dump() indicates if wstr and/or utf8 are shared
2011-10-03 12:12:11 +02:00
Victor Stinner
1c8d0c76a1
Fix resize_inplace(): update shared utf8 pointer
2011-10-03 12:11:00 +02:00
Victor Stinner
ca4f7a4298
Disable unicode_resize() optimization on Windows (16-bit wchar_t)
2011-10-03 04:18:04 +02:00
Victor Stinner
126c559d05
_PyUnicode_Ready() for 16-bit wchar_t
2011-10-03 04:17:10 +02:00
Victor Stinner
2fd82278cb
Fix compilation error on Windows
...
Fix also a compiler warning.
2011-10-03 04:06:05 +02:00
Victor Stinner
a3be613a56
Use PyUnicode_WCHAR_KIND to check if a string is a wstr string
...
Simplify the test in wstr pointer in unicode_sizeof().
2011-10-03 02:16:37 +02:00
Victor Stinner
910337b42e
Add _PyUnicode_CheckConsistency() macro to help debugging
...
* Document Unicode string states
* Use _PyUnicode_CheckConsistency() to ensure that objects are always
consistent.
2011-10-03 03:20:16 +02:00
Victor Stinner
4fae54cb0e
In release mode, PyUnicode_InternInPlace() does nothing if the input is NULL or
...
not a unicode, instead of failing with a fatal error.
Use assertions in debug mode (provide better error messages).
2011-10-03 02:01:52 +02:00
Victor Stinner
23e5668214
PyUnicode_Append() now works in-place when it's possible
2011-10-03 03:54:37 +02:00
Victor Stinner
fe226c0d37
Rewrite PyUnicode_Resize()
...
* Rename _PyUnicode_Resize() to unicode_resize()
* unicode_resize() creates a copy if the string cannot be resized instead
of failing
* Optimize resize_copy() for wstr strings
* Disable temporary resize_inplace()
2011-10-03 03:52:20 +02:00
Victor Stinner
829c0adca9
Add _PyUnicode_HAS_UTF8_MEMORY() macro
2011-10-03 01:08:02 +02:00
Victor Stinner
fe0c155c4f
Write _PyUnicode_Dump() to help debugging
2011-10-03 02:59:31 +02:00
Victor Stinner
f42dc448e0
PyUnicode_CopyCharacters() fails when copying latin1 into ascii
2011-10-02 23:33:16 +02:00
Victor Stinner
c53be96c54
unicode_convert_wchar_to_ucs4() cannot fail
2011-10-02 21:33:54 +02:00
Victor Stinner
c3c7415639
Add _PyUnicode_DATA_ANY(op) private macro
2011-10-02 20:39:55 +02:00
Victor Stinner
a464fc141d
unicode_empty and unicode_latin1 are PyObject* objects, not PyUnicodeObject*
2011-10-02 20:39:30 +02:00
Victor Stinner
267aa24365
PyUnicode_FindChar() raises a IndexError on invalid index
2011-10-02 01:08:37 +02:00
Victor Stinner
bc603d12b7
Optimize _PyUnicode_AsKind() for UCS1->UCS4 and UCS2->UCS4
...
* Ensure that the input string is ready
* Raise a ValueError instead of of a fatal error
2011-10-02 01:00:40 +02:00
Victor Stinner
5a706cf8c0
Fix usage of PyUnicode_READY() in PyUnicode_GetLength()
2011-10-02 00:36:53 +02:00
Victor Stinner
cd9950fd09
PyUnicode_WriteChar() raises IndexError on invalid index
...
PyUnicode_WriteChar() raises also a ValueError if the string has more than 1
reference.
2011-10-02 00:34:53 +02:00
Victor Stinner
2fe5ced752
PyUnicode_ReadChar() raises a IndexError if the index in invalid
...
unicode_getitem() reuses PyUnicode_ReadChar()
2011-10-02 00:25:40 +02:00
Victor Stinner
202b62bd90
PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown
2011-10-01 23:48:37 +02:00
Victor Stinner
07ac3ebd7b
Optimize unicode_subtype_new(): don't encode to wchar_t and decode from wchar_t
...
Rewrite unicode_subtype_new(): allocate directly the right type.
2011-10-01 16:16:43 +02:00
Victor Stinner
e90fe6a8f4
Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros
...
* Rename existing _PyUnicode_UTF8() macro to PyUnicode_UTF8()
* Rename existing _PyUnicode_UTF8_LENGTH() macro to PyUnicode_UTF8_LENGTH()
* PyUnicode_UTF8() and PyUnicode_UTF8_LENGTH() are more strict
2011-10-01 16:48:13 +02:00
Martin v. Löwis
0b1d348990
Issue 13085: Fix some memory leaks. Patch by Stefan Krah.
2011-10-01 16:35:40 +02:00
Benjamin Peterson
5c0fb00ad8
merge heads
2011-10-01 00:12:20 -04:00
Benjamin Peterson
31616ea2ff
remove reference to non-existent file
2011-10-01 00:11:09 -04:00
Victor Stinner
de636f3c34
PyUnicode_Substring() now accepts end bigger than string length
...
Fix also a bug: call PyUnicode_READY() before reading string length.
2011-10-01 03:55:54 +02:00
Victor Stinner
c759f3e7ec
Ooops, avoid a division by zero in unicode_repeat()
2011-10-01 03:09:58 +02:00
Victor Stinner
d3a83d5eb3
PyUnicode_FromObject() ensures that its output is a ready string
2011-10-01 03:09:33 +02:00
Victor Stinner
67ca64ce54
I want a super fast 'a' * n!
...
* Optimize unicode_repeat() for a special case with memset()
* Simplify integer overflow checking; remove the second check because
PyUnicode_New() already does it and uses a smaller limit (Py_ssize_t vs
size_t)
2011-10-01 02:47:29 +02:00
Victor Stinner
e9a2935c1f
Fix usage of PyUnicode_READY in unicodeobject.c
2011-10-01 02:14:59 +02:00
Victor Stinner
12bab6dace
Remove private substring() function, reuse public PyUnicode_Substring()
...
* PyUnicode_Substring() now fails if start or end is invalid
* PyUnicode_Substring() reuses PyUnicode_Copy() for non-exact strings
2011-10-01 01:53:49 +02:00
Victor Stinner
c841e7db1f
Optimize PyUnicode_Copy(): don't recompute maximum character
2011-10-01 01:34:32 +02:00
Victor Stinner
2219e0a37e
PyUnicode_FromObject() reuses PyUnicode_Copy()
...
* PyUnicode_Copy() is faster than substring()
* Fix also a compiler warning
2011-10-01 01:16:59 +02:00
Victor Stinner
034f6cf10c
Add PyUnicode_Copy() function, include it to the public API
2011-09-30 02:26:44 +02:00
Victor Stinner
b153615008
PyUnicode_CopyCharacters() uses exceptions instead of assertions
...
Call PyErr_BadInternalCall() if inputs are not unicode strings.
2011-09-30 02:26:10 +02:00
Victor Stinner
d8f6510acc
_PyUnicode_Ready() cannot be used on ready strings anymore
...
* Change its prototype: PyObject* instead of PyUnicodeoObject*.
* Remove an old assertion, the result of PyUnicode_READY (_PyUnicode_Ready)
must be checked instead
2011-09-29 19:43:17 +02:00
Victor Stinner
bc8b81bc4e
Move _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() outside unicodeobject.h
...
Move these macros to unicodeobject.c
2011-09-29 19:31:34 +02:00
Victor Stinner
a0702ab1fe
Add a note in PyUnicode_CopyCharacters() doc: it doesn't write null character
...
Cleanup also the code (avoid the goto).
2011-09-29 14:14:38 +02:00
Victor Stinner
639418812f
Use the new Py_ARRAY_LENGTH macro
2011-09-29 00:42:28 +02:00
Victor Stinner
b9dcffb51e
Fix 'c' format of PyUnicode_Format()
...
formatbuf is now an array of Py_UCS4, not of Py_UNICODE
2011-09-29 00:39:24 +02:00
Victor Stinner
c17f540b7a
Oops, fix my previous commit: unicode => to
2011-09-29 00:16:58 +02:00
Victor Stinner
b15d4d899c
PyUnicode_CopyCharacters() marks the string as dirty (reset the hash)
2011-09-28 23:59:20 +02:00
Victor Stinner
f5ca1a21a5
PyUnicode_CopyCharacters() fails if 'to' has more than 1 reference
2011-09-28 23:54:59 +02:00
Ezio Melotti
2aa2b3b4d5
Clean up a few tabs that went in with PEP393.
2011-09-29 00:58:57 +03:00
Ezio Melotti
48a2f8fd97
#13054 : sys.maxunicode is now always 0x10FFFF.
2011-09-29 00:18:19 +03:00
Victor Stinner
506f592769
Check size of wchar_t using the preprocessor
2011-09-28 22:34:18 +02:00
Victor Stinner
73f01c65c8
PyUnicode_CopyCharacters() initializes overflow
2011-09-28 22:28:04 +02:00
Victor Stinner
e57b1c0da1
Mark PyUnicode_FromUCS[124] as private
2011-09-28 22:20:48 +02:00
Victor Stinner
ff9e50fd04
Oops, fix Py_MIN/Py_MAX case
2011-09-28 22:17:19 +02:00
Victor Stinner
17222160e7
Mark _PyUnicode_FindMaxCharAndNumSurrogatePairs() as private
2011-09-28 22:15:37 +02:00
Victor Stinner
157f83fcfc
Strip trailing spaces in unicodeobject.[ch]
2011-09-28 21:41:31 +02:00
Victor Stinner
6c7a52a46f
Check for PyUnicode_CopyCharacters() failure
2011-09-28 21:39:17 +02:00
Victor Stinner
be78eaf2de
PyUnicode_CopyCharacters() checks for buffer and character overflow
...
It now returns the number of written characters on success.
2011-09-28 21:37:03 +02:00
Victor Stinner
fb5f5f2420
Mark PyUnicode_CONVERT_BYTES as private
2011-09-28 21:39:49 +02:00
Georg Brandl
4cb0de246c
Rename new macros to conform to naming rules (function macros have "Py" prefix, not "PY").
2011-09-28 21:49:49 +02:00
Benjamin Peterson
9c6e6a0c7f
don't check that the first character is XID_Continue
...
Current, XID_Continue is a superset of XID_Start, but that may sometime change.
2011-09-28 08:09:05 -04:00
Martin v. Löwis
d63a3b8beb
Implement PEP 393.
2011-09-28 07:41:54 +02:00
Mark Dickinson
57e683e53e
Issue #1621 : Fix undefined behaviour in bytes.__hash__, str.__hash__, tuple.__hash__, frozenset.__hash__ and set indexing operations.
2011-09-24 18:18:40 +01:00
Mark Dickinson
0d5f6adbb3
Issue #13012 : Allow 'keepends' to be passed as a keyword argument in str.splitlines, bytes.splitlines and bytearray.splitlines.
2011-09-24 09:14:39 +01:00
Victor Stinner
f955eb210f
Merge 3.2: Fix PyUnicode_AsWideCharString() doc
...
- Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null
character
- Fix spelling of the null character
2011-09-06 02:01:29 +02:00
Victor Stinner
d88d9836c5
Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null character
...
Fix also spelling of the null character.
2011-09-06 02:00:05 +02:00
Ezio Melotti
6f2a683a0c
#9200 : merge with 3.2.
2011-08-22 20:31:11 +03:00
Ezio Melotti
93e7afc5d9
#9200 : The str.is* methods now work with strings that contain non-BMP characters even in narrow Unicode builds.
2011-08-22 14:08:38 +03:00
Benjamin Peterson
e518d4c18a
merge 3.2
2011-08-18 13:52:19 -05:00
Benjamin Peterson
7a6b44ab62
the named of the character is actually NUL
2011-08-18 13:51:47 -05:00
Benjamin Peterson
020340f284
merge 3.2
2011-08-18 10:49:16 -05:00
Benjamin Peterson
5ad517a7d9
NUL -> NULL
2011-08-18 10:48:50 -05:00
Ezio Melotti
269e3ee3db
#12266 : merge with 3.2.
2011-08-15 09:26:28 +03:00
Ezio Melotti
ee8d998ecf
#12266 : Fix str.capitalize() to correctly uppercase/lowercase titlecased and cased non-letter characters.
2011-08-15 09:09:57 +03:00
Benjamin Peterson
f8e7543df9
merge 3.2 ( #12732 )
2011-08-12 22:18:19 -05:00
Benjamin Peterson
f413b80806
in narrow builds, make sure to test codepoints as identifier characters ( closes #12732 )
...
This fixes the use of Unicode identifiers outside the BMP in narrow builds.
2011-08-12 22:17:18 -05:00
Brian Curtin
dfc80e3d97
Replace Py_NotImplemented returns with the macro form Py_RETURN_NOTIMPLEMENTED.
...
The macro was introduced in #12724 .
2011-08-10 20:28:54 -05:00
Victor Stinner
ab1d16b456
Issue #13093 : Fix error handling on PyUnicode_EncodeDecimal()
...
* Add tests for PyUnicode_EncodeDecimal() and PyUnicode_TransformDecimalToASCII()
* Remove the unused "e" variable in replace()
2011-11-22 01:45:37 +01:00
Senthil Kumaran
fcdaaa9011
merge from 3.2 - Fix closes Issue12621 - Fix docstrings of find and rfind methods of bytes/bytearry/unicodeobject.
2011-07-27 23:34:29 +08:00
Senthil Kumaran
53516a82df
Fix closes Issue12621 - Fix docstrings of find and rfind methods of bytes/bytearry/unicodeobject.
2011-07-27 23:33:54 +08:00
Victor Stinner
99b9538636
Issue #9642 : Uniformize the tests on the availability of the mbcs codec
...
Add a new HAVE_MBCS define.
2011-07-04 14:23:54 +02:00
Senthil Kumaran
bc9d8f838b
merge from 3.2
2011-07-03 21:05:25 -07:00
Senthil Kumaran
9ebe08d2f6
Fix closes issue12471 - wrong TypeError message when '%i' format spec was used.
2011-07-03 21:03:16 -07:00
Victor Stinner
3cbf14bfb1
Issue #10914 : Initialize correctly the filesystem codec when creating a new
...
subinterpreter to fix a bootstrap issue with codecs implemented in Python, as
the ISO-8859-15 codec.
Add fscodec_initialized attribute to the PyInterpreterState structure.
2011-04-27 00:24:21 +02:00
Victor Stinner
793b531756
Issue #10914 : Initialize correctly the filesystem codec when creating a new
...
subinterpreter to fix a bootstrap issue with codecs implemented in Python, as
the ISO-8859-15 codec.
Add fscodec_initialized attribute to the PyInterpreterState structure.
2011-04-27 00:24:21 +02:00
Ezio Melotti
bf1253b25a
#6780 : merge with 3.2.
2011-04-26 06:45:24 +03:00
Ezio Melotti
f2b3f780a1
#6780 : merge with 3.1.
2011-04-26 06:40:59 +03:00
Ezio Melotti
ba42fd5801
#6780 : fix starts/endswith error message to mention that tuples are accepted too.
2011-04-26 06:09:45 +03:00
Jesus Cea
c1ceb64e41
MERGE: startswith and endswith don't accept None as slice index. Patch by Torsten Becker. ( closes #11828 )
2011-04-20 17:59:29 +02:00
Jesus Cea
6159ee3cf5
MERGE: startswith and endswith don't accept None as slice index. Patch by Torsten Becker. ( closes #11828 )
2011-04-20 17:42:50 +02:00
Jesus Cea
ac4515063c
startswith and endswith don't accept None as slice index. Patch by Torsten Becker. ( closes #11828 )
2011-04-20 17:09:23 +02:00
Benjamin Peterson
5fd4bd3796
avoid casting with this nice macro
2011-03-06 09:06:34 -06:00
Victor Stinner
2f283c2c19
Fix my previous commit (r88709) for str.encode(errors=...)
2011-03-02 01:21:46 +00:00
Victor Stinner
a5c68c3cb7
Issue #8923 : cache str.encode() result
...
When a string is encoded to UTF-8 in strict mode, the result is cached into the
object. Examples: str.encode(), str.encode('utf-8'), PyUnicode_AsUTF8String()
and PyUnicode_AsEncodedString(unicode, "utf-8", NULL).
2011-03-02 01:03:14 +00:00
Victor Stinner
f3fd733f92
Remove useless argument of _PyUnicode_AsDefaultEncodedString()
2011-03-02 01:03:11 +00:00
Victor Stinner
6d970f4713
Issue #10831 : PyUnicode_FromFormat() supports %li, %lli and %zi formats
2011-03-02 00:04:25 +00:00
Victor Stinner
e7faec1aa9
Fix my previous commit (r88702): initialize size_tflag in parse_format_flags()
2011-03-02 00:01:53 +00:00
Victor Stinner
968654515f
Issue #10829 : Refactor PyUnicode_FromFormat()
...
* Use the same function to parse the format string in the 3 steps
* Fix crashs on invalid format strings
2011-03-01 23:44:09 +00:00
Victor Stinner
2b574a2332
Merged revisions 88697 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r88697 | victor.stinner | 2011-03-01 23:46:52 +0100 (mar., 01 mars 2011) | 4 lines
Issue #11246 : Fix PyUnicode_FromFormat("%V")
Decode the byte string from UTF-8 (with replace error handler) instead of
ISO-8859-1 (in strict mode). Patch written by Ray Allen.
........
2011-03-01 22:48:49 +00:00
Victor Stinner
2512a8b62e
Issue #11246 : Fix PyUnicode_FromFormat("%V")
...
Decode the byte string from UTF-8 (with replace error handler) instead of
ISO-8859-1 (in strict mode). Patch written by Ray Allen.
2011-03-01 22:46:52 +00:00
Alexander Belopolsky
4001847a98
PEP 7 conformance changes (whitespace only).
2011-02-26 01:02:56 +00:00
Alexander Belopolsky
1d52146a25
Issue #11303 : Added shortcuts for utf8 and latin1 encodings.
...
Documented the list of optimized encodings as CPython implementation
detail.
2011-02-25 19:19:57 +00:00
Victor Stinner
659eb84457
Merged revisions 88481 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r88481 | victor.stinner | 2011-02-21 22:13:44 +0100 (lun., 21 févr. 2011) | 4 lines
Fix PyUnicode_FromFormatV("%c") for non-BMP char
Issue #10830 : Fix PyUnicode_FromFormatV("%c") for non-BMP characters on
narrow build.
........
2011-02-23 12:14:22 +00:00
Brett Cannon
b94767ff44
Issue #8914 : fix various warnings from the Clang static analyzer v254.
2011-02-22 20:15:44 +00:00
Victor Stinner
5ed8b2c737
Fix PyUnicode_FromFormatV("%c") for non-BMP char
...
Issue #10830 : Fix PyUnicode_FromFormatV("%c") for non-BMP characters on
narrow build.
2011-02-21 21:13:44 +00:00
Victor Stinner
fd34b3788f
Remove bootstrap code of PyUnicode_AsEncodedString()
...
Issue #11187 : Remove bootstrap code (use ASCII) of
PyUnicode_AsEncodedString(), it was replaced by a better fallback (use
the locale encoding) in PyUnicode_EncodeFSDefault().
Prepare also empty sections in NEWS.
2011-02-21 20:51:28 +00:00
Alexander Belopolsky
b9cc00caab
Removed unneeded #include
2010-12-22 02:35:20 +00:00
Benjamin Peterson
28a4dce6a8
remove (un)transform methods
2010-12-12 01:33:04 +00:00
Alexander Belopolsky
942af5a9a4
Issue #10557 : Fixed error messages from float() and other numeric
...
types. Added a new API function, PyUnicode_TransformDecimalToASCII(),
which transforms non-ASCII decimal digits in a Unicode string to their
ASCII equivalents.
2010-12-04 03:38:46 +00:00
Martin v. Löwis
4d0d471a80
Merge branches/pep-0384.
2010-12-03 20:14:31 +00:00
Georg Brandl
3b9406b08a
Remove redundant check for PyBytes in unicode_encode.
2010-12-03 07:54:09 +00:00
Georg Brandl
02524629f3
#7475 : add (un)transform method to bytes/bytearray and str, add back codecs that can be used with them from Python 2.
2010-12-02 18:06:51 +00:00
Georg Brandl
e5b99f0fb3
Remove redundant includes of headers that are already included by Python.h.
2010-11-30 09:41:01 +00:00
Victor Stinner
d5af0a5df0
PyUnicode_DecodeFSDefaultAndSize() raises MemoryError if _Py_char2wchar() fails
2010-11-08 23:34:29 +00:00
Victor Stinner
2f02a51135
PyUnicode_EncodeFS() raises an exception if _Py_wchar2char() fails
...
* Add error_pos optional argument to _Py_wchar2char()
* PyUnicode_EncodeFS() raises a UnicodeEncodeError or MemoryError if
_Py_wchar2char() fails
2010-11-08 22:43:46 +00:00
Victor Stinner
c911bbfd5d
str, bytes, bytearray docstring: remove unnecessary [...]
2010-11-07 19:04:46 +00:00
Victor Stinner
e14e212221
Fix encode/decode method doc of str, bytes, bytearray types
...
* Specify the default encoding: write 'utf-8' instead of
sys.getdefaultencoding(), because the default encoding is now constant
* Specify the default errors value
2010-11-07 18:41:46 +00:00
Eric Smith
16562f41b0
Merged revisions 86277 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r86277 | eric.smith | 2010-11-06 15:27:37 -0400 (Sat, 06 Nov 2010) | 1 line
Added more to docstrings for str.format, format_map, and __format__.
........
2010-11-06 19:29:45 +00:00
Eric Smith
51d2fd983b
Added more to docstrings for str.format, format_map, and __format__.
2010-11-06 19:27:37 +00:00
David Malcolm
9696088b6d
Issue #10288 : The deprecated family of "char"-handling macros
...
(ISLOWER()/ISUPPER()/etc) have now been removed: use Py_ISLOWER() etc
instead.
2010-11-05 17:23:41 +00:00
Eric Smith
27bbca6f79
Issue #6081 : Add str.format_map. str.format_map(mapping) is similar to str.format(**mapping), except mapping does not get converted to a dict.
2010-11-04 17:06:58 +00:00
Victor Stinner
ad15872854
Simplify PyUnicode_Encode/DecodeFSDefault on Windows/Mac OS X
...
* Windows always uses mbcs
* Mac OS X always uses utf-8
2010-10-27 00:25:46 +00:00
Victor Stinner
f933e1ab6f
Issue #4388 : On Mac OS X, decode command line arguments from UTF-8, instead of
...
the locale encoding. If the LANG (and LC_ALL and LC_CTYPE) environment variable
is not set, the locale encoding is ISO-8859-1, whereas most programs (including
Python) expect UTF-8. Python already uses UTF-8 for the filesystem encoding and
to encode command line arguments on this OS.
2010-10-20 22:58:25 +00:00
Victor Stinner
9a90900da5
PyUnicode_FromFormatV(): Fix %A format
...
It was not completly implemented. Add a test.
2010-10-18 20:59:24 +00:00
Benjamin Peterson
8f67d0893f
make hashes always the size of pointers; introduce Py_hash_t #9778
2010-10-17 20:54:53 +00:00
Georg Brandl
ded5acf34a
Merged revisions 81936 via svnmerge from
...
svn+ssh://svn.python.org/python/branches/py3k
........
r81936 | mark.dickinson | 2010-06-12 11:10:14 +0200 (Sa, 12 Jun 2010) | 2 lines
Silence 'unused variable' gcc warning. Patch by Éric Araujo.
........
2010-10-17 11:48:07 +00:00
Victor Stinner
168e117e0a
Add an optional size argument to _Py_char2wchar()
...
_Py_char2wchar() callers usually need the result size in characters. Since it's
trivial to compute it in _Py_char2wchar() (O(1) whereas wcslen() is O(n)), add
an option to get it.
2010-10-16 23:16:16 +00:00
Victor Stinner
f3170ccef8
Use locale encoding if Py_FileSystemDefaultEncoding is not set
...
* PyUnicode_EncodeFSDefault(), PyUnicode_DecodeFSDefaultAndSize() and
PyUnicode_DecodeFSDefault() use the locale encoding instead of UTF-8 if
Py_FileSystemDefaultEncoding is NULL
* redecode_filenames() functions and _Py_code_object_list (issue #9630 )
are no more needed: remove them
2010-10-15 12:04:23 +00:00
Georg Brandl
66c221e993
#9418 : first step of moving private string methods to _string module.
2010-10-14 07:04:07 +00:00
Victor Stinner
beb4135b8c
PyUnicode_AsWideCharString() takes a PyObject*, not a PyUnicodeObject*
...
All unicode functions uses PyObject* except PyUnicode_AsWideChar(). Fix the
prototype for the new function PyUnicode_AsWideCharString().
2010-10-07 01:02:42 +00:00
Victor Stinner
5593d8aeb4
Issue #8670 : PyUnicode_AsWideChar() and PyUnicode_AsWideCharString() replace
...
UTF-16 surrogate pairs by single non-BMP characters for 16 bits Py_UNICODE
and 32 bits wchar_t (eg. Linux in narrow build).
2010-10-02 11:11:27 +00:00
Victor Stinner
1c24bd0252
Issue #8870 : PyUnicode_AsWideCharString() doesn't count the trailing nul character
...
And write unit tests for PyUnicode_AsWideChar() and PyUnicode_AsWideCharString().
2010-10-02 11:03:13 +00:00
Victor Stinner
71e91a358b
Fix PyUnicode_AsWideCharString(): set *size if size is not NULL
2010-09-29 17:55:12 +00:00
Victor Stinner
c39211f51e
Issue #9630 : Redecode filenames when setting the filesystem encoding
...
Redecode the filenames of:
- all modules: __file__ and __path__ attributes
- all code objects: co_filename attribute
- sys.path
- sys.meta_path
- sys.executable
- sys.path_importer_cache (keys)
Keep weak references to all code objects until initfsencoding() is called, to
be able to redecode co_filename attribute of all code objects.
2010-09-29 16:35:47 +00:00
Victor Stinner
137c34c027
Issue #9979 : Create function PyUnicode_AsWideCharString().
2010-09-29 10:25:54 +00:00
Benjamin Peterson
d4ac96a336
use return NULL; it's just as correct
2010-09-12 16:40:53 +00:00
Victor Stinner
4c7db315df
Issue #9738 , #9836 : Fix refleak introduced by r84704
2010-09-12 07:51:18 +00:00
Benjamin Peterson
9be0b2e312
detect non-ascii characters much earlier (plugs ref leak)
2010-09-12 03:40:54 +00:00
Victor Stinner
1205f2774e
Issue #9738 : PyUnicode_FromFormat() and PyErr_Format() raise an error on
...
a non-ASCII byte in the format string.
Document also the encoding.
2010-09-11 00:54:47 +00:00
Victor Stinner
46408606d8
Rename PyUnicode_strdup() to PyUnicode_AsUnicodeCopy()
2010-09-03 16:18:00 +00:00
Victor Stinner
71133ff368
Create PyUnicode_strdup() function
2010-09-01 23:43:53 +00:00
Victor Stinner
c4eb765fc1
Create Py_UNICODE_strcat() function
2010-09-01 23:43:50 +00:00
Victor Stinner
42cb462682
Remove unicode_default_encoding constant
...
Inline its value in PyUnicode_GetDefaultEncoding(). The comment is now outdated
(we will not change its value anymore).
2010-09-01 19:39:01 +00:00
Antoine Pitrou
fce7fd6426
Issue #9549 : sys.setdefaultencoding() and PyUnicode_SetDefaultEncoding()
...
are now removed, since their effect was inexistent in 3.x (the default
encoding is hardcoded to utf-8 and cannot be changed).
2010-09-01 18:54:56 +00:00
Antoine Pitrou
a2983c6734
Merged revisions 84394 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r84394 | antoine.pitrou | 2010-09-01 17:10:12 +0200 (mer., 01 sept. 2010) | 4 lines
Issue #7415 : PyUnicode_FromEncodedObject() now uses the new buffer API
properly. Patch by Stefan Behnel.
........
2010-09-01 15:16:41 +00:00
Antoine Pitrou
b0fa831d1e
Issue #7415 : PyUnicode_FromEncodedObject() now uses the new buffer API
...
properly. Patch by Stefan Behnel.
2010-09-01 15:10:12 +00:00
Daniel Stutzbach
8515eaefda
Issue 8781: On systems a signed 4-byte wchar_t and a 4-byte Py_UNICODE, use memcpy to convert between the two (as already done when wchar_t is unsigned)
2010-08-24 21:57:33 +00:00
Victor Stinner
3119ed73aa
Fix PyUnicode_EncodeFSDefault() indentation
2010-08-18 22:26:50 +00:00
Victor Stinner
ef8d95c498
Issue #9425 : Create Py_UNICODE_strncmp() function
...
The code is based on strncmp() of the libiberty library,
function in the public domain.
2010-08-16 22:03:11 +00:00
Victor Stinner
47fcb5b4c3
Issue #9542 : Create PyUnicode_FSDecoder() function
...
It's a ParseTuple converter: decode bytes objects to unicode using
PyUnicode_DecodeFSDefaultAndSize(); str objects are output as-is.
* Don't specify surrogateescape error handler in the comments nor the
documentation, but PyUnicode_DecodeFSDefaultAndSize() and
PyUnicode_EncodeFSDefault() because these functions use strict error handler
for the mbcs encoding (on Windows).
* Remove PyUnicode_FSConverter() comment in unicodeobject.c to avoid
inconsistency with unicodeobject.h.
2010-08-13 23:59:58 +00:00
Victor Stinner
4a2b7a1b14
Issue #9425 : Create PyErr_WarnFormat() function
...
Similar to PyErr_WarnEx() but use PyUnicode_FromFormatV() to format the warning
message.
Strip also some trailing spaces.
2010-08-13 14:03:48 +00:00
Alexander Belopolsky
f0f45142d5
Issue #2443 : Added a new macro, Py_VA_COPY, which is equivalent to C99
...
va_copy, but available on all python platforms. Untabified a few
unrelated files.
2010-08-11 17:31:17 +00:00
Victor Stinner
331ea92ade
Issue #9425 : create Py_UNICODE_strrchr() function
2010-08-10 16:37:20 +00:00
Georg Brandl
1fa11af7aa
Merged revisions 83226-83227,83229-83232 via svnmerge from
...
svn+ssh://svn.python.org/python/branches/py3k
........
r83226 | georg.brandl | 2010-07-29 16:17:12 +0200 (Do, 29 Jul 2010) | 1 line
#1090076 : explain the behavior of *vars* in get() better.
........
r83227 | georg.brandl | 2010-07-29 16:23:06 +0200 (Do, 29 Jul 2010) | 1 line
Use Py_CLEAR().
........
r83229 | georg.brandl | 2010-07-29 16:32:22 +0200 (Do, 29 Jul 2010) | 1 line
#9407 : document configparser.Error.
........
r83230 | georg.brandl | 2010-07-29 16:36:11 +0200 (Do, 29 Jul 2010) | 1 line
Use correct directive and name.
........
r83231 | georg.brandl | 2010-07-29 16:46:07 +0200 (Do, 29 Jul 2010) | 1 line
#9397 : remove mention of dbm.bsd which does not exist anymore.
........
r83232 | georg.brandl | 2010-07-29 16:49:08 +0200 (Do, 29 Jul 2010) | 1 line
#9388 : remove ERA_YEAR which is never defined in the source code.
........
2010-08-01 21:03:01 +00:00
Georg Brandl
0f1470960c
Recorded merge of revisions 83444 via svnmerge from
...
svn+ssh://svn.python.org/python/branches/py3k
........
r83444 | georg.brandl | 2010-08-01 22:51:02 +0200 (So, 01 Aug 2010) | 1 line
Revert r83395, it introduces test failures and is not necessary anyway since we now have to nul-terminate the string anyway.
........
2010-08-01 20:54:22 +00:00
Georg Brandl
78eef3de88
Revert r83395, it introduces test failures and is not necessary anyway since we now have to nul-terminate the string anyway.
2010-08-01 20:51:02 +00:00
Georg Brandl
a70070c9e5
Merged revisions 83395,83417 via svnmerge from
...
svn+ssh://svn.python.org/python/branches/py3k
........
r83395 | georg.brandl | 2010-08-01 10:49:18 +0200 (So, 01 Aug 2010) | 1 line
#8821 : do not rely on Unicode strings being terminated with a \u0000, rather explicitly check range before looking for a second surrogate character.
........
r83417 | georg.brandl | 2010-08-01 20:38:26 +0200 (So, 01 Aug 2010) | 1 line
#5776 : fix mistakes in python specfile. (Nobody probably uses it anyway.)
........
2010-08-01 18:59:44 +00:00
Georg Brandl
bd534f0349
#8821 : do not rely on Unicode strings being terminated with a \u0000, rather explicitly check range before looking for a second surrogate character.
2010-08-01 08:49:18 +00:00
Georg Brandl
8ee604b989
Use Py_CLEAR().
2010-07-29 14:23:06 +00:00
Stefan Krah
aebd6f4c29
Merged revisions 82978 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r82978 | stefan.krah | 2010-07-19 19:58:26 +0200 (Mon, 19 Jul 2010) | 3 lines
Sub-issue of #9036 : Fix incorrect use of Py_CHARMASK.
........
2010-07-19 18:01:13 +00:00
Stefan Krah
99212f61db
Sub-issue of #9036 : Fix incorrect use of Py_CHARMASK.
2010-07-19 17:58:26 +00:00
Senthil Kumaran
74ceac2306
Merged revisions 82573 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r82573 | senthil.kumaran | 2010-07-05 17:30:56 +0530 (Mon, 05 Jul 2010) | 3 lines
Fix the docstrings of the capitalize method.
........
2010-07-05 12:04:23 +00:00
Senthil Kumaran
e51ee8a5bc
Fix the docstrings of the capitalize method.
2010-07-05 12:00:56 +00:00
Ezio Melotti
25bc019d46
Merged revisions 82413,82468 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r82413 | ezio.melotti | 2010-07-01 10:32:02 +0300 (Thu, 01 Jul 2010) | 13 lines
Update PyUnicode_DecodeUTF8 from RFC 2279 to RFC 3629.
1) #8271 : when a byte sequence is invalid, only the start byte and all the
valid continuation bytes are now replaced by U+FFFD, instead of replacing
the number of bytes specified by the start byte.
See http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf (pages 94-95);
2) 5- and 6-bytes-long UTF-8 sequences are now considered invalid (no changes
in behavior);
3) Change the error messages "unexpected code byte" to "invalid start byte"
and "invalid data" to "invalid continuation byte";
4) Add an extensive set of tests in test_unicode;
5) Fix test_codeccallbacks because it was failing after this change.
........
r82468 | ezio.melotti | 2010-07-03 07:52:19 +0300 (Sat, 03 Jul 2010) | 1 line
Update comment about surrogates.
........
2010-07-03 05:18:50 +00:00
Ezio Melotti
9bf2b3ae6a
Update comment about surrogates.
2010-07-03 04:52:19 +00:00
Ezio Melotti
57221d02ba
Update PyUnicode_DecodeUTF8 from RFC 2279 to RFC 3629.
...
1) #8271 : when a byte sequence is invalid, only the start byte and all the
valid continuation bytes are now replaced by U+FFFD, instead of replacing
the number of bytes specified by the start byte.
See http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf (pages 94-95);
2) 5- and 6-bytes-long UTF-8 sequences are now considered invalid (no changes
in behavior);
3) Change the error messages "unexpected code byte" to "invalid start byte"
and "invalid data" to "invalid continuation byte";
4) Add an extensive set of tests in test_unicode;
5) Fix test_codeccallbacks because it was failing after this change.
2010-07-01 07:32:02 +00:00
Georg Brandl
952867aa30
#9078 : fix some Unicode C API descriptions, in comments and docs.
2010-06-27 10:17:12 +00:00
Ezio Melotti
415f340a0c
Merged revisions 82252 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
................
r82252 | ezio.melotti | 2010-06-26 21:50:39 +0300 (Sat, 26 Jun 2010) | 9 lines
Merged revisions 82248 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r82248 | ezio.melotti | 2010-06-26 21:44:42 +0300 (Sat, 26 Jun 2010) | 1 line
Fix extra space.
........
................
2010-06-26 18:52:26 +00:00
Ezio Melotti
c1897e716d
Merged revisions 82248 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r82248 | ezio.melotti | 2010-06-26 21:44:42 +0300 (Sat, 26 Jun 2010) | 1 line
Fix extra space.
........
2010-06-26 18:50:39 +00:00
Victor Stinner
554f3f0081
Issue #850997 : mbcs encoding (Windows only) handles errors argument: strict
...
mode raises unicode errors. The encoder only supports "strict" and "replace"
error handlers, the decoder only supports "strict" and "ignore" error handlers.
2010-06-16 23:33:54 +00:00
Mark Dickinson
7db923cc99
Silence 'unused variable' gcc warning. Patch by Éric Araujo.
2010-06-12 09:10:14 +00:00
Victor Stinner
313a120ab6
Issue #8969 : On Windows, use mbcs codec in strict mode to encode and decode
...
filenames and enable os.fsencode().
2010-06-11 23:56:51 +00:00
Antoine Pitrou
6107a688ee
Merged revisions 81908 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
................
r81908 | antoine.pitrou | 2010-06-11 23:46:32 +0200 (ven., 11 juin 2010) | 11 lines
Merged revisions 81907 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r81907 | antoine.pitrou | 2010-06-11 23:42:26 +0200 (ven., 11 juin 2010) | 5 lines
Issue #8941 : decoding big endian UTF-32 data in UCS-2 builds could crash
the interpreter with characters outside the Basic Multilingual Plane
(higher than 0x10000).
........
................
2010-06-11 21:48:34 +00:00
Antoine Pitrou
cc0cfd3576
Merged revisions 81907 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r81907 | antoine.pitrou | 2010-06-11 23:42:26 +0200 (ven., 11 juin 2010) | 5 lines
Issue #8941 : decoding big endian UTF-32 data in UCS-2 builds could crash
the interpreter with characters outside the Basic Multilingual Plane
(higher than 0x10000).
........
2010-06-11 21:46:32 +00:00
Victor Stinner
37296e89a5
Fix r81869: ISO-8859-15 was seen as an alias to ISO-8859-1
...
Don't use normalize_encoding() result if it is truncated.
2010-06-10 13:36:23 +00:00
Victor Stinner
600d3bed6c
Issue #8922 : Normalize the encoding name in PyUnicode_AsEncodedString() to
...
enable shortcuts for upper case encoding name. Add also a shortcut for
"iso-8859-1" in PyUnicode_AsEncodedString() and PyUnicode_Decode().
2010-06-10 12:00:55 +00:00
Victor Stinner
ae6265f8d0
Issue #8715 : Create PyUnicode_EncodeFSDefault() function: Encode a Unicode
...
object to Py_FileSystemDefaultEncoding with the "surrogateescape" error
handler, return a bytes object. If Py_FileSystemDefaultEncoding is not set,
fall back to UTF-8.
2010-05-15 16:27:27 +00:00
Victor Stinner
59e62db0a3
Enable shortcuts for common encodings in PyUnicode_AsEncodedString() for any
...
error handler, not only the default error handler (strict)
2010-05-15 13:14:32 +00:00
Victor Stinner
b9a20ad036
PyUnicode_DecodeFSDefaultAndSize() uses surrogateescape error handler
...
This function is only used to decode Python module filenames, but Python
doesn't support surrogates in modules filenames yet. So nobody noticed this
minor bug.
2010-04-30 16:37:52 +00:00
Victor Stinner
0ea2a468e3
Simplify PyUnicode_FSConverter(): remove reference to PyByteArray
...
PyByteArray is no more supported
2010-04-30 00:22:08 +00:00
Benjamin Peterson
a23831ff44
condense condition
2010-04-25 21:54:00 +00:00
Victor Stinner
0b79b76c2b
Merged revisions 80384 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r80384 | victor.stinner | 2010-04-22 22:01:57 +0200 (jeu., 22 avril 2010) | 2 lines
Fix my previous commit (r80382) for wide build (unicodeobject.c)
........
2010-04-22 20:07:28 +00:00
Victor Stinner
445a623226
Fix my previous commit (r80382) for wide build (unicodeobject.c)
2010-04-22 20:01:57 +00:00
Victor Stinner
158701d886
Merged revisions 80382 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r80382 | victor.stinner | 2010-04-22 21:38:16 +0200 (jeu., 22 avril 2010) | 3 lines
Issue #8092 : Fix PyUnicode_EncodeUTF8() to support error handler producing
unicode string (eg. backslashreplace)
........
2010-04-22 19:41:01 +00:00
Victor Stinner
31be90b0c7
Issue #8092 : Fix PyUnicode_EncodeUTF8() to support error handler producing
...
unicode string (eg. backslashreplace)
2010-04-22 19:38:16 +00:00
Victor Stinner
dcb2403022
Issue #8485 : PyUnicode_FSConverter() doesn't accept bytearray object anymore,
...
you have to convert your bytearray filenames to bytes
2010-04-22 12:08:36 +00:00
Florent Xicluna
806d8cf0e8
Merged revisions 79494,79496 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r79494 | florent.xicluna | 2010-03-30 10:24:06 +0200 (mar, 30 mar 2010) | 2 lines
#7643 : Unicode codepoints VT (0x0B) and FF (0x0C) are linebreaks according to Unicode Standard Annex #14 .
........
r79496 | florent.xicluna | 2010-03-30 18:29:03 +0200 (mar, 30 mar 2010) | 2 lines
Highlight the change of behavior related to r79494. Now VT and FF are linebreaks.
........
2010-03-30 19:34:18 +00:00
Victor Stinner
d526c7cc1c
Merged revisions 76197 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r76197 | benjamin.peterson | 2009-11-10 22:23:15 +0100 (mar., 10 nov. 2009) | 1 line
death to compiler warning
........
2010-03-23 11:43:20 +00:00
Victor Stinner
abdb21a3a8
Merged revisions 79281 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
................
r79281 | victor.stinner | 2010-03-22 13:50:40 +0100 (lun., 22 mars 2010) | 16 lines
Merged revisions 79278,79280 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r79278 | victor.stinner | 2010-03-22 13:24:37 +0100 (lun., 22 mars 2010) | 2 lines
Issue #1583863 : An unicode subclass can now override the __str__ method
........
r79280 | victor.stinner | 2010-03-22 13:36:28 +0100 (lun., 22 mars 2010) | 5 lines
Fix the NEWS about my last commit: an unicode subclass can now override the
__unicode__ method (and not the __str__ method).
Simplify also the testcase.
........
................
2010-03-22 12:53:14 +00:00
Victor Stinner
808fc0a0ee
Merged revisions 79278,79280 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r79278 | victor.stinner | 2010-03-22 13:24:37 +0100 (lun., 22 mars 2010) | 2 lines
Issue #1583863 : An unicode subclass can now override the __str__ method
........
r79280 | victor.stinner | 2010-03-22 13:36:28 +0100 (lun., 22 mars 2010) | 5 lines
Fix the NEWS about my last commit: an unicode subclass can now override the
__unicode__ method (and not the __str__ method).
Simplify also the testcase.
........
2010-03-22 12:50:40 +00:00
Gregory P. Smith
cc47d8c8d4
Update a comment with more details.
2010-02-27 08:33:11 +00:00
Ezio Melotti
4c81fbb118
Merged revisions 77745 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
................
r77745 | ezio.melotti | 2010-01-25 13:58:28 +0200 (Mon, 25 Jan 2010) | 9 lines
Merged revisions 77743 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r77743 | ezio.melotti | 2010-01-25 13:24:37 +0200 (Mon, 25 Jan 2010) | 1 line
#7775 : fixed docstring for rpartition
........
................
2010-01-25 12:02:24 +00:00
Ezio Melotti
5b2b242f07
Merged revisions 77743 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r77743 | ezio.melotti | 2010-01-25 13:24:37 +0200 (Mon, 25 Jan 2010) | 1 line
#7775 : fixed docstring for rpartition
........
2010-01-25 11:58:28 +00:00
Antoine Pitrou
f068f94e82
Merged revisions 77469-77470 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r77469 | antoine.pitrou | 2010-01-13 14:43:37 +0100 (mer., 13 janv. 2010) | 3 lines
Test commit to try to diagnose failures of the IA-64 buildbot
........
r77470 | antoine.pitrou | 2010-01-13 15:01:26 +0100 (mer., 13 janv. 2010) | 3 lines
Sanitize bloom filter macros
........
2010-01-13 14:19:12 +00:00
Antoine Pitrou
cbfdee3e54
Merged revisions 77463 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r77463 | antoine.pitrou | 2010-01-13 09:55:20 +0100 (mer., 13 janv. 2010) | 3 lines
Fix Windows build (re r77461)
........
2010-01-13 08:58:08 +00:00
Antoine Pitrou
f2c5484f9e
Merged revisions 77461 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r77461 | antoine.pitrou | 2010-01-13 08:55:48 +0100 (mer., 13 janv. 2010) | 5 lines
Issue #7622 : Improve the split(), rsplit(), splitlines() and replace()
methods of bytes, bytearray and unicode objects by using a common
implementation based on stringlib's fast search. Patch by Florent Xicluna.
........
2010-01-13 08:07:53 +00:00
Benjamin Peterson
bb81c8ce3f
Merged revisions 77395 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r77395 | benjamin.peterson | 2010-01-09 15:45:28 -0600 (Sat, 09 Jan 2010) | 2 lines
Python strings ending with '\0' should not be equivalent to their C counterparts in PyUnicode_CompareWithASCIIString
........
2010-01-09 21:54:39 +00:00
Benjamin Peterson
8667a9b6ea
Python strings ending with '\0' should not be equivalent to their C counterparts in PyUnicode_CompareWithASCIIString
2010-01-09 21:45:28 +00:00
Mark Dickinson
6ce4a9a9f2
Merged revisions 76308 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r76308 | mark.dickinson | 2009-11-15 16:18:58 +0000 (Sun, 15 Nov 2009) | 3 lines
Issue #7228 : Add '%lld' and '%llu' support to PyFormat_FromString,
PyFormat_FromStringV and PyErr_Format.
........
2009-11-16 17:00:11 +00:00
Benjamin Peterson
adf6a6c842
death to compiler warning
2009-11-10 21:23:15 +00:00
Georg Brandl
628e6f908e
Merged revisions 75797 via svnmerge from
...
svn+ssh://svn.python.org/python/branches/py3k
................
r75797 | georg.brandl | 2009-10-27 16:28:25 +0100 (Di, 27 Okt 2009) | 129 lines
Merged revisions 75365,75394,75402-75403,75418,75459,75484,75592-75596,75600,75602-75607,75610-75613,75616-75617,75623,75627,75640,75647,75696,75795 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r75365 | georg.brandl | 2009-10-11 22:16:16 +0200 (So, 11 Okt 2009) | 1 line
Fix broken links found by "make linkcheck". scipy.org seems to be done right now, so I could not verify links going there.
........
r75394 | georg.brandl | 2009-10-13 20:10:59 +0200 (Di, 13 Okt 2009) | 1 line
Fix markup.
........
r75402 | georg.brandl | 2009-10-14 17:51:48 +0200 (Mi, 14 Okt 2009) | 1 line
#7125 : fix typo.
........
r75403 | georg.brandl | 2009-10-14 17:57:46 +0200 (Mi, 14 Okt 2009) | 1 line
#7126 : os.environ changes *do* take effect in subprocesses started with os.system().
........
r75418 | georg.brandl | 2009-10-14 20:48:32 +0200 (Mi, 14 Okt 2009) | 1 line
#7116 : str.join() takes an iterable.
........
r75459 | georg.brandl | 2009-10-17 10:57:43 +0200 (Sa, 17 Okt 2009) | 1 line
Fix refleaks in _ctypes PyCSimpleType_New, which fixes the refleak seen in test___all__.
........
r75484 | georg.brandl | 2009-10-18 09:58:12 +0200 (So, 18 Okt 2009) | 1 line
Fix missing word.
........
r75592 | georg.brandl | 2009-10-22 09:05:48 +0200 (Do, 22 Okt 2009) | 1 line
Fix punctuation.
........
r75593 | georg.brandl | 2009-10-22 09:06:49 +0200 (Do, 22 Okt 2009) | 1 line
Revert unintended change.
........
r75594 | georg.brandl | 2009-10-22 09:56:02 +0200 (Do, 22 Okt 2009) | 1 line
Fix markup.
........
r75595 | georg.brandl | 2009-10-22 09:56:56 +0200 (Do, 22 Okt 2009) | 1 line
Fix duplicate target.
........
r75596 | georg.brandl | 2009-10-22 10:05:04 +0200 (Do, 22 Okt 2009) | 1 line
Add a new directive marking up implementation details and start using it.
........
r75600 | georg.brandl | 2009-10-22 13:01:46 +0200 (Do, 22 Okt 2009) | 1 line
Make it more robust.
........
r75602 | georg.brandl | 2009-10-22 13:28:06 +0200 (Do, 22 Okt 2009) | 1 line
Document new directive.
........
r75603 | georg.brandl | 2009-10-22 13:28:23 +0200 (Do, 22 Okt 2009) | 1 line
Allow short form with text as argument.
........
r75604 | georg.brandl | 2009-10-22 13:36:50 +0200 (Do, 22 Okt 2009) | 1 line
Fix stylesheet for multi-paragraph impl-details.
........
r75605 | georg.brandl | 2009-10-22 13:48:10 +0200 (Do, 22 Okt 2009) | 1 line
Use "impl-detail" directive where applicable.
........
r75606 | georg.brandl | 2009-10-22 17:00:06 +0200 (Do, 22 Okt 2009) | 1 line
#6324 : membership test tries iteration via __iter__.
........
r75607 | georg.brandl | 2009-10-22 17:04:09 +0200 (Do, 22 Okt 2009) | 1 line
#7088 : document new functions in signal as Unix-only.
........
r75610 | georg.brandl | 2009-10-22 17:27:24 +0200 (Do, 22 Okt 2009) | 1 line
Reorder __slots__ fine print and add a clarification.
........
r75611 | georg.brandl | 2009-10-22 17:42:32 +0200 (Do, 22 Okt 2009) | 1 line
#7035 : improve docs of the various <method>_errors() functions, and give them docstrings.
........
r75612 | georg.brandl | 2009-10-22 17:52:15 +0200 (Do, 22 Okt 2009) | 1 line
#7156 : document curses as Unix-only.
........
r75613 | georg.brandl | 2009-10-22 17:54:35 +0200 (Do, 22 Okt 2009) | 1 line
#6977 : getopt does not support optional option arguments.
........
r75616 | georg.brandl | 2009-10-22 18:17:05 +0200 (Do, 22 Okt 2009) | 1 line
Add proper references.
........
r75617 | georg.brandl | 2009-10-22 18:20:55 +0200 (Do, 22 Okt 2009) | 1 line
Make printout margin important.
........
r75623 | georg.brandl | 2009-10-23 10:14:44 +0200 (Fr, 23 Okt 2009) | 1 line
#7188 : fix optionxform() docs.
........
r75627 | fred.drake | 2009-10-23 15:04:51 +0200 (Fr, 23 Okt 2009) | 2 lines
add further note about what's passed to optionxform
........
r75640 | neil.schemenauer | 2009-10-23 21:58:17 +0200 (Fr, 23 Okt 2009) | 2 lines
Improve some docstrings in the 'warnings' module.
........
r75647 | georg.brandl | 2009-10-24 12:04:19 +0200 (Sa, 24 Okt 2009) | 1 line
Fix markup.
........
r75696 | georg.brandl | 2009-10-25 21:25:43 +0100 (So, 25 Okt 2009) | 1 line
Fix a demo.
........
r75795 | georg.brandl | 2009-10-27 16:10:22 +0100 (Di, 27 Okt 2009) | 1 line
Fix a strange mis-edit.
........
................
2009-10-27 20:24:45 +00:00
Georg Brandl
495f7b5adb
Merged revisions 75365,75394,75402-75403,75418,75459,75484,75592-75596,75600,75602-75607,75610-75613,75616-75617,75623,75627,75640,75647,75696,75795 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r75365 | georg.brandl | 2009-10-11 22:16:16 +0200 (So, 11 Okt 2009) | 1 line
Fix broken links found by "make linkcheck". scipy.org seems to be done right now, so I could not verify links going there.
........
r75394 | georg.brandl | 2009-10-13 20:10:59 +0200 (Di, 13 Okt 2009) | 1 line
Fix markup.
........
r75402 | georg.brandl | 2009-10-14 17:51:48 +0200 (Mi, 14 Okt 2009) | 1 line
#7125 : fix typo.
........
r75403 | georg.brandl | 2009-10-14 17:57:46 +0200 (Mi, 14 Okt 2009) | 1 line
#7126 : os.environ changes *do* take effect in subprocesses started with os.system().
........
r75418 | georg.brandl | 2009-10-14 20:48:32 +0200 (Mi, 14 Okt 2009) | 1 line
#7116 : str.join() takes an iterable.
........
r75459 | georg.brandl | 2009-10-17 10:57:43 +0200 (Sa, 17 Okt 2009) | 1 line
Fix refleaks in _ctypes PyCSimpleType_New, which fixes the refleak seen in test___all__.
........
r75484 | georg.brandl | 2009-10-18 09:58:12 +0200 (So, 18 Okt 2009) | 1 line
Fix missing word.
........
r75592 | georg.brandl | 2009-10-22 09:05:48 +0200 (Do, 22 Okt 2009) | 1 line
Fix punctuation.
........
r75593 | georg.brandl | 2009-10-22 09:06:49 +0200 (Do, 22 Okt 2009) | 1 line
Revert unintended change.
........
r75594 | georg.brandl | 2009-10-22 09:56:02 +0200 (Do, 22 Okt 2009) | 1 line
Fix markup.
........
r75595 | georg.brandl | 2009-10-22 09:56:56 +0200 (Do, 22 Okt 2009) | 1 line
Fix duplicate target.
........
r75596 | georg.brandl | 2009-10-22 10:05:04 +0200 (Do, 22 Okt 2009) | 1 line
Add a new directive marking up implementation details and start using it.
........
r75600 | georg.brandl | 2009-10-22 13:01:46 +0200 (Do, 22 Okt 2009) | 1 line
Make it more robust.
........
r75602 | georg.brandl | 2009-10-22 13:28:06 +0200 (Do, 22 Okt 2009) | 1 line
Document new directive.
........
r75603 | georg.brandl | 2009-10-22 13:28:23 +0200 (Do, 22 Okt 2009) | 1 line
Allow short form with text as argument.
........
r75604 | georg.brandl | 2009-10-22 13:36:50 +0200 (Do, 22 Okt 2009) | 1 line
Fix stylesheet for multi-paragraph impl-details.
........
r75605 | georg.brandl | 2009-10-22 13:48:10 +0200 (Do, 22 Okt 2009) | 1 line
Use "impl-detail" directive where applicable.
........
r75606 | georg.brandl | 2009-10-22 17:00:06 +0200 (Do, 22 Okt 2009) | 1 line
#6324 : membership test tries iteration via __iter__.
........
r75607 | georg.brandl | 2009-10-22 17:04:09 +0200 (Do, 22 Okt 2009) | 1 line
#7088 : document new functions in signal as Unix-only.
........
r75610 | georg.brandl | 2009-10-22 17:27:24 +0200 (Do, 22 Okt 2009) | 1 line
Reorder __slots__ fine print and add a clarification.
........
r75611 | georg.brandl | 2009-10-22 17:42:32 +0200 (Do, 22 Okt 2009) | 1 line
#7035 : improve docs of the various <method>_errors() functions, and give them docstrings.
........
r75612 | georg.brandl | 2009-10-22 17:52:15 +0200 (Do, 22 Okt 2009) | 1 line
#7156 : document curses as Unix-only.
........
r75613 | georg.brandl | 2009-10-22 17:54:35 +0200 (Do, 22 Okt 2009) | 1 line
#6977 : getopt does not support optional option arguments.
........
r75616 | georg.brandl | 2009-10-22 18:17:05 +0200 (Do, 22 Okt 2009) | 1 line
Add proper references.
........
r75617 | georg.brandl | 2009-10-22 18:20:55 +0200 (Do, 22 Okt 2009) | 1 line
Make printout margin important.
........
r75623 | georg.brandl | 2009-10-23 10:14:44 +0200 (Fr, 23 Okt 2009) | 1 line
#7188 : fix optionxform() docs.
........
r75627 | fred.drake | 2009-10-23 15:04:51 +0200 (Fr, 23 Okt 2009) | 2 lines
add further note about what's passed to optionxform
........
r75640 | neil.schemenauer | 2009-10-23 21:58:17 +0200 (Fr, 23 Okt 2009) | 2 lines
Improve some docstrings in the 'warnings' module.
........
r75647 | georg.brandl | 2009-10-24 12:04:19 +0200 (Sa, 24 Okt 2009) | 1 line
Fix markup.
........
r75696 | georg.brandl | 2009-10-25 21:25:43 +0100 (So, 25 Okt 2009) | 1 line
Fix a demo.
........
r75795 | georg.brandl | 2009-10-27 16:10:22 +0100 (Di, 27 Okt 2009) | 1 line
Fix a strange mis-edit.
........
2009-10-27 15:28:25 +00:00
Benjamin Peterson
f38a69f979
kill merged line
2009-09-18 21:49:06 +00:00
Benjamin Peterson
308d637c94
Merged revisions 74929 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r74929 | benjamin.peterson | 2009-09-18 16:14:55 -0500 (Fri, 18 Sep 2009) | 1 line
add keyword arguments support to str/unicode encode and decode #6300
........
2009-09-18 21:42:35 +00:00
Georg Brandl
194da4a7da
Merged revisions 74126,74130-74131,74149,74155,74157,74180-74183,74398 via svnmerge from
...
svn+ssh://svn.python.org/python/branches/py3k
................
r74126 | alexandre.vassalotti | 2009-07-21 02:39:03 +0200 (Di, 21 Jul 2009) | 14 lines
Merged revisions 73871 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r73871 | alexandre.vassalotti | 2009-07-06 22:17:30 -0400 (Mon, 06 Jul 2009) | 7 lines
Grow the allocated buffer in PyUnicode_EncodeUTF7 to avoid buffer overrun.
Without this change, test_unicode.UnicodeTest.test_codecs_utf7 crashes in
debug mode. What happens is the unicode string u'\U000abcde' with a length
of 1 encodes to the string '+2m/c3g-' of length 8. Since only 5 bytes is
reserved in the buffer, a buffer overrun occurs.
........
................
r74130 | alexandre.vassalotti | 2009-07-21 02:57:50 +0200 (Di, 21 Jul 2009) | 2 lines
Add ignore rule for the Doc/tools/jinga2/ directory.
................
r74131 | alexandre.vassalotti | 2009-07-21 04:51:58 +0200 (Di, 21 Jul 2009) | 13 lines
Merged revisions 73683,73786 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r73683 | georg.brandl | 2009-06-29 10:44:49 -0400 (Mon, 29 Jun 2009) | 1 line
Fix error handling in PyCode_Optimize, by Alexander Schremmer at EuroPython sprint.
........
r73786 | benjamin.peterson | 2009-07-02 18:56:16 -0400 (Thu, 02 Jul 2009) | 1 line
condense with assertRaises
........
................
r74149 | ezio.melotti | 2009-07-21 22:37:52 +0200 (Di, 21 Jul 2009) | 9 lines
Merged revisions 74148 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r74148 | ezio.melotti | 2009-07-21 23:18:27 +0300 (Tue, 21 Jul 2009) | 1 line
#6536 fixed typo
........
................
r74155 | alexandre.vassalotti | 2009-07-22 04:24:49 +0200 (Mi, 22 Jul 2009) | 2 lines
Issue #6242 : Fix deallocator of io.StringIO and io.BytesIO.
................
r74157 | alexandre.vassalotti | 2009-07-22 05:07:33 +0200 (Mi, 22 Jul 2009) | 2 lines
Issue #6241 : Better type checking for the arguments of io.StringIO.
................
r74180 | ezio.melotti | 2009-07-22 23:17:14 +0200 (Mi, 22 Jul 2009) | 9 lines
Merged revisions 74179 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r74179 | ezio.melotti | 2009-07-23 00:08:49 +0300 (Thu, 23 Jul 2009) | 1 line
#6423 has_key -> in
........
................
r74181 | alexandre.vassalotti | 2009-07-22 23:27:53 +0200 (Mi, 22 Jul 2009) | 6 lines
Clean up test_curses.
By using __stdout__ directly, test_curses caused regrtest.py
to duplicate the output of some test results.
................
r74182 | alexandre.vassalotti | 2009-07-22 23:29:01 +0200 (Mi, 22 Jul 2009) | 2 lines
Use assertGreater instead of assertTrue(x > y).
................
r74183 | alexandre.vassalotti | 2009-07-23 01:27:17 +0200 (Do, 23 Jul 2009) | 4 lines
Specialize assertTrue checks when possible.
We should get slightly more helpful failure messages with this change.
................
r74398 | georg.brandl | 2009-08-13 11:16:39 +0200 (Do, 13 Aug 2009) | 1 line
#6694 : fix old function names.
................
2009-08-13 09:34:05 +00:00
Alexandre Vassalotti
e85bd987c4
Merged revisions 73871 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r73871 | alexandre.vassalotti | 2009-07-06 22:17:30 -0400 (Mon, 06 Jul 2009) | 7 lines
Grow the allocated buffer in PyUnicode_EncodeUTF7 to avoid buffer overrun.
Without this change, test_unicode.UnicodeTest.test_codecs_utf7 crashes in
debug mode. What happens is the unicode string u'\U000abcde' with a length
of 1 encodes to the string '+2m/c3g-' of length 8. Since only 5 bytes is
reserved in the buffer, a buffer overrun occurs.
........
2009-07-21 00:39:03 +00:00
Amaury Forgeot d'Arc
e5344d6c45
Merged revisions 73698 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r73698 | amaury.forgeotdarc | 2009-06-30 00:36:49 +0200 (mar., 30 juin 2009) | 7 lines
#6373 : SystemError in str.encode('latin1', 'surrogateescape')
if the string contains unpaired surrogates.
(In debug build, crash in assert())
This can happen with normal processing, if python starts with utf-8,
then calls sys.setfilesystemencoding('latin-1')
........
2009-06-29 22:38:54 +00:00
Amaury Forgeot d'Arc
84ec8d9314
#6373 : SystemError in str.encode('latin1', 'surrogateescape')
...
if the string contains unpaired surrogates.
(In debug build, crash in assert())
This can happen with normal processing, if python starts with utf-8,
then calls sys.setfilesystemencoding('latin-1')
2009-06-29 22:36:49 +00:00
Georg Brandl
c6c3178942
Merged revisions 73190,73213,73257-73258,73260,73275,73294 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r73190 | georg.brandl | 2009-06-04 01:23:45 +0200 (Do, 04 Jun 2009) | 2 lines
Avoid PendingDeprecationWarnings emitted by deprecated unittest methods.
........
r73213 | georg.brandl | 2009-06-04 12:15:57 +0200 (Do, 04 Jun 2009) | 1 line
#5967 : note that the C slicing APIs do not support negative indices.
........
r73257 | georg.brandl | 2009-06-06 19:50:05 +0200 (Sa, 06 Jun 2009) | 1 line
#6211 : elaborate a bit on ways to call the function.
........
r73258 | georg.brandl | 2009-06-06 19:51:31 +0200 (Sa, 06 Jun 2009) | 1 line
#6204 : use a real reference instead of "see later".
........
r73260 | georg.brandl | 2009-06-06 20:21:58 +0200 (Sa, 06 Jun 2009) | 1 line
#6224 : s/JPython/Jython/, and remove one link to a module nine years old.
........
r73275 | georg.brandl | 2009-06-07 22:37:52 +0200 (So, 07 Jun 2009) | 1 line
Add Ezio.
........
r73294 | georg.brandl | 2009-06-08 15:34:52 +0200 (Mo, 08 Jun 2009) | 1 line
#6194 : O_SHLOCK/O_EXLOCK are not really more platform independent than lockf().
........
2009-06-08 13:41:29 +00:00
Raymond Hettinger
3ad05763a6
Strengthen the guard. The code doesn't work well with subclasses.
2009-05-29 22:11:22 +00:00
Martin v. Löwis
c15bdef819
Issue #6012 : Add cleanup support to O& argument parsing.
2009-05-29 14:47:46 +00:00
Martin v. Löwis
43c57785d3
Rename utf8b error handler to surrogateescape.
2009-05-10 08:15:24 +00:00
Benjamin Peterson
b173f7853e
add a replacement API for PyCObject, PyCapsule #5630
...
All stdlib modules with C-APIs now use this.
Patch by Larry Hastings
2009-05-05 22:31:58 +00:00
Georg Brandl
780b2a6153
Merged revisions 72326 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r72326 | georg.brandl | 2009-05-05 11:19:43 +0200 (Di, 05 Mai 2009) | 1 line
#5929 : fix signedness warning.
........
2009-05-05 09:19:59 +00:00
Martin v. Löwis
011e842033
Issue #5915 : Implement PEP 383, Non-decodable Bytes in
...
System Character Interfaces.
2009-05-05 04:43:17 +00:00
Antoine Pitrou
244651aa2f
Merged revisions 72283-72284 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r72283 | antoine.pitrou | 2009-05-04 20:32:32 +0200 (lun., 04 mai 2009) | 4 lines
Issue #4426 : The UTF-7 decoder was too strict and didn't accept some legal sequences.
Patch by Nick Barnes and Victor Stinner.
........
r72284 | antoine.pitrou | 2009-05-04 20:32:50 +0200 (lun., 04 mai 2009) | 3 lines
Add Nick Barnes to ACKS.
........
2009-05-04 18:56:13 +00:00
Walter Dörwald
c1651a0b96
Merged revisions 72260 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r72260 | walter.doerwald | 2009-05-04 00:36:33 +0200 (Mo, 04 Mai 2009) | 5 lines
Issue #5108 : Handle %s like %S and %R in PyUnicode_FromFormatV(): Call
PyUnicode_DecodeUTF8() once, remember the result and output it in a second
step. This avoids problems with counting UTF-8 bytes that ignores the effect
of using the replace error handler in PyUnicode_DecodeUTF8().
........
2009-05-03 22:55:55 +00:00
Martin v. Löwis
db12d454e6
Issue #3672 : Reject surrogates in utf-8 codec; add surrogates error
...
handler.
2009-05-02 18:52:14 +00:00
Mark Dickinson
33841c3489
Issue #5859 : Remove '%f' to '%g' formatting switch for large floats.
2009-05-01 15:37:04 +00:00
Mark Dickinson
f489caf5da
Issue #5859 : Remove use of fixed-length buffers for float formatting
...
in unicodeobject.c and the fallback version of PyOS_double_to_string.
As a result, operations like '%.120e' % 12.34 no longer raise an
exception.
2009-05-01 11:42:00 +00:00
Eric Smith
0923d1d8d7
The other half of Issue #1580 : use short float repr where possible.
...
Addresses the float -> string conversion, using David Gay's code which
was added in Mark Dickinson's checkin r71663.
Also addresses these, which are intertwined with the short repr
changes:
- Issue #5772 : format(1e100, '<') produces '1e+100', not '1.0e+100'
- Issue #5515 : 'n' formatting with commas no longer works poorly
with leading zeros.
- PEP 378 Format Specifier for Thousands Separator: implemented
for floats.
2009-04-16 20:16:10 +00:00
Georg Brandl
222de0f713
#5708 : a bit of streamlining in unicode_repeat().
2009-04-12 12:01:50 +00:00
Eric Smith
a3b1ac8dca
Added ',' thousands grouping to int.__format__. See PEP 378.
...
This is incomplete, but I want to get some version into the next alpha. I am still working on:
Documentation.
More tests.
Implement for floats.
In addition, there's an existing bug with 'n' formatting that carries forward to thousands grouping (issue 5515).
2009-04-03 14:45:06 +00:00
Mark Dickinson
4feda2abc2
Merged revisions 70682,70684 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r70682 | mark.dickinson | 2009-03-29 17:17:16 +0100 (Sun, 29 Mar 2009) | 3 lines
Issue #532631 : Add paranoid check to avoid potential buffer overflow
on systems with sizeof(int) > 4.
........
r70684 | mark.dickinson | 2009-03-29 17:24:29 +0100 (Sun, 29 Mar 2009) | 3 lines
Issue #532631 : Apply floatformat changes to unicodeobject.c
as well as stringobject.c.
........
2009-03-29 16:34:21 +00:00
Mark Dickinson
c8a608c666
Merged revisions 70678 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r70678 | mark.dickinson | 2009-03-29 15:37:51 +0100 (Sun, 29 Mar 2009) | 3 lines
Issue #532631 : Replace confusing fabs(x)/1e25 >= 1e25 test
with fabs(x) >= 1e50, and fix documentation.
........
2009-03-29 15:19:47 +00:00
Hirokazu Yamamoto
3530246621
Merged revisions 70499 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/trunk
........
r70499 | hirokazu.yamamoto | 2009-03-21 19:32:52 +0900 | 1 line
There is no macro named SIZEOF_SSIZE_T. Should use SIZEOF_SIZE_T instead.
........
2009-03-21 13:23:27 +00:00
Mark Dickinson
081dfee4f1
Issue 4474: On platforms with sizeof(wchar_t) == 4 and
...
sizeof(Py_UNICODE) == 2, PyUnicode_FromWideChar now converts
each character outside the BMP to the appropriate surrogate pair.
Thanks Victor Stinner for the patch.
2009-03-18 14:47:41 +00:00