Victor Stinner
16e6a80923
PyUnicode_Resize(): warn about canonical representation
...
Call also directly unicode_resize() in unicodeobject.c
2011-12-12 13:24:15 +01:00
Victor Stinner
b0a82a6a7f
Fix PyUnicode_Resize() for compact string: leave the string unchanged on error
...
Fix also PyUnicode_Resize() doc
2011-12-12 13:08:33 +01:00
Victor Stinner
bf6e560d0c
Make PyUnicode_Copy() private => _PyUnicode_Copy()
...
Undocument the function.
Make also decode_utf8_errors() as private (static).
2011-12-12 01:53:47 +01:00
Victor Stinner
7a9105a380
resize_copy() now supports legacy ready strings
2011-12-12 00:13:42 +01:00
Victor Stinner
24c74be9a3
PyUnicode_IS_ASCII() macro ensures that the string is ready
...
It has no sense to check if a not ready string is ASCII or not.
2011-12-12 01:24:20 +01:00
Victor Stinner
551ac95733
Py_UNICODE_HIGH_SURROGATE() and Py_UNICODE_LOW_SURROGATE() macros
...
And use surrogates macros everywhere in unicodeobject.c
2011-11-29 22:58:13 +01:00
Victor Stinner
f3ae6208c7
PyUnicode_GET_SIZE() checks that PyUnicode_AsUnicode() succeed
...
using an assertion
2011-11-21 02:24:49 +01:00
Victor Stinner
77faf69ca1
_PyUnicode_CheckConsistency() also checks maxchar maximum value,
...
not only its minimum value
2011-11-20 18:56:05 +01:00
Victor Stinner
9343999597
Fix PyUnicode_CopyCharacters() doc
2011-11-20 18:29:14 +01:00
Victor Stinner
7c8bbbbb0c
Ensure that Py_UCS4 is 32 bits and Py_UCS2 is 16 bits
2011-11-20 18:28:29 +01:00
Victor Stinner
6f9568bb1f
Fix misused of "PyUnicodeObject" structure name in unicodeobject.h
2011-11-17 00:12:44 +01:00
Martin v. Löwis
1db7c13be1
Port encoders from Py_UNICODE API to unicode object API.
2011-11-10 18:24:32 +01:00
Martin v. Löwis
d10759f6ed
Make _PyUnicode_FromId return borrowed references.
...
http://mail.python.org/pipermail/python-dev/2011-November/114347.html
2011-11-07 13:00:05 +01:00
Victor Stinner
e30c0a1014
Fix gdb/libpython.py for not ready Unicode strings
...
_PyUnicode_CheckConsistency() checks also hash and length value for not ready
Unicode strings.
2011-11-04 20:54:05 +01:00
Victor Stinner
7931d9a951
Replace PyUnicodeObject type by PyObject
...
* _PyUnicode_CheckConsistency() now takes a PyObject* instead of void*
* Remove now useless casts to PyObject*
2011-11-04 00:22:48 +01:00
Martin v. Löwis
23e275b3ad
Port UCS1 and charmap codecs to new API.
2011-11-02 18:02:51 +01:00
Martin v. Löwis
0d3072e98d
Drop Py_UCS4_ functions. Closes #13246 .
2011-10-31 08:40:56 +01:00
Victor Stinner
9db1a8b69f
Replace PyUnicodeObject* by PyObject* where it was irrevelant
...
A Unicode string can now be a PyASCIIObject, PyCompactUnicodeObject or
PyUnicodeObject. Aliasing a PyASCIIObject* or PyCompactUnicodeObject* to
PyUnicodeObject* is wrong
2011-10-23 20:04:37 +02:00
Victor Stinner
55c7e00fc0
Simplify _PyUnicode_COMPACT_DATA() macro
2011-10-18 23:32:53 +02:00
Victor Stinner
3a50e7056e
Issue #12281 : Rewrite the MBCS codec to handle correctly replace and ignore
...
error handlers on all Windows versions. The MBCS codec is now supporting all
error handlers, instead of only replace to encode and ignore to decode.
2011-10-18 21:21:00 +02:00
Martin v. Löwis
bd928fef42
Rename _Py_identifier to _Py_IDENTIFIER.
2011-10-14 10:20:37 +02:00
Victor Stinner
8813104e53
Simplify PyUnicode_MAX_CHAR_VALUE
...
Use PyUnicode_IS_ASCII instead of PyUnicode_IS_COMPACT_ASCII, so the following
test can be removed:
PyUnicode_DATA(op) == (((PyCompactUnicodeObject *)(op))->utf8)
2011-10-13 01:12:01 +02:00
Martin v. Löwis
87da872c69
Drop extra semicolon.
2011-10-09 11:54:42 +02:00
Martin v. Löwis
afe55bba33
Add API for static strings, primarily good for identifiers.
...
Thanks to Konrad Schöbel and Jasper Schulz for helping with the mass-editing.
2011-10-09 10:38:36 +02:00
Martin v. Löwis
c47adb04b3
Change PyUnicode_KIND to 1,2,4. Drop _KIND_SIZE and _CHARACTER_SIZE.
2011-10-07 20:55:35 +02:00
Georg Brandl
db6c7f5c33
Update C API docs for PEP 393.
2011-10-07 11:19:11 +02:00
Victor Stinner
b066cc6aba
Fix PyUnicode_CHARACTER_SIZE and PyUnicode_KIND_SIZE
2011-10-06 15:54:53 +02:00
Antoine Pitrou
dbf697ae5c
Fix compilation warnings under 64-bit Windows
2011-10-06 15:34:41 +02:00
Éric Araujo
0f4ee93b06
Branch merge
2011-10-06 13:22:21 +02:00
Victor Stinner
1d4b35f4e5
rephrase PyUnicode_1BYTE_KIND documentation
2011-10-06 01:51:19 +02:00
Victor Stinner
fb9ea8c57e
Don't check for the maximum character when copying from unicodeobject.c
...
* Create copy_characters() function which doesn't check for the maximum
character in release mode
* _PyUnicode_CheckConsistency() is no more static to be able to use it
in _PyUnicode_FormatAdvanced() (in formatter_unicode.c)
* _PyUnicode_CheckConsistency() checks the string hash
2011-10-06 01:45:57 +02:00
Éric Araujo
80a348c0a0
Fix typo
2011-10-05 01:11:12 +02:00
Victor Stinner
30134f53fc
Complete documentation of compact ASCII strings
2011-10-04 01:32:45 +02:00
Victor Stinner
a41463c203
Document utf8_length and wstr_length states
...
Ensure these states with assertions in _PyUnicode_CheckConsistency().
2011-10-04 01:05:08 +02:00
Victor Stinner
7f11ad4594
Unicode: document when the wstr pointer is shared with data
...
Add also related assertions to _PyUnicode_CheckConsistency().
2011-10-04 00:00:20 +02:00
Victor Stinner
8cfcbed4e3
Improve string forms and PyUnicode_Resize() documentation
...
Remove also the FIXME for resize_copy(): as discussed with Martin, copy the
string on resize if the string is not resizable is just fine.
2011-10-03 23:19:21 +02:00
Victor Stinner
c3cec7868b
Add asciilib: similar to ucs1, ucs2 and ucs4 library, but specialized to ASCII
...
ucs1, ucs2 and ucs4 libraries have to scan created substring to find the
maximum character, whereas it is not need to ASCII strings. Because ASCII
strings are common, it is useful to optimize ASCII.
2011-10-05 21:24:08 +02:00
Victor Stinner
4d0d54bcba
Document requierements of Unicode kinds
2011-10-05 01:31:05 +02:00
Georg Brandl
07de325672
More fixes.
2011-10-05 16:47:38 +02:00
Georg Brandl
c6bc4c6897
Fix a few typos in the unicode header.
2011-10-05 16:23:09 +02:00
Georg Brandl
4975a9b44d
Fix grammar.
2011-10-05 16:12:21 +02:00
Victor Stinner
b9275c104e
Speedup str[a:b] and PyUnicode_FromKindAndData
...
* str[a:b] doesn't scan the string for the maximum character if the string
is ascii only
* PyUnicode_FromKindAndData() stops if we are sure that we cannot use a
shorter character type. For example, _PyUnicode_FromUCS1() stops if we
have at least one character in range U+0080-U+00FF
2011-10-05 14:01:42 +02:00
Victor Stinner
85041a54bd
_PyUnicode_CheckConsistency() checks utf8 field consistency
2011-10-03 14:42:39 +02:00
Victor Stinner
a3b334da6d
PyUnicode_Ready() now sets ascii=1 if maxchar < 128
...
ascii=1 is no more reserved to PyASCIIObject. Use
PyUnicode_IS_COMPACT_ASCII(obj) to check if obj is a PyASCIIObject (as before).
2011-10-03 13:53:37 +02:00
Victor Stinner
910337b42e
Add _PyUnicode_CheckConsistency() macro to help debugging
...
* Document Unicode string states
* Use _PyUnicode_CheckConsistency() to ensure that objects are always
consistent.
2011-10-03 03:20:16 +02:00
Victor Stinner
37943769ef
PyUnicode_READ_CHAR() ensures that the string is ready
2011-10-02 20:33:18 +02:00
Victor Stinner
7a48ff7e06
Use Py_UCS1 instead of unsigned char in unicodeobject.h
2011-10-02 00:55:25 +02:00
Victor Stinner
cd9950fd09
PyUnicode_WriteChar() raises IndexError on invalid index
...
PyUnicode_WriteChar() raises also a ValueError if the string has more than 1
reference.
2011-10-02 00:34:53 +02:00
Victor Stinner
9f789e7f63
_PyUnicode_AsKind() is *not* part of the stable ABI
2011-10-01 03:57:28 +02:00
Victor Stinner
4584a5ba1a
PyUnicode_CHARACTER_SIZE(): add a reference to PyUnicode_KIND_SIZE()
2011-10-01 02:39:37 +02:00