Commit Graph

772 Commits

Author SHA1 Message Date
Antoine Pitrou 31b92a534f Sanitize reference management in the utf-8 encoder 2011-11-12 18:35:19 +01:00
Antoine Pitrou 0290c7a811 Fix regression on 2-byte wchar_t systems (Windows) 2011-11-11 13:29:12 +01:00
Antoine Pitrou 44c6affc79 Avoid crashing because of an unaligned word access 2011-11-11 02:59:42 +01:00
Antoine Pitrou de20b0b50e Issue #13149: Speed up append-only StringIO objects.
This is very similar to the "lazy strings" idea.
2011-11-10 21:47:38 +01:00
Victor Stinner 9f4b1e9c50 Fix and deprecated the unicode_internal codec
unicode_internal codec uses Py_UNICODE instead of the real internal
representation (PEP 393: Py_UCS1, Py_UCS2 or Py_UCS4) for backward
compatibility.
2011-11-10 20:56:30 +01:00
Victor Stinner 24729f36bf Prefer Py_UCS4 or wchar_t over Py_UNICODE 2011-11-10 20:31:37 +01:00
Victor Stinner ebf3ba808e PyUnicode_DecodeCharmap() uses the new Unicode API 2011-11-10 20:30:22 +01:00
Victor Stinner a98b28c1bf Avoid PyUnicode_AS_UNICODE in the UTF-8 encoder 2011-11-10 20:21:49 +01:00
Victor Stinner 3326cb6a36 Fix "unicode_escape" encoder 2011-11-10 20:15:25 +01:00
Victor Stinner 0e36826a04 Fix UTF-7 encoder on Windows 2011-11-10 20:12:49 +01:00
Martin v. Löwis 1db7c13be1 Port encoders from Py_UNICODE API to unicode object API. 2011-11-10 18:24:32 +01:00
Victor Stinner 62aa4d086a Strip trailing spaces 2011-11-09 00:03:45 +01:00
Victor Stinner 0a045efb49 Fix a compiler warning: use unsiged for maxchar in unicode_widen() 2011-11-09 00:02:42 +01:00
Victor Stinner 596a6c4ffc Fix the code page decoder
* unicode_decode_call_errorhandler() now supports the PyUnicode_WCHAR_KIND
   kind
 * unicode_decode_call_errorhandler() calls copy_characters() instead of
   PyUnicode_CopyCharacters()
2011-11-09 00:02:18 +01:00
Antoine Pitrou a8f63c02ef Fix missing goto 2011-11-08 18:37:16 +01:00
Martin v. Löwis d10759f6ed Make _PyUnicode_FromId return borrowed references.
http://mail.python.org/pipermail/python-dev/2011-November/114347.html
2011-11-07 13:00:05 +01:00
Martin v. Löwis e9b11c1cd8 Change decoders to use Unicode API instead of Py_UNICODE. 2011-11-08 17:35:34 +01:00
Victor Stinner e30c0a1014 Fix gdb/libpython.py for not ready Unicode strings
_PyUnicode_CheckConsistency() checks also hash and length value for not ready
Unicode strings.
2011-11-04 20:54:05 +01:00
Victor Stinner 2fc507fe45 Replace tabs by spaces 2011-11-04 20:06:39 +01:00
Martin v. Löwis 12be46ca84 Drop Py_UNICODE based encode exceptions. 2011-11-04 19:04:15 +01:00
Martin v. Löwis 3d325191bf Port code page codec to Unicode API. 2011-11-04 18:23:06 +01:00
Victor Stinner fcd9653667 Fix a compiler warning in unicode_encode_ucs1() 2011-11-04 00:28:50 +01:00
Victor Stinner fc026c98d8 Fix PyUnicode_EncodeCharmap() 2011-11-04 00:24:51 +01:00
Victor Stinner 7931d9a951 Replace PyUnicodeObject type by PyObject
* _PyUnicode_CheckConsistency() now takes a PyObject* instead of void*
 * Remove now useless casts to PyObject*
2011-11-04 00:22:48 +01:00
Victor Stinner 76a31a6bff Cleanup decode_code_page_stateful() and encode_code_page()
* Fix decode_code_page_errors() result
 * Inline decode_code_page() and encode_code_page_chunk()
 * Replace the PyUnicodeObject type by PyObject
2011-11-04 00:05:13 +01:00
Victor Stinner 7581cef699 Adapt the code page encoder to the new unicode_encode_call_errorhandler()
The code is not correct, but at least it doesn't crash anymore.
2011-11-03 22:32:33 +01:00
Brian Curtin 2787ea41fd Fix a compile error (apparently Windows only) introduced in 295fdfd4f422 2011-11-02 15:09:37 -05:00
Martin v. Löwis 23e275b3ad Port UCS1 and charmap codecs to new API. 2011-11-02 18:02:51 +01:00
Martin v. Löwis 9e8166843c Introduce PyObject* API for raising encode errors. 2011-11-02 12:45:42 +01:00
Martin v. Löwis 0d3072e98d Drop Py_UCS4_ functions. Closes #13246. 2011-10-31 08:40:56 +01:00
Victor Stinner 57ffa9d4ff PyUnicode_AsUnicodeCopy() uses PyUnicode_AsUnicodeAndSize() to get directly the length 2011-10-23 20:10:08 +02:00
Victor Stinner af9e4b8c29 Fix PyUnicode_InternImmortal(): PyUnicode_InternInPlace() may changes *p 2011-10-23 20:07:00 +02:00
Victor Stinner 9faa384bed Cast directly to unsigned char, instead of using Py_CHARMASK
We don't need "& 0xff" on an unsigned char.
2011-10-23 20:06:00 +02:00
Victor Stinner 9db1a8b69f Replace PyUnicodeObject* by PyObject* where it was irrevelant
A Unicode string can now be a PyASCIIObject, PyCompactUnicodeObject or
PyUnicodeObject. Aliasing a PyASCIIObject* or PyCompactUnicodeObject* to
PyUnicodeObject* is wrong
2011-10-23 20:04:37 +02:00
Victor Stinner 0d60e87ad6 Fix data variable in _PyUnicode_Dump() for compact ASCII 2011-10-23 19:47:19 +02:00
Victor Stinner d8e61c348e Remove last references to the removed Unicode free list 2011-10-23 19:43:33 +02:00
Victor Stinner 065836ec9c PyUnicode_FSDecoder() ensures that the decoded string is ready 2011-10-27 01:56:33 +02:00
Victor Stinner dd18d3ad9e Fix unicode_subtype_new() on debug build
Patch written by Stefan Behnel.
2011-10-22 11:08:10 +02:00
Ezio Melotti f881751ded Remove unused variable. 2011-10-22 01:01:32 +03:00
Ezio Melotti 931b8aac80 #12753: Add support for Unicode name aliases and named sequences. 2011-10-21 21:57:36 +03:00
Victor Stinner 6707293e75 Add consistency check to _PyUnicode_New() 2011-10-18 22:10:14 +02:00
Victor Stinner 3a50e7056e Issue #12281: Rewrite the MBCS codec to handle correctly replace and ignore
error handlers on all Windows versions. The MBCS codec is now supporting all
error handlers, instead of only replace to encode and ignore to decode.
2011-10-18 21:21:00 +02:00
Benjamin Peterson 7a6debe79c remove some duplication 2011-10-15 09:25:28 -04:00
Victor Stinner f5cff56a1b Issue #13088: Add shared Py_hexdigits constant to format a number into base 16 2011-10-14 02:13:11 +02:00
Antoine Pitrou f0b934b01a Reuse the stringlib in findchar(), and make its signature more convenient 2011-10-13 18:55:09 +02:00
Victor Stinner 55c991197b Optimize unicode_subscript() for step != 1 and ascii strings 2011-10-13 01:17:06 +02:00
Victor Stinner 127226ba69 Don't use PyUnicode_MAX_CHAR_VALUE() macro in Py_MAX() 2011-10-13 01:12:34 +02:00
Victor Stinner 9e7a1bcfd6 Optimize findchar() for PyUnicode_1BYTE_KIND: use memchr and memrchr 2011-10-13 00:18:12 +02:00
Antoine Pitrou dd4e2f0153 Issue #13155: Optimize finding the optimal character width of an unicode string 2011-10-13 00:02:27 +02:00
Victor Stinner 49a0a21f37 Unicode replace() avoids calling unicode_adjust_maxchar() when it's useless
Add also a special case if the result is an empty string.
2011-10-12 23:46:10 +02:00