Nick Coghlan
8b097b4ed7
Close #17828 : better handling of codec errors
...
- output type errors now redirect users to the type-neutral
convenience functions in the codecs module
- stateless errors that occur during encoding and decoding
will now be automatically wrapped in exceptions that give
the name of the codec involved
2013-11-13 23:49:21 +10:00
Victor Stinner
66b3270975
_Py_normalize_encoding(): explain how the value 6 was computed
2013-11-07 23:12:23 +01:00
Victor Stinner
df23e30bea
Fix _Py_normalize_encoding(): ensure that buffer is big enough to store "utf-8"
...
if the input string is NULL
2013-11-07 13:33:36 +01:00
Victor Stinner
ad14ccd047
Issue #19512 : add _PyUnicode_CompareWithId() function
...
_PyUnicode_CompareWithId() is faster than PyUnicode_CompareWithASCIIString()
when both strings are equal and interned.
Add also _PyId_builtins identifier for "builtins" common string.
2013-11-07 00:46:04 +01:00
Victor Stinner
21ea21ef6d
Issue #19424 : PyUnicode_CompareWithASCIIString() normalizes memcmp() result
...
to -1, 0, 1
2013-11-04 11:28:26 +01:00
Victor Stinner
f0c7b2af05
Issue #16286 : remove duplicated identity check from unicode_compare()
...
Move the test to PyUnicode_Compare()
2013-11-04 11:27:14 +01:00
Victor Stinner
fd9e44db37
Issue #16286 : optimize PyUnicode_RichCompare() for identical strings (same
...
pointer) for any operator, not only Py_EQ and Py_NE.
Code of bytes_richcompare() and PyUnicode_RichCompare() is now closer.
2013-11-04 11:23:05 +01:00
Victor Stinner
c8bc5377ac
Issue #16286 : write a new subfunction bytes_compare_eq()
...
* cleanup bytes_richcompare()
* PyUnicode_RichCompare(): replace a test with a XOR
2013-11-04 11:08:10 +01:00
Victor Stinner
e1b1592fd4
Issue #19424 : Fix a compiler warning on comparing signed/unsigned size_t
...
Patch written by Zachary Ware.
2013-11-03 13:53:12 +01:00
Victor Stinner
a6b9b071a3
Issue #19424 : Fix a compiler warning
...
memcmp() just takes raw pointers
2013-10-30 18:27:13 +01:00
Victor Stinner
602f7cf0b9
Issue #19424 : Optimize PyUnicode_CompareWithASCIIString()
...
Use fast memcmp() instead of a loop using the slow PyUnicode_READ() macro.
strlen() is still necessary to check Unicode string containing null bytes.
2013-10-29 23:31:50 +01:00
Victor Stinner
68b674c9d4
Issue #19437 : Fix _PyUnicode_New() (constructor of legacy string), set all
...
attributes before checking for error. The destructor expects all attributes to
be set. It is now safe to call Py_DECREF(unicode) in the constructor.
2013-10-29 19:31:43 +01:00
Victor Stinner
fa3ba4c3bc
Issue #18609 : Add a fast-path for "iso8859-1" encoding
...
On AIX, the locale encoding may be "iso8859-1", which was not a known syntax of
the legacy ISO 8859-1 encoding.
Using a C codec instead of a Python codec is faster but also avoids tricky
issues during Python startup or complex code.
2013-10-29 11:34:05 +01:00
Victor Stinner
a5afb58986
Issue #18408 : Fix PyUnicode_AsUTF8AndSize(), raise MemoryError exception on
...
memory allocation failure
2013-10-29 01:28:23 +01:00
Serhiy Storchaka
c679227e31
Issue #1772673 : The type of `char*` arguments now changed to `const char*`.
2013-10-19 21:03:34 +03:00
Serhiy Storchaka
55e092f545
Issue #19279 : UTF-7 decoder no more produces illegal strings.
2013-10-19 20:39:28 +03:00
Serhiy Storchaka
35804e4c63
Issue #19279 : UTF-7 decoder no more produces illegal strings.
2013-10-19 20:38:19 +03:00
Larry Hastings
3182680210
Issue #16612 : Add "Argument Clinic", a compile-time preprocessor
...
for C files to generate argument parsing code. (See PEP 436.)
2013-10-19 00:09:25 -07:00
Ethan Furman
fb13721b1b
Close #18780 : %-formatting now prints value for int subclasses with %d, %i, and %u codes.
2013-08-31 10:18:55 -07:00
Antoine Pitrou
9ed5f27266
Issue #18722 : Remove uses of the "register" keyword in C code.
2013-08-13 20:18:52 +02:00
Raymond Hettinger
e56666d17f
Silence compiler warning about an uninitialized variable
2013-08-04 11:51:03 -07:00
Raymond Hettinger
5ed1b38a7d
merge
2013-08-04 11:51:35 -07:00
Christian Heimes
b578735dff
Check return value of PyType_Ready(&EncodingMapType)
...
CID 486654
2013-07-20 14:57:28 +02:00
Christian Heimes
26532f7519
Check return value of PyType_Ready(&EncodingMapType)
...
CID 486654
2013-07-20 14:57:16 +02:00
Victor Stinner
e699e5a218
Issue #18408 : Don't check unicode consistency in _PyUnicode_HAS_UTF8_MEMORY()
...
and _PyUnicode_HAS_WSTR_MEMORY() macros
These macros are called in unicode_dealloc(), whereas the unicode object can be
"inconsistent" if the creation of the object failed.
For example, when unicode_subtype_new() fails on a memory allocation,
_PyUnicode_CheckConsistency() fails with an assertion error because data is
NULL.
2013-07-15 18:22:47 +02:00
Victor Stinner
9e6b4d715c
Issue #18408 : _PyUnicodeWriter_Finish() now clears its buffer attribute in all
...
cases, so _PyUnicodeWriter_Dealloc() can be called after finish.
2013-07-09 00:37:24 +02:00
Victor Stinner
15a0bd3965
Issue #18408 : Fix _PyUnicodeWriter_Finish(): clear writer->buffer,
...
so _PyUnicodeWriter_Dealloc() can be called on the writer after finish.
2013-07-08 22:29:55 +02:00
Victor Stinner
6f8eeee7b9
Issue #18203 : Fix _Py_DecodeUTF8_surrogateescape(), use PyMem_RawMalloc() as _Py_char2wchar()
2013-07-07 22:57:45 +02:00
Victor Stinner
1a7425f67a
Issue #18203 : Replace malloc() with PyMem_RawMalloc() at Python initialization
...
* Replace malloc() with PyMem_RawMalloc()
* Replace PyMem_Malloc() with PyMem_RawMalloc() where the GIL is not held.
* _Py_char2wchar() now returns a buffer allocated by PyMem_RawMalloc(), instead
of PyMem_Malloc()
2013-07-07 16:25:15 +02:00
Christian Heimes
d47802eef7
Fix ref leak in error case of unicode find, count, formatlong
...
CID 983315: Resource leak (RESOURCE_LEAK)
CID 983316: Resource leak (RESOURCE_LEAK)
CID 983317: Resource leak (RESOURCE_LEAK)
2013-06-29 21:33:36 +02:00
Christian Heimes
d47a0456b1
Fix ref leak in error case of unicode index
...
CID 983319 (#1 of 2): Resource leak (RESOURCE_LEAK)
leaked_storage: Variable substring going out of scope leaks the storage it points to.
2013-06-29 21:21:37 +02:00
Christian Heimes
ea71a525c3
Fix ref leak in error case of unicode rindex and rfind
...
CID 983320: Resource leak (RESOURCE_LEAK)
CID 983321: Resource leak (RESOURCE_LEAK)
leaked_storage: Variable substring going out of scope leaks the storage it points to.
2013-06-29 21:17:34 +02:00
Christian Heimes
305e49e17e
Fix memory leak in endswith
...
CID 1040368 (#1 of 1): Resource leak (RESOURCE_LEAK)
leaked_storage: Variable substring going out of scope leaks the storage it points to.
2013-06-29 20:41:06 +02:00
Serhiy Storchaka
c89533f72f
Issue #18184 : PyUnicode_FromFormat() and PyUnicode_FromFormatV() now raise
...
OverflowError when an argument of %c format is out of range.
2013-06-23 20:21:16 +03:00
Serhiy Storchaka
8eeae2126c
Issue #18184 : PyUnicode_FromFormat() and PyUnicode_FromFormatV() now raise
...
OverflowError when an argument of %c format is out of range.
2013-06-23 20:12:14 +03:00
Benjamin Peterson
3164f5d565
merge 3.3 ( #18183 )
2013-06-10 09:24:01 -07:00
Benjamin Peterson
7e30373126
remove MAX_MAXCHAR because it's unsafe for computing maximum codepoitn value (see #18183 )
2013-06-10 09:19:46 -07:00
Victor Stinner
9f067f490f
Issue #9566 : Fix compiler warning on Windows 64-bit
2013-06-05 00:21:31 +02:00
Antoine Pitrou
7ce35a1816
Issue #17237 : Fix crash in the ASCII decoder on m68k.
2013-05-11 15:59:37 +02:00
Antoine Pitrou
8b0e98426d
Issue #17237 : Fix crash in the ASCII decoder on m68k.
2013-05-11 15:58:34 +02:00
Victor Stinner
f4f24248dc
Fix uninitialized value in charmap_decode_mapping()
2013-05-07 01:01:31 +02:00
Victor Stinner
8cecc8c262
Issue #7330 : Implement width and precision (ex: "%5.3s") for the format string
...
of PyUnicode_FromFormat() function, original patch written by Ysj Ray.
2013-05-06 23:11:54 +02:00
Victor Stinner
bb4503f61e
Partial revert of changeset 9744b2df134c
...
PyUnicode_Append() cannot call directly resize_compact(): I forgot that a
string can be ready *and* not compact (a legacy string can also be ready).
2013-04-18 09:41:34 +02:00
Victor Stinner
fb161b1b6d
Split PyUnicode_DecodeCharmap() into subfunction for readability
2013-04-18 01:44:27 +02:00
Victor Stinner
170ca6f84b
Fix bug in Unicode decoders related to _PyUnicodeWriter
...
Bug introduced by changesets 7ed9993d53b4 and edf029fc9591.
2013-04-18 00:25:28 +02:00
Victor Stinner
376cfa122d
Fix typo in unicode_decode_call_errorhandler_writer()
...
Bug introduced by changeset 7ed9993d53b4.
2013-04-17 23:58:16 +02:00
Victor Stinner
8f674ccd64
Close #17694 : Add minimum length to _PyUnicodeWriter
...
* Add also min_char attribute to _PyUnicodeWriter structure (currently unused)
* _PyUnicodeWriter_Init() has no more argument (except the writer itself):
min_length and overallocate must be set explicitly
* In error handlers, only enable overallocation if the replacement string
is longer than 1 character
* CJK decoders don't use overallocation anymore
* Set min_length, instead of preallocating memory using
_PyUnicodeWriter_Prepare(), in many decoders
* _PyUnicode_DecodeUnicodeInternal() checks for integer overflow
2013-04-17 23:02:17 +02:00
Victor Stinner
77282cb4f8
Cleanup PyUnicode_Contains()
...
* No need to double-check that strings are ready: test already done by
PyUnicode_FromObject()
* Remove useless kind variable (use kind1 instead)
2013-04-14 19:22:47 +02:00
Victor Stinner
d92e078c8d
Minor change: fix character in do_strip() for the ASCII case
2013-04-14 19:17:42 +02:00
Victor Stinner
f033510fee
Cleanup PyUnicode_Append()
...
* Check also that right is a Unicode object
* call directly resize_compact() instead of unicode_resize() for a more
explicit error handling, and to avoid testing some properties twice
(ex: unicode_modifiable())
2013-04-14 19:13:03 +02:00