Victor Stinner
3222da26fe
Make _PyUnicode_TranslateCharmap() symbol private
...
unicodeobject.h exposes PyUnicode_TranslateCharmap() and PyUnicode_Translate().
2015-10-01 22:07:32 +02:00
Victor Stinner
01ada3996b
Issue #25267 : The UTF-8 encoder is now up to 75 times as fast for error
...
handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``.
Patch co-written with Serhiy Storchaka.
2015-10-01 21:54:51 +02:00
Victor Stinner
c3713e9706
Optimize ascii/latin1+surrogateescape encoders
...
Issue #25227 : Optimize ASCII and latin1 encoders with the ``surrogateescape``
error handler: the encoders are now up to 3 times as fast.
Initial patch written by Serhiy Storchaka.
2015-09-29 12:32:13 +02:00
Victor Stinner
0030cd52da
Issue #25227 : Cleanup unicode_encode_ucs1() error handler
...
* Change limit type from unsigned int to Py_UCS4, to use the same type than the
"ch" variable (an Unicode character).
* Reuse ch variable for _Py_ERROR_XMLCHARREFREPLACE
* Add some newlines for readability
2015-09-24 14:45:00 +02:00
Victor Stinner
54385b206d
Issue #24870 : revert unwanted change
...
Sorry, I pushed the patch on the UTF-8 decoder by mistake :-(
2015-09-22 10:46:52 +02:00
Victor Stinner
5ebae87628
Issue #25207 , #14626 : Fix my commit.
...
It doesn't work to use #define XXX defined(YYY)" and then "#ifdef XXX"
to check YYY.
2015-09-22 01:29:33 +02:00
Victor Stinner
6174474bea
_PyUnicodeWriter_PrepareInternal(): make the assertion more strict
2015-09-22 01:01:17 +02:00
Victor Stinner
ca9381ea01
Issue #24870 : Add _PyUnicodeWriter_PrepareKind() macro
...
Add a macro which ensures that the writer has at least the requested kind.
2015-09-22 00:58:32 +02:00
Victor Stinner
5014920cb7
Issue #24870 : Reuse the new _Py_error_handler enum
...
Factorize code with the new get_error_handler() function.
Add some empty lines for readability.
2015-09-22 00:26:54 +02:00
Victor Stinner
f96418de05
Issue #24870 : Optimize the ASCII decoder for error handlers: surrogateescape,
...
ignore and replace. Initial patch written by Naoki Inada.
The decoder is now up to 60 times as fast for these error handlers.
Add also unit tests for the ASCII decoder.
2015-09-21 23:06:27 +02:00
Zachary Ware
070bd62cfa
Closes #21279 : Merge with 3.5
2015-08-06 00:05:13 -05:00
Zachary Ware
d987a81d29
Issue #21279 : Merge with 3.4
2015-08-06 00:04:23 -05:00
Zachary Ware
79b98df023
Issue #21279 : Flesh out str.translate docs
...
Initial patch by Kinga Farkas, Martin Panter, and John Posner.
2015-08-05 23:54:15 -05:00
Raymond Hettinger
ac2ef65c32
Make the unicode equality test an external function rather than in-lining it.
...
The real benefit of the unicode specialized function comes from
bypassing the overhead of PyObject_RichCompareBool() and not
from being in-lined (especially since there was almost no shared
data between the caller and callee). Also, the in-lining was
having a negative effect on code generation for the callee.
2015-07-04 16:04:44 -07:00
Serhiy Storchaka
d4ea03c785
Issue #24284 : The startswith and endswith methods of the str class no longer
...
return True when finding the empty string and the indexes are completely out
of range.
2015-05-31 09:15:51 +03:00
Antoine Pitrou
873e0df946
Fix some compilation warnings when using gcc (-Wmaybe-uninitialized).
2015-05-19 21:06:04 +02:00
Antoine Pitrou
f6d1f1fa8a
Fix some compilation warnings when using gcc (-Wmaybe-uninitialized).
2015-05-19 21:04:33 +02:00
Serhiy Storchaka
0d4df752ac
Issue #15027 : The UTF-32 encoder is now 3x to 7x faster.
2015-05-12 23:12:45 +03:00
Serhiy Storchaka
7e9d1d1a1b
Issue #23908 : os functions now reject paths with embedded null character
...
on Windows instead of silently truncate them.
Removed no longer used _PyUnicode_HasNULChars().
2015-04-20 10:12:28 +03:00
Serhiy Storchaka
1009bf18b3
Issue #23501 : Argumen Clinic now generates code into separate files by default.
2015-04-03 23:53:51 +03:00
Victor Stinner
1912b39def
_PyUnicodeWriter_WriteStr() now checks that the input string is consistent
...
in debug mode to detect bugs earlier.
_PyUnicodeWriter_Finish() doesn't check if the read only string is consistent,
whereas it does check consistency for strings built by itself.
2015-03-26 09:37:23 +01:00
Serhiy Storchaka
d9d769fcdd
Issue #23573 : Increased performance of string search operations (str.find,
...
str.index, str.count, the in operator, str.split, str.partition) with
arguments of different kinds (UCS1, UCS2, UCS4).
2015-03-24 21:55:47 +02:00
Victor Stinner
f50e187724
Fix compiler warnings: comparison between signed and unsigned numbers
2015-03-20 11:32:24 +01:00
Victor Stinner
0c39b1b970
Initialize variables to prevent GCC warnings
2015-03-18 15:02:06 +01:00
Steve Dower
3e96f324dc
Issue #23451 : Update pyconfig.h for Windows to require Vista headers and remove unnecessary version checks.
2015-03-02 08:01:10 -08:00
Serhiy Storchaka
78a8249127
Issue #23490 : Fixed possible crashes related to interoperability between
...
old-style and new API for string with 2**30-1 characters.
2015-02-20 21:34:39 +02:00
Serhiy Storchaka
e55181f517
Issue #23490 : Fixed possible crashes related to interoperability between
...
old-style and new API for string with 2**30-1 characters.
2015-02-20 21:34:06 +02:00
Serhiy Storchaka
4d0d982985
Issue #23446 : Use PyMem_New instead of PyMem_Malloc to avoid possible integer
...
overflows. Added few missed PyErr_NoMemory().
2015-02-16 13:33:32 +02:00
Serhiy Storchaka
1a1ff29659
Issue #23446 : Use PyMem_New instead of PyMem_Malloc to avoid possible integer
...
overflows. Added few missed PyErr_NoMemory().
2015-02-16 13:28:22 +02:00
Victor Stinner
29dacf2e97
Issue #15859 : PyUnicode_EncodeFSDefault(), PyUnicode_EncodeMBCS() and
...
PyUnicode_EncodeCodePage() now raise an exception if the object is not an
Unicode object. For PyUnicode_EncodeFSDefault(), it was already the case on
platforms other than Windows. Patch written by Campbell Barton.
2015-01-26 16:41:32 +01:00
Serhiy Storchaka
bbd3aa8ece
Issue #23321 : Fixed a crash in str.decode() when error handler returned
...
replacment string longer than mailformed input data.
2015-01-26 01:24:31 +02:00
Serhiy Storchaka
7e4b9057b3
Issue #23321 : Fixed a crash in str.decode() when error handler returned
...
replacment string longer than mailformed input data.
2015-01-26 01:22:54 +02:00
Ethan Furman
b95b56150f
Issue20284: Implement PEP461
2015-01-23 20:05:18 -08:00
Serhiy Storchaka
82e07b92b3
Issue #23181 : More "codepoint" -> "code point".
2015-01-18 11:33:31 +02:00
Serhiy Storchaka
d3faf43f9b
Issue #23181 : More "codepoint" -> "code point".
2015-01-18 11:28:37 +02:00
Serhiy Storchaka
b757c83ec6
Issue #22581 : Use more "bytes-like object" throughout the docs and comments.
2014-12-05 22:25:22 +02:00
Serhiy Storchaka
133b11b566
Issue #22975 : Close block at right place.
2014-12-01 18:56:28 +02:00
Serhiy Storchaka
92bf919ed0
Issue #22581 : Use more "bytes-like object" throughout the docs and comments.
2014-12-05 22:26:10 +02:00
Serhiy Storchaka
407249c62b
Issue #22975 : Close block at right place.
2014-12-01 18:56:54 +02:00
Victor Stinner
3aa979e0cd
Issue #20948 : Inline makefmt() in unicode_fromformat_arg()
2014-11-18 21:40:51 +01:00
Antoine Pitrou
4e334241b7
Fixed signed/unsigned comparison warning
2014-10-15 23:14:53 +02:00
Benjamin Peterson
736982d36d
merge 3.4 ( closes #22643 )
2014-10-15 12:17:47 -04:00
Benjamin Peterson
9c422f3c3d
merge 3.3
2014-10-15 12:17:33 -04:00
Benjamin Peterson
1e211ff10d
it suffices to check for PY_SSIZE_T_MAX overflow ( #22643 )
2014-10-15 12:17:21 -04:00
Benjamin Peterson
315aa40403
Merge 3.4
2014-10-15 11:51:17 -04:00
Benjamin Peterson
60d7a73194
Merge 3.3
2014-10-15 11:51:12 -04:00
Benjamin Peterson
c0e64f5027
make sure length is unsigned
2014-10-15 11:51:05 -04:00
Benjamin Peterson
6925264334
merge 3.4 ( #22643 )
2014-10-15 11:49:15 -04:00
Benjamin Peterson
1cbb3fe775
merge 3.3 ( #22643 )
2014-10-15 11:48:41 -04:00
Benjamin Peterson
e1bd38c03c
fix integer overflow in unicode case operations ( closes #22643 )
2014-10-15 11:47:36 -04:00
Gregory P. Smith
8486f9b134
Fix "warning: comparison between signed and unsigned integer expressions"
...
-Wsign-compare warnings in unicodeobject.c. These were all a result
of sizeof() being unsigned and being compared to a Py_ssize_t.
Not actual problems.
2014-09-30 00:33:24 -07:00
Benjamin Peterson
fd97a6fb2d
merge 3.4 ( #22520 )
2014-09-29 23:02:56 -04:00
Benjamin Peterson
43030ee780
merge 3.3 ( #22520 )
2014-09-29 23:02:35 -04:00
Benjamin Peterson
736b8012b4
prevent overflow in unicode_repr ( closes #22520 )
2014-09-29 23:02:15 -04:00
Benjamin Peterson
10e4b2545e
merge 3.4 ( closes #22518 )
2014-09-29 18:53:58 -04:00
Benjamin Peterson
2b76ce6d27
merge 3.3 ( closes #22518 )
2014-09-29 18:50:06 -04:00
Benjamin Peterson
a1c1be4e03
cleanup overflowing handling in unicode_decode_call_errorhandler and unicode_encode_ucs1 ( closes #22518 )
2014-09-29 18:18:57 -04:00
Serhiy Storchaka
20b39b27d9
Removed redundant casts to `char *`.
...
Corresponding functions now accept `const char *` (issue #1772673 ).
2014-09-28 11:27:24 +03:00
Benjamin Peterson
fa5021699a
Merge 3.3
2014-10-15 23:58:32 -04:00
Antoine Pitrou
b6dc9b7554
Fixed signed/unsigned comparison warning
2014-10-15 23:14:53 +02:00
Serhiy Storchaka
d8a1447c99
Issue #22215 : Now ValueError is raised instead of TypeError when str or bytes
...
argument contains not permitted null character or byte.
2014-09-06 20:07:17 +03:00
Victor Stinner
12174a5dca
Issue #22156 : Fix "comparison between signed and unsigned integers" compiler
...
warnings in the Objects/ subdirectory.
PyType_FromSpecWithBases() and PyType_FromSpec() now reject explicitly negative
slot identifiers.
2014-08-15 23:17:38 +02:00
Victor Stinner
f6a271ae98
Issue #18395 : Rename ``_Py_char2wchar()`` to :c:func:`Py_DecodeLocale`, rename
...
``_Py_wchar2char()`` to :c:func:`Py_EncodeLocale`, and document these
functions.
2014-08-01 12:28:48 +02:00
Victor Stinner
e1f17c6c0b
unicodeobject.c: fix a compiler warning on Windows 64 bits
2014-07-25 14:03:03 +02:00
Victor Stinner
c68b7fba86
(Merge 3.4) Issue #21892 , #21893 : Partial revert of changeset 4f55e802baf0,
...
PyErr_Format() uses "%zd" for Py_ssize_t, not PY_FORMAT_SIZE_T
2014-07-04 22:50:13 +02:00
Victor Stinner
a33bce0945
Issue #21892 , #21893 : Partial revert of changeset 4f55e802baf0, PyErr_Format()
...
uses "%zd" for Py_ssize_t, not PY_FORMAT_SIZE_T
2014-07-04 22:47:46 +02:00
Victor Stinner
9f43505f3d
(Merge 3.4) Closes #21892 , #21893 : Use PY_FORMAT_SIZE_T instead of %zi or %zu
...
to format C size_t, because %zi/%u is not supported on all platforms.
2014-07-01 08:57:54 +02:00
Victor Stinner
293f3f526d
Closes #21892 , #21893 : Use PY_FORMAT_SIZE_T instead of %zi or %zu to format C
...
size_t, because %zi/%u is not supported on all platforms.
2014-07-01 08:57:10 +02:00
Serhiy Storchaka
48070c1248
Issue #23803 : Fixed str.partition() and str.rpartition() when a separator
...
is wider then partitioned string.
2015-03-29 19:21:02 +03:00
Benjamin Peterson
92ce1b4392
merge 3.3 ( #23362 )
2015-03-02 13:23:41 -05:00
Benjamin Peterson
e5a853c390
use PyMem_NEW to detect overflow ( closes #23362 )
2015-03-02 13:23:25 -05:00
Serhiy Storchaka
4dbc305002
Issue #23055 : Fixed a buffer overflow in PyUnicode_FromFormatV. Analysis
...
and fix by Guido Vranken.
2015-01-27 22:18:46 +02:00
Victor Stinner
4dd25256e2
Issue #21118 : PyLong_AS_LONG() result type is long
...
Even if PyLong_AS_LONG() cannot fail, I prefer to use the right type.
2014-04-08 09:14:21 +02:00
Benjamin Peterson
1365de764e
fix reference leaks in the translate fast path ( closes #21175 )
...
Patch by Josh Rosenberg.
2014-04-07 20:15:41 -04:00
Victor Stinner
872b291b96
Issue #21118 : Optimize also str.translate() for ASCII => ASCII deletion
2014-04-05 14:27:07 +02:00
Victor Stinner
4ff33af257
Issue #21118 : Add unit test for invalid character replacement (code point higher than U+10ffff)
2014-04-05 11:56:37 +02:00
Victor Stinner
89a76abf20
Issue #21118 : Optimize str.translate() for ASCII => ASCII translation
2014-04-05 11:44:04 +02:00
Victor Stinner
8a4422e78d
Issue #21118 : Remove unused variable
2014-04-05 00:15:52 +02:00
Victor Stinner
1194ea020c
Issue #21118 : Use _PyUnicodeWriter API in str.translate() to simplify and
...
factorize the code
2014-04-04 19:37:40 +02:00
Ethan Furman
9ab748013b
Issue19995: more informative error message; spelling corrections; use operator.mod instead of __mod__
2014-03-21 06:38:46 -07:00
Ethan Furman
38d872ee5d
Issue19995: passing a non-int to %o, %c, %x, or %X now raises an exception
2014-03-19 08:38:52 -07:00
Victor Stinner
7d00cc1a64
Issue #20574 : Implement incremental decoder for cp65001 code
...
(Windows code page 65001, Microsoft UTF-8).
2014-03-17 23:08:06 +01:00
Kristján Valur Jónsson
25dded041f
Make the various iterators' "setstate" sliently and consistently clip the
...
index. This avoids the possibility of setting an iterator to an invalid
state.
2014-03-05 13:47:57 +00:00
Kristján Valur Jónsson
c5cc5011ac
Make the various iterators' "setstate" sliently and consistently clip the
...
index. This avoids the possibility of setting an iterator to an invalid
state.
2014-03-05 15:23:07 +00:00
Serhiy Storchaka
94ee389308
Issue #19619 : Blacklist non-text codecs in method API
...
str.encode, bytes.decode and bytearray.decode now use an
internal API to throw LookupError for known non-text encodings,
rather than attempting the encoding or decoding operation and
then throwing a TypeError for an unexpected output type.
The latter mechanism remains in place for third party non-text
encodings.
Backported changeset d68df99d7a57.
2014-02-24 14:43:03 +02:00
Benjamin Peterson
4267869ad8
merge 3.3 ( #20507 )
2014-02-15 13:03:20 -05:00
Benjamin Peterson
9743b2c2b5
give non-iterable TypeError a message ( closes #20507 )
2014-02-15 13:02:52 -05:00
Serhiy Storchaka
dfe98a102e
Issue #20437 : Fixed 22 potential bugs when deleting objects references.
2014-02-09 13:46:20 +02:00
Serhiy Storchaka
505ff755d7
Issue #20437 : Fixed 21 potential bugs when deleting objects references.
2014-02-09 13:33:53 +02:00
Larry Hastings
2623c8c23c
Issue #20530 : Argument Clinic's signature format has been revised again.
...
The new syntax is highly human readable while still preventing false
positives. The syntax also extends Python syntax to denote "self" and
positional-only parameters, allowing inspect.Signature objects to be
totally accurate for all supported builtins in Python 3.4.
2014-02-08 22:15:29 -08:00
Serhiy Storchaka
6cbf151032
Issue #20538 : UTF-7 incremental decoder produced inconsistant string when
...
input was truncated in BASE64 section.
2014-02-08 14:06:33 +02:00
Serhiy Storchaka
016a3f33a5
Issue #20538 : UTF-7 incremental decoder produced inconsistant string when
...
input was truncated in BASE64 section.
2014-02-08 14:01:29 +02:00
Larry Hastings
581ee3618c
Issue #20326 : Argument Clinic now uses a simple, unique signature to
...
annotate text signatures in docstrings, resulting in fewer false
positives. "self" parameters are also explicitly marked, allowing
inspect.Signature() to authoritatively detect (and skip) said parameters.
Issue #20326 : Argument Clinic now generates separate checksums for the
input and output sections of the block, allowing external tools to verify
that the input has not changed (and thus the output is not out-of-date).
2014-01-28 05:00:08 -08:00
Larry Hastings
c20472640c
Issue #20390 : Small fixes and improvements for Argument Clinic.
2014-01-25 20:43:29 -08:00
Larry Hastings
5c66189e88
Issue #20189 : Four additional builtin types (PyTypeObject,
...
PyMethodDescr_Type, _PyMethodWrapper_Type, and PyWrapperDescr_Type)
have been modified to provide introspection information for builtins.
Also: many additional Lib, test suite, and Argument Clinic fixes.
2014-01-24 06:17:25 -08:00
Ethan Furman
a70805e1fa
Issue19995: fixed typo; switched from test.support.check_warnings to assertWarns
2014-01-12 08:42:35 -08:00
Ethan Furman
f9bba9c67f
Issue19995: issue deprecation warning for non-integer values to %c, %o, %x, %X
2014-01-11 23:20:58 -08:00
Larry Hastings
61272b77b0
Issue #19273 : The marker comments Argument Clinic uses have been changed
...
to improve readability.
2014-01-07 12:41:53 -08:00
Ethan Furman
df3ed242c0
Issue19995: %o, %x, %X now only accept ints
2014-01-05 06:50:30 -08:00
Serhiy Storchaka
3079328d29
Reverted changeset b72c5573c5e7 (issue #15027 ).
2014-01-04 22:44:01 +02:00