Serhiy Storchaka
e3b2b4b8d9
bpo-31393: Fix the use of PyUnicode_READY(). ( #3451 )
2017-09-08 09:58:51 +03:00
Eric Snow
2ebc5ce42a
bpo-30860: Consolidate stateful runtime globals. ( #3397 )
...
* group the (stateful) runtime globals into various topical structs
* consolidate the topical structs under a single top-level _PyRuntimeState struct
* add a check-c-globals.py script that helps identify runtime globals
Other globals are excluded (see globals.txt and check-c-globals.py).
2017-09-07 23:51:28 -06:00
Stefan Krah
f432a3234f
bpo-30923: Silence fall-through warnings included in -Wextra since gcc-7.0. ( #3157 )
2017-08-21 13:09:59 +02:00
Serhiy Storchaka
64e461be09
bpo-22207: Add checks for possible integer overflows in unicodeobject.c. ( #2623 )
...
Based on patch by Victor Stinner.
2017-07-11 06:55:25 +03:00
Serhiy Storchaka
f7eae0adfc
[security] bpo-13617: Reject embedded null characters in wchar* strings. ( #2302 )
...
Based on patch by Victor Stinner.
Add private C API function _PyUnicode_AsUnicode() which is similar to
PyUnicode_AsUnicode(), but checks for null characters.
2017-06-28 08:30:06 +03:00
Serhiy Storchaka
e613e6add5
bpo-30708: Check for null characters in PyUnicode_AsWideCharString(). ( #2285 )
...
Raise a ValueError if the second argument is NULL and the wchar_t\*
string contains null characters.
2017-06-27 16:03:14 +03:00
Serhiy Storchaka
40db90c1ce
bpo-29802: Fix reference counting in module-level struct functions ( #1213 )
...
when pass arguments of wrong type.
2017-04-20 21:19:31 +03:00
Serhiy Storchaka
b879fe82e7
Expand the PySlice_GetIndicesEx macro. ( #1023 )
2017-04-08 09:53:51 +03:00
Lisa Roach
43ba8861e0
bpo-29549: Fixes docstring for str.index ( #256 )
...
* Updates B.index documentation.
* Updates str.index documentation, makes it Argument Clinic compatible.
* Removes ArgumentClinic code.
* Finishes string.index documentation.
* Updates string.rindex documentation.
* Documents B.rindex.
2017-04-04 22:36:22 -07:00
Serhiy Storchaka
fff9a31a91
bpo-29865: Use PyXXX_GET_SIZE macros rather than Py_SIZE for concrete types. ( #748 )
2017-03-21 08:53:25 +02:00
Serhiy Storchaka
004e03fb0c
bpo-29116: Improve error message for concatenating str with non-str. ( #710 )
2017-03-19 19:38:42 +02:00
Serhiy Storchaka
202fda55c2
bpo-24037: Add Argument Clinic converter `bool(accept={int})`. ( #485 )
2017-03-12 10:10:47 +02:00
Serhiy Storchaka
370fd202f1
Use Py_RETURN_FALSE/Py_RETURN_TRUE rather than PyBool_FromLong(0)/PyBool_FromLong(1). ( #567 )
2017-03-08 20:47:48 +02:00
Serhiy Storchaka
9f8ad3f39e
bpo-29568: Disable any characters between two percents for escaped percent "%%" in the format string for classic string formatting. (GH-513)
2017-03-08 11:51:19 +08:00
Martin Panter
91a8866dc1
Fix grammar in doc string, RST markup
2017-01-24 00:30:06 +00:00
Serhiy Storchaka
228b12edcc
Issue #28999 : Use Py_RETURN_NONE, Py_RETURN_TRUE and Py_RETURN_FALSE wherever
...
possible. Patch is writen with Coccinelle.
2017-01-23 09:47:21 +02:00
Serhiy Storchaka
2a404b63d4
Issue #28769 : The result of PyUnicode_AsUTF8AndSize() and PyUnicode_AsUTF8()
...
is now of type "const char *" rather of "char *".
2017-01-22 23:07:07 +02:00
Victor Stinner
0c4a828cad
Run Argument Clinic: METH_VARARGS=>METH_FASTCALL
...
Issue #29286 . Run Argument Clinic to get the new faster METH_FASTCALL calling
convention for functions using "boring" positional arguments.
Manually fix _elementtree: _elementtree_XMLParser_doctype() must remain
consistent with the clinic code.
2017-01-17 02:21:47 +01:00
INADA Naoki
15f94596b6
Issue #20180 : forgot to update AC output.
2017-01-16 21:49:13 +09:00
INADA Naoki
3ae2056512
Issue #20180 : convert unicode methods to AC.
2017-01-16 20:41:20 +09:00
Xiang Zhang
7a4da324dc
Issue #29145 : Merge 3.6.
2017-01-10 10:56:38 +08:00
Xiang Zhang
95403d74d7
Issue #29145 : Merge 3.5.
2017-01-10 10:54:19 +08:00
Xiang Zhang
b0541f4cdf
Issue #29145 : Fix overflow checks in str.replace() and str.join().
...
Based on patch by Martin Panter.
2017-01-10 10:52:00 +08:00
Xiang Zhang
62497d52d9
Issue #29044 : Merge 3.6.
2016-12-22 15:31:55 +08:00
Xiang Zhang
437a5d2c25
Issue #29044 : Merge 3.5.
2016-12-22 15:31:22 +08:00
Xiang Zhang
ea1cf87030
Issue #29044 : Fix a use-after-free in string '%c' formatter.
2016-12-22 15:30:47 +08:00
Xiang Zhang
b211068f5c
Issue #28822 : Adjust indices handling of PyUnicode_FindChar().
2016-12-20 22:52:33 +08:00
Xavier de Gaye
31eaf49ed9
Merge 3.6.
2016-12-15 21:01:52 +01:00
Xavier de Gaye
76febd0792
Issue #26919 : On Android, operating system data is now always encoded/decoded
...
to/from UTF-8, instead of the locale encoding to avoid inconsistencies with
os.fsencode() and os.fsdecode() which are already using UTF-8.
2016-12-15 20:59:58 +01:00
Serhiy Storchaka
fb3134f4d4
Issue #28808 : PyUnicode_CompareWithASCIIString() now never raises exceptions.
2016-12-06 00:20:26 +02:00
Serhiy Storchaka
9a953dbb34
Issue #28808 : PyUnicode_CompareWithASCIIString() now never raises exceptions.
2016-12-06 00:17:45 +02:00
Serhiy Storchaka
419967b832
Issue #28808 : PyUnicode_CompareWithASCIIString() now never raises exceptions.
2016-12-06 00:13:34 +02:00
Victor Stinner
de4ae3d486
Backed out changeset b9c9691c72c5
...
Issue #28858 : The change b9c9691c72c5 introduced a regression. It seems like
_PyObject_CallArg1() uses more stack memory than
PyObject_CallFunctionObjArgs().
2016-12-04 22:59:09 +01:00
Victor Stinner
27580c1fb5
Replace PyObject_CallFunctionObjArgs() with fastcall
...
* PyObject_CallFunctionObjArgs(func, NULL) => _PyObject_CallNoArg(func)
* PyObject_CallFunctionObjArgs(func, arg, NULL) => _PyObject_CallArg1(func, arg)
PyObject_CallFunctionObjArgs() allocates 40 bytes on the C stack and requires
extra work to "parse" C arguments to build a C array of PyObject*.
_PyObject_CallNoArg() and _PyObject_CallArg1() are simpler and don't allocate
memory on the C stack.
This change is part of the fastcall project. The change on listsort() is
related to the issue #23507 .
2016-12-01 14:43:22 +01:00
Serhiy Storchaka
99250d5c63
Issue #28774 : Simplified encoding a str result of an error handler in ASCII
...
and Latin1 encoders.
2016-11-23 15:13:00 +02:00
Xiang Zhang
d04d8474df
Issue #28774 : Fix start/end pos in unicode_encode_ucs1().
...
Fix error position of the unicode error in ASCII and Latin1
encoders when a string returned by the error handler contains multiple
non-encodable characters (non-ASCII for the ASCII codec, characters out
of the U+0000-U+00FF range for Latin1).
2016-11-23 19:34:01 +08:00
Serhiy Storchaka
50911476f5
Issue #28760 : Clean up and fix comments in PyUnicode_AsUnicodeEscapeString().
...
Patch by Xiang Zhang.
2016-11-21 11:47:16 +02:00
Serhiy Storchaka
ac0720eaa4
Issue #28760 : Clean up and fix comments in PyUnicode_AsUnicodeEscapeString().
...
Patch by Xiang Zhang.
2016-11-21 11:46:51 +02:00
Serhiy Storchaka
460bd0d284
Issue #19569 : Compiler warnings are now emitted if use most of deprecated
...
functions.
2016-11-20 12:16:46 +02:00
Serhiy Storchaka
27b74244fb
Issue #28701 : _PyUnicode_EqualToASCIIId and _PyUnicode_EqualToASCIIString now
...
require ASCII right argument and assert this condition in debug build.
2016-11-16 20:03:03 +02:00
Serhiy Storchaka
a83a6a3275
Issue #28701 : _PyUnicode_EqualToASCIIId and _PyUnicode_EqualToASCIIString now
...
require ASCII right argument and assert this condition in debug build.
2016-11-16 20:02:44 +02:00
Serhiy Storchaka
e6d6131f78
Fixed an off-by-one error in _PyUnicode_EqualToASCIIString (issue #28701 ).
2016-11-16 16:13:13 +02:00
Serhiy Storchaka
df66b9c425
Fixed an off-by-one error in _PyUnicode_EqualToASCIIString (issue #28701 ).
2016-11-16 16:12:56 +02:00
Serhiy Storchaka
292dd1b2ad
Fixed an off-by-one error in _PyUnicode_EqualToASCIIString (issue #28701 ).
2016-11-16 16:12:34 +02:00
Serhiy Storchaka
503db266a5
Issue #21449 : Removed private function _PyUnicode_CompareWithId.
2016-11-16 15:56:50 +02:00
Serhiy Storchaka
dddec81b2d
Issue #21449 : Removed private function _PyUnicode_CompareWithId.
2016-11-16 15:56:27 +02:00
Serhiy Storchaka
29a5447360
Issue #28701 : Replace _PyUnicode_CompareWithId with _PyUnicode_EqualToASCIIId.
...
The latter function is more readable, faster and doesn't raise exceptions.
Based on patch by Xiang Zhang.
2016-11-16 15:41:31 +02:00
Serhiy Storchaka
fab6acd9f5
Issue #28701 : Replace _PyUnicode_CompareWithId with _PyUnicode_EqualToASCIIId.
...
The latter function is more readable, faster and doesn't raise exceptions.
Based on patch by Xiang Zhang.
2016-11-16 15:41:11 +02:00
Serhiy Storchaka
f5894dd646
Issue #28701 : Replace _PyUnicode_CompareWithId with _PyUnicode_EqualToASCIIId.
...
The latter function is more readable, faster and doesn't raise exceptions.
Based on patch by Xiang Zhang.
2016-11-16 15:40:39 +02:00
Serhiy Storchaka
1a73bf365e
Issue #28701 : Replace PyUnicode_CompareWithASCIIString with _PyUnicode_EqualToASCIIString.
...
The latter function is more readable, faster and doesn't raise exceptions.
2016-11-16 10:19:57 +02:00
Serhiy Storchaka
3b73ea1278
Issue #28701 : Replace PyUnicode_CompareWithASCIIString with _PyUnicode_EqualToASCIIString.
...
The latter function is more readable, faster and doesn't raise exceptions.
2016-11-16 10:19:20 +02:00
Serhiy Storchaka
f4934ea77d
Issue #28701 : Replace PyUnicode_CompareWithASCIIString with _PyUnicode_EqualToASCIIString.
...
The latter function is more readable, faster and doesn't raise exceptions.
2016-11-16 10:17:58 +02:00
Serhiy Storchaka
616034eb73
Issue #28648 : Fixed crash in Py_DecodeLocale() in debug build on Mac OS X
...
when decode astral characters.
2016-11-12 14:37:11 +02:00
Serhiy Storchaka
babe4f8e5e
Issue #28648 : Fixed crash in Py_DecodeLocale() in debug build on Mac OS X
...
when decode astral characters.
2016-11-12 14:36:02 +02:00
Serhiy Storchaka
6b4b6e956e
Issue #28648 : Fixed crash in Py_DecodeLocale() in debug build on Mac OS X
...
when decode astral characters.
2016-11-12 14:35:46 +02:00
Serhiy Storchaka
84293aff9f
Issue #28648 : Fixed crash in Py_DecodeLocale() in debug build on Mac OS X
...
when decode astral characters.
2016-11-12 14:29:48 +02:00
Serhiy Storchaka
b626643734
Issue #28648 : Fixed crash in Py_DecodeLocale() in debug build on Mac OS X
...
when decode astral characters.
2016-11-12 14:28:06 +02:00
Steve Dower
257a4c1503
Closes #27781 : Removes special cases for the experimental aspect of PEP 529
2016-11-06 19:35:24 -08:00
Steve Dower
78057b4159
Closes #27781 : Removes special cases for the experimental aspect of PEP 529
2016-11-06 19:35:08 -08:00
Eric V. Smith
5646648678
Issue 28128: Print out better error/warning messages for invalid string escapes. Backport to 3.6.
2016-10-31 14:46:26 -04:00
Eric V. Smith
42454af094
Issue 28128: Print out better error/warning messages for invalid string escapes.
2016-10-31 09:22:08 -04:00
Serhiy Storchaka
2edcd1cba4
Issue #28426 : Deprecated undocumented functions PyUnicode_AsEncodedObject(),
...
PyUnicode_AsDecodedObject(), PyUnicode_AsDecodedUnicode() and
PyUnicode_AsEncodedUnicode().
2016-10-27 21:08:00 +03:00
Serhiy Storchaka
0093907f0e
Issue #28426 : Deprecated undocumented functions PyUnicode_AsEncodedObject(),
...
PyUnicode_AsDecodedObject(), PyUnicode_AsDecodedUnicode() and
PyUnicode_AsEncodedUnicode().
2016-10-27 21:05:49 +03:00
Serhiy Storchaka
a4f8823063
Issue #28408 : Fixed a leak and remove redundant code in _PyUnicodeWriter_Finish().
...
Patch by Xiang Zhang.
2016-10-25 13:25:04 +03:00
Serhiy Storchaka
c8bc3d1c07
Issue #28408 : Fixed a leak and remove redundant code in _PyUnicodeWriter_Finish().
...
Patch by Xiang Zhang.
2016-10-25 13:23:56 +03:00
Serhiy Storchaka
d7e5ff13bb
Issue #28426 : Fixed potential crash in PyUnicode_AsDecodedObject() in debug build.
2016-10-25 10:18:16 +03:00
Serhiy Storchaka
c4a3e90aa8
Issue #28426 : Fixed potential crash in PyUnicode_AsDecodedObject() in debug build.
2016-10-25 10:17:33 +03:00
Serhiy Storchaka
839023f12c
Issue #28426 : Fixed potential crash in PyUnicode_AsDecodedObject() in debug build.
2016-10-25 10:13:43 +03:00
Serhiy Storchaka
77eede35fc
Issue #28426 : Fixed potential crash in PyUnicode_AsDecodedObject() in debug build.
2016-10-25 10:07:51 +03:00
Serhiy Storchaka
2fbc019c8c
Issue #28439 : Remove redundant checks in PyUnicode_EncodeLocale and
...
PyUnicode_DecodeLocaleAndSize. Patch by Xiang Zhang.
2016-10-23 15:41:36 +03:00
Serhiy Storchaka
f8d7d41507
Issue #28511 : Use the "U" format instead of "O!" in PyArg_Parse*.
2016-10-23 15:12:25 +03:00
Serhiy Storchaka
523c449ca0
Issue #28504 : Cleanup unicode_decode_call_errorhandler_wchar/writer.
...
Patch by Xiang Zhang.
2016-10-22 23:18:31 +03:00
Serhiy Storchaka
14ab277632
Issue #28410 : Added _PyErr_FormatFromCause() -- the helper for raising
...
new exception with setting current exception as __cause__.
_PyErr_FormatFromCause(exception, format, args...) is equivalent to Python
raise exception(format % args) from sys.exc_info()[1]
2016-10-21 17:10:42 +03:00
Serhiy Storchaka
467ab194fc
Issue #28410 : Added _PyErr_FormatFromCause() -- the helper for raising
...
new exception with setting current exception as __cause__.
_PyErr_FormatFromCause(exception, format, args...) is equivalent to Python
raise exception(format % args) from sys.exc_info()[1]
2016-10-21 17:09:17 +03:00
Benjamin Peterson
d6d49f16f4
merge 3.6 ( #28454 )
2016-10-16 15:42:33 -07:00
Benjamin Peterson
3aa75528a1
merge 3.5 ( #28454 )
2016-10-16 15:42:24 -07:00
Benjamin Peterson
8d761ff045
remove extra PyErr_Format arguments ( closes #28454 )
...
Patch from Xiang Zhang.
2016-10-16 15:41:46 -07:00
Victor Stinner
5a33759fba
Merge 3.6
2016-10-12 13:59:13 +02:00
Victor Stinner
ebe17e0347
Fix _Py_normalize_encoding() command
...
It's not exactly the same than encodings.normalize_encoding(): the C function
also converts to lowercase.
2016-10-12 13:57:45 +02:00
Benjamin Peterson
8a3748290a
merge 3.6 ( #28417 )
2016-10-11 23:01:12 -07:00
Benjamin Peterson
b329e1bb5b
va_end vargs2 once ( closes #28417 )
2016-10-11 23:00:58 -07:00
Serhiy Storchaka
2e58f1a52a
Issue #28400 : Removed uncessary checks in unicode_char and resize_copy.
...
1. In resize_copy we don't need to PyUnicode_READY(unicode) since when
it's not PyUnicode_WCHAR_KIND it should be ready.
2. In unicode_char, PyUnicode_1BYTE_KIND is handled by get_latin1_char.
Patch by Xiang Zhang.
2016-10-09 23:44:48 +03:00
Serhiy Storchaka
21d9f10c94
Merge from 3.5.
2016-10-08 22:46:01 +03:00
Serhiy Storchaka
9c0e1f83af
Issue #28379 : Added sanity checks and tests for PyUnicode_CopyCharacters().
...
Patch by Xiang Zhang.
2016-10-08 22:45:38 +03:00
Victor Stinner
44f4874e68
Merge 3.5
2016-09-21 14:13:53 +02:00
Victor Stinner
1ddf53d496
Fix PyUnicode_FromFormatV() error handling
...
Issue #28233 : Fix a memory leak if the format string contains a non-ASCII
character, destroy the unicode writer.
2016-09-21 14:13:14 +02:00
Christian Heimes
2f2fee19ec
va_end() all va_copy()ed va_lists.
2016-09-21 11:37:27 +02:00
Benjamin Peterson
0c21214f3e
replace usage of Py_VA_COPY with the (C99) standard va_copy
2016-09-20 20:39:33 -07:00
Christian Heimes
f051e43b22
Issue #28126 : Replace Py_MEMCPY with memcpy(). Visual Studio can properly optimize memcpy().
2016-09-13 20:22:02 +02:00
Benjamin Peterson
621b430a14
remove all usage of Py_LOCAL
2016-09-09 13:54:34 -07:00
Benjamin Peterson
33d2a492d0
promote some shifts to unsigned, so as not to invoke undefined behavior
2016-09-06 20:40:04 -07:00
R David Murray
110b6fecbb
#27364 : Deprecate invalid escape strings in str/byutes.
...
Patch by Emanuel Barry, reviewed by Serhiy Storchaka and Martin Panter.
2016-09-08 15:34:08 -04:00
Steve Dower
cc16be85c0
Issue #27781 : Change file system encoding on Windows to UTF-8 (PEP 529)
2016-09-08 10:35:16 -07:00
Benjamin Peterson
47ff0734b8
more PY_LONG_LONG to long long
2016-09-08 09:15:54 -07:00
Benjamin Peterson
2e7c5e9c11
replace some Py_LOCAL_INLINE with the inline keyword
2016-09-07 15:33:32 -07:00
Benjamin Peterson
4b9abf3a27
merge 3.5
2016-09-06 20:42:17 -07:00
Brett Cannon
a571120410
Issue #27182 : Add support for path-like objects to PyUnicode_FSDecoder().
2016-09-06 19:36:01 -07:00
Victor Stinner
62ec3317d2
Optimize unicode_escape and raw_unicode_escape
...
Issue #16334 . Patch written by Serhiy Storchaka.
2016-09-06 17:04:34 -07:00
Victor Stinner
2740e46089
_PyUnicodeWriter: assert that max character <= MAX_UNICODE
2016-09-06 16:58:36 -07:00
Brett Cannon
ec6ce879c7
Issue #26027 : Support path-like objects in PyUnicode-FSConverter().
...
This is to add support for os.exec*() and os.spawn*() functions. Part
of PEP 519.
2016-09-06 15:50:29 -07:00
Benjamin Peterson
9b3d77052f
replace Python aliases for standard integer types with the standard integer types ( #17884 )
2016-09-06 13:24:00 -07:00
Serhiy Storchaka
ea525a2d1a
Issue #27078 : Added BUILD_STRING opcode. Optimized f-strings evaluation.
2016-09-06 22:07:53 +03:00
Benjamin Peterson
af580dff4a
replace PY_LONG_LONG with long long
2016-09-06 10:46:49 -07:00
Benjamin Peterson
ed4aa83ff7
require a long long data type ( closes #27961 )
2016-09-05 17:44:18 -07:00
Victor Stinner
942889aae2
Issue #27938 : Add a fast-path for us-ascii encoding
...
Other changes:
* Rewrite _Py_normalize_encoding() as a C implementation of
encodings.normalize_encoding(). For example, " utf-8 " is now normalized to
"utf_8". So the fast path is now used for more name variants of the same
encoding.
* Avoid strcpy() when encoding is NULL: call directly the UTF-8 codec
2016-09-05 15:40:10 -07:00
Victor Stinner
1a05d6c04d
PEP 7 style for if/else in C
...
Add also a newline for readability in normalize_encoding().
2016-09-02 12:12:23 +02:00
Raymond Hettinger
15f44ab043
Issue #27895 : Spelling fixes (Contributed by Ville Skyttä).
2016-08-30 10:47:49 -07:00
Serhiy Storchaka
febc332056
Issue #26754 : Undocumented support of general bytes-like objects
...
as path in compile() and similar functions is now deprecated.
2016-08-06 23:29:29 +03:00
Berker Peksag
ced8d4c6eb
Issue #27454 : Use PyDict_SetDefault in PyUnicode_InternInPlace
...
Patch by INADA Naoki.
2016-07-25 04:40:39 +03:00
Serhiy Storchaka
f95de0e8cc
Issue #26754 : PyUnicode_FSDecoder() accepted a filename argument encoded as
...
an iterable of integers. Now only strings and byte-like objects are accepted.
2016-06-18 13:56:16 +03:00
Serhiy Storchaka
9305d83425
Issue #26754 : PyUnicode_FSDecoder() accepted a filename argument encoded as
...
an iterable of integers. Now only strings and byte-like objects are accepted.
2016-06-18 13:53:36 +03:00
Martin Panter
0b7d84de6b
Issue #27171 : Merge typo fixes from 3.5
2016-06-02 10:11:18 +00:00
Martin Panter
e26da7c03a
Issue #27171 : Fix typos in documentation, comments, and test function names
2016-06-02 10:07:09 +00:00
Serhiy Storchaka
dd40fc3e57
Issue #26765 : Moved common code and docstrings for bytes and bytearray methods
...
to bytes_methods.c.
2016-05-04 22:23:26 +03:00
Martin Panter
cda80940ed
Issue #15984 : Merge PyUnicode doc from 3.5
2016-04-15 02:27:11 +00:00
Martin Panter
6245cb3c01
Correct “an” → “a” with “Unicode”, “user”, “UTF”, etc
...
This affects documentation, code comments, and a debugging messages.
2016-04-15 02:14:19 +00:00
Serhiy Storchaka
21a663ea28
Issue #26057 : Got rid of nonneeded use of PyUnicode_FromObject().
2016-04-13 15:37:23 +03:00
Serhiy Storchaka
f01e408c16
Issue #26200 : Added Py_SETREF and replaced Py_XSETREF with Py_SETREF
...
in places where Py_DECREF was used.
2016-04-10 18:12:01 +03:00
Serhiy Storchaka
57a01d3a0e
Issue #26200 : Added Py_SETREF and replaced Py_XSETREF with Py_SETREF
...
in places where Py_DECREF was used.
2016-04-10 18:05:40 +03:00
Serhiy Storchaka
ec39756960
Issue #22570 : Renamed Py_SETREF to Py_XSETREF.
2016-04-06 09:50:03 +03:00
Serhiy Storchaka
48842714b9
Issue #22570 : Renamed Py_SETREF to Py_XSETREF.
2016-04-06 09:45:48 +03:00
Serhiy Storchaka
ab479c49d3
Issue #26494 : Fixed crash on iterating exhausting iterators.
...
Affected classes are generic sequence iterators, iterators of str, bytes,
bytearray, list, tuple, set, frozenset, dict, OrderedDict, corresponding
views and os.scandir() iterator.
2016-03-30 20:41:15 +03:00
Serhiy Storchaka
fbb1c5ee06
Issue #26494 : Fixed crash on iterating exhausting iterators.
...
Affected classes are generic sequence iterators, iterators of str, bytes,
bytearray, list, tuple, set, frozenset, dict, OrderedDict, corresponding
views and os.scandir() iterator.
2016-03-30 20:40:02 +03:00
Victor Stinner
f2192855dd
Merge 3.5
2016-03-01 22:07:53 +01:00
Victor Stinner
337986740f
Issue #26464 : Fix unicode_fast_translate() again
...
Initialize i variable if the string is non-ASCII.
2016-03-01 21:59:58 +01:00
Victor Stinner
3d9d77a3dc
Merge 3.5
2016-03-01 21:30:50 +01:00
Victor Stinner
6c9aa8f2bf
Fix str.translate()
...
Issue #26464 : Fix str.translate() when string is ASCII and first replacements
removes character, but next replacement uses a non-ASCII character or a string
longer than 1 character. Regression introduced in Python 3.5.0.
2016-03-01 21:30:30 +01:00
Victor Stinner
5b96f17b1c
Merge 3.5
2016-01-27 17:01:13 +01:00
Victor Stinner
5bc03a6d4d
Fix resize_compact()
...
Issue #26217 : resize_compact() must set wstr_length to 0 after freeing the wstr
string. Otherwise, an assertion fails in _PyUnicode_CheckConsistency().
2016-01-27 16:56:53 +01:00
Serhiy Storchaka
726fc139a5
Issue #20440 : More use of Py_SETREF.
...
This patch is manually crafted and contains changes that couldn't be handled
automatically.
2015-12-27 15:44:33 +02:00
Serhiy Storchaka
191321d11b
Issue #20440 : More use of Py_SETREF.
...
This patch is manually crafted and contains changes that couldn't be handled
automatically.
2015-12-27 15:41:34 +02:00
Serhiy Storchaka
ef1585eb9a
Issue #25923 : Added more const qualifiers to signatures of static and private functions.
2015-12-25 20:01:53 +02:00
Serhiy Storchaka
2d06e84455
Issue #25923 : Added the const qualifier to static constant arrays.
2015-12-25 19:53:18 +02:00
Serhiy Storchaka
f006940351
Issue #20440 : Massive replacing unsafe attribute setting code with special
...
macro Py_SETREF.
2015-12-24 10:39:57 +02:00
Serhiy Storchaka
5a57ade58e
Issue #20440 : Massive replacing unsafe attribute setting code with special
...
macro Py_SETREF.
2015-12-24 10:35:59 +02:00
Serhiy Storchaka
9b3a2eec1c
Issues #25890 , #25891 , #25892 : Removed unused variables in Windows code.
...
Reported by Alexander Riccio.
2015-12-18 10:03:13 +02:00
Serhiy Storchaka
7c088a9b5c
Issue #25709 : Fixed problem with in-place string concatenation and utf-8 cache.
2015-12-03 01:05:52 +02:00
Serhiy Storchaka
6648bf5661
Issue #25709 : Fixed problem with in-place string concatenation and utf-8 cache.
2015-12-03 01:04:37 +02:00
Serhiy Storchaka
31b9410654
Issue #25709 : Fixed problem with in-place string concatenation and utf-8 cache.
2015-12-03 01:02:03 +02:00
Serhiy Storchaka
7aa690860e
Issue #25709 : Fixed problem with in-place string concatenation and utf-8 cache.
2015-12-03 01:02:03 +02:00
Benjamin Peterson
d798dc1034
merge 3.5 ( #25630 )
2015-11-15 21:57:50 -08:00
Benjamin Peterson
a4d33b3428
make the PyUnicode_FSConverter cleanup set the decrefed argument to NULL ( closes #25630 )
2015-11-15 21:57:39 -08:00
Serhiy Storchaka
413fdcea21
Issue #24821 : Refactor STRINGLIB(fastsearch_memchr_1char) and split it on
...
STRINGLIB(find_char) and STRINGLIB(rfind_char) that can be used independedly
without special preconditions.
2015-11-14 15:42:17 +02:00
Serhiy Storchaka
4a7c03aab4
Issue #25523 : Merge a-to-an corrections from 3.5.
2015-11-02 14:44:29 +02:00
Serhiy Storchaka
a84f6c3dd3
Issue #25523 : Merge a-to-an corrections from 3.4.
2015-11-02 14:39:05 +02:00
Serhiy Storchaka
d65c9496da
Issue #25523 : Further a-to-an corrections.
2015-11-02 14:10:23 +02:00
Victor Stinner
358af13526
Issue #25353 : Optimize unicode escape and raw unicode escape encoders to use
...
the new _PyBytesWriter API.
2015-10-12 22:36:57 +02:00
Victor Stinner
6c2cdae9e6
Writer APIs: use empty string singletons
...
Modify _PyBytesWriter_Finish() and _PyUnicodeWriter_Finish() to return the
empty bytes/Unicode string if the string is empty.
2015-10-12 13:29:43 +02:00
Victor Stinner
6bd525b656
Optimize error handlers of ASCII and Latin1 encoders when the replacement
...
string is pure ASCII: use _PyBytesWriter_WriteBytes(), don't check individual
character.
Cleanup unicode_encode_ucs1():
* Rename repunicode to rep
* Clear rep object on error
* Factorize code between bytes and unicode path
2015-10-09 13:10:05 +02:00
Victor Stinner
ce179bf6ba
Add _PyBytesWriter_WriteBytes() to factorize the code
2015-10-09 12:57:22 +02:00
Victor Stinner
ad7715891e
_PyBytesWriter: simplify code to avoid "prealloc" parameters
...
Substract preallocate bytes from min_size before calling
_PyBytesWriter_Prepare().
2015-10-09 12:38:53 +02:00
Victor Stinner
3fa36ff5e4
Issue #25318 : Fix backslashreplace()
...
Fix code to estimate the needed space.
2015-10-09 03:37:11 +02:00
Victor Stinner
797485e101
Issue #25318 : Avoid sprintf() in backslashreplace()
...
Rewrite backslashreplace() to be closer to PyCodec_BackslashReplaceErrors().
Add also unit tests for non-BMP characters.
2015-10-09 03:17:30 +02:00
Victor Stinner
0016507c16
Issue #25318 : Move _PyBytesWriter to bytesobject.c
...
Declare also the private API in bytesobject.h.
2015-10-09 01:53:21 +02:00
Victor Stinner
e7bf86cd7d
Optimize backslashreplace error handler
...
Issue #25318 : Optimize backslashreplace and xmlcharrefreplace error handlers in
UTF-8 encoder. Optimize also backslashreplace error handler for ASCII and
Latin1 encoders.
Use the new _PyBytesWriter API to optimize these error handlers for the
encoders. It avoids to create an exception and call the slow implementation of
the error handler.
2015-10-09 01:39:28 +02:00
Victor Stinner
fdfbf78114
Issue #25318 : Add _PyBytesWriter API
...
Add a new private API to optimize Unicode encoders. It uses a small buffer
allocated on the stack and supports overallocation.
Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable
overallocation for the UTF-8 encoder with error handlers.
unicode_encode_ucs1(): initialize collend to collstart+1 to not check the
current character twice, we already know that it is not ASCII.
2015-10-09 00:33:49 +02:00
Victor Stinner
74e8fac3c8
Issue #25301 : Fix compatibility with ISO C90
2015-10-05 13:49:26 +02:00
Victor Stinner
1d65d9192d
Issue #25301 : The UTF-8 decoder is now up to 15 times as fast for error
...
handlers: ``ignore``, ``replace`` and ``surrogateescape``.
2015-10-05 13:43:50 +02:00
Victor Stinner
eb36fdaad8
Fix _PyUnicodeWriter_PrepareKind()
...
Initialize kind to 0 (PyUnicode_WCHAR_KIND) to ensure that
_PyUnicodeWriter_PrepareKind() handles correctly read-only buffer: copy the
buffer.
2015-10-03 01:55:51 +02:00
Serhiy Storchaka
29e68edbf4
Issue #24848 : Fixed bugs in UTF-7 decoding of misformed data:
...
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
3. In some circumstances the '\xfd' character was produced instead of the
replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
2015-10-02 13:14:03 +03:00
Serhiy Storchaka
58c8f2bb6d
Issue #24848 : Fixed bugs in UTF-7 decoding of misformed data:
...
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
3. In some circumstances the '\xfd' character was produced instead of the
replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
2015-10-02 13:13:14 +03:00
Serhiy Storchaka
28b21e50c8
Issue #24848 : Fixed bugs in UTF-7 decoding of misformed data:
...
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
2015-10-02 13:07:28 +03:00
Victor Stinner
3222da26fe
Make _PyUnicode_TranslateCharmap() symbol private
...
unicodeobject.h exposes PyUnicode_TranslateCharmap() and PyUnicode_Translate().
2015-10-01 22:07:32 +02:00
Victor Stinner
01ada3996b
Issue #25267 : The UTF-8 encoder is now up to 75 times as fast for error
...
handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``.
Patch co-written with Serhiy Storchaka.
2015-10-01 21:54:51 +02:00
Victor Stinner
c3713e9706
Optimize ascii/latin1+surrogateescape encoders
...
Issue #25227 : Optimize ASCII and latin1 encoders with the ``surrogateescape``
error handler: the encoders are now up to 3 times as fast.
Initial patch written by Serhiy Storchaka.
2015-09-29 12:32:13 +02:00
Victor Stinner
0030cd52da
Issue #25227 : Cleanup unicode_encode_ucs1() error handler
...
* Change limit type from unsigned int to Py_UCS4, to use the same type than the
"ch" variable (an Unicode character).
* Reuse ch variable for _Py_ERROR_XMLCHARREFREPLACE
* Add some newlines for readability
2015-09-24 14:45:00 +02:00
Victor Stinner
54385b206d
Issue #24870 : revert unwanted change
...
Sorry, I pushed the patch on the UTF-8 decoder by mistake :-(
2015-09-22 10:46:52 +02:00
Victor Stinner
5ebae87628
Issue #25207 , #14626 : Fix my commit.
...
It doesn't work to use #define XXX defined(YYY)" and then "#ifdef XXX"
to check YYY.
2015-09-22 01:29:33 +02:00
Victor Stinner
6174474bea
_PyUnicodeWriter_PrepareInternal(): make the assertion more strict
2015-09-22 01:01:17 +02:00
Victor Stinner
ca9381ea01
Issue #24870 : Add _PyUnicodeWriter_PrepareKind() macro
...
Add a macro which ensures that the writer has at least the requested kind.
2015-09-22 00:58:32 +02:00
Victor Stinner
5014920cb7
Issue #24870 : Reuse the new _Py_error_handler enum
...
Factorize code with the new get_error_handler() function.
Add some empty lines for readability.
2015-09-22 00:26:54 +02:00
Victor Stinner
f96418de05
Issue #24870 : Optimize the ASCII decoder for error handlers: surrogateescape,
...
ignore and replace. Initial patch written by Naoki Inada.
The decoder is now up to 60 times as fast for these error handlers.
Add also unit tests for the ASCII decoder.
2015-09-21 23:06:27 +02:00
Zachary Ware
070bd62cfa
Closes #21279 : Merge with 3.5
2015-08-06 00:05:13 -05:00
Zachary Ware
d987a81d29
Issue #21279 : Merge with 3.4
2015-08-06 00:04:23 -05:00
Zachary Ware
79b98df023
Issue #21279 : Flesh out str.translate docs
...
Initial patch by Kinga Farkas, Martin Panter, and John Posner.
2015-08-05 23:54:15 -05:00
Raymond Hettinger
ac2ef65c32
Make the unicode equality test an external function rather than in-lining it.
...
The real benefit of the unicode specialized function comes from
bypassing the overhead of PyObject_RichCompareBool() and not
from being in-lined (especially since there was almost no shared
data between the caller and callee). Also, the in-lining was
having a negative effect on code generation for the callee.
2015-07-04 16:04:44 -07:00
Serhiy Storchaka
d4ea03c785
Issue #24284 : The startswith and endswith methods of the str class no longer
...
return True when finding the empty string and the indexes are completely out
of range.
2015-05-31 09:15:51 +03:00
Antoine Pitrou
873e0df946
Fix some compilation warnings when using gcc (-Wmaybe-uninitialized).
2015-05-19 21:06:04 +02:00
Antoine Pitrou
f6d1f1fa8a
Fix some compilation warnings when using gcc (-Wmaybe-uninitialized).
2015-05-19 21:04:33 +02:00
Serhiy Storchaka
0d4df752ac
Issue #15027 : The UTF-32 encoder is now 3x to 7x faster.
2015-05-12 23:12:45 +03:00
Serhiy Storchaka
7e9d1d1a1b
Issue #23908 : os functions now reject paths with embedded null character
...
on Windows instead of silently truncate them.
Removed no longer used _PyUnicode_HasNULChars().
2015-04-20 10:12:28 +03:00
Serhiy Storchaka
1009bf18b3
Issue #23501 : Argumen Clinic now generates code into separate files by default.
2015-04-03 23:53:51 +03:00
Victor Stinner
1912b39def
_PyUnicodeWriter_WriteStr() now checks that the input string is consistent
...
in debug mode to detect bugs earlier.
_PyUnicodeWriter_Finish() doesn't check if the read only string is consistent,
whereas it does check consistency for strings built by itself.
2015-03-26 09:37:23 +01:00
Serhiy Storchaka
d9d769fcdd
Issue #23573 : Increased performance of string search operations (str.find,
...
str.index, str.count, the in operator, str.split, str.partition) with
arguments of different kinds (UCS1, UCS2, UCS4).
2015-03-24 21:55:47 +02:00
Victor Stinner
f50e187724
Fix compiler warnings: comparison between signed and unsigned numbers
2015-03-20 11:32:24 +01:00
Victor Stinner
0c39b1b970
Initialize variables to prevent GCC warnings
2015-03-18 15:02:06 +01:00
Benjamin Peterson
e5a853c390
use PyMem_NEW to detect overflow ( closes #23362 )
2015-03-02 13:23:25 -05:00
Steve Dower
3e96f324dc
Issue #23451 : Update pyconfig.h for Windows to require Vista headers and remove unnecessary version checks.
2015-03-02 08:01:10 -08:00
Serhiy Storchaka
78a8249127
Issue #23490 : Fixed possible crashes related to interoperability between
...
old-style and new API for string with 2**30-1 characters.
2015-02-20 21:34:39 +02:00
Serhiy Storchaka
e55181f517
Issue #23490 : Fixed possible crashes related to interoperability between
...
old-style and new API for string with 2**30-1 characters.
2015-02-20 21:34:06 +02:00
Serhiy Storchaka
4d0d982985
Issue #23446 : Use PyMem_New instead of PyMem_Malloc to avoid possible integer
...
overflows. Added few missed PyErr_NoMemory().
2015-02-16 13:33:32 +02:00
Serhiy Storchaka
1a1ff29659
Issue #23446 : Use PyMem_New instead of PyMem_Malloc to avoid possible integer
...
overflows. Added few missed PyErr_NoMemory().
2015-02-16 13:28:22 +02:00
Serhiy Storchaka
4dbc305002
Issue #23055 : Fixed a buffer overflow in PyUnicode_FromFormatV. Analysis
...
and fix by Guido Vranken.
2015-01-27 22:18:46 +02:00
Victor Stinner
29dacf2e97
Issue #15859 : PyUnicode_EncodeFSDefault(), PyUnicode_EncodeMBCS() and
...
PyUnicode_EncodeCodePage() now raise an exception if the object is not an
Unicode object. For PyUnicode_EncodeFSDefault(), it was already the case on
platforms other than Windows. Patch written by Campbell Barton.
2015-01-26 16:41:32 +01:00
Serhiy Storchaka
bbd3aa8ece
Issue #23321 : Fixed a crash in str.decode() when error handler returned
...
replacment string longer than mailformed input data.
2015-01-26 01:24:31 +02:00
Serhiy Storchaka
7e4b9057b3
Issue #23321 : Fixed a crash in str.decode() when error handler returned
...
replacment string longer than mailformed input data.
2015-01-26 01:22:54 +02:00
Ethan Furman
b95b56150f
Issue20284: Implement PEP461
2015-01-23 20:05:18 -08:00
Serhiy Storchaka
82e07b92b3
Issue #23181 : More "codepoint" -> "code point".
2015-01-18 11:33:31 +02:00
Serhiy Storchaka
d3faf43f9b
Issue #23181 : More "codepoint" -> "code point".
2015-01-18 11:28:37 +02:00
Serhiy Storchaka
b757c83ec6
Issue #22581 : Use more "bytes-like object" throughout the docs and comments.
2014-12-05 22:25:22 +02:00
Serhiy Storchaka
133b11b566
Issue #22975 : Close block at right place.
2014-12-01 18:56:28 +02:00
Serhiy Storchaka
92bf919ed0
Issue #22581 : Use more "bytes-like object" throughout the docs and comments.
2014-12-05 22:26:10 +02:00
Serhiy Storchaka
407249c62b
Issue #22975 : Close block at right place.
2014-12-01 18:56:54 +02:00
Victor Stinner
3aa979e0cd
Issue #20948 : Inline makefmt() in unicode_fromformat_arg()
2014-11-18 21:40:51 +01:00
Antoine Pitrou
b6dc9b7554
Fixed signed/unsigned comparison warning
2014-10-15 23:14:53 +02:00
Antoine Pitrou
4e334241b7
Fixed signed/unsigned comparison warning
2014-10-15 23:14:53 +02:00
Benjamin Peterson
736982d36d
merge 3.4 ( closes #22643 )
2014-10-15 12:17:47 -04:00
Benjamin Peterson
9c422f3c3d
merge 3.3
2014-10-15 12:17:33 -04:00
Benjamin Peterson
1e211ff10d
it suffices to check for PY_SSIZE_T_MAX overflow ( #22643 )
2014-10-15 12:17:21 -04:00
Benjamin Peterson
315aa40403
Merge 3.4
2014-10-15 11:51:17 -04:00
Benjamin Peterson
60d7a73194
Merge 3.3
2014-10-15 11:51:12 -04:00
Benjamin Peterson
c0e64f5027
make sure length is unsigned
2014-10-15 11:51:05 -04:00
Benjamin Peterson
6925264334
merge 3.4 ( #22643 )
2014-10-15 11:49:15 -04:00
Benjamin Peterson
1cbb3fe775
merge 3.3 ( #22643 )
2014-10-15 11:48:41 -04:00
Benjamin Peterson
e1bd38c03c
fix integer overflow in unicode case operations ( closes #22643 )
2014-10-15 11:47:36 -04:00
Gregory P. Smith
8486f9b134
Fix "warning: comparison between signed and unsigned integer expressions"
...
-Wsign-compare warnings in unicodeobject.c. These were all a result
of sizeof() being unsigned and being compared to a Py_ssize_t.
Not actual problems.
2014-09-30 00:33:24 -07:00
Benjamin Peterson
fd97a6fb2d
merge 3.4 ( #22520 )
2014-09-29 23:02:56 -04:00
Benjamin Peterson
43030ee780
merge 3.3 ( #22520 )
2014-09-29 23:02:35 -04:00
Benjamin Peterson
736b8012b4
prevent overflow in unicode_repr ( closes #22520 )
2014-09-29 23:02:15 -04:00
Benjamin Peterson
10e4b2545e
merge 3.4 ( closes #22518 )
2014-09-29 18:53:58 -04:00
Benjamin Peterson
2b76ce6d27
merge 3.3 ( closes #22518 )
2014-09-29 18:50:06 -04:00
Benjamin Peterson
a1c1be4e03
cleanup overflowing handling in unicode_decode_call_errorhandler and unicode_encode_ucs1 ( closes #22518 )
2014-09-29 18:18:57 -04:00
Serhiy Storchaka
20b39b27d9
Removed redundant casts to `char *`.
...
Corresponding functions now accept `const char *` (issue #1772673 ).
2014-09-28 11:27:24 +03:00
Benjamin Peterson
fa5021699a
Merge 3.3
2014-10-15 23:58:32 -04:00
Serhiy Storchaka
d8a1447c99
Issue #22215 : Now ValueError is raised instead of TypeError when str or bytes
...
argument contains not permitted null character or byte.
2014-09-06 20:07:17 +03:00
Victor Stinner
12174a5dca
Issue #22156 : Fix "comparison between signed and unsigned integers" compiler
...
warnings in the Objects/ subdirectory.
PyType_FromSpecWithBases() and PyType_FromSpec() now reject explicitly negative
slot identifiers.
2014-08-15 23:17:38 +02:00
Victor Stinner
f6a271ae98
Issue #18395 : Rename ``_Py_char2wchar()`` to :c:func:`Py_DecodeLocale`, rename
...
``_Py_wchar2char()`` to :c:func:`Py_EncodeLocale`, and document these
functions.
2014-08-01 12:28:48 +02:00
Victor Stinner
e1f17c6c0b
unicodeobject.c: fix a compiler warning on Windows 64 bits
2014-07-25 14:03:03 +02:00
Victor Stinner
c68b7fba86
(Merge 3.4) Issue #21892 , #21893 : Partial revert of changeset 4f55e802baf0,
...
PyErr_Format() uses "%zd" for Py_ssize_t, not PY_FORMAT_SIZE_T
2014-07-04 22:50:13 +02:00
Victor Stinner
a33bce0945
Issue #21892 , #21893 : Partial revert of changeset 4f55e802baf0, PyErr_Format()
...
uses "%zd" for Py_ssize_t, not PY_FORMAT_SIZE_T
2014-07-04 22:47:46 +02:00
Victor Stinner
9f43505f3d
(Merge 3.4) Closes #21892 , #21893 : Use PY_FORMAT_SIZE_T instead of %zi or %zu
...
to format C size_t, because %zi/%u is not supported on all platforms.
2014-07-01 08:57:54 +02:00
Victor Stinner
293f3f526d
Closes #21892 , #21893 : Use PY_FORMAT_SIZE_T instead of %zi or %zu to format C
...
size_t, because %zi/%u is not supported on all platforms.
2014-07-01 08:57:10 +02:00
Serhiy Storchaka
48070c1248
Issue #23803 : Fixed str.partition() and str.rpartition() when a separator
...
is wider then partitioned string.
2015-03-29 19:21:02 +03:00
Benjamin Peterson
92ce1b4392
merge 3.3 ( #23362 )
2015-03-02 13:23:41 -05:00
Victor Stinner
4dd25256e2
Issue #21118 : PyLong_AS_LONG() result type is long
...
Even if PyLong_AS_LONG() cannot fail, I prefer to use the right type.
2014-04-08 09:14:21 +02:00
Benjamin Peterson
1365de764e
fix reference leaks in the translate fast path ( closes #21175 )
...
Patch by Josh Rosenberg.
2014-04-07 20:15:41 -04:00
Victor Stinner
872b291b96
Issue #21118 : Optimize also str.translate() for ASCII => ASCII deletion
2014-04-05 14:27:07 +02:00
Victor Stinner
4ff33af257
Issue #21118 : Add unit test for invalid character replacement (code point higher than U+10ffff)
2014-04-05 11:56:37 +02:00
Victor Stinner
89a76abf20
Issue #21118 : Optimize str.translate() for ASCII => ASCII translation
2014-04-05 11:44:04 +02:00
Victor Stinner
8a4422e78d
Issue #21118 : Remove unused variable
2014-04-05 00:15:52 +02:00
Victor Stinner
1194ea020c
Issue #21118 : Use _PyUnicodeWriter API in str.translate() to simplify and
...
factorize the code
2014-04-04 19:37:40 +02:00
Ethan Furman
9ab748013b
Issue19995: more informative error message; spelling corrections; use operator.mod instead of __mod__
2014-03-21 06:38:46 -07:00
Ethan Furman
38d872ee5d
Issue19995: passing a non-int to %o, %c, %x, or %X now raises an exception
2014-03-19 08:38:52 -07:00
Victor Stinner
7d00cc1a64
Issue #20574 : Implement incremental decoder for cp65001 code
...
(Windows code page 65001, Microsoft UTF-8).
2014-03-17 23:08:06 +01:00
Kristján Valur Jónsson
25dded041f
Make the various iterators' "setstate" sliently and consistently clip the
...
index. This avoids the possibility of setting an iterator to an invalid
state.
2014-03-05 13:47:57 +00:00
Kristján Valur Jónsson
c5cc5011ac
Make the various iterators' "setstate" sliently and consistently clip the
...
index. This avoids the possibility of setting an iterator to an invalid
state.
2014-03-05 15:23:07 +00:00
Serhiy Storchaka
94ee389308
Issue #19619 : Blacklist non-text codecs in method API
...
str.encode, bytes.decode and bytearray.decode now use an
internal API to throw LookupError for known non-text encodings,
rather than attempting the encoding or decoding operation and
then throwing a TypeError for an unexpected output type.
The latter mechanism remains in place for third party non-text
encodings.
Backported changeset d68df99d7a57.
2014-02-24 14:43:03 +02:00
Benjamin Peterson
4267869ad8
merge 3.3 ( #20507 )
2014-02-15 13:03:20 -05:00
Benjamin Peterson
9743b2c2b5
give non-iterable TypeError a message ( closes #20507 )
2014-02-15 13:02:52 -05:00
Serhiy Storchaka
dfe98a102e
Issue #20437 : Fixed 22 potential bugs when deleting objects references.
2014-02-09 13:46:20 +02:00
Serhiy Storchaka
505ff755d7
Issue #20437 : Fixed 21 potential bugs when deleting objects references.
2014-02-09 13:33:53 +02:00
Larry Hastings
2623c8c23c
Issue #20530 : Argument Clinic's signature format has been revised again.
...
The new syntax is highly human readable while still preventing false
positives. The syntax also extends Python syntax to denote "self" and
positional-only parameters, allowing inspect.Signature objects to be
totally accurate for all supported builtins in Python 3.4.
2014-02-08 22:15:29 -08:00
Serhiy Storchaka
6cbf151032
Issue #20538 : UTF-7 incremental decoder produced inconsistant string when
...
input was truncated in BASE64 section.
2014-02-08 14:06:33 +02:00
Serhiy Storchaka
016a3f33a5
Issue #20538 : UTF-7 incremental decoder produced inconsistant string when
...
input was truncated in BASE64 section.
2014-02-08 14:01:29 +02:00
Larry Hastings
581ee3618c
Issue #20326 : Argument Clinic now uses a simple, unique signature to
...
annotate text signatures in docstrings, resulting in fewer false
positives. "self" parameters are also explicitly marked, allowing
inspect.Signature() to authoritatively detect (and skip) said parameters.
Issue #20326 : Argument Clinic now generates separate checksums for the
input and output sections of the block, allowing external tools to verify
that the input has not changed (and thus the output is not out-of-date).
2014-01-28 05:00:08 -08:00
Larry Hastings
c20472640c
Issue #20390 : Small fixes and improvements for Argument Clinic.
2014-01-25 20:43:29 -08:00
Larry Hastings
5c66189e88
Issue #20189 : Four additional builtin types (PyTypeObject,
...
PyMethodDescr_Type, _PyMethodWrapper_Type, and PyWrapperDescr_Type)
have been modified to provide introspection information for builtins.
Also: many additional Lib, test suite, and Argument Clinic fixes.
2014-01-24 06:17:25 -08:00
Ethan Furman
a70805e1fa
Issue19995: fixed typo; switched from test.support.check_warnings to assertWarns
2014-01-12 08:42:35 -08:00
Ethan Furman
f9bba9c67f
Issue19995: issue deprecation warning for non-integer values to %c, %o, %x, %X
2014-01-11 23:20:58 -08:00
Larry Hastings
61272b77b0
Issue #19273 : The marker comments Argument Clinic uses have been changed
...
to improve readability.
2014-01-07 12:41:53 -08:00
Ethan Furman
df3ed242c0
Issue19995: %o, %x, %X now only accept ints
2014-01-05 06:50:30 -08:00
Serhiy Storchaka
3079328d29
Reverted changeset b72c5573c5e7 (issue #15027 ).
2014-01-04 22:44:01 +02:00
Serhiy Storchaka
583a93943c
Issue #15027 : Rewrite the UTF-32 encoder. It is now 1.6x to 3.5x faster.
2014-01-04 19:25:37 +02:00
Victor Stinner
fa4e68d425
Remove deadcode (HASH macro is no more defined)
2014-01-03 17:42:18 +01:00
Victor Stinner
92a419eea4
Remove now unused variables
2014-01-03 17:39:40 +01:00
Victor Stinner
f3b46b4a66
unicode_char() uses get_latin1_char() to get latin1 singleton characters
2014-01-03 13:16:00 +01:00
Victor Stinner
985a82a6d2
add unicode_char() in unicodeobject.c to factorize code
2014-01-03 12:53:47 +01:00
Larry Hastings
44e2eaab54
Issue #19674 : inspect.signature() now produces a correct signature
...
for some builtins.
2013-11-23 15:37:55 -08:00
Larry Hastings
ebdcb50b8a
Issue #19730 : Argument Clinic now supports all the existing PyArg
...
"format units" as legacy converters, as well as two new features:
"self converters" and the "version" directive.
2013-11-23 14:54:00 -08:00
Nick Coghlan
c72e4e6dcc
Issue #19619 : Blacklist non-text codecs in method API
...
str.encode, bytes.decode and bytearray.decode now use an
internal API to throw LookupError for known non-text encodings,
rather than attempting the encoding or decoding operation and
then throwing a TypeError for an unexpected output type.
The latter mechanism remains in place for third party non-text
encodings.
2013-11-22 22:39:36 +10:00
Christian Heimes
985ecdcfc2
ssue #19183 : Implement PEP 456 'secure and interchangeable hash algorithm'.
...
Python now uses SipHash24 on all major platforms.
2013-11-20 11:46:18 +01:00
Victor Stinner
4a58707a34
Add _PyUnicodeWriter_WriteASCIIString() function
2013-11-19 12:54:53 +01:00
Serhiy Storchaka
58cf607d13
Issue #12892 : The utf-16* and utf-32* codecs now reject (lone) surrogates.
...
The utf-16* and utf-32* encoders no longer allow surrogate code points
(U+D800-U+DFFF) to be encoded.
The utf-32* decoders no longer decode byte sequences that correspond to
surrogate code points.
The surrogatepass error handler now works with the utf-16* and utf-32* codecs.
Based on patches by Victor Stinner and Kang-Hao (Kenny) Lu.
2013-11-19 11:32:41 +02:00
Victor Stinner
6989ba0174
Issue #19581 : Change the overallocation factor of _PyUnicodeWriter on Windows
...
On Windows, a factor of 50% gives best performances.
2013-11-18 21:08:39 +01:00
Larry Hastings
ed4a1c5703
Argument Clinic: rename "self" to "module" for module-level functions.
2013-11-18 09:32:13 -08:00
Ezio Melotti
745d54d2fa
#17806 : Added keyword-argument support for "tabsize" to str/bytes.expandtabs().
2013-11-16 19:10:57 +02:00
Nick Coghlan
8b097b4ed7
Close #17828 : better handling of codec errors
...
- output type errors now redirect users to the type-neutral
convenience functions in the codecs module
- stateless errors that occur during encoding and decoding
will now be automatically wrapped in exceptions that give
the name of the codec involved
2013-11-13 23:49:21 +10:00
Victor Stinner
66b3270975
_Py_normalize_encoding(): explain how the value 6 was computed
2013-11-07 23:12:23 +01:00
Victor Stinner
df23e30bea
Fix _Py_normalize_encoding(): ensure that buffer is big enough to store "utf-8"
...
if the input string is NULL
2013-11-07 13:33:36 +01:00
Victor Stinner
ad14ccd047
Issue #19512 : add _PyUnicode_CompareWithId() function
...
_PyUnicode_CompareWithId() is faster than PyUnicode_CompareWithASCIIString()
when both strings are equal and interned.
Add also _PyId_builtins identifier for "builtins" common string.
2013-11-07 00:46:04 +01:00
Victor Stinner
21ea21ef6d
Issue #19424 : PyUnicode_CompareWithASCIIString() normalizes memcmp() result
...
to -1, 0, 1
2013-11-04 11:28:26 +01:00
Victor Stinner
f0c7b2af05
Issue #16286 : remove duplicated identity check from unicode_compare()
...
Move the test to PyUnicode_Compare()
2013-11-04 11:27:14 +01:00
Victor Stinner
fd9e44db37
Issue #16286 : optimize PyUnicode_RichCompare() for identical strings (same
...
pointer) for any operator, not only Py_EQ and Py_NE.
Code of bytes_richcompare() and PyUnicode_RichCompare() is now closer.
2013-11-04 11:23:05 +01:00
Victor Stinner
c8bc5377ac
Issue #16286 : write a new subfunction bytes_compare_eq()
...
* cleanup bytes_richcompare()
* PyUnicode_RichCompare(): replace a test with a XOR
2013-11-04 11:08:10 +01:00
Victor Stinner
e1b1592fd4
Issue #19424 : Fix a compiler warning on comparing signed/unsigned size_t
...
Patch written by Zachary Ware.
2013-11-03 13:53:12 +01:00
Victor Stinner
a6b9b071a3
Issue #19424 : Fix a compiler warning
...
memcmp() just takes raw pointers
2013-10-30 18:27:13 +01:00
Victor Stinner
602f7cf0b9
Issue #19424 : Optimize PyUnicode_CompareWithASCIIString()
...
Use fast memcmp() instead of a loop using the slow PyUnicode_READ() macro.
strlen() is still necessary to check Unicode string containing null bytes.
2013-10-29 23:31:50 +01:00
Victor Stinner
68b674c9d4
Issue #19437 : Fix _PyUnicode_New() (constructor of legacy string), set all
...
attributes before checking for error. The destructor expects all attributes to
be set. It is now safe to call Py_DECREF(unicode) in the constructor.
2013-10-29 19:31:43 +01:00
Victor Stinner
fa3ba4c3bc
Issue #18609 : Add a fast-path for "iso8859-1" encoding
...
On AIX, the locale encoding may be "iso8859-1", which was not a known syntax of
the legacy ISO 8859-1 encoding.
Using a C codec instead of a Python codec is faster but also avoids tricky
issues during Python startup or complex code.
2013-10-29 11:34:05 +01:00
Victor Stinner
a5afb58986
Issue #18408 : Fix PyUnicode_AsUTF8AndSize(), raise MemoryError exception on
...
memory allocation failure
2013-10-29 01:28:23 +01:00
Serhiy Storchaka
c679227e31
Issue #1772673 : The type of `char*` arguments now changed to `const char*`.
2013-10-19 21:03:34 +03:00
Serhiy Storchaka
55e092f545
Issue #19279 : UTF-7 decoder no more produces illegal strings.
2013-10-19 20:39:28 +03:00
Serhiy Storchaka
35804e4c63
Issue #19279 : UTF-7 decoder no more produces illegal strings.
2013-10-19 20:38:19 +03:00
Larry Hastings
3182680210
Issue #16612 : Add "Argument Clinic", a compile-time preprocessor
...
for C files to generate argument parsing code. (See PEP 436.)
2013-10-19 00:09:25 -07:00
Ethan Furman
fb13721b1b
Close #18780 : %-formatting now prints value for int subclasses with %d, %i, and %u codes.
2013-08-31 10:18:55 -07:00
Antoine Pitrou
9ed5f27266
Issue #18722 : Remove uses of the "register" keyword in C code.
2013-08-13 20:18:52 +02:00
Raymond Hettinger
e56666d17f
Silence compiler warning about an uninitialized variable
2013-08-04 11:51:03 -07:00
Raymond Hettinger
5ed1b38a7d
merge
2013-08-04 11:51:35 -07:00
Christian Heimes
b578735dff
Check return value of PyType_Ready(&EncodingMapType)
...
CID 486654
2013-07-20 14:57:28 +02:00
Christian Heimes
26532f7519
Check return value of PyType_Ready(&EncodingMapType)
...
CID 486654
2013-07-20 14:57:16 +02:00
Victor Stinner
e699e5a218
Issue #18408 : Don't check unicode consistency in _PyUnicode_HAS_UTF8_MEMORY()
...
and _PyUnicode_HAS_WSTR_MEMORY() macros
These macros are called in unicode_dealloc(), whereas the unicode object can be
"inconsistent" if the creation of the object failed.
For example, when unicode_subtype_new() fails on a memory allocation,
_PyUnicode_CheckConsistency() fails with an assertion error because data is
NULL.
2013-07-15 18:22:47 +02:00
Victor Stinner
9e6b4d715c
Issue #18408 : _PyUnicodeWriter_Finish() now clears its buffer attribute in all
...
cases, so _PyUnicodeWriter_Dealloc() can be called after finish.
2013-07-09 00:37:24 +02:00
Victor Stinner
15a0bd3965
Issue #18408 : Fix _PyUnicodeWriter_Finish(): clear writer->buffer,
...
so _PyUnicodeWriter_Dealloc() can be called on the writer after finish.
2013-07-08 22:29:55 +02:00
Victor Stinner
6f8eeee7b9
Issue #18203 : Fix _Py_DecodeUTF8_surrogateescape(), use PyMem_RawMalloc() as _Py_char2wchar()
2013-07-07 22:57:45 +02:00
Victor Stinner
1a7425f67a
Issue #18203 : Replace malloc() with PyMem_RawMalloc() at Python initialization
...
* Replace malloc() with PyMem_RawMalloc()
* Replace PyMem_Malloc() with PyMem_RawMalloc() where the GIL is not held.
* _Py_char2wchar() now returns a buffer allocated by PyMem_RawMalloc(), instead
of PyMem_Malloc()
2013-07-07 16:25:15 +02:00
Christian Heimes
d47802eef7
Fix ref leak in error case of unicode find, count, formatlong
...
CID 983315: Resource leak (RESOURCE_LEAK)
CID 983316: Resource leak (RESOURCE_LEAK)
CID 983317: Resource leak (RESOURCE_LEAK)
2013-06-29 21:33:36 +02:00
Christian Heimes
d47a0456b1
Fix ref leak in error case of unicode index
...
CID 983319 (#1 of 2): Resource leak (RESOURCE_LEAK)
leaked_storage: Variable substring going out of scope leaks the storage it points to.
2013-06-29 21:21:37 +02:00
Christian Heimes
ea71a525c3
Fix ref leak in error case of unicode rindex and rfind
...
CID 983320: Resource leak (RESOURCE_LEAK)
CID 983321: Resource leak (RESOURCE_LEAK)
leaked_storage: Variable substring going out of scope leaks the storage it points to.
2013-06-29 21:17:34 +02:00
Christian Heimes
305e49e17e
Fix memory leak in endswith
...
CID 1040368 (#1 of 1): Resource leak (RESOURCE_LEAK)
leaked_storage: Variable substring going out of scope leaks the storage it points to.
2013-06-29 20:41:06 +02:00
Serhiy Storchaka
c89533f72f
Issue #18184 : PyUnicode_FromFormat() and PyUnicode_FromFormatV() now raise
...
OverflowError when an argument of %c format is out of range.
2013-06-23 20:21:16 +03:00
Serhiy Storchaka
8eeae2126c
Issue #18184 : PyUnicode_FromFormat() and PyUnicode_FromFormatV() now raise
...
OverflowError when an argument of %c format is out of range.
2013-06-23 20:12:14 +03:00
Benjamin Peterson
3164f5d565
merge 3.3 ( #18183 )
2013-06-10 09:24:01 -07:00
Benjamin Peterson
7e30373126
remove MAX_MAXCHAR because it's unsafe for computing maximum codepoitn value (see #18183 )
2013-06-10 09:19:46 -07:00
Victor Stinner
9f067f490f
Issue #9566 : Fix compiler warning on Windows 64-bit
2013-06-05 00:21:31 +02:00
Antoine Pitrou
7ce35a1816
Issue #17237 : Fix crash in the ASCII decoder on m68k.
2013-05-11 15:59:37 +02:00
Antoine Pitrou
8b0e98426d
Issue #17237 : Fix crash in the ASCII decoder on m68k.
2013-05-11 15:58:34 +02:00
Victor Stinner
f4f24248dc
Fix uninitialized value in charmap_decode_mapping()
2013-05-07 01:01:31 +02:00
Victor Stinner
8cecc8c262
Issue #7330 : Implement width and precision (ex: "%5.3s") for the format string
...
of PyUnicode_FromFormat() function, original patch written by Ysj Ray.
2013-05-06 23:11:54 +02:00
Victor Stinner
bb4503f61e
Partial revert of changeset 9744b2df134c
...
PyUnicode_Append() cannot call directly resize_compact(): I forgot that a
string can be ready *and* not compact (a legacy string can also be ready).
2013-04-18 09:41:34 +02:00
Victor Stinner
fb161b1b6d
Split PyUnicode_DecodeCharmap() into subfunction for readability
2013-04-18 01:44:27 +02:00
Victor Stinner
170ca6f84b
Fix bug in Unicode decoders related to _PyUnicodeWriter
...
Bug introduced by changesets 7ed9993d53b4 and edf029fc9591.
2013-04-18 00:25:28 +02:00
Victor Stinner
376cfa122d
Fix typo in unicode_decode_call_errorhandler_writer()
...
Bug introduced by changeset 7ed9993d53b4.
2013-04-17 23:58:16 +02:00
Victor Stinner
8f674ccd64
Close #17694 : Add minimum length to _PyUnicodeWriter
...
* Add also min_char attribute to _PyUnicodeWriter structure (currently unused)
* _PyUnicodeWriter_Init() has no more argument (except the writer itself):
min_length and overallocate must be set explicitly
* In error handlers, only enable overallocation if the replacement string
is longer than 1 character
* CJK decoders don't use overallocation anymore
* Set min_length, instead of preallocating memory using
_PyUnicodeWriter_Prepare(), in many decoders
* _PyUnicode_DecodeUnicodeInternal() checks for integer overflow
2013-04-17 23:02:17 +02:00
Victor Stinner
77282cb4f8
Cleanup PyUnicode_Contains()
...
* No need to double-check that strings are ready: test already done by
PyUnicode_FromObject()
* Remove useless kind variable (use kind1 instead)
2013-04-14 19:22:47 +02:00
Victor Stinner
d92e078c8d
Minor change: fix character in do_strip() for the ASCII case
2013-04-14 19:17:42 +02:00
Victor Stinner
f033510fee
Cleanup PyUnicode_Append()
...
* Check also that right is a Unicode object
* call directly resize_compact() instead of unicode_resize() for a more
explicit error handling, and to avoid testing some properties twice
(ex: unicode_modifiable())
2013-04-14 19:13:03 +02:00
Victor Stinner
4560f9c63f
PyUnicode_Join(): move use_memcpy test out of the loop to cleanup and optimize the code
2013-04-14 18:56:46 +02:00
Victor Stinner
55c08781e8
Optimize repr(str): use _PyUnicode_FastCopyCharacters() when no character is escaped
2013-04-14 18:45:39 +02:00
Victor Stinner
af03757d20
Optimize ascii(str): don't encode/decode repr if repr is already ASCII
2013-04-14 18:44:10 +02:00
Victor Stinner
8a1a6cffd6
Add _PyUnicodeWriter_WriteCharInline()
2013-04-14 02:35:33 +02:00
Serhiy Storchaka
e2cef885a2
Issue #16061 : Speed up str.replace() for replacing 1-character strings.
2013-04-13 22:45:04 +03:00
Victor Stinner
a0dd0213cc
Close #17693 : Rewrite CJK decoders to use the _PyUnicodeWriter API instead of
...
the legacy Py_UNICODE API.
Add also a new _PyUnicodeWriter_WriteChar() function.
2013-04-11 22:09:04 +02:00
Victor Stinner
247109e74d
Issue #17615 : On Windows (VS2010), Performances of wmemcmp() to compare Unicode
...
strings are not convincing. For UCS2 (16-bit wchar_t type), use a dummy loop
instead of wmemcmp(). The dummy loop is as fast, or a little bit faster.
wchar_t is only 16-bit long on Windows. wmemcmp() is still used for 32-bit
wchar_t.
2013-04-09 23:53:26 +02:00
Victor Stinner
0cff4b16d9
replace(): only call PyUnicode_DATA(u) once
2013-04-09 22:52:48 +02:00
Victor Stinner
cc7af72192
Write super-fast version of str.strip(), str.lstrip() and str.rstrip() for pure ASCII
2013-04-09 22:39:24 +02:00
Victor Stinner
f50a4e9bc9
Don't calls macros in PyUnicode_WRITE() parameters
...
PyUnicode_WRITE() expands some parameters twice or more.
2013-04-09 22:38:52 +02:00
Victor Stinner
9c79e41fc5
Fix do_strip(): don't call PyUnicode_READ() in Py_UNICODE_ISSPACE() to not call
...
it twice
2013-04-09 22:21:08 +02:00
Victor Stinner
b3a6014504
Fix _PyUnicode_XStrip()
...
Inline the BLOOM_MEMBER() to only call PyUnicode_READ() only once (per loop
iteration). Store also the length of the seperator in a variable to avoid calls
to PyUnicode_GET_LENGTH().
2013-04-09 22:19:21 +02:00
Victor Stinner
63d5c1a14a
Optimize PyUnicode_DecodeCharmap()
...
Avoid expensive PyUnicode_READ() and PyUnicode_WRITE(), manipulate pointers
instead.
2013-04-09 22:13:33 +02:00
Victor Stinner
a85af502a4
Optimize make_bloom_mask(), used by str.strip(), str.lstrip() and str.rstrip()
...
Write specialized functions per Unicode kind to avoid the expensive
PyUnicode_READ() macro.
2013-04-09 21:53:54 +02:00
Victor Stinner
69ed0f4c86
Use PyUnicode_READ() instead of PyUnicode_READ_CHAR()
...
"PyUnicode_READ_CHAR() is less efficient than PyUnicode_READ() because it calls
PyUnicode_KIND() and might call it twice." according to its documentation.
2013-04-09 21:48:24 +02:00
Victor Stinner
03c3e35d42
Add fast-path in PyUnicode_DecodeCharmap() for pure 8 bit encodings:
...
cp037, cp500 and iso8859_1 codecs
2013-04-09 21:53:09 +02:00
Victor Stinner
cd777eaf53
Issue #17615 : Comparing two Unicode strings now uses wmemcmp() when possible
...
wmemcmp() is twice faster than a dummy loop (342 usec vs 744 usec) on Fedora
18/x86_64, GCC 4.7.2.
2013-04-08 22:43:44 +02:00
Victor Stinner
c1302bba4c
Issue #17615 : Expand expensive PyUnicode_READ() macro in unicode_compare():
...
write specialized functions for each combination of Unicode kinds.
2013-04-08 21:50:54 +02:00
Victor Stinner
207dd38726
fix unused variable
2013-04-03 03:14:58 +02:00
Victor Stinner
eb4b5ac8af
Close #16757 : Avoid calling the expensive _PyUnicode_FindMaxChar() function
...
when possible
2013-04-03 02:02:33 +02:00
Victor Stinner
cfc4c13b04
Add _PyUnicodeWriter_WriteSubstring() function
...
Write a function to enable more optimizations:
* If the substring is the whole string and overallocation is disabled, just
keep a reference to the string, don't copy characters
* Avoid a call to the expensive _PyUnicode_FindMaxChar() function when
possible
2013-04-03 01:48:39 +02:00
Raymond Hettinger
51612fd803
merge
2013-03-23 08:21:52 -07:00
Raymond Hettinger
378170d5d9
Issue 17447: Clarify that str.isidentifier doesn't check for reserved keywords.
2013-03-23 08:21:12 -07:00
Victor Stinner
fb84b5d48d
(Merge 3.3) _PyUnicode_Writer() now also reuses Unicode singletons:
...
empty string and latin1 single character
2013-03-06 19:29:09 +01:00