Commit Graph

1419 Commits

Author SHA1 Message Date
Miss Islington (bot) f02d1a43c6
bpo-36946: Fix possible signed integer overflow when handling slices. (GH-13375)
The final addition (cur += step) may overflow, so use size_t for "cur".
"cur" is always positive (even for negative steps), so it is safe to use
size_t here.

Co-Authored-By: Martin Panter <vadmium+py@gmail.com>
(cherry picked from commit 14514d9084)

Co-authored-by: Zackery Spytz <zspytz@gmail.com>
2019-05-17 00:33:10 -07:00
Miss Islington (bot) bd48280cb6 bpo-24214: Fixed the UTF-8 incremental decoder. (GH-12603) (GH-12627)
The bug occurred when the encoded surrogate character is passed
to the incremental decoder in two chunks.
(cherry picked from commit 7a465cb5ee)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2019-03-30 15:52:41 +02:00
Miss Islington (bot) 74829b7323
bpo-36312: Fix decoders for some code pages. (GH-12369)
(cherry picked from commit c1e2c288f4)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2019-03-20 21:31:57 -07:00
Miss Islington (bot) cbc7c2c791
bpo-35552: Fix reading past the end in PyUnicode_FromFormat() and PyBytes_FromFormat(). (GH-11276)
Format characters "%s" and "%V" in PyUnicode_FromFormat() and "%s" in PyBytes_FromFormat()
no longer read memory past the limit if precision is specified.
(cherry picked from commit d586ccb04f)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2019-01-12 00:52:55 -08:00
Miss Islington (bot) 9a413faa87
bpo-35560: Remove assertion from format(float, "n") (GH-11288)
Fix an assertion error in format() in debug build for floating point
formatting with "n" format, zero padding and small width. Release build is
not impacted. Patch by Karthikeyan Singaravelan.
(cherry picked from commit 3f7983a25a)

Co-authored-by: Xtreak <tir.karthi@gmail.com>
2019-01-07 07:26:20 -08:00
Miss Islington (bot) bdeb56cd21
bpo-35372: Fix the code page decoder for input > 2 GiB. (GH-10848)
(cherry picked from commit 4013c17911)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2018-12-03 01:09:11 -08:00
Victor Stinner 85ab974f78
bpo-34523, bpo-35322: Fix unicode_encode_locale() (GH-10759) (GH-10761)
Fix memory leak in PyUnicode_EncodeLocale() and
PyUnicode_EncodeFSDefault() on error handling.

Fix unicode_encode_locale() error handling.

(cherry picked from commit bde9d6bbb4)
2018-11-28 12:42:40 +01:00
Victor Stinner 7f9fb0f345
bpo-33954: Rewrite FILL() macro of unicodeobject.c (GH-10738)
Copy code from master: add assertions on start and value, replace 'i'
iterator with 'end' pointer for the loop stop condition.

_PyUnicode_FastFill(): fix type of 'data', it must not be constant,
since data is modified by FILL().
2018-11-27 12:42:04 +01:00
Victor Stinner 6f5fa1b4be
bpo-33954: Fix _PyUnicode_InsertThousandsGrouping() (GH-10623) (GH-10718)
Fix str.format(), float.__format__() and complex.__format__() methods
for non-ASCII decimal point when using the "n" formatter.

Rewrite _PyUnicode_InsertThousandsGrouping(): it now requires
a _PyUnicodeWriter object for the buffer and a Python str object
for digits.

(cherry picked from commit 59423e3ddd)
2018-11-26 14:17:01 +01:00
Miss Islington (bot) 9fbcb1402e
[3.7] bpo-35214: Fix OOB memory access in unicode escape parser (GH-10506) (GH-10522)
Discovered using clang's MemorySanitizer when it ran python3's
test_fstring test_misformed_unicode_character_name.

An msan build will fail by simply executing: ./python -c 'u"\N"'
(cherry picked from commit 746b2d35ea)


Co-authored-by: Gregory P. Smith <greg@krypto.org>


https://bugs.python.org/issue35214
2018-11-13 16:39:36 -08:00
Miss Islington (bot) 1e596d3a20
bpo-34435: Add missing NULL check to unicode_encode_ucs1(). (GH-8823)
Reported by Svace static analyzer.
(cherry picked from commit 74a307d48e)

Co-authored-by: Alexey Izbyshev <izbyshev@users.noreply.github.com>
2018-08-19 16:17:53 -04:00
Miss Islington (bot) c721472fb8
bpo-34087: Fix buffer overflow in int(s) and similar functions (GH-8274)
`_PyUnicode_TransformDecimalAndSpaceToASCII()` missed trailing NUL char.
It caused buffer overflow in `_Py_string_to_number_with_underscores()`.

This bug is introduced in 9b6c60cb.
(cherry picked from commit 16dfca4d82)

Co-authored-by: INADA Naoki <methane@users.noreply.github.com>
2018-07-13 20:58:12 -07:00
Miss Islington (bot) 09819ef05a bpo-32827: Fix usage of _PyUnicodeWriter_Prepare() in decoding errors handler. (GH-5636) (GH-5650)
(cherry picked from commit b7e2d67f7c)


Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2018-02-13 09:15:57 +02:00
Xiang Zhang 86fdad093b bpo-32583: Fix possible crashing in builtin Unicode decoders (#5325)
When using customized decode error handlers, it is possible for builtin decoders
to write out-of-bounds and then crash.
2018-01-31 17:02:12 -05:00
INADA Naoki 7cc95f5069
Fix wrong assert in unicodeobject (GH-5340) 2018-01-28 02:07:09 +09:00
INADA Naoki a49ac99029
bpo-32677: Add .isascii() to str, bytes and bytearray (GH-5342) 2018-01-27 14:06:21 +09:00
Victor Stinner 7ed7aead95
bpo-29240: Fix locale encodings in UTF-8 Mode (#5170)
Modify locale.localeconv(), time.tzname, os.strerror() and other
functions to ignore the UTF-8 Mode: always use the current locale
encoding.

Changes:

* Add _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx(). On decoding or
  encoding error, they return the position of the error and an error
  message which are used to raise Unicode errors in
  PyUnicode_DecodeLocale() and PyUnicode_EncodeLocale().
* Replace _Py_DecodeCurrentLocale() with _Py_DecodeLocaleEx().
* PyUnicode_DecodeLocale() now uses _Py_DecodeLocaleEx() for all
  cases, especially for the strict error handler.
* Add _Py_DecodeUTF8Ex(): return more information on decoding error
  and supports the strict error handler.
* Rename _Py_EncodeUTF8_surrogateescape() to _Py_EncodeUTF8Ex().
* Replace _Py_EncodeCurrentLocale() with _Py_EncodeLocaleEx().
* Ignore the UTF-8 mode to encode/decode localeconv(), strerror()
  and time zone name.
* Remove PyUnicode_DecodeLocale(), PyUnicode_DecodeLocaleAndSize()
  and PyUnicode_EncodeLocale() now ignore the UTF-8 mode: always use
  the "current" locale.
* Remove _PyUnicode_DecodeCurrentLocale(),
  _PyUnicode_DecodeCurrentLocaleAndSize() and
  _PyUnicode_EncodeCurrentLocale().
2018-01-15 10:45:49 +01:00
Victor Stinner cb3ae5588b
bpo-29240: Ignore UTF-8 Mode in time module (#5148)
time.strftime() must use the current LC_CTYPE encoding, not UTF-8
if the UTF-8 mode is enabled.

Add _PyUnicode_DecodeCurrentLocale() function.
2018-01-11 10:37:59 +01:00
Victor Stinner 2cba6b8579
bpo-29240: readline now ignores the UTF-8 Mode (#5145)
Add new fuctions ignoring the UTF-8 mode:

* _Py_DecodeCurrentLocale()
* _Py_EncodeCurrentLocale()
* _PyUnicode_DecodeCurrentLocaleAndSize()
* _PyUnicode_EncodeCurrentLocale()

Modify the readline module to use these functions.

Re-enable test_readline.test_nonascii().
2018-01-10 22:46:15 +01:00
Victor Stinner 9dd762013f
bpo-32030: Add _Py_EncodeLocaleRaw() (#4961)
Replace Py_EncodeLocale() with _Py_EncodeLocaleRaw() in:

* _Py_wfopen()
* _Py_wreadlink()
* _Py_wrealpath()
* _Py_wstat()
* pymain_open_filename()

These functions are called early during Python intialization, only
the RAW memory allocator must be used.
2017-12-21 16:20:32 +01:00
Victor Stinner e47e698da6
bpo-32030: Add _Py_EncodeUTF8_surrogateescape() (#4960)
Py_EncodeLocale() now uses _Py_EncodeUTF8_surrogateescape(), instead
of using temporary unicode and bytes objects. So Py_EncodeLocale()
doesn't use the Python C API anymore.
2017-12-21 15:45:16 +01:00
Serhiy Storchaka a5552f023e
bpo-32240: Add the const qualifier to declarations of PyObject* array arguments. (#4746) 2017-12-15 13:11:11 +02:00
Victor Stinner 91106cd9ff
bpo-29240: PEP 540: Add a new UTF-8 Mode (#855)
* Add -X utf8 command line option, PYTHONUTF8 environment variable
  and a new sys.flags.utf8_mode flag.
* If the LC_CTYPE locale is "C" at startup: enable automatically the
  UTF-8 mode.
* Add _winapi.GetACP(). encodings._alias_mbcs() now calls
  _winapi.GetACP() to get the ANSI code page
* locale.getpreferredencoding() now returns 'UTF-8' in the UTF-8
  mode. As a side effect, open() now uses the UTF-8 encoding by
  default in this mode.
* Py_DecodeLocale() and Py_EncodeLocale() now use the UTF-8 encoding
  in the UTF-8 Mode.
* Update subprocess._args_from_interpreter_flags() to handle -X utf8
* Skip some tests relying on the current locale if the UTF-8 mode is
  enabled.
* Add test_utf8mode.py.
* _Py_DecodeUTF8_surrogateescape() gets a new optional parameter to
  return also the length (number of wide characters).
* pymain_get_global_config() and pymain_set_global_config() now
  always copy flag values, rather than only copying if the new value
  is greater than the old value.
2017-12-13 12:29:09 +01:00
Victor Stinner 6a54c676e6
bpo-31979: Remove unused align_maxchar() function (#4527) 2017-11-23 19:02:23 +01:00
Serhiy Storchaka 9b6c60cbce
bpo-31979: Simplify transforming decimals to ASCII (#4336)
in int(), float() and complex() parsers.

This also speeds up parsing non-ASCII numbers by around 20%.
2017-11-13 21:23:48 +02:00
Serhiy Storchaka e2f92de6a9
Add the const qualifier to "char *" variables that refer to literal strings. (#4370) 2017-11-11 13:06:26 +02:00
stratakis e8b1965639 bpo-23699: Use a macro to reduce boilerplate code in rich comparison functions (GH-793) 2017-11-02 20:32:54 +10:00
Serhiy Storchaka a2314283ff
bpo-20047: Make bytearray methods partition() and rpartition() rejecting (#4158)
separators that are not bytes-like objects.
2017-10-29 02:11:54 +03:00
Serhiy Storchaka 56cb465cc9 bpo-31825: Fixed OverflowError in the 'unicode-escape' codec (#4058)
and in codecs.escape_decode() when decode an escaped non-ascii byte.
2017-10-20 17:08:15 +03:00
Barry Warsaw b2e5794870 bpo-31338 (#3374)
* Add Py_UNREACHABLE() as an alias to abort().
* Use Py_UNREACHABLE() instead of assert(0)
* Convert more unreachable code to use Py_UNREACHABLE()
* Document Py_UNREACHABLE() and a few other macros.
2017-09-14 18:13:16 -07:00
Serhiy Storchaka e3b2b4b8d9 bpo-31393: Fix the use of PyUnicode_READY(). (#3451) 2017-09-08 09:58:51 +03:00
Eric Snow 2ebc5ce42a bpo-30860: Consolidate stateful runtime globals. (#3397)
* group the (stateful) runtime globals into various topical structs
* consolidate the topical structs under a single top-level _PyRuntimeState struct
* add a check-c-globals.py script that helps identify runtime globals

Other globals are excluded (see globals.txt and check-c-globals.py).
2017-09-07 23:51:28 -06:00
Stefan Krah f432a3234f bpo-30923: Silence fall-through warnings included in -Wextra since gcc-7.0. (#3157) 2017-08-21 13:09:59 +02:00
Serhiy Storchaka 64e461be09 bpo-22207: Add checks for possible integer overflows in unicodeobject.c. (#2623)
Based on patch by Victor Stinner.
2017-07-11 06:55:25 +03:00
Serhiy Storchaka f7eae0adfc [security] bpo-13617: Reject embedded null characters in wchar* strings. (#2302)
Based on patch by Victor Stinner.

Add private C API function _PyUnicode_AsUnicode() which is similar to
PyUnicode_AsUnicode(), but checks for null characters.
2017-06-28 08:30:06 +03:00
Serhiy Storchaka e613e6add5 bpo-30708: Check for null characters in PyUnicode_AsWideCharString(). (#2285)
Raise a ValueError if the second argument is NULL and the wchar_t\*
string contains null characters.
2017-06-27 16:03:14 +03:00
Serhiy Storchaka 40db90c1ce bpo-29802: Fix reference counting in module-level struct functions (#1213)
when pass arguments of wrong type.
2017-04-20 21:19:31 +03:00
Serhiy Storchaka b879fe82e7 Expand the PySlice_GetIndicesEx macro. (#1023) 2017-04-08 09:53:51 +03:00
Lisa Roach 43ba8861e0 bpo-29549: Fixes docstring for str.index (#256)
* Updates B.index documentation.

* Updates str.index documentation, makes it Argument Clinic compatible.

* Removes ArgumentClinic code.

* Finishes string.index documentation.

* Updates string.rindex documentation.

* Documents B.rindex.
2017-04-04 22:36:22 -07:00
Serhiy Storchaka fff9a31a91 bpo-29865: Use PyXXX_GET_SIZE macros rather than Py_SIZE for concrete types. (#748) 2017-03-21 08:53:25 +02:00
Serhiy Storchaka 004e03fb0c bpo-29116: Improve error message for concatenating str with non-str. (#710) 2017-03-19 19:38:42 +02:00
Serhiy Storchaka 202fda55c2 bpo-24037: Add Argument Clinic converter `bool(accept={int})`. (#485) 2017-03-12 10:10:47 +02:00
Serhiy Storchaka 370fd202f1 Use Py_RETURN_FALSE/Py_RETURN_TRUE rather than PyBool_FromLong(0)/PyBool_FromLong(1). (#567) 2017-03-08 20:47:48 +02:00
Serhiy Storchaka 9f8ad3f39e bpo-29568: Disable any characters between two percents for escaped percent "%%" in the format string for classic string formatting. (GH-513) 2017-03-08 11:51:19 +08:00
Martin Panter 91a8866dc1 Fix grammar in doc string, RST markup 2017-01-24 00:30:06 +00:00
Serhiy Storchaka 228b12edcc Issue #28999: Use Py_RETURN_NONE, Py_RETURN_TRUE and Py_RETURN_FALSE wherever
possible.  Patch is writen with Coccinelle.
2017-01-23 09:47:21 +02:00
Serhiy Storchaka 2a404b63d4 Issue #28769: The result of PyUnicode_AsUTF8AndSize() and PyUnicode_AsUTF8()
is now of type "const char *" rather of "char *".
2017-01-22 23:07:07 +02:00
Victor Stinner 0c4a828cad Run Argument Clinic: METH_VARARGS=>METH_FASTCALL
Issue #29286. Run Argument Clinic to get the new faster METH_FASTCALL calling
convention for functions using "boring" positional arguments.

Manually fix _elementtree: _elementtree_XMLParser_doctype() must remain
consistent with the clinic code.
2017-01-17 02:21:47 +01:00
INADA Naoki 15f94596b6 Issue #20180: forgot to update AC output. 2017-01-16 21:49:13 +09:00
INADA Naoki 3ae2056512 Issue #20180: convert unicode methods to AC. 2017-01-16 20:41:20 +09:00