cpython

Commit Graph

Author	SHA1	Message	Date
Victor Stinner	a557478987	gh-116417: Move limited C API unicode.c tests to _testlimitedcapi (#116993 ) Split unicode.c tests of _testcapi into two parts: limited C API tests in _testlimitedcapi and non-limited C API tests in _testcapi. Update test_codecs.	2024-03-19 12:30:39 +00:00
John Sloboda	649857a157	gh-85287: Change codecs to raise precise UnicodeEncodeError and UnicodeDecodeError (#113674 ) Co-authored-by: Inada Naoki <songofacandy@gmail.com>	2024-03-17 04:58:42 +00:00
Zackery Spytz	d180b507c4	gh-63283: IDNA prefix should be case insensitive (GH-17726) Any capitalization of "xn--" should be acceptable for the ACE prefix (see https://tools.ietf.org/html/rfc3490#section-5). Co-authored-by: Pepijn de Vos <pepijndevos@gmail.com> Co-authored-by: Erlend E. Aasland <erlend@python.org> Co-authored-by: Petr Viktorin <encukou@gmail.com>	2024-03-15 15:38:13 +01:00
Serhiy Storchaka	b987fdb19b	gh-109848: Make test_rot13_func in test_codecs independent (GH-109850)	2023-10-07 16:01:39 +03:00
Furkan Onder	3439cb0049	gh-66143: Allow copying and pickling of CodecInfo object (GH-109235) Co-authored-by: Robert Lehmann <mail@robertlehmann.de>	2023-09-29 20:07:09 +03:00
Serhiy Storchaka	d6892c2b92	gh-50644: Forbid pickling of codecs streams (GH-109180) Attempts to pickle or create a shallow or deep copy of codecs streams now raise a TypeError. Previously, copying failed with a RecursionError, while pickling produced wrong results that eventually caused unpickling to fail with a RecursionError.	2023-09-10 20:06:09 +03:00
Nikita Sobolev	6e6a4cd523	gh-106300: Improve `assertRaises(Exception)` usages in tests (GH-106302)	2023-07-07 13:42:40 -07:00
Irit Katriel	76350e85eb	gh-102406: replace exception chaining by PEP-678 notes in codecs (#102407 )	2023-03-21 21:36:31 +00:00
Gregory P. Smith	d315722564	gh-98433: Fix quadratic time idna decoding. (#99092 ) There was an unnecessary quadratic loop in idna decoding. This restores the behavior to linear. This also adds an early length check in IDNA decoding to outright reject huge inputs early on given the ultimate result is defined to be 63 or fewer characters.	2022-11-07 16:54:41 -08:00
Stanley	d9407b174c	gh-51511: Note that codecs.open()'s encoding parameter affects automatic conversion to binary mode (#94370 )	2022-10-21 16:01:05 -07:00
Victor Stinner	3ceb4b8d3a	gh-84623: Remove unused imports in tests (#93772 )	2022-06-13 16:56:03 +02:00
Serhiy Storchaka	3483299a24	gh-81548: Deprecate octal escape sequences with value larger than 0o377 (GH-91668)	2022-04-30 13:16:27 +03:00
Nikita Sobolev	6c83c8e6b5	bpo-46198: rename duplicate tests and remove unused code (GH-30297)	2022-03-10 08:20:11 -08:00
Victor Stinner	ccbe8045fa	bpo-46659: Fix the MBCS codec alias on Windows (GH-31218)	2022-02-22 22:04:07 +01:00
Victor Stinner	04dd60e50c	bpo-46659: Update the test on the mbcs codec alias (GH-31168) encodings registers the _alias_mbcs() codec search function before the search_function() codec search function. Previously, the _alias_mbcs() was never used. Fix the test_codecs.test_mbcs_alias() test: use the current ANSI code page, not a fake ANSI code page number. Remove the test_site.test_aliasing_mbcs() test: the alias is now implemented in the encodings module, no longer in the site module.	2022-02-06 21:50:09 +01:00
Victor Stinner	ea1a54506b	bpo-46303: Move fileutils.h private functions to internal C API (GH-30484) Move almost all private functions of Include/cpython/fileutils.h to the internal C API Include/internal/pycore_fileutils.h. Only keep _Py_fopen_obj() in Include/cpython/fileutils.h, since it's used by _testcapi which must not use the internal C API. Move EncodeLocaleEx() and DecodeLocaleEx() functions from _testcapi to _testinternalcapi, since the C API moved to the internal C API.	2022-01-11 11:56:16 +01:00
Christian Heimes	e73283a20f	bpo-45668: Fix PGO tests without test extensions (GH-29315)	2021-11-01 11:14:53 +01:00
Serhiy Storchaka	39aa98346d	bpo-45467: Fix IncrementalDecoder and StreamReader in the "raw-unicode-escape" codec (GH-28944) They support now splitting escape sequences between input chunks. Add the third parameter "final" in codecs.raw_unicode_escape_decode(). It is True by default to match the former behavior.	2021-10-14 20:04:19 +03:00
Serhiy Storchaka	c96d1546b1	bpo-45461: Fix IncrementalDecoder and StreamReader in the "unicode-escape" codec (GH-28939) They support now splitting escape sequences between input chunks. Add the third parameter "final" in codecs.unicode_escape_decode(). It is True by default to match the former behavior.	2021-10-14 13:17:00 +03:00
Victor Stinner	19ba2122ac	bpo-37330: open() no longer accept 'U' in file mode (GH-28118) open(), io.open(), codecs.open() and fileinput.FileInput no longer accept "U" ("universal newline") in the file mode. This flag was deprecated since Python 3.3.	2021-09-02 12:58:00 +02:00
Max Bernstein	3635388f52	bpo-42065: Fix incorrectly formatted _codecs.charmap_decode error message (GH-19940)	2020-10-17 23:38:21 +03:00
Hai Shi	c9f696cb96	bpo-41919, test_codecs: Move codecs.register calls to setUp() (GH-22513) * Move the codecs' (un)register operation to testcases. * Remove _codecs._forget_codec() and _PyCodec_Forget()	2020-10-16 10:34:15 +02:00
Hai Shi	c5b049b91c	bpo-39337: encodings.normalize_encoding() now ignores non-ASCII characters (GH-22219)	2020-10-14 17:43:31 +02:00
Hai Shi	3f342376ab	bpo-39337: Add a test case for normalizing of codec names (GH-19069)	2020-10-08 21:20:57 +02:00
Hai Shi	d332e7b816	bpo-41842: Add codecs.unregister() function (GH-22360) Add codecs.unregister() and PyCodec_Unregister() functions to unregister a codec search function.	2020-09-28 23:41:11 +02:00
Victor Stinner	0ee0b2938c	bpo-41521: Replace whitelist/blacklist with allowlist/denylist (GH-21823) Rename 5 test method names in test_codecs and test_typing.	2020-08-11 15:28:43 +02:00
Hai Shi	4660597b51	bpo-40275: Use new test.support helper submodules in tests (GH-21448)	2020-08-03 18:49:18 +02:00
Victor Stinner	942f7a2dea	bpo-39674: Revert "bpo-37330: open() no longer accept 'U' in file mode (GH-16959)" (GH-18767) This reverts commit `e471e72977`. The mode will be removed from Python 3.10.	2020-03-04 18:50:22 +01:00
Chris A	2565edec2c	bpo-38971: Open file in codecs.open() closes if exception raised. (GH-17666) Open issue in the BPO indicated a desire to make the implementation of codecs.open() at parity with io.open(), which implements a try/except to assure file stream gets closed before an exception is raised.	2020-03-02 08:39:50 +02:00
Berker Peksag	ba22e8f174	bpo-30566: Fix IndexError when using punycode codec (GH-18632) Trying to decode an invalid string with the punycode codec shoud raise UnicodeError.	2020-02-25 06:19:03 +03:00
Pablo Galindo	293dd23477	Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246) Capturing exceptions into names can lead to reference cycles though the __traceback__ attribute of the exceptions in some obscure cases that have been reported previously and fixed individually. As these variables are not used anyway, we can remove the binding to reduce the chances of creating reference cycles. See for example GH-13135	2019-11-19 21:34:03 +00:00
Victor Stinner	e471e72977	bpo-37330: open() no longer accept 'U' in file mode (GH-16959) open(), io.open(), codecs.open() and fileinput.FileInput no longer accept "U" ("universal newline") in the file mode. This flag was deprecated since Python 3.3.	2019-10-28 15:40:08 +01:00
Zeth	b3b48c81f0	bpo-37876: Tests for ROT-13 codec (GH-15314) The Rot-13 codec is for educational use but does not have unit tests, dragging down test coverage. This adds a few very simple tests.	2019-09-09 07:50:36 -07:00
Steve Dower	7ebdda0dbe	bpo-36311: Fixes decoding multibyte characters around chunk boundaries and improves decoding performance (GH-15083)	2019-08-21 16:22:33 -07:00
Victor Stinner	8f4ef3b019	Remove unused imports in tests (GH-14518)	2019-07-01 18:28:25 +02:00
Serhiy Storchaka	894263ba80	bpo-24214: Fixed the UTF-8 and UTF-16 incremental decoders. (GH-14304) * The UTF-8 incremental decoders fails now fast if encounter a sequence that can't be handled by the error handler. * The UTF-16 incremental decoders with the surrogatepass error handler decodes now a lone low surrogate with final=False.	2019-06-25 11:54:18 +03:00
Victor Stinner	ca612a9728	bpo-36778: Remove outdated comment from CodePageTest (GH-13807) CP65001Test has been removed.	2019-06-04 17:09:10 +02:00
Ammar Askar	a6ec1ce1ac	bpo-33361: Fix bug with seeking in StreamRecoders (GH-8278)	2019-05-31 22:44:00 +03:00
Jelle Zijlstra	b3be407288	bpo-33482: fix codecs.StreamRecoder.writelines (GH-6779) A very simple fix. I found this while writing typeshed stubs for StreamRecoder. https://bugs.python.org/issue33482	2019-05-22 08:18:26 -07:00
Victor Stinner	d267ac20c3	bpo-36778: cp65001 encoding becomes an alias to utf_8 (GH-13230)	2019-05-10 03:19:54 +02:00
Paul Monson	62dfd7d6fe	bpo-35920: Windows 10 ARM32 platform support (GH-11774)	2019-04-25 18:36:45 +00:00
Serhiy Storchaka	7a465cb5ee	bpo-24214: Fixed the UTF-8 incremental decoder. (GH-12603) The bug occurred when the encoded surrogate character is passed to the incremental decoder in two chunks.	2019-03-30 08:23:38 +02:00
Serhiy Storchaka	c1e2c288f4	bpo-36312: Fix decoders for some code pages. (GH-12369)	2019-03-20 21:45:18 +02:00
Inada Naoki	6a16b18224	bpo-36297: remove "unicode_internal" codec (GH-12342)	2019-03-18 15:44:11 +09:00
Serhiy Storchaka	5b10b98247	bpo-22831: Use "with" to avoid possible fd leaks in tests (part 2). (GH-10929)	2019-03-05 10:06:26 +02:00
Serhiy Storchaka	4013c17911	bpo-35372: Fix the code page decoder for input > 2 GiB. (GH-10848)	2018-12-03 10:36:45 +02:00
Victor Stinner	bde9d6bbb4	bpo-34523, bpo-35322: Fix unicode_encode_locale() (GH-10759) Fix memory leak in PyUnicode_EncodeLocale() and PyUnicode_EncodeFSDefault() on error handling. Changes: * Fix unicode_encode_locale() error handling * Fix test_codecs.LocaleCodecTest	2018-11-28 10:26:20 +01:00
Victor Stinner	3d4226a832	bpo-34523: Support surrogatepass in locale codecs (GH-8995) Add support for the "surrogatepass" error handler in PyUnicode_DecodeFSDefault() and PyUnicode_EncodeFSDefault() for the UTF-8 encoding. Changes: * _Py_DecodeUTF8Ex() and _Py_EncodeUTF8Ex() now support the surrogatepass error handler (_Py_ERROR_SURROGATEPASS). * _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() now use the _Py_error_handler enum instead of "int surrogateescape" to pass the error handler. These functions now return -3 if the error handler is unknown. * Add unit tests on _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() in test_codecs. * Rename get_error_handler() to _Py_GetErrorHandler() and expose it as a private function. * _freeze_importlib doesn't need config.filesystem_errors="strict" workaround anymore.	2018-08-29 22:21:32 +02:00
Zackery Spytz	e349bf2358	bpo-22602: Raise an exception in the UTF-7 decoder for ill-formed sequences starting with "+". (GH-8741) The UTF-7 decoder now raises UnicodeDecodeError for ill-formed sequences starting with "+" (as specified in RFC 2152).	2018-08-19 07:43:38 +03:00
Victor Stinner	91106cd9ff	bpo-29240: PEP 540: Add a new UTF-8 Mode (#855 ) * Add -X utf8 command line option, PYTHONUTF8 environment variable and a new sys.flags.utf8_mode flag. * If the LC_CTYPE locale is "C" at startup: enable automatically the UTF-8 mode. * Add _winapi.GetACP(). encodings._alias_mbcs() now calls _winapi.GetACP() to get the ANSI code page * locale.getpreferredencoding() now returns 'UTF-8' in the UTF-8 mode. As a side effect, open() now uses the UTF-8 encoding by default in this mode. * Py_DecodeLocale() and Py_EncodeLocale() now use the UTF-8 encoding in the UTF-8 Mode. * Update subprocess._args_from_interpreter_flags() to handle -X utf8 * Skip some tests relying on the current locale if the UTF-8 mode is enabled. * Add test_utf8mode.py. * _Py_DecodeUTF8_surrogateescape() gets a new optional parameter to return also the length (number of wide characters). * pymain_get_global_config() and pymain_set_global_config() now always copy flag values, rather than only copying if the new value is greater than the old value.	2017-12-13 12:29:09 +01:00

1 2 3 4 5 ...

269 Commits