Serhiy Storchaka
219c2de5ad
bpo-32110: codecs.StreamReader.read(n) now returns not more than n ( #4499 )
...
characters/bytes for non-negative n. This makes it compatible with
read() methods of other file-like objects.
2017-11-29 01:30:00 +02:00
Serhiy Storchaka
56cb465cc9
bpo-31825: Fixed OverflowError in the 'unicode-escape' codec ( #4058 )
...
and in codecs.escape_decode() when decode an escaped non-ascii byte.
2017-10-20 17:08:15 +03:00
Berker Peksag
7b4bcd2004
Issue #25270 : Merge from 3.5
2016-09-16 17:32:06 +03:00
Berker Peksag
4a72a7b6c4
Issue #25270 : Prevent codecs.escape_encode() from raising SystemError when an empty bytestring is passed
2016-09-16 17:31:06 +03:00
R David Murray
110b6fecbb
#27364 : Deprecate invalid escape strings in str/byutes.
...
Patch by Emanuel Barry, reviewed by Serhiy Storchaka and Martin Panter.
2016-09-08 15:34:08 -04:00
R David Murray
44b548dda8
#27364 : fix "incorrect" uses of escape character in the stdlib.
...
And most of the tools.
Patch by Emanual Barry, reviewed by me, Serhiy Storchaka, and
Martin Panter.
2016-09-08 13:59:53 -04:00
Steve Dower
f5aba58480
Issue #27959 : Adds oem encoding, alias ansi to mbcs, move aliasmbcs to codec lookup
2016-09-06 19:42:27 -07:00
Serhiy Storchaka
e437a10d15
Issue #23277 : Remove unused imports in tests.
2016-04-24 21:41:02 +03:00
Martin Panter
8b04a945ef
Merge typo fixes from 3.5
2016-04-16 09:29:17 +00:00
Martin Panter
119e502277
Fix typos in code comments and documentation
2016-04-16 09:28:57 +00:00
Martin Panter
cda80940ed
Issue #15984 : Merge PyUnicode doc from 3.5
2016-04-15 02:27:11 +00:00
Martin Panter
6245cb3c01
Correct “an” → “a” with “Unicode”, “user”, “UTF”, etc
...
This affects documentation, code comments, and a debugging messages.
2016-04-15 02:14:19 +00:00
Martin Panter
e56a919100
Issue #25523 : Merge a-to-an corrections from 3.5
2015-11-02 04:27:17 +00:00
Martin Panter
2eb819f7a8
Issue #25523 : Merge "a" to "an" fixes from 3.4 into 3.5
2015-11-02 04:04:57 +00:00
Martin Panter
7462b64911
Issue #25523 : Correct "a" article to "an" article
...
This changes the main documentation, doc strings, source code comments, and a
couple error messages in the test suite. In some cases the word was removed
or edited some other way to fix the grammar.
2015-11-02 03:37:02 +00:00
Victor Stinner
797485e101
Issue #25318 : Avoid sprintf() in backslashreplace()
...
Rewrite backslashreplace() to be closer to PyCodec_BackslashReplaceErrors().
Add also unit tests for non-BMP characters.
2015-10-09 03:17:30 +02:00
Victor Stinner
1d65d9192d
Issue #25301 : The UTF-8 decoder is now up to 15 times as fast for error
...
handlers: ``ignore``, ``replace`` and ``surrogateescape``.
2015-10-05 13:43:50 +02:00
Serhiy Storchaka
29e68edbf4
Issue #24848 : Fixed bugs in UTF-7 decoding of misformed data:
...
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
3. In some circumstances the '\xfd' character was produced instead of the
replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
2015-10-02 13:14:03 +03:00
Serhiy Storchaka
58c8f2bb6d
Issue #24848 : Fixed bugs in UTF-7 decoding of misformed data:
...
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
3. In some circumstances the '\xfd' character was produced instead of the
replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
2015-10-02 13:13:14 +03:00
Serhiy Storchaka
28b21e50c8
Issue #24848 : Fixed bugs in UTF-7 decoding of misformed data:
...
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
2015-10-02 13:07:28 +03:00
Victor Stinner
01ada3996b
Issue #25267 : The UTF-8 encoder is now up to 75 times as fast for error
...
handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``.
Patch co-written with Serhiy Storchaka.
2015-10-01 21:54:51 +02:00
Victor Stinner
c3713e9706
Optimize ascii/latin1+surrogateescape encoders
...
Issue #25227 : Optimize ASCII and latin1 encoders with the ``surrogateescape``
error handler: the encoders are now up to 3 times as fast.
Initial patch written by Serhiy Storchaka.
2015-09-29 12:32:13 +02:00
Victor Stinner
f96418de05
Issue #24870 : Optimize the ASCII decoder for error handlers: surrogateescape,
...
ignore and replace. Initial patch written by Naoki Inada.
The decoder is now up to 60 times as fast for these error handlers.
Add also unit tests for the ASCII decoder.
2015-09-21 23:06:27 +02:00
Martin Panter
9ab96946ee
Issue #16473 : Merge codecs doc and test from 3.4 into 3.5
2015-09-12 01:22:17 +00:00
Martin Panter
06171bd52a
Issue #16473 : Fix byte transform codec documentation; test quotetabs=True
...
This changes the equivalent functions listed for the Base-64, hex and Quoted-
Printable codecs to reflect the functions actually used. Also mention and
test the "quotetabs" setting for Quoted-Printable encoding.
2015-09-12 00:34:28 +00:00
Serhiy Storchaka
f0eeedf0d8
Issue #22681 : Added support for the koi8_t encoding.
2015-05-12 23:24:19 +03:00
Serhiy Storchaka
ad8a1c3fb2
Issue #22682 : Added support for the kz1048 encoding.
2015-05-12 23:16:55 +03:00
Serhiy Storchaka
8490f5acfe
Issue #23001 : Few functions in modules mmap, ossaudiodev, socket, ssl, and
...
codecs, that accepted only read-only bytes-like object now accept writable
bytes-like object too.
2015-03-20 09:00:36 +02:00
Victor Stinner
f2be23d329
Issue #22286 , #23321 : Fix failing test on Windows code page 932
...
There was a bug which was fixed. The unit test was also wrong.
2015-01-26 23:26:11 +01:00
Serhiy Storchaka
07985ef387
Issue #22286 : The "backslashreplace" error handlers now works with
...
decoding and translating.
2015-01-25 22:56:57 +02:00
Nick Coghlan
582acb75e9
Merge issue 19548 changes from 3.4
2015-01-07 00:37:01 +10:00
Nick Coghlan
b9fdb7a452
Issue 19548: update codecs module documentation
...
- clarified the distinction between text encodings and other codecs
- clarified relationship with builtin open and the io module
- consolidated documentation of error handlers into one section
- clarified type constraints of some behaviours
- added tests for some of the new statements in the docs
2015-01-07 00:22:00 +10:00
Serhiy Storchaka
f65d1d3b02
Issue #23071 : "namereplace_errors" was added only in 3.5.
2014-12-20 18:53:01 +02:00
Serhiy Storchaka
4d33ff6183
Issue #23071 : Added missing names to codecs.__all__. Patch by Martin Panter.
2014-12-20 17:46:05 +02:00
Serhiy Storchaka
de3ee5b94f
Issue #23071 : Added missing names to codecs.__all__. Patch by Martin Panter.
2014-12-20 17:42:38 +02:00
Serhiy Storchaka
166ebc4e5d
Issue #19676 : Added the "namereplace" error handler.
2014-11-25 13:57:17 +02:00
Serhiy Storchaka
85e7066278
Issue #22406 : Fixed the uu_codec codec incorrectly ported to 3.x.
...
Based on patch by Martin Panter.
2014-11-07 14:06:19 +02:00
Serhiy Storchaka
519114df42
Issue #22406 : Fixed the uu_codec codec incorrectly ported to 3.x.
...
Based on patch by Martin Panter.
2014-11-07 14:04:37 +02:00
Nick Coghlan
a0f33759fa
Merge fix for issue #22166 from 3.4
2014-09-15 23:55:16 +12:00
Nick Coghlan
8fad1676a2
Issue #22166 : clear codec caches in test_codecs
2014-09-15 23:50:44 +12:00
Victor Stinner
0d4e01ca07
Issue #13916 : Fix surrogatepass error handler on Windows
2014-05-16 14:46:20 +02:00
Serhiy Storchaka
88d8fb6af6
Issue #13916 : Disallowed the surrogatepass error handler for non UTF-*
...
encodings.
2014-05-15 14:37:42 +03:00
Victor Stinner
a57dfd033c
Issue #21488 : Add support of keyword arguments for codecs.encode and codecs.decode
2014-05-14 17:13:14 +02:00
Victor Stinner
07beb375b7
Issue #20574 : Remove duplicated test failing on Windows XP
2014-03-18 01:40:22 +01:00
Victor Stinner
f8cbf78bbd
Issue #20574 : Add more tests for cp65001
2014-03-17 23:16:02 +01:00
Victor Stinner
7d00cc1a64
Issue #20574 : Implement incremental decoder for cp65001 code
...
(Windows code page 65001, Microsoft UTF-8).
2014-03-17 23:08:06 +01:00
Victor Stinner
3633ce3301
Issue #20571 : skip test_readline() of test_codecs for Windows code page 65001.
...
The decoder does not support partial decoding yet for this code page.
2014-02-09 13:11:53 +01:00
Serhiy Storchaka
6cbf151032
Issue #20538 : UTF-7 incremental decoder produced inconsistant string when
...
input was truncated in BASE64 section.
2014-02-08 14:06:33 +02:00
Serhiy Storchaka
016a3f33a5
Issue #20538 : UTF-7 incremental decoder produced inconsistant string when
...
input was truncated in BASE64 section.
2014-02-08 14:01:29 +02:00
Nick Coghlan
96252cd724
Issue 20542: Temporarily skip failing test
2014-02-07 23:34:41 +10:00
Serhiy Storchaka
f28ba369dd
Issue #20532 : Tests which use _testcapi now are marked as CPython only.
2014-02-07 10:10:55 +02:00
Serhiy Storchaka
5cfc79deae
Issue #20532 : Tests which use _testcapi now are marked as CPython only.
2014-02-07 10:06:39 +02:00
Serhiy Storchaka
3dcb0cf9b1
Issue #20520 : Fixed readline test in test_codecs.
2014-02-06 09:27:28 +02:00
Serhiy Storchaka
5b4fab1ad7
Issue #20520 : Fixed readline test in test_codecs.
2014-02-06 09:26:56 +02:00
Serhiy Storchaka
dbe0982bc5
Issue #8260 : The read(), readline() and readlines() methods of
...
codecs.StreamReader returned incomplete data when were called after
readline() or read(size). Based on patch by Amaury Forgeot d'Arc.
2014-01-26 19:27:56 +02:00
Serhiy Storchaka
8003850e22
Issue #8260 : The read(), readline() and readlines() methods of
...
codecs.StreamReader returned incomplete data when were called after
readline() or read(size). Based on patch by Amaury Forgeot d'Arc.
2014-01-26 19:21:00 +02:00
Nick Coghlan
77b286b2cc
Close #20105 : set __traceback__ when chaining exceptions in C
2014-01-27 00:53:38 +10:00
Zachary Ware
efa2e04033
Issue19619: skip zlib error test when zlib not available
2013-12-30 14:54:11 -06:00
Serhiy Storchaka
2480c2ed59
Issue #15204 : Silence and check the 'U' mode deprecation warnings in tests.
...
Changed deprecation message in the fileinput module.
2013-11-24 23:13:26 +02:00
Serhiy Storchaka
be0c3250b1
Issue #19668 : Added support for the cp1125 encoding.
2013-11-23 18:52:23 +02:00
Nick Coghlan
9c1aed8f94
Close #7475 : Restore binary & text transform codecs
...
The codecs themselves were restored in Python 3.2, this
completes the restoration by adding back the convenience
aliases.
These aliases were originally left out due to confusing
errors when attempting to use them with the text encoding
specific convenience methods. Python 3.4 includes several
improvements to those errors, thus permitting the aliases
to be restored as well.
2013-11-23 11:13:36 +10:00
Nick Coghlan
c72e4e6dcc
Issue #19619 : Blacklist non-text codecs in method API
...
str.encode, bytes.decode and bytearray.decode now use an
internal API to throw LookupError for known non-text encodings,
rather than attempting the encoding or decoding operation and
then throwing a TypeError for an unexpected output type.
The latter mechanism remains in place for third party non-text
encodings.
2013-11-22 22:39:36 +10:00
Nick Coghlan
f1de55fb33
Also chain codec exceptions that allow weakrefs
...
The zlib and hex codecs throw custom exception types with
weakref support if the input type is valid, but the data
fails validation. Make sure the exception chaining in the
codec infrastructure can wrap those as well.
2013-11-19 22:33:10 +10:00
Serhiy Storchaka
58cf607d13
Issue #12892 : The utf-16* and utf-32* codecs now reject (lone) surrogates.
...
The utf-16* and utf-32* encoders no longer allow surrogate code points
(U+D800-U+DFFF) to be encoded.
The utf-32* decoders no longer decode byte sequences that correspond to
surrogate code points.
The surrogatepass error handler now works with the utf-16* and utf-32* codecs.
Based on patches by Victor Stinner and Kang-Hao (Kenny) Lu.
2013-11-19 11:32:41 +02:00
Nick Coghlan
4e553e2e52
Avoid triggering the refleak detector
2013-11-16 00:35:34 +10:00
Nick Coghlan
c4c2580d43
Close 19609: narrow scope of codec exc chaining
2013-11-15 21:47:37 +10:00
Nick Coghlan
8b097b4ed7
Close #17828 : better handling of codec errors
...
- output type errors now redirect users to the type-neutral
convenience functions in the codecs module
- stateless errors that occur during encoding and decoding
will now be automatically wrapped in exceptions that give
the name of the codec involved
2013-11-13 23:49:21 +10:00
Serhiy Storchaka
0e071c967c
Fixed tests for issue #19279 .
2013-10-19 21:14:57 +03:00
Serhiy Storchaka
55e092f545
Issue #19279 : UTF-7 decoder no more produces illegal strings.
2013-10-19 20:39:28 +03:00
Serhiy Storchaka
35804e4c63
Issue #19279 : UTF-7 decoder no more produces illegal strings.
2013-10-19 20:38:19 +03:00
Nick Coghlan
fdf239a855
Close #17839 : support bytes-like objects in base64 module
...
This mostly affected the encodebytes and decodebytes function
(which are used by base64_codec)
Also added a test to ensure all bytes-bytes codecs can handle
memoryview input and tests for handling of multidimensional
and non-bytes format input in the modern base64 API.
2013-10-03 00:43:22 +10:00
Serhiy Storchaka
7b07873b93
Add tests for raw-unicode-escape codec.
2013-01-29 11:41:34 +02:00
Serhiy Storchaka
799fd9c877
Add tests for raw-unicode-escape codec.
2013-01-29 11:41:01 +02:00
Serhiy Storchaka
c9c4338e2b
Add tests for raw-unicode-escape codec.
2013-01-29 11:40:00 +02:00
Serhiy Storchaka
d8f07cd374
Clean up escape-decode decoder tests.
2013-01-29 11:08:06 +02:00
Serhiy Storchaka
db6add7d71
Clean up escape-decode decoder tests.
2013-01-29 11:07:27 +02:00
Serhiy Storchaka
077cb347a9
Clean up escape-decode decoder tests.
2013-01-29 11:06:53 +02:00
Serhiy Storchaka
8fe5a9f9c3
Issue #16979 : Fix error handling bugs in the unicode-escape-decode decoder.
2013-01-29 10:37:39 +02:00
Serhiy Storchaka
24193debd4
Issue #16979 : Fix error handling bugs in the unicode-escape-decode decoder.
2013-01-29 10:28:07 +02:00
Serhiy Storchaka
d679377be7
Issue #16979 : Fix error handling bugs in the unicode-escape-decode decoder.
2013-01-29 10:20:44 +02:00
Serhiy Storchaka
f584aba3a5
Issue #16975 : Fix error handling bug in the escape-decode bytes decoder.
2013-01-25 23:33:22 +02:00
Serhiy Storchaka
e58785b200
Issue #16975 : Fix error handling bug in the escape-decode bytes decoder.
2013-01-25 23:32:41 +02:00
Serhiy Storchaka
ace3ad3bf7
Issue #16975 : Fix error handling bug in the escape-decode bytes decoder.
2013-01-25 23:31:43 +02:00
Serhiy Storchaka
55e2cb497b
Issue #14850 : Now a chamap decoder treates U+FFFE as "undefined mapping"
...
in any mapping, not only in an unicode string.
2013-01-15 15:30:04 +02:00
Serhiy Storchaka
45d16d9924
Issue #14850 : Now a chamap decoder treates U+FFFE as "undefined mapping"
...
in any mapping, not only in an unicode string.
2013-01-15 15:01:20 +02:00
Serhiy Storchaka
4fb8caee87
Issue #14850 : Now a chamap decoder treates U+FFFE as "undefined mapping"
...
in any mapping, not only in an unicode string.
2013-01-15 14:43:21 +02:00
Ezio Melotti
aabd0b0312
#16918 : merge with 3.3.
2013-01-11 06:05:51 +02:00
Ezio Melotti
5d3dba0d27
#16918 : test_codecs now works with unittest test discovery. Patch by Zachary Ware.
2013-01-11 06:02:07 +02:00
Ezio Melotti
e0b87edd7f
Merge fix for broken/disabled test.
2013-01-11 05:57:58 +02:00
Ezio Melotti
26ed234052
Enable a broken test and fix it.
2013-01-11 05:54:57 +02:00
Serhiy Storchaka
24a3ef6999
Issue #11461 : Fix the incremental UTF-16 decoder. Original patch by
...
Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP
characters.
2013-01-08 23:41:55 +02:00
Serhiy Storchaka
ae3b32ad6b
Issue #11461 : Fix the incremental UTF-16 decoder. Original patch by
...
Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP
characters.
2013-01-08 23:40:52 +02:00
Serhiy Storchaka
48e188e573
Issue #11461 : Fix the incremental UTF-16 decoder. Original patch by
...
Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP
characters.
2013-01-08 23:14:24 +02:00
Andrew Svetlov
2606a6f197
Issue #16719 : Get rid of WindowsError. Use OSError instead
...
Patch by Serhiy Storchaka.
2012-12-19 14:33:35 +02:00
Ezio Melotti
a0b5c46fa2
#16336 : merge with 3.2.
2012-11-03 23:04:41 +02:00
Ezio Melotti
540da76115
#16336 : fix input checking in the surrogatepass error handler. Patch by Serhiy Storchaka.
2012-11-03 23:03:39 +02:00
Philip Jenvey
5f9459fbed
merge with 3.2
2012-10-26 17:05:09 -07:00
Philip Jenvey
45c41494bf
bounds check for bad data (thanks amaury)
2012-10-26 17:01:53 -07:00
Antoine Pitrou
a1f7655fa7
Issue #15379 : Fix passing of non-BMP characters as integers for the charmap decoder (already working as unicode strings).
...
Patch by Serhiy Storchaka.
2012-09-23 20:00:04 +02:00
Antoine Pitrou
6f80f5d444
Issue #15379 : Fix passing of non-BMP characters as integers for the charmap decoder (already working as unicode strings).
...
Patch by Serhiy Storchaka.
2012-09-23 19:55:21 +02:00