Commit Graph

158 Commits

Author SHA1 Message Date
Serhiy Storchaka 519114df42 Issue #22406: Fixed the uu_codec codec incorrectly ported to 3.x.
Based on patch by Martin Panter.
2014-11-07 14:04:37 +02:00
Serhiy Storchaka a39938ff44 Issue #21171: Fixed undocumented filter API of the rot13 codec.
Patch by Berker Peksag.
2014-04-13 17:07:04 +03:00
R David Murray fb2c2db0fb Merge #7475: Remove references to '.transform' from transform codec docstrings. 2014-03-13 20:55:09 -04:00
R David Murray e5cb836d4c #7475: Remove references to '.transform' from transform codec docstrings. 2014-03-13 20:54:30 -04:00
R David Murray 47d083cf1a whatsnew: cp273 codec (#10907797)
Also updated the docs and added the aliases mentioned by the
references.
2014-03-07 21:00:34 -05:00
Serhiy Storchaka 94ee389308 Issue #19619: Blacklist non-text codecs in method API
str.encode, bytes.decode and bytearray.decode now use an
internal API to throw LookupError for known non-text encodings,
rather than attempting the encoding or decoding operation and
then throwing a TypeError for an unexpected output type.

The latter mechanism remains in place for third party non-text
encodings.

Backported changeset d68df99d7a57.
2014-02-24 14:43:03 +02:00
Serhiy Storchaka e7f87e1262 Fixed incorrectly applying a patch for issue19668. 2013-11-23 19:50:47 +02:00
Serhiy Storchaka be0c3250b1 Issue #19668: Added support for the cp1125 encoding. 2013-11-23 18:52:23 +02:00
Nick Coghlan 9c1aed8f94 Close #7475: Restore binary & text transform codecs
The codecs themselves were restored in Python 3.2, this
completes the restoration by adding back the convenience
aliases.

These aliases were originally left out due to confusing
errors when attempting to use them with the text encoding
specific convenience methods. Python 3.4 includes several
improvements to those errors, thus permitting the aliases
to be restored as well.
2013-11-23 11:13:36 +10:00
Nick Coghlan c72e4e6dcc Issue #19619: Blacklist non-text codecs in method API
str.encode, bytes.decode and bytearray.decode now use an
internal API to throw LookupError for known non-text encodings,
rather than attempting the encoding or decoding operation and
then throwing a TypeError for an unexpected output type.

The latter mechanism remains in place for third party non-text
encodings.
2013-11-22 22:39:36 +10:00
Andrew Kuchling ad8156e9b2 #1097797: Add CP273 codec, and exercise it in the test suite 2013-11-10 13:44:30 -05:00
Brett Cannon cd171c8e92 Issue #18200: Back out usage of ModuleNotFoundError (8d28d44f3a9a) 2013-07-04 17:43:24 -04:00
Brett Cannon 0a140668fa Issue #18200: Update the stdlib (except tests) to use
ModuleNotFoundError.
2013-06-13 20:57:26 -04:00
Victor Stinner 03c3e35d42 Add fast-path in PyUnicode_DecodeCharmap() for pure 8 bit encodings:
cp037, cp500 and iso8859_1 codecs
2013-04-09 21:53:09 +02:00
Antoine Pitrou 7e19337ebc Normalize whitespace 2012-06-16 22:50:54 +02:00
Antoine Pitrou aaefac76dd Issue #14874: Restore charmap decoding speed to pre-PEP 393 levels.
Patch by Serhiy Storchaka.
2012-06-16 22:48:21 +02:00
Antoine Pitrou 9768676f6f Speed up IDNA for the common case 2011-11-10 22:49:20 +01:00
Florent Xicluna aabbda5354 Merge 3.2 2011-10-28 14:52:29 +02:00
Florent Xicluna 5d1155c08e Closes #13258: Use callable() built-in in the standard library. 2011-10-28 14:45:05 +02:00
Victor Stinner 2f3ca9f20e Close #13247: Add cp65001 codec, the Windows UTF-8 (CP_UTF8) 2011-10-27 01:38:56 +02:00
Victor Stinner b6f424043d Issue #10807: Remove base64, bz2, hex, quopri, rot13, uu and zlib codecs from
the codec aliases. They are still accessible via codecs.lookup().
2011-01-02 19:50:36 +00:00
Georg Brandl 7c23ea2e88 Don't use deprecated aliases. 2010-12-06 22:25:25 +00:00
Georg Brandl 02524629f3 #7475: add (un)transform method to bytes/bytearray and str, add back codecs that can be used with them from Python 2. 2010-12-02 18:06:51 +00:00
Florent Xicluna e01de8f2f3 remove pointless coding cookies 2010-08-30 14:05:50 +00:00
Marc-André Lemburg ff562506d4 Fix a typo in the alias target name for 'macintosh'. 2010-08-21 10:58:31 +00:00
Benjamin Peterson 23110e7361 alias macintosh to mac_roman #843590 2010-08-21 02:54:44 +00:00
Benjamin Peterson 5a6214afe2 Merged revisions 81499,81506 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81499 | georg.brandl | 2010-05-24 16:29:07 -0500 (Mon, 24 May 2010) | 1 line

  #8016: add the CP858 codec (approved by Benjamin).  (Also add CP720 to the tests, it was missing there.)
........
  r81506 | benjamin.peterson | 2010-05-24 17:04:53 -0500 (Mon, 24 May 2010) | 1 line

  set svn:eol-style
........
2010-06-27 22:41:29 +00:00
Victor Stinner a92ad7ee2c Merged revisions 81471-81472 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81471 | victor.stinner | 2010-05-22 15:37:56 +0200 (sam., 22 mai 2010) | 7 lines

  Issue #6268: More bugfixes about BOM, UTF-16 and UTF-32

   * Fix seek() method of codecs.open(), don't write the BOM twice after seek(0)
   * Fix reset() method of codecs, UTF-16, UTF-32 and StreamWriter classes
   * test_codecs: use "w+" mode instead of "wt+". "t" mode is not supported by
     Solaris or Windows, but does it really exist? I found it the in the issue.
........
  r81472 | victor.stinner | 2010-05-22 15:44:25 +0200 (sam., 22 mai 2010) | 4 lines

  Fix my last commit (r81471) about codecs

  Rememder: don't touch the code just before a commit
........
2010-05-22 16:59:09 +00:00
Benjamin Peterson 75ad1fc089 Merged revisions 78806 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r78806 | benjamin.peterson | 2010-03-08 16:15:11 -0600 (Mon, 08 Mar 2010) | 1 line

  set svn:eol-style on various files
........
2010-03-08 22:17:58 +00:00
Brett Cannon 5f4ec0451c Fix a minor grammatical error. 2009-12-13 21:25:28 +00:00
Philip Jenvey 1309adb06a Merged revisions 76337 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r76337 | philip.jenvey | 2009-11-16 18:42:26 -0800 (Mon, 16 Nov 2009) | 2 lines

  #1757126: fix typo with the cyrillic_asian alias
........
2009-11-17 03:43:14 +00:00
Amaury Forgeot d'Arc d8840860df Oops, really pass a bytes string to the ctypes function. 2009-07-13 20:48:07 +00:00
Amaury Forgeot d'Arc 8b84ea0aa4 Merged revisions 74000-74001 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r74000 | amaury.forgeotdarc | 2009-07-13 22:01:11 +0200 (lun., 13 juil. 2009) | 4 lines

  #1616979: Add the cp720 (Arabic DOS) encoding.
  Since there is no official mapping file from unicode.org,
  the codec file is generated on Windows with the new genwincodec.py script.
........
  r74001 | amaury.forgeotdarc | 2009-07-13 22:03:21 +0200 (lun., 13 juil. 2009) | 2 lines

  NEWS entry for r74000.
........
2009-07-13 20:38:21 +00:00
Hye-Shik Chang 50d1f7935d #1276: Add temporary encoding aliases for non-supported Mac CJK
encodings that are detected as system defaults in MacOS with CJK
locales.  Will be replaced by properly-implemented codecs in 3.1.
2008-08-23 08:03:03 +00:00
Antoine Pitrou fd036451bf #2834: Change re module semantics, so that str and bytes mixing is forbidden,
and str (unicode) patterns get full unicode matching by default. The re.ASCII
flag is also introduced to ask for ASCII matching instead.
2008-08-19 17:56:33 +00:00
Marc-André Lemburg b2750b5d33 Move the codec decode type checks to bytes/bytearray.decode().
Use faster PyUnicode_FromEncodedObject() for bytes/bytearray.decode().

Add new PyCodec_KnownEncoding() API.

Add new PyUnicode_AsDecodedUnicode() and PyUnicode_AsEncodedUnicode() APIs.

Add missing PyUnicode_AsDecodedObject() to unicodeobject.h

Fix punicode codec to also work on memoryviews.
2008-06-06 12:18:17 +00:00
Christian Heimes b9819954aa The bz2 codec isn't supported any more. I've also commented out several codecs which were removed in the past. 2007-12-02 15:27:38 +00:00
Guido van Rossum 254348e201 Rename buffer -> bytearray. 2007-11-21 19:29:53 +00:00
Christian Heimes 5d14c2b8f8 Merged revisions 59056-59076 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r59064 | christian.heimes | 2007-11-20 02:48:48 +0100 (Tue, 20 Nov 2007) | 1 line

  Fixed bug #1470
........
  r59066 | martin.v.loewis | 2007-11-20 03:46:02 +0100 (Tue, 20 Nov 2007) | 2 lines

  Patch #1468: Package Lib/test/*.pem.
........
  r59068 | christian.heimes | 2007-11-20 04:21:02 +0100 (Tue, 20 Nov 2007) | 1 line

  Another fix for test_shutil. Martin pointed out that it breaks some build bots
........
  r59073 | nick.coghlan | 2007-11-20 15:55:57 +0100 (Tue, 20 Nov 2007) | 1 line

  Backport some main.c cleanup from the py3k branch
........
  r59076 | amaury.forgeotdarc | 2007-11-21 00:31:27 +0100 (Wed, 21 Nov 2007) | 6 lines

  The incremental decoder for utf-7 must preserve its state between calls.
  Solves issue1460.

  Might not be a backport candidate: a new API function was added,
  and some code may rely on details in utf-7.py.
........
2007-11-20 23:38:09 +00:00
Guido van Rossum 87c0f1d1c9 Merged revisions 59041-59055 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r59044 | neal.norwitz | 2007-11-18 17:46:20 -0800 (Sun, 18 Nov 2007) | 1 line

  Use a slightly more recent version than 1.5.2b2.
........
  r59047 | walter.doerwald | 2007-11-19 04:14:05 -0800 (Mon, 19 Nov 2007) | 2 lines

  Fix typo in comment.
........
  r59049 | walter.doerwald | 2007-11-19 04:41:10 -0800 (Mon, 19 Nov 2007) | 4 lines

  Fix for #1444: utf_8_sig.StreamReader was (indirectly through decode())
  calling codecs.utf_8_decode() with final==True, which falled with incomplete
  byte sequences. Fix and test by James G. Sack.
........
  r59051 | nick.coghlan | 2007-11-19 05:56:27 -0800 (Mon, 19 Nov 2007) | 1 line

  Enable some test_cmd_line_script debugging output to investigate failure on Mac OSX buildbot
........
  r59053 | facundo.batista | 2007-11-19 08:30:24 -0800 (Mon, 19 Nov 2007) | 3 lines


  Fixed detail in add_type() explanation (issue 1463).
........
  r59054 | guido.van.rossum | 2007-11-19 09:35:24 -0800 (Mon, 19 Nov 2007) | 2 lines

  Make this work stand-alone, too.
........
  r59055 | guido.van.rossum | 2007-11-19 09:50:22 -0800 (Mon, 19 Nov 2007) | 3 lines

  Fix the OSX failures in this test -- they were due to /tmp being a symlink
  to /private/tmp.  Adding a call to os.path.realpath() to temp_dir() fixed it.
........
2007-11-19 18:03:44 +00:00
Guido van Rossum 98297ee781 Merging the py3k-pep3137 branch back into the py3k branch.
No detailed change log; just check out the change log for the py3k-pep3137
branch.  The most obvious changes:

  - str8 renamed to bytes (PyString at the C level);
  - bytes renamed to buffer (PyBytes at the C level);
  - PyString and PyUnicode are no longer compatible.

I.e. we now have an immutable bytes type and a mutable bytes type.

The behavior of PyString was modified quite a bit, to make it more
bytes-like.  Some changes are still on the to-do list.
2007-11-06 21:34:58 +00:00
Guido van Rossum 75a902db78 Patch 1280, by Alexandre Vassalotti.
Make PyString's indexing and iteration return integers.
(I changed a few of Alexandre's decisions -- GvR.)
2007-10-19 22:06:24 +00:00
Collin Winter 4902e69e40 More raise statement normalization. 2007-08-30 18:18:27 +00:00
Collin Winter ce36ad8a46 Raise statement normalization in Lib/. 2007-08-30 01:19:48 +00:00
Walter Dörwald 19e62387b9 Fix stupid typo in Lib/encodings/utf_32.py which led to failing tests
on big endian machines.

Update documentation: UTF-32 codecs will be in 2.6.
2007-08-17 16:23:21 +00:00
Walter Dörwald 41980caf64 Apply SF patch #1775604: This adds three new codecs (utf-32, utf-32-le and
ut-32-be). On narrow builds the codecs combine surrogate pairs in the unicode
object into one codepoint on encoding and create surrogate pairs for
codepoints outside the BMP on decoding. Lone surrogates are passed through
unchanged in all cases.

Backport to the trunk will follow.
2007-08-16 21:55:45 +00:00
Guido van Rossum d77d6992a5 Change a bunch of file encodings from Latin-1 to UTF-8.
Remove the encoding from Tix.py (it doesn't seem to need one).
Note: we still have to keep the "coding: utf-8" declaration
for files that aren't pure ASCII, as the default per PEP 3120
hasn't been implemented yet.
2007-07-16 23:10:57 +00:00
Guido van Rossum c1f779cb01 Merged revisions 56125-56153 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/p3yk

........
  r56127 | georg.brandl | 2007-06-30 09:32:49 +0200 (Sat, 30 Jun 2007) | 2 lines

  Fix a place where floor division would be in order.
........
  r56135 | guido.van.rossum | 2007-07-01 06:13:54 +0200 (Sun, 01 Jul 2007) | 28 lines

  Make map() and filter() identical to itertools.imap() and .ifilter(),
  respectively.

  I fixed two bootstrap issues, due to the dynamic import of itertools:

  1. Starting python requires that map() and filter() are not used until
     site.py has added build/lib.<arch> to sys.path.
  2. Building python requires that setup.py and distutils and everything
     they use is free of map() and filter() calls.

  Beyond this, I only fixed the tests in test_builtin.py.
  Others, please help fixing the remaining tests that are now broken!
  The fixes are usually simple:
  a. map(None, X) -> list(X)
  b. map(F, X) -> list(map(F, X))
  c. map(lambda x: F(x), X) -> [F(x) for x in X]
  d. filter(F, X) -> list(filter(F, X))
  e. filter(lambda x: P(x), X) -> [x for x in X if P(x)]

  Someone, please also contribute a fixer for 2to3 to do this.
  It can leave map()/filter() calls alone that are already
  inside a list() or sorted() call or for-loop.

  Only in rare cases have I seen code that depends on map() of lists
  of different lengths going to the end of the longest, or on filter()
  of a string or tuple returning an object of the same type; these
  will need more thought to fix.
........
  r56136 | guido.van.rossum | 2007-07-01 06:22:01 +0200 (Sun, 01 Jul 2007) | 3 lines

  Make it so that test_decimal fails instead of hangs, to help automated
  test runners.
........
  r56139 | georg.brandl | 2007-07-01 18:20:58 +0200 (Sun, 01 Jul 2007) | 2 lines

  Fix a few test cases after the map->imap change.
........
  r56142 | neal.norwitz | 2007-07-02 06:38:12 +0200 (Mon, 02 Jul 2007) | 1 line

  Get a bunch more tests passing after converting map/filter to return iterators.
........
  r56147 | guido.van.rossum | 2007-07-02 15:32:02 +0200 (Mon, 02 Jul 2007) | 4 lines

  Fix the remaining failing unit tests (at least on OSX).
  Also tweaked urllib2 so it doesn't raise socket.gaierror when
  all network interfaces are turned off.
........
2007-07-03 08:25:58 +00:00
Walter Dörwald 42748a8d6d Rip out all codecs that can't work in a unicode/bytes world:
base64, uu, zlib, rot_13, hex, quopri, bz2, string_escape.

However codecs.escape_encode() and codecs.escape_decode()
still exist, as they are used for pickling str8 objects
(so those two functions can go, when the str8 type is removed).
2007-06-12 16:40:17 +00:00
Guido van Rossum ad5b9de288 Change normalize_encodings() to avoid using .translate() or depending on
the string type.  It will always return a Unicode string.  The algoritm's
specification is unchanged.
2007-06-07 21:43:46 +00:00