Commit Graph

122 Commits

Author SHA1 Message Date
Stefan Krah ae7dd8fab0 Merged revisions 82980 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/release27-maint

........
  r82980 | stefan.krah | 2010-07-19 20:06:46 +0200 (Mon, 19 Jul 2010) | 3 lines

  Sub-issue of #9036: Fix incorrect use of Py_CHARMASK.
........
2010-07-19 18:24:18 +00:00
Ezio Melotti 86e5e17bda Merged revisions 81758-81759 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81758 | ezio.melotti | 2010-06-05 20:51:07 +0300 (Sat, 05 Jun 2010) | 15 lines

  Update PyUnicode_DecodeUTF8 from RFC 2279 to RFC 3629.

  1) #8271: when a byte sequence is invalid, only the start byte and all the
     valid continuation bytes are now replaced by U+FFFD, instead of replacing
     the number of bytes specified by the start byte.
     See http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf (pages 94-95);
  2) 5- and 6-bytes-long UTF-8 sequences are now considered invalid (no changes
     in behavior);
  3) Add code and tests to reject surrogates (U+D800-U+DFFF) as defined in
     RFC 3629, but leave it commented out since it's not backward compatible;
  4) Change the error messages "unexpected code byte" to "invalid start byte"
     and "invalid data" to "invalid continuation byte";
  5) Add an extensive set of tests in test_unicode;
  6) Fix test_codeccallbacks because it was failing after this change.
........
  r81759 | ezio.melotti | 2010-06-05 22:21:32 +0300 (Sat, 05 Jun 2010) | 1 line

  Add a NEWS entry for r81758 and clarify a comment.
........
2010-07-03 05:34:39 +00:00
Benjamin Peterson eacc8737be Merged revisions 81825 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81825 | benjamin.peterson | 2010-06-07 17:33:09 -0500 (Mon, 07 Jun 2010) | 1 line

  use unicode literals
........
2010-06-07 22:38:19 +00:00
Benjamin Peterson c971913f84 Merged revisions 81820 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81820 | benjamin.peterson | 2010-06-07 17:23:23 -0500 (Mon, 07 Jun 2010) | 1 line

  correctly overflow when indexes are too large
........
2010-06-07 22:27:32 +00:00
Victor Stinner 4fd2ff90a4 Merged revisions 79278,79280 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r79278 | victor.stinner | 2010-03-22 13:24:37 +0100 (lun., 22 mars 2010) | 2 lines

  Issue #1583863: An unicode subclass can now override the __str__ method
........
  r79280 | victor.stinner | 2010-03-22 13:36:28 +0100 (lun., 22 mars 2010) | 5 lines

  Fix the NEWS about my last commit: an unicode subclass can now override the
  __unicode__ method (and not the __str__ method).

  Simplify also the testcase.
........
2010-03-22 12:56:39 +00:00
Victor Stinner f7270ba58f Merged revisions 78392 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r78392 | victor.stinner | 2010-02-24 00:16:07 +0100 (mer., 24 févr. 2010) | 4 lines

  Issue #7649: Fix u'%c' % char for character in range 0x80..0xFF

  => raise an UnicodeDecodeError. Patch written by Ezio Melotti.
........
2010-02-23 23:20:14 +00:00
Eric Smith f73758f012 Merged revisions 72848 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r72848 | eric.smith | 2009-05-23 09:56:13 -0400 (Sat, 23 May 2009) | 1 line

  Issue 6089: str.format raises SystemError.
........
2009-05-23 14:04:31 +00:00
Eric Smith 0047511191 Merged revisions 70368 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r70368 | eric.smith | 2009-03-14 10:37:38 -0400 (Sat, 14 Mar 2009) | 1 line

  Unicode format tests weren't actually testing unicode. This was probably due to the original backport from py3k.
........
2009-03-14 14:43:27 +00:00
Antoine Pitrou 187ac1bda4 #3601: test_unicode.test_raiseMemError fails in UCS4
Reviewed by Benjamin Peterson on IRC.
2008-09-05 22:04:54 +00:00
Antoine Pitrou fd7c43e7be #3556: test_raiseMemError consumes an insane amount of memory 2008-08-17 17:01:49 +00:00
Amaury Forgeot d'Arc 06847b13ca Correct a crash when two successive unicode allocations fail with a MemoryError:
the freelist contained half-initialized objects with freed pointers.

The comment
/* XXX UNREF/NEWREF interface should be more symmetrical */
was copied from tupleobject.c, and appears in some other places.
I sign the petition.
2008-07-31 23:39:05 +00:00
Antoine Pitrou 4982d5d04a #2242: utf7 decoding crashes on bogus input on some Windows/MSVC versions 2008-07-25 17:45:59 +00:00
Amaury Forgeot d'Arc 9a0d3462fc #1477: ur'\U0010FFFF' raised in narrow unicode builds.
Corrected the raw-unicode-escape codec to use UTF-16 surrogates in
this case, just like the unicode-escape codec.
2008-03-23 09:55:29 +00:00
Christian Heimes c5f05e45cf Patch #2167 from calvin: Remove unused imports 2008-02-23 17:40:11 +00:00
Eric Smith bc32fee029 Added code to correct combining str and unicode in ''.format(). Added test case. 2008-02-18 18:02:34 +00:00
Eric Smith a9f7d62480 Backport of PEP 3101, Advanced String Formatting, from py3k.
Highlights:
 - Adding PyObject_Format.
 - Adding string.Format class.
 - Adding __format__ for str, unicode, int, long, float, datetime.
 - Adding builtin format.
 - Adding ''.format and u''.format.
 - str/unicode fixups for formatters.

The files in Objects/stringlib that implement PEP 3101 (stringdefs.h,
unicodedefs.h, formatter.h, string_format.h) are identical in trunk
and py3k.  Any changes from here on should be made to trunk, and
changes will propogate to py3k).
2008-02-17 19:46:49 +00:00
Kurt B. Kaiser db98f3632a Fix failing unicode test caused by change to ast.c at r56441 2007-07-18 19:58:42 +00:00
Neal Norwitz ba965deea8 Prevent these tests from running on Win64 since they don\'t apply there either 2007-06-11 02:14:39 +00:00
Neal Norwitz 7dbd2a3720 Prevent expandtabs() on string and unicode objects from causing a segfault when
a large width is passed on 32-bit platforms.  Found by Google.

It would be good for people to review this especially carefully and verify
I don't have an off by one error and there is no other way to cause overflow.
2007-06-09 03:36:34 +00:00
Collin Winter c2898c5a67 Standardize on test.test_support.run_unittest() (as opposed to a mix of run_unittest() and run_suite()). Also, add functionality to run_unittest() that admits usage of unittest.TestLoader.loadTestsFromModule(). 2007-04-25 17:29:52 +00:00
Neal Norwitz 17753ecbfa Patch #1541585: fix buffer overrun when performing repr() on
a unicode string in a build with wide unicode (UCS-4) support.

This code could be improved, so add an XXX comment.
2006-08-21 22:21:19 +00:00
Tim Peters 4511a713d5 Whitespace normalization. 2006-05-03 04:46:14 +00:00
Georg Brandl de9b624fb9 Bug #1473625: stop cPickle making float dumps locale dependent in protocol 0.
On the way, add a decorator to test_support to facilitate running single
test functions in different locales with automatic cleanup.
2006-04-30 11:13:56 +00:00
Anthony Baxter 67b6d516ce Fixed bug #1459029 - unicode reprs were double-escaped. 2006-03-30 10:54:07 +00:00
Georg Brandl da6b107745 Checkin the test of patch #1400181. 2006-01-20 17:48:54 +00:00
Hye-Shik Chang 835b243c71 Bug #1379994: Fix *unicode_escape codecs to encode r'\' as r'\\'
just like string codecs.
2005-12-17 04:38:31 +00:00
Neal Norwitz 430f68b447 Move registration of the codec search function to the module scope
so it is only executed once.  Otherwise the same search function is
repeated added to the codec search path when regrtest is run with -R
and leaks are reported.
2005-11-24 22:00:56 +00:00
Neil Schemenauer cf52c07843 Change the %s format specifier for str objects so that it returns a
unicode instance if the argument is not an instance of basestring and
calling __str__ on the argument returns a unicode instance.
2005-08-12 17:34:58 +00:00
Brett Cannon c3647ac93e Make subclasses of int, long, complex, float, and unicode perform type
conversion using the proper magic slot (e.g., __int__()).  Also move conversion
code out of PyNumber_*() functions in the C API into the nb_* function.

Applied patch #1109424.  Thanks Walter Doewald.
2005-04-26 03:45:26 +00:00
Walter Dörwald 57d88e5abd Move test_bug1001011() to string_tests.MixinStrUnicodeTest so that
it can be used for str and unicode. Drop the test for
   "".join([s]) is s
because this is an implementation detail (and doesn't work for unicode)
2004-08-26 16:53:04 +00:00
Hye-Shik Chang e9ddfbb412 SF #989185: Drop unicode.iswide() and unicode.width() and add
unicodedata.east_asian_width().  You can still implement your own
simple width() function using it like this:
    def width(u):
        w = 0
        for c in unicodedata.normalize('NFC', u):
            cwidth = unicodedata.east_asian_width(c)
            if cwidth in ('W', 'F'): w += 2
            else: w += 1
        return w
2004-08-04 07:38:35 +00:00
Marc-André Lemburg d25c650461 Let u'%s' % obj try obj.__unicode__() first and fallback to obj.__str__(). 2004-07-23 16:13:25 +00:00
Hye-Shik Chang 3c145449da Reuse width/iswide tests from strings_test. (Suggested by Walter Dörwald) 2004-06-04 04:24:54 +00:00
Hye-Shik Chang 7bd860655f Fix typo. 2004-06-04 03:19:17 +00:00
Hye-Shik Chang 974ed7cfa5 - SF #962502: Add two more methods for unicode type; width() and
iswide() for east asian width manipulation. (Inspired by David
Goodger, Reviewed by Martin v. Loewis)
- Move _PyUnicode_TypeRecord.flags to the end of the struct so that
no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
2004-06-02 16:49:17 +00:00
Walter Dörwald cd736e71a3 Fix reallocation bug in unicode.translate(): The code was comparing
characters instead of character pointers to determine space requirements.
2004-02-05 17:36:00 +00:00
Jeremy Hylton 504de6bd2c Fix for SF bug [ 817156 ] invalid \U escape gives 0=length unistr. 2003-10-06 05:08:26 +00:00
Martin v. Löwis 0d8e16c7ad Support trailing dots in DNS names. Fixes #782510. Will backport to 2.3. 2003-08-05 06:19:47 +00:00
Martin v. Löwis 9a3a9f7791 Consider \U-escapes in raw-unicode-escape. Fixes #444514. 2003-05-18 12:31:09 +00:00
Walter Dörwald 21d3a32b99 Combine the functionality of test_support.run_unittest()
and test_support.run_classtests() into run_unittest()
and use it wherever possible.

Also don't use "from test.test_support import ...", but
"from test import test_support" in a few spots.

From SF patch #662807.
2003-05-01 17:45:56 +00:00
Walter Dörwald 44f527fea4 Change formatchar(), so that u"%c" % 0xffffffff now raises
an OverflowError instead of a TypeError to be consistent
with "%c" % 256. See SF patch #710127.
2003-04-02 16:37:24 +00:00
Walter Dörwald 56fbcb525b Remove duplicate test. 2003-03-31 18:18:41 +00:00
Walter Dörwald 43440a621e Fix PyString_Format() so that '%c' % u'a' returns u'a'
instead of raising a TypeError. (From SF patch #710127)

Add tests to verify this is fixed.

Add various tests for '%c' % int.
2003-03-31 18:07:50 +00:00
Walter Dörwald 0fd583ce4d Port all string tests to PyUnit and share as much tests
between str, unicode, UserString and the string module
as possible. This increases code coverage in stringobject.c
from 83% to 86% and should help keep the string classes
in sync in the future. From SF patch #662807
2003-02-21 12:53:50 +00:00
Walter Dörwald 4f046e2e21 Add a few tests to test_count() to increase coverage in
Object/unicodeobject.c::unicode_count().
2003-02-10 17:51:03 +00:00
Walter Dörwald 74640247d4 Fix copy&paste error: call title instead of count 2003-02-10 17:44:16 +00:00
Walter Dörwald 28256f276e Port test_unicode.py to PyUnit and add tests for error
cases and a few methods. This increases code coverage
in Objects/unicodeobject.c from 81% to 85%.
(From SF patch #662807)
2003-01-19 16:59:20 +00:00
Walter Dörwald 395bb49555 Add a test that exercises the error handling part of
PyUnicode_EncodeDecimal().
2003-01-08 23:02:34 +00:00
Marc-André Lemburg 79f57833f3 Patch for bug #659709: bogus computation of float length
Python 2.2.x backport candidate. (This bug has been around since
Python 1.6.)
2002-12-29 19:44:06 +00:00
Neil Schemenauer ab9e4b76c2 check for unicode.__mod__ 2002-11-18 16:11:34 +00:00