Commit Graph

153 Commits

Author SHA1 Message Date
Serhiy Storchaka 0e0282eb14 Issue #23055: Fixed a buffer overflow in PyUnicode_FromFormatV. Analysis
and fix by Guido Vranken.
2015-01-27 22:17:56 +02:00
Serhiy Storchaka e8c9e14af9 Issue #23181: More "codepoint" -> "code point". 2015-01-18 11:42:50 +02:00
Berker Peksag dfdae021b9 Issue #16056: Rename test methods to avoid conflict. 2014-11-24 23:57:00 +02:00
Victor Stinner 2af8d2f698 Issue #22023: Fix %S, %R and %V formats of PyUnicode_FromFormat(). 2014-07-30 00:39:05 +02:00
Eric V. Smith 9a55cd8857 Issue #12546: Allow \x00 as a fill character for builtin type __format__ methods. 2014-04-14 11:22:33 -04:00
Serhiy Storchaka 76249ea4a7 Issue #20532: Tests which use _testcapi now are marked as CPython only. 2014-02-07 10:06:05 +02:00
Zachary Ware 1f70221b86 Issue #19572: More silently skipped tests explicitly skipped. 2013-12-10 14:09:20 -06:00
Serhiy Storchaka 1fdc702861 Issue #19457: Fixed xmlcharrefreplace tests on wide build when tests are
loaded from .py[co] files.
2013-10-31 17:06:03 +02:00
Serhiy Storchaka e822b034e7 Issue #15866: The xmlcharrefreplace error handler no more produces two XML
entities for a non-BMP character on narrow build.
2013-08-06 16:56:26 +03:00
Mark Dickinson 75d3600466 Issue #14700: Fix buggy overflow checks for large precision and width in new-style and old-style formatting. 2012-10-28 10:00:46 +00:00
Victor Stinner 975134e2a2 Issue #13093: Fix error handling on PyUnicode_EncodeDecimal()
Add tests for PyUnicode_EncodeDecimal()
2011-11-22 01:54:19 +01:00
Antoine Pitrou 30402549de Issue #13333: The UTF-7 decoder now accepts lone surrogates
(the encoder already accepts them).
2011-11-15 01:49:40 +01:00
Ezio Melotti 12682b10a7 #9200: backport tests but run them on wide builds only. 2011-08-22 23:46:30 +03:00
Ezio Melotti ea7b6f6e2a #12266: move the tests in test_unicode. 2011-08-15 10:04:28 +03:00
Ezio Melotti e3685f6b1b #6780: fix starts/endswith error message to mention that tuples are accepted too. 2011-04-26 05:12:51 +03:00
Ezio Melotti 370d85cee4 Python 2 can encode/decode surrogates to utf-8. Add a test for this. 2011-02-28 01:42:29 +00:00
Antoine Pitrou b27ddc72ea Merged revisions 85861 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r85861 | antoine.pitrou | 2010-10-27 20:52:48 +0200 (mer., 27 oct. 2010) | 3 lines

  Recode modules from latin-1 to utf-8
........
2010-10-27 18:58:04 +00:00
Florent Xicluna c0c0b14671 Strengthen test_unicode with explicit type checking for assertEqual tests. 2010-09-13 08:53:00 +00:00
Florent Xicluna 60d512c3b0 Check PendingDeprecationWarning after issue #7994. 2010-09-13 08:21:43 +00:00
Florent Xicluna 9b90cd1f7b Merged revisions 84470-84471,84566-84567,84759 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r84470 | florent.xicluna | 2010-09-03 22:00:37 +0200 (ven., 03 sept. 2010) | 1 line

  Strengthen BytesWarning tests.
........
  r84471 | florent.xicluna | 2010-09-03 22:23:40 +0200 (ven., 03 sept. 2010) | 1 line

  Typo
........
  r84566 | florent.xicluna | 2010-09-06 22:27:15 +0200 (lun., 06 sept. 2010) | 1 line

  typo
........
  r84567 | florent.xicluna | 2010-09-06 22:27:55 +0200 (lun., 06 sept. 2010) | 1 line

  typo
........
  r84759 | florent.xicluna | 2010-09-13 04:28:18 +0200 (lun., 13 sept. 2010) | 1 line

  Reenable test_ucs4 and remove some duplicated lines.
........
2010-09-13 07:46:37 +00:00
Stefan Krah 0b9201fa1c Sub-issue of #9036: Fix incorrect use of Py_CHARMASK. 2010-07-19 18:06:46 +00:00
Benjamin Peterson eabdeba25e use unicode literals 2010-06-07 22:33:09 +00:00
Benjamin Peterson 13e934acc0 correctly overflow when indexes are too large 2010-06-07 22:23:23 +00:00
Ezio Melotti ab2eb0ee84 Add a NEWS entry for r81758 and clarify a comment. 2010-06-05 19:21:32 +00:00
Ezio Melotti e57e50c8e7 Update PyUnicode_DecodeUTF8 from RFC 2279 to RFC 3629.
1) #8271: when a byte sequence is invalid, only the start byte and all the
   valid continuation bytes are now replaced by U+FFFD, instead of replacing
   the number of bytes specified by the start byte.
   See http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf (pages 94-95);
2) 5- and 6-bytes-long UTF-8 sequences are now considered invalid (no changes
   in behavior);
3) Add code and tests to reject surrogates (U+D800-U+DFFF) as defined in
   RFC 3629, but leave it commented out since it's not backward compatible;
4) Change the error messages "unexpected code byte" to "invalid start byte"
   and "invalid data" to "invalid continuation byte";
5) Add an extensive set of tests in test_unicode;
6) Fix test_codeccallbacks because it was failing after this change.
2010-06-05 17:51:07 +00:00
Georg Brandl f0757a2937 #8016: add the CP858 codec (approved by Benjamin). (Also add CP720 to the tests, it was missing there.) 2010-05-24 21:29:07 +00:00
Victor Stinner c7790ed163 Fix the NEWS about my last commit: an unicode subclass can now override the
__unicode__ method (and not the __str__ method).

Simplify also the testcase.
2010-03-22 12:36:28 +00:00
Victor Stinner 95affc4449 Issue #1583863: An unicode subclass can now override the __str__ method 2010-03-22 12:24:37 +00:00
Florent Xicluna 6de9e938a5 Issue #7849: Now the utility ``check_warnings`` verifies if the warnings are
effectively raised.  A new utility ``check_py3k_warnings`` deals with py3k warnings.
2010-03-07 12:18:33 +00:00
Victor Stinner f20f9c299e Issue #7649: Fix u'%c' % char for character in range 0x80..0xFF
=> raise an UnicodeDecodeError. Patch written by Ezio Melotti.
2010-02-23 23:16:07 +00:00
Ezio Melotti aa98058cc4 use assert[Not]In where appropriate 2010-01-23 23:04:36 +00:00
Antoine Pitrou 5b7139aab4 Issue #7462: Implement the stringlib fast search algorithm for the `rfind`,
`rindex`, `rsplit` and `rpartition` methods.  Patch by Florent Xicluna.
2010-01-02 21:12:58 +00:00
R. David Murray 0a0a1a842c Issue #1680159: unicode coercion during an 'in' operation was masking
any errors that might occur during coercion of the left operand and
turning them into a TypeError with a message text that was confusing in
the given context.  This patch lets any errors through, as was already
done during coercion of the right hand side.
2009-12-14 16:28:26 +00:00
Benjamin Peterson 332d721750 add keyword arguments support to str/unicode encode and decode #6300 2009-09-18 21:14:55 +00:00
Benjamin Peterson 5c8da86f3a convert usage of fail* to assert* 2009-06-30 22:57:08 +00:00
Eric Smith 4b94b192ff Issue 6089: str.format raises SystemError. 2009-05-23 13:56:13 +00:00
Antoine Pitrou 653dece278 Issue #4426: The UTF-7 decoder was too strict and didn't accept some legal sequences.
Patch by Nick Barnes and Victor Stinner.
2009-05-04 18:32:32 +00:00
Eric Smith 2ace4cf813 Unicode format tests weren't actually testing unicode. This was probably due to the original backport from py3k. 2009-03-14 14:37:38 +00:00
Eric Smith 6f42edb682 Issue 5237, Allow auto-numbered replacement fields in str.format() strings.
For simple uses for str.format(), this makes the typing easier. Hopfully this
will help in the adoption of str.format().

For example:
'The {} is {}'.format('sky', 'blue')

You can mix and matcth auto-numbering and named replacement fields:
'The {} is {color}'.format('sky', color='blue')

But you can't mix and match auto-numbering and specified numbering:
'The {0} is {}'.format('sky', 'blue')
ValueError: cannot switch from manual field specification to automatic field numbering

Will port to 3.1.
2009-03-14 11:57:26 +00:00
Antoine Pitrou 187ac1bda4 #3601: test_unicode.test_raiseMemError fails in UCS4
Reviewed by Benjamin Peterson on IRC.
2008-09-05 22:04:54 +00:00
Antoine Pitrou fd7c43e7be #3556: test_raiseMemError consumes an insane amount of memory 2008-08-17 17:01:49 +00:00
Amaury Forgeot d'Arc 06847b13ca Correct a crash when two successive unicode allocations fail with a MemoryError:
the freelist contained half-initialized objects with freed pointers.

The comment
/* XXX UNREF/NEWREF interface should be more symmetrical */
was copied from tupleobject.c, and appears in some other places.
I sign the petition.
2008-07-31 23:39:05 +00:00
Antoine Pitrou 4982d5d04a #2242: utf7 decoding crashes on bogus input on some Windows/MSVC versions 2008-07-25 17:45:59 +00:00
Amaury Forgeot d'Arc 9a0d3462fc #1477: ur'\U0010FFFF' raised in narrow unicode builds.
Corrected the raw-unicode-escape codec to use UTF-16 surrogates in
this case, just like the unicode-escape codec.
2008-03-23 09:55:29 +00:00
Christian Heimes c5f05e45cf Patch #2167 from calvin: Remove unused imports 2008-02-23 17:40:11 +00:00
Eric Smith bc32fee029 Added code to correct combining str and unicode in ''.format(). Added test case. 2008-02-18 18:02:34 +00:00
Eric Smith a9f7d62480 Backport of PEP 3101, Advanced String Formatting, from py3k.
Highlights:
 - Adding PyObject_Format.
 - Adding string.Format class.
 - Adding __format__ for str, unicode, int, long, float, datetime.
 - Adding builtin format.
 - Adding ''.format and u''.format.
 - str/unicode fixups for formatters.

The files in Objects/stringlib that implement PEP 3101 (stringdefs.h,
unicodedefs.h, formatter.h, string_format.h) are identical in trunk
and py3k.  Any changes from here on should be made to trunk, and
changes will propogate to py3k).
2008-02-17 19:46:49 +00:00
Kurt B. Kaiser db98f3632a Fix failing unicode test caused by change to ast.c at r56441 2007-07-18 19:58:42 +00:00
Neal Norwitz ba965deea8 Prevent these tests from running on Win64 since they don\'t apply there either 2007-06-11 02:14:39 +00:00
Neal Norwitz 7dbd2a3720 Prevent expandtabs() on string and unicode objects from causing a segfault when
a large width is passed on 32-bit platforms.  Found by Google.

It would be good for people to review this especially carefully and verify
I don't have an off by one error and there is no other way to cause overflow.
2007-06-09 03:36:34 +00:00