Victor Stinner
975134e2a2
Issue #13093 : Fix error handling on PyUnicode_EncodeDecimal()
...
Add tests for PyUnicode_EncodeDecimal()
2011-11-22 01:54:19 +01:00
Antoine Pitrou
30402549de
Issue #13333 : The UTF-7 decoder now accepts lone surrogates
...
(the encoder already accepts them).
2011-11-15 01:49:40 +01:00
Ezio Melotti
15d6b65ead
#12266 : Fix str.capitalize() to correctly uppercase/lowercase titlecased and cased non-letter characters.
2011-08-15 09:22:24 +03:00
Senthil Kumaran
5e3a19d806
merge from 3.2 - Fix closes Issue12621 - Fix docstrings of find and rfind methods of bytes/bytearry/unicodeobject.
2011-07-27 23:36:51 +08:00
Ezio Melotti
e3685f6b1b
#6780 : fix starts/endswith error message to mention that tuples are accepted too.
2011-04-26 05:12:51 +03:00
Jesus Cea
44e81687a2
startswith and endswith don't accept None as slice index. Patch by Torsten Becker. ( closes #11828 )
2011-04-20 16:39:15 +02:00
Eric Smith
6c84085cfb
Improved docstrings for str and unicode methods format and __format__.
2010-11-06 19:43:44 +00:00
Georg Brandl
d070cc5350
Merged revisions 83226-83227,83229-83230,83232 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r83226 | georg.brandl | 2010-07-29 16:17:12 +0200 (Do, 29 Jul 2010) | 1 line
#1090076 : explain the behavior of *vars* in get() better.
........
r83227 | georg.brandl | 2010-07-29 16:23:06 +0200 (Do, 29 Jul 2010) | 1 line
Use Py_CLEAR().
........
r83229 | georg.brandl | 2010-07-29 16:32:22 +0200 (Do, 29 Jul 2010) | 1 line
#9407 : document configparser.Error.
........
r83230 | georg.brandl | 2010-07-29 16:36:11 +0200 (Do, 29 Jul 2010) | 1 line
Use correct directive and name.
........
r83232 | georg.brandl | 2010-07-29 16:49:08 +0200 (Do, 29 Jul 2010) | 1 line
#9388 : remove ERA_YEAR which is never defined in the source code.
........
2010-08-01 21:06:46 +00:00
Georg Brandl
e27d044769
Recorded merge of revisions 83444 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r83444 | georg.brandl | 2010-08-01 22:51:02 +0200 (So, 01 Aug 2010) | 1 line
Revert r83395, it introduces test failures and is not necessary anyway since we now have to nul-terminate the string anyway.
........
2010-08-01 20:54:30 +00:00
Georg Brandl
09f0d60f7c
Merged revisions 83395 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r83395 | georg.brandl | 2010-08-01 10:49:18 +0200 (So, 01 Aug 2010) | 1 line
#8821 : do not rely on Unicode strings being terminated with a \u0000, rather explicitly check range before looking for a second surrogate character.
........
2010-08-01 18:41:59 +00:00
Stefan Krah
0b9201fa1c
Sub-issue of #9036 : Fix incorrect use of Py_CHARMASK.
2010-07-19 18:06:46 +00:00
Senthil Kumaran
5261b10556
Merged revisions 82573 via svnmerge from
...
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r82573 | senthil.kumaran | 2010-07-05 17:30:56 +0530 (Mon, 05 Jul 2010) | 3 lines
Fix the docstrings of the capitalize method.
........
2010-07-05 12:04:07 +00:00
Ezio Melotti
2f06b78d61
Fix extra space.
2010-06-26 18:44:42 +00:00
Benjamin Peterson
8e5effaaa4
fix warning with ucs4
2010-06-12 17:47:06 +00:00
Antoine Pitrou
cca3a3f396
Issue #8941 : decoding big endian UTF-32 data in UCS-2 builds could crash
...
the interpreter with characters outside the Basic Multilingual Plane
(higher than 0x10000).
2010-06-11 21:42:26 +00:00
Ezio Melotti
e57e50c8e7
Update PyUnicode_DecodeUTF8 from RFC 2279 to RFC 3629.
...
1) #8271 : when a byte sequence is invalid, only the start byte and all the
valid continuation bytes are now replaced by U+FFFD, instead of replacing
the number of bytes specified by the start byte.
See http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf (pages 94-95);
2) 5- and 6-bytes-long UTF-8 sequences are now considered invalid (no changes
in behavior);
3) Add code and tests to reject surrogates (U+D800-U+DFFF) as defined in
RFC 3629, but leave it commented out since it's not backward compatible;
4) Change the error messages "unexpected code byte" to "invalid start byte"
and "invalid data" to "invalid continuation byte";
5) Add an extensive set of tests in test_unicode;
6) Fix test_codeccallbacks because it was failing after this change.
2010-06-05 17:51:07 +00:00
Brett Cannon
a7f13ee3f5
Remove an unneeded variable and assignment.
...
Found using Clang's static analyzer.
2010-05-04 01:16:51 +00:00
Benjamin Peterson
bea424af98
more _PyString_Resize error checking
2010-04-03 00:57:33 +00:00
Florent Xicluna
22b243809e
#7643 : Unicode codepoints VT (0x0B) and FF (0x0C) are linebreaks according to Unicode Standard Annex #14 .
2010-03-30 08:24:06 +00:00
Larry Hastings
402b73fb8d
Backported PyCapsule from 3.1, and converted most uses of
...
CObject to PyCapsule.
2010-03-25 00:54:54 +00:00
Victor Stinner
95affc4449
Issue #1583863 : An unicode subclass can now override the __str__ method
2010-03-22 12:24:37 +00:00
Ezio Melotti
321251567e
#7649 : "u'%c' % char" now behaves like "u'%s' % char" and raises a UnicodeDecodeError if 'char' is a byte string that can't be decoded using the default encoding.
2010-02-25 17:36:04 +00:00
Victor Stinner
f20f9c299e
Issue #7649 : Fix u'%c' % char for character in range 0x80..0xFF
...
=> raise an UnicodeDecodeError. Patch written by Ezio Melotti.
2010-02-23 23:16:07 +00:00
Ezio Melotti
1fafaab5e5
#7775 : fixed docstring for rpartition
2010-01-25 11:24:37 +00:00
Antoine Pitrou
10042922d9
Sanitize bloom filter macros
2010-01-13 14:01:26 +00:00
Antoine Pitrou
5c767c2f87
Fix Windows build (re r77461)
2010-01-13 08:55:20 +00:00
Antoine Pitrou
6467213bfd
Issue #7622 : Improve the split(), rsplit(), splitlines() and replace()
...
methods of bytes, bytearray and unicode objects by using a common
implementation based on stringlib's fast search. Patch by Florent Xicluna.
2010-01-13 07:55:48 +00:00
R. David Murray
0a0a1a842c
Issue #1680159 : unicode coercion during an 'in' operation was masking
...
any errors that might occur during coercion of the left operand and
turning them into a TypeError with a message text that was confusing in
the given context. This patch lets any errors through, as was already
done during coercion of the right hand side.
2009-12-14 16:28:26 +00:00
Eric Smith
c4ab8339e9
Issue #3382 : Make '%F' and float.__format__('F') convert results to upper case. Much of the patch came from Mark Dickinson.
2009-11-29 17:40:57 +00:00
Mark Dickinson
9dd5e16c5d
Issue #7117 , continued: Remove substitution of %g-style formatting for
...
%f-style formatting, which used to occur at high precision. Float formatting
should now be consistent between 2.7 and 3.1.
2009-11-23 20:54:09 +00:00
Mark Dickinson
18cfada1ea
Remove restriction on precision when formatting floats. This is the
...
first step towards removing the %f -> %g switch (see issues 7117,
5859).
2009-11-23 18:46:41 +00:00
Eric Smith
c1bdf89145
Finished removing _PyOS_double_to_string, as mentioned in issue 7117.
2009-10-26 17:46:17 +00:00
Georg Brandl
9b4e5820cb
#7116 : str.join() takes an iterable.
2009-10-14 18:48:32 +00:00
Benjamin Peterson
332d721750
add keyword arguments support to str/unicode encode and decode #6300
2009-09-18 21:14:55 +00:00
Georg Brandl
e9741f3ed8
Issue #6922 : Fix an infinite loop when trying to decode an invalid
...
UTF-32 stream with a non-raising error handler like "replace" or "ignore".
2009-09-17 11:28:09 +00:00
Mark Dickinson
2fdd58ad18
Silence gcc 'comparison always false' warning
2009-08-28 20:46:24 +00:00
Alexandre Vassalotti
fd00916c2e
Grow the allocated buffer in PyUnicode_EncodeUTF7 to avoid buffer overrun.
...
Without this change, test_unicode.UnicodeTest.test_codecs_utf7 crashes in
debug mode. What happens is the unicode string u'\U000abcde' with a length
of 1 encodes to the string '+2m/c3g-' of length 8. Since only 5 bytes is
reserved in the buffer, a buffer overrun occurs.
2009-07-07 02:17:30 +00:00
Georg Brandl
18187e2167
#6224 : s/JPython/Jython/, and remove one link to a module nine years old.
2009-06-06 18:21:58 +00:00
Georg Brandl
ba68a99656
#5929 : fix signedness warning.
2009-05-05 09:19:43 +00:00
Antoine Pitrou
653dece278
Issue #4426 : The UTF-7 decoder was too strict and didn't accept some legal sequences.
...
Patch by Nick Barnes and Victor Stinner.
2009-05-04 18:32:32 +00:00
Walter Dörwald
342c8db859
There's no %A in Python 2.x!
2009-05-03 22:46:07 +00:00
Walter Dörwald
ed960ac404
Issue #5108 : Handle %s like %S and %R in PyUnicode_FromFormatV(): Call
...
PyUnicode_DecodeUTF8() once, remember the result and output it in a second
step. This avoids problems with counting UTF-8 bytes that ignores the effect
of using the replace error handler in PyUnicode_DecodeUTF8().
2009-05-03 22:36:33 +00:00
Eric Smith
068f06568b
Issue #5835 , deprecate PyOS_ascii_formatd.
...
If anyone wants to clean up the documentation, feel free. It's my first documentation foray, and it's not that great.
Will port to py3k with a different strategy.
2009-04-25 21:40:15 +00:00
Mark Dickinson
d4814bfa23
Issue #532631 : Apply floatformat changes to unicodeobject.c
...
as well as stringobject.c.
2009-03-29 16:24:29 +00:00
Mark Dickinson
2e648ecc7d
Issue #532631 : Replace confusing fabs(x)/1e25 >= 1e25 test
...
with fabs(x) >= 1e50, and fix documentation.
2009-03-29 14:37:51 +00:00
Hirokazu Yamamoto
52a3492efb
There is no macro named SIZEOF_SSIZE_T. Should use SIZEOF_SIZE_T instead.
2009-03-21 10:32:52 +00:00
Mark Dickinson
6b265f1bf8
Issue 4474: On platforms with sizeof(wchar_t) == 4 and
...
sizeof(Py_UNICODE) == 2, PyUnicode_FromWideChar now converts
each character outside the BMP to the appropriate surrogate pair.
Thanks Victor Stinner for the patch.
(backport of r70452 from py3k to trunk)
2009-03-18 16:07:26 +00:00
Mark Dickinson
3e4caeb3bf
Issue #5341 : Fix a variety of spelling errors.
2009-02-21 20:27:01 +00:00
Georg Brandl
cbb4958cd8
Fix warnings GCC emits where the argument of PyErr_Format is a single variable.
2009-02-13 11:06:59 +00:00
Benjamin Peterson
1c5d21d644
fix indentation in comment
2009-01-31 22:33:02 +00:00