Commit Graph

1042 Commits

Author SHA1 Message Date
Raymond Hettinger 51612fd803 merge 2013-03-23 08:21:52 -07:00
Raymond Hettinger 378170d5d9 Issue 17447: Clarify that str.isidentifier doesn't check for reserved keywords. 2013-03-23 08:21:12 -07:00
Victor Stinner fb84b5d48d (Merge 3.3) _PyUnicode_Writer() now also reuses Unicode singletons:
empty string and latin1 single character
2013-03-06 19:29:09 +01:00
Victor Stinner 2cb16aa3cb _PyUnicode_Writer() now also reuses Unicode singletons:
empty string and latin1 single character
2013-03-06 19:28:37 +01:00
Victor Stinner cf77da9fb5 Backed out changeset b9f7b1bf36aa 2013-03-06 01:09:24 +01:00
Victor Stinner 313cac88c5 Issue #17223: Fix PyUnicode_FromUnicode() on Windows (16-bit wchar_t type)
to reject invalid UTF-16 surrogate.
2013-03-06 00:41:50 +01:00
Victor Stinner 36025478bf (Merge 3.3) Issue #17223: Fix PyUnicode_FromUnicode() for string of 1 character
outside the range U+0000-U+10ffff.
2013-02-26 00:16:57 +01:00
Victor Stinner d21b58c05d Issue #17223: Fix PyUnicode_FromUnicode() for string of 1 character outside
the range U+0000-U+10ffff.
2013-02-26 00:15:54 +01:00
Victor Stinner cfd2c1b4cc (Merge 3.3) Issue #17137: When an Unicode string is resized, the internal wide
character string (wstr) format is now cleared.
2013-02-07 23:17:34 +01:00
Victor Stinner bbbac2ec34 Issue #17137: When an Unicode string is resized, the internal wide character
string (wstr) format is now cleared.
2013-02-07 23:12:46 +01:00
Serhiy Storchaka d0c79dcda5 Issue #17043: The unicode-internal decoder no longer read past the end of
input buffer.
2013-02-07 16:26:55 +02:00
Serhiy Storchaka 03ee12ed72 Issue #17043: The unicode-internal decoder no longer read past the end of
input buffer.
2013-02-07 16:25:25 +02:00
Serhiy Storchaka 3fd4ab356d Issue #17043: The unicode-internal decoder no longer read past the end of
input buffer.
2013-02-07 16:23:21 +02:00
Serhiy Storchaka 2aee6a6460 Issue #16971: Fix a refleak in the charmap decoder. 2013-01-29 12:16:57 +02:00
Serhiy Storchaka afb1cb5579 Issue #16971: Fix a refleak in the charmap decoder. 2013-01-29 12:13:22 +02:00
Serhiy Storchaka 8fe5a9f9c3 Issue #16979: Fix error handling bugs in the unicode-escape-decode decoder. 2013-01-29 10:37:39 +02:00
Serhiy Storchaka 24193debd4 Issue #16979: Fix error handling bugs in the unicode-escape-decode decoder. 2013-01-29 10:28:07 +02:00
Serhiy Storchaka d679377be7 Issue #16979: Fix error handling bugs in the unicode-escape-decode decoder. 2013-01-29 10:20:44 +02:00
Serhiy Storchaka ed3c4128c0 Issue #10156: In the interpreter's initialization phase, unicode globals
are now initialized dynamically as needed.
2013-01-26 12:18:17 +02:00
Serhiy Storchaka 678db84b37 Issue #10156: In the interpreter's initialization phase, unicode globals
are now initialized dynamically as needed.
2013-01-26 12:16:36 +02:00
Serhiy Storchaka 059972535f Issue #10156: In the interpreter's initialization phase, unicode globals
are now initialized dynamically as needed.
2013-01-26 12:14:02 +02:00
Serhiy Storchaka 570c5b2354 Issue #16980: Fix processing of escaped non-ascii bytes in the
unicode-escape-decode decoder.
2013-01-25 23:53:29 +02:00
Serhiy Storchaka 73e38809e0 Issue #16980: Fix processing of escaped non-ascii bytes in the
unicode-escape-decode decoder.
2013-01-25 23:52:21 +02:00
Serhiy Storchaka 6481bfb2b5 Issue #16335: Fix integer overflow in unicode-escape decoder. 2013-01-21 11:44:40 +02:00
Serhiy Storchaka c35f3a9f61 Issue #16335: Fix integer overflow in unicode-escape decoder. 2013-01-21 11:42:57 +02:00
Serhiy Storchaka 4f5f0e54e0 Issue #16335: Fix integer overflow in unicode-escape decoder. 2013-01-21 11:38:00 +02:00
Serhiy Storchaka 441d30fac7 Issue #15989: Fix several occurrences of integer overflow
when result of PyLong_AsLong() narrowed to int without checks.

This is a backport of changesets 13e2e44db99d and 525407d89277.
2013-01-19 12:26:26 +02:00
Serhiy Storchaka 9101e23ff6 Issue #15989: Fix several occurrences of integer overflow
when result of PyLong_AsLong() narrowed to int without checks.

This is a backport of changesets 13e2e44db99d and 525407d89277.
2013-01-19 12:41:45 +02:00
Serhiy Storchaka 55e2cb497b Issue #14850: Now a chamap decoder treates U+FFFE as "undefined mapping"
in any mapping, not only in an unicode string.
2013-01-15 15:30:04 +02:00
Serhiy Storchaka 45d16d9924 Issue #14850: Now a chamap decoder treates U+FFFE as "undefined mapping"
in any mapping, not only in an unicode string.
2013-01-15 15:01:20 +02:00
Serhiy Storchaka 4fb8caee87 Issue #14850: Now a chamap decoder treates U+FFFE as "undefined mapping"
in any mapping, not only in an unicode string.
2013-01-15 14:43:21 +02:00
Serhiy Storchaka 7898043868 Issue #15989: Fix several occurrences of integer overflow
when result of PyLong_AsLong() narrowed to int without checks.
2013-01-15 01:12:17 +02:00
Benjamin Peterson 0b32a480bd merge 3.3 (#16906) 2013-01-09 09:52:22 -06:00
Benjamin Peterson 0c270a8bb7 correct static string clearing loop (closes #16906) 2013-01-09 09:52:01 -06:00
Serhiy Storchaka 24a3ef6999 Issue #11461: Fix the incremental UTF-16 decoder. Original patch by
Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP
characters.
2013-01-08 23:41:55 +02:00
Serhiy Storchaka ae3b32ad6b Issue #11461: Fix the incremental UTF-16 decoder. Original patch by
Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP
characters.
2013-01-08 23:40:52 +02:00
Serhiy Storchaka 48e188e573 Issue #11461: Fix the incremental UTF-16 decoder. Original patch by
Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP
characters.
2013-01-08 23:14:24 +02:00
Serhiy Storchaka dec798eb46 Fix out of bound read in UTF-32 decoder on "narrow Unicode" builds. 2013-01-08 22:45:42 +02:00
Serhiy Storchaka 4e02538bf3 Issue #16856: Fix a segmentation fault from calling repr() on a dict with
a key whose repr raise an exception.
2013-01-04 12:40:35 +02:00
Serhiy Storchaka 6c83e739d7 Issue #16856: Fix a segmentation fault from calling repr() on a dict with
a key whose repr raise an exception.
2013-01-04 12:39:34 +02:00
Victor Stinner 18aa4477d3 Close #16281: handle tailmatch() failure and remove useless comment
"honor direction and do a forward or backwards search": the runtime speed may
be different, but I consider that it doesn't really matter in practice. The
direction was never honored before: Python 2.7 uses memcmp() for the str type
for example.
2013-01-03 03:18:09 +01:00
Victor Stinner 7ae320d667 (Merge 3.2) Issue #16455: On FreeBSD and Solaris, if the locale is C, the
ASCII/surrogateescape codec is now used, instead of the locale encoding, to
decode the command line arguments. This change fixes inconsistencies with
os.fsencode() and os.fsdecode() because these operating systems announces an
ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
2013-01-03 01:21:07 +01:00
Victor Stinner 20b654acb5 Issue #16455: On FreeBSD and Solaris, if the locale is C, the
ASCII/surrogateescape codec is now used, instead of the locale encoding, to
decode the command line arguments. This change fixes inconsistencies with
os.fsencode() and os.fsdecode() because these operating systems announces an
ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
2013-01-03 01:08:58 +01:00
Andrew Svetlov 2606a6f197 Issue #16719: Get rid of WindowsError. Use OSError instead
Patch by Serhiy Storchaka.
2012-12-19 14:33:35 +02:00
Gregory P. Smith 27dc02e8c5 Fix the internals of our hash functions to used unsigned values during hash
computation as the overflow behavior of signed integers is undefined.

NOTE: This change is smaller compared to 3.2 as much of this cleanup had
already been done.  I added the comment that my change in 3.2 added so that the
code would match up.  Otherwise this just adds or synchronizes appropriate UL
designations on some constants to be pedantic.

In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent.  We could work to get rid of the -fwrapv requirement
in 3.4 but that requires more planning.

Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).

Cleanup only - no functionality or hash values change.
2012-12-10 19:51:29 -08:00
Gregory P. Smith c2176e46d7 Fix the internals of our hash functions to used unsigned values during hash
computation as the overflow behavior of signed integers is undefined.

NOTE: This change is smaller compared to 3.2 as much of this cleanup had
already been done.  I added the comment that my change in 3.2 added so that the
code would match up.  Otherwise this just adds or synchronizes appropriate UL
designations on some constants to be pedantic.

In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent.

Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).

Cleanup only - no functionality or hash values change.
2012-12-10 18:32:53 -08:00
Gregory P. Smith 27cbcd6241 Fix the internals of our hash functions to used unsigned values during hash
computation as the overflow behavior of signed integers is undefined.

In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent.

Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).

Cleanup only - no functionality or hash values change.
2012-12-10 18:15:46 -08:00
Victor Stinner 8dbd421b4d Cleanup unicodeobject.c
* Remove micro-optization:
   (errors == "surrogateescape" || strcmp(errors, "surrogateescape") == 0).
   Only use strcmp()
 * Initialize 'arg' members in unicode_format_arg() to help the compiler to
   diagnose real bugs and also make the code simpler to read
2012-12-04 09:30:24 +01:00
Victor Stinner d45c7f8d74 Issue #16455: On FreeBSD and Solaris, if the locale is C, the
ASCII/surrogateescape codec is now used, instead of the locale encoding, to
decode the command line arguments. This change fixes inconsistencies with
os.fsencode() and os.fsdecode() because these operating systems announces an
ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
2012-12-04 01:34:47 +01:00
Victor Stinner 2660e427d1 (Merge 3.2) Issue #16416: On Mac OS X, operating system data are now always
encoded/decoded to/from UTF-8/surrogateescape, instead of the locale encoding
(which may be ASCII if no locale environment variable is set), to avoid
inconsistencies with os.fsencode() and os.fsdecode() functions which are
already using UTF-8/surrogateescape.
2012-12-03 12:48:53 +01:00