Commit Graph

384 Commits

Author SHA1 Message Date
Irit Katriel 4aeee0b47b
bpo-28146: Fix a confusing error message in str.format() (GH-24213)
Automerge-Triggered-By: GH:pitrou
2021-05-13 13:55:55 -07:00
Inada Naoki 9ad8f109ac
bpo-44029: Remove Py_UNICODE APIs (GH-25881)
Remove deprecated `Py_UNICODE` APIs: `PyUnicode_Encode`,
`PyUnicode_EncodeUTF7`, `PyUnicode_EncodeUTF8`,
`PyUnicode_EncodeUTF16`, `PyUnicode_EncodeUTF32`,
`PyUnicode_EncodeLatin1`, `PyUnicode_EncodeMBCS`,
`PyUnicode_EncodeDecimal`, `PyUnicode_EncodeRawUnicodeEscape`,
`PyUnicode_EncodeCharmap`, `PyUnicode_EncodeUnicodeEscape`,
`PyUnicode_TransformDecimalToASCII`, `PyUnicode_TranslateCharmap`,
`PyUnicodeEncodeError_Create`, `PyUnicodeTranslateError_Create`.

See :pep:`393` and :pep:`624` for reference.
2021-05-07 15:58:29 +09:00
Ethan Furman a02cb474f9
bpo-38659: [Enum] add _simple_enum decorator (GH-25497)
add:

* `_simple_enum` decorator to transform a normal class into an enum
* `_test_simple_enum` function to compare
* `_old_convert_` to enable checking `_convert_` generated enums

`_simple_enum` takes a normal class and converts it into an enum:

    @simple_enum(Enum)
    class Color:
        RED = 1
        GREEN = 2
        BLUE = 3

`_old_convert_` works much like` _convert_` does, using the original logic:

    # in a test file
    import socket, enum
    CheckedAddressFamily = enum._old_convert_(
            enum.IntEnum, 'AddressFamily', 'socket',
            lambda C: C.isupper() and C.startswith('AF_'),
            source=_socket,
            )

`_test_simple_enum` takes a traditional enum and a simple enum and
compares the two:

    # in the REPL or the same module as Color
    class CheckedColor(Enum):
        RED = 1
        GREEN = 2
        BLUE = 3

    _test_simple_enum(CheckedColor, Color)

    _test_simple_enum(CheckedAddressFamily, socket.AddressFamily)

Any important differences will raise a TypeError
2021-04-21 10:20:44 -07:00
Ethan Furman 503cdc7c12
Revert "bpo-38659: [Enum] add _simple_enum decorator (GH-25285)" (GH-25476)
This reverts commit dbac8f40e8.
2021-04-19 19:12:24 -07:00
Ethan Furman dbac8f40e8
bpo-38659: [Enum] add _simple_enum decorator (GH-25285)
add:

_simple_enum decorator to transform a normal class into an enum
_test_simple_enum function to compare
_old_convert_ to enable checking _convert_ generated enums
_simple_enum takes a normal class and converts it into an enum:

@simple_enum(Enum)
class Color:
    RED = 1
    GREEN = 2
    BLUE = 3

_old_convert_ works much like _convert_ does, using the original logic:

# in a test file
import socket, enum
CheckedAddressFamily = enum._old_convert_(
        enum.IntEnum, 'AddressFamily', 'socket',
        lambda C: C.isupper() and C.startswith('AF_'),
        source=_socket,
        )

test_simple_enum takes a traditional enum and a simple enum and
compares the two:

# in the REPL or the same module as Color
class CheckedColor(Enum):
    RED = 1
    GREEN = 2
    BLUE = 3

_test_simple_enum(CheckedColor, Color)

_test_simple_enum(CheckedAddressFamily, socket.AddressFamily)

Any important differences will raise a TypeError
2021-04-19 18:04:53 -07:00
Ethan Furman b775106d94
bpo-40066: Enum: modify `repr()` and `str()` (GH-22392)
* Enum: streamline repr() and str(); improve docs

- repr() is now ``enum_class.member_name``
- stdlib global enums are ``module_name.member_name``
- str() is now ``member_name``
- add HOW-TO section for ``Enum``
- change main documentation to be an API reference
2021-03-30 21:17:26 -07:00
Zackery Spytz 8aabfa8550
bpo-43405: Fix DeprecationWarnings in test_unicode (GH-24754)
DeprecationWarnings were being raised in the test_encode_decimal()
and test_transform_decimal() methods after 91a639a094.
2021-03-07 15:12:35 +09:00
Inada Naoki 91a639a094
bpo-36346: Emit DeprecationWarning for PyArg_Parse() with 'u' or 'Z'. (GH-20927)
Emit DeprecationWarning when PyArg_Parse*() is called with 'u', 'Z' format.

See PEP 623.
2021-02-22 22:11:48 +09:00
Serhiy Storchaka cf19cc3b92
bpo-27772: Make preceding width with 0 valid in string format. (GH-11270)
Previously it was an error with confusing error message.
2021-01-25 11:56:33 +02:00
Ronald Oussoren 41761933c1
bpo-41100: Support macOS 11 and Apple Silicon (GH-22855)
Co-authored-by:  Lawrence D’Anna <lawrence_danna@apple.com>

* Add support for macOS 11 and Apple Silicon (aka arm64)
   
  As a side effect of this work use the system copy of libffi on macOS, and remove the vendored copy

* Support building on recent versions of macOS while deploying to older versions

  This allows building installers on macOS 11 while still supporting macOS 10.9.
2020-11-08 10:05:27 +01:00
Hai Shi c9f696cb96
bpo-41919, test_codecs: Move codecs.register calls to setUp() (GH-22513)
* Move the codecs' (un)register operation to testcases.
* Remove _codecs._forget_codec() and _PyCodec_Forget()
2020-10-16 10:34:15 +02:00
Serhiy Storchaka 4c8f09d7ce
bpo-36346: Make using the legacy Unicode C API optional (GH-21437)
Add compile time option USE_UNICODE_WCHAR_CACHE. Setting it to 0
makes the interpreter not using the wchar_t cache and the legacy Unicode C API.
2020-07-10 23:26:06 +03:00
Hai Shi deb016224c
bpo-40275: Use new test.support helper submodules in tests (GH-21317) 2020-07-06 14:29:49 +02:00
Inada Naoki 038dd0f79d
bpo-36346: Raise DeprecationWarning when creating legacy Unicode (GH-20933) 2020-06-30 15:26:56 +09:00
Serhiy Storchaka f9bab74d5b
bpo-41055: Remove outdated tests for the tp_print slot. (GH-21006) 2020-06-21 11:11:17 +03:00
Serhiy Storchaka 5650e76f63
bpo-40596: Fix str.isidentifier() for non-canonicalized strings containing non-BMP characters on Windows. (GH-20053) 2020-05-12 16:18:00 +03:00
Inada Naoki 3a8c56295d
Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer()" (GH-18985)
* Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659)"

This reverts commit c7ad974d34.

* Update unicodeobject.h
2020-03-14 15:59:27 +09:00
Inada Naoki c7ad974d34
bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659)
Co-authored-by: Victor Stinner <vstinner@python.org>
2020-03-14 12:43:18 +09:00
Benjamin Peterson 51796e5d26
Update some www.unicode.org URLs to use HTTPS. (GH-18912) 2020-03-10 21:10:59 -07:00
Serhiy Storchaka 1f21eaa15e
bpo-15999: Clean up of handling boolean arguments. (GH-15610)
* Use the 'p' format unit instead of manually called PyObject_IsTrue().
* Pass boolean value instead 0/1 integers to functions that needs boolean.
* Convert some arguments to boolean only once.
2019-09-01 12:16:51 +03:00
Greg Price 6bccbe7dfb bpo-36502: Correct documentation of str.isspace() (GH-15019)
The documented definition was much broader than the real one:
there are tons of characters with general category "Other",
and we don't (and shouldn't) treat most of them as whitespace.

Rewrite the definition to agree with the comment on
_PyUnicode_IsWhitespace, and with the logic in makeunicodedata.py,
which is what generates that function and so ultimately governs.

Add suitable breadcrumbs so that a reader who wants to pin down
exactly what this definition means (what's a "bidirectional class"
of "B"?) can do so.  The `unicodedata` module documentation is an
appropriate central place for our references to Unicode's own copious
documentation, so point there.

Also add to the isspace() test a thorough check that the
implementation agrees with the intended definition.
2019-08-14 13:05:19 +02:00
Hai Shi 5623ac87bb bpo-37476: Adding tests for asutf8 and asutf8andsize (GH-14531) 2019-07-20 15:56:23 +08:00
Victor Stinner 22eb689cf3
bpo-37388: Development mode check encoding and errors (GH-14341)
In development mode and in debug build, encoding and errors arguments
are now checked on string encoding and decoding operations. Examples:
open(), str.encode() and bytes.decode().

By default, for best performances, the errors argument is only
checked at the first encoding/decoding error, and the encoding
argument is sometimes ignored for empty strings.
2019-06-26 00:51:05 +02:00
Kingsley M b015fc86f7 bpo-36549: str.capitalize now titlecases the first character instead of uppercasing it (GH-12804) 2019-04-12 08:35:39 -07:00
Inada Naoki 6a16b18224
bpo-36297: remove "unicode_internal" codec (GH-12342) 2019-03-18 15:44:11 +09:00
Serhiy Storchaka 44cc4822bb
bpo-33817: Fix _PyBytes_Resize() for empty bytes object. (GH-11516)
Add also tests for PyUnicode_FromFormat() and PyBytes_FromFormat()
with empty result.
2019-01-12 09:22:29 +02:00
Victor Stinner 998b806366
Revert "bpo-34595: Add %T format to PyUnicode_FromFormatV() (GH-9080)" (GH-9187)
This reverts commit 886483e2b9.
2018-09-12 00:23:25 +02:00
Victor Stinner 886483e2b9
bpo-34595: Add %T format to PyUnicode_FromFormatV() (GH-9080)
* Add %T format to PyUnicode_FromFormatV(), and so to
  PyUnicode_FromFormat() and PyErr_Format(), to format an object type
  name: equivalent to "%s" with Py_TYPE(obj)->tp_name.
* Replace Py_TYPE(obj)->tp_name with %T format in unicodeobject.c.
* Add unit test on %T format.
* Rename unicode_fromformat_write_cstr() to
  unicode_fromformat_write_utf8(), to make the intent more explicit.
2018-09-07 18:00:58 +02:00
Zackery Spytz e349bf2358 bpo-22602: Raise an exception in the UTF-7 decoder for ill-formed sequences starting with "+". (GH-8741)
The UTF-7 decoder now raises UnicodeDecodeError for ill-formed
sequences starting with "+" (as specified in RFC 2152).
2018-08-19 07:43:38 +03:00
INADA Naoki a49ac99029
bpo-32677: Add .isascii() to str, bytes and bytearray (GH-5342) 2018-01-27 14:06:21 +09:00
Serhiy Storchaka 9b6c60cbce
bpo-31979: Simplify transforming decimals to ASCII (#4336)
in int(), float() and complex() parsers.

This also speeds up parsing non-ASCII numbers by around 20%.
2017-11-13 21:23:48 +02:00
Serhiy Storchaka 5075416b8f bpo-30978: str.format_map() now passes key lookup exceptions through. (#2790)
Previously any exception was replaced with a KeyError exception.
2017-08-03 11:45:23 +03:00
Victor Stinner d6debb24e0 bpo-29919: Remove unused imports found by pyflakes (#137)
Make also minor PEP8 coding style fixes on modified imports.
2017-03-27 16:05:26 +02:00
Martijn Pieters d7e64337ef bpo-28598: Support __rmod__ for RHS subclasses of str in % string formatting operations (#51)
When you use `'%s' % SubClassOfStr()`, where `SubClassOfStr.__rmod__` exists, the reverse operation is ignored as normally such string formatting operations use the `PyUnicode_Format()` fast path. This patch tests for subclasses of `str` first and picks the slow path in that case.

Patch by Martijn Pieters.
2017-02-23 15:38:04 +02:00
Martin Panter 5644729aa6 Issue #29145: Merge test from 3.6 2017-01-14 06:29:32 +00:00
Martin Panter 758c7d044b Merge tests from 3.5 2017-01-14 06:26:51 +00:00
Martin Panter b71c0956d0 Issues #1621, #29145: Test for str.join() overflow 2017-01-12 11:54:59 +00:00
Serhiy Storchaka 8cbd3df3ce Issue #28992: Use bytes.fromhex(). 2016-12-21 12:59:28 +02:00
Xiang Zhang b211068f5c Issue #28822: Adjust indices handling of PyUnicode_FindChar(). 2016-12-20 22:52:33 +08:00
Martin Panter fff07e34fa Merge spelling and grammar from 3.5 2016-12-18 05:37:21 +00:00
Martin Panter 2f9171d900 Fix spelling and grammar in code comments and documentation 2016-12-18 01:23:09 +00:00
Eric V. Smith 5646648678 Issue 28128: Print out better error/warning messages for invalid string escapes. Backport to 3.6. 2016-10-31 14:46:26 -04:00
Serhiy Storchaka 21d9f10c94 Merge from 3.5. 2016-10-08 22:46:01 +03:00
Serhiy Storchaka 9c0e1f83af Issue #28379: Added sanity checks and tests for PyUnicode_CopyCharacters().
Patch by Xiang Zhang.
2016-10-08 22:45:38 +03:00
Serhiy Storchaka 0a6ef790e4 test_invalid_sequences seems don't have to stay in CAPITest.
Reported by Xiang Zhang.
2016-10-02 21:59:44 +03:00
Serhiy Storchaka b3648576cd Issue #28295: Fixed the documentation and added tests for PyUnicode_AsUCS4().
Original patch by Xiang Zhang.
2016-10-02 21:30:35 +03:00
Serhiy Storchaka cc164232aa Issue #28295: Fixed the documentation and added tests for PyUnicode_AsUCS4().
Original patch by Xiang Zhang.
2016-10-02 21:29:26 +03:00
Serhiy Storchaka 1edebef724 Moved Unicode C API related tests to separate test class. 2016-10-02 21:18:14 +03:00
Serhiy Storchaka 63b5b6fd45 Moved Unicode C API related tests to separate test class. 2016-10-02 21:16:38 +03:00
R David Murray 110b6fecbb #27364: Deprecate invalid escape strings in str/byutes.
Patch by Emanuel Barry, reviewed by Serhiy Storchaka and Martin Panter.
2016-09-08 15:34:08 -04:00