Commit Graph

1525 Commits

Author SHA1 Message Date
Serhiy Storchaka 12f433411b
bpo-41334: Convert constructors of str, bytes and bytearray to Argument Clinic (GH-21535) 2020-07-20 15:53:55 +03:00
Serhiy Storchaka 4c8f09d7ce
bpo-36346: Make using the legacy Unicode C API optional (GH-21437)
Add compile time option USE_UNICODE_WCHAR_CACHE. Setting it to 0
makes the interpreter not using the wchar_t cache and the legacy Unicode C API.
2020-07-10 23:26:06 +03:00
Victor Stinner 8182cc2e68
bpo-39573: Use the Py_TYPE() macro (GH-21433)
Replace obj->ob_type with Py_TYPE(obj).
2020-07-10 12:40:38 +02:00
Serhiy Storchaka b3dd5cd4a3
bpo-36346: Undeprecate private function _PyUnicode_AsUnicode(). (GH-21336) 2020-07-05 18:53:45 +03:00
Victor Stinner 3549ca313a
bpo-1635741: Fix unicode_dealloc() for mortal interned string (GH-21270)
When unicode_dealloc() is called on a mortal interned string, the
string reference counter is now reset at zero.
2020-07-03 16:59:12 +02:00
Victor Stinner 666ecfb095
bpo-1635741: Release Unicode interned strings at exit (GH-21269)
* PyUnicode_InternInPlace() now ensures that interned strings are
  ready.
* Add _PyUnicode_ClearInterned().
* Py_Finalize() now releases Unicode interned strings:
  call _PyUnicode_ClearInterned().
2020-07-02 01:19:57 +02:00
Inada Naoki 038dd0f79d
bpo-36346: Raise DeprecationWarning when creating legacy Unicode (GH-20933) 2020-06-30 15:26:56 +09:00
Serhiy Storchaka 349f76c6aa
bpo-36346: Prepare for removing the legacy Unicode C API (AC only). (GH-21223) 2020-06-30 09:03:15 +03:00
Inada Naoki b3332660ad
bpo-41123: Remove PyUnicode_AsUnicodeCopy (GH-21209) 2020-06-30 12:23:07 +09:00
Serhiy Storchaka e67f7db3c3
bpo-37999: Simplify the conversion code for %c, %d, %x, etc. (GH-20437)
Since PyLong_AsLong() no longer use __int__, explicit call
of PyNumber_Index() before it is no longer needed.
2020-06-29 22:36:41 +03:00
Inada Naoki d9f2a13106
bpo-41123: Remove PyUnicode_GetMax() (GH-21192) 2020-06-29 10:46:51 +09:00
Inada Naoki 20a7902175
bpo-41123: Remove Py_UNICODE_str* functions (GH-21164)
They are undocumented and deprecated since Python 3.3.
2020-06-27 18:22:09 +09:00
Victor Stinner 91698d8caa
bpo-40521: Optimize PyBytes_FromStringAndSize(str, 0) (GH-21142)
Always create the empty bytes string singleton.

Optimize PyBytes_FromStringAndSize(str, 0): it no longer has to check
if the empty string singleton was created or not, it is always
available.

Add functions:

* _PyBytes_Init()
* bytes_get_empty(), bytes_new_empty()
* bytes_create_empty_string_singleton()
* unicode_create_empty_string_singleton()

_Py_unicode_state: rename empty structure member to empty_string.
2020-06-25 14:07:40 +02:00
Victor Stinner 2f9ada96e0
bpo-40521: Make Unicode latin1 singletons per interpreter (GH-21101)
Each interpreter now has its own Unicode latin1 singletons.

Remove "ifdef EXPERIMENTAL_ISOLATED_SUBINTERPRETERS"
and "ifdef LATIN1_SINGLETONS": always enable latin1 singletons.

Optimize unicode_result_ready(): only attempt to get a latin1
singleton for PyUnicode_1BYTE_KIND.
2020-06-24 02:22:21 +02:00
Victor Stinner 90ed8a6d71
bpo-40521: Optimize PyUnicode_New(0, maxchar) (GH-21099)
Functions of unicodeobject.c, like PyUnicode_New(), no longer check
if the empty Unicode singleton has been initialized or not. Consider
that it is always initialized. The Unicode API must not be used
before _PyUnicode_Init() or after _PyUnicode_Fini().
2020-06-24 00:34:07 +02:00
Victor Stinner f363d0a6e9
bpo-40521: Make empty Unicode string per interpreter (GH-21096)
Each interpreter now has its own empty Unicode string singleton.
2020-06-24 00:10:40 +02:00
Inada Naoki 2c4928d37e
bpo-36346: Add Py_DEPRECATED to deprecated unicode APIs (GH-20878)
Co-authored-by: Kyle Stanley <aeros167@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
2020-06-17 20:09:44 +09:00
Victor Stinner 04fc4f2a46
bpo-40989: PyObject_INIT() becomes an alias to PyObject_Init() (GH-20901)
The PyObject_INIT() and PyObject_INIT_VAR() macros become aliases to,
respectively, PyObject_Init() and PyObject_InitVar() functions.

Rename _PyObject_INIT() and _PyObject_INIT_VAR() static inline
functions to, respectively, _PyObject_Init() and _PyObject_InitVar(),
and move them to pycore_object.h. Remove their return value:
their return type becomes void.

The _datetime module is now built with the Py_BUILD_CORE_MODULE macro
defined.

Remove an outdated comment on _Py_tracemalloc_config.
2020-06-16 01:28:07 +02:00
Victor Stinner d36cf5f1d2
bpo-40943: Replace PY_FORMAT_SIZE_T with "z" (GH-20781)
The PEP 353, written in 2005, introduced PY_FORMAT_SIZE_T. Python no
longer supports macOS 10.4 and Visual Studio 2010, but requires more
recent macOS and Visual Studio versions. In 2020 with Python 3.10, it
is now safe to use directly "%zu" to format size_t and "%zi" to
format Py_ssize_t.
2020-06-10 18:38:05 +02:00
Victor Stinner c96a61e816
bpo-40881: Fix unicode_release_interned() (GH-20699)
Use Py_SET_REFCNT() in unicode_release_interned().
2020-06-08 01:39:47 +02:00
Victor Stinner 297257f7bc
bpo-39465: Cleanup _PyUnicode_FromId() code (GH-20595)
Work on a local variable before filling _Py_Identifier members.
2020-06-02 14:39:45 +02:00
Serhiy Storchaka 5f4b229df7
bpo-40792: Make the result of PyNumber_Index() always having exact type int. (GH-20443)
Previously, the result could have been an instance of a subclass of int.

Also revert bpo-26202 and make attributes start, stop and step of the range
object having exact type int.

Add private function _PyNumber_Index() which preserves the old behavior
of PyNumber_Index() for performance to use it in the conversion functions
like PyLong_AsLong().
2020-05-28 10:33:45 +03:00
Victor Stinner 3d17c045b4
bpo-40521: Add PyInterpreterState.unicode (GH-20081)
Move PyInterpreterState.fs_codec into a new
PyInterpreterState.unicode structure.

Give a name to the fs_codec structure and use this structure in
unicodeobject.c.
2020-05-14 01:48:38 +02:00
Victor Stinner d6fb53fe42
bpo-39465: Remove _PyUnicode_ClearStaticStrings() from C API (GH-20078)
Remove the _PyUnicode_ClearStaticStrings() function from the C API.
Make the function fully private (declare it with "static").
2020-05-14 01:11:54 +02:00
Serhiy Storchaka 5650e76f63
bpo-40596: Fix str.isidentifier() for non-canonicalized strings containing non-BMP characters on Windows. (GH-20053) 2020-05-12 16:18:00 +03:00
Serhiy Storchaka 74ea6b5a75
bpo-40593: Improve syntax errors for invalid characters in source code. (GH-20033) 2020-05-12 12:42:04 +03:00
Victor Stinner 607b1027fe
bpo-40521: Disable Unicode caches in isolated subinterpreters (GH-19933)
When Python is built in the experimental isolated subinterpreters
mode, disable Unicode singletons and Unicode interned strings since
they are shared by all interpreters.

Temporary workaround until these caches are made per-interpreter.
2020-05-05 18:50:30 +02:00
sweeneyde a81849b031
bpo-39939: Add str.removeprefix and str.removesuffix (GH-18939)
Added str.removeprefix and str.removesuffix methods and corresponding
bytes, bytearray, and collections.UserString methods to remove affixes
from a string if present. See PEP 616 for a full description.
2020-04-22 23:05:48 +02:00
Victor Stinner e5014be049
bpo-40268: Remove a few pycore_pystate.h includes (GH-19510) 2020-04-14 17:52:15 +02:00
Victor Stinner 81a7be3fa2
bpo-40268: Rename _PyInterpreterState_GET_UNSAFE() (GH-19509)
Rename _PyInterpreterState_GET_UNSAFE() to _PyInterpreterState_GET()
for consistency with _PyThreadState_GET() and to have a shorter name
(help to fit into 80 columns).

Add also "assert(tstate != NULL);" to the function.
2020-04-14 15:14:01 +02:00
Victor Stinner da7933ecc3
bpo-40268: Add _PyInterpreterState_GetConfig() (GH-19492)
Don't access PyInterpreterState.config member directly anymore, but
use new functions:

* _PyInterpreterState_GetConfig()
* _PyInterpreterState_SetConfig()
* _Py_GetConfig()
2020-04-13 03:04:28 +02:00
Serhiy Storchaka 8f87eefe7f
bpo-39943: Add the const qualifier to pointers on non-mutable PyBytes data. (GH-19472) 2020-04-12 14:58:27 +03:00
Serhiy Storchaka cd8295ff75
bpo-39943: Add the const qualifier to pointers on non-mutable PyUnicode data. (GH-19345) 2020-04-11 10:48:40 +03:00
Victor Stinner a15e260b70
bpo-40170: Add _PyIndex_Check() internal function (GH-19426)
Add _PyIndex_Check() function to the internal C API: fast inlined
verson of PyIndex_Check().

Add Include/internal/pycore_abstract.h header file.

Replace PyIndex_Check() with _PyIndex_Check() in C files of Objects
and Python subdirectories.
2020-04-08 02:01:56 +02:00
Victor Stinner d8acf0d9aa
bpo-37388: Don't check encoding/errors during finalization (GH-19409)
str.encode() and str.decode() no longer check the encoding and errors
in development mode or in debug mode during Python finalization. The
codecs machinery can no longer work on very late calls to
str.encode() and str.decode().

This change should help to call _PyObject_Dump() to debug during late
Python finalization.
2020-04-07 16:07:42 +02:00
Serhiy Storchaka 17b4733f2f
bpo-40130: _PyUnicode_AsKind() should not be exported. (GH-19265)
Make it a static function, and pass known attributes
(kind, data, length) instead of the PyUnicode object.
2020-04-01 15:41:49 +03:00
Inada Naoki 3a8c56295d
Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer()" (GH-18985)
* Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659)"

This reverts commit c7ad974d34.

* Update unicodeobject.h
2020-03-14 15:59:27 +09:00
Inada Naoki c7ad974d34
bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659)
Co-authored-by: Victor Stinner <vstinner@python.org>
2020-03-14 12:43:18 +09:00
Andy Lester dffe4c0709
bpo-39573: Finish converting to new Py_IS_TYPE() macro (GH-18601) 2020-03-04 14:15:20 +01:00
Inada Naoki 02a4d57263
bpo-39087: Optimize PyUnicode_AsUTF8AndSize() (GH-18327)
Avoid using temporary bytes object.
2020-02-27 13:48:59 +09:00
Andy Lester 933fc53f3f
closes bpo-39684: Combine two if/thens and squash uninit var warning. (GH-18565) 2020-02-20 20:51:47 -08:00
Hai Shi 3d235f5c5c
bpo-39500: Fix compile warnings in unicodeobject.c (GH-18519) 2020-02-17 14:41:15 +01:00
Victor Stinner 45876a90e2
bpo-35081: Move bytes_methods.h to the internal C API (GH-18492)
Move the bytes_methods.h header file to the internal C API as
pycore_bytes_methods.h: it only contains private symbols (prefixed by
"_Py"), except of the PyDoc_STRVAR_shared() macro.
2020-02-12 22:32:34 +01:00
Benjamin Peterson 95905ce0f4
bpo-39605: Remove a cast that causes a warning. (GH-18473) 2020-02-11 19:36:14 -08:00
Andy Lester e6be9b59a9
closes bpo-39605: Fix some casts to not cast away const. (GH-18453)
gcc -Wcast-qual turns up a number of instances of casting away constness of pointers. Some of these can be safely modified, by either:

Adding the const to the type cast, as in:

-    return _PyUnicode_FromUCS1((unsigned char*)s, size);
+    return _PyUnicode_FromUCS1((const unsigned char*)s, size);

or, Removing the cast entirely, because it's not necessary (but probably was at one time), as in:

-    PyDTrace_FUNCTION_ENTRY((char *)filename, (char *)funcname, lineno);
+    PyDTrace_FUNCTION_ENTRY(filename, funcname, lineno);

These changes will not change code, but they will make it much easier to check for errors in consts
2020-02-11 18:28:35 -08:00
Petr Viktorin ffd9753a94
bpo-39245: Switch to public API for Vectorcall (GH-18460)
The bulk of this patch was generated automatically with:

    for name in \
        PyObject_Vectorcall \
        Py_TPFLAGS_HAVE_VECTORCALL \
        PyObject_VectorcallMethod \
        PyVectorcall_Function \
        PyObject_CallOneArg \
        PyObject_CallMethodNoArgs \
        PyObject_CallMethodOneArg \
    ;
    do
        echo $name
        git grep -lwz _$name | xargs -0 sed -i "s/\b_$name\b/$name/g"
    done

    old=_PyObject_FastCallDict
    new=PyObject_VectorcallDict
    git grep -lwz $old | xargs -0 sed -i "s/\b$old\b/$new/g"

and then cleaned up:

- Revert changes to in docs & news
- Revert changes to backcompat defines in headers
- Nudge misaligned comments
2020-02-11 17:46:57 +01:00
Victor Stinner f3e7ea5b8c
bpo-39500: Document PyUnicode_IsIdentifier() function (GH-18397)
PyUnicode_IsIdentifier() does not call Py_FatalError() anymore if the
string is not ready.
2020-02-11 14:29:33 +01:00
Victor Stinner 58ac700fb0
bpo-39573: Use Py_TYPE() macro in Objects directory (GH-18392)
Replace direct access to PyObject.ob_type with Py_TYPE().
2020-02-07 03:04:21 +01:00
Victor Stinner c86a11221d
bpo-39573: Add Py_SET_REFCNT() function (GH-18389)
Add a Py_SET_REFCNT() function to set the reference counter of an
object.
2020-02-07 01:24:29 +01:00
Victor Stinner bf305cc6f0
Add PyInterpreterState.fs_codec.utf8 (GH-18367)
Add a fast-path for UTF-8 encoding in PyUnicode_EncodeFSDefault()
and PyUnicode_DecodeFSDefaultAndSize().

Add _PyUnicode_FiniEncodings() helper function for _PyUnicode_Fini().
2020-02-05 17:39:57 +01:00