Commit Graph

3641 Commits

Author SHA1 Message Date
Victor Stinner 4b9aad4999
bpo-42236: Enhance init and encoding documentation (GH-23109)
Enhance the documentation of the Python startup, filesystem encoding
and error handling, locale encoding. Add a new "Python UTF-8 Mode"
section.

* Add "locale encoding" and "filesystem encoding and error handler"
  to the glossary
* Remove documentation from Include/cpython/initconfig.h: move it to
  Doc/c-api/init_config.rst.
* Doc/c-api/init_config.rst:

  * Document command line options and environment variables
  * Document default values.

* Add a new "Python UTF-8 Mode" section in Doc/library/os.rst.
* Add warnings to Py_DecodeLocale() and Py_EncodeLocale() docs.
* Document how Python selects the filesystem encoding and error
  handler at a single place: PyConfig.filesystem_encoding and
  PyConfig.filesystem_errors.
* PyConfig: move orig_argv member at the right place.
2020-11-02 16:49:54 +01:00
Julien Danjou 64366fa9b3
bpo-41435: Add sys._current_exceptions() function (GH-21689)
This adds a new function named sys._current_exceptions() which is equivalent ot
sys._current_frames() except that it returns the exceptions currently handled
by other threads. It is equivalent to calling sys.exc_info() for each running
thread.
2020-11-02 16:16:25 +02:00
Victor Stinner e662c398d8
bpo-42236: Use UTF-8 encoding if nl_langinfo(CODESET) fails (GH-23086)
If the nl_langinfo(CODESET) function returns an empty string, Python
now uses UTF-8 as the filesystem encoding.

In May 2010 (commit b744ba1d14), I
modified Python to log a warning and use UTF-8 as the filesystem
encoding (instead of None) if nl_langinfo(CODESET) returns an empty
string.

In August 2020 (commit 94908bbc15), I
modified Python startup to fail with a fatal error and a specific
error message if nl_langinfo(CODESET) returns an empty string. The
intent was to prevent guessing the encoding and also investigate user
configuration where this case happens.

In 10 years (2010 to 2020), I saw zero user report about the error
message related to nl_langinfo(CODESET) returning an empty string.

Today, UTF-8 became the defacto standard and it's safe to make the
assumption that the user expects UTF-8. For example,
nl_langinfo(CODESET) can return an empty string on macOS if the
LC_CTYPE locale is not supported, and UTF-8 is the default encoding
on macOS.

While this change is likely to not affect anyone in practice, it
should make UTF-8 lover happy ;-)

Rewrite also the documentation explaining how Python selects the
filesystem encoding and error handler.
2020-11-01 23:07:23 +01:00
Victor Stinner 82458b6cdb
bpo-42236: Enhance _locale._get_locale_encoding() (GH-23083)
* Rename _Py_GetLocaleEncoding() to _Py_GetLocaleEncodingObject()
* Add _Py_GetLocaleEncoding() which returns a wchar_t* string to
  share code between _Py_GetLocaleEncodingObject()
  and config_get_locale_encoding().
* _Py_GetLocaleEncodingObject() now decodes nl_langinfo(CODESET)
  from the current locale encoding with surrogateescape,
  rather than using UTF-8.
2020-11-01 20:59:35 +01:00
Victor Stinner 710e826307
bpo-42208: Add _Py_GetLocaleEncoding() (GH-23050)
_io.TextIOWrapper no longer calls getpreferredencoding(False) of
_bootlocale to get the locale encoding, but calls
_Py_GetLocaleEncoding() instead.

Add config_get_fs_encoding() sub-function. Reorganize also
config_get_locale_encoding() code.
2020-10-31 01:02:09 +01:00
Victor Stinner eba5bf2f56
bpo-42208: Call GC collect earlier in PyInterpreterState_Clear() (GH-23044)
The last GC collection is now done before clearing builtins and sys
dictionaries. Add also assertions to ensure that gc.collect() is no
longer called after _PyGC_Fini().

Pass also the tstate to PyInterpreterState_Clear() to pass the
correct tstate to _PyGC_CollectNoFail() and _PyGC_Fini().
2020-10-30 22:51:02 +01:00
Victor Stinner 8b3414818f
bpo-42208: Pass tstate to _PyGC_CollectNoFail() (GH-23038)
Move private _PyGC_CollectNoFail() to the internal C API.

Remove the private _PyGC_CollectIfEnabled() which was just an alias
to the public PyGC_Collect() function since Python 3.8.

Rename functions:

* collect() => gc_collect_main()
* collect_with_callback() => gc_collect_with_callback()
* collect_generations() => gc_collect_generations()
2020-10-30 17:00:00 +01:00
Victor Stinner c310185c08
bpo-42161: Remove private _PyLong_Zero and _PyLong_One (GH-23003)
Use PyLong_FromLong(0) and PyLong_FromLong(1) of the public C API
instead. For Python internals, _PyLong_GetZero() and _PyLong_GetOne()
of pycore_long.h can be used.
2020-10-27 21:34:33 +01:00
Victor Stinner 84f7382215
bpo-42157: Rename unicodedata.ucnhash_CAPI (GH-22994)
Removed the unicodedata.ucnhash_CAPI attribute which was an internal
PyCapsule object. The related private _PyUnicode_Name_CAPI structure
was moved to the internal C API.

Rename unicodedata.ucnhash_CAPI as unicodedata._ucnhash_CAPI.
2020-10-27 04:36:22 +01:00
Victor Stinner 8e3b9f9283
bpo-42161: Add _PyLong_GetZero() and _PyLong_GetOne() (GH-22993)
Add _PyLong_GetZero() and _PyLong_GetOne() functions and a new
internal pycore_long.h header file.

Python cannot be built without small integer singletons anymore.
2020-10-27 00:00:03 +01:00
Victor Stinner 920cb647ba
bpo-42157: unicodedata avoids references to UCD_Type (GH-22990)
* UCD_Check() uses PyModule_Check()
* Simplify the internal _PyUnicode_Name_CAPI structure:

  * Remove size and state members
  * Remove state and self parameters of getcode() and getname()
    functions

* Remove global_module_state
2020-10-26 19:19:36 +01:00
Victor Stinner 47e1afd2a1
bpo-1635741: _PyUnicode_Name_CAPI moves to internal C API (GH-22713)
The private _PyUnicode_Name_CAPI structure of the PyCapsule API
unicodedata.ucnhash_CAPI moves to the internal C API. Moreover, the
structure gets a new state member which must be passed to the
getcode() and getname() functions.

* Move Include/ucnhash.h to Include/internal/pycore_ucnhash.h
* unicodedata module is now built with Py_BUILD_CORE_MODULE.
* unicodedata: move hashAPI variable into unicodedata_module_state.
2020-10-26 16:43:47 +01:00
Serhiy Storchaka fb5db7ec58
bpo-42006: Stop using PyDict_GetItem, PyDict_GetItemString and _PyDict_GetItemId. (GH-22648)
These functions are considered not safe because they suppress all internal errors
and can return wrong result.  PyDict_GetItemString and _PyDict_GetItemId can
also silence current exception in rare cases.

Remove no longer used _PyDict_GetItemId.
Add _PyDict_ContainsId and rename _PyDict_Contains into
_PyDict_Contains_KnownHash.
2020-10-26 08:43:39 +02:00
Pablo Galindo 109826c850
bpo-42093: Add opcode cache for LOAD_ATTR (GH-22803) 2020-10-20 06:22:44 +01:00
Zackery Spytz 1438c2ac77
bpo-41845: Move PyObject_GenericGetDict() back into the limited API (GH22646)
It was moved out of the limited API in 7d95e40721.
This change re-enables it from 3.10, to avoid generating invalid extension modules for earlier versions.
2020-10-19 23:47:37 +01:00
Alex Gaynor 3a8fdb2879
bpo-41784: make PyUnicode_AsUTF8AndSize part of the limited API (GH-22252) 2020-10-19 23:17:50 +01:00
Kyle Evans 7992579cd2
bpo-40422: Move _Py_closerange to fileutils.c (GH-22680)
This API is relatively lightweight and organizationally, given that it's
used by multiple modules, it makes sense to move it to fileutils.

Requires making sure that _posixsubprocess is compiled with the appropriate
Py_BUIILD_CORE_BUILTIN macro.
2020-10-13 22:04:44 +02:00
Vladimir Matveev cfb0f57ff8
bpo-41756: Export PyGen_Send and wrap it in if-defs (#22677) 2020-10-13 10:26:51 -07:00
Vladimir Matveev 24a54c0bd4
Delete PyGen_Send (#22663) 2020-10-12 12:10:42 -07:00
chilaxan 10c98db7f5
Fix typo in listobject.h (GH-22588) 2020-10-11 19:21:51 +01:00
Serhiy Storchaka 98c4433a81
bpo-41991: Remove _PyObject_HasAttrId (GH-22629)
It can silence arbitrary exceptions.
2020-10-10 22:23:42 +03:00
Serhiy Storchaka 637a09b0d6
bpo-41986: Add Py_FileSystemDefaultEncodeErrors and Py_UTF8Mode back to limited API (GH-22621) 2020-10-10 17:09:45 +03:00
Vladimir Matveev 037245c5ac
bpo-41756: Add PyIter_Send function (#22443) 2020-10-09 17:15:15 -07:00
Serhiy Storchaka 9975cc5008
bpo-41985: Add _PyLong_FileDescriptor_Converter and AC converter for "fildes". (GH-22620) 2020-10-09 23:00:45 +03:00
Pablo Galindo 91e3339066
Post 3.10.0a1 2020-10-05 21:16:35 +01:00
Pablo Galindo 8e9afaf822
Python 3.10.0a1 2020-10-05 18:30:18 +01:00
Serhiy Storchaka dcc54215ac
bpo-41936. Remove macros Py_ALLOW_RECURSION/Py_END_ALLOW_RECURSION (GH-22552) 2020-10-05 12:32:00 +03:00
Victor Stinner 583ee5a5b1
bpo-41692: Deprecate PyUnicode_InternImmortal() (GH-22486)
The PyUnicode_InternImmortal() function is now deprecated and will be
removed in Python 3.12: use PyUnicode_InternInPlace() instead.
2020-10-02 14:49:00 +02:00
Hai Shi d332e7b816
bpo-41842: Add codecs.unregister() function (GH-22360)
Add codecs.unregister() and PyCodec_Unregister() functions
to unregister a codec search function.
2020-09-28 23:41:11 +02:00
Dong-hee Na 24ba3b0df5
bpo-41875: Use __builtin_unreachable when possible (GH-22433) 2020-09-29 05:41:23 +09:00
Zackery Spytz 2e4dd336e5
bpo-30155: Add macros to get tzinfo from datetime instances (GH-21633)
Add PyDateTime_DATE_GET_TZINFO() and PyDateTime_TIME_GET_TZINFO()
macros.
2020-09-23 14:43:45 -04:00
Victor Stinner 19c3ac92bf
bpo-41834: Remove _Py_CheckRecursionLimit variable (GH-22359)
Remove the global _Py_CheckRecursionLimit variable: it has been
replaced by ceval.recursion_limit of the PyInterpreterState
structure.

There is no need to keep the variable for the stable ABI, since
Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() were not usable
in Python 3.8 and older: these macros accessed PyThreadState members,
whereas the PyThreadState structure is opaque in the limited C API.
2020-09-23 14:04:57 +02:00
Victor Stinner c5cb077ab3
Py_IS_TYPE() macro uses Py_TYPE() (GH-22341) 2020-09-22 12:42:28 +02:00
Vladimir Matveev 2b05361bf7
bpo-41756: Introduce PyGen_Send C API (GH-22196)
The new API allows to efficiently send values into native generators
and coroutines avoiding use of StopIteration exceptions to signal 
returns.

ceval loop now uses this method instead of the old "private"
_PyGen_Send C API. This translates to 1.6x increased performance
of 'await' calls in micro-benchmarks.

Aside from CPython core improvements, this new API will also allow 
Cython to generate more efficient code, benefiting high-performance
IO libraries like uvloop.
2020-09-18 18:38:38 -07:00
Pablo Galindo a5634c4067
bpo-41746: Add type information to asdl_seq objects (GH-22223)
* Add new capability to the PEG parser to type variable assignments. For instance:
```
       | a[asdl_stmt_seq*]=';'.small_stmt+ [';'] NEWLINE { a }
```

* Add new sequence types from the asdl definition (automatically generated)
* Make `asdl_seq` type a generic aliasing pointer type.
* Create a new `asdl_generic_seq` for the generic case using `void*`.
* The old `asdl_seq_GET`/`ast_seq_SET` macros now are typed.
* New `asdl_seq_GET_UNTYPED`/`ast_seq_SET_UNTYPED` macros for dealing with generic sequences.
* Changes all possible `asdl_seq` types to use specific versions everywhere.
2020-09-16 19:42:00 +01:00
Victor Stinner e5fbe0cbd4
bpo-41631: _ast module uses again a global state (#21961)
Partially revert commit ac46eb4ad6662cf6d771b20d8963658b2186c48c:
"bpo-38113: Update the Python-ast.c generator to PEP384 (gh-15957)".

Using a module state per module instance is causing subtle practical
problems.

For example, the Mercurial project replaces the __import__() function
to implement lazy import, whereas Python expected that "import _ast"
always return a fully initialized _ast module.

Add _PyAST_Fini() to clear the state at exit.

The _ast module has no state (set _astmodule.m_size to 0). Remove
astmodule_traverse(), astmodule_clear() and astmodule_free()
functions.
2020-09-15 18:03:34 +02:00
Maggie Moss 1b4552c5e8
bpo-41428: Implementation for PEP 604 (GH-21515)
See https://www.python.org/dev/peps/pep-0604/ for more information.

Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2020-09-09 21:23:24 +01:00
Victor Stinner e6905e4c82
bpo-41617: Fix pycore_bitutils.h to support clang 3.0 (GH-22042)
__builtin_bswap16() is not available in LLVM clang 3.0.
2020-09-01 18:25:14 +02:00
Stefan Krah 39042e00ab
bpo-41324 Add a minimal decimal capsule API (#21519) 2020-08-10 16:32:21 +02:00
Inada Naoki 46e19b61d3
bpo-41098: Doc: Add missing deprecated directives (GH-21162)
PyUnicodeEncodeError_Create has been deprecated with
`Py_DEPRECATED` macro. But it was not documented.
2020-08-07 16:31:53 +09:00
Mark Shannon 582aaf19e8
bpo-41463: Generate information about jumps from 'opcode.py' rather than duplicating it in 'compile.c' (GH-21714)
Generate information about jumps from 'opcode.py' rather than duplicate it in 'compile.c'
2020-08-04 17:30:11 +01:00
Lysandros Nikolaou b3fbff7289
bpo-40939: Remove even more references to the old parser (GH-21642)
Automerge-Triggered-By: @lysnikolaou
2020-07-27 12:52:59 -07:00
Henry Schreiner 680254a8dc
bpo-41366: Fix clang warning for sign conversion (GH-21592) 2020-07-23 17:39:03 +09:00
Mark Shannon cb9879b948
bpo-40941: Unify implicit and explicit state in the frame and generator objects into a single value. (GH-20803)
* Merge gen and frame state variables into one.

* Replace stack pointer with depth in PyFrameObject. Makes code easier to read and saves a word of memory.
2020-07-17 11:44:23 +01:00
Serhiy Storchaka 4c8f09d7ce
bpo-36346: Make using the legacy Unicode C API optional (GH-21437)
Add compile time option USE_UNICODE_WCHAR_CACHE. Setting it to 0
makes the interpreter not using the wchar_t cache and the legacy Unicode C API.
2020-07-10 23:26:06 +03:00
Victor Stinner b26a0db8ea
Revert "bpo-40170: PyType_HasFeature() now always calls PyType_GetFlags() (GH-19378)" (GH-21390)
This partially reverts commit 45ec5b99ae.
2020-07-08 11:02:23 +02:00
Victor Stinner 8f42748ded
bpo-29778: test_embed tests the path configuration (GH-21306) 2020-07-08 00:20:37 +02:00
Inada Naoki 9ce8132e1f
bpo-41165: Deprecate PyEval_ReleaseLock() (GH-21309) 2020-07-06 11:48:30 +09:00
Serhiy Storchaka b3dd5cd4a3
bpo-36346: Undeprecate private function _PyUnicode_AsUnicode(). (GH-21336) 2020-07-05 18:53:45 +03:00
Inada Naoki 13c90e82b6
Uncomment Py_DEPRECATED for Py_UNICODE APIs (GH-21318)
PyUnicode_EncodeDecimal and PyUnicode_TransformDecimalToASCII
are deprecated since Python 3.3.
But Py_DEPRECATED(3.3) was commented out.
2020-07-05 11:01:54 +09:00