Commit Graph

1238 Commits

Author SHA1 Message Date
Pablo Galindo Salgado ac61d58db0
gh-119521: Rename IncompleteInputError to _IncompleteInputError and remove from public API/ABI (GH-119680)
Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
2024-06-24 14:08:12 +02:00
Petr Viktorin 6f1d448bc1
gh-113993: Allow interned strings to be mortal, and fix related issues (GH-120520)
* Add an InternalDocs file describing how interning should work and how to use it.

* Add internal functions to *explicitly* request what kind of interning is done:
  - `_PyUnicode_InternMortal`
  - `_PyUnicode_InternImmortal`
  - `_PyUnicode_InternStatic`

* Switch uses of `PyUnicode_InternInPlace` to those.

* Disallow using `_Py_SetImmortal` on strings directly.
  You should use `_PyUnicode_InternImmortal` instead:
  - Strings should be interned before immortalization, otherwise you're possibly
    interning a immortalizing copy.
  - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
    `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
    backports, as they are now part of public API and version-specific ABI.

* Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.

* Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
  - `_Py_ID`
  - `_Py_STR` (including the empty string)
  - one-character latin-1 singletons

  Now, when you intern a singleton, that exact singleton will be interned.

* Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).

* Intern `_Py_STR` singletons at startup.

* For free-threaded builds, intern `_Py_LATIN1_CHR` singletons at startup.

* Beef up the tests. Cover internal details (marked with `@cpython_only`).

* Add lots of assertions

Co-Authored-By: Eric Snow <ericsnowcurrently@gmail.com>
2024-06-21 17:19:31 +02:00
Mark Jason Dominus (陶敏修) bd8c1f97e1
gh-94808: Reorganize _make_posargs and mark unused code (GH-119227)
* Reorganize four-way if-elsif-elsif-elsif as nested if-elses
* Mark unused branch in _make_posargs

`names_with_default` is never `NULL`, even if there are no names with
defaults.  In that case it points to a structure with `size` zero.

Rather than eliminating the branch, we leave it behind with an `assert(0)`
in case a future change to the grammar exercises the branch.
2024-06-04 14:59:56 +02:00
Petr Viktorin 31a4fb3c74
gh-119724: Revert "bpo-45759: Better error messages for non-matching 'elif'/'else' statements (#29513)" (#119974)
This reverts commit 1c8f912ebd.
2024-06-03 18:10:15 -07:00
Petr Viktorin 48f21b3631
gh-118235: Move RAISE_SYNTAX_ERROR actions to invalid rules and make sure they stay there (GH-119731)
The Full Grammar specification in the docs omits rule actions, so grammar rules that raise a syntax error looked like valid syntax.
This was solved in ef940de by hiding those rules in the custom syntax highlighter.

This moves all syntax-error alternatives to invalid rules, adds a validator that ensures that actions containing RAISE_SYNTAX_ERROR are in invalid rules, and reverts the syntax highlighter hack.
2024-05-30 09:27:32 +02:00
Lysandros Nikolaou d87b015106
gh-119118: Fix performance regression in tokenize module (#119615)
* gh-119118: Fix performance regression in tokenize module

- Cache line object to avoid creating a Unicode object
  for all of the tokens in the same line.
- Speed up byte offset to column offset conversion by using the
  smallest buffer possible to measure the difference.

Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2024-05-28 19:17:49 +00:00
Jelle Zijlstra 68fbc00dc8
gh-118851: Default ctx arguments to AST constructors to Load() (#118854)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-05-09 15:30:14 -07:00
Nikita Sobolev b60d4c0d53
gh-118090: Improve error message for empty type param brackets (GH-118091) 2024-05-07 14:01:06 +02:00
Jelle Zijlstra e0422198fb
gh-117486: Improve behavior for user-defined AST subclasses (#118212)
Now, such classes will no longer require changes in Python 3.13 in the normal case.
The test suite for robotframework passes with no DeprecationWarnings under this PR.

I also added a new DeprecationWarning for the case where `_field_types` exists
but is incomplete, since that seems likely to indicate a user mistake.
2024-05-06 15:57:27 -07:00
Brett Simmers c2627d6eea
gh-116322: Add Py_mod_gil module slot (#116882)
This PR adds the ability to enable the GIL if it was disabled at
interpreter startup, and modifies the multi-phase module initialization
path to enable the GIL when loading a module, unless that module's spec
includes a slot indicating it can run safely without the GIL.

PEP 703 called the constant for the slot `Py_mod_gil_not_used`; I went
with `Py_MOD_GIL_NOT_USED` for consistency with gh-104148.

A warning will be issued up to once per interpreter for the first
GIL-using module that is loaded. If `-v` is given, a shorter message
will be printed to stderr every time a GIL-using module is loaded
(including the first one that issues a warning).
2024-05-03 11:30:55 -04:00
Jelle Zijlstra ca269e58c2
gh-116126: Implement PEP 696 (#116129)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
2024-05-03 06:17:32 -07:00
David Rubin 9b280ab0ab
gh-116988: Remove duplicates of `annotated_rhs` in the Grammar (#117004) 2024-04-24 18:16:06 +01:00
Nikita Sobolev de1f686827
gh-118082: Improve `import` without names syntax error message (#118083) 2024-04-23 13:00:52 +01:00
Grigoriev Semyon c97d3af239
gh-109120: Fix syntax error in handlinh of incorrect star expressions (#117444) 2024-04-02 11:42:58 +01:00
Jelle Zijlstra 4c71d51a4b
gh-117266: Fix crashes on user-created AST subclasses (GH-117276)
Fix crashes on user-created AST subclasses
2024-03-28 11:30:31 +01:00
Pablo Galindo Salgado 61599a48f5
bpo-24612: Improve syntax error for 'not' after an operator (GH-28170)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2024-03-26 10:30:46 +01:00
Serhiy Storchaka 72d3cc94cd
gh-116437: Use new C API PyDict_Pop() to simplify the code (GH-116438) 2024-03-07 11:21:08 +02:00
Jelle Zijlstra ed4dfd8825
gh-105858: Improve AST node constructors (#105880)
Demonstration:

>>> ast.FunctionDef.__annotations__
{'name': <class 'str'>, 'args': <class 'ast.arguments'>, 'body': list[ast.stmt], 'decorator_list': list[ast.expr], 'returns': ast.expr | None, 'type_comment': str | None, 'type_params': list[ast.type_param]}
>>> ast.FunctionDef()
<stdin>:1: DeprecationWarning: FunctionDef.__init__ missing 1 required positional argument: 'name'. This will become an error in Python 3.15.
<stdin>:1: DeprecationWarning: FunctionDef.__init__ missing 1 required positional argument: 'args'. This will become an error in Python 3.15.
<ast.FunctionDef object at 0x101959460>
>>> node = ast.FunctionDef(name="foo", args=ast.arguments())
>>> node.decorator_list
[]
>>> ast.FunctionDef(whatever="you want", name="x", args=ast.arguments())
<stdin>:1: DeprecationWarning: FunctionDef.__init__ got an unexpected keyword argument 'whatever'. Support for arbitrary keyword arguments is deprecated and will be removed in Python 3.15.
<ast.FunctionDef object at 0x1019581f0>
2024-02-27 18:13:03 -08:00
Pablo Galindo Salgado 015b97d19a
gh-115823: Calculate correctly error locations when dealing with implicit encodings (#115824) 2024-02-26 12:57:09 +00:00
Alex Waygood 7a3518e43a
gh-115881: Ensure `ast.parse()` parses conditional context managers even with low `feature_version` passed (#115920) 2024-02-26 09:22:09 +00:00
Pablo Galindo Salgado 39d102c2ee
gh-113744: Add a new IncompleteInputError exception to improve incomplete input detection in the codeop module (#113745)
Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
2024-01-30 16:21:30 +00:00
Erlend E. Aasland dcd28b5c35
gh-114569: Use PyMem_* APIs for most non-PyObject uses (#114574)
Fix usage in Modules, Objects, and Parser subdirectories.
2024-01-26 10:11:35 +00:00
Mark Shannon 17b73ab99e
GH-113655: Lower the C recursion limit on various platforms (GH-113944) 2024-01-16 09:32:01 +00:00
Grigoriev Semyon bb4c167060
gh-111488: Changed error message in case of no 'in' keyword after 'for' in cmp (#113656) 2024-01-06 10:27:49 +00:00
Pablo Galindo Salgado 3003fbbf00
gh-113703: Correctly identify incomplete f-strings in the codeop module (#113709) 2024-01-05 12:16:46 +00:00
Pablo Galindo Salgado 9ed36d533a
gh-113602: Bail out when the parser tries to override existing errors (#113607)
Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
2024-01-02 13:00:52 +00:00
Yilei Yang 48c49739f5
gh-106905: Use separate structs to track recursion depth in each PyAST_mod2obj call. (GH-113035)
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
2023-12-25 19:36:59 +02:00
Pablo Galindo Salgado a135a6d2c6
gh-112943: Correctly compute end offsets for multiline tokens in the tokenize module (#112949) 2023-12-11 11:44:22 +00:00
Yang Hau 707c37e373
Fix typos in variable names, function names, and comments (GH-101868) 2023-12-01 09:37:40 +00:00
Pablo Galindo Salgado 45d648597b
gh-112387: Fix error positions for decoded strings with backwards tokenize errors (#112409)
Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
2023-11-27 18:37:48 +00:00
Pablo Galindo Salgado 2c8b191742
gh-112388: Fix an error that was causing the parser to try to overwrite tokenizer errors (#112410)
Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
2023-11-27 18:36:11 +00:00
Pablo Galindo Salgado d59feb5dbe
gh-112243: Don't include comments in f-string debug expressions (#112284) 2023-11-20 15:18:24 +00:00
Crowthebird 1c8f912ebd
bpo-45759: Better error messages for non-matching 'elif'/'else' statements (#29513) 2023-11-20 13:27:53 +00:00
Brett Cannon 56e59a49ae
GH-111807: Lower the parser stack depth under WASI debug builds (#112225) 2023-11-20 13:27:33 +00:00
Sam Gross 446f18a911
gh-111956: Add thread-safe one-time initialization. (gh-111960) 2023-11-16 12:19:54 -07:00
Markus Mohrhard 1447af7970
gh-106905: avoid incorrect SystemError about recursion depth mismatch (#106906)
* gh-106905: avoid incorrect SystemError about recursion depth mismatch

* Update Misc/NEWS.d/next/Core and Builtins/2023-07-20-11-41-16.gh-issue-106905.AyZpuB.rst

---------

Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2023-11-13 13:05:17 +00:00
Serhiy Storchaka 937872e8ea
Simplify _PyPegen_join_names_with_dot() (GH-111602) 2023-11-01 16:25:36 +02:00
Tomas R 453e96e302
gh-111420: Allow type comments in parenthesized `with` statements (#111468) 2023-10-31 21:02:42 +00:00
Pablo Galindo Salgado 3d2f1f0b83
gh-111380: Show SyntaxWarnings only once when parsing if invalid syntax is encouintered (#111381) 2023-10-27 12:19:34 +09:00
Shantanu 3156d193b8
gh-100445: Improve error message for unterminated strings with escapes (#100446) 2023-10-18 13:58:51 +01:00
Pablo Galindo Salgado 24e4ec7766
gh-110938: Fix error messages for indented blocks with functions and classes with generic type parameters (#110973) 2023-10-17 13:45:13 +01:00
Lysandros Nikolaou f46333b9f5
gh-107450: Remove unnecessary overflow check in parser error handler (#110940) 2023-10-16 22:41:01 +02:00
Lysandros Nikolaou a1ac5590e0
gh-107450: Check for overflow in the tokenizer and fix overflow test (#110832)
Co-authored-by: Filipe Laíns <lains@riseup.net>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2023-10-16 16:42:49 +02:00
Pablo Galindo Salgado e1d8c65e1d
gh-110805: Allow the repl to show source code and complete tracebacks (#110775) 2023-10-13 09:25:37 +00:00
Lysandros Nikolaou fb7843ee89
gh-107450: Raise OverflowError when parser column offset overflows (#110754) 2023-10-12 09:34:12 +00:00
Pablo Galindo Salgado 3d180347ae
gh-110696: Fix incorrect syntax error message for incorrect argument unpacking (#110706) 2023-10-12 09:02:02 +00:00
Lysandros Nikolaou 17d65547df
gh-104169: Fix test_peg_generator after tokenizer refactoring (#110727)
* Fix test_peg_generator after tokenizer refactoring
* Remove references to tokenizer.c in comments etc.
2023-10-12 09:34:35 +02:00
Filipe Laíns 23645420dc
GH-110749: fix unistd.h import in file_tokenizer.c (#110750) 2023-10-12 07:52:13 +02:00
Lysandros Nikolaou 01481f2dc1
gh-104169: Refactor tokenizer into lexer and wrappers (#110684)
* The lexer, which include the actual lexeme producing logic, goes into
  the `lexer` directory.
* The wrappers, one wrapper per input mode (file, string, utf-8, and
  readline), go into the `tokenizer` directory and include logic for
  creating a lexer instance and managing the buffer for different modes.
---------

Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2023-10-11 15:14:44 +00:00
sunmy2019 2cb62c6437
gh-110309: Prune empty constant in format specs (#110320)
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2023-10-05 14:08:42 +00:00