Commit Graph

118 Commits

Author SHA1 Message Date
Pablo Galindo 05073036dc
bpo-44368: Improve syntax errors with invalid as pattern targets (GH-26632) 2021-06-10 23:50:32 +01:00
Lysandros Nikolaou e7b4644607
bpo-44385: Remove unused grammar rules (GH-26655)
Automerge-Triggered-By: GH:lysnikolaou
2021-06-10 15:05:06 -07:00
Pablo Galindo b250f89bb7
bpo-44305: Improve syntax error for try blocks without except or finally (GH-26523) 2021-06-03 23:52:12 +01:00
Pablo Galindo c878a97968
bpo-44180: Fix edge cases in invalid assigment rules in the parser (GH-26283)
The invalid assignment rules are very delicate since the parser can
easily raise an invalid assignment when a keyword argument is provided.
As they are very deep into the grammar tree, is very difficult to
specify in which contexts these rules can be used and in which don't.
For that, we need to use a different version of the rule that doesn't do
error checking in those situations where we don't want the rule to raise
(keyword arguments and generator expressions).

We also need to check if we are in left-recursive rule, as those can try
to eagerly advance the parser even if the parse will fail at the end of
the expression. Failing to do this allows the parser to start parsing a
call as a tuple and incorrectly identify a keyword argument as an
invalid assignment, before it realizes that it was not a tuple after all.
2021-05-21 18:34:54 +01:00
Pablo Galindo 33c0c90dea
bpo-44168: Fix error message in the parser for keyword arguments for invalid expressions (GH-26210) 2021-05-19 19:03:04 +01:00
Pablo Galindo 6692dc1ca9
bpo-43149: Correct the syntax error message for multiple exception types (GH-25996)
Automerge-Triggered-By: GH:pablogsal
2021-05-08 11:24:41 -07:00
Brandt Bucher dbe60ee09d
bpo-43892: Validate the first term of complex literal value patterns (GH-25735) 2021-04-29 17:19:28 -07:00
Nick Coghlan 1e7b858575
bpo-43892: Make match patterns explicit in the AST (GH-25585)
Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>
2021-04-28 22:58:44 -07:00
Pablo Galindo a77aac4fca
bpo-43914: Highlight invalid ranges in SyntaxErrors (#25525)
To improve the user experience understanding what part of the error messages associated with SyntaxErrors is wrong, we can highlight the whole error range and not only place the caret at the first character. In this way:

>>> foo(x, z for z in range(10), t, w)
  File "<stdin>", line 1
    foo(x, z for z in range(10), t, w)
           ^
SyntaxError: Generator expression must be parenthesized

becomes

>>> foo(x, z for z in range(10), t, w)
  File "<stdin>", line 1
    foo(x, z for z in range(10), t, w)
           ^^^^^^^^^^^^^^^^^^^^
SyntaxError: Generator expression must be parenthesized
2021-04-23 14:27:05 +01:00
Pablo Galindo 56c95dfe27
bpo-43859: Improve the error message for IndentationError exceptions (GH-25431) 2021-04-21 15:28:21 +01:00
Pablo Galindo b5b98bd8f8
bpo-43823: Fix location of one of the errors for invalid dictionary literals (GH-25427) 2021-04-16 00:45:42 +01:00
Pablo Galindo b280248be8
bpo-43822: Improve syntax errors for missing commas (GH-25377) 2021-04-15 21:38:45 +01:00
Pablo Galindo da74350174
bpo-43823: Improve syntax errors for invalid dictionary literals (GH-25378) 2021-04-15 14:06:39 +01:00
Pablo Galindo 30ed93bfec
bpo-43797: Handle correctly invalid assignments inside function calls and generators (GH-25390) 2021-04-13 17:51:21 +01:00
Pablo Galindo d9151cb453
Ensure that early = are not matched by the parser as invalid comparisons (GH-25375) 2021-04-13 02:32:33 +01:00
Pablo Galindo b86ed8e3bb
bpo-43797: Improve syntax error for invalid comparisons (#25317)
* bpo-43797: Improve syntax error for invalid comparisons

* Update Lib/test/test_fstring.py

Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>

* Apply review comments

* can't -> cannot

Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>
2021-04-12 16:59:30 +01:00
Matthew Suozzo 75a06f067b
bpo-43798: Add source location attributes to alias (GH-25324)
* Add source location attributes to alias.
* Move alias star construction to pegen helper.

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2021-04-10 22:56:28 +02:00
Victor Stinner d27f8d2e07
bpo-43244: Rename pycore_ast.h functions to _PyAST_xxx() (GH-25252)
Rename AST functions of pycore_ast.h to use the "_PyAST_" prefix.
Remove macros creating aliases without prefix. For example, Module()
becomes _PyAST_Module(). Update Grammar/python.gram to use
_PyAST_xxx() functions.
2021-04-07 21:34:22 +02:00
Pablo Galindo 8efad61963
bpo-41064: Improve syntax error for invalid usage of '**' in f-strings (GH-25006) 2021-03-24 19:34:17 +00:00
Pablo Galindo 08fb8ac99a
bpo-42128: Add 'missing :' syntax error message to match statements (GH-24733) 2021-03-18 01:03:11 +00:00
Brandt Bucher 145bf269df
bpo-42128: Structural Pattern Matching (PEP 634) (GH-22917)
Co-authored-by: Guido van Rossum <guido@python.org>
Co-authored-by: Talin <viridia@gmail.com>
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2021-02-26 14:51:55 -08:00
Pablo Galindo 206cbdab16
bpo-43149: Improve error message for exception group without parentheses (GH-24467) 2021-02-07 18:42:21 +00:00
Pablo Galindo d4e6ed7e5f
bpo-43121: Fix incorrect SyntaxError message for missing comma (GH-24436) 2021-02-03 23:29:26 +00:00
Pablo Galindo 58fb156edd
bpo-42997: Improve error message for missing : before suites (GH-24292)
* Add to the peg generator a new directive ('&&') that allows to expect
  a token and hard fail the parsing if the token is not found. This
  allows to quickly emmit syntax errors for missing tokens.

* Use the new grammar element to hard-fail if the ':' is missing before
  suites.
2021-02-02 19:54:22 +00:00
Pablo Galindo 835f14ff8e
bpo-43017: Improve error message for unparenthesised tuples in comprehensions (GH24314) 2021-01-31 22:52:56 +00:00
Lysandros Nikolaou 07dcd86cee
bpo-42860: Remove type error from grammar (GH-24156)
This is only there so that alternative implementations written in statically-typed languages can use this grammar without
having type errors in the way.

Automerge-Triggered-By: GH:lysnikolaou
2021-01-07 14:31:25 -08:00
Lysandros Nikolaou 2ea320dddd
bpo-40631: Disallow single parenthesized star target (GH-24027) 2021-01-03 01:14:21 +02:00
Pablo Galindo 43c4fb6c90
bpo-30858: Improve error location for expressions with assignments (GH-23753)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2020-12-13 16:46:48 +00:00
Pablo Galindo 9bdc40ee3e
Refactor the grammar to match the language specification docs (GH-23574) 2020-11-30 19:42:38 +00:00
Pablo Galindo b0aba1fcdc
bpo-42381: Allow walrus in set literals and set comprehensions (GH-23332)
Currently walruses are not allowerd in set literals and set comprehensions:

>>> {y := 4, 4**2, 3**3}
  File "<stdin>", line 1
    {y := 4, 4**2, 3**3}
       ^
SyntaxError: invalid syntax

but they should be allowed as well per PEP 572
2020-11-17 01:17:12 +00:00
Lysandros Nikolaou cae60187cf
bpo-42316: Allow unparenthesized walrus operator in indexes (GH-23317) 2020-11-17 01:09:35 +02:00
Lysandros Nikolaou cb3e5ed071
bpo-42374: Allow unparenthesized walrus in genexps (GH-23319)
This fixes a regression that was introduced by the new parser.

Automerge-Triggered-By: GH:lysnikolaou
2020-11-16 15:08:35 -08:00
Pablo Galindo 06f8c3328d
bpo-42214: Fix check for NOTEQUAL token in the PEG parser for the barry_as_flufl rule (GH-23048) 2020-10-30 23:48:42 +00:00
Lysandros Nikolaou 15acc4eaba
bpo-41659: Disallow curly brace directly after primary (GH-22996) 2020-10-27 20:54:20 +02:00
Lysandros Nikolaou bca7014032
bpo-42123: Run the parser two times and only enable invalid rules on the second run (GH-22111)
* Implement running the parser a second time for the errors messages

The first parser run is only responsible for detecting whether
there is a `SyntaxError` or not. If there isn't the AST gets returned.
Otherwise, the parser is run a second time with all the `invalid_*`
rules enabled so that all the customized error messages get produced.
2020-10-27 00:42:04 +02:00
Lysandros Nikolaou 2e5ca9e3f6
bpo-41746: Cast to typed seqs in CHECK macros to avoid type erasure (GH-22864) 2020-10-21 22:53:14 +03:00
Batuhan Taskaya 48f305fd12
bpo-41979: Accept star-unpacking on with-item targets (GH-22611)
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-10-09 10:56:48 +01:00
Pablo Galindo a5634c4067
bpo-41746: Add type information to asdl_seq objects (GH-22223)
* Add new capability to the PEG parser to type variable assignments. For instance:
```
       | a[asdl_stmt_seq*]=';'.small_stmt+ [';'] NEWLINE { a }
```

* Add new sequence types from the asdl definition (automatically generated)
* Make `asdl_seq` type a generic aliasing pointer type.
* Create a new `asdl_generic_seq` for the generic case using `void*`.
* The old `asdl_seq_GET`/`ast_seq_SET` macros now are typed.
* New `asdl_seq_GET_UNTYPED`/`ast_seq_SET_UNTYPED` macros for dealing with generic sequences.
* Changes all possible `asdl_seq` types to use specific versions everywhere.
2020-09-16 19:42:00 +01:00
Pablo Galindo 315a61f7a9
bpo-41697: Correctly handle KeywordOrStarred when parsing arguments in the parser (GH-22077) 2020-09-03 15:29:32 +01:00
Pablo Galindo 4a97b1517a
bpo-41690: Use a loop to collect args in the parser instead of recursion (GH-22053)
This program can segfault the parser by stack overflow:

```
import ast

code = "f(" + ",".join(['a' for _ in range(100000)]) + ")"
print("Ready!")
ast.parse(code)
```

the reason is that the rule for arguments has a simple recursion when collecting args:

args[expr_ty]:
    [...]
    | a=named_expression b=[',' c=args { c }] {
        [...] }
2020-09-02 17:44:19 +01:00
Batuhan Taskaya c8f29ad986
bpo-40769: Allow extra surrounding parentheses for invalid annotated assignment rule (GH-20387) 2020-06-27 19:33:08 +01:00
Lysandros Nikolaou 4b85e60601
bpo-41119: Output correct error message for list/tuple followed by colon (GH-21160) 2020-06-26 00:22:36 +01:00
Lysandros Nikolaou 6c4e0bd974
bpo-41060: Avoid SEGFAULT when calling GET_INVALID_TARGET in the grammar (GH-21020)
`GET_INVALID_TARGET` might unexpectedly return `NULL`, which if not
caught will cause a SEGFAULT. Therefore, this commit introduces a new
inline function `RAISE_SYNTAX_ERROR_INVALID_TARGET` that always
checks for `GET_INVALID_TARGET` returning NULL and can be used in
the grammar, replacing the long C ternary operation used till now.
2020-06-21 03:18:01 +01:00
Lysandros Nikolaou 01ece63d42
bpo-40334: Produce better error messages on invalid targets (GH-20106)
The following error messages get produced:
- `cannot delete ...` for invalid `del` targets
- `... is an illegal 'for' target` for invalid targets in for
  statements
- `... is an illegal 'with' target` for invalid targets in
  with statements

Additionally, a few `cut`s were added in various places before the
invocation of the `invalid_*` rule, in order to speed things
up.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-06-19 00:10:43 +01:00
Pablo Galindo b4282dd150
Remove unnecessary grammar decorations and change header (GH-20819) 2020-06-12 00:51:44 +01:00
Lysandros Nikolaou bcd7deed91
bpo-40939: Remove PEG parser easter egg (__new_parser__) (#20802)
It no longer serves a purpose (there's only one parser) and having "new" in any name will eventually look odd. Also, it impinges on a potential sub-namespace, `__new_...__`.
2020-06-11 09:09:21 -07:00
Pablo Galindo c6483c9896
Raise specialised syntax error for invalid lambda parameters (GH-20776) 2020-06-10 14:07:06 +01:00
Pablo Galindo 9f495908c5
bpo-40903: Handle multiple '=' in invalid assignment rules in the PEG parser (GH-20697)
Automerge-Triggered-By: @pablogsal
2020-06-07 18:57:00 -07:00
Lysandros Nikolaou ae14583302
bpo-40334: Produce better error messages for non-parenthesized genexps (GH-20153)
The error message, generated for a non-parenthesized generator expression
in function calls, was still the generic `invalid syntax`, when the generator expression wasn't appearing as the first argument in the call. With this patch, even on input like `f(a, b, c for c in d, e)`, the correct error message gets produced.
2020-05-22 01:56:52 +01:00
Batuhan Taskaya b8a65ec1d3
bpo-40715: Reject dict unpacking on dict comprehensions (GH-20292)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2020-05-21 23:39:56 +01:00
Batuhan Taskaya 72e0aa2fd2
bpo-40176: Improve error messages for trailing comma on from import (GH-20294) 2020-05-21 21:41:58 +01:00
Lysandros Nikolaou 75b863aa97
bpo-40334: Reproduce error message for type comments on bare '*' in the new parser (GH-20151) 2020-05-18 20:14:47 +01:00
Pablo Galindo 16ab07063c
bpo-40334: Correctly identify invalid target in assignment errors (GH-20076)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2020-05-15 02:04:52 +01:00
Lysandros Nikolaou ce21cfca7b
bpo-40618: Disallow invalid targets in augassign and except clauses (GH-20083)
This commit fixes the new parser to disallow invalid targets in the
following scenarios:
- Augmented assignments must only accept a single target (Name,
  Attribute or Subscript), but no tuples or lists.
- `except` clauses should only accept a single `Name` as a target.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-05-14 21:13:50 +01:00
Lysandros Nikolaou a15c9b3a05
bpo-40334: Always show the caret on SyntaxErrors (GH-20050)
This commit fixes SyntaxError locations when the caret is not displayed,
by doing the following:

- `col_number` always gets set to the location of the offending
  node/expr. When no caret is to be displayed, this gets achieved
  by setting the object holding the error line to None.

- Introduce a new function `_PyPegen_raise_error_known_location`,
  which can be called, when an arbitrary `lineno`/`col_offset`
  needs to be passed. This function then gets used in the grammar
  (through some new macros and inline functions) so that SyntaxError
  locations of the new parser match that of the old.
2020-05-13 20:36:27 +01:00
Shantanu 27c0d9b54a
bpo-40334: produce specialized errors for invalid del targets (GH-19911) 2020-05-11 14:53:58 -07:00
Lysandros Nikolaou 4638c64295
bpo-40334: Error message for invalid default args in function call (GH-19973)
When parsing something like `f(g()=2)`, where the name of a default arg
is not a NAME, but an arbitrary expression, a specialised error message
is emitted.
2020-05-07 11:44:06 +01:00
Pablo Galindo 99db2a1db7
bpo-40334: Allow trailing comma in parenthesised context managers (GH-19964) 2020-05-06 22:54:34 +01:00
Lysandros Nikolaou 999ec9ab6a
bpo-40334: Add type to the assignment rule in the grammar file (GH-19963) 2020-05-06 19:11:04 +01:00
Lysandros Nikolaou e10e7c771b
bpo-40334: Spacialized error message for invalid args after bare '*' (GH-19865)
When parsing things like `def f(*): pass` the old parser used to output `SyntaxError: named arguments must follow bare *`, which the new parser wasn't able to do.
2020-05-04 11:58:31 +01:00
Shantanu 603d354626
bpo-40493: fix function type comment parsing (GH-19894)
The grammar for func_type_input rejected things like `(*t1) ->t2`. This fixes that.

Automerge-Triggered-By: @gvanrossum
2020-05-03 22:08:14 -07:00
Guido van Rossum 3941d9700b
bpo-40334: Refactor lambda_parameters similar to parameters (GH-19830) 2020-05-01 17:42:03 +01:00
Pablo Galindo d955241469
bpo-40334: Correct return value of func_type_comment (GH-19833) 2020-05-01 08:32:09 -07:00
Batuhan Taskaya 76c1b4d5c5
bpo-40334: Improve column offsets for thrown syntax errors by Pegen (GH-19782) 2020-05-01 14:13:43 +01:00
Lysandros Nikolaou 3e0a6f37df
bpo-40334: Add support for feature_version in new PEG parser (GH-19827)
`ast.parse` and `compile` support a `feature_version` parameter that
tells the parser to parse the input string, as if it were written in
an older Python version.
The `feature_version` is propagated to the tokenizer, which uses it
to handle the three different stages of support for `async` and
`await`. Additionally, it disallows the following at parser level:
- The '@' operator in < 3.5
- Async functions in < 3.5
- Async comprehensions in < 3.6
- Underscores in numeric literals in < 3.6
- Await expression in < 3.5
- Variable annotations in < 3.6
- Async for-loops in < 3.5
- Async with-statements in < 3.5
- F-strings in < 3.6

Closes we-like-parsers/cpython#124.
2020-04-30 20:27:52 -07:00
Guido van Rossum c001c09e90
bpo-40334: Support type comments (GH-19780)
This implements full support for # type: <type> comments, # type: ignore <stuff> comments, and the func_type parsing mode for ast.parse() and compile().

Closes https://github.com/we-like-parsers/cpython/issues/95.

(For now, you need to use the master branch of mypy, since another issue unique to 3.9 had to be fixed there, and there's no mypy release yet.)

The only thing missing is `feature_version=N`, which is being tracked in https://github.com/we-like-parsers/cpython/issues/124.
2020-04-30 12:12:19 -07:00
Pablo Galindo 2b74c835a7
bpo-40334: Support CO_FUTURE_BARRY_AS_BDFL in the new parser (GH-19721)
This commit also allows to pass flags to the new parser in all interfaces and fixes a bug in the parser generator that was causing to inline rules with actions, making them disappear.
2020-04-27 18:02:07 +01:00
Pablo Galindo c5fc156852
bpo-40334: PEP 617 implementation: New PEG parser for CPython (GH-19503)
Co-authored-by: Guido van Rossum <guido@python.org>
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
2020-04-22 23:29:27 +01:00