Left-recursive rules need to check for errors explicitly, since
even if the rule returns NULL, the parsing might continue and lead
to long-distance failures.
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
* Implement running the parser a second time for the errors messages
The first parser run is only responsible for detecting whether
there is a `SyntaxError` or not. If there isn't the AST gets returned.
Otherwise, the parser is run a second time with all the `invalid_*`
rules enabled so that all the customized error messages get produced.
* Add new capability to the PEG parser to type variable assignments. For instance:
```
| a[asdl_stmt_seq*]=';'.small_stmt+ [';'] NEWLINE { a }
```
* Add new sequence types from the asdl definition (automatically generated)
* Make `asdl_seq` type a generic aliasing pointer type.
* Create a new `asdl_generic_seq` for the generic case using `void*`.
* The old `asdl_seq_GET`/`ast_seq_SET` macros now are typed.
* New `asdl_seq_GET_UNTYPED`/`ast_seq_SET_UNTYPED` macros for dealing with generic sequences.
* Changes all possible `asdl_seq` types to use specific versions everywhere.
These are like keywords but they only work in context; they are not reserved except when there is an exact match.
This would enable things like match statements without reserving `match` (which would be bad for the `re.match()` function and probably lots of other places).
Automerge-Triggered-By: @gvanrossum
When there are 2 negative lookaheads in the same rule, let's say `!"(" blabla "," !")"`, there will the 2 `FunctionCall`'s where assigned value is None. Currently when the `add_var` is called
the first one will be ignored but when the second lookahead's var is sent to dedupe it
will be returned as `None_1` and this won't be ignored by the declaration generator in the `visit_Alt`. This patch adds an explicit check to `add_var` to distinguish whether if there is a variable or not.
The following improvements are implemented in this commit:
- `p->error_indicator` is set, in case malloc or realloc fail.
- Avoid memory leaks in the case that realloc fails.
- Call `PyErr_NoMemory()` instead of `PyErr_Format()`, because it requires no memory.
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
This is for the C generator:
- Disallow rule and variable names starting with `_`
- Rename most local variable names generated by the parser to start with `_`
Exceptions:
- Renaming `p` to `_p` will be a separate PR
- There are still some names that might clash, e.g.
- anything starting with `Py`
- C reserved words (`if` etc.)
- Macros like `EXTRA` and `CHECK`
This commit also allows to pass flags to the new parser in all interfaces and fixes a bug in the parser generator that was causing to inline rules with actions, making them disappear.