* Add to the peg generator a new directive ('&&') that allows to expect
a token and hard fail the parsing if the token is not found. This
allows to quickly emmit syntax errors for missing tokens.
* Use the new grammar element to hard-fail if the ':' is missing before
suites.
Left-recursive rules need to check for errors explicitly, since
even if the rule returns NULL, the parsing might continue and lead
to long-distance failures.
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
* Implement running the parser a second time for the errors messages
The first parser run is only responsible for detecting whether
there is a `SyntaxError` or not. If there isn't the AST gets returned.
Otherwise, the parser is run a second time with all the `invalid_*`
rules enabled so that all the customized error messages get produced.
This commit reverts commit ac0333e1e1 as the original links are working again and they provide extended features such as comments and alternative versions.
* Add new capability to the PEG parser to type variable assignments. For instance:
```
| a[asdl_stmt_seq*]=';'.small_stmt+ [';'] NEWLINE { a }
```
* Add new sequence types from the asdl definition (automatically generated)
* Make `asdl_seq` type a generic aliasing pointer type.
* Create a new `asdl_generic_seq` for the generic case using `void*`.
* The old `asdl_seq_GET`/`ast_seq_SET` macros now are typed.
* New `asdl_seq_GET_UNTYPED`/`ast_seq_SET_UNTYPED` macros for dealing with generic sequences.
* Changes all possible `asdl_seq` types to use specific versions everywhere.
Currently, empty sequences in gather rules make the conditional for
gather rules fail as empty sequences evaluate as "False". We need to
explicitly check for "None" (the failure condition) to avoid false
negatives.
This commit removes the old parser, the deprecated parser module, the old parser compatibility flags and environment variables and all associated support code and documentation.
These are like keywords but they only work in context; they are not reserved except when there is an exact match.
This would enable things like match statements without reserving `match` (which would be bad for the `re.match()` function and probably lots of other places).
Automerge-Triggered-By: @gvanrossum
When there are 2 negative lookaheads in the same rule, let's say `!"(" blabla "," !")"`, there will the 2 `FunctionCall`'s where assigned value is None. Currently when the `add_var` is called
the first one will be ignored but when the second lookahead's var is sent to dedupe it
will be returned as `None_1` and this won't be ignored by the declaration generator in the `visit_Alt`. This patch adds an explicit check to `add_var` to distinguish whether if there is a variable or not.
The following improvements are implemented in this commit:
- `p->error_indicator` is set, in case malloc or realloc fail.
- Avoid memory leaks in the case that realloc fails.
- Call `PyErr_NoMemory()` instead of `PyErr_Format()`, because it requires no memory.
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
This is for the C generator:
- Disallow rule and variable names starting with `_`
- Rename most local variable names generated by the parser to start with `_`
Exceptions:
- Renaming `p` to `_p` will be a separate PR
- There are still some names that might clash, e.g.
- anything starting with `Py`
- C reserved words (`if` etc.)
- Macros like `EXTRA` and `CHECK`
This commit also allows to pass flags to the new parser in all interfaces and fixes a bug in the parser generator that was causing to inline rules with actions, making them disappear.
Previously every test was building an extension module and
loading it into sys.modules. The tearDown function was thus
not able to clean up correctly, resulting in memory leaks.
With this commit, every test function now builds the extension
module and runs the actual test code in a new process
(using assert_python_ok), so that sys.modules stays intact
and no memory gets leaked.