Commit Graph

64 Commits

Author SHA1 Message Date
Pablo Galindo Salgado b01fd533fe
Extract visitors from the grammar nodes and call makers in the peg generator (GH-28172)
Simplify the peg generator logic by extracting as much visitors as possible to disentangle the flow and separate concerns.
2021-09-05 14:58:52 +01:00
Pablo Galindo Salgado 953d27261e
Update pegen to use the latest upstream developments (GH-27586) 2021-08-12 17:37:30 +01:00
Akira Nonaka aef1b58dc8
bpo-44345: Fix 'generated by' comment in parser.c (GH-26615) 2021-06-09 16:38:53 +02:00
Pablo Galindo c878a97968
bpo-44180: Fix edge cases in invalid assigment rules in the parser (GH-26283)
The invalid assignment rules are very delicate since the parser can
easily raise an invalid assignment when a keyword argument is provided.
As they are very deep into the grammar tree, is very difficult to
specify in which contexts these rules can be used and in which don't.
For that, we need to use a different version of the rule that doesn't do
error checking in those situations where we don't want the rule to raise
(keyword arguments and generator expressions).

We also need to check if we are in left-recursive rule, as those can try
to eagerly advance the parser even if the parse will fail at the end of
the expression. Failing to do this allows the parser to start parsing a
call as a tuple and incorrectly identify a keyword argument as an
invalid assignment, before it realizes that it was not a tuple after all.
2021-05-21 18:34:54 +01:00
Pablo Galindo b0544ba77c
bpo-38605: Revert making 'from __future__ import annotations' the default (GH-25490)
This reverts commits 044a1048ca and 1be456ae9d, adapting the code to changes that happened after it.
2021-04-21 12:41:19 +01:00
Pablo Galindo b280248be8
bpo-43822: Improve syntax errors for missing commas (GH-25377) 2021-04-15 21:38:45 +01:00
Pablo Galindo 58bafe42ab
Sanitize macros and debug functions in pegen.c (GH-25291) 2021-04-09 01:17:31 +01:00
Victor Stinner 8370e07e1e
bpo-43244: Remove the pyarena.h header (GH-25007)
Remove the pyarena.h header file with functions:

* PyArena_New()
* PyArena_Free()
* PyArena_Malloc()
* PyArena_AddPyObject()

These functions were undocumented, excluded from the limited C API,
and were only used internally by the compiler.

Add pycore_pyarena.h header. Rename functions:

* PyArena_New() => _PyArena_New()
* PyArena_Free() => _PyArena_Free()
* PyArena_Malloc() => _PyArena_Malloc()
* PyArena_AddPyObject() => _PyArena_AddPyObject()
2021-03-24 02:23:01 +01:00
Victor Stinner 57364ce34e
bpo-43244: Remove parser_interface.h header file (GH-25001)
Remove parser functions using the "struct _mod" type, because the
AST C API was removed:

* PyParser_ASTFromFile()
* PyParser_ASTFromFileObject()
* PyParser_ASTFromFilename()
* PyParser_ASTFromString()
* PyParser_ASTFromStringObject()

These functions were undocumented and excluded from the limited C
API.

Add pycore_parser.h internal header file. Rename functions:

* PyParser_ASTFromFileObject() => _PyParser_ASTFromFile()
* PyParser_ASTFromStringObject() => _PyParser_ASTFromString()

These functions are no longer exported (replace PyAPI_FUNC() with
extern).

Remove also _PyPegen_run_parser_from_file() function. Update
test_peg_generator to use _PyPegen_run_parser_from_file_pointer()
instead.
2021-03-24 01:29:09 +01:00
Victor Stinner a81fca6ec8
bpo-43244: Add pycore_compile.h header file (GH-25000)
Remove the compiler functions using "struct _mod" type, because the
public AST C API was removed:

* PyAST_Compile()
* PyAST_CompileEx()
* PyAST_CompileObject()
* PyFuture_FromAST()
* PyFuture_FromASTObject()

These functions were undocumented and excluded from the limited C API.

Rename functions:

* PyAST_CompileObject() => _PyAST_Compile()
* PyFuture_FromASTObject() => _PyFuture_FromAST()

Moreover, _PyFuture_FromAST() is no longer exported (replace
PyAPI_FUNC() with extern). _PyAST_Compile() remains exported for
test_peg_generator.

Remove also compatibility functions:

* PyAST_Compile()
* PyAST_CompileEx()
* PyFuture_FromAST()
2021-03-24 00:51:50 +01:00
Victor Stinner 6af528b4ab
bpo-43244: Fix test_peg_generators on Windows (GH-24913)
Don't redefine Py_DebugFlag, it's already defined in pydebug.h which
is included by Python.h
2021-03-18 09:54:13 +01:00
Victor Stinner e0bf70d08c
bpo-43244: Fix test_peg_generator for PyAST_Validate() (GH-24912)
test_peg_generator now defines _Py_TEST_PEGEN macro when building C
code to not call PyAST_Validate() in Parser/pegen.c. Moreover, it
defines Py_BUILD_CORE_MODULE macro to get access to the internal
C API.

Remove "global_ast_state" from Python-ast.c when it's built by
test_peg_generator: always get the AST state from the current interpreter.
2021-03-18 02:46:06 +01:00
Elisha Hollander e272528bbd
Remove unnecessary imports in the grammar parser (GH-24904) 2021-03-17 22:07:17 +00:00
Jozef Grajciar c994ffe695
bpo-11717: fix ssize_t redefinition error when targeting 32bit Windows app (GH-24479) 2021-03-01 11:18:33 +00:00
Pablo Galindo 58fb156edd
bpo-42997: Improve error message for missing : before suites (GH-24292)
* Add to the peg generator a new directive ('&&') that allows to expect
  a token and hard fail the parsing if the token is not found. This
  allows to quickly emmit syntax errors for missing tokens.

* Use the new grammar element to hard-fail if the ':' is missing before
  suites.
2021-02-02 19:54:22 +00:00
Pablo Galindo 3bcc4ead3f
Add small validator utility for PEG grammars (GH-23519) 2020-12-26 19:11:29 +00:00
Lysandros Nikolaou 02cdfc93f8
bpo-42218: Correctly handle errors in left-recursive rules (GH-23065)
Left-recursive rules need to check for errors explicitly, since
even if the rule returns NULL, the parsing might continue and lead
to long-distance failures.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-10-31 20:31:41 +02:00
Lysandros Nikolaou bca7014032
bpo-42123: Run the parser two times and only enable invalid rules on the second run (GH-22111)
* Implement running the parser a second time for the errors messages

The first parser run is only responsible for detecting whether
there is a `SyntaxError` or not. If there isn't the AST gets returned.
Otherwise, the parser is run a second time with all the `invalid_*`
rules enabled so that all the customized error messages get produced.
2020-10-27 00:42:04 +02:00
Andre Delfino e8a2076e14
Revert "Fix all Python Cookbook links (#22205)" (GH-22424)
This commit reverts commit ac0333e1e1 as the original links are working again and they provide extended features such as comments and alternative versions.
2020-09-27 01:47:25 +01:00
Pablo Galindo a5634c4067
bpo-41746: Add type information to asdl_seq objects (GH-22223)
* Add new capability to the PEG parser to type variable assignments. For instance:
```
       | a[asdl_stmt_seq*]=';'.small_stmt+ [';'] NEWLINE { a }
```

* Add new sequence types from the asdl definition (automatically generated)
* Make `asdl_seq` type a generic aliasing pointer type.
* Create a new `asdl_generic_seq` for the generic case using `void*`.
* The old `asdl_seq_GET`/`ast_seq_SET` macros now are typed.
* New `asdl_seq_GET_UNTYPED`/`ast_seq_SET_UNTYPED` macros for dealing with generic sequences.
* Changes all possible `asdl_seq` types to use specific versions everywhere.
2020-09-16 19:42:00 +01:00
Andre Delfino ac0333e1e1
Fix all Python Cookbook links (#22205) 2020-09-15 21:13:26 +01:00
Pablo Galindo e55a0e971b
Fix 'gather' rules in the python parser generator (GH-22021)
Currently, empty sequences in gather rules make the conditional for
gather rules fail as empty sequences evaluate as "False". We need to
explicitly check for "None" (the failure condition) to avoid false
negatives.
2020-09-03 15:29:55 +01:00
Guido van Rossum 508ed2d912
Delete remaining references to Grammar/Grammar from docs (#21624)
(Ironically, the file itself remains, see https://github.com/we-like-parsers/cpython/issues/135.)
2020-07-26 08:27:52 -07:00
Pablo Galindo 1ac0cbca36
bpo-41215: Don't use NULL by default in the PEG parser keyword list (GH-21355)
Automerge-Triggered-By: @lysnikolaou
2020-07-06 12:31:16 -07:00
Batuhan Taskaya 55460ee6dc
bpo-41044: Generate valid PEG python parsers for opt+seq rules (GH-20995)
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-06-20 18:40:06 +01:00
Serhiy Storchaka 9355868458
bpo-41043: Escape literal part of the path for glob(). (GH-20994) 2020-06-20 11:10:31 +03:00
Pablo Galindo 5fc4f8ae68
Fix trailing whitespace in keyword.py (GH-20881) 2020-06-15 04:33:33 +01:00
Pablo Galindo 78319e373d
Include soft keywords in keyword.py (GH-20877) 2020-06-15 03:55:15 +01:00
Pablo Galindo 756180b4bf
bpo-40939: Clean and adapt the peg_generator directory after deleting the old parser (GH-20822) 2020-06-12 01:55:35 +01:00
Pablo Galindo 1ed83adb0e
bpo-40939: Remove the old parser (GH-20768)
This commit removes the old parser, the deprecated parser module, the old parser compatibility flags and environment variables and all associated support code and documentation.
2020-06-11 17:30:46 +01:00
Lysandros Nikolaou 9727694f08
bpo-40939: Generate keyword.py using the new parser (GH-20800) 2020-06-11 13:45:15 +01:00
Lysandros Nikolaou ba6fd87e41
Refactor scripts in Tools/peg_generator/scripts (GH-20401) 2020-06-05 21:21:40 -07:00
Pablo Galindo 404b23b85b
Fix lookahead of soft keywords in the PEG parser (GH-20436)
Automerge-Triggered-By: @gvanrossum
2020-05-26 16:15:52 -07:00
Guido van Rossum b45af1a569
Add soft keywords (GH-20370)
These are like keywords but they only work in context; they are not reserved except when there is an exact match.

This would enable things like match statements without reserving `match` (which would be bad for the `re.match()` function and probably lots of other places).

Automerge-Triggered-By: @gvanrossum
2020-05-26 10:58:44 -07:00
Ammar Askar a2bbedc8b1
Fix peg_generator compiler warnings under MSVC (GH-20405) 2020-05-26 05:33:35 +01:00
Lysandros Nikolaou 9645930b5b
bpo-40688: Use the correct parser in the peg_generator scripts (GH-20235)
The scripts in `Tools/peg_generator/scripts` mostly assume that
`ast.parse` and `compile` use the old parser, since this was the
state of things, while we were developing them. They need to be
updated to always use the correct parser. `_peg_parser` is being
extended to support both parsing and compiling with both parsers.
2020-05-25 20:51:58 +01:00
Pablo Galindo deb4355a37
bpo-40750: Do not expand the new parser debug flags if Py_BUILD_CORE is not defined (GH-20393) 2020-05-25 20:17:12 +01:00
Pablo Galindo 800a35c623
bpo-40750: Support -d flag in the new parser (GH-20340) 2020-05-25 18:38:45 +01:00
Batuhan Taskaya cba5031510
bpo-40334: Support suppressing of multiple optional variables in Pegen (GH-20367) 2020-05-24 23:20:18 +01:00
Pablo Galindo b831129123
Fix debug output in PEG parser generator (GH-20308) 2020-05-22 02:48:09 +01:00
Pablo Galindo d10fef35c6
Fix typing problems reported by mypy in pegen (GH-20297) 2020-05-21 21:39:44 +01:00
Batuhan Taskaya f50516e6a9
bpo-40334: Correctly generate C parser when assigned var is None (GH-20296)
When there are 2 negative lookaheads in the same rule, let's say `!"(" blabla "," !")"`, there will the 2 `FunctionCall`'s where assigned value is None. Currently when the `add_var` is called
the first one will be ignored but when the second lookahead's var is sent to dedupe it
will be returned as `None_1` and this won't be ignored by the declaration generator in the `visit_Alt`. This patch adds an explicit check to `add_var` to distinguish whether if there is a variable or not.
2020-05-21 20:57:52 +01:00
Pablo Galindo 3764069f3b
bpo-40669: Use requirements.pip when installing PEG dependencies (GH-20194) 2020-05-18 23:37:06 +01:00
Lysandros Nikolaou dc31800f86
bpo-40669: Install PEG benchmarking dependencies in a venv (GH-20183)
Create a `make venv` target, that creates a virtual environment
and installs the dependency in that venv. `make time` and all
the related targets are changed to use the virtual environment
python.

Automerge-Triggered-By: @pablogsal
2020-05-18 11:27:40 -07:00
Lysandros Nikolaou 7b7a21bc4f
bpo-40661: Fix segfault when parsing invalid input (GH-20165)
Fix segfaults when parsing very complex invalid input, like `import äˆ ð£„¯ð¢·žð±‹á”€ð””ð‘©±å®ä±¬ð©¾\n𗶽`.

Co-authored-by: Guido van Rossum <guido@python.org>
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
2020-05-18 18:32:03 +01:00
Lysandros Nikolaou 2c8cd06afe
bpo-40334: Improvements to error-handling code in the PEG parser (GH-20003)
The following improvements are implemented in this commit:
- `p->error_indicator` is set, in case malloc or realloc fail.
- Avoid memory leaks in the case that realloc fails.
- Call `PyErr_NoMemory()` instead of `PyErr_Format()`, because it requires no memory.

Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
2020-05-17 04:19:23 +01:00
Pablo Galindo ac7a92cc0a
bpo-40334: Avoid collisions between parser variables and grammar variables (GH-19987)
This is for the C generator:
- Disallow rule and variable names starting with `_`
- Rename most local variable names generated by the parser to start with `_`

Exceptions:
- Renaming `p` to `_p` will be a separate PR
- There are still some names that might clash, e.g.
  - anything starting with `Py`
  - C reserved words (`if` etc.)
  - Macros like `EXTRA` and `CHECK`
2020-05-09 21:34:50 -07:00
Pablo Galindo db9163ceef
bpo-40555: Check for p->error_indicator in loop rules after the main loop is done (GH-19986) 2020-05-08 03:38:44 +01:00
Pablo Galindo 470aac4d8e
bpo-40334: Generate comments in the parser code to improve debugging (GH-19966) 2020-05-06 23:14:43 +01:00
Anthony Shaw c95e691c90
Clean up unused imports for the peg generator module (GH-19891) 2020-05-04 03:03:05 +01:00