The majority of this PR is tediously passing `end_lineno` and `end_col_offset` everywhere. Here are non-trivial points:
* It is not possible to reconstruct end positions in AST "on the fly", some information is lost after an AST node is constructed, so we need two more attributes for every AST node `end_lineno` and `end_col_offset`.
* I add end position information to both CST and AST. Although it may be technically possible to avoid adding end positions to CST, the code becomes more cumbersome and less efficient.
* Since the end position is not known for non-leaf CST nodes while the next token is added, this requires a bit of extra care (see `_PyNode_FinalizeEndPos`). Unless I made some mistake, the algorithm should be linear.
* For statements, I "trim" the end position of suites to not include the terminal newlines and dedent (this seems to be what people would expect), for example in
```python
class C:
pass
pass
```
the end line and end column for the class definition is (2, 8).
* For `end_col_offset` I use the common Python convention for indexing, for example for `pass` the `end_col_offset` is 4 (not 3), so that `[0:4]` gives one the source code that corresponds to the node.
* I added a helper function `ast.get_source_segment()`, to get source text segment corresponding to a given AST node. It is also useful for testing.
An (inevitable) downside of this PR is that AST now takes almost 25% more memory. I think however it is probably justified by the benefits.
The lineno and col_offset attributes of AST nodes for list comprehensions,
generator expressions and tuples are now point to the opening parenthesis or
square brace. For tuples without parenthesis they point to the position
of the first item.
Remove the docstring attribute of AST types and restore docstring
expression as a first stmt in their body.
Co-authored-by: INADA Naoki <methane@users.noreply.github.com>
bpo-29463 added optional "docstring" field to 4 AST types.
While it is optional, it breaks backward compatibility because AST constructor
requires number of positional argument is same to number of fields.
AST types accepts empty arguments, and incomplete keyword arguments.
But it's not big problem because field can be filled after creation, and checked when compiling.
So stop requiring complete set of fields for positional arguments too.
* bpo-29463: Add docstring field to some AST nodes.
ClassDef, ModuleDef, FunctionDef, and AsyncFunctionDef has docstring
field for now. It was first statement of there body.
* fix document. thanks travis!
* doc fixes
The compile ignores constant statements and emit a SyntaxWarning warning.
Don't emit the warning for string statement because triple quoted string is a
common syntax for multiline comments.
Don't emit the warning on ellipis neither: 'def f(): ...' is a legit syntax for
abstract functions.
Changes:
* test_ast: ignore SyntaxWarning when compiling test statements. Modify
test_load_const() to use assignment expressions rather than constant
expression.
* test_code: add more kinds of constant statements, ignore SyntaxWarning when
testing that the compiler removes constant statements.
* test_grammar: ignore SyntaxWarning on the statement "1"
Issue #26146: Add a new kind of AST node: ast.Constant. It can be used by
external AST optimizers, but the compiler does not emit directly such node.
An optimizer can replace the following AST nodes with ast.Constant:
* ast.NameConstant: None, False, True
* ast.Num: int, float, complex
* ast.Str: str
* ast.Bytes: bytes
* ast.Tuple if items are constants too: tuple
* frozenset
Update code to accept ast.Constant instead of ast.Num and/or ast.Str:
* compiler
* docstrings
* ast.literal_eval()
* Tools/parser/unparse.py