348 lines
14 KiB
ReStructuredText
348 lines
14 KiB
ReStructuredText
:mod:`ast` --- Abstract Syntax Trees
|
|
====================================
|
|
|
|
.. module:: ast
|
|
:synopsis: Abstract Syntax Tree classes and manipulation.
|
|
|
|
.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
|
|
.. sectionauthor:: Georg Brandl <georg@python.org>
|
|
|
|
**Source code:** :source:`Lib/ast.py`
|
|
|
|
--------------
|
|
|
|
The :mod:`ast` module helps Python applications to process trees of the Python
|
|
abstract syntax grammar. The abstract syntax itself might change with each
|
|
Python release; this module helps to find out programmatically what the current
|
|
grammar looks like.
|
|
|
|
An abstract syntax tree can be generated by passing :data:`ast.PyCF_ONLY_AST` as
|
|
a flag to the :func:`compile` built-in function, or using the :func:`parse`
|
|
helper provided in this module. The result will be a tree of objects whose
|
|
classes all inherit from :class:`ast.AST`. An abstract syntax tree can be
|
|
compiled into a Python code object using the built-in :func:`compile` function.
|
|
|
|
|
|
Node classes
|
|
------------
|
|
|
|
.. class:: AST
|
|
|
|
This is the base of all AST node classes. The actual node classes are
|
|
derived from the :file:`Parser/Python.asdl` file, which is reproduced
|
|
:ref:`below <abstract-grammar>`. They are defined in the :mod:`_ast` C
|
|
module and re-exported in :mod:`ast`.
|
|
|
|
There is one class defined for each left-hand side symbol in the abstract
|
|
grammar (for example, :class:`ast.stmt` or :class:`ast.expr`). In addition,
|
|
there is one class defined for each constructor on the right-hand side; these
|
|
classes inherit from the classes for the left-hand side trees. For example,
|
|
:class:`ast.BinOp` inherits from :class:`ast.expr`. For production rules
|
|
with alternatives (aka "sums"), the left-hand side class is abstract: only
|
|
instances of specific constructor nodes are ever created.
|
|
|
|
.. index:: single: ? (question mark); in AST grammar
|
|
.. index:: single: * (asterisk); in AST grammar
|
|
|
|
.. attribute:: _fields
|
|
|
|
Each concrete class has an attribute :attr:`_fields` which gives the names
|
|
of all child nodes.
|
|
|
|
Each instance of a concrete class has one attribute for each child node,
|
|
of the type as defined in the grammar. For example, :class:`ast.BinOp`
|
|
instances have an attribute :attr:`left` of type :class:`ast.expr`.
|
|
|
|
If these attributes are marked as optional in the grammar (using a
|
|
question mark), the value might be ``None``. If the attributes can have
|
|
zero-or-more values (marked with an asterisk), the values are represented
|
|
as Python lists. All possible attributes must be present and have valid
|
|
values when compiling an AST with :func:`compile`.
|
|
|
|
.. attribute:: lineno
|
|
col_offset
|
|
end_lineno
|
|
end_col_offset
|
|
|
|
Instances of :class:`ast.expr` and :class:`ast.stmt` subclasses have
|
|
:attr:`lineno`, :attr:`col_offset`, :attr:`lineno`, and :attr:`col_offset`
|
|
attributes. The :attr:`lineno` and :attr:`end_lineno` are the first and
|
|
last line numbers of source text span (1-indexed so the first line is line 1)
|
|
and the :attr:`col_offset` and :attr:`end_col_offset` are the corresponding
|
|
UTF-8 byte offsets of the first and last tokens that generated the node.
|
|
The UTF-8 offset is recorded because the parser uses UTF-8 internally.
|
|
|
|
Note that the end positions are not required by the compiler and are
|
|
therefore optional. The end offset is *after* the last symbol, for example
|
|
one can get the source segment of a one-line expression node using
|
|
``source_line[node.col_offset : node.end_col_offset]``.
|
|
|
|
The constructor of a class :class:`ast.T` parses its arguments as follows:
|
|
|
|
* If there are positional arguments, there must be as many as there are items
|
|
in :attr:`T._fields`; they will be assigned as attributes of these names.
|
|
* If there are keyword arguments, they will set the attributes of the same
|
|
names to the given values.
|
|
|
|
For example, to create and populate an :class:`ast.UnaryOp` node, you could
|
|
use ::
|
|
|
|
node = ast.UnaryOp()
|
|
node.op = ast.USub()
|
|
node.operand = ast.Constant()
|
|
node.operand.value = 5
|
|
node.operand.lineno = 0
|
|
node.operand.col_offset = 0
|
|
node.lineno = 0
|
|
node.col_offset = 0
|
|
|
|
or the more compact ::
|
|
|
|
node = ast.UnaryOp(ast.USub(), ast.Constant(5, lineno=0, col_offset=0),
|
|
lineno=0, col_offset=0)
|
|
|
|
.. versionchanged:: 3.8
|
|
|
|
Class :class:`ast.Constant` is now used for all constants.
|
|
|
|
.. deprecated:: 3.8
|
|
|
|
Old classes :class:`ast.Num`, :class:`ast.Str`, :class:`ast.Bytes`,
|
|
:class:`ast.NameConstant` and :class:`ast.Ellipsis` are still available,
|
|
but they will be removed in future Python releases. In the meanwhile,
|
|
instantiating them will return an instance of a different class.
|
|
|
|
|
|
.. _abstract-grammar:
|
|
|
|
Abstract Grammar
|
|
----------------
|
|
|
|
The abstract grammar is currently defined as follows:
|
|
|
|
.. literalinclude:: ../../Parser/Python.asdl
|
|
:language: none
|
|
|
|
|
|
:mod:`ast` Helpers
|
|
------------------
|
|
|
|
Apart from the node classes, the :mod:`ast` module defines these utility functions
|
|
and classes for traversing abstract syntax trees:
|
|
|
|
.. function:: parse(source, filename='<unknown>', mode='exec', *, type_comments=False, feature_version=None)
|
|
|
|
Parse the source into an AST node. Equivalent to ``compile(source,
|
|
filename, mode, ast.PyCF_ONLY_AST)``.
|
|
|
|
If ``type_comments=True`` is given, the parser is modified to check
|
|
and return type comments as specified by :pep:`484` and :pep:`526`.
|
|
This is equivalent to adding :data:`ast.PyCF_TYPE_COMMENTS` to the
|
|
flags passed to :func:`compile()`. This will report syntax errors
|
|
for misplaced type comments. Without this flag, type comments will
|
|
be ignored, and the ``type_comment`` field on selected AST nodes
|
|
will always be ``None``. In addition, the locations of ``# type:
|
|
ignore`` comments will be returned as the ``type_ignores``
|
|
attribute of :class:`Module` (otherwise it is always an empty list).
|
|
|
|
In addition, if ``mode`` is ``'func_type'``, the input syntax is
|
|
modified to correspond to :pep:`484` "signature type comments",
|
|
e.g. ``(str, int) -> List[str]``.
|
|
|
|
Also, setting ``feature_version`` to a tuple ``(major, minor)``
|
|
will attempt to parse using that Python version's grammar.
|
|
Currently ``major`` must equal to ``3``. For example, setting
|
|
``feature_version=(3, 4)`` will allow the use of ``async`` and
|
|
``await`` as variable names. The lowest supported version is
|
|
``(3, 4)``; the highest is ``sys.version_info[0:2]``.
|
|
|
|
.. warning::
|
|
It is possible to crash the Python interpreter with a
|
|
sufficiently large/complex string due to stack depth limitations
|
|
in Python's AST compiler.
|
|
|
|
.. versionchanged:: 3.8
|
|
Added ``type_comments``, ``mode='func_type'`` and ``feature_version``.
|
|
|
|
|
|
.. function:: literal_eval(node_or_string)
|
|
|
|
Safely evaluate an expression node or a string containing a Python literal or
|
|
container display. The string or node provided may only consist of the
|
|
following Python literal structures: strings, bytes, numbers, tuples, lists,
|
|
dicts, sets, booleans, and ``None``.
|
|
|
|
This can be used for safely evaluating strings containing Python values from
|
|
untrusted sources without the need to parse the values oneself. It is not
|
|
capable of evaluating arbitrarily complex expressions, for example involving
|
|
operators or indexing.
|
|
|
|
.. warning::
|
|
It is possible to crash the Python interpreter with a
|
|
sufficiently large/complex string due to stack depth limitations
|
|
in Python's AST compiler.
|
|
|
|
.. versionchanged:: 3.2
|
|
Now allows bytes and set literals.
|
|
|
|
|
|
.. function:: get_docstring(node, clean=True)
|
|
|
|
Return the docstring of the given *node* (which must be a
|
|
:class:`FunctionDef`, :class:`AsyncFunctionDef`, :class:`ClassDef`,
|
|
or :class:`Module` node), or ``None`` if it has no docstring.
|
|
If *clean* is true, clean up the docstring's indentation with
|
|
:func:`inspect.cleandoc`.
|
|
|
|
.. versionchanged:: 3.5
|
|
:class:`AsyncFunctionDef` is now supported.
|
|
|
|
|
|
.. function:: get_source_segment(source, node, *, padded=False)
|
|
|
|
Get source code segment of the *source* that generated *node*.
|
|
If some location information (:attr:`lineno`, :attr:`end_lineno`,
|
|
:attr:`col_offset`, or :attr:`end_col_offset`) is missing, return ``None``.
|
|
|
|
If *padded* is ``True``, the first line of a multi-line statement will
|
|
be padded with spaces to match its original position.
|
|
|
|
.. versionadded:: 3.8
|
|
|
|
|
|
.. function:: fix_missing_locations(node)
|
|
|
|
When you compile a node tree with :func:`compile`, the compiler expects
|
|
:attr:`lineno` and :attr:`col_offset` attributes for every node that supports
|
|
them. This is rather tedious to fill in for generated nodes, so this helper
|
|
adds these attributes recursively where not already set, by setting them to
|
|
the values of the parent node. It works recursively starting at *node*.
|
|
|
|
|
|
.. function:: increment_lineno(node, n=1)
|
|
|
|
Increment the line number and end line number of each node in the tree
|
|
starting at *node* by *n*. This is useful to "move code" to a different
|
|
location in a file.
|
|
|
|
|
|
.. function:: copy_location(new_node, old_node)
|
|
|
|
Copy source location (:attr:`lineno`, :attr:`col_offset`, :attr:`end_lineno`,
|
|
and :attr:`end_col_offset`) from *old_node* to *new_node* if possible,
|
|
and return *new_node*.
|
|
|
|
|
|
.. function:: iter_fields(node)
|
|
|
|
Yield a tuple of ``(fieldname, value)`` for each field in ``node._fields``
|
|
that is present on *node*.
|
|
|
|
|
|
.. function:: iter_child_nodes(node)
|
|
|
|
Yield all direct child nodes of *node*, that is, all fields that are nodes
|
|
and all items of fields that are lists of nodes.
|
|
|
|
|
|
.. function:: walk(node)
|
|
|
|
Recursively yield all descendant nodes in the tree starting at *node*
|
|
(including *node* itself), in no specified order. This is useful if you only
|
|
want to modify nodes in place and don't care about the context.
|
|
|
|
|
|
.. class:: NodeVisitor()
|
|
|
|
A node visitor base class that walks the abstract syntax tree and calls a
|
|
visitor function for every node found. This function may return a value
|
|
which is forwarded by the :meth:`visit` method.
|
|
|
|
This class is meant to be subclassed, with the subclass adding visitor
|
|
methods.
|
|
|
|
.. method:: visit(node)
|
|
|
|
Visit a node. The default implementation calls the method called
|
|
:samp:`self.visit_{classname}` where *classname* is the name of the node
|
|
class, or :meth:`generic_visit` if that method doesn't exist.
|
|
|
|
.. method:: generic_visit(node)
|
|
|
|
This visitor calls :meth:`visit` on all children of the node.
|
|
|
|
Note that child nodes of nodes that have a custom visitor method won't be
|
|
visited unless the visitor calls :meth:`generic_visit` or visits them
|
|
itself.
|
|
|
|
Don't use the :class:`NodeVisitor` if you want to apply changes to nodes
|
|
during traversal. For this a special visitor exists
|
|
(:class:`NodeTransformer`) that allows modifications.
|
|
|
|
.. deprecated:: 3.8
|
|
|
|
Methods :meth:`visit_Num`, :meth:`visit_Str`, :meth:`visit_Bytes`,
|
|
:meth:`visit_NameConstant` and :meth:`visit_Ellipsis` are deprecated
|
|
now and will not be called in future Python versions. Add the
|
|
:meth:`visit_Constant` method to handle all constant nodes.
|
|
|
|
|
|
.. class:: NodeTransformer()
|
|
|
|
A :class:`NodeVisitor` subclass that walks the abstract syntax tree and
|
|
allows modification of nodes.
|
|
|
|
The :class:`NodeTransformer` will walk the AST and use the return value of
|
|
the visitor methods to replace or remove the old node. If the return value
|
|
of the visitor method is ``None``, the node will be removed from its
|
|
location, otherwise it is replaced with the return value. The return value
|
|
may be the original node in which case no replacement takes place.
|
|
|
|
Here is an example transformer that rewrites all occurrences of name lookups
|
|
(``foo``) to ``data['foo']``::
|
|
|
|
class RewriteName(NodeTransformer):
|
|
|
|
def visit_Name(self, node):
|
|
return Subscript(
|
|
value=Name(id='data', ctx=Load()),
|
|
slice=Index(value=Constant(value=node.id)),
|
|
ctx=node.ctx
|
|
)
|
|
|
|
Keep in mind that if the node you're operating on has child nodes you must
|
|
either transform the child nodes yourself or call the :meth:`generic_visit`
|
|
method for the node first.
|
|
|
|
For nodes that were part of a collection of statements (that applies to all
|
|
statement nodes), the visitor may also return a list of nodes rather than
|
|
just a single node.
|
|
|
|
If :class:`NodeTransformer` introduces new nodes (that weren't part of
|
|
original tree) without giving them location information (such as
|
|
:attr:`lineno`), :func:`fix_missing_locations` should be called with
|
|
the new sub-tree to recalculate the location information::
|
|
|
|
tree = ast.parse('foo', mode='eval')
|
|
new_tree = fix_missing_locations(RewriteName().visit(tree))
|
|
|
|
Usually you use the transformer like this::
|
|
|
|
node = YourTransformer().visit(node)
|
|
|
|
|
|
.. function:: dump(node, annotate_fields=True, include_attributes=False)
|
|
|
|
Return a formatted dump of the tree in *node*. This is mainly useful for
|
|
debugging purposes. If *annotate_fields* is true (by default),
|
|
the returned string will show the names and the values for fields.
|
|
If *annotate_fields* is false, the result string will be more compact by
|
|
omitting unambiguous field names. Attributes such as line
|
|
numbers and column offsets are not dumped by default. If this is wanted,
|
|
*include_attributes* can be set to true.
|
|
|
|
.. seealso::
|
|
|
|
`Green Tree Snakes <https://greentreesnakes.readthedocs.io/>`_, an external documentation resource, has good
|
|
details on working with Python ASTs.
|