mirror of https://github.com/python/cpython
58 lines
2.6 KiB
TeX
58 lines
2.6 KiB
TeX
% XXX Label can't be _ast?
|
|
% XXX Where should this section/chapter go?
|
|
\chapter{Abstract Syntax Trees\label{ast}}
|
|
|
|
\sectionauthor{Martin v. L\"owis}{martin@v.loewis.de}
|
|
|
|
\versionadded{2.5}
|
|
|
|
The \code{_ast} module helps Python applications to process
|
|
trees of the Python abstract syntax grammar. The Python compiler
|
|
currently provides read-only access to such trees, meaning that
|
|
applications can only create a tree for a given piece of Python
|
|
source code; generating byte code from a (potentially modified)
|
|
tree is not supported. The abstract syntax itself might change with
|
|
each Python release; this module helps to find out programmatically
|
|
what the current grammar looks like.
|
|
|
|
An abstract syntax tree can be generated by passing \code{_ast.PyCF_ONLY_AST}
|
|
as a flag to the \function{compile} builtin function. The result will be a tree
|
|
of objects whose classes all inherit from \code{_ast.AST}.
|
|
|
|
The actual classes are derived from the \code{Parser/Python.asdl} file,
|
|
which is reproduced below. There is one class defined for each left-hand
|
|
side symbol in the abstract grammar (for example, \code{_ast.stmt} or \code{_ast.expr}).
|
|
In addition, there is one class defined for each constructor on the
|
|
right-hand side; these classes inherit from the classes for the left-hand
|
|
side trees. For example, \code{_ast.BinOp} inherits from \code{_ast.expr}.
|
|
For production rules with alternatives (aka "sums"), the left-hand side
|
|
class is abstract: only instances of specific constructor nodes are ever
|
|
created.
|
|
|
|
Each concrete class has an attribute \code{_fields} which gives the
|
|
names of all child nodes.
|
|
|
|
Each instance of a concrete class has one attribute for each child node,
|
|
of the type as defined in the grammar. For example, \code{_ast.BinOp}
|
|
instances have an attribute \code{left} of type \code{_ast.expr}.
|
|
Instances of \code{_ast.expr} and \code{_ast.stmt} subclasses also
|
|
have lineno and col_offset attributes. The lineno is the line number
|
|
of source text (1 indexed so the first line is line 1) and the
|
|
col_offset is the utf8 byte offset of the first token that generated
|
|
the node. The utf8 offset is recorded because the parser uses utf8
|
|
internally.
|
|
|
|
If these attributes are marked as optional in the grammar (using a
|
|
question mark), the value might be \code{None}. If the attributes
|
|
can have zero-or-more values (marked with an asterisk), the
|
|
values are represented as Python lists.
|
|
|
|
\section{Abstract Grammar}
|
|
|
|
The module defines a string constant \code{__version__} which
|
|
is the decimal subversion revision number of the file shown below.
|
|
|
|
The abstract grammar is currently defined as follows:
|
|
|
|
\verbatiminput{../../Parser/Python.asdl}
|