Initial revision
This commit is contained in:
parent
39789030bd
commit
46f3e00407
|
@ -0,0 +1,81 @@
|
|||
\chapter{Introduction}
|
||||
|
||||
This reference manual describes the Python programming language.
|
||||
It is not intended as a tutorial.
|
||||
|
||||
While I am trying to be as precise as possible, I chose to use English
|
||||
rather than formal specifications for everything except syntax and
|
||||
lexical analysis. This should make the document better understandable
|
||||
to the average reader, but will leave room for ambiguities.
|
||||
Consequently, if you were coming from Mars and tried to re-implement
|
||||
Python from this document alone, you might have to guess things and in
|
||||
fact you would probably end up implementing quite a different language.
|
||||
On the other hand, if you are using
|
||||
Python and wonder what the precise rules about a particular area of
|
||||
the language are, you should definitely be able to find them here.
|
||||
|
||||
It is dangerous to add too many implementation details to a language
|
||||
reference document --- the implementation may change, and other
|
||||
implementations of the same language may work differently. On the
|
||||
other hand, there is currently only one Python implementation, and
|
||||
its particular quirks are sometimes worth being mentioned, especially
|
||||
where the implementation imposes additional limitations. Therefore,
|
||||
you'll find short ``implementation notes'' sprinkled throughout the
|
||||
text.
|
||||
|
||||
Every Python implementation comes with a number of built-in and
|
||||
standard modules. These are not documented here, but in the separate
|
||||
{\em Python Library Reference} document. A few built-in modules are
|
||||
mentioned when they interact in a significant way with the language
|
||||
definition.
|
||||
|
||||
\section{Notation}
|
||||
|
||||
The descriptions of lexical analysis and syntax use a modified BNF
|
||||
grammar notation. This uses the following style of definition:
|
||||
\index{BNF}
|
||||
\index{grammar}
|
||||
\index{syntax}
|
||||
\index{notation}
|
||||
|
||||
\begin{verbatim}
|
||||
name: lc_letter (lc_letter | "_")*
|
||||
lc_letter: "a"..."z"
|
||||
\end{verbatim}
|
||||
|
||||
The first line says that a \verb\name\ is an \verb\lc_letter\ followed by
|
||||
a sequence of zero or more \verb\lc_letter\s and underscores. An
|
||||
\verb\lc_letter\ in turn is any of the single characters `a' through `z'.
|
||||
(This rule is actually adhered to for the names defined in lexical and
|
||||
grammar rules in this document.)
|
||||
|
||||
Each rule begins with a name (which is the name defined by the rule)
|
||||
and a colon. A vertical bar (\verb\|\) is used to separate
|
||||
alternatives; it is the least binding operator in this notation. A
|
||||
star (\verb\*\) means zero or more repetitions of the preceding item;
|
||||
likewise, a plus (\verb\+\) means one or more repetitions, and a
|
||||
phrase enclosed in square brackets (\verb\[ ]\) means zero or one
|
||||
occurrences (in other words, the enclosed phrase is optional). The
|
||||
\verb\*\ and \verb\+\ operators bind as tightly as possible;
|
||||
parentheses are used for grouping. Literal strings are enclosed in
|
||||
double quotes. White space is only meaningful to separate tokens.
|
||||
Rules are normally contained on a single line; rules with many
|
||||
alternatives may be formatted alternatively with each line after the
|
||||
first beginning with a vertical bar.
|
||||
|
||||
In lexical definitions (as the example above), two more conventions
|
||||
are used: Two literal characters separated by three dots mean a choice
|
||||
of any single character in the given (inclusive) range of ASCII
|
||||
characters. A phrase between angular brackets (\verb\<...>\) gives an
|
||||
informal description of the symbol defined; e.g. this could be used
|
||||
to describe the notion of `control character' if needed.
|
||||
\index{lexical definitions}
|
||||
\index{ASCII}
|
||||
|
||||
Even though the notation used is almost the same, there is a big
|
||||
difference between the meaning of lexical and syntactic definitions:
|
||||
a lexical definition operates on the individual characters of the
|
||||
input source, while a syntax definition operates on the stream of
|
||||
tokens generated by the lexical analysis. All uses of BNF in the next
|
||||
chapter (``Lexical Analysis'') are lexical definitions; uses in
|
||||
subsequent chapters are syntactic definitions.
|
|
@ -0,0 +1,349 @@
|
|||
\chapter{Lexical analysis}
|
||||
|
||||
A Python program is read by a {\em parser}. Input to the parser is a
|
||||
stream of {\em tokens}, generated by the {\em lexical analyzer}. This
|
||||
chapter describes how the lexical analyzer breaks a file into tokens.
|
||||
\index{lexical analysis}
|
||||
\index{parser}
|
||||
\index{token}
|
||||
|
||||
\section{Line structure}
|
||||
|
||||
A Python program is divided in a number of logical lines. The end of
|
||||
a logical line is represented by the token NEWLINE. Statements cannot
|
||||
cross logical line boundaries except where NEWLINE is allowed by the
|
||||
syntax (e.g. between statements in compound statements).
|
||||
\index{line structure}
|
||||
\index{logical line}
|
||||
\index{NEWLINE token}
|
||||
|
||||
\subsection{Comments}
|
||||
|
||||
A comment starts with a hash character (\verb\#\) that is not part of
|
||||
a string literal, and ends at the end of the physical line. A comment
|
||||
always signifies the end of the logical line. Comments are ignored by
|
||||
the syntax.
|
||||
\index{comment}
|
||||
\index{logical line}
|
||||
\index{physical line}
|
||||
\index{hash character}
|
||||
|
||||
\subsection{Line joining}
|
||||
|
||||
Two or more physical lines may be joined into logical lines using
|
||||
backslash characters (\verb/\/), as follows: when a physical line ends
|
||||
in a backslash that is not part of a string literal or comment, it is
|
||||
joined with the following forming a single logical line, deleting the
|
||||
backslash and the following end-of-line character. For example:
|
||||
\index{physical line}
|
||||
\index{line joining}
|
||||
\index{backslash character}
|
||||
%
|
||||
\begin{verbatim}
|
||||
month_names = ['Januari', 'Februari', 'Maart', \
|
||||
'April', 'Mei', 'Juni', \
|
||||
'Juli', 'Augustus', 'September', \
|
||||
'Oktober', 'November', 'December']
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{Blank lines}
|
||||
|
||||
A logical line that contains only spaces, tabs, and possibly a
|
||||
comment, is ignored (i.e., no NEWLINE token is generated), except that
|
||||
during interactive input of statements, an entirely blank logical line
|
||||
terminates a multi-line statement.
|
||||
\index{blank line}
|
||||
|
||||
\subsection{Indentation}
|
||||
|
||||
Leading whitespace (spaces and tabs) at the beginning of a logical
|
||||
line is used to compute the indentation level of the line, which in
|
||||
turn is used to determine the grouping of statements.
|
||||
\index{indentation}
|
||||
\index{whitespace}
|
||||
\index{leading whitespace}
|
||||
\index{space}
|
||||
\index{tab}
|
||||
\index{grouping}
|
||||
\index{statement grouping}
|
||||
|
||||
First, tabs are replaced (from left to right) by one to eight spaces
|
||||
such that the total number of characters up to there is a multiple of
|
||||
eight (this is intended to be the same rule as used by {\UNIX}). The
|
||||
total number of spaces preceding the first non-blank character then
|
||||
determines the line's indentation. Indentation cannot be split over
|
||||
multiple physical lines using backslashes.
|
||||
|
||||
The indentation levels of consecutive lines are used to generate
|
||||
INDENT and DEDENT tokens, using a stack, as follows.
|
||||
\index{INDENT token}
|
||||
\index{DEDENT token}
|
||||
|
||||
Before the first line of the file is read, a single zero is pushed on
|
||||
the stack; this will never be popped off again. The numbers pushed on
|
||||
the stack will always be strictly increasing from bottom to top. At
|
||||
the beginning of each logical line, the line's indentation level is
|
||||
compared to the top of the stack. If it is equal, nothing happens.
|
||||
If it is larger, it is pushed on the stack, and one INDENT token is
|
||||
generated. If it is smaller, it {\em must} be one of the numbers
|
||||
occurring on the stack; all numbers on the stack that are larger are
|
||||
popped off, and for each number popped off a DEDENT token is
|
||||
generated. At the end of the file, a DEDENT token is generated for
|
||||
each number remaining on the stack that is larger than zero.
|
||||
|
||||
Here is an example of a correctly (though confusingly) indented piece
|
||||
of Python code:
|
||||
|
||||
\begin{verbatim}
|
||||
def perm(l):
|
||||
# Compute the list of all permutations of l
|
||||
|
||||
if len(l) <= 1:
|
||||
return [l]
|
||||
r = []
|
||||
for i in range(len(l)):
|
||||
s = l[:i] + l[i+1:]
|
||||
p = perm(s)
|
||||
for x in p:
|
||||
r.append(l[i:i+1] + x)
|
||||
return r
|
||||
\end{verbatim}
|
||||
|
||||
The following example shows various indentation errors:
|
||||
|
||||
\begin{verbatim}
|
||||
def perm(l): # error: first line indented
|
||||
for i in range(len(l)): # error: not indented
|
||||
s = l[:i] + l[i+1:]
|
||||
p = perm(l[:i] + l[i+1:]) # error: unexpected indent
|
||||
for x in p:
|
||||
r.append(l[i:i+1] + x)
|
||||
return r # error: inconsistent dedent
|
||||
\end{verbatim}
|
||||
|
||||
(Actually, the first three errors are detected by the parser; only the
|
||||
last error is found by the lexical analyzer --- the indentation of
|
||||
\verb\return r\ does not match a level popped off the stack.)
|
||||
|
||||
\section{Other tokens}
|
||||
|
||||
Besides NEWLINE, INDENT and DEDENT, the following categories of tokens
|
||||
exist: identifiers, keywords, literals, operators, and delimiters.
|
||||
Spaces and tabs are not tokens, but serve to delimit tokens. Where
|
||||
ambiguity exists, a token comprises the longest possible string that
|
||||
forms a legal token, when read from left to right.
|
||||
|
||||
\section{Identifiers}
|
||||
|
||||
Identifiers (also referred to as names) are described by the following
|
||||
lexical definitions:
|
||||
\index{identifier}
|
||||
\index{name}
|
||||
|
||||
\begin{verbatim}
|
||||
identifier: (letter|"_") (letter|digit|"_")*
|
||||
letter: lowercase | uppercase
|
||||
lowercase: "a"..."z"
|
||||
uppercase: "A"..."Z"
|
||||
digit: "0"..."9"
|
||||
\end{verbatim}
|
||||
|
||||
Identifiers are unlimited in length. Case is significant.
|
||||
|
||||
\subsection{Keywords}
|
||||
|
||||
The following identifiers are used as reserved words, or {\em
|
||||
keywords} of the language, and cannot be used as ordinary
|
||||
identifiers. They must be spelled exactly as written here:
|
||||
\index{keyword}
|
||||
\index{reserved word}
|
||||
|
||||
\begin{verbatim}
|
||||
and del for in print
|
||||
break elif from is raise
|
||||
class else global not return
|
||||
continue except if or try
|
||||
def finally import pass while
|
||||
\end{verbatim}
|
||||
|
||||
% # This Python program sorts and formats the above table
|
||||
% import string
|
||||
% l = []
|
||||
% try:
|
||||
% while 1:
|
||||
% l = l + string.split(raw_input())
|
||||
% except EOFError:
|
||||
% pass
|
||||
% l.sort()
|
||||
% for i in range((len(l)+4)/5):
|
||||
% for j in range(i, len(l), 5):
|
||||
% print string.ljust(l[j], 10),
|
||||
% print
|
||||
|
||||
\section{Literals} \label{literals}
|
||||
|
||||
Literals are notations for constant values of some built-in types.
|
||||
\index{literal}
|
||||
\index{constant}
|
||||
|
||||
\subsection{String literals}
|
||||
|
||||
String literals are described by the following lexical definitions:
|
||||
\index{string literal}
|
||||
|
||||
\begin{verbatim}
|
||||
stringliteral: "'" stringitem* "'"
|
||||
stringitem: stringchar | escapeseq
|
||||
stringchar: <any ASCII character except newline or "\" or "'">
|
||||
escapeseq: "'" <any ASCII character except newline>
|
||||
\end{verbatim}
|
||||
\index{ASCII}
|
||||
|
||||
String literals cannot span physical line boundaries. Escape
|
||||
sequences in strings are actually interpreted according to rules
|
||||
similar to those used by Standard C. The recognized escape sequences
|
||||
are:
|
||||
\index{physical line}
|
||||
\index{escape sequence}
|
||||
\index{Standard C}
|
||||
\index{C}
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{|l|l|}
|
||||
\hline
|
||||
\verb/\\/ & Backslash (\verb/\/) \\
|
||||
\verb/\'/ & Single quote (\verb/'/) \\
|
||||
\verb/\a/ & ASCII Bell (BEL) \\
|
||||
\verb/\b/ & ASCII Backspace (BS) \\
|
||||
%\verb/\E/ & ASCII Escape (ESC) \\
|
||||
\verb/\f/ & ASCII Formfeed (FF) \\
|
||||
\verb/\n/ & ASCII Linefeed (LF) \\
|
||||
\verb/\r/ & ASCII Carriage Return (CR) \\
|
||||
\verb/\t/ & ASCII Horizontal Tab (TAB) \\
|
||||
\verb/\v/ & ASCII Vertical Tab (VT) \\
|
||||
\verb/\/{\em ooo} & ASCII character with octal value {\em ooo} \\
|
||||
\verb/\x/{\em xx...} & ASCII character with hex value {\em xx...} \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
\index{ASCII}
|
||||
|
||||
In strict compatibility with Standard C, up to three octal digits are
|
||||
accepted, but an unlimited number of hex digits is taken to be part of
|
||||
the hex escape (and then the lower 8 bits of the resulting hex number
|
||||
are used in all current implementations...).
|
||||
|
||||
All unrecognized escape sequences are left in the string unchanged,
|
||||
i.e., {\em the backslash is left in the string.} (This behavior is
|
||||
useful when debugging: if an escape sequence is mistyped, the
|
||||
resulting output is more easily recognized as broken. It also helps a
|
||||
great deal for string literals used as regular expressions or
|
||||
otherwise passed to other modules that do their own escape handling.)
|
||||
\index{unrecognized escape sequence}
|
||||
|
||||
\subsection{Numeric literals}
|
||||
|
||||
There are three types of numeric literals: plain integers, long
|
||||
integers, and floating point numbers.
|
||||
\index{number}
|
||||
\index{numeric literal}
|
||||
\index{integer literal}
|
||||
\index{plain integer literal}
|
||||
\index{long integer literal}
|
||||
\index{floating point literal}
|
||||
\index{hexadecimal literal}
|
||||
\index{octal literal}
|
||||
\index{decimal literal}
|
||||
|
||||
Integer and long integer literals are described by the following
|
||||
lexical definitions:
|
||||
|
||||
\begin{verbatim}
|
||||
longinteger: integer ("l"|"L")
|
||||
integer: decimalinteger | octinteger | hexinteger
|
||||
decimalinteger: nonzerodigit digit* | "0"
|
||||
octinteger: "0" octdigit+
|
||||
hexinteger: "0" ("x"|"X") hexdigit+
|
||||
|
||||
nonzerodigit: "1"..."9"
|
||||
octdigit: "0"..."7"
|
||||
hexdigit: digit|"a"..."f"|"A"..."F"
|
||||
\end{verbatim}
|
||||
|
||||
Although both lower case `l' and upper case `L' are allowed as suffix
|
||||
for long integers, it is strongly recommended to always use `L', since
|
||||
the letter `l' looks too much like the digit `1'.
|
||||
|
||||
Plain integer decimal literals must be at most $2^{31} - 1$ (i.e., the
|
||||
largest positive integer, assuming 32-bit arithmetic). Plain octal and
|
||||
hexadecimal literals may be as large as $2^{32} - 1$, but values
|
||||
larger than $2^{31} - 1$ are converted to a negative value by
|
||||
subtracting $2^{32}$. There is no limit for long integer literals.
|
||||
|
||||
Some examples of plain and long integer literals:
|
||||
|
||||
\begin{verbatim}
|
||||
7 2147483647 0177 0x80000000
|
||||
3L 79228162514264337593543950336L 0377L 0x100000000L
|
||||
\end{verbatim}
|
||||
|
||||
Floating point literals are described by the following lexical
|
||||
definitions:
|
||||
|
||||
\begin{verbatim}
|
||||
floatnumber: pointfloat | exponentfloat
|
||||
pointfloat: [intpart] fraction | intpart "."
|
||||
exponentfloat: (intpart | pointfloat) exponent
|
||||
intpart: digit+
|
||||
fraction: "." digit+
|
||||
exponent: ("e"|"E") ["+"|"-"] digit+
|
||||
\end{verbatim}
|
||||
|
||||
The allowed range of floating point literals is
|
||||
implementation-dependent.
|
||||
|
||||
Some examples of floating point literals:
|
||||
|
||||
\begin{verbatim}
|
||||
3.14 10. .001 1e100 3.14e-10
|
||||
\end{verbatim}
|
||||
|
||||
Note that numeric literals do not include a sign; a phrase like
|
||||
\verb\-1\ is actually an expression composed of the operator
|
||||
\verb\-\ and the literal \verb\1\.
|
||||
|
||||
\section{Operators}
|
||||
|
||||
The following tokens are operators:
|
||||
\index{operators}
|
||||
|
||||
\begin{verbatim}
|
||||
+ - * / %
|
||||
<< >> & | ^ ~
|
||||
< == > <= <> != >=
|
||||
\end{verbatim}
|
||||
|
||||
The comparison operators \verb\<>\ and \verb\!=\ are alternate
|
||||
spellings of the same operator.
|
||||
|
||||
\section{Delimiters}
|
||||
|
||||
The following tokens serve as delimiters or otherwise have a special
|
||||
meaning:
|
||||
\index{delimiters}
|
||||
|
||||
\begin{verbatim}
|
||||
( ) [ ] { }
|
||||
; , : . ` =
|
||||
\end{verbatim}
|
||||
|
||||
The following printing ASCII characters are not used in Python. Their
|
||||
occurrence outside string literals and comments is an unconditional
|
||||
error:
|
||||
\index{ASCII}
|
||||
|
||||
\begin{verbatim}
|
||||
@ $ " ?
|
||||
\end{verbatim}
|
||||
|
||||
They may be used by future versions of the language though!
|
|
@ -0,0 +1,705 @@
|
|||
\chapter{Data model}
|
||||
|
||||
\section{Objects, values and types}
|
||||
|
||||
{\em Objects} are Python's abstraction for data. All data in a Python
|
||||
program is represented by objects or by relations between objects.
|
||||
(In a sense, and in conformance to Von Neumann's model of a
|
||||
``stored program computer'', code is also represented by objects.)
|
||||
\index{object}
|
||||
\index{data}
|
||||
|
||||
Every object has an identity, a type and a value. An object's {\em
|
||||
identity} never changes once it has been created; you may think of it
|
||||
as the object's address in memory. An object's {\em type} is also
|
||||
unchangeable. It determines the operations that an object supports
|
||||
(e.g. ``does it have a length?'') and also defines the possible
|
||||
values for objects of that type. The {\em value} of some objects can
|
||||
change. Objects whose value can change are said to be {\em mutable};
|
||||
objects whose value is unchangeable once they are created are called
|
||||
{\em immutable}. The type determines an object's (im)mutability.
|
||||
\index{identity of an object}
|
||||
\index{value of an object}
|
||||
\index{type of an object}
|
||||
\index{mutable object}
|
||||
\index{immutable object}
|
||||
|
||||
Objects are never explicitly destroyed; however, when they become
|
||||
unreachable they may be garbage-collected. An implementation is
|
||||
allowed to delay garbage collection or omit it altogether --- it is a
|
||||
matter of implementation quality how garbage collection is
|
||||
implemented, as long as no objects are collected that are still
|
||||
reachable. (Implementation note: the current implementation uses a
|
||||
reference-counting scheme which collects most objects as soon as they
|
||||
become unreachable, but never collects garbage containing circular
|
||||
references.)
|
||||
\index{garbage collection}
|
||||
\index{reference counting}
|
||||
\index{unreachable object}
|
||||
|
||||
Note that the use of the implementation's tracing or debugging
|
||||
facilities may keep objects alive that would normally be collectable.
|
||||
|
||||
Some objects contain references to ``external'' resources such as open
|
||||
files or windows. It is understood that these resources are freed
|
||||
when the object is garbage-collected, but since garbage collection is
|
||||
not guaranteed to happen, such objects also provide an explicit way to
|
||||
release the external resource, usually a \verb\close\ method.
|
||||
Programs are strongly recommended to always explicitly close such
|
||||
objects.
|
||||
|
||||
Some objects contain references to other objects; these are called
|
||||
{\em containers}. Examples of containers are tuples, lists and
|
||||
dictionaries. The references are part of a container's value. In
|
||||
most cases, when we talk about the value of a container, we imply the
|
||||
values, not the identities of the contained objects; however, when we
|
||||
talk about the (im)mutability of a container, only the identities of
|
||||
the immediately contained objects are implied. (So, if an immutable
|
||||
container contains a reference to a mutable object, its value changes
|
||||
if that mutable object is changed.)
|
||||
\index{container}
|
||||
|
||||
Types affect almost all aspects of objects' lives. Even the meaning
|
||||
of object identity is affected in some sense: for immutable types,
|
||||
operations that compute new values may actually return a reference to
|
||||
any existing object with the same type and value, while for mutable
|
||||
objects this is not allowed. E.g. after
|
||||
|
||||
\begin{verbatim}
|
||||
a = 1; b = 1; c = []; d = []
|
||||
\end{verbatim}
|
||||
|
||||
\verb\a\ and \verb\b\ may or may not refer to the same object with the
|
||||
value one, depending on the implementation, but \verb\c\ and \verb\d\
|
||||
are guaranteed to refer to two different, unique, newly created empty
|
||||
lists.
|
||||
|
||||
\section{The standard type hierarchy} \label{types}
|
||||
|
||||
Below is a list of the types that are built into Python. Extension
|
||||
modules written in C can define additional types. Future versions of
|
||||
Python may add types to the type hierarchy (e.g. rational or complex
|
||||
numbers, efficiently stored arrays of integers, etc.).
|
||||
\index{type}
|
||||
\indexii{data}{type}
|
||||
\indexii{type}{hierarchy}
|
||||
\indexii{extension}{module}
|
||||
\index{C}
|
||||
|
||||
Some of the type descriptions below contain a paragraph listing
|
||||
`special attributes'. These are attributes that provide access to the
|
||||
implementation and are not intended for general use. Their definition
|
||||
may change in the future. There are also some `generic' special
|
||||
attributes, not listed with the individual objects: \verb\__methods__\
|
||||
is a list of the method names of a built-in object, if it has any;
|
||||
\verb\__members__\ is a list of the data attribute names of a built-in
|
||||
object, if it has any.
|
||||
\index{attribute}
|
||||
\indexii{special}{attribute}
|
||||
\indexiii{generic}{special}{attribute}
|
||||
\ttindex{__methods__}
|
||||
\ttindex{__members__}
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[None]
|
||||
This type has a single value. There is a single object with this value.
|
||||
This object is accessed through the built-in name \verb\None\.
|
||||
It is returned from functions that don't explicitly return an object.
|
||||
\ttindex{None}
|
||||
\obindex{None@{\tt None}}
|
||||
|
||||
\item[Numbers]
|
||||
These are created by numeric literals and returned as results by
|
||||
arithmetic operators and arithmetic built-in functions. Numeric
|
||||
objects are immutable; once created their value never changes. Python
|
||||
numbers are of course strongly related to mathematical numbers, but
|
||||
subject to the limitations of numerical representation in computers.
|
||||
\obindex{number}
|
||||
\obindex{numeric}
|
||||
|
||||
Python distinguishes between integers and floating point numbers:
|
||||
|
||||
\begin{description}
|
||||
\item[Integers]
|
||||
These represent elements from the mathematical set of whole numbers.
|
||||
\obindex{integer}
|
||||
|
||||
There are two types of integers:
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[Plain integers]
|
||||
These represent numbers in the range $-2^{31}$ through $2^{31}-1$.
|
||||
(The range may be larger on machines with a larger natural word
|
||||
size, but not smaller.)
|
||||
When the result of an operation falls outside this range, the
|
||||
exception \verb\OverflowError\ is raised.
|
||||
For the purpose of shift and mask operations, integers are assumed to
|
||||
have a binary, 2's complement notation using 32 or more bits, and
|
||||
hiding no bits from the user (i.e., all $2^{32}$ different bit
|
||||
patterns correspond to different values).
|
||||
\obindex{plain integer}
|
||||
|
||||
\item[Long integers]
|
||||
These represent numbers in an unlimited range, subject to available
|
||||
(virtual) memory only. For the purpose of shift and mask operations,
|
||||
a binary representation is assumed, and negative numbers are
|
||||
represented in a variant of 2's complement which gives the illusion of
|
||||
an infinite string of sign bits extending to the left.
|
||||
\obindex{long integer}
|
||||
|
||||
\end{description} % Integers
|
||||
|
||||
The rules for integer representation are intended to give the most
|
||||
meaningful interpretation of shift and mask operations involving
|
||||
negative integers and the least surprises when switching between the
|
||||
plain and long integer domains. For any operation except left shift,
|
||||
if it yields a result in the plain integer domain without causing
|
||||
overflow, it will yield the same result in the long integer domain or
|
||||
when using mixed operands.
|
||||
\indexii{integer}{representation}
|
||||
|
||||
\item[Floating point numbers]
|
||||
These represent machine-level double precision floating point numbers.
|
||||
You are at the mercy of the underlying machine architecture and
|
||||
C implementation for the accepted range and handling of overflow.
|
||||
\obindex{floating point}
|
||||
\indexii{floating point}{number}
|
||||
\index{C}
|
||||
|
||||
\end{description} % Numbers
|
||||
|
||||
\item[Sequences]
|
||||
These represent finite ordered sets indexed by natural numbers.
|
||||
The built-in function \verb\len()\ returns the number of elements
|
||||
of a sequence. When this number is $n$, the index set contains
|
||||
the numbers $0, 1, \ldots, n-1$. Element \verb\i\ of sequence
|
||||
\verb\a\ is selected by \verb\a[i]\.
|
||||
\obindex{seqence}
|
||||
\bifuncindex{len}
|
||||
\index{index operation}
|
||||
\index{item selection}
|
||||
\index{subscription}
|
||||
|
||||
Sequences also support slicing: \verb\a[i:j]\ selects all elements
|
||||
with index $k$ such that $i <= k < j$. When used as an expression,
|
||||
a slice is a sequence of the same type --- this implies that the
|
||||
index set is renumbered so that it starts at 0 again.
|
||||
\index{slicing}
|
||||
|
||||
Sequences are distinguished according to their mutability:
|
||||
|
||||
\begin{description}
|
||||
%
|
||||
\item[Immutable sequences]
|
||||
An object of an immutable sequence type cannot change once it is
|
||||
created. (If the object contains references to other objects,
|
||||
these other objects may be mutable and may be changed; however
|
||||
the collection of objects directly referenced by an immutable object
|
||||
cannot change.)
|
||||
\obindex{immutable sequence}
|
||||
\obindex{immutable}
|
||||
|
||||
The following types are immutable sequences:
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[Strings]
|
||||
The elements of a string are characters. There is no separate
|
||||
character type; a character is represented by a string of one element.
|
||||
Characters represent (at least) 8-bit bytes. The built-in
|
||||
functions \verb\chr()\ and \verb\ord()\ convert between characters
|
||||
and nonnegative integers representing the byte values.
|
||||
Bytes with the values 0-127 represent the corresponding ASCII values.
|
||||
The string data type is also used to represent arrays of bytes, e.g.
|
||||
to hold data read from a file.
|
||||
\obindex{string}
|
||||
\index{character}
|
||||
\index{byte}
|
||||
\index{ASCII}
|
||||
\bifuncindex{chr}
|
||||
\bifuncindex{ord}
|
||||
|
||||
(On systems whose native character set is not ASCII, strings may use
|
||||
EBCDIC in their internal representation, provided the functions
|
||||
\verb\chr()\ and \verb\ord()\ implement a mapping between ASCII and
|
||||
EBCDIC, and string comparison preserves the ASCII order.
|
||||
Or perhaps someone can propose a better rule?)
|
||||
\index{ASCII}
|
||||
\index{EBCDIC}
|
||||
\index{character set}
|
||||
\indexii{string}{comparison}
|
||||
\bifuncindex{chr}
|
||||
\bifuncindex{ord}
|
||||
|
||||
\item[Tuples]
|
||||
The elements of a tuple are arbitrary Python objects.
|
||||
Tuples of two or more elements are formed by comma-separated lists
|
||||
of expressions. A tuple of one element (a `singleton') can be formed
|
||||
by affixing a comma to an expression (an expression by itself does
|
||||
not create a tuple, since parentheses must be usable for grouping of
|
||||
expressions). An empty tuple can be formed by enclosing `nothing' in
|
||||
parentheses.
|
||||
\obindex{tuple}
|
||||
\indexii{singleton}{tuple}
|
||||
\indexii{empty}{tuple}
|
||||
|
||||
\end{description} % Immutable sequences
|
||||
|
||||
\item[Mutable sequences]
|
||||
Mutable sequences can be changed after they are created. The
|
||||
subscription and slicing notations can be used as the target of
|
||||
assignment and \verb\del\ (delete) statements.
|
||||
\obindex{mutable sequece}
|
||||
\obindex{mutable}
|
||||
\indexii{assignment}{statement}
|
||||
\index{delete}
|
||||
\stindex{del}
|
||||
\index{subscription}
|
||||
\index{slicing}
|
||||
|
||||
There is currently a single mutable sequence type:
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[Lists]
|
||||
The elements of a list are arbitrary Python objects. Lists are formed
|
||||
by placing a comma-separated list of expressions in square brackets.
|
||||
(Note that there are no special cases needed to form lists of length 0
|
||||
or 1.)
|
||||
\obindex{list}
|
||||
|
||||
\end{description} % Mutable sequences
|
||||
|
||||
\end{description} % Sequences
|
||||
|
||||
\item[Mapping types]
|
||||
These represent finite sets of objects indexed by arbitrary index sets.
|
||||
The subscript notation \verb\a[k]\ selects the element indexed
|
||||
by \verb\k\ from the mapping \verb\a\; this can be used in
|
||||
expressions and as the target of assignments or \verb\del\ statements.
|
||||
The built-in function \verb\len()\ returns the number of elements
|
||||
in a mapping.
|
||||
\bifuncindex{len}
|
||||
\index{subscription}
|
||||
\obindex{mapping}
|
||||
|
||||
There is currently a single mapping type:
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[Dictionaries]
|
||||
These represent finite sets of objects indexed by strings.
|
||||
Dictionaries are mutable; they are created by the \verb\{...}\
|
||||
notation (see section \ref{dict}). (Implementation note: the strings
|
||||
used for indexing must not contain null bytes.)
|
||||
\obindex{dictionary}
|
||||
\obindex{mutable}
|
||||
|
||||
\end{description} % Mapping types
|
||||
|
||||
\item[Callable types]
|
||||
These are the types to which the function call (invocation) operation,
|
||||
written as \verb\function(argument, argument, ...)\, can be applied:
|
||||
\indexii{function}{call}
|
||||
\index{invocation}
|
||||
\indexii{function}{argument}
|
||||
\obindex{callable}
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[User-defined functions]
|
||||
A user-defined function object is created by a function definition
|
||||
(see section \ref{function}). It should be called with an argument
|
||||
list containing the same number of items as the function's formal
|
||||
parameter list.
|
||||
\indexii{user-defined}{function}
|
||||
\obindex{function}
|
||||
\obindex{user-defined function}
|
||||
|
||||
Special read-only attributes: \verb\func_code\ is the code object
|
||||
representing the compiled function body, and \verb\func_globals\ is (a
|
||||
reference to) the dictionary that holds the function's global
|
||||
variables --- it implements the global name space of the module in
|
||||
which the function was defined.
|
||||
\ttindex{func_code}
|
||||
\ttindex{func_globals}
|
||||
\indexii{global}{name space}
|
||||
|
||||
\item[User-defined methods]
|
||||
A user-defined method (a.k.a. {\em object closure}) is a pair of a
|
||||
class instance object and a user-defined function. It should be
|
||||
called with an argument list containing one item less than the number
|
||||
of items in the function's formal parameter list. When called, the
|
||||
class instance becomes the first argument, and the call arguments are
|
||||
shifted one to the right.
|
||||
\obindex{method}
|
||||
\obindex{user-defined method}
|
||||
\indexii{user-defined}{method}
|
||||
\index{object closure}
|
||||
|
||||
Special read-only attributes: \verb\im_self\ is the class instance
|
||||
object, \verb\im_func\ is the function object.
|
||||
\ttindex{im_func}
|
||||
\ttindex{im_self}
|
||||
|
||||
\item[Built-in functions]
|
||||
A built-in function object is a wrapper around a C function. Examples
|
||||
of built-in functions are \verb\len\ and \verb\math.sin\. There
|
||||
are no special attributes. The number and type of the arguments are
|
||||
determined by the C function.
|
||||
\obindex{built-in function}
|
||||
\obindex{function}
|
||||
\index{C}
|
||||
|
||||
\item[Built-in methods]
|
||||
This is really a different disguise of a built-in function, this time
|
||||
containing an object passed to the C function as an implicit extra
|
||||
argument. An example of a built-in method is \verb\list.append\ if
|
||||
\verb\list\ is a list object.
|
||||
\obindex{built-in method}
|
||||
\obindex{method}
|
||||
\indexii{built-in}{method}
|
||||
|
||||
\item[Classes]
|
||||
Class objects are described below. When a class object is called as a
|
||||
parameterless function, a new class instance (also described below) is
|
||||
created and returned. The class's initialization function is not
|
||||
called --- this is the responsibility of the caller. It is illegal to
|
||||
call a class object with one or more arguments.
|
||||
\obindex{class}
|
||||
\obindex{class instance}
|
||||
\obindex{instance}
|
||||
\indexii{class object}{call}
|
||||
|
||||
\end{description}
|
||||
|
||||
\item[Modules]
|
||||
Modules are imported by the \verb\import\ statement (see section
|
||||
\ref{import}). A module object is a container for a module's name
|
||||
space, which is a dictionary (the same dictionary as referenced by the
|
||||
\verb\func_globals\ attribute of functions defined in the module).
|
||||
Module attribute references are translated to lookups in this
|
||||
dictionary. A module object does not contain the code object used to
|
||||
initialize the module (since it isn't needed once the initialization
|
||||
is done).
|
||||
\stindex{import}
|
||||
\obindex{module}
|
||||
|
||||
Attribute assignment update the module's name space dictionary.
|
||||
|
||||
Special read-only attributes: \verb\__dict__\ yields the module's name
|
||||
space as a dictionary object; \verb\__name__\ yields the module's name
|
||||
as a string object.
|
||||
\ttindex{__dict__}
|
||||
\ttindex{__name__}
|
||||
\indexii{module}{name space}
|
||||
|
||||
\item[Classes]
|
||||
Class objects are created by class definitions (see section
|
||||
\ref{class}). A class is a container for a dictionary containing the
|
||||
class's name space. Class attribute references are translated to
|
||||
lookups in this dictionary. When an attribute name is not found
|
||||
there, the attribute search continues in the base classes. The search
|
||||
is depth-first, left-to-right in the order of their occurrence in the
|
||||
base class list.
|
||||
\obindex{class}
|
||||
\obindex{class instance}
|
||||
\obindex{instance}
|
||||
\indexii{class object}{call}
|
||||
\index{container}
|
||||
\index{dictionary}
|
||||
\indexii{class}{attribute}
|
||||
|
||||
Class attribute assignments update the class's dictionary, never the
|
||||
dictionary of a base class.
|
||||
\indexiii{class}{attribute}{assignment}
|
||||
|
||||
A class can be called as a parameterless function to yield a class
|
||||
instance (see above).
|
||||
\indexii{class object}{call}
|
||||
|
||||
Special read-only attributes: \verb\__dict__\ yields the dictionary
|
||||
containing the class's name space; \verb\__bases__\ yields a tuple
|
||||
(possibly empty or a singleton) containing the base classes, in the
|
||||
order of their occurrence in the base class list.
|
||||
\ttindex{__dict__}
|
||||
\ttindex{__bases__}
|
||||
|
||||
\item[Class instances]
|
||||
A class instance is created by calling a class object as a
|
||||
parameterless function. A class instance has a dictionary in which
|
||||
attribute references are searched. When an attribute is not found
|
||||
there, and the instance's class has an attribute by that name, and
|
||||
that class attribute is a user-defined function (and in no other
|
||||
cases), the instance attribute reference yields a user-defined method
|
||||
object (see above) constructed from the instance and the function.
|
||||
\obindex{class instance}
|
||||
\obindex{instance}
|
||||
\indexii{class}{instance}
|
||||
\indexii{class instance}{attribute}
|
||||
|
||||
Attribute assignments update the instance's dictionary.
|
||||
\indexiii{class instance}{attribute}{assignment}
|
||||
|
||||
Class instances can pretend to be numbers, sequences, or mappings if
|
||||
they have methods with certain special names. These are described in
|
||||
section \ref{specialnames}.
|
||||
\obindex{number}
|
||||
\obindex{sequence}
|
||||
\obindex{mapping}
|
||||
|
||||
Special read-only attributes: \verb\__dict__\ yields the attribute
|
||||
dictionary; \verb\__class__\ yields the instance's class.
|
||||
\ttindex{__dict__}
|
||||
\ttindex{__class__}
|
||||
|
||||
\item[Files]
|
||||
A file object represents an open file. (It is a wrapper around a C
|
||||
{\tt stdio} file pointer.) File objects are created by the
|
||||
\verb\open()\ built-in function, and also by \verb\posix.popen()\ and
|
||||
the \verb\makefile\ method of socket objects. \verb\sys.stdin\,
|
||||
\verb\sys.stdout\ and \verb\sys.stderr\ are file objects corresponding
|
||||
the the interpreter's standard input, output and error streams.
|
||||
See the Python Library Reference for methods of file objects and other
|
||||
details.
|
||||
\obindex{file}
|
||||
\index{C}
|
||||
\index{stdio}
|
||||
\bifuncindex{open}
|
||||
\bifuncindex{popen}
|
||||
\bifuncindex{makefile}
|
||||
\ttindex{stdin}
|
||||
\ttindex{stdout}
|
||||
\ttindex{stderr}
|
||||
\ttindex{sys.stdin}
|
||||
\ttindex{sys.stdout}
|
||||
\ttindex{sys.stderr}
|
||||
|
||||
\item[Internal types]
|
||||
A few types used internally by the interpreter are exposed to the user.
|
||||
Their definition may change with future versions of the interpreter,
|
||||
but they are mentioned here for completeness.
|
||||
\index{internal type}
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[Code objects]
|
||||
Code objects represent executable code. The difference between a code
|
||||
object and a function object is that the function object contains an
|
||||
explicit reference to the function's context (the module in which it
|
||||
was defined) which a code object contains no context. There is no way
|
||||
to execute a bare code object.
|
||||
\obindex{code}
|
||||
|
||||
Special read-only attributes: \verb\co_code\ is a string representing
|
||||
the sequence of instructions; \verb\co_consts\ is a list of literals
|
||||
used by the code; \verb\co_names\ is a list of names (strings) used by
|
||||
the code; \verb\co_filename\ is the filename from which the code was
|
||||
compiled. (To find out the line numbers, you would have to decode the
|
||||
instructions; the standard library module \verb\dis\ contains an
|
||||
example of how to do this.)
|
||||
\ttindex{co_code}
|
||||
\ttindex{co_consts}
|
||||
\ttindex{co_names}
|
||||
\ttindex{co_filename}
|
||||
|
||||
\item[Frame objects]
|
||||
Frame objects represent execution frames. They may occur in traceback
|
||||
objects (see below).
|
||||
\obindex{frame}
|
||||
|
||||
Special read-only attributes: \verb\f_back\ is to the previous
|
||||
stack frame (towards the caller), or \verb\None\ if this is the bottom
|
||||
stack frame; \verb\f_code\ is the code object being executed in this
|
||||
frame; \verb\f_globals\ is the dictionary used to look up global
|
||||
variables; \verb\f_locals\ is used for local variables;
|
||||
\verb\f_lineno\ gives the line number and \verb\f_lasti\ gives the
|
||||
precise instruction (this is an index into the instruction string of
|
||||
the code object).
|
||||
\ttindex{f_back}
|
||||
\ttindex{f_code}
|
||||
\ttindex{f_globals}
|
||||
\ttindex{f_locals}
|
||||
\ttindex{f_lineno}
|
||||
\ttindex{f_lasti}
|
||||
|
||||
\item[Traceback objects]
|
||||
Traceback objects represent a stack trace of an exception. A
|
||||
traceback object is created when an exception occurs. When the search
|
||||
for an exception handler unwinds the execution stack, at each unwound
|
||||
level a traceback object is inserted in front of the current
|
||||
traceback. When an exception handler is entered, the stack trace is
|
||||
made available to the program as \verb\sys.exc_traceback\. When the
|
||||
program contains no suitable handler, the stack trace is written
|
||||
(nicely formatted) to the standard error stream; if the interpreter is
|
||||
interactive, it is also made available to the user as
|
||||
\verb\sys.last_traceback\.
|
||||
\obindex{traceback}
|
||||
\indexii{stack}{trace}
|
||||
\indexii{exception}{handler}
|
||||
\indexii{execution}{stack}
|
||||
\ttindex{exc_traceback}
|
||||
\ttindex{last_traceback}
|
||||
\ttindex{sys.exc_traceback}
|
||||
\ttindex{sys.last_traceback}
|
||||
|
||||
Special read-only attributes: \verb\tb_next\ is the next level in the
|
||||
stack trace (towards the frame where the exception occurred), or
|
||||
\verb\None\ if there is no next level; \verb\tb_frame\ points to the
|
||||
execution frame of the current level; \verb\tb_lineno\ gives the line
|
||||
number where the exception occurred; \verb\tb_lasti\ indicates the
|
||||
precise instruction. The line number and last instruction in the
|
||||
traceback may differ from the line number of its frame object if the
|
||||
exception occurred in a \verb\try\ statement with no matching
|
||||
\verb\except\ clause or with a \verb\finally\ clause.
|
||||
\ttindex{tb_next}
|
||||
\ttindex{tb_frame}
|
||||
\ttindex{tb_lineno}
|
||||
\ttindex{tb_lasti}
|
||||
\stindex{try}
|
||||
|
||||
\end{description} % Internal types
|
||||
|
||||
\end{description} % Types
|
||||
|
||||
|
||||
\section{Special method names} \label{specialnames}
|
||||
|
||||
A class can implement certain operations that are invoked by special
|
||||
syntax (such as subscription or arithmetic operations) by defining
|
||||
methods with special names. For instance, if a class defines a
|
||||
method named \verb\__getitem__\, and \verb\x\ is an instance of this
|
||||
class, then \verb\x[i]\ is equivalent to \verb\x.__getitem__(i)\.
|
||||
(The reverse is not true --- if \verb\x\ is a list object,
|
||||
\verb\x.__getitem__(i)\ is not equivalent to \verb\x[i]\.)
|
||||
|
||||
Except for \verb\__repr__\ and \verb\__cmp__\, attempts to execute an
|
||||
operation raise an exception when no appropriate method is defined.
|
||||
For \verb\__repr__\ and \verb\__cmp__\, the traditional
|
||||
interpretations are used in this case.
|
||||
|
||||
|
||||
\subsection{Special methods for any type}
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[\tt __repr__(self)]
|
||||
Called by the \verb\print\ statement and conversions (reverse quotes) to
|
||||
compute the string representation of an object.
|
||||
|
||||
\item[\tt _cmp__(self, other)]
|
||||
Called by all comparison operations. Should return -1 if
|
||||
\verb\self < other\, 0 if \verb\self == other\, +1 if
|
||||
\verb\self > other\. (Implementation note: due to limitations in the
|
||||
interpreter, exceptions raised by comparisons are ignored, and the
|
||||
objects will be considered equal in this case.)
|
||||
|
||||
\end{description}
|
||||
|
||||
|
||||
\subsection{Special methods for sequence and mapping types}
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[\tt __len__(self)]
|
||||
Called to implement the built-in function \verb\len()\. Should return
|
||||
the length of the object, an integer \verb\>=\ 0. Also, an object
|
||||
whose \verb\__len__()\ method returns 0 is considered to be false in a
|
||||
Boolean context.
|
||||
|
||||
\item[\tt __getitem__(self, key)]
|
||||
Called to implement evaluation of \verb\self[key]\. Note that the
|
||||
special interpretation of negative keys (if the class wishes to
|
||||
emulate a sequence type) is up to the \verb\__getitem__\ method.
|
||||
|
||||
\item[\tt __setitem__(self, key, value)]
|
||||
Called to implement assignment to \verb\self[key]\. Same note as for
|
||||
\verb\__getitem__\.
|
||||
|
||||
\item[\tt __delitem__(self, key)]
|
||||
Called to implement deletion of \verb\self[key]\. Same note as for
|
||||
\verb\__getitem__\.
|
||||
|
||||
\end{description}
|
||||
|
||||
|
||||
\subsection{Special methods for sequence types}
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[\tt __getslice__(self, i, j)]
|
||||
Called to implement evaluation of \verb\self[i:j]\. Note that missing
|
||||
\verb\i\ or \verb\j\ are replaced by 0 or \verb\len(self)\,
|
||||
respectively, and \verb\len(self)\ has been added (once) to originally
|
||||
negative \verb\i\ or \verb\j\ by the time this function is called
|
||||
(unlike for \verb\__getitem__\).
|
||||
|
||||
\item[\tt __setslice__(self, i, j, sequence)]
|
||||
Called to implement assignment to \verb\self[i:j]\. Same notes as for
|
||||
\verb\__getslice__\.
|
||||
|
||||
\item[\tt __delslice__(self, i, j)]
|
||||
Called to implement deletion of \verb\self[i:j]\. Same notes as for
|
||||
\verb\__getslice__\.
|
||||
|
||||
\end{description}
|
||||
|
||||
|
||||
\subsection{Special methods for numeric types}
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[\tt __add__(self, other)]\itemjoin
|
||||
\item[\tt __sub__(self, other)]\itemjoin
|
||||
\item[\tt __mul__(self, other)]\itemjoin
|
||||
\item[\tt __div__(self, other)]\itemjoin
|
||||
\item[\tt __mod__(self, other)]\itemjoin
|
||||
\item[\tt __divmod__(self, other)]\itemjoin
|
||||
\item[\tt __pow__(self, other)]\itemjoin
|
||||
\item[\tt __lshift__(self, other)]\itemjoin
|
||||
\item[\tt __rshift__(self, other)]\itemjoin
|
||||
\item[\tt __and__(self, other)]\itemjoin
|
||||
\item[\tt __xor__(self, other)]\itemjoin
|
||||
\item[\tt __or__(self, other)]\itembreak
|
||||
Called to implement the binary arithmetic operations (\verb\+\,
|
||||
\verb\-\, \verb\*\, \verb\/\, \verb\%\, \verb\divmod()\, \verb\pow()\,
|
||||
\verb\<<\, \verb\>>\, \verb\&\, \verb\^\, \verb\|\).
|
||||
|
||||
\item[\tt __neg__(self)]\itemjoin
|
||||
\item[\tt __pos__(self)]\itemjoin
|
||||
\item[\tt __abs__(self)]\itemjoin
|
||||
\item[\tt __invert__(self)]\itembreak
|
||||
Called to implement the unary arithmetic operations (\verb\-\, \verb\+\,
|
||||
\verb\abs()\ and \verb\~\).
|
||||
|
||||
\item[\tt __nonzero__(self)]
|
||||
Called to implement boolean testing; should return 0 or 1. An
|
||||
alternative name for this method is \verb\__len__\.
|
||||
|
||||
\item[\tt __coerce__(self, other)]
|
||||
Called to implement ``mixed-mode'' numeric arithmetic. Should either
|
||||
return a tuple containing self and other converted to a common numeric
|
||||
type, or None if no way of conversion is known. When the common type
|
||||
would be the type of other, it is sufficient to return None, since the
|
||||
interpreter will also ask the other object to attempt a coercion (but
|
||||
sometimes, if the implementation of the other type cannot be changed,
|
||||
it is useful to do the conversion to the other type here).
|
||||
|
||||
Note that this method is not called to coerce the arguments to \verb\+\
|
||||
and \verb\*\, because these are also used to implement sequence
|
||||
concatenation and repetition, respectively. Also note that, for the
|
||||
same reason, in \verb\n*x\, where \verb\n\ is a built-in number and
|
||||
\verb\x\ is an instance, a call to \verb\x.__mul__(n)\ is made.%
|
||||
\footnote{The interpreter should really distinguish between
|
||||
user-defined classes implementing sequences, mappings or numbers, but
|
||||
currently it doesn't --- hence this strange exception.}
|
||||
|
||||
\item[\tt __int__(self)]\itemjoin
|
||||
\item[\tt __long__(self)]\itemjoin
|
||||
\item[\tt __float__(self)]\itembreak
|
||||
Called to implement the built-in functions \verb\int()\, \verb\long()\
|
||||
and \verb\float()\. Should return a value of the appropriate type.
|
||||
|
||||
\end{description}
|
|
@ -0,0 +1,147 @@
|
|||
\chapter{Execution model}
|
||||
\index{execution model}
|
||||
|
||||
\section{Code blocks, execution frames, and name spaces} \label{execframes}
|
||||
\index{code block}
|
||||
\indexii{execution}{frame}
|
||||
\index{name space}
|
||||
|
||||
A {\em code block} is a piece of Python program text that can be
|
||||
executed as a unit, such as a module, a class definition or a function
|
||||
body. Some code blocks (like modules) are executed only once, others
|
||||
(like function bodies) may be executed many times. Code block may
|
||||
textually contain other code blocks. Code blocks may invoke other
|
||||
code blocks (that may or may not be textually contained in them) as
|
||||
part of their execution, e.g. by invoking (calling) a function.
|
||||
\index{code block}
|
||||
\indexii{code}{block}
|
||||
|
||||
The following are code blocks: A module is a code block. A function
|
||||
body is a code block. A class definition is a code block. Each
|
||||
command typed interactively is a separate code block; a script file is
|
||||
a code block. The string argument passed to the built-in functions
|
||||
\verb\eval\ and \verb\exec\ are code blocks. And finally, the
|
||||
expression read and evaluated by the built-in function \verb\input\ is
|
||||
a code block.
|
||||
|
||||
A code block is executed in an execution frame. An {\em execution
|
||||
frame} contains some administrative information (used for debugging),
|
||||
determines where and how execution continues after the code block's
|
||||
execution has completed, and (perhaps most importantly) defines two
|
||||
name spaces, the local and the global name space, that affect
|
||||
execution of the code block.
|
||||
\indexii{execution}{frame}
|
||||
|
||||
A {\em name space} is a mapping from names (identifiers) to objects.
|
||||
A particular name space may be referenced by more than one execution
|
||||
frame, and from other places as well. Adding a name to a name space
|
||||
is called {\em binding} a name (to an object); changing the mapping of
|
||||
a name is called {\em rebinding}; removing a name is {\em unbinding}.
|
||||
Name spaces are functionally equivalent to dictionaries.
|
||||
\index{name space}
|
||||
\indexii{binding}{name}
|
||||
\indexii{rebinding}{name}
|
||||
\indexii{unbinding}{name}
|
||||
|
||||
The {\em local name space} of an execution frame determines the default
|
||||
place where names are defined and searched. The {\em global name
|
||||
space} determines the place where names listed in \verb\global\
|
||||
statements are defined and searched, and where names that are not
|
||||
explicitly bound in the current code block are searched.
|
||||
\indexii{local}{name space}
|
||||
\indexii{global}{name space}
|
||||
\stindex{global}
|
||||
|
||||
Whether a name is local or global in a code block is determined by
|
||||
static inspection of the source text for the code block: in the
|
||||
absence of \verb\global\ statements, a name that is bound anywhere in
|
||||
the code block is local in the entire code block; all other names are
|
||||
considered global. The \verb\global\ statement forces global
|
||||
interpretation of selected names throughout the code block. The
|
||||
following constructs bind names: formal parameters, \verb\import\
|
||||
statements, class and function definitions (these bind the class or
|
||||
function name), and targets that are identifiers if occurring in an
|
||||
assignment, \verb\for\ loop header, or \verb\except\ clause header.
|
||||
(A target occurring in a \verb\del\ statement does not bind a name.)
|
||||
|
||||
When a global name is not found in the global name space, it is
|
||||
searched in the list of ``built-in'' names (which is actually the
|
||||
global name space of the module \verb\builtin\). When a name is not
|
||||
found at all, the \verb\NameError\ exception is raised.
|
||||
|
||||
The following table lists the meaning of the local and global name
|
||||
space for various types of code blocks. The name space for a
|
||||
particular module is automatically created when the module is first
|
||||
referenced.
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{|l|l|l|l|}
|
||||
\hline
|
||||
Code block type & Global name space & Local name space & Notes \\
|
||||
\hline
|
||||
Module & n.s. for this module & same as global & \\
|
||||
Script & n.s. for \verb\__main__\ & same as global & \\
|
||||
Interactive command & n.s. for \verb\__main__\ & same as global & \\
|
||||
Class definition & global n.s. of containing block & new n.s. & \\
|
||||
Function body & global n.s. of containing block & new n.s. & \\
|
||||
String passed to \verb\exec\ or \verb\eval\
|
||||
& global n.s. of caller & local n.s. of caller & (1) \\
|
||||
File read by \verb\execfile\
|
||||
& global n.s. of caller & local n.s. of caller & (1) \\
|
||||
Expression read by \verb\input\
|
||||
& global n.s. of caller & local n.s. of caller & \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
Notes:
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[n.s.] means {\em name space}
|
||||
|
||||
\item[(1)] The global and local name space for these functions can be
|
||||
overridden with optional extra arguments.
|
||||
|
||||
\end{description}
|
||||
|
||||
\section{Exceptions}
|
||||
|
||||
Exceptions are a means of breaking out of the normal flow of control
|
||||
of a code block in order to handle errors or other exceptional
|
||||
conditions. An exception is {\em raised} at the point where the error
|
||||
is detected; it may be {\em handled} by the surrounding code block or
|
||||
by any code block that directly or indirectly invoked the code block
|
||||
where the error occurred.
|
||||
\index{exception}
|
||||
\index{raise an exception}
|
||||
\index{handle an exception}
|
||||
\index{exception handler}
|
||||
\index{errors}
|
||||
\index{error handling}
|
||||
|
||||
The Python interpreter raises an exception when it detects an run-time
|
||||
error (such as division by zero). A Python program can also
|
||||
explicitly raise an exception with the \verb\raise\ statement.
|
||||
Exception handlers are specified with the \verb\try...except\
|
||||
statement.
|
||||
|
||||
Python uses the ``termination'' model of error handling: an exception
|
||||
handler can find out what happened and continue execution at an outer
|
||||
level, but it cannot repair the cause of the error and retry the
|
||||
failing operation (except by re-entering the the offending piece of
|
||||
code from the top).
|
||||
|
||||
When an exception is not handled at all, the interpreter terminates
|
||||
execution of the program, or returns to its interactive main loop.
|
||||
|
||||
Exceptions are identified by string objects. Two different string
|
||||
objects with the same value identify different exceptions.
|
||||
|
||||
When an exception is raised, an object (maybe \verb\None\) is passed
|
||||
as the exception's ``parameter''; this object does not affect the
|
||||
selection of an exception handler, but is passed to the selected
|
||||
exception handler as additional information.
|
||||
|
||||
See also the description of the \verb\try\ and \verb\raise\
|
||||
statements.
|
|
@ -0,0 +1,672 @@
|
|||
\chapter{Expressions and conditions}
|
||||
\index{expression}
|
||||
\index{condition}
|
||||
|
||||
{\bf Note:} In this and the following chapters, extended BNF notation
|
||||
will be used to describe syntax, not lexical analysis.
|
||||
\index{BNF}
|
||||
|
||||
This chapter explains the meaning of the elements of expressions and
|
||||
conditions. Conditions are a superset of expressions, and a condition
|
||||
may be used wherever an expression is required by enclosing it in
|
||||
parentheses. The only places where expressions are used in the syntax
|
||||
instead of conditions is in expression statements and on the
|
||||
right-hand side of assignment statements; this catches some nasty bugs
|
||||
like accidentally writing \verb\x == 1\ instead of \verb\x = 1\.
|
||||
\indexii{assignment}{statement}
|
||||
|
||||
The comma plays several roles in Python's syntax. It is usually an
|
||||
operator with a lower precedence than all others, but occasionally
|
||||
serves other purposes as well; e.g. it separates function arguments,
|
||||
is used in list and dictionary constructors, and has special semantics
|
||||
in \verb\print\ statements.
|
||||
\index{comma}
|
||||
|
||||
When (one alternative of) a syntax rule has the form
|
||||
|
||||
\begin{verbatim}
|
||||
name: othername
|
||||
\end{verbatim}
|
||||
|
||||
and no semantics are given, the semantics of this form of \verb\name\
|
||||
are the same as for \verb\othername\.
|
||||
\index{syntax}
|
||||
|
||||
\section{Arithmetic conversions}
|
||||
\indexii{arithmetic}{conversion}
|
||||
|
||||
When a description of an arithmetic operator below uses the phrase
|
||||
``the numeric arguments are converted to a common type'',
|
||||
this both means that if either argument is not a number, a
|
||||
\verb\TypeError\ exception is raised, and that otherwise
|
||||
the following conversions are applied:
|
||||
\exindex{TypeError}
|
||||
\indexii{floating point}{number}
|
||||
\indexii{long}{integer}
|
||||
\indexii{plain}{integer}
|
||||
|
||||
\begin{itemize}
|
||||
\item first, if either argument is a floating point number,
|
||||
the other is converted to floating point;
|
||||
\item else, if either argument is a long integer,
|
||||
the other is converted to long integer;
|
||||
\item otherwise, both must be plain integers and no conversion
|
||||
is necessary.
|
||||
\end{itemize}
|
||||
|
||||
\section{Atoms}
|
||||
\index{atom}
|
||||
|
||||
Atoms are the most basic elements of expressions. Forms enclosed in
|
||||
reverse quotes or in parentheses, brackets or braces are also
|
||||
categorized syntactically as atoms. The syntax for atoms is:
|
||||
|
||||
\begin{verbatim}
|
||||
atom: identifier | literal | enclosure
|
||||
enclosure: parenth_form | list_display | dict_display | string_conversion
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{Identifiers (Names)}
|
||||
\index{name}
|
||||
\index{identifier}
|
||||
|
||||
An identifier occurring as an atom is a reference to a local, global
|
||||
or built-in name binding. If a name can be assigned to anywhere in a
|
||||
code block, and is not mentioned in a \verb\global\ statement in that
|
||||
code block, it refers to a local name throughout that code block.
|
||||
Otherwise, it refers to a global name if one exists, else to a
|
||||
built-in name.
|
||||
\indexii{name}{binding}
|
||||
\index{code block}
|
||||
\stindex{global}
|
||||
\indexii{built-in}{name}
|
||||
\indexii{global}{name}
|
||||
|
||||
When the name is bound to an object, evaluation of the atom yields
|
||||
that object. When a name is not bound, an attempt to evaluate it
|
||||
raises a \verb\NameError\ exception.
|
||||
\exindex{NameError}
|
||||
|
||||
\subsection{Literals}
|
||||
\index{literal}
|
||||
|
||||
Python knows string and numeric literals:
|
||||
|
||||
\begin{verbatim}
|
||||
literal: stringliteral | integer | longinteger | floatnumber
|
||||
\end{verbatim}
|
||||
|
||||
Evaluation of a literal yields an object of the given type (string,
|
||||
integer, long integer, floating point number) with the given value.
|
||||
The value may be approximated in the case of floating point literals.
|
||||
See section \ref{literals} for details.
|
||||
|
||||
All literals correspond to immutable data types, and hence the
|
||||
object's identity is less important than its value. Multiple
|
||||
evaluations of literals with the same value (either the same
|
||||
occurrence in the program text or a different occurrence) may obtain
|
||||
the same object or a different object with the same value.
|
||||
\indexiii{immutable}{data}{type}
|
||||
|
||||
(In the original implementation, all literals in the same code block
|
||||
with the same type and value yield the same object.)
|
||||
|
||||
\subsection{Parenthesized forms}
|
||||
\index{parenthesized form}
|
||||
|
||||
A parenthesized form is an optional condition list enclosed in
|
||||
parentheses:
|
||||
|
||||
\begin{verbatim}
|
||||
parenth_form: "(" [condition_list] ")"
|
||||
\end{verbatim}
|
||||
|
||||
A parenthesized condition list yields whatever that condition list
|
||||
yields.
|
||||
|
||||
An empty pair of parentheses yields an empty tuple object. Since
|
||||
tuples are immutable, the rules for literals apply here.
|
||||
\indexii{empty}{tuple}
|
||||
|
||||
(Note that tuples are not formed by the parentheses, but rather by use
|
||||
of the comma operator. The exception is the empty tuple, for which
|
||||
parentheses {\em are} required --- allowing unparenthesized ``nothing''
|
||||
in expressions would causes ambiguities and allow common typos to
|
||||
pass uncaught.)
|
||||
\index{comma}
|
||||
\indexii{tuple}{display}
|
||||
|
||||
\subsection{List displays}
|
||||
\indexii{list}{display}
|
||||
|
||||
A list display is a possibly empty series of conditions enclosed in
|
||||
square brackets:
|
||||
|
||||
\begin{verbatim}
|
||||
list_display: "[" [condition_list] "]"
|
||||
\end{verbatim}
|
||||
|
||||
A list display yields a new list object.
|
||||
\obindex{list}
|
||||
|
||||
If it has no condition list, the list object has no items. Otherwise,
|
||||
the elements of the condition list are evaluated from left to right
|
||||
and inserted in the list object in that order.
|
||||
\indexii{empty}{list}
|
||||
|
||||
\subsection{Dictionary displays} \label{dict}
|
||||
\indexii{dictionary}{display}
|
||||
|
||||
A dictionary display is a possibly empty series of key/datum pairs
|
||||
enclosed in curly braces:
|
||||
\index{key}
|
||||
\index{datum}
|
||||
\index{key/datum pair}
|
||||
|
||||
\begin{verbatim}
|
||||
dict_display: "{" [key_datum_list] "}"
|
||||
key_datum_list: key_datum ("," key_datum)* [","]
|
||||
key_datum: condition ":" condition
|
||||
\end{verbatim}
|
||||
|
||||
A dictionary display yields a new dictionary object.
|
||||
\obindex{dictionary}
|
||||
|
||||
The key/datum pairs are evaluated from left to right to define the
|
||||
entries of the dictionary: each key object is used as a key into the
|
||||
dictionary to store the corresponding datum.
|
||||
|
||||
Keys must be strings, otherwise a \verb\TypeError\ exception is
|
||||
raised. Clashes between duplicate keys are not detected; the last
|
||||
datum (textually rightmost in the display) stored for a given key
|
||||
value prevails.
|
||||
\exindex{TypeError}
|
||||
|
||||
\subsection{String conversions}
|
||||
\indexii{string}{conversion}
|
||||
|
||||
A string conversion is a condition list enclosed in reverse (or
|
||||
backward) quotes:
|
||||
|
||||
\begin{verbatim}
|
||||
string_conversion: "`" condition_list "`"
|
||||
\end{verbatim}
|
||||
|
||||
A string conversion evaluates the contained condition list and
|
||||
converts the resulting object into a string according to rules
|
||||
specific to its type.
|
||||
|
||||
If the object is a string, a number, \verb\None\, or a tuple, list or
|
||||
dictionary containing only objects whose type is one of these, the
|
||||
resulting string is a valid Python expression which can be passed to
|
||||
the built-in function \verb\eval()\ to yield an expression with the
|
||||
same value (or an approximation, if floating point numbers are
|
||||
involved).
|
||||
|
||||
(In particular, converting a string adds quotes around it and converts
|
||||
``funny'' characters to escape sequences that are safe to print.)
|
||||
|
||||
It is illegal to attempt to convert recursive objects (e.g. lists or
|
||||
dictionaries that contain a reference to themselves, directly or
|
||||
indirectly.)
|
||||
\obindex{recursive}
|
||||
|
||||
\section{Primaries} \label{primaries}
|
||||
\index{primary}
|
||||
|
||||
Primaries represent the most tightly bound operations of the language.
|
||||
Their syntax is:
|
||||
|
||||
\begin{verbatim}
|
||||
primary: atom | attributeref | subscription | slicing | call
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{Attribute references}
|
||||
\indexii{attribute}{reference}
|
||||
|
||||
An attribute reference is a primary followed by a period and a name:
|
||||
|
||||
\begin{verbatim}
|
||||
attributeref: primary "." identifier
|
||||
\end{verbatim}
|
||||
|
||||
The primary must evaluate to an object of a type that supports
|
||||
attribute references, e.g. a module or a list. This object is then
|
||||
asked to produce the attribute whose name is the identifier. If this
|
||||
attribute is not available, the exception \verb\AttributeError\ is
|
||||
raised. Otherwise, the type and value of the object produced is
|
||||
determined by the object. Multiple evaluations of the same attribute
|
||||
reference may yield different objects.
|
||||
\obindex{module}
|
||||
\obindex{list}
|
||||
|
||||
\subsection{Subscriptions}
|
||||
\index{subscription}
|
||||
|
||||
A subscription selects an item of a sequence (string, tuple or list)
|
||||
or mapping (dictionary) object:
|
||||
\obindex{sequence}
|
||||
\obindex{mapping}
|
||||
\obindex{string}
|
||||
\obindex{tuple}
|
||||
\obindex{list}
|
||||
\obindex{dictionary}
|
||||
\indexii{sequence}{item}
|
||||
|
||||
\begin{verbatim}
|
||||
subscription: primary "[" condition "]"
|
||||
\end{verbatim}
|
||||
|
||||
The primary must evaluate to an object of a sequence or mapping type.
|
||||
|
||||
If it is a mapping, the condition must evaluate to an object whose
|
||||
value is one of the keys of the mapping, and the subscription selects
|
||||
the value in the mapping that corresponds to that key.
|
||||
|
||||
If it is a sequence, the condition must evaluate to a plain integer.
|
||||
If this value is negative, the length of the sequence is added to it
|
||||
(so that, e.g. \verb\x[-1]\ selects the last item of \verb\x\.)
|
||||
The resulting value must be a nonnegative integer smaller than the
|
||||
number of items in the sequence, and the subscription selects the item
|
||||
whose index is that value (counting from zero).
|
||||
|
||||
A string's items are characters. A character is not a separate data
|
||||
type but a string of exactly one character.
|
||||
\index{character}
|
||||
\indexii{string}{item}
|
||||
|
||||
\subsection{Slicings}
|
||||
\index{slicing}
|
||||
\index{slice}
|
||||
|
||||
A slicing (or slice) selects a range of items in a sequence (string,
|
||||
tuple or list) object:
|
||||
\obindex{sequence}
|
||||
\obindex{string}
|
||||
\obindex{tuple}
|
||||
\obindex{list}
|
||||
|
||||
\begin{verbatim}
|
||||
slicing: primary "[" [condition] ":" [condition] "]"
|
||||
\end{verbatim}
|
||||
|
||||
The primary must evaluate to a sequence object. The lower and upper
|
||||
bound expressions, if present, must evaluate to plain integers;
|
||||
defaults are zero and the sequence's length, respectively. If either
|
||||
bound is negative, the sequence's length is added to it. The slicing
|
||||
now selects all items with index $k$ such that $i <= k < j$ where $i$
|
||||
and $j$ are the specified lower and upper bounds. This may be an
|
||||
empty sequence. It is not an error if $i$ or $j$ lie outside the
|
||||
range of valid indexes (such items don't exist so they aren't
|
||||
selected).
|
||||
|
||||
\subsection{Calls} \label{calls}
|
||||
\index{call}
|
||||
|
||||
A call calls a callable object (e.g. a function) with a possibly empty
|
||||
series of arguments:
|
||||
\obindex{callable}
|
||||
|
||||
\begin{verbatim}
|
||||
call: primary "(" [condition_list] ")"
|
||||
\end{verbatim}
|
||||
|
||||
The primary must evaluate to a callable object (user-defined
|
||||
functions, built-in functions, methods of built-in objects, class
|
||||
objects, and methods of class instances are callable). If it is a
|
||||
class, the argument list must be empty; otherwise, the arguments are
|
||||
evaluated.
|
||||
|
||||
A call always returns some value, possibly \verb\None\, unless it
|
||||
raises an exception. How this value is computed depends on the type
|
||||
of the callable object. If it is:
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[a user-defined function:] the code block for the function is
|
||||
executed, passing it the argument list. The first thing the code
|
||||
block will do is bind the formal parameters to the arguments; this is
|
||||
described in section \ref{function}. When the code block executes a
|
||||
\verb\return\ statement, this specifies the return value of the
|
||||
function call.
|
||||
\indexii{function}{call}
|
||||
\indexiii{user-defined}{function}{call}
|
||||
\obindex{user-defined function}
|
||||
\obindex{function}
|
||||
|
||||
\item[a built-in function or method:] the result is up to the
|
||||
interpreter; see the library reference manual for the descriptions of
|
||||
built-in functions and methods.
|
||||
\indexii{function}{call}
|
||||
\indexii{built-in function}{call}
|
||||
\indexii{method}{call}
|
||||
\indexii{built-in method}{call}
|
||||
\obindex{built-in method}
|
||||
\obindex{built-in function}
|
||||
\obindex{method}
|
||||
\obindex{function}
|
||||
|
||||
\item[a class object:] a new instance of that class is returned.
|
||||
\obindex{class}
|
||||
\indexii{class object}{call}
|
||||
|
||||
\item[a class instance method:] the corresponding user-defined
|
||||
function is called, with an argument list that is one longer than the
|
||||
argument list of the call: the instance becomes the first argument.
|
||||
\obindex{class instance}
|
||||
\obindex{instance}
|
||||
\indexii{instance}{call}
|
||||
\indexii{class instance}{call}
|
||||
|
||||
\end{description}
|
||||
|
||||
\section{Unary arithmetic operations}
|
||||
\indexiii{unary}{arithmetic}{operation}
|
||||
\indexiii{unary}{bit-wise}{operation}
|
||||
|
||||
All unary arithmetic (and bit-wise) operations have the same priority:
|
||||
|
||||
\begin{verbatim}
|
||||
u_expr: primary | "-" u_expr | "+" u_expr | "~" u_expr
|
||||
\end{verbatim}
|
||||
|
||||
The unary \verb\"-"\ (minus) operator yields the negation of its
|
||||
numeric argument.
|
||||
\index{negation}
|
||||
\index{minus}
|
||||
|
||||
The unary \verb\"+"\ (plus) operator yields its numeric argument
|
||||
unchanged.
|
||||
\index{plus}
|
||||
|
||||
The unary \verb\"~"\ (invert) operator yields the bit-wise inversion
|
||||
of its plain or long integer argument. The bit-wise inversion of
|
||||
\verb\x\ is defined as \verb\-(x+1)\.
|
||||
\index{inversion}
|
||||
|
||||
In all three cases, if the argument does not have the proper type,
|
||||
a \verb\TypeError\ exception is raised.
|
||||
\exindex{TypeError}
|
||||
|
||||
\section{Binary arithmetic operations}
|
||||
\indexiii{binary}{arithmetic}{operation}
|
||||
|
||||
The binary arithmetic operations have the conventional priority
|
||||
levels. Note that some of these operations also apply to certain
|
||||
non-numeric types. There is no ``power'' operator, so there are only
|
||||
two levels, one for multiplicative operators and one for additive
|
||||
operators:
|
||||
|
||||
\begin{verbatim}
|
||||
m_expr: u_expr | m_expr "*" u_expr
|
||||
| m_expr "/" u_expr | m_expr "%" u_expr
|
||||
a_expr: m_expr | aexpr "+" m_expr | aexpr "-" m_expr
|
||||
\end{verbatim}
|
||||
|
||||
The \verb\"*"\ (multiplication) operator yields the product of its
|
||||
arguments. The arguments must either both be numbers, or one argument
|
||||
must be a plain integer and the other must be a sequence. In the
|
||||
former case, the numbers are converted to a common type and then
|
||||
multiplied together. In the latter case, sequence repetition is
|
||||
performed; a negative repetition factor yields an empty sequence.
|
||||
\index{multiplication}
|
||||
|
||||
The \verb\"/"\ (division) operator yields the quotient of its
|
||||
arguments. The numeric arguments are first converted to a common
|
||||
type. Plain or long integer division yields an integer of the same
|
||||
type; the result is that of mathematical division with the `floor'
|
||||
function applied to the result. Division by zero raises the
|
||||
\verb\ZeroDivisionError\ exception.
|
||||
\exindex{ZeroDivisionError}
|
||||
\index{division}
|
||||
|
||||
The \verb\"%"\ (modulo) operator yields the remainder from the
|
||||
division of the first argument by the second. The numeric arguments
|
||||
are first converted to a common type. A zero right argument raises
|
||||
the \verb\ZeroDivisionError\ exception. The arguments may be floating
|
||||
point numbers, e.g. \verb\3.14 % 0.7\ equals \verb\0.34\. The modulo
|
||||
operator always yields a result with the same sign as its second
|
||||
operand (or zero); the absolute value of the result is strictly
|
||||
smaller than the second operand.
|
||||
\index{modulo}
|
||||
|
||||
The integer division and modulo operators are connected by the
|
||||
following identity: \verb\x == (x/y)*y + (x%y)\. Integer division and
|
||||
modulo are also connected with the built-in function \verb\divmod()\:
|
||||
\verb\divmod(x, y) == (x/y, x%y)\. These identities don't hold for
|
||||
floating point numbers; there a similar identity holds where
|
||||
\verb\x/y\ is replaced by \verb\floor(x/y)\).
|
||||
|
||||
The \verb\"+"\ (addition) operator yields the sum of its arguments.
|
||||
The arguments must either both be numbers, or both sequences of the
|
||||
same type. In the former case, the numbers are converted to a common
|
||||
type and then added together. In the latter case, the sequences are
|
||||
concatenated.
|
||||
\index{addition}
|
||||
|
||||
The \verb\"-"\ (subtraction) operator yields the difference of its
|
||||
arguments. The numeric arguments are first converted to a common
|
||||
type.
|
||||
\index{subtraction}
|
||||
|
||||
\section{Shifting operations}
|
||||
\indexii{shifting}{operation}
|
||||
|
||||
The shifting operations have lower priority than the arithmetic
|
||||
operations:
|
||||
|
||||
\begin{verbatim}
|
||||
shift_expr: a_expr | shift_expr ( "<<" | ">>" ) a_expr
|
||||
\end{verbatim}
|
||||
|
||||
These operators accept plain or long integers as arguments. The
|
||||
arguments are converted to a common type. They shift the first
|
||||
argument to the left or right by the number of bits given by the
|
||||
second argument.
|
||||
|
||||
A right shift by $n$ bits is defined as division by $2^n$. A left
|
||||
shift by $n$ bits is defined as multiplication with $2^n$; for plain
|
||||
integers there is no overflow check so this drops bits and flip the
|
||||
sign if the result is not less than $2^{31}$ in absolute value.
|
||||
|
||||
Negative shift counts raise a \verb\ValueError\ exception.
|
||||
\exindex{ValueError}
|
||||
|
||||
\section{Binary bit-wise operations}
|
||||
\indexiii{binary}{bit-wise}{operation}
|
||||
|
||||
Each of the three bitwise operations has a different priority level:
|
||||
|
||||
\begin{verbatim}
|
||||
and_expr: shift_expr | and_expr "&" shift_expr
|
||||
xor_expr: and_expr | xor_expr "^" and_expr
|
||||
or_expr: xor_expr | or_expr "|" xor_expr
|
||||
\end{verbatim}
|
||||
|
||||
The \verb\"&"\ operator yields the bitwise AND of its arguments, which
|
||||
must be plain or long integers. The arguments are converted to a
|
||||
common type.
|
||||
\indexii{bit-wise}{and}
|
||||
|
||||
The \verb\"^"\ operator yields the bitwise XOR (exclusive OR) of its
|
||||
arguments, which must be plain or long integers. The arguments are
|
||||
converted to a common type.
|
||||
\indexii{bit-wise}{xor}
|
||||
\indexii{exclusive}{or}
|
||||
|
||||
The \verb\"|"\ operator yields the bitwise (inclusive) OR of its
|
||||
arguments, which must be plain or long integers. The arguments are
|
||||
converted to a common type.
|
||||
\indexii{bit-wise}{or}
|
||||
\indexii{inclusive}{or}
|
||||
|
||||
\section{Comparisons}
|
||||
\index{comparison}
|
||||
|
||||
Contrary to C, all comparison operations in Python have the same
|
||||
priority, which is lower than that of any arithmetic, shifting or
|
||||
bitwise operation. Also contrary to C, expressions like
|
||||
\verb\a < b < c\ have the interpretation that is conventional in
|
||||
mathematics:
|
||||
\index{C}
|
||||
|
||||
\begin{verbatim}
|
||||
comparison: or_expr (comp_operator or_expr)*
|
||||
comp_operator: "<"|">"|"=="|">="|"<="|"<>"|"!="|"is" ["not"]|["not"] "in"
|
||||
\end{verbatim}
|
||||
|
||||
Comparisons yield integer values: 1 for true, 0 for false.
|
||||
|
||||
Comparisons can be chained arbitrarily, e.g. $x < y <= z$ is
|
||||
equivalent to $x < y$ \verb\and\ $y <= z$, except that $y$ is
|
||||
evaluated only once (but in both cases $z$ is not evaluated at all
|
||||
when $x < y$ is found to be false).
|
||||
\indexii{chaining}{comparisons}
|
||||
|
||||
Formally, $e_0 op_1 e_1 op_2 e_2 ...e_{n-1} op_n e_n$ is equivalent to
|
||||
$e_0 op_1 e_1$ \verb\and\ $e_1 op_2 e_2$ \verb\and\ ... \verb\and\
|
||||
$e_{n-1} op_n e_n$, except that each expression is evaluated at most once.
|
||||
|
||||
Note that $e_0 op_1 e_1 op_2 e_2$ does not imply any kind of comparison
|
||||
between $e_0$ and $e_2$, e.g. $x < y > z$ is perfectly legal.
|
||||
|
||||
The forms \verb\<>\ and \verb\!=\ are equivalent; for consistency with
|
||||
C, \verb\!=\ is preferred; where \verb\!=\ is mentioned below
|
||||
\verb\<>\ is also implied.
|
||||
|
||||
The operators {\tt "<", ">", "==", ">=", "<="}, and {\tt "!="} compare
|
||||
the values of two objects. The objects needn't have the same type.
|
||||
If both are numbers, they are coverted to a common type. Otherwise,
|
||||
objects of different types {\em always} compare unequal, and are
|
||||
ordered consistently but arbitrarily.
|
||||
|
||||
(This unusual definition of comparison is done to simplify the
|
||||
definition of operations like sorting and the \verb\in\ and \verb\not
|
||||
in\ operators.)
|
||||
|
||||
Comparison of objects of the same type depends on the type:
|
||||
|
||||
\begin{itemize}
|
||||
|
||||
\item
|
||||
Numbers are compared arithmetically.
|
||||
|
||||
\item
|
||||
Strings are compared lexicographically using the numeric equivalents
|
||||
(the result of the built-in function \verb\ord\) of their characters.
|
||||
|
||||
\item
|
||||
Tuples and lists are compared lexicographically using comparison of
|
||||
corresponding items.
|
||||
|
||||
\item
|
||||
Mappings (dictionaries) are compared through lexicographic
|
||||
comparison of their sorted (key, value) lists.%
|
||||
\footnote{This is expensive since it requires sorting the keys first,
|
||||
but about the only sensible definition. It was tried to compare
|
||||
dictionaries by identity only, but this caused surprises because
|
||||
people expected to be able to test a dictionary for emptiness by
|
||||
comparing it to {\tt \{\}}.}
|
||||
|
||||
\item
|
||||
Most other types compare unequal unless they are the same object;
|
||||
the choice whether one object is considered smaller or larger than
|
||||
another one is made arbitrarily but consistently within one
|
||||
execution of a program.
|
||||
|
||||
\end{itemize}
|
||||
|
||||
The operators \verb\in\ and \verb\not in\ test for sequence
|
||||
membership: if $y$ is a sequence, $x ~\verb\in\~ y$ is true if and
|
||||
only if there exists an index $i$ such that $x = y[i]$.
|
||||
$x ~\verb\not in\~ y$ yields the inverse truth value. The exception
|
||||
\verb\TypeError\ is raised when $y$ is not a sequence, or when $y$ is
|
||||
a string and $x$ is not a string of length one.%
|
||||
\footnote{The latter restriction is sometimes a nuisance.}
|
||||
\opindex{in}
|
||||
\opindex{not in}
|
||||
\indexii{membership}{test}
|
||||
\obindex{sequence}
|
||||
|
||||
The operators \verb\is\ and \verb\is not\ test for object identity:
|
||||
$x ~\verb\is\~ y$ is true if and only if $x$ and $y$ are the same
|
||||
object. $x ~\verb\is not\~ y$ yields the inverse truth value.
|
||||
\opindex{is}
|
||||
\opindex{is not}
|
||||
\indexii{identity}{test}
|
||||
|
||||
\section{Boolean operations} \label{Booleans}
|
||||
\indexii{Boolean}{operation}
|
||||
|
||||
Boolean operations have the lowest priority of all Python operations:
|
||||
|
||||
\begin{verbatim}
|
||||
condition: or_test
|
||||
or_test: and_test | or_test "or" and_test
|
||||
and_test: not_test | and_test "and" not_test
|
||||
not_test: comparison | "not" not_test
|
||||
\end{verbatim}
|
||||
|
||||
In the context of Boolean operations, and also when conditions are
|
||||
used by control flow statements, the following values are interpreted
|
||||
as false: \verb\None\, numeric zero of all types, empty sequences
|
||||
(strings, tuples and lists), and empty mappings (dictionaries). All
|
||||
other values are interpreted as true.
|
||||
|
||||
The operator \verb\not\ yields 1 if its argument is false, 0 otherwise.
|
||||
\opindex{not}
|
||||
|
||||
The condition $x ~\verb\and\~ y$ first evaluates $x$; if $x$ is false,
|
||||
its value is returned; otherwise, $y$ is evaluated and the resulting
|
||||
value is returned.
|
||||
\opindex{and}
|
||||
|
||||
The condition $x ~\verb\or\~ y$ first evaluates $x$; if $x$ is true,
|
||||
its value is returned; otherwise, $y$ is evaluated and the resulting
|
||||
value is returned.
|
||||
\opindex{or}
|
||||
|
||||
(Note that \verb\and\ and \verb\or\ do not restrict the value and type
|
||||
they return to 0 and 1, but rather return the last evaluated argument.
|
||||
This is sometimes useful, e.g. if \verb\s\ is a string that should be
|
||||
replaced by a default value if it is empty, the expression
|
||||
\verb\s or 'foo'\ yields the desired value. Because \verb\not\ has to
|
||||
invent a value anyway, it does not bother to return a value of the
|
||||
same type as its argument, so e.g. \verb\not 'foo'\ yields \verb\0\,
|
||||
not \verb\''\.)
|
||||
|
||||
\section{Expression lists and condition lists}
|
||||
\indexii{expression}{list}
|
||||
\indexii{condition}{list}
|
||||
|
||||
\begin{verbatim}
|
||||
expr_list: or_expr ("," or_expr)* [","]
|
||||
cond_list: condition ("," condition)* [","]
|
||||
\end{verbatim}
|
||||
|
||||
The only difference between expression lists and condition lists is
|
||||
the lowest priority of operators that can be used in them without
|
||||
being enclosed in parentheses; condition lists allow all operators,
|
||||
while expression lists don't allow comparisons and Boolean operators
|
||||
(they do allow bitwise and shift operators though).
|
||||
|
||||
Expression lists are used in expression statements and assignments;
|
||||
condition lists are used everywhere else where a list of
|
||||
comma-separated values is required.
|
||||
|
||||
An expression (condition) list containing at least one comma yields a
|
||||
tuple. The length of the tuple is the number of expressions
|
||||
(conditions) in the list. The expressions (conditions) are evaluated
|
||||
from left to right. (Conditions lists are used syntactically is a few
|
||||
places where no tuple is constructed but a list of values is needed
|
||||
nevertheless.)
|
||||
\obindex{tuple}
|
||||
|
||||
The trailing comma is required only to create a single tuple (a.k.a. a
|
||||
{\em singleton}); it is optional in all other cases. A single
|
||||
expression (condition) without a trailing comma doesn't create a
|
||||
tuple, but rather yields the value of that expression (condition).
|
||||
\indexii{trailing}{comma}
|
||||
|
||||
(To create an empty tuple, use an empty pair of parentheses:
|
||||
\verb\()\.)
|
|
@ -0,0 +1,81 @@
|
|||
\chapter{Introduction}
|
||||
|
||||
This reference manual describes the Python programming language.
|
||||
It is not intended as a tutorial.
|
||||
|
||||
While I am trying to be as precise as possible, I chose to use English
|
||||
rather than formal specifications for everything except syntax and
|
||||
lexical analysis. This should make the document better understandable
|
||||
to the average reader, but will leave room for ambiguities.
|
||||
Consequently, if you were coming from Mars and tried to re-implement
|
||||
Python from this document alone, you might have to guess things and in
|
||||
fact you would probably end up implementing quite a different language.
|
||||
On the other hand, if you are using
|
||||
Python and wonder what the precise rules about a particular area of
|
||||
the language are, you should definitely be able to find them here.
|
||||
|
||||
It is dangerous to add too many implementation details to a language
|
||||
reference document --- the implementation may change, and other
|
||||
implementations of the same language may work differently. On the
|
||||
other hand, there is currently only one Python implementation, and
|
||||
its particular quirks are sometimes worth being mentioned, especially
|
||||
where the implementation imposes additional limitations. Therefore,
|
||||
you'll find short ``implementation notes'' sprinkled throughout the
|
||||
text.
|
||||
|
||||
Every Python implementation comes with a number of built-in and
|
||||
standard modules. These are not documented here, but in the separate
|
||||
{\em Python Library Reference} document. A few built-in modules are
|
||||
mentioned when they interact in a significant way with the language
|
||||
definition.
|
||||
|
||||
\section{Notation}
|
||||
|
||||
The descriptions of lexical analysis and syntax use a modified BNF
|
||||
grammar notation. This uses the following style of definition:
|
||||
\index{BNF}
|
||||
\index{grammar}
|
||||
\index{syntax}
|
||||
\index{notation}
|
||||
|
||||
\begin{verbatim}
|
||||
name: lc_letter (lc_letter | "_")*
|
||||
lc_letter: "a"..."z"
|
||||
\end{verbatim}
|
||||
|
||||
The first line says that a \verb\name\ is an \verb\lc_letter\ followed by
|
||||
a sequence of zero or more \verb\lc_letter\s and underscores. An
|
||||
\verb\lc_letter\ in turn is any of the single characters `a' through `z'.
|
||||
(This rule is actually adhered to for the names defined in lexical and
|
||||
grammar rules in this document.)
|
||||
|
||||
Each rule begins with a name (which is the name defined by the rule)
|
||||
and a colon. A vertical bar (\verb\|\) is used to separate
|
||||
alternatives; it is the least binding operator in this notation. A
|
||||
star (\verb\*\) means zero or more repetitions of the preceding item;
|
||||
likewise, a plus (\verb\+\) means one or more repetitions, and a
|
||||
phrase enclosed in square brackets (\verb\[ ]\) means zero or one
|
||||
occurrences (in other words, the enclosed phrase is optional). The
|
||||
\verb\*\ and \verb\+\ operators bind as tightly as possible;
|
||||
parentheses are used for grouping. Literal strings are enclosed in
|
||||
double quotes. White space is only meaningful to separate tokens.
|
||||
Rules are normally contained on a single line; rules with many
|
||||
alternatives may be formatted alternatively with each line after the
|
||||
first beginning with a vertical bar.
|
||||
|
||||
In lexical definitions (as the example above), two more conventions
|
||||
are used: Two literal characters separated by three dots mean a choice
|
||||
of any single character in the given (inclusive) range of ASCII
|
||||
characters. A phrase between angular brackets (\verb\<...>\) gives an
|
||||
informal description of the symbol defined; e.g. this could be used
|
||||
to describe the notion of `control character' if needed.
|
||||
\index{lexical definitions}
|
||||
\index{ASCII}
|
||||
|
||||
Even though the notation used is almost the same, there is a big
|
||||
difference between the meaning of lexical and syntactic definitions:
|
||||
a lexical definition operates on the individual characters of the
|
||||
input source, while a syntax definition operates on the stream of
|
||||
tokens generated by the lexical analysis. All uses of BNF in the next
|
||||
chapter (``Lexical Analysis'') are lexical definitions; uses in
|
||||
subsequent chapters are syntactic definitions.
|
|
@ -0,0 +1,349 @@
|
|||
\chapter{Lexical analysis}
|
||||
|
||||
A Python program is read by a {\em parser}. Input to the parser is a
|
||||
stream of {\em tokens}, generated by the {\em lexical analyzer}. This
|
||||
chapter describes how the lexical analyzer breaks a file into tokens.
|
||||
\index{lexical analysis}
|
||||
\index{parser}
|
||||
\index{token}
|
||||
|
||||
\section{Line structure}
|
||||
|
||||
A Python program is divided in a number of logical lines. The end of
|
||||
a logical line is represented by the token NEWLINE. Statements cannot
|
||||
cross logical line boundaries except where NEWLINE is allowed by the
|
||||
syntax (e.g. between statements in compound statements).
|
||||
\index{line structure}
|
||||
\index{logical line}
|
||||
\index{NEWLINE token}
|
||||
|
||||
\subsection{Comments}
|
||||
|
||||
A comment starts with a hash character (\verb\#\) that is not part of
|
||||
a string literal, and ends at the end of the physical line. A comment
|
||||
always signifies the end of the logical line. Comments are ignored by
|
||||
the syntax.
|
||||
\index{comment}
|
||||
\index{logical line}
|
||||
\index{physical line}
|
||||
\index{hash character}
|
||||
|
||||
\subsection{Line joining}
|
||||
|
||||
Two or more physical lines may be joined into logical lines using
|
||||
backslash characters (\verb/\/), as follows: when a physical line ends
|
||||
in a backslash that is not part of a string literal or comment, it is
|
||||
joined with the following forming a single logical line, deleting the
|
||||
backslash and the following end-of-line character. For example:
|
||||
\index{physical line}
|
||||
\index{line joining}
|
||||
\index{backslash character}
|
||||
%
|
||||
\begin{verbatim}
|
||||
month_names = ['Januari', 'Februari', 'Maart', \
|
||||
'April', 'Mei', 'Juni', \
|
||||
'Juli', 'Augustus', 'September', \
|
||||
'Oktober', 'November', 'December']
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{Blank lines}
|
||||
|
||||
A logical line that contains only spaces, tabs, and possibly a
|
||||
comment, is ignored (i.e., no NEWLINE token is generated), except that
|
||||
during interactive input of statements, an entirely blank logical line
|
||||
terminates a multi-line statement.
|
||||
\index{blank line}
|
||||
|
||||
\subsection{Indentation}
|
||||
|
||||
Leading whitespace (spaces and tabs) at the beginning of a logical
|
||||
line is used to compute the indentation level of the line, which in
|
||||
turn is used to determine the grouping of statements.
|
||||
\index{indentation}
|
||||
\index{whitespace}
|
||||
\index{leading whitespace}
|
||||
\index{space}
|
||||
\index{tab}
|
||||
\index{grouping}
|
||||
\index{statement grouping}
|
||||
|
||||
First, tabs are replaced (from left to right) by one to eight spaces
|
||||
such that the total number of characters up to there is a multiple of
|
||||
eight (this is intended to be the same rule as used by {\UNIX}). The
|
||||
total number of spaces preceding the first non-blank character then
|
||||
determines the line's indentation. Indentation cannot be split over
|
||||
multiple physical lines using backslashes.
|
||||
|
||||
The indentation levels of consecutive lines are used to generate
|
||||
INDENT and DEDENT tokens, using a stack, as follows.
|
||||
\index{INDENT token}
|
||||
\index{DEDENT token}
|
||||
|
||||
Before the first line of the file is read, a single zero is pushed on
|
||||
the stack; this will never be popped off again. The numbers pushed on
|
||||
the stack will always be strictly increasing from bottom to top. At
|
||||
the beginning of each logical line, the line's indentation level is
|
||||
compared to the top of the stack. If it is equal, nothing happens.
|
||||
If it is larger, it is pushed on the stack, and one INDENT token is
|
||||
generated. If it is smaller, it {\em must} be one of the numbers
|
||||
occurring on the stack; all numbers on the stack that are larger are
|
||||
popped off, and for each number popped off a DEDENT token is
|
||||
generated. At the end of the file, a DEDENT token is generated for
|
||||
each number remaining on the stack that is larger than zero.
|
||||
|
||||
Here is an example of a correctly (though confusingly) indented piece
|
||||
of Python code:
|
||||
|
||||
\begin{verbatim}
|
||||
def perm(l):
|
||||
# Compute the list of all permutations of l
|
||||
|
||||
if len(l) <= 1:
|
||||
return [l]
|
||||
r = []
|
||||
for i in range(len(l)):
|
||||
s = l[:i] + l[i+1:]
|
||||
p = perm(s)
|
||||
for x in p:
|
||||
r.append(l[i:i+1] + x)
|
||||
return r
|
||||
\end{verbatim}
|
||||
|
||||
The following example shows various indentation errors:
|
||||
|
||||
\begin{verbatim}
|
||||
def perm(l): # error: first line indented
|
||||
for i in range(len(l)): # error: not indented
|
||||
s = l[:i] + l[i+1:]
|
||||
p = perm(l[:i] + l[i+1:]) # error: unexpected indent
|
||||
for x in p:
|
||||
r.append(l[i:i+1] + x)
|
||||
return r # error: inconsistent dedent
|
||||
\end{verbatim}
|
||||
|
||||
(Actually, the first three errors are detected by the parser; only the
|
||||
last error is found by the lexical analyzer --- the indentation of
|
||||
\verb\return r\ does not match a level popped off the stack.)
|
||||
|
||||
\section{Other tokens}
|
||||
|
||||
Besides NEWLINE, INDENT and DEDENT, the following categories of tokens
|
||||
exist: identifiers, keywords, literals, operators, and delimiters.
|
||||
Spaces and tabs are not tokens, but serve to delimit tokens. Where
|
||||
ambiguity exists, a token comprises the longest possible string that
|
||||
forms a legal token, when read from left to right.
|
||||
|
||||
\section{Identifiers}
|
||||
|
||||
Identifiers (also referred to as names) are described by the following
|
||||
lexical definitions:
|
||||
\index{identifier}
|
||||
\index{name}
|
||||
|
||||
\begin{verbatim}
|
||||
identifier: (letter|"_") (letter|digit|"_")*
|
||||
letter: lowercase | uppercase
|
||||
lowercase: "a"..."z"
|
||||
uppercase: "A"..."Z"
|
||||
digit: "0"..."9"
|
||||
\end{verbatim}
|
||||
|
||||
Identifiers are unlimited in length. Case is significant.
|
||||
|
||||
\subsection{Keywords}
|
||||
|
||||
The following identifiers are used as reserved words, or {\em
|
||||
keywords} of the language, and cannot be used as ordinary
|
||||
identifiers. They must be spelled exactly as written here:
|
||||
\index{keyword}
|
||||
\index{reserved word}
|
||||
|
||||
\begin{verbatim}
|
||||
and del for in print
|
||||
break elif from is raise
|
||||
class else global not return
|
||||
continue except if or try
|
||||
def finally import pass while
|
||||
\end{verbatim}
|
||||
|
||||
% # This Python program sorts and formats the above table
|
||||
% import string
|
||||
% l = []
|
||||
% try:
|
||||
% while 1:
|
||||
% l = l + string.split(raw_input())
|
||||
% except EOFError:
|
||||
% pass
|
||||
% l.sort()
|
||||
% for i in range((len(l)+4)/5):
|
||||
% for j in range(i, len(l), 5):
|
||||
% print string.ljust(l[j], 10),
|
||||
% print
|
||||
|
||||
\section{Literals} \label{literals}
|
||||
|
||||
Literals are notations for constant values of some built-in types.
|
||||
\index{literal}
|
||||
\index{constant}
|
||||
|
||||
\subsection{String literals}
|
||||
|
||||
String literals are described by the following lexical definitions:
|
||||
\index{string literal}
|
||||
|
||||
\begin{verbatim}
|
||||
stringliteral: "'" stringitem* "'"
|
||||
stringitem: stringchar | escapeseq
|
||||
stringchar: <any ASCII character except newline or "\" or "'">
|
||||
escapeseq: "'" <any ASCII character except newline>
|
||||
\end{verbatim}
|
||||
\index{ASCII}
|
||||
|
||||
String literals cannot span physical line boundaries. Escape
|
||||
sequences in strings are actually interpreted according to rules
|
||||
similar to those used by Standard C. The recognized escape sequences
|
||||
are:
|
||||
\index{physical line}
|
||||
\index{escape sequence}
|
||||
\index{Standard C}
|
||||
\index{C}
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{|l|l|}
|
||||
\hline
|
||||
\verb/\\/ & Backslash (\verb/\/) \\
|
||||
\verb/\'/ & Single quote (\verb/'/) \\
|
||||
\verb/\a/ & ASCII Bell (BEL) \\
|
||||
\verb/\b/ & ASCII Backspace (BS) \\
|
||||
%\verb/\E/ & ASCII Escape (ESC) \\
|
||||
\verb/\f/ & ASCII Formfeed (FF) \\
|
||||
\verb/\n/ & ASCII Linefeed (LF) \\
|
||||
\verb/\r/ & ASCII Carriage Return (CR) \\
|
||||
\verb/\t/ & ASCII Horizontal Tab (TAB) \\
|
||||
\verb/\v/ & ASCII Vertical Tab (VT) \\
|
||||
\verb/\/{\em ooo} & ASCII character with octal value {\em ooo} \\
|
||||
\verb/\x/{\em xx...} & ASCII character with hex value {\em xx...} \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
\index{ASCII}
|
||||
|
||||
In strict compatibility with Standard C, up to three octal digits are
|
||||
accepted, but an unlimited number of hex digits is taken to be part of
|
||||
the hex escape (and then the lower 8 bits of the resulting hex number
|
||||
are used in all current implementations...).
|
||||
|
||||
All unrecognized escape sequences are left in the string unchanged,
|
||||
i.e., {\em the backslash is left in the string.} (This behavior is
|
||||
useful when debugging: if an escape sequence is mistyped, the
|
||||
resulting output is more easily recognized as broken. It also helps a
|
||||
great deal for string literals used as regular expressions or
|
||||
otherwise passed to other modules that do their own escape handling.)
|
||||
\index{unrecognized escape sequence}
|
||||
|
||||
\subsection{Numeric literals}
|
||||
|
||||
There are three types of numeric literals: plain integers, long
|
||||
integers, and floating point numbers.
|
||||
\index{number}
|
||||
\index{numeric literal}
|
||||
\index{integer literal}
|
||||
\index{plain integer literal}
|
||||
\index{long integer literal}
|
||||
\index{floating point literal}
|
||||
\index{hexadecimal literal}
|
||||
\index{octal literal}
|
||||
\index{decimal literal}
|
||||
|
||||
Integer and long integer literals are described by the following
|
||||
lexical definitions:
|
||||
|
||||
\begin{verbatim}
|
||||
longinteger: integer ("l"|"L")
|
||||
integer: decimalinteger | octinteger | hexinteger
|
||||
decimalinteger: nonzerodigit digit* | "0"
|
||||
octinteger: "0" octdigit+
|
||||
hexinteger: "0" ("x"|"X") hexdigit+
|
||||
|
||||
nonzerodigit: "1"..."9"
|
||||
octdigit: "0"..."7"
|
||||
hexdigit: digit|"a"..."f"|"A"..."F"
|
||||
\end{verbatim}
|
||||
|
||||
Although both lower case `l' and upper case `L' are allowed as suffix
|
||||
for long integers, it is strongly recommended to always use `L', since
|
||||
the letter `l' looks too much like the digit `1'.
|
||||
|
||||
Plain integer decimal literals must be at most $2^{31} - 1$ (i.e., the
|
||||
largest positive integer, assuming 32-bit arithmetic). Plain octal and
|
||||
hexadecimal literals may be as large as $2^{32} - 1$, but values
|
||||
larger than $2^{31} - 1$ are converted to a negative value by
|
||||
subtracting $2^{32}$. There is no limit for long integer literals.
|
||||
|
||||
Some examples of plain and long integer literals:
|
||||
|
||||
\begin{verbatim}
|
||||
7 2147483647 0177 0x80000000
|
||||
3L 79228162514264337593543950336L 0377L 0x100000000L
|
||||
\end{verbatim}
|
||||
|
||||
Floating point literals are described by the following lexical
|
||||
definitions:
|
||||
|
||||
\begin{verbatim}
|
||||
floatnumber: pointfloat | exponentfloat
|
||||
pointfloat: [intpart] fraction | intpart "."
|
||||
exponentfloat: (intpart | pointfloat) exponent
|
||||
intpart: digit+
|
||||
fraction: "." digit+
|
||||
exponent: ("e"|"E") ["+"|"-"] digit+
|
||||
\end{verbatim}
|
||||
|
||||
The allowed range of floating point literals is
|
||||
implementation-dependent.
|
||||
|
||||
Some examples of floating point literals:
|
||||
|
||||
\begin{verbatim}
|
||||
3.14 10. .001 1e100 3.14e-10
|
||||
\end{verbatim}
|
||||
|
||||
Note that numeric literals do not include a sign; a phrase like
|
||||
\verb\-1\ is actually an expression composed of the operator
|
||||
\verb\-\ and the literal \verb\1\.
|
||||
|
||||
\section{Operators}
|
||||
|
||||
The following tokens are operators:
|
||||
\index{operators}
|
||||
|
||||
\begin{verbatim}
|
||||
+ - * / %
|
||||
<< >> & | ^ ~
|
||||
< == > <= <> != >=
|
||||
\end{verbatim}
|
||||
|
||||
The comparison operators \verb\<>\ and \verb\!=\ are alternate
|
||||
spellings of the same operator.
|
||||
|
||||
\section{Delimiters}
|
||||
|
||||
The following tokens serve as delimiters or otherwise have a special
|
||||
meaning:
|
||||
\index{delimiters}
|
||||
|
||||
\begin{verbatim}
|
||||
( ) [ ] { }
|
||||
; , : . ` =
|
||||
\end{verbatim}
|
||||
|
||||
The following printing ASCII characters are not used in Python. Their
|
||||
occurrence outside string literals and comments is an unconditional
|
||||
error:
|
||||
\index{ASCII}
|
||||
|
||||
\begin{verbatim}
|
||||
@ $ " ?
|
||||
\end{verbatim}
|
||||
|
||||
They may be used by future versions of the language though!
|
|
@ -0,0 +1,705 @@
|
|||
\chapter{Data model}
|
||||
|
||||
\section{Objects, values and types}
|
||||
|
||||
{\em Objects} are Python's abstraction for data. All data in a Python
|
||||
program is represented by objects or by relations between objects.
|
||||
(In a sense, and in conformance to Von Neumann's model of a
|
||||
``stored program computer'', code is also represented by objects.)
|
||||
\index{object}
|
||||
\index{data}
|
||||
|
||||
Every object has an identity, a type and a value. An object's {\em
|
||||
identity} never changes once it has been created; you may think of it
|
||||
as the object's address in memory. An object's {\em type} is also
|
||||
unchangeable. It determines the operations that an object supports
|
||||
(e.g. ``does it have a length?'') and also defines the possible
|
||||
values for objects of that type. The {\em value} of some objects can
|
||||
change. Objects whose value can change are said to be {\em mutable};
|
||||
objects whose value is unchangeable once they are created are called
|
||||
{\em immutable}. The type determines an object's (im)mutability.
|
||||
\index{identity of an object}
|
||||
\index{value of an object}
|
||||
\index{type of an object}
|
||||
\index{mutable object}
|
||||
\index{immutable object}
|
||||
|
||||
Objects are never explicitly destroyed; however, when they become
|
||||
unreachable they may be garbage-collected. An implementation is
|
||||
allowed to delay garbage collection or omit it altogether --- it is a
|
||||
matter of implementation quality how garbage collection is
|
||||
implemented, as long as no objects are collected that are still
|
||||
reachable. (Implementation note: the current implementation uses a
|
||||
reference-counting scheme which collects most objects as soon as they
|
||||
become unreachable, but never collects garbage containing circular
|
||||
references.)
|
||||
\index{garbage collection}
|
||||
\index{reference counting}
|
||||
\index{unreachable object}
|
||||
|
||||
Note that the use of the implementation's tracing or debugging
|
||||
facilities may keep objects alive that would normally be collectable.
|
||||
|
||||
Some objects contain references to ``external'' resources such as open
|
||||
files or windows. It is understood that these resources are freed
|
||||
when the object is garbage-collected, but since garbage collection is
|
||||
not guaranteed to happen, such objects also provide an explicit way to
|
||||
release the external resource, usually a \verb\close\ method.
|
||||
Programs are strongly recommended to always explicitly close such
|
||||
objects.
|
||||
|
||||
Some objects contain references to other objects; these are called
|
||||
{\em containers}. Examples of containers are tuples, lists and
|
||||
dictionaries. The references are part of a container's value. In
|
||||
most cases, when we talk about the value of a container, we imply the
|
||||
values, not the identities of the contained objects; however, when we
|
||||
talk about the (im)mutability of a container, only the identities of
|
||||
the immediately contained objects are implied. (So, if an immutable
|
||||
container contains a reference to a mutable object, its value changes
|
||||
if that mutable object is changed.)
|
||||
\index{container}
|
||||
|
||||
Types affect almost all aspects of objects' lives. Even the meaning
|
||||
of object identity is affected in some sense: for immutable types,
|
||||
operations that compute new values may actually return a reference to
|
||||
any existing object with the same type and value, while for mutable
|
||||
objects this is not allowed. E.g. after
|
||||
|
||||
\begin{verbatim}
|
||||
a = 1; b = 1; c = []; d = []
|
||||
\end{verbatim}
|
||||
|
||||
\verb\a\ and \verb\b\ may or may not refer to the same object with the
|
||||
value one, depending on the implementation, but \verb\c\ and \verb\d\
|
||||
are guaranteed to refer to two different, unique, newly created empty
|
||||
lists.
|
||||
|
||||
\section{The standard type hierarchy} \label{types}
|
||||
|
||||
Below is a list of the types that are built into Python. Extension
|
||||
modules written in C can define additional types. Future versions of
|
||||
Python may add types to the type hierarchy (e.g. rational or complex
|
||||
numbers, efficiently stored arrays of integers, etc.).
|
||||
\index{type}
|
||||
\indexii{data}{type}
|
||||
\indexii{type}{hierarchy}
|
||||
\indexii{extension}{module}
|
||||
\index{C}
|
||||
|
||||
Some of the type descriptions below contain a paragraph listing
|
||||
`special attributes'. These are attributes that provide access to the
|
||||
implementation and are not intended for general use. Their definition
|
||||
may change in the future. There are also some `generic' special
|
||||
attributes, not listed with the individual objects: \verb\__methods__\
|
||||
is a list of the method names of a built-in object, if it has any;
|
||||
\verb\__members__\ is a list of the data attribute names of a built-in
|
||||
object, if it has any.
|
||||
\index{attribute}
|
||||
\indexii{special}{attribute}
|
||||
\indexiii{generic}{special}{attribute}
|
||||
\ttindex{__methods__}
|
||||
\ttindex{__members__}
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[None]
|
||||
This type has a single value. There is a single object with this value.
|
||||
This object is accessed through the built-in name \verb\None\.
|
||||
It is returned from functions that don't explicitly return an object.
|
||||
\ttindex{None}
|
||||
\obindex{None@{\tt None}}
|
||||
|
||||
\item[Numbers]
|
||||
These are created by numeric literals and returned as results by
|
||||
arithmetic operators and arithmetic built-in functions. Numeric
|
||||
objects are immutable; once created their value never changes. Python
|
||||
numbers are of course strongly related to mathematical numbers, but
|
||||
subject to the limitations of numerical representation in computers.
|
||||
\obindex{number}
|
||||
\obindex{numeric}
|
||||
|
||||
Python distinguishes between integers and floating point numbers:
|
||||
|
||||
\begin{description}
|
||||
\item[Integers]
|
||||
These represent elements from the mathematical set of whole numbers.
|
||||
\obindex{integer}
|
||||
|
||||
There are two types of integers:
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[Plain integers]
|
||||
These represent numbers in the range $-2^{31}$ through $2^{31}-1$.
|
||||
(The range may be larger on machines with a larger natural word
|
||||
size, but not smaller.)
|
||||
When the result of an operation falls outside this range, the
|
||||
exception \verb\OverflowError\ is raised.
|
||||
For the purpose of shift and mask operations, integers are assumed to
|
||||
have a binary, 2's complement notation using 32 or more bits, and
|
||||
hiding no bits from the user (i.e., all $2^{32}$ different bit
|
||||
patterns correspond to different values).
|
||||
\obindex{plain integer}
|
||||
|
||||
\item[Long integers]
|
||||
These represent numbers in an unlimited range, subject to available
|
||||
(virtual) memory only. For the purpose of shift and mask operations,
|
||||
a binary representation is assumed, and negative numbers are
|
||||
represented in a variant of 2's complement which gives the illusion of
|
||||
an infinite string of sign bits extending to the left.
|
||||
\obindex{long integer}
|
||||
|
||||
\end{description} % Integers
|
||||
|
||||
The rules for integer representation are intended to give the most
|
||||
meaningful interpretation of shift and mask operations involving
|
||||
negative integers and the least surprises when switching between the
|
||||
plain and long integer domains. For any operation except left shift,
|
||||
if it yields a result in the plain integer domain without causing
|
||||
overflow, it will yield the same result in the long integer domain or
|
||||
when using mixed operands.
|
||||
\indexii{integer}{representation}
|
||||
|
||||
\item[Floating point numbers]
|
||||
These represent machine-level double precision floating point numbers.
|
||||
You are at the mercy of the underlying machine architecture and
|
||||
C implementation for the accepted range and handling of overflow.
|
||||
\obindex{floating point}
|
||||
\indexii{floating point}{number}
|
||||
\index{C}
|
||||
|
||||
\end{description} % Numbers
|
||||
|
||||
\item[Sequences]
|
||||
These represent finite ordered sets indexed by natural numbers.
|
||||
The built-in function \verb\len()\ returns the number of elements
|
||||
of a sequence. When this number is $n$, the index set contains
|
||||
the numbers $0, 1, \ldots, n-1$. Element \verb\i\ of sequence
|
||||
\verb\a\ is selected by \verb\a[i]\.
|
||||
\obindex{seqence}
|
||||
\bifuncindex{len}
|
||||
\index{index operation}
|
||||
\index{item selection}
|
||||
\index{subscription}
|
||||
|
||||
Sequences also support slicing: \verb\a[i:j]\ selects all elements
|
||||
with index $k$ such that $i <= k < j$. When used as an expression,
|
||||
a slice is a sequence of the same type --- this implies that the
|
||||
index set is renumbered so that it starts at 0 again.
|
||||
\index{slicing}
|
||||
|
||||
Sequences are distinguished according to their mutability:
|
||||
|
||||
\begin{description}
|
||||
%
|
||||
\item[Immutable sequences]
|
||||
An object of an immutable sequence type cannot change once it is
|
||||
created. (If the object contains references to other objects,
|
||||
these other objects may be mutable and may be changed; however
|
||||
the collection of objects directly referenced by an immutable object
|
||||
cannot change.)
|
||||
\obindex{immutable sequence}
|
||||
\obindex{immutable}
|
||||
|
||||
The following types are immutable sequences:
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[Strings]
|
||||
The elements of a string are characters. There is no separate
|
||||
character type; a character is represented by a string of one element.
|
||||
Characters represent (at least) 8-bit bytes. The built-in
|
||||
functions \verb\chr()\ and \verb\ord()\ convert between characters
|
||||
and nonnegative integers representing the byte values.
|
||||
Bytes with the values 0-127 represent the corresponding ASCII values.
|
||||
The string data type is also used to represent arrays of bytes, e.g.
|
||||
to hold data read from a file.
|
||||
\obindex{string}
|
||||
\index{character}
|
||||
\index{byte}
|
||||
\index{ASCII}
|
||||
\bifuncindex{chr}
|
||||
\bifuncindex{ord}
|
||||
|
||||
(On systems whose native character set is not ASCII, strings may use
|
||||
EBCDIC in their internal representation, provided the functions
|
||||
\verb\chr()\ and \verb\ord()\ implement a mapping between ASCII and
|
||||
EBCDIC, and string comparison preserves the ASCII order.
|
||||
Or perhaps someone can propose a better rule?)
|
||||
\index{ASCII}
|
||||
\index{EBCDIC}
|
||||
\index{character set}
|
||||
\indexii{string}{comparison}
|
||||
\bifuncindex{chr}
|
||||
\bifuncindex{ord}
|
||||
|
||||
\item[Tuples]
|
||||
The elements of a tuple are arbitrary Python objects.
|
||||
Tuples of two or more elements are formed by comma-separated lists
|
||||
of expressions. A tuple of one element (a `singleton') can be formed
|
||||
by affixing a comma to an expression (an expression by itself does
|
||||
not create a tuple, since parentheses must be usable for grouping of
|
||||
expressions). An empty tuple can be formed by enclosing `nothing' in
|
||||
parentheses.
|
||||
\obindex{tuple}
|
||||
\indexii{singleton}{tuple}
|
||||
\indexii{empty}{tuple}
|
||||
|
||||
\end{description} % Immutable sequences
|
||||
|
||||
\item[Mutable sequences]
|
||||
Mutable sequences can be changed after they are created. The
|
||||
subscription and slicing notations can be used as the target of
|
||||
assignment and \verb\del\ (delete) statements.
|
||||
\obindex{mutable sequece}
|
||||
\obindex{mutable}
|
||||
\indexii{assignment}{statement}
|
||||
\index{delete}
|
||||
\stindex{del}
|
||||
\index{subscription}
|
||||
\index{slicing}
|
||||
|
||||
There is currently a single mutable sequence type:
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[Lists]
|
||||
The elements of a list are arbitrary Python objects. Lists are formed
|
||||
by placing a comma-separated list of expressions in square brackets.
|
||||
(Note that there are no special cases needed to form lists of length 0
|
||||
or 1.)
|
||||
\obindex{list}
|
||||
|
||||
\end{description} % Mutable sequences
|
||||
|
||||
\end{description} % Sequences
|
||||
|
||||
\item[Mapping types]
|
||||
These represent finite sets of objects indexed by arbitrary index sets.
|
||||
The subscript notation \verb\a[k]\ selects the element indexed
|
||||
by \verb\k\ from the mapping \verb\a\; this can be used in
|
||||
expressions and as the target of assignments or \verb\del\ statements.
|
||||
The built-in function \verb\len()\ returns the number of elements
|
||||
in a mapping.
|
||||
\bifuncindex{len}
|
||||
\index{subscription}
|
||||
\obindex{mapping}
|
||||
|
||||
There is currently a single mapping type:
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[Dictionaries]
|
||||
These represent finite sets of objects indexed by strings.
|
||||
Dictionaries are mutable; they are created by the \verb\{...}\
|
||||
notation (see section \ref{dict}). (Implementation note: the strings
|
||||
used for indexing must not contain null bytes.)
|
||||
\obindex{dictionary}
|
||||
\obindex{mutable}
|
||||
|
||||
\end{description} % Mapping types
|
||||
|
||||
\item[Callable types]
|
||||
These are the types to which the function call (invocation) operation,
|
||||
written as \verb\function(argument, argument, ...)\, can be applied:
|
||||
\indexii{function}{call}
|
||||
\index{invocation}
|
||||
\indexii{function}{argument}
|
||||
\obindex{callable}
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[User-defined functions]
|
||||
A user-defined function object is created by a function definition
|
||||
(see section \ref{function}). It should be called with an argument
|
||||
list containing the same number of items as the function's formal
|
||||
parameter list.
|
||||
\indexii{user-defined}{function}
|
||||
\obindex{function}
|
||||
\obindex{user-defined function}
|
||||
|
||||
Special read-only attributes: \verb\func_code\ is the code object
|
||||
representing the compiled function body, and \verb\func_globals\ is (a
|
||||
reference to) the dictionary that holds the function's global
|
||||
variables --- it implements the global name space of the module in
|
||||
which the function was defined.
|
||||
\ttindex{func_code}
|
||||
\ttindex{func_globals}
|
||||
\indexii{global}{name space}
|
||||
|
||||
\item[User-defined methods]
|
||||
A user-defined method (a.k.a. {\em object closure}) is a pair of a
|
||||
class instance object and a user-defined function. It should be
|
||||
called with an argument list containing one item less than the number
|
||||
of items in the function's formal parameter list. When called, the
|
||||
class instance becomes the first argument, and the call arguments are
|
||||
shifted one to the right.
|
||||
\obindex{method}
|
||||
\obindex{user-defined method}
|
||||
\indexii{user-defined}{method}
|
||||
\index{object closure}
|
||||
|
||||
Special read-only attributes: \verb\im_self\ is the class instance
|
||||
object, \verb\im_func\ is the function object.
|
||||
\ttindex{im_func}
|
||||
\ttindex{im_self}
|
||||
|
||||
\item[Built-in functions]
|
||||
A built-in function object is a wrapper around a C function. Examples
|
||||
of built-in functions are \verb\len\ and \verb\math.sin\. There
|
||||
are no special attributes. The number and type of the arguments are
|
||||
determined by the C function.
|
||||
\obindex{built-in function}
|
||||
\obindex{function}
|
||||
\index{C}
|
||||
|
||||
\item[Built-in methods]
|
||||
This is really a different disguise of a built-in function, this time
|
||||
containing an object passed to the C function as an implicit extra
|
||||
argument. An example of a built-in method is \verb\list.append\ if
|
||||
\verb\list\ is a list object.
|
||||
\obindex{built-in method}
|
||||
\obindex{method}
|
||||
\indexii{built-in}{method}
|
||||
|
||||
\item[Classes]
|
||||
Class objects are described below. When a class object is called as a
|
||||
parameterless function, a new class instance (also described below) is
|
||||
created and returned. The class's initialization function is not
|
||||
called --- this is the responsibility of the caller. It is illegal to
|
||||
call a class object with one or more arguments.
|
||||
\obindex{class}
|
||||
\obindex{class instance}
|
||||
\obindex{instance}
|
||||
\indexii{class object}{call}
|
||||
|
||||
\end{description}
|
||||
|
||||
\item[Modules]
|
||||
Modules are imported by the \verb\import\ statement (see section
|
||||
\ref{import}). A module object is a container for a module's name
|
||||
space, which is a dictionary (the same dictionary as referenced by the
|
||||
\verb\func_globals\ attribute of functions defined in the module).
|
||||
Module attribute references are translated to lookups in this
|
||||
dictionary. A module object does not contain the code object used to
|
||||
initialize the module (since it isn't needed once the initialization
|
||||
is done).
|
||||
\stindex{import}
|
||||
\obindex{module}
|
||||
|
||||
Attribute assignment update the module's name space dictionary.
|
||||
|
||||
Special read-only attributes: \verb\__dict__\ yields the module's name
|
||||
space as a dictionary object; \verb\__name__\ yields the module's name
|
||||
as a string object.
|
||||
\ttindex{__dict__}
|
||||
\ttindex{__name__}
|
||||
\indexii{module}{name space}
|
||||
|
||||
\item[Classes]
|
||||
Class objects are created by class definitions (see section
|
||||
\ref{class}). A class is a container for a dictionary containing the
|
||||
class's name space. Class attribute references are translated to
|
||||
lookups in this dictionary. When an attribute name is not found
|
||||
there, the attribute search continues in the base classes. The search
|
||||
is depth-first, left-to-right in the order of their occurrence in the
|
||||
base class list.
|
||||
\obindex{class}
|
||||
\obindex{class instance}
|
||||
\obindex{instance}
|
||||
\indexii{class object}{call}
|
||||
\index{container}
|
||||
\index{dictionary}
|
||||
\indexii{class}{attribute}
|
||||
|
||||
Class attribute assignments update the class's dictionary, never the
|
||||
dictionary of a base class.
|
||||
\indexiii{class}{attribute}{assignment}
|
||||
|
||||
A class can be called as a parameterless function to yield a class
|
||||
instance (see above).
|
||||
\indexii{class object}{call}
|
||||
|
||||
Special read-only attributes: \verb\__dict__\ yields the dictionary
|
||||
containing the class's name space; \verb\__bases__\ yields a tuple
|
||||
(possibly empty or a singleton) containing the base classes, in the
|
||||
order of their occurrence in the base class list.
|
||||
\ttindex{__dict__}
|
||||
\ttindex{__bases__}
|
||||
|
||||
\item[Class instances]
|
||||
A class instance is created by calling a class object as a
|
||||
parameterless function. A class instance has a dictionary in which
|
||||
attribute references are searched. When an attribute is not found
|
||||
there, and the instance's class has an attribute by that name, and
|
||||
that class attribute is a user-defined function (and in no other
|
||||
cases), the instance attribute reference yields a user-defined method
|
||||
object (see above) constructed from the instance and the function.
|
||||
\obindex{class instance}
|
||||
\obindex{instance}
|
||||
\indexii{class}{instance}
|
||||
\indexii{class instance}{attribute}
|
||||
|
||||
Attribute assignments update the instance's dictionary.
|
||||
\indexiii{class instance}{attribute}{assignment}
|
||||
|
||||
Class instances can pretend to be numbers, sequences, or mappings if
|
||||
they have methods with certain special names. These are described in
|
||||
section \ref{specialnames}.
|
||||
\obindex{number}
|
||||
\obindex{sequence}
|
||||
\obindex{mapping}
|
||||
|
||||
Special read-only attributes: \verb\__dict__\ yields the attribute
|
||||
dictionary; \verb\__class__\ yields the instance's class.
|
||||
\ttindex{__dict__}
|
||||
\ttindex{__class__}
|
||||
|
||||
\item[Files]
|
||||
A file object represents an open file. (It is a wrapper around a C
|
||||
{\tt stdio} file pointer.) File objects are created by the
|
||||
\verb\open()\ built-in function, and also by \verb\posix.popen()\ and
|
||||
the \verb\makefile\ method of socket objects. \verb\sys.stdin\,
|
||||
\verb\sys.stdout\ and \verb\sys.stderr\ are file objects corresponding
|
||||
the the interpreter's standard input, output and error streams.
|
||||
See the Python Library Reference for methods of file objects and other
|
||||
details.
|
||||
\obindex{file}
|
||||
\index{C}
|
||||
\index{stdio}
|
||||
\bifuncindex{open}
|
||||
\bifuncindex{popen}
|
||||
\bifuncindex{makefile}
|
||||
\ttindex{stdin}
|
||||
\ttindex{stdout}
|
||||
\ttindex{stderr}
|
||||
\ttindex{sys.stdin}
|
||||
\ttindex{sys.stdout}
|
||||
\ttindex{sys.stderr}
|
||||
|
||||
\item[Internal types]
|
||||
A few types used internally by the interpreter are exposed to the user.
|
||||
Their definition may change with future versions of the interpreter,
|
||||
but they are mentioned here for completeness.
|
||||
\index{internal type}
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[Code objects]
|
||||
Code objects represent executable code. The difference between a code
|
||||
object and a function object is that the function object contains an
|
||||
explicit reference to the function's context (the module in which it
|
||||
was defined) which a code object contains no context. There is no way
|
||||
to execute a bare code object.
|
||||
\obindex{code}
|
||||
|
||||
Special read-only attributes: \verb\co_code\ is a string representing
|
||||
the sequence of instructions; \verb\co_consts\ is a list of literals
|
||||
used by the code; \verb\co_names\ is a list of names (strings) used by
|
||||
the code; \verb\co_filename\ is the filename from which the code was
|
||||
compiled. (To find out the line numbers, you would have to decode the
|
||||
instructions; the standard library module \verb\dis\ contains an
|
||||
example of how to do this.)
|
||||
\ttindex{co_code}
|
||||
\ttindex{co_consts}
|
||||
\ttindex{co_names}
|
||||
\ttindex{co_filename}
|
||||
|
||||
\item[Frame objects]
|
||||
Frame objects represent execution frames. They may occur in traceback
|
||||
objects (see below).
|
||||
\obindex{frame}
|
||||
|
||||
Special read-only attributes: \verb\f_back\ is to the previous
|
||||
stack frame (towards the caller), or \verb\None\ if this is the bottom
|
||||
stack frame; \verb\f_code\ is the code object being executed in this
|
||||
frame; \verb\f_globals\ is the dictionary used to look up global
|
||||
variables; \verb\f_locals\ is used for local variables;
|
||||
\verb\f_lineno\ gives the line number and \verb\f_lasti\ gives the
|
||||
precise instruction (this is an index into the instruction string of
|
||||
the code object).
|
||||
\ttindex{f_back}
|
||||
\ttindex{f_code}
|
||||
\ttindex{f_globals}
|
||||
\ttindex{f_locals}
|
||||
\ttindex{f_lineno}
|
||||
\ttindex{f_lasti}
|
||||
|
||||
\item[Traceback objects]
|
||||
Traceback objects represent a stack trace of an exception. A
|
||||
traceback object is created when an exception occurs. When the search
|
||||
for an exception handler unwinds the execution stack, at each unwound
|
||||
level a traceback object is inserted in front of the current
|
||||
traceback. When an exception handler is entered, the stack trace is
|
||||
made available to the program as \verb\sys.exc_traceback\. When the
|
||||
program contains no suitable handler, the stack trace is written
|
||||
(nicely formatted) to the standard error stream; if the interpreter is
|
||||
interactive, it is also made available to the user as
|
||||
\verb\sys.last_traceback\.
|
||||
\obindex{traceback}
|
||||
\indexii{stack}{trace}
|
||||
\indexii{exception}{handler}
|
||||
\indexii{execution}{stack}
|
||||
\ttindex{exc_traceback}
|
||||
\ttindex{last_traceback}
|
||||
\ttindex{sys.exc_traceback}
|
||||
\ttindex{sys.last_traceback}
|
||||
|
||||
Special read-only attributes: \verb\tb_next\ is the next level in the
|
||||
stack trace (towards the frame where the exception occurred), or
|
||||
\verb\None\ if there is no next level; \verb\tb_frame\ points to the
|
||||
execution frame of the current level; \verb\tb_lineno\ gives the line
|
||||
number where the exception occurred; \verb\tb_lasti\ indicates the
|
||||
precise instruction. The line number and last instruction in the
|
||||
traceback may differ from the line number of its frame object if the
|
||||
exception occurred in a \verb\try\ statement with no matching
|
||||
\verb\except\ clause or with a \verb\finally\ clause.
|
||||
\ttindex{tb_next}
|
||||
\ttindex{tb_frame}
|
||||
\ttindex{tb_lineno}
|
||||
\ttindex{tb_lasti}
|
||||
\stindex{try}
|
||||
|
||||
\end{description} % Internal types
|
||||
|
||||
\end{description} % Types
|
||||
|
||||
|
||||
\section{Special method names} \label{specialnames}
|
||||
|
||||
A class can implement certain operations that are invoked by special
|
||||
syntax (such as subscription or arithmetic operations) by defining
|
||||
methods with special names. For instance, if a class defines a
|
||||
method named \verb\__getitem__\, and \verb\x\ is an instance of this
|
||||
class, then \verb\x[i]\ is equivalent to \verb\x.__getitem__(i)\.
|
||||
(The reverse is not true --- if \verb\x\ is a list object,
|
||||
\verb\x.__getitem__(i)\ is not equivalent to \verb\x[i]\.)
|
||||
|
||||
Except for \verb\__repr__\ and \verb\__cmp__\, attempts to execute an
|
||||
operation raise an exception when no appropriate method is defined.
|
||||
For \verb\__repr__\ and \verb\__cmp__\, the traditional
|
||||
interpretations are used in this case.
|
||||
|
||||
|
||||
\subsection{Special methods for any type}
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[\tt __repr__(self)]
|
||||
Called by the \verb\print\ statement and conversions (reverse quotes) to
|
||||
compute the string representation of an object.
|
||||
|
||||
\item[\tt _cmp__(self, other)]
|
||||
Called by all comparison operations. Should return -1 if
|
||||
\verb\self < other\, 0 if \verb\self == other\, +1 if
|
||||
\verb\self > other\. (Implementation note: due to limitations in the
|
||||
interpreter, exceptions raised by comparisons are ignored, and the
|
||||
objects will be considered equal in this case.)
|
||||
|
||||
\end{description}
|
||||
|
||||
|
||||
\subsection{Special methods for sequence and mapping types}
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[\tt __len__(self)]
|
||||
Called to implement the built-in function \verb\len()\. Should return
|
||||
the length of the object, an integer \verb\>=\ 0. Also, an object
|
||||
whose \verb\__len__()\ method returns 0 is considered to be false in a
|
||||
Boolean context.
|
||||
|
||||
\item[\tt __getitem__(self, key)]
|
||||
Called to implement evaluation of \verb\self[key]\. Note that the
|
||||
special interpretation of negative keys (if the class wishes to
|
||||
emulate a sequence type) is up to the \verb\__getitem__\ method.
|
||||
|
||||
\item[\tt __setitem__(self, key, value)]
|
||||
Called to implement assignment to \verb\self[key]\. Same note as for
|
||||
\verb\__getitem__\.
|
||||
|
||||
\item[\tt __delitem__(self, key)]
|
||||
Called to implement deletion of \verb\self[key]\. Same note as for
|
||||
\verb\__getitem__\.
|
||||
|
||||
\end{description}
|
||||
|
||||
|
||||
\subsection{Special methods for sequence types}
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[\tt __getslice__(self, i, j)]
|
||||
Called to implement evaluation of \verb\self[i:j]\. Note that missing
|
||||
\verb\i\ or \verb\j\ are replaced by 0 or \verb\len(self)\,
|
||||
respectively, and \verb\len(self)\ has been added (once) to originally
|
||||
negative \verb\i\ or \verb\j\ by the time this function is called
|
||||
(unlike for \verb\__getitem__\).
|
||||
|
||||
\item[\tt __setslice__(self, i, j, sequence)]
|
||||
Called to implement assignment to \verb\self[i:j]\. Same notes as for
|
||||
\verb\__getslice__\.
|
||||
|
||||
\item[\tt __delslice__(self, i, j)]
|
||||
Called to implement deletion of \verb\self[i:j]\. Same notes as for
|
||||
\verb\__getslice__\.
|
||||
|
||||
\end{description}
|
||||
|
||||
|
||||
\subsection{Special methods for numeric types}
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[\tt __add__(self, other)]\itemjoin
|
||||
\item[\tt __sub__(self, other)]\itemjoin
|
||||
\item[\tt __mul__(self, other)]\itemjoin
|
||||
\item[\tt __div__(self, other)]\itemjoin
|
||||
\item[\tt __mod__(self, other)]\itemjoin
|
||||
\item[\tt __divmod__(self, other)]\itemjoin
|
||||
\item[\tt __pow__(self, other)]\itemjoin
|
||||
\item[\tt __lshift__(self, other)]\itemjoin
|
||||
\item[\tt __rshift__(self, other)]\itemjoin
|
||||
\item[\tt __and__(self, other)]\itemjoin
|
||||
\item[\tt __xor__(self, other)]\itemjoin
|
||||
\item[\tt __or__(self, other)]\itembreak
|
||||
Called to implement the binary arithmetic operations (\verb\+\,
|
||||
\verb\-\, \verb\*\, \verb\/\, \verb\%\, \verb\divmod()\, \verb\pow()\,
|
||||
\verb\<<\, \verb\>>\, \verb\&\, \verb\^\, \verb\|\).
|
||||
|
||||
\item[\tt __neg__(self)]\itemjoin
|
||||
\item[\tt __pos__(self)]\itemjoin
|
||||
\item[\tt __abs__(self)]\itemjoin
|
||||
\item[\tt __invert__(self)]\itembreak
|
||||
Called to implement the unary arithmetic operations (\verb\-\, \verb\+\,
|
||||
\verb\abs()\ and \verb\~\).
|
||||
|
||||
\item[\tt __nonzero__(self)]
|
||||
Called to implement boolean testing; should return 0 or 1. An
|
||||
alternative name for this method is \verb\__len__\.
|
||||
|
||||
\item[\tt __coerce__(self, other)]
|
||||
Called to implement ``mixed-mode'' numeric arithmetic. Should either
|
||||
return a tuple containing self and other converted to a common numeric
|
||||
type, or None if no way of conversion is known. When the common type
|
||||
would be the type of other, it is sufficient to return None, since the
|
||||
interpreter will also ask the other object to attempt a coercion (but
|
||||
sometimes, if the implementation of the other type cannot be changed,
|
||||
it is useful to do the conversion to the other type here).
|
||||
|
||||
Note that this method is not called to coerce the arguments to \verb\+\
|
||||
and \verb\*\, because these are also used to implement sequence
|
||||
concatenation and repetition, respectively. Also note that, for the
|
||||
same reason, in \verb\n*x\, where \verb\n\ is a built-in number and
|
||||
\verb\x\ is an instance, a call to \verb\x.__mul__(n)\ is made.%
|
||||
\footnote{The interpreter should really distinguish between
|
||||
user-defined classes implementing sequences, mappings or numbers, but
|
||||
currently it doesn't --- hence this strange exception.}
|
||||
|
||||
\item[\tt __int__(self)]\itemjoin
|
||||
\item[\tt __long__(self)]\itemjoin
|
||||
\item[\tt __float__(self)]\itembreak
|
||||
Called to implement the built-in functions \verb\int()\, \verb\long()\
|
||||
and \verb\float()\. Should return a value of the appropriate type.
|
||||
|
||||
\end{description}
|
|
@ -0,0 +1,147 @@
|
|||
\chapter{Execution model}
|
||||
\index{execution model}
|
||||
|
||||
\section{Code blocks, execution frames, and name spaces} \label{execframes}
|
||||
\index{code block}
|
||||
\indexii{execution}{frame}
|
||||
\index{name space}
|
||||
|
||||
A {\em code block} is a piece of Python program text that can be
|
||||
executed as a unit, such as a module, a class definition or a function
|
||||
body. Some code blocks (like modules) are executed only once, others
|
||||
(like function bodies) may be executed many times. Code block may
|
||||
textually contain other code blocks. Code blocks may invoke other
|
||||
code blocks (that may or may not be textually contained in them) as
|
||||
part of their execution, e.g. by invoking (calling) a function.
|
||||
\index{code block}
|
||||
\indexii{code}{block}
|
||||
|
||||
The following are code blocks: A module is a code block. A function
|
||||
body is a code block. A class definition is a code block. Each
|
||||
command typed interactively is a separate code block; a script file is
|
||||
a code block. The string argument passed to the built-in functions
|
||||
\verb\eval\ and \verb\exec\ are code blocks. And finally, the
|
||||
expression read and evaluated by the built-in function \verb\input\ is
|
||||
a code block.
|
||||
|
||||
A code block is executed in an execution frame. An {\em execution
|
||||
frame} contains some administrative information (used for debugging),
|
||||
determines where and how execution continues after the code block's
|
||||
execution has completed, and (perhaps most importantly) defines two
|
||||
name spaces, the local and the global name space, that affect
|
||||
execution of the code block.
|
||||
\indexii{execution}{frame}
|
||||
|
||||
A {\em name space} is a mapping from names (identifiers) to objects.
|
||||
A particular name space may be referenced by more than one execution
|
||||
frame, and from other places as well. Adding a name to a name space
|
||||
is called {\em binding} a name (to an object); changing the mapping of
|
||||
a name is called {\em rebinding}; removing a name is {\em unbinding}.
|
||||
Name spaces are functionally equivalent to dictionaries.
|
||||
\index{name space}
|
||||
\indexii{binding}{name}
|
||||
\indexii{rebinding}{name}
|
||||
\indexii{unbinding}{name}
|
||||
|
||||
The {\em local name space} of an execution frame determines the default
|
||||
place where names are defined and searched. The {\em global name
|
||||
space} determines the place where names listed in \verb\global\
|
||||
statements are defined and searched, and where names that are not
|
||||
explicitly bound in the current code block are searched.
|
||||
\indexii{local}{name space}
|
||||
\indexii{global}{name space}
|
||||
\stindex{global}
|
||||
|
||||
Whether a name is local or global in a code block is determined by
|
||||
static inspection of the source text for the code block: in the
|
||||
absence of \verb\global\ statements, a name that is bound anywhere in
|
||||
the code block is local in the entire code block; all other names are
|
||||
considered global. The \verb\global\ statement forces global
|
||||
interpretation of selected names throughout the code block. The
|
||||
following constructs bind names: formal parameters, \verb\import\
|
||||
statements, class and function definitions (these bind the class or
|
||||
function name), and targets that are identifiers if occurring in an
|
||||
assignment, \verb\for\ loop header, or \verb\except\ clause header.
|
||||
(A target occurring in a \verb\del\ statement does not bind a name.)
|
||||
|
||||
When a global name is not found in the global name space, it is
|
||||
searched in the list of ``built-in'' names (which is actually the
|
||||
global name space of the module \verb\builtin\). When a name is not
|
||||
found at all, the \verb\NameError\ exception is raised.
|
||||
|
||||
The following table lists the meaning of the local and global name
|
||||
space for various types of code blocks. The name space for a
|
||||
particular module is automatically created when the module is first
|
||||
referenced.
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{|l|l|l|l|}
|
||||
\hline
|
||||
Code block type & Global name space & Local name space & Notes \\
|
||||
\hline
|
||||
Module & n.s. for this module & same as global & \\
|
||||
Script & n.s. for \verb\__main__\ & same as global & \\
|
||||
Interactive command & n.s. for \verb\__main__\ & same as global & \\
|
||||
Class definition & global n.s. of containing block & new n.s. & \\
|
||||
Function body & global n.s. of containing block & new n.s. & \\
|
||||
String passed to \verb\exec\ or \verb\eval\
|
||||
& global n.s. of caller & local n.s. of caller & (1) \\
|
||||
File read by \verb\execfile\
|
||||
& global n.s. of caller & local n.s. of caller & (1) \\
|
||||
Expression read by \verb\input\
|
||||
& global n.s. of caller & local n.s. of caller & \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
Notes:
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[n.s.] means {\em name space}
|
||||
|
||||
\item[(1)] The global and local name space for these functions can be
|
||||
overridden with optional extra arguments.
|
||||
|
||||
\end{description}
|
||||
|
||||
\section{Exceptions}
|
||||
|
||||
Exceptions are a means of breaking out of the normal flow of control
|
||||
of a code block in order to handle errors or other exceptional
|
||||
conditions. An exception is {\em raised} at the point where the error
|
||||
is detected; it may be {\em handled} by the surrounding code block or
|
||||
by any code block that directly or indirectly invoked the code block
|
||||
where the error occurred.
|
||||
\index{exception}
|
||||
\index{raise an exception}
|
||||
\index{handle an exception}
|
||||
\index{exception handler}
|
||||
\index{errors}
|
||||
\index{error handling}
|
||||
|
||||
The Python interpreter raises an exception when it detects an run-time
|
||||
error (such as division by zero). A Python program can also
|
||||
explicitly raise an exception with the \verb\raise\ statement.
|
||||
Exception handlers are specified with the \verb\try...except\
|
||||
statement.
|
||||
|
||||
Python uses the ``termination'' model of error handling: an exception
|
||||
handler can find out what happened and continue execution at an outer
|
||||
level, but it cannot repair the cause of the error and retry the
|
||||
failing operation (except by re-entering the the offending piece of
|
||||
code from the top).
|
||||
|
||||
When an exception is not handled at all, the interpreter terminates
|
||||
execution of the program, or returns to its interactive main loop.
|
||||
|
||||
Exceptions are identified by string objects. Two different string
|
||||
objects with the same value identify different exceptions.
|
||||
|
||||
When an exception is raised, an object (maybe \verb\None\) is passed
|
||||
as the exception's ``parameter''; this object does not affect the
|
||||
selection of an exception handler, but is passed to the selected
|
||||
exception handler as additional information.
|
||||
|
||||
See also the description of the \verb\try\ and \verb\raise\
|
||||
statements.
|
|
@ -0,0 +1,672 @@
|
|||
\chapter{Expressions and conditions}
|
||||
\index{expression}
|
||||
\index{condition}
|
||||
|
||||
{\bf Note:} In this and the following chapters, extended BNF notation
|
||||
will be used to describe syntax, not lexical analysis.
|
||||
\index{BNF}
|
||||
|
||||
This chapter explains the meaning of the elements of expressions and
|
||||
conditions. Conditions are a superset of expressions, and a condition
|
||||
may be used wherever an expression is required by enclosing it in
|
||||
parentheses. The only places where expressions are used in the syntax
|
||||
instead of conditions is in expression statements and on the
|
||||
right-hand side of assignment statements; this catches some nasty bugs
|
||||
like accidentally writing \verb\x == 1\ instead of \verb\x = 1\.
|
||||
\indexii{assignment}{statement}
|
||||
|
||||
The comma plays several roles in Python's syntax. It is usually an
|
||||
operator with a lower precedence than all others, but occasionally
|
||||
serves other purposes as well; e.g. it separates function arguments,
|
||||
is used in list and dictionary constructors, and has special semantics
|
||||
in \verb\print\ statements.
|
||||
\index{comma}
|
||||
|
||||
When (one alternative of) a syntax rule has the form
|
||||
|
||||
\begin{verbatim}
|
||||
name: othername
|
||||
\end{verbatim}
|
||||
|
||||
and no semantics are given, the semantics of this form of \verb\name\
|
||||
are the same as for \verb\othername\.
|
||||
\index{syntax}
|
||||
|
||||
\section{Arithmetic conversions}
|
||||
\indexii{arithmetic}{conversion}
|
||||
|
||||
When a description of an arithmetic operator below uses the phrase
|
||||
``the numeric arguments are converted to a common type'',
|
||||
this both means that if either argument is not a number, a
|
||||
\verb\TypeError\ exception is raised, and that otherwise
|
||||
the following conversions are applied:
|
||||
\exindex{TypeError}
|
||||
\indexii{floating point}{number}
|
||||
\indexii{long}{integer}
|
||||
\indexii{plain}{integer}
|
||||
|
||||
\begin{itemize}
|
||||
\item first, if either argument is a floating point number,
|
||||
the other is converted to floating point;
|
||||
\item else, if either argument is a long integer,
|
||||
the other is converted to long integer;
|
||||
\item otherwise, both must be plain integers and no conversion
|
||||
is necessary.
|
||||
\end{itemize}
|
||||
|
||||
\section{Atoms}
|
||||
\index{atom}
|
||||
|
||||
Atoms are the most basic elements of expressions. Forms enclosed in
|
||||
reverse quotes or in parentheses, brackets or braces are also
|
||||
categorized syntactically as atoms. The syntax for atoms is:
|
||||
|
||||
\begin{verbatim}
|
||||
atom: identifier | literal | enclosure
|
||||
enclosure: parenth_form | list_display | dict_display | string_conversion
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{Identifiers (Names)}
|
||||
\index{name}
|
||||
\index{identifier}
|
||||
|
||||
An identifier occurring as an atom is a reference to a local, global
|
||||
or built-in name binding. If a name can be assigned to anywhere in a
|
||||
code block, and is not mentioned in a \verb\global\ statement in that
|
||||
code block, it refers to a local name throughout that code block.
|
||||
Otherwise, it refers to a global name if one exists, else to a
|
||||
built-in name.
|
||||
\indexii{name}{binding}
|
||||
\index{code block}
|
||||
\stindex{global}
|
||||
\indexii{built-in}{name}
|
||||
\indexii{global}{name}
|
||||
|
||||
When the name is bound to an object, evaluation of the atom yields
|
||||
that object. When a name is not bound, an attempt to evaluate it
|
||||
raises a \verb\NameError\ exception.
|
||||
\exindex{NameError}
|
||||
|
||||
\subsection{Literals}
|
||||
\index{literal}
|
||||
|
||||
Python knows string and numeric literals:
|
||||
|
||||
\begin{verbatim}
|
||||
literal: stringliteral | integer | longinteger | floatnumber
|
||||
\end{verbatim}
|
||||
|
||||
Evaluation of a literal yields an object of the given type (string,
|
||||
integer, long integer, floating point number) with the given value.
|
||||
The value may be approximated in the case of floating point literals.
|
||||
See section \ref{literals} for details.
|
||||
|
||||
All literals correspond to immutable data types, and hence the
|
||||
object's identity is less important than its value. Multiple
|
||||
evaluations of literals with the same value (either the same
|
||||
occurrence in the program text or a different occurrence) may obtain
|
||||
the same object or a different object with the same value.
|
||||
\indexiii{immutable}{data}{type}
|
||||
|
||||
(In the original implementation, all literals in the same code block
|
||||
with the same type and value yield the same object.)
|
||||
|
||||
\subsection{Parenthesized forms}
|
||||
\index{parenthesized form}
|
||||
|
||||
A parenthesized form is an optional condition list enclosed in
|
||||
parentheses:
|
||||
|
||||
\begin{verbatim}
|
||||
parenth_form: "(" [condition_list] ")"
|
||||
\end{verbatim}
|
||||
|
||||
A parenthesized condition list yields whatever that condition list
|
||||
yields.
|
||||
|
||||
An empty pair of parentheses yields an empty tuple object. Since
|
||||
tuples are immutable, the rules for literals apply here.
|
||||
\indexii{empty}{tuple}
|
||||
|
||||
(Note that tuples are not formed by the parentheses, but rather by use
|
||||
of the comma operator. The exception is the empty tuple, for which
|
||||
parentheses {\em are} required --- allowing unparenthesized ``nothing''
|
||||
in expressions would causes ambiguities and allow common typos to
|
||||
pass uncaught.)
|
||||
\index{comma}
|
||||
\indexii{tuple}{display}
|
||||
|
||||
\subsection{List displays}
|
||||
\indexii{list}{display}
|
||||
|
||||
A list display is a possibly empty series of conditions enclosed in
|
||||
square brackets:
|
||||
|
||||
\begin{verbatim}
|
||||
list_display: "[" [condition_list] "]"
|
||||
\end{verbatim}
|
||||
|
||||
A list display yields a new list object.
|
||||
\obindex{list}
|
||||
|
||||
If it has no condition list, the list object has no items. Otherwise,
|
||||
the elements of the condition list are evaluated from left to right
|
||||
and inserted in the list object in that order.
|
||||
\indexii{empty}{list}
|
||||
|
||||
\subsection{Dictionary displays} \label{dict}
|
||||
\indexii{dictionary}{display}
|
||||
|
||||
A dictionary display is a possibly empty series of key/datum pairs
|
||||
enclosed in curly braces:
|
||||
\index{key}
|
||||
\index{datum}
|
||||
\index{key/datum pair}
|
||||
|
||||
\begin{verbatim}
|
||||
dict_display: "{" [key_datum_list] "}"
|
||||
key_datum_list: key_datum ("," key_datum)* [","]
|
||||
key_datum: condition ":" condition
|
||||
\end{verbatim}
|
||||
|
||||
A dictionary display yields a new dictionary object.
|
||||
\obindex{dictionary}
|
||||
|
||||
The key/datum pairs are evaluated from left to right to define the
|
||||
entries of the dictionary: each key object is used as a key into the
|
||||
dictionary to store the corresponding datum.
|
||||
|
||||
Keys must be strings, otherwise a \verb\TypeError\ exception is
|
||||
raised. Clashes between duplicate keys are not detected; the last
|
||||
datum (textually rightmost in the display) stored for a given key
|
||||
value prevails.
|
||||
\exindex{TypeError}
|
||||
|
||||
\subsection{String conversions}
|
||||
\indexii{string}{conversion}
|
||||
|
||||
A string conversion is a condition list enclosed in reverse (or
|
||||
backward) quotes:
|
||||
|
||||
\begin{verbatim}
|
||||
string_conversion: "`" condition_list "`"
|
||||
\end{verbatim}
|
||||
|
||||
A string conversion evaluates the contained condition list and
|
||||
converts the resulting object into a string according to rules
|
||||
specific to its type.
|
||||
|
||||
If the object is a string, a number, \verb\None\, or a tuple, list or
|
||||
dictionary containing only objects whose type is one of these, the
|
||||
resulting string is a valid Python expression which can be passed to
|
||||
the built-in function \verb\eval()\ to yield an expression with the
|
||||
same value (or an approximation, if floating point numbers are
|
||||
involved).
|
||||
|
||||
(In particular, converting a string adds quotes around it and converts
|
||||
``funny'' characters to escape sequences that are safe to print.)
|
||||
|
||||
It is illegal to attempt to convert recursive objects (e.g. lists or
|
||||
dictionaries that contain a reference to themselves, directly or
|
||||
indirectly.)
|
||||
\obindex{recursive}
|
||||
|
||||
\section{Primaries} \label{primaries}
|
||||
\index{primary}
|
||||
|
||||
Primaries represent the most tightly bound operations of the language.
|
||||
Their syntax is:
|
||||
|
||||
\begin{verbatim}
|
||||
primary: atom | attributeref | subscription | slicing | call
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{Attribute references}
|
||||
\indexii{attribute}{reference}
|
||||
|
||||
An attribute reference is a primary followed by a period and a name:
|
||||
|
||||
\begin{verbatim}
|
||||
attributeref: primary "." identifier
|
||||
\end{verbatim}
|
||||
|
||||
The primary must evaluate to an object of a type that supports
|
||||
attribute references, e.g. a module or a list. This object is then
|
||||
asked to produce the attribute whose name is the identifier. If this
|
||||
attribute is not available, the exception \verb\AttributeError\ is
|
||||
raised. Otherwise, the type and value of the object produced is
|
||||
determined by the object. Multiple evaluations of the same attribute
|
||||
reference may yield different objects.
|
||||
\obindex{module}
|
||||
\obindex{list}
|
||||
|
||||
\subsection{Subscriptions}
|
||||
\index{subscription}
|
||||
|
||||
A subscription selects an item of a sequence (string, tuple or list)
|
||||
or mapping (dictionary) object:
|
||||
\obindex{sequence}
|
||||
\obindex{mapping}
|
||||
\obindex{string}
|
||||
\obindex{tuple}
|
||||
\obindex{list}
|
||||
\obindex{dictionary}
|
||||
\indexii{sequence}{item}
|
||||
|
||||
\begin{verbatim}
|
||||
subscription: primary "[" condition "]"
|
||||
\end{verbatim}
|
||||
|
||||
The primary must evaluate to an object of a sequence or mapping type.
|
||||
|
||||
If it is a mapping, the condition must evaluate to an object whose
|
||||
value is one of the keys of the mapping, and the subscription selects
|
||||
the value in the mapping that corresponds to that key.
|
||||
|
||||
If it is a sequence, the condition must evaluate to a plain integer.
|
||||
If this value is negative, the length of the sequence is added to it
|
||||
(so that, e.g. \verb\x[-1]\ selects the last item of \verb\x\.)
|
||||
The resulting value must be a nonnegative integer smaller than the
|
||||
number of items in the sequence, and the subscription selects the item
|
||||
whose index is that value (counting from zero).
|
||||
|
||||
A string's items are characters. A character is not a separate data
|
||||
type but a string of exactly one character.
|
||||
\index{character}
|
||||
\indexii{string}{item}
|
||||
|
||||
\subsection{Slicings}
|
||||
\index{slicing}
|
||||
\index{slice}
|
||||
|
||||
A slicing (or slice) selects a range of items in a sequence (string,
|
||||
tuple or list) object:
|
||||
\obindex{sequence}
|
||||
\obindex{string}
|
||||
\obindex{tuple}
|
||||
\obindex{list}
|
||||
|
||||
\begin{verbatim}
|
||||
slicing: primary "[" [condition] ":" [condition] "]"
|
||||
\end{verbatim}
|
||||
|
||||
The primary must evaluate to a sequence object. The lower and upper
|
||||
bound expressions, if present, must evaluate to plain integers;
|
||||
defaults are zero and the sequence's length, respectively. If either
|
||||
bound is negative, the sequence's length is added to it. The slicing
|
||||
now selects all items with index $k$ such that $i <= k < j$ where $i$
|
||||
and $j$ are the specified lower and upper bounds. This may be an
|
||||
empty sequence. It is not an error if $i$ or $j$ lie outside the
|
||||
range of valid indexes (such items don't exist so they aren't
|
||||
selected).
|
||||
|
||||
\subsection{Calls} \label{calls}
|
||||
\index{call}
|
||||
|
||||
A call calls a callable object (e.g. a function) with a possibly empty
|
||||
series of arguments:
|
||||
\obindex{callable}
|
||||
|
||||
\begin{verbatim}
|
||||
call: primary "(" [condition_list] ")"
|
||||
\end{verbatim}
|
||||
|
||||
The primary must evaluate to a callable object (user-defined
|
||||
functions, built-in functions, methods of built-in objects, class
|
||||
objects, and methods of class instances are callable). If it is a
|
||||
class, the argument list must be empty; otherwise, the arguments are
|
||||
evaluated.
|
||||
|
||||
A call always returns some value, possibly \verb\None\, unless it
|
||||
raises an exception. How this value is computed depends on the type
|
||||
of the callable object. If it is:
|
||||
|
||||
\begin{description}
|
||||
|
||||
\item[a user-defined function:] the code block for the function is
|
||||
executed, passing it the argument list. The first thing the code
|
||||
block will do is bind the formal parameters to the arguments; this is
|
||||
described in section \ref{function}. When the code block executes a
|
||||
\verb\return\ statement, this specifies the return value of the
|
||||
function call.
|
||||
\indexii{function}{call}
|
||||
\indexiii{user-defined}{function}{call}
|
||||
\obindex{user-defined function}
|
||||
\obindex{function}
|
||||
|
||||
\item[a built-in function or method:] the result is up to the
|
||||
interpreter; see the library reference manual for the descriptions of
|
||||
built-in functions and methods.
|
||||
\indexii{function}{call}
|
||||
\indexii{built-in function}{call}
|
||||
\indexii{method}{call}
|
||||
\indexii{built-in method}{call}
|
||||
\obindex{built-in method}
|
||||
\obindex{built-in function}
|
||||
\obindex{method}
|
||||
\obindex{function}
|
||||
|
||||
\item[a class object:] a new instance of that class is returned.
|
||||
\obindex{class}
|
||||
\indexii{class object}{call}
|
||||
|
||||
\item[a class instance method:] the corresponding user-defined
|
||||
function is called, with an argument list that is one longer than the
|
||||
argument list of the call: the instance becomes the first argument.
|
||||
\obindex{class instance}
|
||||
\obindex{instance}
|
||||
\indexii{instance}{call}
|
||||
\indexii{class instance}{call}
|
||||
|
||||
\end{description}
|
||||
|
||||
\section{Unary arithmetic operations}
|
||||
\indexiii{unary}{arithmetic}{operation}
|
||||
\indexiii{unary}{bit-wise}{operation}
|
||||
|
||||
All unary arithmetic (and bit-wise) operations have the same priority:
|
||||
|
||||
\begin{verbatim}
|
||||
u_expr: primary | "-" u_expr | "+" u_expr | "~" u_expr
|
||||
\end{verbatim}
|
||||
|
||||
The unary \verb\"-"\ (minus) operator yields the negation of its
|
||||
numeric argument.
|
||||
\index{negation}
|
||||
\index{minus}
|
||||
|
||||
The unary \verb\"+"\ (plus) operator yields its numeric argument
|
||||
unchanged.
|
||||
\index{plus}
|
||||
|
||||
The unary \verb\"~"\ (invert) operator yields the bit-wise inversion
|
||||
of its plain or long integer argument. The bit-wise inversion of
|
||||
\verb\x\ is defined as \verb\-(x+1)\.
|
||||
\index{inversion}
|
||||
|
||||
In all three cases, if the argument does not have the proper type,
|
||||
a \verb\TypeError\ exception is raised.
|
||||
\exindex{TypeError}
|
||||
|
||||
\section{Binary arithmetic operations}
|
||||
\indexiii{binary}{arithmetic}{operation}
|
||||
|
||||
The binary arithmetic operations have the conventional priority
|
||||
levels. Note that some of these operations also apply to certain
|
||||
non-numeric types. There is no ``power'' operator, so there are only
|
||||
two levels, one for multiplicative operators and one for additive
|
||||
operators:
|
||||
|
||||
\begin{verbatim}
|
||||
m_expr: u_expr | m_expr "*" u_expr
|
||||
| m_expr "/" u_expr | m_expr "%" u_expr
|
||||
a_expr: m_expr | aexpr "+" m_expr | aexpr "-" m_expr
|
||||
\end{verbatim}
|
||||
|
||||
The \verb\"*"\ (multiplication) operator yields the product of its
|
||||
arguments. The arguments must either both be numbers, or one argument
|
||||
must be a plain integer and the other must be a sequence. In the
|
||||
former case, the numbers are converted to a common type and then
|
||||
multiplied together. In the latter case, sequence repetition is
|
||||
performed; a negative repetition factor yields an empty sequence.
|
||||
\index{multiplication}
|
||||
|
||||
The \verb\"/"\ (division) operator yields the quotient of its
|
||||
arguments. The numeric arguments are first converted to a common
|
||||
type. Plain or long integer division yields an integer of the same
|
||||
type; the result is that of mathematical division with the `floor'
|
||||
function applied to the result. Division by zero raises the
|
||||
\verb\ZeroDivisionError\ exception.
|
||||
\exindex{ZeroDivisionError}
|
||||
\index{division}
|
||||
|
||||
The \verb\"%"\ (modulo) operator yields the remainder from the
|
||||
division of the first argument by the second. The numeric arguments
|
||||
are first converted to a common type. A zero right argument raises
|
||||
the \verb\ZeroDivisionError\ exception. The arguments may be floating
|
||||
point numbers, e.g. \verb\3.14 % 0.7\ equals \verb\0.34\. The modulo
|
||||
operator always yields a result with the same sign as its second
|
||||
operand (or zero); the absolute value of the result is strictly
|
||||
smaller than the second operand.
|
||||
\index{modulo}
|
||||
|
||||
The integer division and modulo operators are connected by the
|
||||
following identity: \verb\x == (x/y)*y + (x%y)\. Integer division and
|
||||
modulo are also connected with the built-in function \verb\divmod()\:
|
||||
\verb\divmod(x, y) == (x/y, x%y)\. These identities don't hold for
|
||||
floating point numbers; there a similar identity holds where
|
||||
\verb\x/y\ is replaced by \verb\floor(x/y)\).
|
||||
|
||||
The \verb\"+"\ (addition) operator yields the sum of its arguments.
|
||||
The arguments must either both be numbers, or both sequences of the
|
||||
same type. In the former case, the numbers are converted to a common
|
||||
type and then added together. In the latter case, the sequences are
|
||||
concatenated.
|
||||
\index{addition}
|
||||
|
||||
The \verb\"-"\ (subtraction) operator yields the difference of its
|
||||
arguments. The numeric arguments are first converted to a common
|
||||
type.
|
||||
\index{subtraction}
|
||||
|
||||
\section{Shifting operations}
|
||||
\indexii{shifting}{operation}
|
||||
|
||||
The shifting operations have lower priority than the arithmetic
|
||||
operations:
|
||||
|
||||
\begin{verbatim}
|
||||
shift_expr: a_expr | shift_expr ( "<<" | ">>" ) a_expr
|
||||
\end{verbatim}
|
||||
|
||||
These operators accept plain or long integers as arguments. The
|
||||
arguments are converted to a common type. They shift the first
|
||||
argument to the left or right by the number of bits given by the
|
||||
second argument.
|
||||
|
||||
A right shift by $n$ bits is defined as division by $2^n$. A left
|
||||
shift by $n$ bits is defined as multiplication with $2^n$; for plain
|
||||
integers there is no overflow check so this drops bits and flip the
|
||||
sign if the result is not less than $2^{31}$ in absolute value.
|
||||
|
||||
Negative shift counts raise a \verb\ValueError\ exception.
|
||||
\exindex{ValueError}
|
||||
|
||||
\section{Binary bit-wise operations}
|
||||
\indexiii{binary}{bit-wise}{operation}
|
||||
|
||||
Each of the three bitwise operations has a different priority level:
|
||||
|
||||
\begin{verbatim}
|
||||
and_expr: shift_expr | and_expr "&" shift_expr
|
||||
xor_expr: and_expr | xor_expr "^" and_expr
|
||||
or_expr: xor_expr | or_expr "|" xor_expr
|
||||
\end{verbatim}
|
||||
|
||||
The \verb\"&"\ operator yields the bitwise AND of its arguments, which
|
||||
must be plain or long integers. The arguments are converted to a
|
||||
common type.
|
||||
\indexii{bit-wise}{and}
|
||||
|
||||
The \verb\"^"\ operator yields the bitwise XOR (exclusive OR) of its
|
||||
arguments, which must be plain or long integers. The arguments are
|
||||
converted to a common type.
|
||||
\indexii{bit-wise}{xor}
|
||||
\indexii{exclusive}{or}
|
||||
|
||||
The \verb\"|"\ operator yields the bitwise (inclusive) OR of its
|
||||
arguments, which must be plain or long integers. The arguments are
|
||||
converted to a common type.
|
||||
\indexii{bit-wise}{or}
|
||||
\indexii{inclusive}{or}
|
||||
|
||||
\section{Comparisons}
|
||||
\index{comparison}
|
||||
|
||||
Contrary to C, all comparison operations in Python have the same
|
||||
priority, which is lower than that of any arithmetic, shifting or
|
||||
bitwise operation. Also contrary to C, expressions like
|
||||
\verb\a < b < c\ have the interpretation that is conventional in
|
||||
mathematics:
|
||||
\index{C}
|
||||
|
||||
\begin{verbatim}
|
||||
comparison: or_expr (comp_operator or_expr)*
|
||||
comp_operator: "<"|">"|"=="|">="|"<="|"<>"|"!="|"is" ["not"]|["not"] "in"
|
||||
\end{verbatim}
|
||||
|
||||
Comparisons yield integer values: 1 for true, 0 for false.
|
||||
|
||||
Comparisons can be chained arbitrarily, e.g. $x < y <= z$ is
|
||||
equivalent to $x < y$ \verb\and\ $y <= z$, except that $y$ is
|
||||
evaluated only once (but in both cases $z$ is not evaluated at all
|
||||
when $x < y$ is found to be false).
|
||||
\indexii{chaining}{comparisons}
|
||||
|
||||
Formally, $e_0 op_1 e_1 op_2 e_2 ...e_{n-1} op_n e_n$ is equivalent to
|
||||
$e_0 op_1 e_1$ \verb\and\ $e_1 op_2 e_2$ \verb\and\ ... \verb\and\
|
||||
$e_{n-1} op_n e_n$, except that each expression is evaluated at most once.
|
||||
|
||||
Note that $e_0 op_1 e_1 op_2 e_2$ does not imply any kind of comparison
|
||||
between $e_0$ and $e_2$, e.g. $x < y > z$ is perfectly legal.
|
||||
|
||||
The forms \verb\<>\ and \verb\!=\ are equivalent; for consistency with
|
||||
C, \verb\!=\ is preferred; where \verb\!=\ is mentioned below
|
||||
\verb\<>\ is also implied.
|
||||
|
||||
The operators {\tt "<", ">", "==", ">=", "<="}, and {\tt "!="} compare
|
||||
the values of two objects. The objects needn't have the same type.
|
||||
If both are numbers, they are coverted to a common type. Otherwise,
|
||||
objects of different types {\em always} compare unequal, and are
|
||||
ordered consistently but arbitrarily.
|
||||
|
||||
(This unusual definition of comparison is done to simplify the
|
||||
definition of operations like sorting and the \verb\in\ and \verb\not
|
||||
in\ operators.)
|
||||
|
||||
Comparison of objects of the same type depends on the type:
|
||||
|
||||
\begin{itemize}
|
||||
|
||||
\item
|
||||
Numbers are compared arithmetically.
|
||||
|
||||
\item
|
||||
Strings are compared lexicographically using the numeric equivalents
|
||||
(the result of the built-in function \verb\ord\) of their characters.
|
||||
|
||||
\item
|
||||
Tuples and lists are compared lexicographically using comparison of
|
||||
corresponding items.
|
||||
|
||||
\item
|
||||
Mappings (dictionaries) are compared through lexicographic
|
||||
comparison of their sorted (key, value) lists.%
|
||||
\footnote{This is expensive since it requires sorting the keys first,
|
||||
but about the only sensible definition. It was tried to compare
|
||||
dictionaries by identity only, but this caused surprises because
|
||||
people expected to be able to test a dictionary for emptiness by
|
||||
comparing it to {\tt \{\}}.}
|
||||
|
||||
\item
|
||||
Most other types compare unequal unless they are the same object;
|
||||
the choice whether one object is considered smaller or larger than
|
||||
another one is made arbitrarily but consistently within one
|
||||
execution of a program.
|
||||
|
||||
\end{itemize}
|
||||
|
||||
The operators \verb\in\ and \verb\not in\ test for sequence
|
||||
membership: if $y$ is a sequence, $x ~\verb\in\~ y$ is true if and
|
||||
only if there exists an index $i$ such that $x = y[i]$.
|
||||
$x ~\verb\not in\~ y$ yields the inverse truth value. The exception
|
||||
\verb\TypeError\ is raised when $y$ is not a sequence, or when $y$ is
|
||||
a string and $x$ is not a string of length one.%
|
||||
\footnote{The latter restriction is sometimes a nuisance.}
|
||||
\opindex{in}
|
||||
\opindex{not in}
|
||||
\indexii{membership}{test}
|
||||
\obindex{sequence}
|
||||
|
||||
The operators \verb\is\ and \verb\is not\ test for object identity:
|
||||
$x ~\verb\is\~ y$ is true if and only if $x$ and $y$ are the same
|
||||
object. $x ~\verb\is not\~ y$ yields the inverse truth value.
|
||||
\opindex{is}
|
||||
\opindex{is not}
|
||||
\indexii{identity}{test}
|
||||
|
||||
\section{Boolean operations} \label{Booleans}
|
||||
\indexii{Boolean}{operation}
|
||||
|
||||
Boolean operations have the lowest priority of all Python operations:
|
||||
|
||||
\begin{verbatim}
|
||||
condition: or_test
|
||||
or_test: and_test | or_test "or" and_test
|
||||
and_test: not_test | and_test "and" not_test
|
||||
not_test: comparison | "not" not_test
|
||||
\end{verbatim}
|
||||
|
||||
In the context of Boolean operations, and also when conditions are
|
||||
used by control flow statements, the following values are interpreted
|
||||
as false: \verb\None\, numeric zero of all types, empty sequences
|
||||
(strings, tuples and lists), and empty mappings (dictionaries). All
|
||||
other values are interpreted as true.
|
||||
|
||||
The operator \verb\not\ yields 1 if its argument is false, 0 otherwise.
|
||||
\opindex{not}
|
||||
|
||||
The condition $x ~\verb\and\~ y$ first evaluates $x$; if $x$ is false,
|
||||
its value is returned; otherwise, $y$ is evaluated and the resulting
|
||||
value is returned.
|
||||
\opindex{and}
|
||||
|
||||
The condition $x ~\verb\or\~ y$ first evaluates $x$; if $x$ is true,
|
||||
its value is returned; otherwise, $y$ is evaluated and the resulting
|
||||
value is returned.
|
||||
\opindex{or}
|
||||
|
||||
(Note that \verb\and\ and \verb\or\ do not restrict the value and type
|
||||
they return to 0 and 1, but rather return the last evaluated argument.
|
||||
This is sometimes useful, e.g. if \verb\s\ is a string that should be
|
||||
replaced by a default value if it is empty, the expression
|
||||
\verb\s or 'foo'\ yields the desired value. Because \verb\not\ has to
|
||||
invent a value anyway, it does not bother to return a value of the
|
||||
same type as its argument, so e.g. \verb\not 'foo'\ yields \verb\0\,
|
||||
not \verb\''\.)
|
||||
|
||||
\section{Expression lists and condition lists}
|
||||
\indexii{expression}{list}
|
||||
\indexii{condition}{list}
|
||||
|
||||
\begin{verbatim}
|
||||
expr_list: or_expr ("," or_expr)* [","]
|
||||
cond_list: condition ("," condition)* [","]
|
||||
\end{verbatim}
|
||||
|
||||
The only difference between expression lists and condition lists is
|
||||
the lowest priority of operators that can be used in them without
|
||||
being enclosed in parentheses; condition lists allow all operators,
|
||||
while expression lists don't allow comparisons and Boolean operators
|
||||
(they do allow bitwise and shift operators though).
|
||||
|
||||
Expression lists are used in expression statements and assignments;
|
||||
condition lists are used everywhere else where a list of
|
||||
comma-separated values is required.
|
||||
|
||||
An expression (condition) list containing at least one comma yields a
|
||||
tuple. The length of the tuple is the number of expressions
|
||||
(conditions) in the list. The expressions (conditions) are evaluated
|
||||
from left to right. (Conditions lists are used syntactically is a few
|
||||
places where no tuple is constructed but a list of values is needed
|
||||
nevertheless.)
|
||||
\obindex{tuple}
|
||||
|
||||
The trailing comma is required only to create a single tuple (a.k.a. a
|
||||
{\em singleton}); it is optional in all other cases. A single
|
||||
expression (condition) without a trailing comma doesn't create a
|
||||
tuple, but rather yields the value of that expression (condition).
|
||||
\indexii{trailing}{comma}
|
||||
|
||||
(To create an empty tuple, use an empty pair of parentheses:
|
||||
\verb\()\.)
|
Loading…
Reference in New Issue