1898 lines
78 KiB
TeX
1898 lines
78 KiB
TeX
\section{Built-in Types \label{types}}
|
|
|
|
The following sections describe the standard types that are built into
|
|
the interpreter. Historically, Python's built-in types have differed
|
|
from user-defined types because it was not possible to use the built-in
|
|
types as the basis for object-oriented inheritance. With the 2.2
|
|
release this situation has started to change, although the intended
|
|
unification of user-defined and built-in types is as yet far from
|
|
complete.
|
|
|
|
The principal built-in types are numerics, sequences, mappings, files
|
|
classes, instances and exceptions.
|
|
\indexii{built-in}{types}
|
|
|
|
Some operations are supported by several object types; in particular,
|
|
practically all objects can be compared, tested for truth value,
|
|
and converted to a string (with the \code{`\textrm{\ldots}`} notation,
|
|
the equivalent \function{repr()} function, or the slightly different
|
|
\function{str()} function). The latter
|
|
function is implicitly used when an object is written by the
|
|
\keyword{print}\stindex{print} statement.
|
|
(Information on \ulink{\keyword{print} statement}{../ref/print.html}
|
|
and other language statements can be found in the
|
|
\citetitle[../ref/ref.html]{Python Reference Manual} and the
|
|
\citetitle[../tut/tut.html]{Python Tutorial}.)
|
|
|
|
|
|
\subsection{Truth Value Testing\label{truth}}
|
|
|
|
Any object can be tested for truth value, for use in an \keyword{if} or
|
|
\keyword{while} condition or as operand of the Boolean operations below.
|
|
The following values are considered false:
|
|
\stindex{if}
|
|
\stindex{while}
|
|
\indexii{truth}{value}
|
|
\indexii{Boolean}{operations}
|
|
\index{false}
|
|
|
|
\begin{itemize}
|
|
|
|
\item \code{None}
|
|
\withsubitem{(Built-in object)}{\ttindex{None}}
|
|
|
|
\item \code{False}
|
|
\withsubitem{(Built-in object)}{\ttindex{False}}
|
|
|
|
\item zero of any numeric type, for example, \code{0}, \code{0L},
|
|
\code{0.0}, \code{0j}.
|
|
|
|
\item any empty sequence, for example, \code{''}, \code{()}, \code{[]}.
|
|
|
|
\item any empty mapping, for example, \code{\{\}}.
|
|
|
|
\item instances of user-defined classes, if the class defines a
|
|
\method{__nonzero__()} or \method{__len__()} method, when that
|
|
method returns the integer zero or \class{bool} value
|
|
\code{False}.\footnote{Additional
|
|
information on these special methods may be found in the
|
|
\citetitle[../ref/ref.html]{Python Reference Manual}.}
|
|
|
|
\end{itemize}
|
|
|
|
All other values are considered true --- so objects of many types are
|
|
always true.
|
|
\index{true}
|
|
|
|
Operations and built-in functions that have a Boolean result always
|
|
return \code{0} or \code{False} for false and \code{1} or \code{True}
|
|
for true, unless otherwise stated. (Important exception: the Boolean
|
|
operations \samp{or}\opindex{or} and \samp{and}\opindex{and} always
|
|
return one of their operands.)
|
|
\index{False}
|
|
\index{True}
|
|
|
|
\subsection{Boolean Operations ---
|
|
\keyword{and}, \keyword{or}, \keyword{not}
|
|
\label{boolean}}
|
|
|
|
These are the Boolean operations, ordered by ascending priority:
|
|
\indexii{Boolean}{operations}
|
|
|
|
\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
|
|
\lineiii{\var{x} or \var{y}}
|
|
{if \var{x} is false, then \var{y}, else \var{x}}{(1)}
|
|
\lineiii{\var{x} and \var{y}}
|
|
{if \var{x} is false, then \var{x}, else \var{y}}{(1)}
|
|
\hline
|
|
\lineiii{not \var{x}}
|
|
{if \var{x} is false, then \code{True}, else \code{False}}{(2)}
|
|
\end{tableiii}
|
|
\opindex{and}
|
|
\opindex{or}
|
|
\opindex{not}
|
|
|
|
\noindent
|
|
Notes:
|
|
|
|
\begin{description}
|
|
|
|
\item[(1)]
|
|
These only evaluate their second argument if needed for their outcome.
|
|
|
|
\item[(2)]
|
|
\samp{not} has a lower priority than non-Boolean operators, so
|
|
\code{not \var{a} == \var{b}} is interpreted as \code{not (\var{a} ==
|
|
\var{b})}, and \code{\var{a} == not \var{b}} is a syntax error.
|
|
|
|
\end{description}
|
|
|
|
|
|
\subsection{Comparisons \label{comparisons}}
|
|
|
|
Comparison operations are supported by all objects. They all have the
|
|
same priority (which is higher than that of the Boolean operations).
|
|
Comparisons can be chained arbitrarily; for example, \code{\var{x} <
|
|
\var{y} <= \var{z}} is equivalent to \code{\var{x} < \var{y} and
|
|
\var{y} <= \var{z}}, except that \var{y} is evaluated only once (but
|
|
in both cases \var{z} is not evaluated at all when \code{\var{x} <
|
|
\var{y}} is found to be false).
|
|
\indexii{chaining}{comparisons}
|
|
|
|
This table summarizes the comparison operations:
|
|
|
|
\begin{tableiii}{c|l|c}{code}{Operation}{Meaning}{Notes}
|
|
\lineiii{<}{strictly less than}{}
|
|
\lineiii{<=}{less than or equal}{}
|
|
\lineiii{>}{strictly greater than}{}
|
|
\lineiii{>=}{greater than or equal}{}
|
|
\lineiii{==}{equal}{}
|
|
\lineiii{!=}{not equal}{(1)}
|
|
\lineiii{<>}{not equal}{(1)}
|
|
\lineiii{is}{object identity}{}
|
|
\lineiii{is not}{negated object identity}{}
|
|
\end{tableiii}
|
|
\indexii{operator}{comparison}
|
|
\opindex{==} % XXX *All* others have funny characters < ! >
|
|
\opindex{is}
|
|
\opindex{is not}
|
|
|
|
\noindent
|
|
Notes:
|
|
|
|
\begin{description}
|
|
|
|
\item[(1)]
|
|
\code{<>} and \code{!=} are alternate spellings for the same operator.
|
|
\code{!=} is the preferred spelling; \code{<>} is obsolescent.
|
|
|
|
\end{description}
|
|
|
|
Objects of different types, except different numeric types and different string types, never
|
|
compare equal; such objects are ordered consistently but arbitrarily
|
|
(so that sorting a heterogeneous array yields a consistent result).
|
|
Furthermore, some types (for example, file objects) support only a
|
|
degenerate notion of comparison where any two objects of that type are
|
|
unequal. Again, such objects are ordered arbitrarily but
|
|
consistently. The \code{<}, \code{<=}, \code{>} and \code{>=}
|
|
operators will raise a \exception{TypeError} exception when any operand
|
|
is a complex number.
|
|
\indexii{object}{numeric}
|
|
\indexii{objects}{comparing}
|
|
|
|
Instances of a class normally compare as non-equal unless the class
|
|
\withsubitem{(instance method)}{\ttindex{__cmp__()}}
|
|
defines the \method{__cmp__()} method. Refer to the
|
|
\citetitle[../ref/customization.html]{Python Reference Manual} for
|
|
information on the use of this method to effect object comparisons.
|
|
|
|
\strong{Implementation note:} Objects of different types except
|
|
numbers are ordered by their type names; objects of the same types
|
|
that don't support proper comparison are ordered by their address.
|
|
|
|
Two more operations with the same syntactic priority,
|
|
\samp{in}\opindex{in} and \samp{not in}\opindex{not in}, are supported
|
|
only by sequence types (below).
|
|
|
|
|
|
\subsection{Numeric Types ---
|
|
\class{int}, \class{float}, \class{long}, \class{complex}
|
|
\label{typesnumeric}}
|
|
|
|
There are four distinct numeric types: \dfn{plain integers},
|
|
\dfn{long integers},
|
|
\dfn{floating point numbers}, and \dfn{complex numbers}.
|
|
In addition, Booleans are a subtype of plain integers.
|
|
Plain integers (also just called \dfn{integers})
|
|
are implemented using \ctype{long} in C, which gives them at least 32
|
|
bits of precision. Long integers have unlimited precision. Floating
|
|
point numbers are implemented using \ctype{double} in C. All bets on
|
|
their precision are off unless you happen to know the machine you are
|
|
working with.
|
|
\obindex{numeric}
|
|
\obindex{Boolean}
|
|
\obindex{integer}
|
|
\obindex{long integer}
|
|
\obindex{floating point}
|
|
\obindex{complex number}
|
|
\indexii{C}{language}
|
|
|
|
Complex numbers have a real and imaginary part, which are each
|
|
implemented using \ctype{double} in C. To extract these parts from
|
|
a complex number \var{z}, use \code{\var{z}.real} and \code{\var{z}.imag}.
|
|
|
|
Numbers are created by numeric literals or as the result of built-in
|
|
functions and operators. Unadorned integer literals (including hex
|
|
and octal numbers) yield plain integers unless the value they denote
|
|
is too large to be represented as a plain integer, in which case
|
|
they yield a long integer. Integer literals with an
|
|
\character{L} or \character{l} suffix yield long integers
|
|
(\character{L} is preferred because \samp{1l} looks too much like
|
|
eleven!). Numeric literals containing a decimal point or an exponent
|
|
sign yield floating point numbers. Appending \character{j} or
|
|
\character{J} to a numeric literal yields a complex number with a
|
|
zero real part. A complex numeric literal is the sum of a real and
|
|
an imaginary part.
|
|
\indexii{numeric}{literals}
|
|
\indexii{integer}{literals}
|
|
\indexiii{long}{integer}{literals}
|
|
\indexii{floating point}{literals}
|
|
\indexii{complex number}{literals}
|
|
\indexii{hexadecimal}{literals}
|
|
\indexii{octal}{literals}
|
|
|
|
Python fully supports mixed arithmetic: when a binary arithmetic
|
|
operator has operands of different numeric types, the operand with the
|
|
``narrower'' type is widened to that of the other, where plain
|
|
integer is narrower than long integer is narrower than floating point is
|
|
narrower than complex.
|
|
Comparisons between numbers of mixed type use the same rule.\footnote{
|
|
As a consequence, the list \code{[1, 2]} is considered equal
|
|
to \code{[1.0, 2.0]}, and similarly for tuples.
|
|
} The constructors \function{int()}, \function{long()}, \function{float()},
|
|
and \function{complex()} can be used
|
|
to produce numbers of a specific type.
|
|
\index{arithmetic}
|
|
\bifuncindex{int}
|
|
\bifuncindex{long}
|
|
\bifuncindex{float}
|
|
\bifuncindex{complex}
|
|
|
|
All numeric types (except complex) support the following operations,
|
|
sorted by ascending priority (operations in the same box have the same
|
|
priority; all numeric operations have a higher priority than
|
|
comparison operations):
|
|
|
|
\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
|
|
\lineiii{\var{x} + \var{y}}{sum of \var{x} and \var{y}}{}
|
|
\lineiii{\var{x} - \var{y}}{difference of \var{x} and \var{y}}{}
|
|
\hline
|
|
\lineiii{\var{x} * \var{y}}{product of \var{x} and \var{y}}{}
|
|
\lineiii{\var{x} / \var{y}}{quotient of \var{x} and \var{y}}{(1)}
|
|
\lineiii{\var{x} \%{} \var{y}}{remainder of \code{\var{x} / \var{y}}}{(4)}
|
|
\hline
|
|
\lineiii{-\var{x}}{\var{x} negated}{}
|
|
\lineiii{+\var{x}}{\var{x} unchanged}{}
|
|
\hline
|
|
\lineiii{abs(\var{x})}{absolute value or magnitude of \var{x}}{}
|
|
\lineiii{int(\var{x})}{\var{x} converted to integer}{(2)}
|
|
\lineiii{long(\var{x})}{\var{x} converted to long integer}{(2)}
|
|
\lineiii{float(\var{x})}{\var{x} converted to floating point}{}
|
|
\lineiii{complex(\var{re},\var{im})}{a complex number with real part \var{re}, imaginary part \var{im}. \var{im} defaults to zero.}{}
|
|
\lineiii{\var{c}.conjugate()}{conjugate of the complex number \var{c}}{}
|
|
\lineiii{divmod(\var{x}, \var{y})}{the pair \code{(\var{x} // \var{y}, \var{x} \%{} \var{y})}}{(3)(4)}
|
|
\lineiii{pow(\var{x}, \var{y})}{\var{x} to the power \var{y}}{}
|
|
\lineiii{\var{x} ** \var{y}}{\var{x} to the power \var{y}}{}
|
|
\end{tableiii}
|
|
\indexiii{operations on}{numeric}{types}
|
|
\withsubitem{(complex number method)}{\ttindex{conjugate()}}
|
|
|
|
\noindent
|
|
Notes:
|
|
\begin{description}
|
|
|
|
\item[(1)]
|
|
For (plain or long) integer division, the result is an integer.
|
|
The result is always rounded towards minus infinity: 1/2 is 0,
|
|
(-1)/2 is -1, 1/(-2) is -1, and (-1)/(-2) is 0. Note that the result
|
|
is a long integer if either operand is a long integer, regardless of
|
|
the numeric value.
|
|
\indexii{integer}{division}
|
|
\indexiii{long}{integer}{division}
|
|
|
|
\item[(2)]
|
|
Conversion from floating point to (long or plain) integer may round or
|
|
truncate as in C; see functions \function{floor()} and
|
|
\function{ceil()} in the \refmodule{math}\refbimodindex{math} module
|
|
for well-defined conversions.
|
|
\withsubitem{(in module math)}{\ttindex{floor()}\ttindex{ceil()}}
|
|
\indexii{numeric}{conversions}
|
|
\indexii{C}{language}
|
|
|
|
\item[(3)]
|
|
See section \ref{built-in-funcs}, ``Built-in Functions,'' for a full
|
|
description.
|
|
|
|
\item[(4)]
|
|
Complex floor division operator, modulo operator, and \function{divmod()}.
|
|
|
|
\deprecated{2.3}{Instead convert to float using \function{abs()}
|
|
if appropriate.}
|
|
|
|
\end{description}
|
|
% XXXJH exceptions: overflow (when? what operations?) zerodivision
|
|
|
|
\subsubsection{Bit-string Operations on Integer Types \label{bitstring-ops}}
|
|
\nodename{Bit-string Operations}
|
|
|
|
Plain and long integer types support additional operations that make
|
|
sense only for bit-strings. Negative numbers are treated as their 2's
|
|
complement value (for long integers, this assumes a sufficiently large
|
|
number of bits that no overflow occurs during the operation).
|
|
|
|
The priorities of the binary bit-wise operations are all lower than
|
|
the numeric operations and higher than the comparisons; the unary
|
|
operation \samp{\~} has the same priority as the other unary numeric
|
|
operations (\samp{+} and \samp{-}).
|
|
|
|
This table lists the bit-string operations sorted in ascending
|
|
priority (operations in the same box have the same priority):
|
|
|
|
\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
|
|
\lineiii{\var{x} | \var{y}}{bitwise \dfn{or} of \var{x} and \var{y}}{}
|
|
\lineiii{\var{x} \^{} \var{y}}{bitwise \dfn{exclusive or} of \var{x} and \var{y}}{}
|
|
\lineiii{\var{x} \&{} \var{y}}{bitwise \dfn{and} of \var{x} and \var{y}}{}
|
|
% The empty groups below prevent conversion to guillemets.
|
|
\lineiii{\var{x} <{}< \var{n}}{\var{x} shifted left by \var{n} bits}{(1), (2)}
|
|
\lineiii{\var{x} >{}> \var{n}}{\var{x} shifted right by \var{n} bits}{(1), (3)}
|
|
\hline
|
|
\lineiii{\~\var{x}}{the bits of \var{x} inverted}{}
|
|
\end{tableiii}
|
|
\indexiii{operations on}{integer}{types}
|
|
\indexii{bit-string}{operations}
|
|
\indexii{shifting}{operations}
|
|
\indexii{masking}{operations}
|
|
|
|
\noindent
|
|
Notes:
|
|
\begin{description}
|
|
\item[(1)] Negative shift counts are illegal and cause a
|
|
\exception{ValueError} to be raised.
|
|
\item[(2)] A left shift by \var{n} bits is equivalent to
|
|
multiplication by \code{pow(2, \var{n})} without overflow check.
|
|
\item[(3)] A right shift by \var{n} bits is equivalent to
|
|
division by \code{pow(2, \var{n})} without overflow check.
|
|
\end{description}
|
|
|
|
|
|
\subsection{Iterator Types \label{typeiter}}
|
|
|
|
\versionadded{2.2}
|
|
\index{iterator protocol}
|
|
\index{protocol!iterator}
|
|
\index{sequence!iteration}
|
|
\index{container!iteration over}
|
|
|
|
Python supports a concept of iteration over containers. This is
|
|
implemented using two distinct methods; these are used to allow
|
|
user-defined classes to support iteration. Sequences, described below
|
|
in more detail, always support the iteration methods.
|
|
|
|
One method needs to be defined for container objects to provide
|
|
iteration support:
|
|
|
|
\begin{methoddesc}[container]{__iter__}{}
|
|
Return an iterator object. The object is required to support the
|
|
iterator protocol described below. If a container supports
|
|
different types of iteration, additional methods can be provided to
|
|
specifically request iterators for those iteration types. (An
|
|
example of an object supporting multiple forms of iteration would be
|
|
a tree structure which supports both breadth-first and depth-first
|
|
traversal.) This method corresponds to the \member{tp_iter} slot of
|
|
the type structure for Python objects in the Python/C API.
|
|
\end{methoddesc}
|
|
|
|
The iterator objects themselves are required to support the following
|
|
two methods, which together form the \dfn{iterator protocol}:
|
|
|
|
\begin{methoddesc}[iterator]{__iter__}{}
|
|
Return the iterator object itself. This is required to allow both
|
|
containers and iterators to be used with the \keyword{for} and
|
|
\keyword{in} statements. This method corresponds to the
|
|
\member{tp_iter} slot of the type structure for Python objects in
|
|
the Python/C API.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[iterator]{next}{}
|
|
Return the next item from the container. If there are no further
|
|
items, raise the \exception{StopIteration} exception. This method
|
|
corresponds to the \member{tp_iternext} slot of the type structure
|
|
for Python objects in the Python/C API.
|
|
\end{methoddesc}
|
|
|
|
Python defines several iterator objects to support iteration over
|
|
general and specific sequence types, dictionaries, and other more
|
|
specialized forms. The specific types are not important beyond their
|
|
implementation of the iterator protocol.
|
|
|
|
The intention of the protocol is that once an iterator's
|
|
\method{next()} method raises \exception{StopIteration}, it will
|
|
continue to do so on subsequent calls. Implementations that
|
|
do not obey this property are deemed broken. (This constraint
|
|
was added in Python 2.3; in Python 2.2, various iterators are
|
|
broken according to this rule.)
|
|
|
|
Python's generators provide a convenient way to implement the
|
|
iterator protocol. If a container object's \method{__iter__()}
|
|
method is implemented as a generator, it will automatically
|
|
return an iterator object (technically, a generator object)
|
|
supplying the \method{__iter__()} and \method{next()} methods.
|
|
|
|
|
|
\subsection{Sequence Types ---
|
|
\class{str}, \class{unicode}, \class{list},
|
|
\class{tuple}, \class{buffer}, \class{xrange}
|
|
\label{typesseq}}
|
|
|
|
There are six sequence types: strings, Unicode strings, lists,
|
|
tuples, buffers, and xrange objects.
|
|
|
|
String literals are written in single or double quotes:
|
|
\code{'xyzzy'}, \code{"frobozz"}. See chapter 2 of the
|
|
\citetitle[../ref/strings.html]{Python Reference Manual} for more about
|
|
string literals. Unicode strings are much like strings, but are
|
|
specified in the syntax using a preceding \character{u} character:
|
|
\code{u'abc'}, \code{u"def"}. Lists are constructed with square brackets,
|
|
separating items with commas: \code{[a, b, c]}. Tuples are
|
|
constructed by the comma operator (not within square brackets), with
|
|
or without enclosing parentheses, but an empty tuple must have the
|
|
enclosing parentheses, such as \code{a, b, c} or \code{()}. A single
|
|
item tuple must have a trailing comma, such as \code{(d,)}.
|
|
\obindex{sequence}
|
|
\obindex{string}
|
|
\obindex{Unicode}
|
|
\obindex{tuple}
|
|
\obindex{list}
|
|
|
|
Buffer objects are not directly supported by Python syntax, but can be
|
|
created by calling the builtin function
|
|
\function{buffer()}.\bifuncindex{buffer} They don't support
|
|
concatenation or repetition.
|
|
\obindex{buffer}
|
|
|
|
Xrange objects are similar to buffers in that there is no specific
|
|
syntax to create them, but they are created using the \function{xrange()}
|
|
function.\bifuncindex{xrange} They don't support slicing,
|
|
concatenation or repetition, and using \code{in}, \code{not in},
|
|
\function{min()} or \function{max()} on them is inefficient.
|
|
\obindex{xrange}
|
|
|
|
Most sequence types support the following operations. The \samp{in} and
|
|
\samp{not in} operations have the same priorities as the comparison
|
|
operations. The \samp{+} and \samp{*} operations have the same
|
|
priority as the corresponding numeric operations.\footnote{They must
|
|
have since the parser can't tell the type of the operands.}
|
|
|
|
This table lists the sequence operations sorted in ascending priority
|
|
(operations in the same box have the same priority). In the table,
|
|
\var{s} and \var{t} are sequences of the same type; \var{n}, \var{i}
|
|
and \var{j} are integers:
|
|
|
|
\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
|
|
\lineiii{\var{x} in \var{s}}{\code{True} if an item of \var{s} is equal to \var{x}, else \code{False}}{(1)}
|
|
\lineiii{\var{x} not in \var{s}}{\code{False} if an item of \var{s} is
|
|
equal to \var{x}, else \code{True}}{(1)}
|
|
\hline
|
|
\lineiii{\var{s} + \var{t}}{the concatenation of \var{s} and \var{t}}{(6)}
|
|
\lineiii{\var{s} * \var{n}\textrm{,} \var{n} * \var{s}}{\var{n} shallow copies of \var{s} concatenated}{(2)}
|
|
\hline
|
|
\lineiii{\var{s}[\var{i}]}{\var{i}'th item of \var{s}, origin 0}{(3)}
|
|
\lineiii{\var{s}[\var{i}:\var{j}]}{slice of \var{s} from \var{i} to \var{j}}{(3), (4)}
|
|
\lineiii{\var{s}[\var{i}:\var{j}:\var{k}]}{slice of \var{s} from \var{i} to \var{j} with step \var{k}}{(3), (5)}
|
|
\hline
|
|
\lineiii{len(\var{s})}{length of \var{s}}{}
|
|
\lineiii{min(\var{s})}{smallest item of \var{s}}{}
|
|
\lineiii{max(\var{s})}{largest item of \var{s}}{}
|
|
\end{tableiii}
|
|
\indexiii{operations on}{sequence}{types}
|
|
\bifuncindex{len}
|
|
\bifuncindex{min}
|
|
\bifuncindex{max}
|
|
\indexii{concatenation}{operation}
|
|
\indexii{repetition}{operation}
|
|
\indexii{subscript}{operation}
|
|
\indexii{slice}{operation}
|
|
\indexii{extended slice}{operation}
|
|
\opindex{in}
|
|
\opindex{not in}
|
|
|
|
\noindent
|
|
Notes:
|
|
|
|
\begin{description}
|
|
\item[(1)] When \var{s} is a string or Unicode string object the
|
|
\code{in} and \code{not in} operations act like a substring test. In
|
|
Python versions before 2.3, \var{x} had to be a string of length 1.
|
|
In Python 2.3 and beyond, \var{x} may be a string of any length.
|
|
|
|
\item[(2)] Values of \var{n} less than \code{0} are treated as
|
|
\code{0} (which yields an empty sequence of the same type as
|
|
\var{s}). Note also that the copies are shallow; nested structures
|
|
are not copied. This often haunts new Python programmers; consider:
|
|
|
|
\begin{verbatim}
|
|
>>> lists = [[]] * 3
|
|
>>> lists
|
|
[[], [], []]
|
|
>>> lists[0].append(3)
|
|
>>> lists
|
|
[[3], [3], [3]]
|
|
\end{verbatim}
|
|
|
|
What has happened is that \code{[[]]} is a one-element list containing
|
|
an empty list, so all three elements of \code{[[]] * 3} are (pointers to)
|
|
this single empty list. Modifying any of the elements of \code{lists}
|
|
modifies this single list. You can create a list of different lists this
|
|
way:
|
|
|
|
\begin{verbatim}
|
|
>>> lists = [[] for i in range(3)]
|
|
>>> lists[0].append(3)
|
|
>>> lists[1].append(5)
|
|
>>> lists[2].append(7)
|
|
>>> lists
|
|
[[3], [5], [7]]
|
|
\end{verbatim}
|
|
|
|
\item[(3)] If \var{i} or \var{j} is negative, the index is relative to
|
|
the end of the string: \code{len(\var{s}) + \var{i}} or
|
|
\code{len(\var{s}) + \var{j}} is substituted. But note that \code{-0} is
|
|
still \code{0}.
|
|
|
|
\item[(4)] The slice of \var{s} from \var{i} to \var{j} is defined as
|
|
the sequence of items with index \var{k} such that \code{\var{i} <=
|
|
\var{k} < \var{j}}. If \var{i} or \var{j} is greater than
|
|
\code{len(\var{s})}, use \code{len(\var{s})}. If \var{i} is omitted,
|
|
use \code{0}. If \var{j} is omitted, use \code{len(\var{s})}. If
|
|
\var{i} is greater than or equal to \var{j}, the slice is empty.
|
|
|
|
\item[(5)] The slice of \var{s} from \var{i} to \var{j} with step
|
|
\var{k} is defined as the sequence of items with index
|
|
\code{\var{x} = \var{i} + \var{n}*\var{k}} such that
|
|
$0 \leq n < \frac{j-i}{k}$. In other words, the indices
|
|
are \code{i}, \code{i+k}, \code{i+2*k}, \code{i+3*k} and so on, stopping when
|
|
\var{j} is reached (but never including \var{j}). If \var{i} or \var{j}
|
|
is greater than \code{len(\var{s})}, use \code{len(\var{s})}. If
|
|
\var{i} or \var{j} are omitted then they become ``end'' values
|
|
(which end depends on the sign of \var{k}). Note, \var{k} cannot
|
|
be zero.
|
|
|
|
\item[(6)] If \var{s} and \var{t} are both strings, some Python
|
|
implementations such as CPython can usually perform an in-place optimization
|
|
for assignments of the form \code{\var{s}=\var{s}+\var{t}} or
|
|
\code{\var{s}+=\var{t}}. When applicable, this optimization makes
|
|
quadratic run-time much less likely. This optimization is both version
|
|
and implementation dependent. For performance sensitive code, it is
|
|
preferable to use the \method{str.join()} method which assures consistent
|
|
linear concatenation performance across versions and implementations.
|
|
\versionchanged[Formerly, string concatenation never occurred in-place]{2.4}
|
|
|
|
\end{description}
|
|
|
|
|
|
\subsubsection{String Methods \label{string-methods}}
|
|
|
|
These are the string methods which both 8-bit strings and Unicode
|
|
objects support:
|
|
|
|
\begin{methoddesc}[string]{capitalize}{}
|
|
Return a copy of the string with only its first character capitalized.
|
|
|
|
For 8-bit strings, this method is locale-dependent.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{center}{width\optional{, fillchar}}
|
|
Return centered in a string of length \var{width}. Padding is done
|
|
using the specified \var{fillchar} (default is a space).
|
|
\versionchanged[Support for the \var{fillchar} argument]{2.4}
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{count}{sub\optional{, start\optional{, end}}}
|
|
Return the number of occurrences of substring \var{sub} in string
|
|
S\code{[\var{start}:\var{end}]}. Optional arguments \var{start} and
|
|
\var{end} are interpreted as in slice notation.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{decode}{\optional{encoding\optional{, errors}}}
|
|
Decodes the string using the codec registered for \var{encoding}.
|
|
\var{encoding} defaults to the default string encoding. \var{errors}
|
|
may be given to set a different error handling scheme. The default is
|
|
\code{'strict'}, meaning that encoding errors raise
|
|
\exception{UnicodeError}. Other possible values are \code{'ignore'},
|
|
\code{'replace'} and any other name registered via
|
|
\function{codecs.register_error}.
|
|
\versionadded{2.2}
|
|
\versionchanged[Support for other error handling schemes added]{2.3}
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{encode}{\optional{encoding\optional{,errors}}}
|
|
Return an encoded version of the string. Default encoding is the current
|
|
default string encoding. \var{errors} may be given to set a different
|
|
error handling scheme. The default for \var{errors} is
|
|
\code{'strict'}, meaning that encoding errors raise a
|
|
\exception{UnicodeError}. Other possible values are \code{'ignore'},
|
|
\code{'replace'}, \code{'xmlcharrefreplace'}, \code{'backslashreplace'}
|
|
and any other name registered via \function{codecs.register_error}.
|
|
For a list of possible encodings, see section~\ref{standard-encodings}.
|
|
\versionadded{2.0}
|
|
\versionchanged[Support for \code{'xmlcharrefreplace'} and
|
|
\code{'backslashreplace'} and other error handling schemes added]{2.3}
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{endswith}{suffix\optional{, start\optional{, end}}}
|
|
Return \code{True} if the string ends with the specified \var{suffix},
|
|
otherwise return \code{False}. With optional \var{start}, test beginning at
|
|
that position. With optional \var{end}, stop comparing at that position.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{expandtabs}{\optional{tabsize}}
|
|
Return a copy of the string where all tab characters are expanded
|
|
using spaces. If \var{tabsize} is not given, a tab size of \code{8}
|
|
characters is assumed.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{find}{sub\optional{, start\optional{, end}}}
|
|
Return the lowest index in the string where substring \var{sub} is
|
|
found, such that \var{sub} is contained in the range [\var{start},
|
|
\var{end}). Optional arguments \var{start} and \var{end} are
|
|
interpreted as in slice notation. Return \code{-1} if \var{sub} is
|
|
not found.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{index}{sub\optional{, start\optional{, end}}}
|
|
Like \method{find()}, but raise \exception{ValueError} when the
|
|
substring is not found.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{isalnum}{}
|
|
Return true if all characters in the string are alphanumeric and there
|
|
is at least one character, false otherwise.
|
|
|
|
For 8-bit strings, this method is locale-dependent.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{isalpha}{}
|
|
Return true if all characters in the string are alphabetic and there
|
|
is at least one character, false otherwise.
|
|
|
|
For 8-bit strings, this method is locale-dependent.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{isdigit}{}
|
|
Return true if all characters in the string are digits and there
|
|
is at least one character, false otherwise.
|
|
|
|
For 8-bit strings, this method is locale-dependent.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{islower}{}
|
|
Return true if all cased characters in the string are lowercase and
|
|
there is at least one cased character, false otherwise.
|
|
|
|
For 8-bit strings, this method is locale-dependent.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{isspace}{}
|
|
Return true if there are only whitespace characters in the string and
|
|
there is at least one character, false otherwise.
|
|
|
|
For 8-bit strings, this method is locale-dependent.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{istitle}{}
|
|
Return true if the string is a titlecased string and there is at least one
|
|
character, for example uppercase characters may only follow uncased
|
|
characters and lowercase characters only cased ones. Return false
|
|
otherwise.
|
|
|
|
For 8-bit strings, this method is locale-dependent.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{isupper}{}
|
|
Return true if all cased characters in the string are uppercase and
|
|
there is at least one cased character, false otherwise.
|
|
|
|
For 8-bit strings, this method is locale-dependent.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{join}{seq}
|
|
Return a string which is the concatenation of the strings in the
|
|
sequence \var{seq}. The separator between elements is the string
|
|
providing this method.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{ljust}{width\optional{, fillchar}}
|
|
Return the string left justified in a string of length \var{width}.
|
|
Padding is done using the specified \var{fillchar} (default is a
|
|
space). The original string is returned if
|
|
\var{width} is less than \code{len(\var{s})}.
|
|
\versionchanged[Support for the \var{fillchar} argument]{2.4}
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{lower}{}
|
|
Return a copy of the string converted to lowercase.
|
|
|
|
For 8-bit strings, this method is locale-dependent.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{lstrip}{\optional{chars}}
|
|
Return a copy of the string with leading characters removed. The
|
|
\var{chars} argument is a string specifying the set of characters
|
|
to be removed. If omitted or \code{None}, the \var{chars} argument
|
|
defaults to removing whitespace. The \var{chars} argument is not
|
|
a prefix; rather, all combinations of its values are stripped:
|
|
\begin{verbatim}
|
|
>>> ' spacious '.lstrip()
|
|
'spacious '
|
|
>>> 'www.example.com'.lstrip('cmowz.')
|
|
'example.com'
|
|
\end{verbatim}
|
|
\versionchanged[Support for the \var{chars} argument]{2.2.2}
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{replace}{old, new\optional{, count}}
|
|
Return a copy of the string with all occurrences of substring
|
|
\var{old} replaced by \var{new}. If the optional argument
|
|
\var{count} is given, only the first \var{count} occurrences are
|
|
replaced.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{rfind}{sub \optional{,start \optional{,end}}}
|
|
Return the highest index in the string where substring \var{sub} is
|
|
found, such that \var{sub} is contained within s[start,end]. Optional
|
|
arguments \var{start} and \var{end} are interpreted as in slice
|
|
notation. Return \code{-1} on failure.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{rindex}{sub\optional{, start\optional{, end}}}
|
|
Like \method{rfind()} but raises \exception{ValueError} when the
|
|
substring \var{sub} is not found.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{rjust}{width\optional{, fillchar}}
|
|
Return the string right justified in a string of length \var{width}.
|
|
Padding is done using the specified \var{fillchar} (default is a space).
|
|
The original string is returned if
|
|
\var{width} is less than \code{len(\var{s})}.
|
|
\versionchanged[Support for the \var{fillchar} argument]{2.4}
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{rsplit}{\optional{sep \optional{,maxsplit}}}
|
|
Return a list of the words in the string, using \var{sep} as the
|
|
delimiter string. If \var{maxsplit} is given, at most \var{maxsplit}
|
|
splits are done, the \emph{rightmost} ones. If \var{sep} is not specified
|
|
or \code{None}, any whitespace string is a separator. Except for splitting
|
|
from the right, \method{rsplit()} behaves like \method{split()} which
|
|
is described in detail below.
|
|
\versionadded{2.4}
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{rstrip}{\optional{chars}}
|
|
Return a copy of the string with trailing characters removed. The
|
|
\var{chars} argument is a string specifying the set of characters
|
|
to be removed. If omitted or \code{None}, the \var{chars} argument
|
|
defaults to removing whitespace. The \var{chars} argument is not
|
|
a suffix; rather, all combinations of its values are stripped:
|
|
\begin{verbatim}
|
|
>>> ' spacious '.rstrip()
|
|
' spacious'
|
|
>>> 'mississippi'.rstrip('ipz')
|
|
'mississ'
|
|
\end{verbatim}
|
|
\versionchanged[Support for the \var{chars} argument]{2.2.2}
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{split}{\optional{sep \optional{,maxsplit}}}
|
|
Return a list of the words in the string, using \var{sep} as the
|
|
delimiter string. If \var{maxsplit} is given, at most \var{maxsplit}
|
|
splits are done. (thus, the list will have at most \code{\var{maxsplit}+1}
|
|
elements). If \var{maxsplit} is not specified, then there
|
|
is no limit on the number of splits (all possible splits are made).
|
|
Consecutive delimiters are not grouped together and are
|
|
deemed to delimit empty strings (for example, \samp{'1,,2'.split(',')}
|
|
returns \samp{['1', '', '2']}). The \var{sep} argument may consist of
|
|
multiple characters (for example, \samp{'1, 2, 3'.split(', ')} returns
|
|
\samp{['1', '2', '3']}). Splitting an empty string with a specified
|
|
separator returns \samp{['']}.
|
|
|
|
If \var{sep} is not specified or is \code{None}, a different splitting
|
|
algorithm is applied. First, whitespace characters (spaces, tabs,
|
|
newlines, returns, and formfeeds) are stripped from both ends. Then,
|
|
words are separated by arbitrary length strings of whitespace
|
|
characters. Consecutive whitespace delimiters are treated as a single
|
|
delimiter (\samp{'1 2 3'.split()} returns \samp{['1', '2', '3']}).
|
|
Splitting an empty string or a string consisting of just whitespace
|
|
returns an empty list.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{splitlines}{\optional{keepends}}
|
|
Return a list of the lines in the string, breaking at line
|
|
boundaries. Line breaks are not included in the resulting list unless
|
|
\var{keepends} is given and true.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{startswith}{prefix\optional{,
|
|
start\optional{, end}}}
|
|
Return \code{True} if string starts with the \var{prefix}, otherwise
|
|
return \code{False}. With optional \var{start}, test string beginning at
|
|
that position. With optional \var{end}, stop comparing string at that
|
|
position.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{strip}{\optional{chars}}
|
|
Return a copy of the string with the leading and trailing characters
|
|
removed. The \var{chars} argument is a string specifying the set of
|
|
characters to be removed. If omitted or \code{None}, the \var{chars}
|
|
argument defaults to removing whitespace. The \var{chars} argument is not
|
|
a prefix or suffix; rather, all combinations of its values are stripped:
|
|
\begin{verbatim}
|
|
>>> ' spacious '.strip()
|
|
'spacious'
|
|
>>> 'www.example.com'.strip('cmowz.')
|
|
'example'
|
|
\end{verbatim}
|
|
\versionchanged[Support for the \var{chars} argument]{2.2.2}
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{swapcase}{}
|
|
Return a copy of the string with uppercase characters converted to
|
|
lowercase and vice versa.
|
|
|
|
For 8-bit strings, this method is locale-dependent.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{title}{}
|
|
Return a titlecased version of the string: words start with uppercase
|
|
characters, all remaining cased characters are lowercase.
|
|
|
|
For 8-bit strings, this method is locale-dependent.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{translate}{table\optional{, deletechars}}
|
|
Return a copy of the string where all characters occurring in the
|
|
optional argument \var{deletechars} are removed, and the remaining
|
|
characters have been mapped through the given translation table, which
|
|
must be a string of length 256.
|
|
|
|
For Unicode objects, the \method{translate()} method does not
|
|
accept the optional \var{deletechars} argument. Instead, it
|
|
returns a copy of the \var{s} where all characters have been mapped
|
|
through the given translation table which must be a mapping of
|
|
Unicode ordinals to Unicode ordinals, Unicode strings or \code{None}.
|
|
Unmapped characters are left untouched. Characters mapped to \code{None}
|
|
are deleted. Note, a more flexible approach is to create a custom
|
|
character mapping codec using the \refmodule{codecs} module (see
|
|
\module{encodings.cp1251} for an example).
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{upper}{}
|
|
Return a copy of the string converted to uppercase.
|
|
|
|
For 8-bit strings, this method is locale-dependent.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[string]{zfill}{width}
|
|
Return the numeric string left filled with zeros in a string
|
|
of length \var{width}. The original string is returned if
|
|
\var{width} is less than \code{len(\var{s})}.
|
|
\versionadded{2.2.2}
|
|
\end{methoddesc}
|
|
|
|
|
|
\subsubsection{String Formatting Operations \label{typesseq-strings}}
|
|
|
|
\index{formatting, string (\%{})}
|
|
\index{interpolation, string (\%{})}
|
|
\index{string!formatting}
|
|
\index{string!interpolation}
|
|
\index{printf-style formatting}
|
|
\index{sprintf-style formatting}
|
|
\index{\protect\%{} formatting}
|
|
\index{\protect\%{} interpolation}
|
|
|
|
String and Unicode objects have one unique built-in operation: the
|
|
\code{\%} operator (modulo). This is also known as the string
|
|
\emph{formatting} or \emph{interpolation} operator. Given
|
|
\code{\var{format} \% \var{values}} (where \var{format} is a string or
|
|
Unicode object), \code{\%} conversion specifications in \var{format}
|
|
are replaced with zero or more elements of \var{values}. The effect
|
|
is similar to the using \cfunction{sprintf()} in the C language. If
|
|
\var{format} is a Unicode object, or if any of the objects being
|
|
converted using the \code{\%s} conversion are Unicode objects, the
|
|
result will also be a Unicode object.
|
|
|
|
If \var{format} requires a single argument, \var{values} may be a
|
|
single non-tuple object.\footnote{To format only a tuple you
|
|
should therefore provide a singleton tuple whose only element
|
|
is the tuple to be formatted.} Otherwise, \var{values} must be a tuple with
|
|
exactly the number of items specified by the format string, or a
|
|
single mapping object (for example, a dictionary).
|
|
|
|
A conversion specifier contains two or more characters and has the
|
|
following components, which must occur in this order:
|
|
|
|
\begin{enumerate}
|
|
\item The \character{\%} character, which marks the start of the
|
|
specifier.
|
|
\item Mapping key (optional), consisting of a parenthesised sequence
|
|
of characters (for example, \code{(somename)}).
|
|
\item Conversion flags (optional), which affect the result of some
|
|
conversion types.
|
|
\item Minimum field width (optional). If specified as an
|
|
\character{*} (asterisk), the actual width is read from the
|
|
next element of the tuple in \var{values}, and the object to
|
|
convert comes after the minimum field width and optional
|
|
precision.
|
|
\item Precision (optional), given as a \character{.} (dot) followed
|
|
by the precision. If specified as \character{*} (an
|
|
asterisk), the actual width is read from the next element of
|
|
the tuple in \var{values}, and the value to convert comes after
|
|
the precision.
|
|
\item Length modifier (optional).
|
|
\item Conversion type.
|
|
\end{enumerate}
|
|
|
|
When the right argument is a dictionary (or other mapping type), then
|
|
the formats in the string \emph{must} include a parenthesised mapping key into
|
|
that dictionary inserted immediately after the \character{\%}
|
|
character. The mapping key selects the value to be formatted from the
|
|
mapping. For example:
|
|
|
|
\begin{verbatim}
|
|
>>> print '%(language)s has %(#)03d quote types.' % \
|
|
{'language': "Python", "#": 2}
|
|
Python has 002 quote types.
|
|
\end{verbatim}
|
|
|
|
In this case no \code{*} specifiers may occur in a format (since they
|
|
require a sequential parameter list).
|
|
|
|
The conversion flag characters are:
|
|
|
|
\begin{tableii}{c|l}{character}{Flag}{Meaning}
|
|
\lineii{\#}{The value conversion will use the ``alternate form''
|
|
(where defined below).}
|
|
\lineii{0}{The conversion will be zero padded for numeric values.}
|
|
\lineii{-}{The converted value is left adjusted (overrides
|
|
the \character{0} conversion if both are given).}
|
|
\lineii{{~}}{(a space) A blank should be left before a positive number
|
|
(or empty string) produced by a signed conversion.}
|
|
\lineii{+}{A sign character (\character{+} or \character{-}) will
|
|
precede the conversion (overrides a "space" flag).}
|
|
\end{tableii}
|
|
|
|
The length modifier may be \code{h}, \code{l}, and \code{L} may be
|
|
present, but are ignored as they are not necessary for Python.
|
|
|
|
The conversion types are:
|
|
|
|
\begin{tableiii}{c|l|c}{character}{Conversion}{Meaning}{Notes}
|
|
\lineiii{d}{Signed integer decimal.}{}
|
|
\lineiii{i}{Signed integer decimal.}{}
|
|
\lineiii{o}{Unsigned octal.}{(1)}
|
|
\lineiii{u}{Unsigned decimal.}{}
|
|
\lineiii{x}{Unsigned hexadecimal (lowercase).}{(2)}
|
|
\lineiii{X}{Unsigned hexadecimal (uppercase).}{(2)}
|
|
\lineiii{e}{Floating point exponential format (lowercase).}{}
|
|
\lineiii{E}{Floating point exponential format (uppercase).}{}
|
|
\lineiii{f}{Floating point decimal format.}{}
|
|
\lineiii{F}{Floating point decimal format.}{}
|
|
\lineiii{g}{Same as \character{e} if exponent is greater than -4 or
|
|
less than precision, \character{f} otherwise.}{}
|
|
\lineiii{G}{Same as \character{E} if exponent is greater than -4 or
|
|
less than precision, \character{F} otherwise.}{}
|
|
\lineiii{c}{Single character (accepts integer or single character
|
|
string).}{}
|
|
\lineiii{r}{String (converts any python object using
|
|
\function{repr()}).}{(3)}
|
|
\lineiii{s}{String (converts any python object using
|
|
\function{str()}).}{(4)}
|
|
\lineiii{\%}{No argument is converted, results in a \character{\%}
|
|
character in the result.}{}
|
|
\end{tableiii}
|
|
|
|
\noindent
|
|
Notes:
|
|
\begin{description}
|
|
\item[(1)]
|
|
The alternate form causes a leading zero (\character{0}) to be
|
|
inserted between left-hand padding and the formatting of the
|
|
number if the leading character of the result is not already a
|
|
zero.
|
|
\item[(2)]
|
|
The alternate form causes a leading \code{'0x'} or \code{'0X'}
|
|
(depending on whether the \character{x} or \character{X} format
|
|
was used) to be inserted between left-hand padding and the
|
|
formatting of the number if the leading character of the result is
|
|
not already a zero.
|
|
\item[(3)]
|
|
The \code{\%r} conversion was added in Python 2.0.
|
|
\item[(4)]
|
|
If the object or format provided is a \class{unicode} string,
|
|
the resulting string will also be \class{unicode}.
|
|
\end{description}
|
|
|
|
% XXX Examples?
|
|
|
|
Since Python strings have an explicit length, \code{\%s} conversions
|
|
do not assume that \code{'\e0'} is the end of the string.
|
|
|
|
For safety reasons, floating point precisions are clipped to 50;
|
|
\code{\%f} conversions for numbers whose absolute value is over 1e25
|
|
are replaced by \code{\%g} conversions.\footnote{
|
|
These numbers are fairly arbitrary. They are intended to
|
|
avoid printing endless strings of meaningless digits without hampering
|
|
correct use and without having to know the exact precision of floating
|
|
point values on a particular machine.
|
|
} All other errors raise exceptions.
|
|
|
|
Additional string operations are defined in standard modules
|
|
\refmodule{string}\refstmodindex{string}\ and
|
|
\refmodule{re}.\refstmodindex{re}
|
|
|
|
|
|
\subsubsection{XRange Type \label{typesseq-xrange}}
|
|
|
|
The \class{xrange}\obindex{xrange} type is an immutable sequence which
|
|
is commonly used for looping. The advantage of the \class{xrange}
|
|
type is that an \class{xrange} object will always take the same amount
|
|
of memory, no matter the size of the range it represents. There are
|
|
no consistent performance advantages.
|
|
|
|
XRange objects have very little behavior: they only support indexing,
|
|
iteration, and the \function{len()} function.
|
|
|
|
|
|
\subsubsection{Mutable Sequence Types \label{typesseq-mutable}}
|
|
|
|
List objects support additional operations that allow in-place
|
|
modification of the object.
|
|
Other mutable sequence types (when added to the language) should
|
|
also support these operations.
|
|
Strings and tuples are immutable sequence types: such objects cannot
|
|
be modified once created.
|
|
The following operations are defined on mutable sequence types (where
|
|
\var{x} is an arbitrary object):
|
|
\indexiii{mutable}{sequence}{types}
|
|
\obindex{list}
|
|
|
|
\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
|
|
\lineiii{\var{s}[\var{i}] = \var{x}}
|
|
{item \var{i} of \var{s} is replaced by \var{x}}{}
|
|
\lineiii{\var{s}[\var{i}:\var{j}] = \var{t}}
|
|
{slice of \var{s} from \var{i} to \var{j} is replaced by \var{t}}{}
|
|
\lineiii{del \var{s}[\var{i}:\var{j}]}
|
|
{same as \code{\var{s}[\var{i}:\var{j}] = []}}{}
|
|
\lineiii{\var{s}[\var{i}:\var{j}:\var{k}] = \var{t}}
|
|
{the elements of \code{\var{s}[\var{i}:\var{j}:\var{k}]} are replaced by those of \var{t}}{(1)}
|
|
\lineiii{del \var{s}[\var{i}:\var{j}:\var{k}]}
|
|
{removes the elements of \code{\var{s}[\var{i}:\var{j}:\var{k}]} from the list}{}
|
|
\lineiii{\var{s}.append(\var{x})}
|
|
{same as \code{\var{s}[len(\var{s}):len(\var{s})] = [\var{x}]}}{(2)}
|
|
\lineiii{\var{s}.extend(\var{x})}
|
|
{same as \code{\var{s}[len(\var{s}):len(\var{s})] = \var{x}}}{(3)}
|
|
\lineiii{\var{s}.count(\var{x})}
|
|
{return number of \var{i}'s for which \code{\var{s}[\var{i}] == \var{x}}}{}
|
|
\lineiii{\var{s}.index(\var{x}\optional{, \var{i}\optional{, \var{j}}})}
|
|
{return smallest \var{k} such that \code{\var{s}[\var{k}] == \var{x}} and
|
|
\code{\var{i} <= \var{k} < \var{j}}}{(4)}
|
|
\lineiii{\var{s}.insert(\var{i}, \var{x})}
|
|
{same as \code{\var{s}[\var{i}:\var{i}] = [\var{x}]}}{(5)}
|
|
\lineiii{\var{s}.pop(\optional{\var{i}})}
|
|
{same as \code{\var{x} = \var{s}[\var{i}]; del \var{s}[\var{i}]; return \var{x}}}{(6)}
|
|
\lineiii{\var{s}.remove(\var{x})}
|
|
{same as \code{del \var{s}[\var{s}.index(\var{x})]}}{(4)}
|
|
\lineiii{\var{s}.reverse()}
|
|
{reverses the items of \var{s} in place}{(7)}
|
|
\lineiii{\var{s}.sort(\optional{\var{cmp}\optional{,
|
|
\var{key}\optional{, \var{reverse}}}})}
|
|
{sort the items of \var{s} in place}{(7), (8), (9), (10)}
|
|
\end{tableiii}
|
|
\indexiv{operations on}{mutable}{sequence}{types}
|
|
\indexiii{operations on}{sequence}{types}
|
|
\indexiii{operations on}{list}{type}
|
|
\indexii{subscript}{assignment}
|
|
\indexii{slice}{assignment}
|
|
\indexii{extended slice}{assignment}
|
|
\stindex{del}
|
|
\withsubitem{(list method)}{
|
|
\ttindex{append()}\ttindex{extend()}\ttindex{count()}\ttindex{index()}
|
|
\ttindex{insert()}\ttindex{pop()}\ttindex{remove()}\ttindex{reverse()}
|
|
\ttindex{sort()}}
|
|
\noindent
|
|
Notes:
|
|
\begin{description}
|
|
\item[(1)] \var{t} must have the same length as the slice it is
|
|
replacing.
|
|
|
|
\item[(2)] The C implementation of Python has historically accepted
|
|
multiple parameters and implicitly joined them into a tuple; this
|
|
no longer works in Python 2.0. Use of this misfeature has been
|
|
deprecated since Python 1.4.
|
|
|
|
\item[(3)] \var{x} can be any iterable object.
|
|
|
|
\item[(4)] Raises \exception{ValueError} when \var{x} is not found in
|
|
\var{s}. When a negative index is passed as the second or third parameter
|
|
to the \method{index()} method, the list length is added, as for slice
|
|
indices. If it is still negative, it is truncated to zero, as for
|
|
slice indices. \versionchanged[Previously, \method{index()} didn't
|
|
have arguments for specifying start and stop positions]{2.3}
|
|
|
|
\item[(5)] When a negative index is passed as the first parameter to
|
|
the \method{insert()} method, the list length is added, as for slice
|
|
indices. If it is still negative, it is truncated to zero, as for
|
|
slice indices. \versionchanged[Previously, all negative indices
|
|
were truncated to zero]{2.3}
|
|
|
|
\item[(6)] The \method{pop()} method is only supported by the list and
|
|
array types. The optional argument \var{i} defaults to \code{-1},
|
|
so that by default the last item is removed and returned.
|
|
|
|
\item[(7)] The \method{sort()} and \method{reverse()} methods modify the
|
|
list in place for economy of space when sorting or reversing a large
|
|
list. To remind you that they operate by side effect, they don't return
|
|
the sorted or reversed list.
|
|
|
|
\item[(8)] The \method{sort()} method takes optional arguments for
|
|
controlling the comparisons.
|
|
|
|
\var{cmp} specifies a custom comparison function of two arguments
|
|
(list items) which should return a negative, zero or positive number
|
|
depending on whether the first argument is considered smaller than,
|
|
equal to, or larger than the second argument:
|
|
\samp{\var{cmp}=\keyword{lambda} \var{x},\var{y}:
|
|
\function{cmp}(x.lower(), y.lower())}
|
|
|
|
\var{key} specifies a function of one argument that is used to
|
|
extract a comparison key from each list element:
|
|
\samp{\var{key}=\function{str.lower}}
|
|
|
|
\var{reverse} is a boolean value. If set to \code{True}, then the
|
|
list elements are sorted as if each comparison were reversed.
|
|
|
|
In general, the \var{key} and \var{reverse} conversion processes are
|
|
much faster than specifying an equivalent \var{cmp} function. This is
|
|
because \var{cmp} is called multiple times for each list element while
|
|
\var{key} and \var{reverse} touch each element only once.
|
|
|
|
\versionchanged[Support for \code{None} as an equivalent to omitting
|
|
\var{cmp} was added]{2.3}
|
|
|
|
\versionchanged[Support for \var{key} and \var{reverse} was added]{2.4}
|
|
|
|
\item[(9)] Starting with Python 2.3, the \method{sort()} method is
|
|
guaranteed to be stable. A sort is stable if it guarantees not to
|
|
change the relative order of elements that compare equal --- this is
|
|
helpful for sorting in multiple passes (for example, sort by
|
|
department, then by salary grade).
|
|
|
|
\item[(10)] While a list is being sorted, the effect of attempting to
|
|
mutate, or even inspect, the list is undefined. The C
|
|
implementation of Python 2.3 and newer makes the list appear empty
|
|
for the duration, and raises \exception{ValueError} if it can detect
|
|
that the list has been mutated during a sort.
|
|
\end{description}
|
|
|
|
\subsection{Set Types ---
|
|
\class{set}, \class{frozenset}
|
|
\label{types-set}}
|
|
\obindex{set}
|
|
|
|
A \dfn{set} object is an unordered collection of immutable values.
|
|
Common uses include membership testing, removing duplicates from a sequence,
|
|
and computing mathematical operations such as intersection, union, difference,
|
|
and symmetric difference.
|
|
\versionadded{2.4}
|
|
|
|
Like other collections, sets support \code{\var{x} in \var{set}},
|
|
\code{len(\var{set})}, and \code{for \var{x} in \var{set}}. Being an
|
|
unordered collection, sets do not record element position or order of
|
|
insertion. Accordingly, sets do not support indexing, slicing, or
|
|
other sequence-like behavior.
|
|
|
|
There are currently two builtin set types, \class{set} and \class{frozenset}.
|
|
The \class{set} type is mutable --- the contents can be changed using methods
|
|
like \method{add()} and \method{remove()}. Since it is mutable, it has no
|
|
hash value and cannot be used as either a dictionary key or as an element of
|
|
another set. The \class{frozenset} type is immutable and hashable --- its
|
|
contents cannot be altered after is created; however, it can be used as
|
|
a dictionary key or as an element of another set.
|
|
|
|
Instances of \class{set} and \class{frozenset} provide the following operations:
|
|
|
|
\begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result}
|
|
\lineiii{len(\var{s})}{}{cardinality of set \var{s}}
|
|
|
|
\hline
|
|
\lineiii{\var{x} in \var{s}}{}
|
|
{test \var{x} for membership in \var{s}}
|
|
\lineiii{\var{x} not in \var{s}}{}
|
|
{test \var{x} for non-membership in \var{s}}
|
|
\lineiii{\var{s}.issubset(\var{t})}{\code{\var{s} <= \var{t}}}
|
|
{test whether every element in \var{s} is in \var{t}}
|
|
\lineiii{\var{s}.issuperset(\var{t})}{\code{\var{s} >= \var{t}}}
|
|
{test whether every element in \var{t} is in \var{s}}
|
|
|
|
\hline
|
|
\lineiii{\var{s}.union(\var{t})}{\var{s} | \var{t}}
|
|
{new set with elements from both \var{s} and \var{t}}
|
|
\lineiii{\var{s}.intersection(\var{t})}{\var{s} \&\ \var{t}}
|
|
{new set with elements common to \var{s} and \var{t}}
|
|
\lineiii{\var{s}.difference(\var{t})}{\var{s} - \var{t}}
|
|
{new set with elements in \var{s} but not in \var{t}}
|
|
\lineiii{\var{s}.symmetric_difference(\var{t})}{\var{s} \^\ \var{t}}
|
|
{new set with elements in either \var{s} or \var{t} but not both}
|
|
\lineiii{\var{s}.copy()}{}
|
|
{new set with a shallow copy of \var{s}}
|
|
\end{tableiii}
|
|
|
|
Note, the non-operator versions of \method{union()}, \method{intersection()},
|
|
\method{difference()}, and \method{symmetric_difference()},
|
|
\method{issubset()}, and \method{issuperset()} methods will accept any
|
|
iterable as an argument. In contrast, their operator based counterparts
|
|
require their arguments to be sets. This precludes error-prone constructions
|
|
like \code{set('abc') \&\ 'cbs'} in favor of the more readable
|
|
\code{set('abc').intersection('cbs')}.
|
|
|
|
Both \class{set} and \class{frozenset} support set to set comparisons.
|
|
Two sets are equal if and only if every element of each set is contained in
|
|
the other (each is a subset of the other).
|
|
A set is less than another set if and only if the first set is a proper
|
|
subset of the second set (is a subset, but is not equal).
|
|
A set is greater than another set if and only if the first set is a proper
|
|
superset of the second set (is a superset, but is not equal).
|
|
|
|
Instances of \class{set} are compared to instances of \class{frozenset} based
|
|
on their members. For example, \samp{set('abc') == frozenset('abc')} returns
|
|
\code{True}.
|
|
|
|
The subset and equality comparisons do not generalize to a complete
|
|
ordering function. For example, any two disjoint sets are not equal and
|
|
are not subsets of each other, so \emph{all} of the following return
|
|
\code{False}: \code{\var{a}<\var{b}}, \code{\var{a}==\var{b}}, or
|
|
\code{\var{a}>\var{b}}.
|
|
Accordingly, sets do not implement the \method{__cmp__} method.
|
|
|
|
Since sets only define partial ordering (subset relationships), the output
|
|
of the \method{list.sort()} method is undefined for lists of sets.
|
|
|
|
Set elements are like dictionary keys; they need to define both
|
|
\method{__hash__} and \method{__eq__} methods.
|
|
|
|
Binary operations that mix \class{set} instances with \class{frozenset}
|
|
return the type of the first operand. For example:
|
|
\samp{frozenset('ab') | set('bc')} returns an instance of \class{frozenset}.
|
|
|
|
The following table lists operations available for \class{set}
|
|
that do not apply to immutable instances of \class{frozenset}:
|
|
|
|
\begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result}
|
|
\lineiii{\var{s}.update(\var{t})}
|
|
{\var{s} |= \var{t}}
|
|
{return set \var{s} with elements added from \var{t}}
|
|
\lineiii{\var{s}.intersection_update(\var{t})}
|
|
{\var{s} \&= \var{t}}
|
|
{return set \var{s} keeping only elements also found in \var{t}}
|
|
\lineiii{\var{s}.difference_update(\var{t})}
|
|
{\var{s} -= \var{t}}
|
|
{return set \var{s} after removing elements found in \var{t}}
|
|
\lineiii{\var{s}.symmetric_difference_update(\var{t})}
|
|
{\var{s} \textasciicircum= \var{t}}
|
|
{return set \var{s} with elements from \var{s} or \var{t}
|
|
but not both}
|
|
|
|
\hline
|
|
\lineiii{\var{s}.add(\var{x})}{}
|
|
{add element \var{x} to set \var{s}}
|
|
\lineiii{\var{s}.remove(\var{x})}{}
|
|
{remove \var{x} from set \var{s}; raises KeyError if not present}
|
|
\lineiii{\var{s}.discard(\var{x})}{}
|
|
{removes \var{x} from set \var{s} if present}
|
|
\lineiii{\var{s}.pop()}{}
|
|
{remove and return an arbitrary element from \var{s}; raises
|
|
\exception{KeyError} if empty}
|
|
\lineiii{\var{s}.clear()}{}
|
|
{remove all elements from set \var{s}}
|
|
\end{tableiii}
|
|
|
|
Note, the non-operator versions of the \method{update()},
|
|
\method{intersection_update()}, \method{difference_update()}, and
|
|
\method{symmetric_difference_update()} methods will accept any iterable
|
|
as an argument.
|
|
|
|
|
|
\subsection{Mapping Types --- class{dict} \label{typesmapping}}
|
|
\obindex{mapping}
|
|
\obindex{dictionary}
|
|
|
|
A \dfn{mapping} object maps immutable values to
|
|
arbitrary objects. Mappings are mutable objects. There is currently
|
|
only one standard mapping type, the \dfn{dictionary}. A dictionary's keys are
|
|
almost arbitrary values. Only values containing lists, dictionaries
|
|
or other mutable types (that are compared by value rather than by
|
|
object identity) may not be used as keys.
|
|
Numeric types used for keys obey the normal rules for numeric
|
|
comparison: if two numbers compare equal (such as \code{1} and
|
|
\code{1.0}) then they can be used interchangeably to index the same
|
|
dictionary entry.
|
|
|
|
Dictionaries are created by placing a comma-separated list of
|
|
\code{\var{key}: \var{value}} pairs within braces, for example:
|
|
\code{\{'jack': 4098, 'sjoerd': 4127\}} or
|
|
\code{\{4098: 'jack', 4127: 'sjoerd'\}}.
|
|
|
|
The following operations are defined on mappings (where \var{a} and
|
|
\var{b} are mappings, \var{k} is a key, and \var{v} and \var{x} are
|
|
arbitrary objects):
|
|
\indexiii{operations on}{mapping}{types}
|
|
\indexiii{operations on}{dictionary}{type}
|
|
\stindex{del}
|
|
\bifuncindex{len}
|
|
\withsubitem{(dictionary method)}{
|
|
\ttindex{clear()}
|
|
\ttindex{copy()}
|
|
\ttindex{has_key()}
|
|
\ttindex{fromkeys()}
|
|
\ttindex{items()}
|
|
\ttindex{keys()}
|
|
\ttindex{update()}
|
|
\ttindex{values()}
|
|
\ttindex{get()}
|
|
\ttindex{setdefault()}
|
|
\ttindex{pop()}
|
|
\ttindex{popitem()}
|
|
\ttindex{iteritems()}
|
|
\ttindex{iterkeys()}
|
|
\ttindex{itervalues()}}
|
|
|
|
\begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
|
|
\lineiii{len(\var{a})}{the number of items in \var{a}}{}
|
|
\lineiii{\var{a}[\var{k}]}{the item of \var{a} with key \var{k}}{(1)}
|
|
\lineiii{\var{a}[\var{k}] = \var{v}}
|
|
{set \code{\var{a}[\var{k}]} to \var{v}}
|
|
{}
|
|
\lineiii{del \var{a}[\var{k}]}
|
|
{remove \code{\var{a}[\var{k}]} from \var{a}}
|
|
{(1)}
|
|
\lineiii{\var{a}.clear()}{remove all items from \code{a}}{}
|
|
\lineiii{\var{a}.copy()}{a (shallow) copy of \code{a}}{}
|
|
\lineiii{\var{a}.has_key(\var{k})}
|
|
{\code{True} if \var{a} has a key \var{k}, else \code{False}}
|
|
{}
|
|
\lineiii{\var{k} \code{in} \var{a}}
|
|
{Equivalent to \var{a}.has_key(\var{k})}
|
|
{(2)}
|
|
\lineiii{\var{k} not in \var{a}}
|
|
{Equivalent to \code{not} \var{a}.has_key(\var{k})}
|
|
{(2)}
|
|
\lineiii{\var{a}.items()}
|
|
{a copy of \var{a}'s list of (\var{key}, \var{value}) pairs}
|
|
{(3)}
|
|
\lineiii{\var{a}.keys()}{a copy of \var{a}'s list of keys}{(3)}
|
|
\lineiii{\var{a}.update(\optional{\var{b}})}
|
|
{updates (and overwrites) key/value pairs from \var{b}}
|
|
{(9)}
|
|
\lineiii{\var{a}.fromkeys(\var{seq}\optional{, \var{value}})}
|
|
{Creates a new dictionary with keys from \var{seq} and values set to \var{value}}
|
|
{(7)}
|
|
\lineiii{\var{a}.values()}{a copy of \var{a}'s list of values}{(3)}
|
|
\lineiii{\var{a}.get(\var{k}\optional{, \var{x}})}
|
|
{\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
|
|
else \var{x}}
|
|
{(4)}
|
|
\lineiii{\var{a}.setdefault(\var{k}\optional{, \var{x}})}
|
|
{\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
|
|
else \var{x} (also setting it)}
|
|
{(5)}
|
|
\lineiii{\var{a}.pop(\var{k}\optional{, \var{x}})}
|
|
{\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
|
|
else \var{x} (and remove k)}
|
|
{(8)}
|
|
\lineiii{\var{a}.popitem()}
|
|
{remove and return an arbitrary (\var{key}, \var{value}) pair}
|
|
{(6)}
|
|
\lineiii{\var{a}.iteritems()}
|
|
{return an iterator over (\var{key}, \var{value}) pairs}
|
|
{(2), (3)}
|
|
\lineiii{\var{a}.iterkeys()}
|
|
{return an iterator over the mapping's keys}
|
|
{(2), (3)}
|
|
\lineiii{\var{a}.itervalues()}
|
|
{return an iterator over the mapping's values}
|
|
{(2), (3)}
|
|
\end{tableiii}
|
|
|
|
\noindent
|
|
Notes:
|
|
\begin{description}
|
|
\item[(1)] Raises a \exception{KeyError} exception if \var{k} is not
|
|
in the map.
|
|
|
|
\item[(2)] \versionadded{2.2}
|
|
|
|
\item[(3)] Keys and values are listed in an arbitrary order which is
|
|
non-random, varies across Python implementations, and depends on the
|
|
dictionary's history of insertions and deletions.
|
|
If \method{items()}, \method{keys()}, \method{values()},
|
|
\method{iteritems()}, \method{iterkeys()}, and \method{itervalues()}
|
|
are called with no intervening modifications to the dictionary, the
|
|
lists will directly correspond. This allows the creation of
|
|
\code{(\var{value}, \var{key})} pairs using \function{zip()}:
|
|
\samp{pairs = zip(\var{a}.values(), \var{a}.keys())}. The same
|
|
relationship holds for the \method{iterkeys()} and
|
|
\method{itervalues()} methods: \samp{pairs = zip(\var{a}.itervalues(),
|
|
\var{a}.iterkeys())} provides the same value for \code{pairs}.
|
|
Another way to create the same list is \samp{pairs = [(v, k) for (k,
|
|
v) in \var{a}.iteritems()]}.
|
|
|
|
\item[(4)] Never raises an exception if \var{k} is not in the map,
|
|
instead it returns \var{x}. \var{x} is optional; when \var{x} is not
|
|
provided and \var{k} is not in the map, \code{None} is returned.
|
|
|
|
\item[(5)] \function{setdefault()} is like \function{get()}, except
|
|
that if \var{k} is missing, \var{x} is both returned and inserted into
|
|
the dictionary as the value of \var{k}. \var{x} defaults to \var{None}.
|
|
|
|
\item[(6)] \function{popitem()} is useful to destructively iterate
|
|
over a dictionary, as often used in set algorithms. If the dictionary
|
|
is empty, calling \function{popitem()} raises a \exception{KeyError}.
|
|
|
|
\item[(7)] \function{fromkeys()} is a class method that returns a
|
|
new dictionary. \var{value} defaults to \code{None}. \versionadded{2.3}
|
|
|
|
\item[(8)] \function{pop()} raises a \exception{KeyError} when no default
|
|
value is given and the key is not found. \versionadded{2.3}
|
|
|
|
\item[(9)] \function{update()} accepts either another mapping object
|
|
or an iterable of key/value pairs (as a tuple or other iterable of
|
|
length two). If keyword arguments are specified, the mapping is
|
|
then is updated with those key/value pairs:
|
|
\samp{d.update(red=1, blue=2)}.
|
|
\versionchanged[Allowed the argument to be an iterable of key/value
|
|
pairs and allowed keyword arguments]{2.4}
|
|
|
|
\end{description}
|
|
|
|
\subsection{File Objects
|
|
\label{bltin-file-objects}}
|
|
|
|
File objects\obindex{file} are implemented using C's \code{stdio}
|
|
package and can be created with the built-in constructor
|
|
\function{file()}\bifuncindex{file} described in section
|
|
\ref{built-in-funcs}, ``Built-in Functions.''\footnote{\function{file()}
|
|
is new in Python 2.2. The older built-in \function{open()} is an
|
|
alias for \function{file()}.} File objects are also returned
|
|
by some other built-in functions and methods, such as
|
|
\function{os.popen()} and \function{os.fdopen()} and the
|
|
\method{makefile()} method of socket objects.
|
|
\refstmodindex{os}
|
|
\refbimodindex{socket}
|
|
|
|
When a file operation fails for an I/O-related reason, the exception
|
|
\exception{IOError} is raised. This includes situations where the
|
|
operation is not defined for some reason, like \method{seek()} on a tty
|
|
device or writing a file opened for reading.
|
|
|
|
Files have the following methods:
|
|
|
|
|
|
\begin{methoddesc}[file]{close}{}
|
|
Close the file. A closed file cannot be read or written any more.
|
|
Any operation which requires that the file be open will raise a
|
|
\exception{ValueError} after the file has been closed. Calling
|
|
\method{close()} more than once is allowed.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[file]{flush}{}
|
|
Flush the internal buffer, like \code{stdio}'s
|
|
\cfunction{fflush()}. This may be a no-op on some file-like
|
|
objects.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[file]{fileno}{}
|
|
\index{file descriptor}
|
|
\index{descriptor, file}
|
|
Return the integer ``file descriptor'' that is used by the
|
|
underlying implementation to request I/O operations from the
|
|
operating system. This can be useful for other, lower level
|
|
interfaces that use file descriptors, such as the
|
|
\refmodule{fcntl}\refbimodindex{fcntl} module or
|
|
\function{os.read()} and friends. \note{File-like objects
|
|
which do not have a real file descriptor should \emph{not} provide
|
|
this method!}
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[file]{isatty}{}
|
|
Return \code{True} if the file is connected to a tty(-like) device, else
|
|
\code{False}. \note{If a file-like object is not associated
|
|
with a real file, this method should \emph{not} be implemented.}
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[file]{next}{}
|
|
A file object is its own iterator, for example \code{iter(\var{f})} returns
|
|
\var{f} (unless \var{f} is closed). When a file is used as an
|
|
iterator, typically in a \keyword{for} loop (for example,
|
|
\code{for line in f: print line}), the \method{next()} method is
|
|
called repeatedly. This method returns the next input line, or raises
|
|
\exception{StopIteration} when \EOF{} is hit. In order to make a
|
|
\keyword{for} loop the most efficient way of looping over the lines of
|
|
a file (a very common operation), the \method{next()} method uses a
|
|
hidden read-ahead buffer. As a consequence of using a read-ahead
|
|
buffer, combining \method{next()} with other file methods (like
|
|
\method{readline()}) does not work right. However, using
|
|
\method{seek()} to reposition the file to an absolute position will
|
|
flush the read-ahead buffer.
|
|
\versionadded{2.3}
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[file]{read}{\optional{size}}
|
|
Read at most \var{size} bytes from the file (less if the read hits
|
|
\EOF{} before obtaining \var{size} bytes). If the \var{size}
|
|
argument is negative or omitted, read all data until \EOF{} is
|
|
reached. The bytes are returned as a string object. An empty
|
|
string is returned when \EOF{} is encountered immediately. (For
|
|
certain files, like ttys, it makes sense to continue reading after
|
|
an \EOF{} is hit.) Note that this method may call the underlying
|
|
C function \cfunction{fread()} more than once in an effort to
|
|
acquire as close to \var{size} bytes as possible. Also note that
|
|
when in non-blocking mode, less data than what was requested may
|
|
be returned, even if no \var{size} parameter was given.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[file]{readline}{\optional{size}}
|
|
Read one entire line from the file. A trailing newline character is
|
|
kept in the string (but may be absent when a file ends with an
|
|
incomplete line).\footnote{
|
|
The advantage of leaving the newline on is that
|
|
returning an empty string is then an unambiguous \EOF{}
|
|
indication. It is also possible (in cases where it might
|
|
matter, for example, if you
|
|
want to make an exact copy of a file while scanning its lines)
|
|
to tell whether the last line of a file ended in a newline
|
|
or not (yes this happens!).
|
|
} If the \var{size} argument is present and
|
|
non-negative, it is a maximum byte count (including the trailing
|
|
newline) and an incomplete line may be returned.
|
|
An empty string is returned \emph{only} when \EOF{} is encountered
|
|
immediately. \note{Unlike \code{stdio}'s \cfunction{fgets()}, the
|
|
returned string contains null characters (\code{'\e 0'}) if they
|
|
occurred in the input.}
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[file]{readlines}{\optional{sizehint}}
|
|
Read until \EOF{} using \method{readline()} and return a list containing
|
|
the lines thus read. If the optional \var{sizehint} argument is
|
|
present, instead of reading up to \EOF, whole lines totalling
|
|
approximately \var{sizehint} bytes (possibly after rounding up to an
|
|
internal buffer size) are read. Objects implementing a file-like
|
|
interface may choose to ignore \var{sizehint} if it cannot be
|
|
implemented, or cannot be implemented efficiently.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[file]{xreadlines}{}
|
|
This method returns the same thing as \code{iter(f)}.
|
|
\versionadded{2.1}
|
|
\deprecated{2.3}{Use \samp{for \var{line} in \var{file}} instead.}
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[file]{seek}{offset\optional{, whence}}
|
|
Set the file's current position, like \code{stdio}'s \cfunction{fseek()}.
|
|
The \var{whence} argument is optional and defaults to \code{0}
|
|
(absolute file positioning); other values are \code{1} (seek
|
|
relative to the current position) and \code{2} (seek relative to the
|
|
file's end). There is no return value. Note that if the file is
|
|
opened for appending (mode \code{'a'} or \code{'a+'}), any
|
|
\method{seek()} operations will be undone at the next write. If the
|
|
file is only opened for writing in append mode (mode \code{'a'}),
|
|
this method is essentially a no-op, but it remains useful for files
|
|
opened in append mode with reading enabled (mode \code{'a+'}). If the
|
|
file is opened in text mode (mode \code{'t'}), only offsets returned
|
|
by \method{tell()} are legal. Use of other offsets causes undefined
|
|
behavior.
|
|
|
|
Note that not all file objects are seekable.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[file]{tell}{}
|
|
Return the file's current position, like \code{stdio}'s
|
|
\cfunction{ftell()}.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[file]{truncate}{\optional{size}}
|
|
Truncate the file's size. If the optional \var{size} argument is
|
|
present, the file is truncated to (at most) that size. The size
|
|
defaults to the current position. The current file position is
|
|
not changed. Note that if a specified size exceeds the file's
|
|
current size, the result is platform-dependent: possibilities
|
|
include that file may remain unchanged, increase to the specified
|
|
size as if zero-filled, or increase to the specified size with
|
|
undefined new content.
|
|
Availability: Windows, many \UNIX{} variants.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[file]{write}{str}
|
|
Write a string to the file. There is no return value. Due to
|
|
buffering, the string may not actually show up in the file until
|
|
the \method{flush()} or \method{close()} method is called.
|
|
\end{methoddesc}
|
|
|
|
\begin{methoddesc}[file]{writelines}{sequence}
|
|
Write a sequence of strings to the file. The sequence can be any
|
|
iterable object producing strings, typically a list of strings.
|
|
There is no return value.
|
|
(The name is intended to match \method{readlines()};
|
|
\method{writelines()} does not add line separators.)
|
|
\end{methoddesc}
|
|
|
|
|
|
Files support the iterator protocol. Each iteration returns the same
|
|
result as \code{\var{file}.readline()}, and iteration ends when the
|
|
\method{readline()} method returns an empty string.
|
|
|
|
|
|
File objects also offer a number of other interesting attributes.
|
|
These are not required for file-like objects, but should be
|
|
implemented if they make sense for the particular object.
|
|
|
|
\begin{memberdesc}[file]{closed}
|
|
bool indicating the current state of the file object. This is a
|
|
read-only attribute; the \method{close()} method changes the value.
|
|
It may not be available on all file-like objects.
|
|
\end{memberdesc}
|
|
|
|
\begin{memberdesc}[file]{encoding}
|
|
The encoding that this file uses. When Unicode strings are written
|
|
to a file, they will be converted to byte strings using this encoding.
|
|
In addition, when the file is connected to a terminal, the attribute
|
|
gives the encoding that the terminal is likely to use (that
|
|
information might be incorrect if the user has misconfigured the
|
|
terminal). The attribute is read-only and may not be present on
|
|
all file-like objects. It may also be \code{None}, in which case
|
|
the file uses the system default encoding for converting Unicode
|
|
strings.
|
|
|
|
\versionadded{2.3}
|
|
\end{memberdesc}
|
|
|
|
\begin{memberdesc}[file]{mode}
|
|
The I/O mode for the file. If the file was created using the
|
|
\function{open()} built-in function, this will be the value of the
|
|
\var{mode} parameter. This is a read-only attribute and may not be
|
|
present on all file-like objects.
|
|
\end{memberdesc}
|
|
|
|
\begin{memberdesc}[file]{name}
|
|
If the file object was created using \function{open()}, the name of
|
|
the file. Otherwise, some string that indicates the source of the
|
|
file object, of the form \samp{<\mbox{\ldots}>}. This is a read-only
|
|
attribute and may not be present on all file-like objects.
|
|
\end{memberdesc}
|
|
|
|
\begin{memberdesc}[file]{newlines}
|
|
If Python was built with the \longprogramopt{with-universal-newlines}
|
|
option to \program{configure} (the default) this read-only attribute
|
|
exists, and for files opened in
|
|
universal newline read mode it keeps track of the types of newlines
|
|
encountered while reading the file. The values it can take are
|
|
\code{'\e r'}, \code{'\e n'}, \code{'\e r\e n'}, \code{None} (unknown,
|
|
no newlines read yet) or a tuple containing all the newline
|
|
types seen, to indicate that multiple
|
|
newline conventions were encountered. For files not opened in universal
|
|
newline read mode the value of this attribute will be \code{None}.
|
|
\end{memberdesc}
|
|
|
|
\begin{memberdesc}[file]{softspace}
|
|
Boolean that indicates whether a space character needs to be printed
|
|
before another value when using the \keyword{print} statement.
|
|
Classes that are trying to simulate a file object should also have a
|
|
writable \member{softspace} attribute, which should be initialized to
|
|
zero. This will be automatic for most classes implemented in Python
|
|
(care may be needed for objects that override attribute access); types
|
|
implemented in C will have to provide a writable
|
|
\member{softspace} attribute.
|
|
\note{This attribute is not used to control the
|
|
\keyword{print} statement, but to allow the implementation of
|
|
\keyword{print} to keep track of its internal state.}
|
|
\end{memberdesc}
|
|
|
|
|
|
\subsection{Other Built-in Types \label{typesother}}
|
|
|
|
The interpreter supports several other kinds of objects.
|
|
Most of these support only one or two operations.
|
|
|
|
|
|
\subsubsection{Modules \label{typesmodules}}
|
|
|
|
The only special operation on a module is attribute access:
|
|
\code{\var{m}.\var{name}}, where \var{m} is a module and \var{name}
|
|
accesses a name defined in \var{m}'s symbol table. Module attributes
|
|
can be assigned to. (Note that the \keyword{import} statement is not,
|
|
strictly speaking, an operation on a module object; \code{import
|
|
\var{foo}} does not require a module object named \var{foo} to exist,
|
|
rather it requires an (external) \emph{definition} for a module named
|
|
\var{foo} somewhere.)
|
|
|
|
A special member of every module is \member{__dict__}.
|
|
This is the dictionary containing the module's symbol table.
|
|
Modifying this dictionary will actually change the module's symbol
|
|
table, but direct assignment to the \member{__dict__} attribute is not
|
|
possible (you can write \code{\var{m}.__dict__['a'] = 1}, which
|
|
defines \code{\var{m}.a} to be \code{1}, but you can't write
|
|
\code{\var{m}.__dict__ = \{\}}). Modifying \member{__dict__} directly
|
|
is not recommended.
|
|
|
|
Modules built into the interpreter are written like this:
|
|
\code{<module 'sys' (built-in)>}. If loaded from a file, they are
|
|
written as \code{<module 'os' from
|
|
'/usr/local/lib/python\shortversion/os.pyc'>}.
|
|
|
|
|
|
\subsubsection{Classes and Class Instances \label{typesobjects}}
|
|
\nodename{Classes and Instances}
|
|
|
|
See chapters 3 and 7 of the \citetitle[../ref/ref.html]{Python
|
|
Reference Manual} for these.
|
|
|
|
|
|
\subsubsection{Functions \label{typesfunctions}}
|
|
|
|
Function objects are created by function definitions. The only
|
|
operation on a function object is to call it:
|
|
\code{\var{func}(\var{argument-list})}.
|
|
|
|
There are really two flavors of function objects: built-in functions
|
|
and user-defined functions. Both support the same operation (to call
|
|
the function), but the implementation is different, hence the
|
|
different object types.
|
|
|
|
See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
|
|
information.
|
|
|
|
\subsubsection{Methods \label{typesmethods}}
|
|
\obindex{method}
|
|
|
|
Methods are functions that are called using the attribute notation.
|
|
There are two flavors: built-in methods (such as \method{append()} on
|
|
lists) and class instance methods. Built-in methods are described
|
|
with the types that support them.
|
|
|
|
The implementation adds two special read-only attributes to class
|
|
instance methods: \code{\var{m}.im_self} is the object on which the
|
|
method operates, and \code{\var{m}.im_func} is the function
|
|
implementing the method. Calling \code{\var{m}(\var{arg-1},
|
|
\var{arg-2}, \textrm{\ldots}, \var{arg-n})} is completely equivalent to
|
|
calling \code{\var{m}.im_func(\var{m}.im_self, \var{arg-1},
|
|
\var{arg-2}, \textrm{\ldots}, \var{arg-n})}.
|
|
|
|
Class instance methods are either \emph{bound} or \emph{unbound},
|
|
referring to whether the method was accessed through an instance or a
|
|
class, respectively. When a method is unbound, its \code{im_self}
|
|
attribute will be \code{None} and if called, an explicit \code{self}
|
|
object must be passed as the first argument. In this case,
|
|
\code{self} must be an instance of the unbound method's class (or a
|
|
subclass of that class), otherwise a \code{TypeError} is raised.
|
|
|
|
Like function objects, methods objects support getting
|
|
arbitrary attributes. However, since method attributes are actually
|
|
stored on the underlying function object (\code{meth.im_func}),
|
|
setting method attributes on either bound or unbound methods is
|
|
disallowed. Attempting to set a method attribute results in a
|
|
\code{TypeError} being raised. In order to set a method attribute,
|
|
you need to explicitly set it on the underlying function object:
|
|
|
|
\begin{verbatim}
|
|
class C:
|
|
def method(self):
|
|
pass
|
|
|
|
c = C()
|
|
c.method.im_func.whoami = 'my name is c'
|
|
\end{verbatim}
|
|
|
|
See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
|
|
information.
|
|
|
|
|
|
\subsubsection{Code Objects \label{bltin-code-objects}}
|
|
\obindex{code}
|
|
|
|
Code objects are used by the implementation to represent
|
|
``pseudo-compiled'' executable Python code such as a function body.
|
|
They differ from function objects because they don't contain a
|
|
reference to their global execution environment. Code objects are
|
|
returned by the built-in \function{compile()} function and can be
|
|
extracted from function objects through their \member{func_code}
|
|
attribute.
|
|
\bifuncindex{compile}
|
|
\withsubitem{(function object attribute)}{\ttindex{func_code}}
|
|
|
|
A code object can be executed or evaluated by passing it (instead of a
|
|
source string) to the \keyword{exec} statement or the built-in
|
|
\function{eval()} function.
|
|
\stindex{exec}
|
|
\bifuncindex{eval}
|
|
|
|
See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
|
|
information.
|
|
|
|
|
|
\subsubsection{Type Objects \label{bltin-type-objects}}
|
|
|
|
Type objects represent the various object types. An object's type is
|
|
accessed by the built-in function \function{type()}. There are no special
|
|
operations on types. The standard module \refmodule{types} defines names
|
|
for all standard built-in types.
|
|
\bifuncindex{type}
|
|
\refstmodindex{types}
|
|
|
|
Types are written like this: \code{<type 'int'>}.
|
|
|
|
|
|
\subsubsection{The Null Object \label{bltin-null-object}}
|
|
|
|
This object is returned by functions that don't explicitly return a
|
|
value. It supports no special operations. There is exactly one null
|
|
object, named \code{None} (a built-in name).
|
|
|
|
It is written as \code{None}.
|
|
|
|
|
|
\subsubsection{The Ellipsis Object \label{bltin-ellipsis-object}}
|
|
|
|
This object is used by extended slice notation (see the
|
|
\citetitle[../ref/ref.html]{Python Reference Manual}). It supports no
|
|
special operations. There is exactly one ellipsis object, named
|
|
\constant{Ellipsis} (a built-in name).
|
|
|
|
It is written as \code{Ellipsis}.
|
|
|
|
\subsubsection{Boolean Values}
|
|
|
|
Boolean values are the two constant objects \code{False} and
|
|
\code{True}. They are used to represent truth values (although other
|
|
values can also be considered false or true). In numeric contexts
|
|
(for example when used as the argument to an arithmetic operator),
|
|
they behave like the integers 0 and 1, respectively. The built-in
|
|
function \function{bool()} can be used to cast any value to a Boolean,
|
|
if the value can be interpreted as a truth value (see section Truth
|
|
Value Testing above).
|
|
|
|
They are written as \code{False} and \code{True}, respectively.
|
|
\index{False}
|
|
\index{True}
|
|
\indexii{Boolean}{values}
|
|
|
|
|
|
\subsubsection{Internal Objects \label{typesinternal}}
|
|
|
|
See the \citetitle[../ref/ref.html]{Python Reference Manual} for this
|
|
information. It describes stack frame objects, traceback objects, and
|
|
slice objects.
|
|
|
|
|
|
\subsection{Special Attributes \label{specialattrs}}
|
|
|
|
The implementation adds a few special read-only attributes to several
|
|
object types, where they are relevant. Some of these are not reported
|
|
by the \function{dir()} built-in function.
|
|
|
|
\begin{memberdesc}[object]{__dict__}
|
|
A dictionary or other mapping object used to store an
|
|
object's (writable) attributes.
|
|
\end{memberdesc}
|
|
|
|
\begin{memberdesc}[object]{__methods__}
|
|
\deprecated{2.2}{Use the built-in function \function{dir()} to get a
|
|
list of an object's attributes. This attribute is no longer available.}
|
|
\end{memberdesc}
|
|
|
|
\begin{memberdesc}[object]{__members__}
|
|
\deprecated{2.2}{Use the built-in function \function{dir()} to get a
|
|
list of an object's attributes. This attribute is no longer available.}
|
|
\end{memberdesc}
|
|
|
|
\begin{memberdesc}[instance]{__class__}
|
|
The class to which a class instance belongs.
|
|
\end{memberdesc}
|
|
|
|
\begin{memberdesc}[class]{__bases__}
|
|
The tuple of base classes of a class object. If there are no base
|
|
classes, this will be an empty tuple.
|
|
\end{memberdesc}
|
|
|
|
\begin{memberdesc}[class]{__name__}
|
|
The name of the class or type.
|
|
\end{memberdesc}
|