cpython/Doc/lib/librandom.tex

288 lines
11 KiB
TeX
Raw Normal View History

\section{\module{random} ---
Generate pseudo-random numbers}
\declaremodule{standard}{random}
\modulesynopsis{Generate pseudo-random numbers with various common
distributions.}
1997-04-03 18:41:49 -04:00
This module implements pseudo-random number generators for various
distributions.
For integers, uniform selection from a range.
For sequences, uniform selection of a random element, and a function to
generate a random permutation of a list in-place.
On the real line, there are functions to compute uniform, normal (Gaussian),
lognormal, negative exponential, gamma, and beta distributions.
For generating distribution of angles, the circular uniform and
von Mises distributions are available.
Almost all module functions depend on the basic function
\function{random()}, which generates a random float uniformly in
the semi-open range [0.0, 1.0). Python uses the standard Wichmann-Hill
generator, combining three pure multiplicative congruential
generators of modulus 30269, 30307 and 30323. Its period (how many
numbers it generates before repeating the sequence exactly) is
6,953,607,871,644. While of much higher quality than the \function{rand()}
function supplied by most C libraries, the theoretical properties
are much the same as for a single linear congruential generator of
large modulus. It is not suitable for all purposes, and is completely
unsuitable for cryptographic purposes.
The functions in this module are not threadsafe: if you want to call these
functions from multiple threads, you should explicitly serialize the calls.
Else, because no critical sections are implemented internally, calls
from different threads may see the same return values.
The functions supplied by this module are actually bound methods of a
2001-02-01 22:42:31 -04:00
hidden instance of the \class{random.Random} class. You can
instantiate your own instances of \class{Random} to get generators
that don't share state. This is especially useful for multi-threaded
programs, creating a different instance of \class{Random} for each
thread, and using the \method{jumpahead()} method to ensure that the
generated sequences seen by each thread don't overlap (see example
below).
Class \class{Random} can also be subclassed if you want to use a
different basic generator of your own devising: in that case, override
the \method{random()}, \method{seed()}, \method{getstate()},
\method{setstate()} and \method{jumpahead()} methods.
Here's one way to create threadsafe distinct and non-overlapping generators:
\begin{verbatim}
def create_generators(num, delta, firstseed=None):
"""Return list of num distinct generators.
Each generator has its own unique segment of delta elements
from Random.random()'s full period.
Seed the first generator with optional arg firstseed (default
is None, to seed from current time).
"""
from random import Random
g = Random(firstseed)
result = [g]
for i in range(num - 1):
laststate = g.getstate()
g = Random()
g.setstate(laststate)
g.jumpahead(delta)
result.append(g)
return result
gens = create_generators(10, 1000000)
\end{verbatim}
That creates 10 distinct generators, which can be passed out to 10
distinct threads. The generators don't share state so can be called
safely in parallel. So long as no thread calls its \code{g.random()}
more than a million times (the second argument to
\function{create_generators()}, the sequences seen by each thread will
not overlap. The period of the underlying Wichmann-Hill generator
limits how far this technique can be pushed.
Just for fun, note that since we know the period, \method{jumpahead()}
can also be used to ``move backward in time:''
\begin{verbatim}
>>> g = Random(42) # arbitrary
>>> g.random()
0.25420336316883324
>>> g.jumpahead(6953607871644L - 1) # move *back* one
>>> g.random()
0.25420336316883324
\end{verbatim}
Bookkeeping functions:
\begin{funcdesc}{seed}{\optional{x}}
Initialize the basic random number generator.
Optional argument \var{x} can be any hashable object.
2001-04-21 02:56:06 -03:00
If \var{x} is omitted or \code{None}, current system time is used;
current system time is also used to initialize the generator when the
module is first imported.
2001-04-21 02:56:06 -03:00
If \var{x} is not \code{None} or an int or long,
\code{hash(\var{x})} is used instead.
If \var{x} is an int or long, \var{x} is used directly.
Distinct values between 0 and 27814431486575L inclusive are guaranteed
to yield distinct internal states (this guarantee is specific to the
default Wichmann-Hill generator, and may not apply to subclasses
supplying their own basic generator).
\end{funcdesc}
\begin{funcdesc}{whseed}{\optional{x}}
This is obsolete, supplied for bit-level compatibility with versions
of Python prior to 2.1.
See \function{seed} for details. \function{whseed} does not guarantee
that distinct integer arguments yield distinct internal states, and can
yield no more than about 2**24 distinct internal states in all.
\end{funcdesc}
\begin{funcdesc}{getstate}{}
2001-02-01 22:42:31 -04:00
Return an object capturing the current internal state of the
generator. This object can be passed to \function{setstate()} to
restore the state.
\versionadded{2.1}
\end{funcdesc}
\begin{funcdesc}{setstate}{state}
\var{state} should have been obtained from a previous call to
2001-02-01 22:42:31 -04:00
\function{getstate()}, and \function{setstate()} restores the
internal state of the generator to what it was at the time
\function{setstate()} was called.
\versionadded{2.1}
2001-02-01 22:42:31 -04:00
\end{funcdesc}
\begin{funcdesc}{jumpahead}{n}
2001-02-01 22:42:31 -04:00
Change the internal state to what it would be if \function{random()}
were called \var{n} times, but do so quickly. \var{n} is a
non-negative integer. This is most useful in multi-threaded
2001-04-21 02:56:06 -03:00
programs, in conjuction with multiple instances of the \class{Random}
2001-02-01 22:42:31 -04:00
class: \method{setstate()} or \method{seed()} can be used to force
all instances into the same internal state, and then
\method{jumpahead()} can be used to force the instances' states as
far apart as you like (up to the period of the generator).
\versionadded{2.1}
\end{funcdesc}
Functions for integers:
\begin{funcdesc}{randrange}{\optional{start,} stop\optional{, step}}
Return a randomly selected element from \code{range(\var{start},
\var{stop}, \var{step})}. This is equivalent to
\code{choice(range(\var{start}, \var{stop}, \var{step}))},
but doesn't actually build a range object.
\versionadded{1.5.2}
\end{funcdesc}
\begin{funcdesc}{randint}{a, b}
Return a random integer \var{N} such that
\code{\var{a} <= \var{N} <= \var{b}}.
\end{funcdesc}
Functions for sequences:
\begin{funcdesc}{choice}{seq}
Return a random element from the non-empty sequence \var{seq}.
\end{funcdesc}
\begin{funcdesc}{shuffle}{x\optional{, random}}
Shuffle the sequence \var{x} in place.
The optional argument \var{random} is a 0-argument function
returning a random float in [0.0, 1.0); by default, this is the
function \function{random()}.
Note that for even rather small \code{len(\var{x})}, the total
number of permutations of \var{x} is larger than the period of most
random number generators; this implies that most permutations of a
long sequence can never be generated.
\end{funcdesc}
\begin{funcdesc}{sample}{population, k}
Return a \var{k} length list of unique elements chosen from the
population sequence. Used for random sampling without replacement.
\versionadded{2.3}
Returns a new list containing elements from the population while
leaving the original population unchanged. The resulting list is
in selection order so that all sub-slices will also be valid random
samples. This allows raffle winners (the sample) to be partitioned
into grand prize and second place winners (the subslices).
Members of the population need not be hashable or unique. If the
population contains repeats, then each occurrence is a possible
selection in the sample.
To choose a sample from a range of integers, use \function{xrange}
as an argument. This is especially fast and space efficient for
sampling from a large population: \code{sample(xrange(10000000), 60)}.
\end{funcdesc}
The following functions generate specific real-valued distributions.
Function parameters are named after the corresponding variables in the
distribution's equation, as used in common mathematical practice; most of
these equations can be found in any statistics text.
\begin{funcdesc}{random}{}
Return the next random floating point number in the range [0.0, 1.0).
\end{funcdesc}
\begin{funcdesc}{uniform}{a, b}
Return a random real number \var{N} such that
\code{\var{a} <= \var{N} < \var{b}}.
\end{funcdesc}
1998-03-08 04:13:53 -04:00
\begin{funcdesc}{betavariate}{alpha, beta}
Beta distribution. Conditions on the parameters are
\code{\var{alpha} > -1} and \code{\var{beta} > -1}.
Returned values range between 0 and 1.
1997-04-03 18:41:49 -04:00
\end{funcdesc}
1998-03-08 04:13:53 -04:00
\begin{funcdesc}{cunifvariate}{mean, arc}
Circular uniform distribution. \var{mean} is the mean angle, and
\var{arc} is the range of the distribution, centered around the mean
angle. Both values must be expressed in radians, and can range
between 0 and \emph{pi}. Returned values range between
\code{\var{mean} - \var{arc}/2} and \code{\var{mean} +
\var{arc}/2} and are normalized to between 0 and \emph{pi}.
\deprecated{2.3}{Instead, use \code{(\var{mean} + \var{arc} *
(random.random() - 0.5)) \% math.pi}.}
1997-04-03 18:41:49 -04:00
\end{funcdesc}
\begin{funcdesc}{expovariate}{lambd}
Exponential distribution. \var{lambd} is 1.0 divided by the desired
mean. (The parameter would be called ``lambda'', but that is a
reserved word in Python.) Returned values range from 0 to
positive infinity.
1997-04-03 18:41:49 -04:00
\end{funcdesc}
\begin{funcdesc}{gammavariate}{alpha, beta}
Gamma distribution. (\emph{Not} the gamma function!) Conditions on
the parameters are \code{\var{alpha} > 0} and \code{\var{beta} > 0}.
1997-04-03 18:41:49 -04:00
\end{funcdesc}
1998-03-08 04:13:53 -04:00
\begin{funcdesc}{gauss}{mu, sigma}
Gaussian distribution. \var{mu} is the mean, and \var{sigma} is the
standard deviation. This is slightly faster than the
\function{normalvariate()} function defined below.
1997-04-03 18:41:49 -04:00
\end{funcdesc}
1998-03-08 04:13:53 -04:00
\begin{funcdesc}{lognormvariate}{mu, sigma}
Log normal distribution. If you take the natural logarithm of this
distribution, you'll get a normal distribution with mean \var{mu}
and standard deviation \var{sigma}. \var{mu} can have any value,
and \var{sigma} must be greater than zero.
1997-04-03 18:41:49 -04:00
\end{funcdesc}
1998-03-08 04:13:53 -04:00
\begin{funcdesc}{normalvariate}{mu, sigma}
Normal distribution. \var{mu} is the mean, and \var{sigma} is the
standard deviation.
1997-04-03 18:41:49 -04:00
\end{funcdesc}
1998-03-08 04:13:53 -04:00
\begin{funcdesc}{vonmisesvariate}{mu, kappa}
\var{mu} is the mean angle, expressed in radians between 0 and
2*\emph{pi}, and \var{kappa} is the concentration parameter, which
must be greater than or equal to zero. If \var{kappa} is equal to
zero, this distribution reduces to a uniform random angle over the
range 0 to 2*\emph{pi}.
1997-04-03 18:41:49 -04:00
\end{funcdesc}
\begin{funcdesc}{paretovariate}{alpha}
Pareto distribution. \var{alpha} is the shape parameter.
\end{funcdesc}
\begin{funcdesc}{weibullvariate}{alpha, beta}
Weibull distribution. \var{alpha} is the scale parameter and
\var{beta} is the shape parameter.
\end{funcdesc}
\begin{seealso}
\seetext{Wichmann, B. A. \& Hill, I. D., ``Algorithm AS 183:
An efficient and portable pseudo-random number generator'',
\citetitle{Applied Statistics} 31 (1982) 188-190.}
\end{seealso}