70 lines
3.0 KiB
TeX
70 lines
3.0 KiB
TeX
\section{Standard Module \sectcode{regsub}}
|
|
\label{module-regsub}
|
|
|
|
\stmodindex{regsub}
|
|
This module defines a number of functions useful for working with
|
|
regular expressions (see built-in module \code{regex}).
|
|
|
|
Warning: these functions are not thread-safe.
|
|
|
|
\strong{Obsolescence note:}
|
|
This module is obsolete as of Python version 1.5; it is still being
|
|
maintained because much existing code still uses it. All new code in
|
|
need of regular expressions should use the new \module{re} module, which
|
|
supports the more powerful and regular Perl-style regular expressions.
|
|
Existing code should be converted. The standard library module
|
|
\module{reconvert} helps in converting \code{regex} style regular
|
|
expressions to \module{re} style regular expressions. (For more
|
|
conversion help, see the URL
|
|
\url{http://starship.skyport.net/crew/amk/howto/regex-to-re.html}.)
|
|
|
|
|
|
\begin{funcdesc}{sub}{pat, repl, str}
|
|
Replace the first occurrence of pattern \var{pat} in string
|
|
\var{str} by replacement \var{repl}. If the pattern isn't found,
|
|
the string is returned unchanged. The pattern may be a string or an
|
|
already compiled pattern. The replacement may contain references
|
|
\samp{\e \var{digit}} to subpatterns and escaped backslashes.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{gsub}{pat, repl, str}
|
|
Replace all (non-overlapping) occurrences of pattern \var{pat} in
|
|
string \var{str} by replacement \var{repl}. The same rules as for
|
|
\code{sub()} apply. Empty matches for the pattern are replaced only
|
|
when not adjacent to a previous match, so e.g.
|
|
\code{gsub('', '-', 'abc')} returns \code{'-a-b-c-'}.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{split}{str, pat\optional{, maxsplit}}
|
|
Split the string \var{str} in fields separated by delimiters matching
|
|
the pattern \var{pat}, and return a list containing the fields. Only
|
|
non-empty matches for the pattern are considered, so e.g.
|
|
\code{split('a:b', ':*')} returns \code{['a', 'b']} and
|
|
\code{split('abc', '')} returns \code{['abc']}. The \var{maxsplit}
|
|
defaults to 0. If it is nonzero, only \var{maxsplit} number of splits
|
|
occur, and the remainder of the string is returned as the final
|
|
element of the list.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{splitx}{str, pat\optional{, maxsplit}}
|
|
Split the string \var{str} in fields separated by delimiters matching
|
|
the pattern \var{pat}, and return a list containing the fields as well
|
|
as the separators. For example, \code{splitx('a:::b', ':*')} returns
|
|
\code{['a', ':::', 'b']}. Otherwise, this function behaves the same
|
|
as \code{split}.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{capwords}{s\optional{, pat}}
|
|
Capitalize words separated by optional pattern \var{pat}. The default
|
|
pattern uses any characters except letters, digits and underscores as
|
|
word delimiters. Capitalization is done by changing the first
|
|
character of each word to upper case.
|
|
\end{funcdesc}
|
|
|
|
\begin{funcdesc}{clear_cache}{}
|
|
The regsub module maintains a cache of compiled regular expressions,
|
|
keyed on the regular expression string and the syntax of the regex
|
|
module at the time the expression was compiled. This function clears
|
|
that cache.
|
|
\end{funcdesc}
|