81 lines
3.3 KiB
TeX
81 lines
3.3 KiB
TeX
|
\section{\module{reconvert} ---
|
||
|
Convert regular expressions from regex to re form}
|
||
|
\declaremodule{standard}{reconvert}
|
||
|
\moduleauthor{Andrew M. Kuchling}{amk@amk.ca}
|
||
|
\sectionauthor{Skip Montanaro}{skip@pobox.com}
|
||
|
|
||
|
|
||
|
\modulesynopsis{Convert regex-, emacs- or sed-style regular expressions
|
||
|
to re-style syntax.}
|
||
|
|
||
|
|
||
|
This module provides a facility to convert regular expressions from the
|
||
|
syntax used by the deprecated \module{regex} module to those used by the
|
||
|
newer \module{re} module. Because of similarity between the regular
|
||
|
expression syntax of \code{sed(1)} and \code{emacs(1)} and the
|
||
|
\module{regex} module, it is also helpful to convert patterns written for
|
||
|
those tools to \module{re} patterns.
|
||
|
|
||
|
When used as a script, a Python string literal (or any other expression
|
||
|
evaluating to a string) is read from stdin, and the translated expression is
|
||
|
written to stdout as a string literal. Unless stdout is a tty, no trailing
|
||
|
newline is written to stdout. This is done so that it can be used with
|
||
|
Emacs \code{C-U M-|} (shell-command-on-region) which filters the region
|
||
|
through the shell command.
|
||
|
|
||
|
\begin{seealso}
|
||
|
\seetitle{Mastering Regular Expressions}{Book on regular expressions
|
||
|
by Jeffrey Friedl, published by O'Reilly. The second
|
||
|
edition of the book no longer covers Python at all,
|
||
|
but the first edition covered writing good regular expression
|
||
|
patterns in great detail.}
|
||
|
\end{seealso}
|
||
|
|
||
|
\subsection{Module Contents}
|
||
|
\nodename{Contents of Module reconvert}
|
||
|
|
||
|
The module defines two functions and a handful of constants.
|
||
|
|
||
|
\begin{funcdesc}{convert}{pattern\optional{, syntax=None}}
|
||
|
Convert a \var{pattern} representing a \module{regex}-stype regular
|
||
|
expression into a \module{re}-style regular expression. The optional
|
||
|
\var{syntax} parameter is a bitwise-or'd set of flags that control what
|
||
|
constructs are converted. See below for a description of the various
|
||
|
constants.
|
||
|
\end{funcdesc}
|
||
|
|
||
|
\begin{funcdesc}{quote}{s\optional{, quote=None}}
|
||
|
Convert a string object to a quoted string literal.
|
||
|
|
||
|
This is similar to \function{repr} but will return a "raw" string (r'...'
|
||
|
or r"...") when the string contains backslashes, instead of doubling all
|
||
|
backslashes. The resulting string does not always evaluate to the same
|
||
|
string as the original; however it will do just the right thing when passed
|
||
|
into re.compile().
|
||
|
|
||
|
The optional second argument forces the string quote; it must be a single
|
||
|
character which is a valid Python string quote. Note that prior to Python
|
||
|
2.5 this would not accept triple-quoted string delimiters.
|
||
|
\end{funcdesc}
|
||
|
|
||
|
\begin{datadesc}{RE_NO_BK_PARENS}
|
||
|
Suppress paren conversion. This should be omitted when converting
|
||
|
\code{sed}-style or \code{emacs}-style regular expressions.
|
||
|
\end{datadesc}
|
||
|
|
||
|
\begin{datadesc}{RE_NO_BK_VBAR}
|
||
|
Suppress vertical bar conversion. This should be omitted when converting
|
||
|
\code{sed}-style or \code{emacs}-style regular expressions.
|
||
|
\end{datadesc}
|
||
|
|
||
|
\begin{datadesc}{RE_BK_PLUS_QM}
|
||
|
Enable conversion of \code{+} and \code{?} characters. This should be
|
||
|
added to the \var{syntax} arg of \function{convert} when converting
|
||
|
\code{sed}-style regular expressions and omitted when converting
|
||
|
\code{emacs}-style regular expressions.
|
||
|
\end{datadesc}
|
||
|
|
||
|
\begin{datadesc}{RE_NEWLINE_OR}
|
||
|
When set, newline characters are replaced by \code{|}.
|
||
|
\end{datadesc}
|