Update documentation for re library to explain that a backreference `\g<0>` is
expanded to the entire string when using Match.expand().
Note that numeric backreferences to group 0 (`\0`) are not supported.
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
The GH-93000 change set inadvertently caused a sentence in re.compile()
documentation to refer to details that no longer followed. Correct this
with a link to the Flags sub-subsection.
Co-authored-by: Adam Turner <9087854+aa-turner@users.noreply.github.com>
Renamed re.error for clarity, and kept re.error for backward compatibility.
Updated idlelib files at TJR's request.
---------
Co-authored-by: Matthias Bussonnier <mbussonnier@ucmerced.edu>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
Deprecate passing optional arguments maxsplit, count and flags in
module-level functions re.split(), re.sub() and re.subn() as positional.
They should only be passed by keyword.
A backslash-character pair that is not a valid escape sequence now
generates a SyntaxWarning, instead of DeprecationWarning. For
example, re.compile("\d+\.\d+") now emits a SyntaxWarning ("\d" is an
invalid escape sequence), use raw strings for regular expression:
re.compile(r"\d+\.\d+"). In a future Python version, SyntaxError will
eventually be raised, instead of SyntaxWarning.
Octal escapes with value larger than 0o377 (ex: "\477"), deprecated
in Python 3.11, now produce a SyntaxWarning, instead of
DeprecationWarning. In a future Python version they will be
eventually a SyntaxError.
codecs.escape_decode() and codecs.unicode_escape_decode() are left
unchanged: they still emit DeprecationWarning.
* The parser only emits SyntaxWarning for Python 3.12 (feature
version), and still emits DeprecationWarning on older Python
versions.
* Fix SyntaxWarning by using raw strings in Tools/c-analyzer/ and
wasm_build.py.
In very rare circumstances the JUMP opcode could be confused with the
argument of the opcode in the "then" part which doesn't end with the
JUMP opcode. This led to incorrect detection of the final JUMP opcode
and incorrect calculation of the size of the subexpression.
NOTE: Changed return value of functions _validate_inner() and
_validate_charset() in Modules/_sre/sre.c. Now they return 0 on success,
-1 on failure, and 1 if the last op is JUMP (which usually is a failure).
Previously they returned 1 on success and 0 on failure.
The current re.VERBOSE documentation example leaves space for ambiguous
interpretation. One may read that spaces within the `(?:` token are
spaces inside the non-capturing group (such as `(?: )`). This patch
removes the ambiguity by including examples after the statement.
Only sequence of ASCII digits is now accepted as a numerical reference.
The group name in bytes patterns and replacement strings can now only
contain ASCII letters and digits and underscore.
Only sequence of ASCII digits will be accepted as a numerical reference.
The group name in bytes patterns and replacement strings could only
contain ASCII letters and digits and underscore.
Prior to 3.7, re.escape escaped many characters that don't have
special meaning in Python, but that use to require escaping in other
tools and languages. This commit aims to make it clear which characters
were, but are no longer escaped.
Need to reset capturing groups between two SRE(match) callings in loops, this fixes wrong capturing groups in rare cases.
Also add a missing index in re.rst.
1) Convert weird field name "typ" to the more standard "type".
2) For the NUMBER type, convert the value to an int() or float().
3) Simplify ``group(kind)`` to the shorter and faster ``group()`` call.
4) Simplify logic go a single if-elif chain to make this easier to extend.
5) Reorder the tests to match the order the tokens are specified.
This isn't necessary for correctness but does make the example
easier to follow.
6) Move the "column" calculation before the if-elif chain so that
users have the option of using this value in error messages.