Commit Graph

137 Commits

Author SHA1 Message Date
Serhiy Storchaka 345b390ed6
bpo-433030: Add support of atomic grouping in regular expressions (GH-31982)
* Atomic grouping: (?>...).
* Possessive quantifiers: x++, x*+, x?+, x{m,n}+.
  Equivalent to (?>x+), (?>x*), (?>x?), (?>x{m,n}).

Co-authored-by: Jeffrey C. Jacobs <timehorse@users.sourceforge.net>
2022-03-21 18:28:22 +02:00
Serhiy Storchaka 92a6abf72e
bpo-47066: Convert a warning about flags not at the start of the regular expression into error (GH-31994) 2022-03-19 16:10:44 +02:00
Serhiy Storchaka 4142961b9f
bpo-39394: Improve warning message in the re module (GH-31988)
A warning about inline flags not at the start of the regular
expression now contains the position of the flag.
2022-03-19 14:13:31 +02:00
yannvgn 9f55551f3d bpo-37723: Fix performance regression on regular expression parsing. (GH-15030)
Improve performance of sre_parse._uniq function.
2019-07-31 21:50:39 +03:00
Serhiy Storchaka e0c19ddc66
bpo-34681: Rename class Pattern in sre_parse to State. (GH-9310)
Also rename corresponding attributes, parameters and variables.
2018-09-18 09:16:26 +03:00
Victor Stinner 5e922658fb
bpo-34605: Avoid master/slave terms (GH-9101)
* Replace "master process" with "parent process"
* Replace "master option mappings" with "main option mappings"
* Replace "master pattern object" with "main pattern object"
* ssl: replace "master" with "server"
* And some other similar changes
2018-09-07 17:30:33 +02:00
Zhou Fangyi 5df5286abd bpo-30688: Import unicodedata only when needed. (GH-5606)
Importing unicodedata in sre_parse leads to failure in compilation.
unicodedata is unused during compilation, and is not compiled when this
file is imported. The error occurs when generating posix variables,
pprint is required. The dependency chain goes on like this:

sysconfig -> pprint -> re -> sre_compile -> sre_parse (this file)

This commits fixes compilation issues introduced by
2272cec13b.
(Issue 30688, GH-5588)
2018-02-10 08:59:29 +02:00
Serhiy Storchaka a445feb729
bpo-30688: Support \N{name} escapes in re patterns. (GH-5588)
Co-authored-by: Jonathan Eunice <jonathan.eunice@gmail.com>
2018-02-10 00:08:17 +02:00
Serhiy Storchaka 05cb728d68
bpo-30349: Raise FutureWarning for nested sets and set operations (#1553)
in regular expressions.
2017-11-16 12:38:26 +02:00
Serhiy Storchaka 3557b05c5a bpo-31690: Allow the inline flags "a", "L", and "u" to be used as group flags for RE. (#3885) 2017-10-24 23:31:42 +03:00
Roy Williams 171b9a354e bpo-30605: Fix compiling binary regexs with BytesWarnings enabled. (#2016)
Running our unit tests with `-bb` enabled triggered this failure.
2017-06-10 08:01:16 +03:00
Serhiy Storchaka c7ac7280c3 bpo-30375: Correct the stacklevel of regex compiling warnings. (#1595)
Warnings emitted when compile a regular expression now always point
to the line in the user code.  Previously they could point into inners
of the re module if emitted from inside of groups or conditionals.
2017-05-16 15:16:15 +03:00
Serhiy Storchaka 821a9d146b bpo-30340: Enhanced regular expressions optimization. (#1542)
This increased the performance of matching some patterns up to 25 times.
2017-05-14 08:32:33 +03:00
Serhiy Storchaka 305ccbe27e bpo-30298: Weaken the condition of deprecation warnings for inline modifiers. (#1490)
Now allowed several subsequential inline modifiers at the start of the
pattern (e.g. '(?i)(?s)...').  In verbose mode whitespaces and comments
now are allowed before and between inline modifiers (e.g.
'(?x) (?i) (?s)...').
2017-05-10 06:05:20 +03:00
Serhiy Storchaka 662cef66d7 Issue #25953: re.sub() now raises an error for invalid numerical group
reference in replacement template even if the pattern is not found in
the string.  Error message for invalid group reference now includes the
group index and the position of the reference.
Based on patch by SilentGhost.
2016-10-23 12:11:19 +03:00
Serhiy Storchaka abf275af58 Issue #22493: Warning message emitted by using inline flags in the middle of
regular expression now contains a (truncated) regex pattern.
Patch by Tim Graham.
2016-09-17 01:29:58 +03:00
Serhiy Storchaka bd48d27944 Issue #22493: Inline flags now should be used only at the start of the
regular expression.  Deprecation warning is emitted if uses them in the
middle of the regular expression.
2016-09-11 12:50:02 +03:00
Serhiy Storchaka d65cd091e9 Issue #28070: Fixed parsing inline verbose flag in regular expressions. 2016-09-11 01:39:01 +03:00
Serhiy Storchaka be9a4e5c85 Issue #433028: Added support of modifier spans in regular expressions. 2016-09-10 00:57:55 +03:00
Serhiy Storchaka 9bd85b83f6 Issue #27030: Unknown escapes consisting of ``'\'`` and ASCII letter in
regular expressions now are errors.
2016-06-11 19:15:00 +03:00
Serhiy Storchaka a01a144aab Issue #26475: Fixed debugging output for regular expressions with the (?x) flag. 2016-03-06 09:15:47 +02:00
Serhiy Storchaka b5d0a21553 Issue #25554: Got rid of circular references in regular expression parsing. 2015-11-05 17:49:26 +02:00
Serhiy Storchaka 485407ce1e Issue #24580: Symbolic group references to open group in re patterns now are
explicitly forbidden as well as numeric group references.
2015-07-18 23:27:00 +03:00
Serhiy Storchaka 07360df481 Issue #14260: The groupindex attribute of regular expression pattern object
now is non-modifiable mapping.
2015-03-30 01:01:48 +03:00
Serhiy Storchaka 632a77e6a3 Issue #22364: Improved some re error messages using regex for hints. 2015-03-25 21:03:47 +02:00
Serhiy Storchaka 15fa1c4ade Fixed using deprecated escaping in regular expression in _strptime.py (issue23622). 2015-03-25 01:21:50 +02:00
Serhiy Storchaka a54aae0683 Issue #23622: Unknown escapes in regular expressions that consist of ``'\'``
and ASCII letter now raise a deprecation warning and will be forbidden in
Python 3.6.
2015-03-24 22:58:14 +02:00
Serhiy Storchaka 4eea62fd2e Issues #814253, #9179: Group references and conditional group references now
work in lookbehind assertions in regular expressions.
2015-02-21 10:07:35 +02:00
Serhiy Storchaka 22a309a434 Issue #21032: Deprecated the use of re.LOCALE flag with str patterns or
re.ASCII. It was newer worked.
2014-12-01 11:50:07 +02:00
Benjamin Peterson 16e802f4ae merge 3.4 (#9179) 2014-11-30 11:51:16 -05:00
Benjamin Peterson 66323415c7 backout 9fcf4008b626 (#9179) for further consideration 2014-11-30 11:49:00 -05:00
Serhiy Storchaka ab14088141 Minor code clean up and improvements in the re module. 2014-11-11 21:13:28 +02:00
Serhiy Storchaka 1b2004f059 Fixed error position for the backslash at the end of regex pattern. 2014-11-10 18:28:53 +02:00
Serhiy Storchaka b99c132bd9 Fixed AttributeError when the regular expression starts from illegal escape. 2014-11-10 14:38:16 +02:00
Serhiy Storchaka ad446d57a9 Issue #22578: Added attributes to the re.error class. 2014-11-10 13:49:00 +02:00
Serhiy Storchaka 5f33677219 Merge heads 2014-11-10 10:21:03 +02:00
Raymond Hettinger df1b699447 Issue #22823: Use set literals instead of creating a set from a list 2014-11-09 15:56:33 -08:00
Serhiy Storchaka c7f7d3897e Issue #22434: Constants in sre_constants are now named constants (enum-like). 2014-11-09 20:48:36 +02:00
Serhiy Storchaka 6276b32799 Issues #814253, #9179: Group references and conditional group references now
work in lookbehind assertions in regular expressions.
2014-11-07 21:45:17 +02:00
Serhiy Storchaka 84df7fe6a2 Issues #814253, #9179: Group references and conditional group references now
work in lookbehind assertions in regular expressions.
2014-11-07 21:43:57 +02:00
Serhiy Storchaka e2ccf5608c Issue #19380: Optimized parsing of regular expressions. 2014-10-10 11:14:49 +03:00
Serhiy Storchaka 7438e4b56f Issue 1519638: Now unmatched groups are replaced with empty strings in re.sub()
and re.subn().
2014-10-10 11:06:31 +03:00
Serhiy Storchaka 9baa5b2de2 Issue #22437: Number of capturing groups in regular expression is no longer
limited by 100.
2014-09-29 22:49:23 +03:00
Serhiy Storchaka c563caf3a2 Issue #22362: Forbidden ambiguous octal escapes out of range 0-0o377 in
regular expressions.
2014-09-23 23:22:41 +03:00
Serhiy Storchaka 44dae8bde3 Issue #22423: Fixed debugging output of the GROUPREF_EXISTS opcode in the re
module.
2014-09-21 22:47:55 +03:00
Raymond Hettinger 1c99bc84bd Issue #8343: Named group error msgs did not show the group name. 2014-06-22 19:47:22 -07:00
Victor Stinner 7fa767e517 Issue #20976: pyflakes: Remove unused imports 2014-03-20 09:16:38 +01:00
Serhiy Storchaka 9c15ec1ce6 Issue #19365: Optimized the parsing of long replacement string in re.sub*()
functions.
2013-10-23 22:27:52 +03:00
Serhiy Storchaka 9d96542b6d Issue #18647: Correctly bound calculated min/max width of a subexpression.
Now max width is MAXREPEAT on 32- and 64-bit platforms when one of
subexpressions is unbounded repetition.
2013-08-19 22:50:54 +03:00
R David Murray 26dfaac9ac #17341: Include name in re error message about invalid group name.
Patch by Jason Michalski.
2013-04-14 13:00:54 -04:00