Commit Graph

224 Commits

Author SHA1 Message Date
Serhiy Storchaka 5619ab926b Issue #12728: Different Unicode characters having the same uppercase but
different lowercase are now matched in case-insensitive regular expressions.
2014-11-10 12:43:14 +02:00
Serhiy Storchaka 0c938f6d24 Issue #12728: Different Unicode characters having the same uppercase but
different lowercase are now matched in case-insensitive regular expressions.
2014-11-10 12:37:16 +02:00
Serhiy Storchaka c7f7d3897e Issue #22434: Constants in sre_constants are now named constants (enum-like). 2014-11-09 20:48:36 +02:00
Serhiy Storchaka 6276b32799 Issues #814253, #9179: Group references and conditional group references now
work in lookbehind assertions in regular expressions.
2014-11-07 21:45:17 +02:00
Serhiy Storchaka 84df7fe6a2 Issues #814253, #9179: Group references and conditional group references now
work in lookbehind assertions in regular expressions.
2014-11-07 21:43:57 +02:00
Serhiy Storchaka 4b8f8949b4 Issue #17381: Fixed handling of case-insensitive ranges in regular expressions.
Added new opcode RANGE_IGNORE.
2014-10-31 12:36:56 +02:00
Serhiy Storchaka 7cc0a1f7cb Issue #22410: Module level functions in the re module now cache compiled
locale-dependent regular expressions taking into account the locale.
2014-10-31 00:56:45 +02:00
Serhiy Storchaka 4659cc0756 Issue #22410: Module level functions in the re module now cache compiled
locale-dependent regular expressions taking into account the locale.
2014-10-31 00:53:49 +02:00
Victor Stinner 55e614a2a8 Issue #11957: Explicit parameter name when calling re.split() and re.sub() 2014-10-29 16:58:59 +01:00
Serhiy Storchaka 7438e4b56f Issue 1519638: Now unmatched groups are replaced with empty strings in re.sub()
and re.subn().
2014-10-10 11:06:31 +03:00
Serhiy Storchaka 9baa5b2de2 Issue #22437: Number of capturing groups in regular expression is no longer
limited by 100.
2014-09-29 22:49:23 +03:00
Serhiy Storchaka c563caf3a2 Issue #22362: Forbidden ambiguous octal escapes out of range 0-0o377 in
regular expressions.
2014-09-23 23:22:41 +03:00
Serhiy Storchaka cd9032d45b Fixed bytes literals in tests. 2014-09-23 23:04:21 +03:00
Serhiy Storchaka 44dae8bde3 Issue #22423: Fixed debugging output of the GROUPREF_EXISTS opcode in the re
module.
2014-09-21 22:47:55 +03:00
Serhiy Storchaka b1847e7541 Issue #17381: Fixed handling of case-insensitive ranges in regular expressions. 2014-10-31 12:37:50 +02:00
Serhiy Storchaka b85a97600a Restored re pickling test. 2014-09-15 11:33:19 +03:00
Serhiy Storchaka d9cf65f00e Use more appropriate asserts in re tests. 2014-09-14 16:20:20 +03:00
Serhiy Storchaka a25875cfd0 Fixed re tests incorrectly ported from 2.x to 3.x. 2014-09-14 15:56:27 +03:00
Serhiy Storchaka 429b59ec69 Issue #20998: Fixed re.fullmatch() of repeated single character pattern
with ignore case.  Original patch by Matthew Barnett.
2014-05-14 21:48:17 +03:00
Serhiy Storchaka a537eb45fd Issue #20283: RE pattern methods now accept the string keyword parameters
as documented.  The pattern and source keyword parameters are left as
deprecated aliases.
2014-03-06 11:36:15 +02:00
Serhiy Storchaka ccdf352370 Issue #20283: RE pattern methods now accept the string keyword parameters
as documented.  The pattern and source keyword parameters are left as
deprecated aliases.
2014-03-06 11:28:32 +02:00
Antoine Pitrou c49672f25e Issue #20426: When passing the re.DEBUG flag, re.compile() displays the debug output every time it is called, regardless of the compilation cache. 2014-02-03 21:01:35 +01:00
Antoine Pitrou d2cc743ca4 Issue #20426: When passing the re.DEBUG flag, re.compile() displays the debug output every time it is called, regardless of the compilation cache. 2014-02-03 20:59:59 +01:00
Serhiy Storchaka 32eddc1bbc Issue #16203: Add re.fullmatch() function and regex.fullmatch() method,
which anchor the pattern at both ends of the string to match.

Original patch by Matthew Barnett.
2013-11-23 23:20:30 +02:00
Serhiy Storchaka 5c24d0e504 Issue #13592: Improved the repr for regular expression pattern objects.
Based on patch by Hugo Lopes Tavares.
2013-11-23 22:42:43 +02:00
Serhiy Storchaka 9eabac68a3 Issue #18685: Restore re performance to pre-PEP 393 levels. 2013-10-26 10:45:48 +03:00
Antoine Pitrou 79aa68dfc1 Issue #19387: explain and test the sre overlap table 2013-10-25 21:36:10 +02:00
Serhiy Storchaka 8b150ecfc9 Issue #19327: Fixed the working of regular expressions with too big charset. 2013-10-24 22:04:37 +03:00
Serhiy Storchaka be80fc9a84 Issue #19327: Fixed the working of regular expressions with too big charset. 2013-10-24 22:02:58 +03:00
Serhiy Storchaka 36af10c1f7 Issue #17087: Improved the repr for regular expression match objects. 2013-10-20 13:13:31 +03:00
Serhiy Storchaka 25324971fb Issue #18468: The re.split, re.findall, and re.sub functions and the group()
and groups() methods of match object now always return a string or a bytes
object.
2013-10-16 12:46:28 +03:00
Georg Brandl daa1fa991c Back out accidentally pushed changeset b51218966201. 2013-10-13 09:32:59 +02:00
Georg Brandl 4300019e1a Add re.fullmatch() function and regex.fullmatch() method, which anchor the
pattern at both ends of the string to match.

Patch by Matthew Barnett.
Closes #16203.
2013-10-13 09:18:45 +02:00
Serhiy Storchaka 98985a1980 Issue #2537: Remove breaked check which prevented valid regular expressions.
Patch by Meador Inge.

See also issue #18647.
2013-08-19 23:18:23 +03:00
Serhiy Storchaka 1f35ae0a3c Issue #17998: Fix an internal error in regular expression engine. 2013-08-03 19:18:38 +03:00
R David Murray 26dfaac9ac #17341: Include name in re error message about invalid group name.
Patch by Jason Michalski.
2013-04-14 13:00:54 -04:00
Georg Brandl 1d472b74cb Closes #14462: allow any valid Python identifier in sre group names, as documented. 2013-04-14 11:40:00 +02:00
Ezio Melotti eadece2865 #12749: add a test for non-BMP ranges in character classes. 2013-02-23 08:40:07 +02:00
Serhiy Storchaka b0c75a7dec Issue #9669: Protect re against infinite loops on zero-width matching in
non-greedy repeat.  Patch by Matthew Barnett.
2013-02-16 21:25:05 +02:00
Serhiy Storchaka fa46816915 Issue #9669: Protect re against infinite loops on zero-width matching in
non-greedy repeat.  Patch by Matthew Barnett.
2013-02-16 21:23:53 +02:00
Serhiy Storchaka a0eb809995 Issue #13169: The maximal repetition number in a regular expression has been
increased from 65534 to 2147483647 (on 32-bit platform) or 4294967294 (on
64-bit).
2013-02-16 16:54:33 +02:00
Serhiy Storchaka 70ca0210e8 Issue #13169: The maximal repetition number in a regular expression has been
increased from 65534 to 2147483647 (on 32-bit platform) or 4294967294 (on
64-bit).
2013-02-16 16:47:47 +02:00
Ezio Melotti adfbb8e8ec #13899: merge with 3.2. 2013-01-11 08:43:53 +02:00
Ezio Melotti fe8e6e7414 #13899: \A, \Z, and \B now correctly match the A, Z, and B literals when used inside character classes (e.g. [A]). Patch by Matthew Barnett. 2013-01-11 08:32:01 +02:00
Serhiy Storchaka c1b59d4552 Issue #16688: Fix backreferences did make case-insensitive regex fail on non-ASCII strings.
Patch by Matthew Barnett.
2012-12-29 23:38:48 +02:00
Antoine Pitrou 56a2ae27e3 Fix test splitting in previous commit. 2012-12-03 21:09:08 +01:00
Antoine Pitrou 86067c2e17 Fix test splitting in previous commit. 2012-12-03 21:08:43 +01:00
Antoine Pitrou b33941ab02 Split the bigmem re test in two separate tests with different memory requirements. 2012-12-03 20:55:56 +01:00
Antoine Pitrou 1f1888ec1e Split the bigmem re test in two separate tests with different memory requirements. 2012-12-03 20:53:12 +01:00
Antoine Pitrou 9a2b26748b Issue #10182: The re module doesn't truncate indices to 32 bits anymore.
Patch by Serhiy Storchaka.
2012-12-02 12:54:28 +01:00
Antoine Pitrou 43fb54cd4f Issue #10182: The re module doesn't truncate indices to 32 bits anymore.
Patch by Serhiy Storchaka.
2012-12-02 12:52:36 +01:00
Antoine Pitrou a34412a992 Merge test from issue #1160. 2012-11-20 22:35:53 +01:00
Antoine Pitrou 39bdad813a Issue #1160: Fix compiling large regular expressions on UCS2 builds.
Patch by Serhiy Storchaka.
2012-11-20 22:30:42 +01:00
Ezio Melotti 68600aff3a #12759: merge with 3.2. 2012-11-03 20:33:38 +02:00
Ezio Melotti 0941d9fc64 #12759: sre_parse now raises a proper error when the name of the group is missing. Initial patch by Serhiy Storchaka. 2012-11-03 20:33:08 +02:00
Antoine Pitrou 463badf06c Issue #3665: \u and \U escapes are now supported in unicode regular expressions.
Patch by Serhiy Storchaka.
2012-06-23 13:29:19 +02:00
Sean Reifschneider 7b3c975aaf closes #14259 re.finditer() now takes keyword arguments: pos, endpos.
Contrary to the documentation, finditer() did not take pos and endpos
keyword arguments.
2012-03-12 18:22:38 -06:00
Ezio Melotti cc50ba26bd #14179: merge with 3.2. 2012-03-13 01:33:30 +02:00
Ezio Melotti df723e1e5e #14179: add tests for re.compile. Patch by Florian Mladitsch. 2012-03-13 01:29:48 +02:00
Benjamin Peterson 33d21a24fa merge 3.2 (#14212) 2012-03-07 14:59:13 -06:00
Benjamin Peterson e48944b69c keep the buffer object around while we're using it (closes #14212) 2012-03-07 14:50:25 -06:00
Ezio Melotti 0b8123d8ae #10713: merge with 3.2. 2012-02-29 11:49:45 +02:00
Ezio Melotti 5a045b9f54 #10713: Improve documentation for \b and \B and add a few tests. Initial patch and tests by Martin Pool. 2012-02-29 11:48:44 +02:00
Martin v. Löwis d63a3b8beb Implement PEP 393. 2011-09-28 07:41:54 +02:00
Ezio Melotti 88fdeb45ef #2650: re.escape() no longer escapes the "_". 2011-04-10 12:59:16 +03:00
Ezio Melotti 213eb96902 #2650: Merge with 3.1. 2011-03-25 14:25:36 +02:00
Ezio Melotti 7b9e97b487 #2650: Add tests with non-ascii chars for re.escape. 2011-03-25 14:09:33 +02:00
Ezio Melotti d2114ebd97 #2650: Refactor the tests for re.escape. 2011-03-25 14:08:44 +02:00
Ezio Melotti 4969f709cc #11515: Merge with 3.1. 2011-03-15 05:59:46 +02:00
Ezio Melotti 42da663e6f #11515: fix several typos. Patch by Piotr Kasprzyk. 2011-03-15 05:18:48 +02:00
Antoine Pitrou 3060c4573f Reapply r83877. 2010-08-13 16:27:38 +00:00
Antoine Pitrou aba74bddd6 Revert r83877 in order to fix compilation 2010-08-09 10:47:46 +00:00
Senthil Kumaran 9f347ea545 reapply the revert made in r83875
Now the _collections is statically built, the build dependencies are in proper
order and build works fine.

Commit Log from r83874:
Issue 9396.   Apply functools.lru_cache in the place of the
random flushing cache in the re module.
2010-08-09 07:30:53 +00:00
Raymond Hettinger 31022301b5 Revert 83784 adding functools.lru_cache() to the re module.
The problem is that the re module is imported by sysconfig
and re needs functools which uses collections.OrderedDict()
but the _collectionsmodule.c code is not yet constructed
at this point in the build.

The likely best solution will be to include _collections
as part of the static build before the rest of the
boot-strapping.
2010-08-09 05:56:50 +00:00
Raymond Hettinger 4f859ed9c7 Issue 9396. Apply functools.lru_cache in the place of the
random flushing cache in the re module.
2010-08-09 04:24:42 +00:00
Gregory P. Smith 5a63183a8b The default size of the re module's compiled regular expression cache has
been increased from 100 to 500 and the cache replacement policy has changed
from simply clearing the entire cache on overflow to randomly forgetting 20%
of the existing cached compiled regular expressions.  This is a performance
win for applications that use a lot of regular expressions and limits the
impact of the performance hit anytime the cache is exceeded.
2010-07-27 05:31:29 +00:00
Georg Brandl 1b37e8728c Merged revisions 78093 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r78093 | georg.brandl | 2010-02-07 18:03:15 +0100 (So, 07 Feb 2010) | 1 line

  Remove unused imports in test modules.
........
2010-03-14 10:45:50 +00:00
Ezio Melotti dab886ab0f Merged revisions 78729 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r78729 | ezio.melotti | 2010-03-06 17:24:08 +0200 (Sat, 06 Mar 2010) | 1 line

  #6509: fix re.sub to work properly when the pattern, the string, and the replacement were all bytes. Patch by Antoine Pitrou.
........
2010-03-06 15:27:04 +00:00
Ezio Melotti b92ed7cf36 #6509: fix re.sub to work properly when the pattern, the string, and the replacement were all bytes. Patch by Antoine Pitrou. 2010-03-06 15:24:08 +00:00
Victor Stinner 26c966b91d Merged revisions 78664 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r78664 | victor.stinner | 2010-03-04 22:59:53 +0100 (jeu., 04 mars 2010) | 3 lines

  Issue #3299: replace PyObject_DEL() by Py_DECREF() in _sre module to fix a
  crash in pydebug mode.
........
2010-03-04 22:01:47 +00:00
Victor Stinner 5abeafbb0f Issue #3299: replace PyObject_DEL() by Py_DECREF() in _sre module to fix a
crash in pydebug mode.
2010-03-04 21:59:53 +00:00
Ezio Melotti 68338cd63f Merged revisions 77708 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

................
  r77708 | ezio.melotti | 2010-01-23 12:49:39 +0200 (Sat, 23 Jan 2010) | 9 lines

  Merged revisions 77706 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r77706 | ezio.melotti | 2010-01-23 12:43:05 +0200 (Sat, 23 Jan 2010) | 1 line

    Increased the overflow value on test_dealloc to make sure that it is big enough even for wide builds.
  ........
................
2010-01-23 10:54:37 +00:00
Ezio Melotti 0f77f465ff Merged revisions 77706 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r77706 | ezio.melotti | 2010-01-23 12:43:05 +0200 (Sat, 23 Jan 2010) | 1 line

  Increased the overflow value on test_dealloc to make sure that it is big enough even for wide builds.
........
2010-01-23 10:49:39 +00:00
Antoine Pitrou 0560e8a8f8 Merged revisions 77501 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

................
  r77501 | antoine.pitrou | 2010-01-14 18:34:48 +0100 (jeu., 14 janv. 2010) | 10 lines

  Merged revisions 77499 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r77499 | antoine.pitrou | 2010-01-14 18:25:24 +0100 (jeu., 14 janv. 2010) | 4 lines

    Issue #3299: Fix possible crash in the _sre module when given bad
    argument values in debug mode.  Patch by Victor Stinner.
  ........
................
2010-01-14 17:37:24 +00:00
Antoine Pitrou 82feb1f360 Merged revisions 77499 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r77499 | antoine.pitrou | 2010-01-14 18:25:24 +0100 (jeu., 14 janv. 2010) | 4 lines

  Issue #3299: Fix possible crash in the _sre module when given bad
  argument values in debug mode.  Patch by Victor Stinner.
........
2010-01-14 17:34:48 +00:00
Georg Brandl ab91fdef1f Merged revisions 73715 via svnmerge from
svn+ssh://svn.python.org/python/branches/py3k

........
  r73715 | benjamin.peterson | 2009-07-01 01:06:06 +0200 (Mi, 01 Jul 2009) | 1 line

  convert old fail* assertions to assert*
........
2009-08-13 08:51:18 +00:00
Mark Dickinson 1f268285ff Issue #6561: '\d' in a regular expression should match only Unicode
character category [Nd],  not [No].
2009-07-28 17:22:36 +00:00
R. David Murray d98ef1a1ac Merged revisions 74118 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r74118 | r.david.murray | 2009-07-20 13:34:54 -0400 (Mon, 20 Jul 2009) | 5 lines

  Remove apparently unneeded and un-cleaned-up munging of sys.path from
  test_re.  Tests pass on my machine without it, and I can't see
  any obvious place in the tests that would need it.
........
2009-07-21 14:03:55 +00:00
R. David Murray d33396c22b Remove apparently unneeded and un-cleaned-up munging of sys.path from
test_re.  Tests pass on my machine without it, and I can't see
any obvious place in the tests that would need it.
2009-07-20 17:34:54 +00:00
Benjamin Peterson c9c0f201fe convert old fail* assertions to assert* 2009-06-30 23:06:06 +00:00
Guido van Rossum 698280df7c Issue #3756: make re.escape() handle bytes as well as str.
Patch by Andrew McNamara, reviewed and tweaked by myself.
2008-09-10 17:44:35 +00:00
Guido van Rossum 92f8f3e013 Merged revisions 66364 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r66364 | guido.van.rossum | 2008-09-10 07:27:00 -0700 (Wed, 10 Sep 2008) | 3 lines

  Issue #3629: Fix sre "bytecode" validator for an end case.
  Reviewed by Amaury.
........
2008-09-10 14:30:50 +00:00
Brett Cannon 1cd0247a4d Merged revisions 66321 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r66321 | brett.cannon | 2008-09-08 17:49:16 -0700 (Mon, 08 Sep 2008) | 7 lines

  warnings.catch_warnings() now returns a list or None instead of the custom
  WarningsRecorder object. This makes the API simpler to use as no special object
  must be learned.

  Closes issue 3781.
  Review by Benjamin Peterson.
........
2008-09-09 01:52:27 +00:00
Benjamin Peterson a786b026c9 Merged revisions 65910,65977,65980,65984,65986,66000,66011-66012,66014,66017,66020 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r65910 | benjamin.peterson | 2008-08-20 09:07:59 -0500 (Wed, 20 Aug 2008) | 1 line

  fix up the multiprocessing docs a little
........
  r65977 | christian.heimes | 2008-08-22 14:47:25 -0500 (Fri, 22 Aug 2008) | 3 lines

  Silenced compiler warning
  Objects/stringlib/find.h:97: warning: 'stringlib_contains_obj' defined but not used
  Reviewed by Benjamin Peterson
........
  r65980 | christian.heimes | 2008-08-22 15:10:27 -0500 (Fri, 22 Aug 2008) | 3 lines

  Fixed two format strings in the _collections module. For example
  Modules/_collectionsmodule.c:674: warning: format '%i' expects type 'int', but argument 2 has type 'Py_ssize_t'
  Reviewed by Benjamin Peterson
........
  r65984 | christian.heimes | 2008-08-22 16:23:47 -0500 (Fri, 22 Aug 2008) | 1 line

  d is the correct format string
........
  r65986 | mark.hammond | 2008-08-22 19:59:14 -0500 (Fri, 22 Aug 2008) | 2 lines

  Fix bug 3625: test issues on 64bit windows. r=pitrou
........
  r66000 | benjamin.peterson | 2008-08-23 15:27:43 -0500 (Sat, 23 Aug 2008) | 5 lines

  #3643 add a few more checks to _testcapi to prevent segfaults

  Author: Victor Stinner
  Reviewer: Benjamin Peterson
........
  r66011 | neal.norwitz | 2008-08-24 12:27:43 -0500 (Sun, 24 Aug 2008) | 1 line

  Ignore a couple more tests that report leaks inconsistently.
........
  r66012 | neal.norwitz | 2008-08-24 12:29:53 -0500 (Sun, 24 Aug 2008) | 1 line

  Use the actual blacklist of leaky tests
........
  r66014 | georg.brandl | 2008-08-24 13:11:07 -0500 (Sun, 24 Aug 2008) | 2 lines

  #3654: fix duplicate test method name. Review by Benjamin P.
........
  r66017 | benjamin.peterson | 2008-08-24 16:55:03 -0500 (Sun, 24 Aug 2008) | 1 line

  remove note about unimplemented feature
........
  r66020 | brett.cannon | 2008-08-24 18:15:19 -0500 (Sun, 24 Aug 2008) | 1 line

  Clarify that some attributes/methods are listed somewhat separately because they are not part of the threading API.
........
2008-08-25 21:05:21 +00:00
Antoine Pitrou fd036451bf #2834: Change re module semantics, so that str and bytes mixing is forbidden,
and str (unicode) patterns get full unicode matching by default. The re.ASCII
flag is also introduced to ask for ASCII matching instead.
2008-08-19 17:56:33 +00:00
Antoine Pitrou 22628c4d6a #3231: re.compile fails with some bytes patterns 2008-07-22 17:53:22 +00:00
Amaury Forgeot d'Arc e43d33a4db #3247 Get rid of Py_FindMethod; use tp_members instead.
Otherwise dir(_sre.SRE_Match) returns an empty list.

First step: handle most occurrences, remove tp_getattr and fill the tp_methods and tp_members slots.
Add some test about attribute access.
2008-07-02 20:50:16 +00:00
Benjamin Peterson ee8712cda4 #2621 rename test.test_support to test.support 2008-05-20 21:35:26 +00:00
Brett Cannon cf3520c2bf Remove the sre module. 2008-05-11 22:47:58 +00:00
Thomas Wouters 40a088dc27 Fix 're' to work on bytes. It could do with a few more tests, though. 2008-03-18 20:19:54 +00:00