cpython

Commit Graph

Author	SHA1	Message	Date
Andrew M. Kuchling	c24fe36c57	Allow _sre.c to compile with Python 2.2	2003-04-30 13:09:08 +00:00
Gustavo Niemeyer	caf1c9dfe7	- Included detailed documentation in _sre.c explaining how, when, and why to use LASTMARK_SAVE()/LASTMARK_RESTORE(), based on the discussion in patch #712900. - Cleaned up LASTMARK_SAVE()/LASTMARK_RESTORE() usage, based on the established rules. - Moved the upper part of the just commited patch (relative to bug #725106) to outside the for() loop of BRANCH OP. There's no need to mark_save() in every loop iteration.	2003-04-27 14:42:54 +00:00
Gustavo Niemeyer	3646ab98af	Fix for part of the problem mentioned in #725149 by Greg Chapman. This problem is related to a wrong behavior from mark_save/restore(), which don't restore the mark_stack_base before restoring the marks. Greg's suggestion was to change the asserts, which happen to be the only recursive ops that can continue the loop, but the problem would happen to any operation with the same behavior. So, rather than hardcoding this into asserts, I have changed mark_save/restore() to always restore the stackbase before restoring the marks. Both solutions should fix these two cases, presented by Greg: >>> re.match('(a)(?:(?=(b))c)', 'abb').groups() ('b', None) >>> re.match('(a)((?!(b)))', 'abb').groups() ('b', None, None) The rest of the bug and patch in #725149 must be discussed further.	2003-04-27 13:25:21 +00:00
Gustavo Niemeyer	c34f2555bd	Applied patch #725106 , by Greg Chapman, fixing capturing groups within repeats of alternatives. The only change to the original patch was to convert the tests to the new test_re.py file. This patch fixes cases like: >>> re.match('((a)\|b)', 'abc').groups() ('b', '') Which is wrong (it's impossible to match the empty string), and incompatible with other regex systems, like the following examples show: % perl -e '"abc" =~ /^((a)\|b)/; print "$1 $2\n";' b a % echo "abc" \| sed -r -e "s/^((a)\|b)*/\1 \2\|/" b a\|c	2003-04-27 12:34:14 +00:00
Gustavo Niemeyer	c23fb77477	Applying patch #726869 by Andrew I MacIntyre, reducing in _sre.c the recursion limit for certain setups of FreeBSD and OS/2.	2003-04-27 06:58:54 +00:00
Gustavo Niemeyer	3c9068bbec	Made MAX_UNTIL/MIN_UNTIL code more coherent about mark protection, accordingly to further discussions with Greg Chapman in patch #712900.	2003-04-22 15:39:09 +00:00
Gustavo Niemeyer	be733ee7fb	More work on bug #672491 and patch #712900 . I've applied a modified version of Greg Chapman's patch. I've included the fixes without introducing the reorganization mentioned, for the sake of stability. Also, the second fix mentioned in the patch don't fix the mentioned problem anymore, because of the change introduced by patch #720991 (by Greg as well). The new fix wasn't complicated though, and is included as well. As a note. It seems that there are other places that require the "protection" of LASTMARK_SAVE()/LASTMARK_RESTORE(), and are just waiting for someone to find how to break them. Particularly, I belive that every recursion of SRE_MATCH() should be protected by these macros. I won't do that right now since I'm not completely sure about this, and we don't have much time for testing until the next release.	2003-04-20 07:35:44 +00:00
Gustavo Niemeyer	1aca359e89	- Fixed bug #672491 . This change restores the behavior of lastindex/lastgroup to be compliant with previous python versions, by backing out the changes made in revision 2.84 which affected this. The bugfix for backtracking is still maintained.	2003-04-20 00:45:13 +00:00
Martin v. Löwis	78e2f06cc6	Fully support 32-bit codes. Enable BIGCHARSET in UCS-4 builds.	2003-04-19 12:56:08 +00:00
Guido van Rossum	41c99e7f96	SF patch #720991 by Gary Herron: A small fix for bug #545855 and Greg Chapman's addition of op code SRE_OP_MIN_REPEAT_ONE for eliminating recursion on simple uses of pattern '*?' on a long string.	2003-04-14 17:59:34 +00:00
Fredrik Lundh	09705f0b89	fix for SF #635398 (don't "downcast" return strings from unicode to ascii)	2002-11-22 12:46:35 +00:00
Neal Norwitz	addfe0c09c	Make private functions static so we don't pollute the namespace	2002-11-10 14:33:26 +00:00
Gustavo Niemeyer	c523b04b0f	Fixed sre bug "[#581080 ] Provoking infinite scanner loops". This bug happened because: 1) the scanner_search and scanner_match methods were not checking the buffer limits before increasing the current pointer; and 2) SRE_SEARCH was using "if (ptr == end)" as a loop break, instead of "if (ptr >= end)". * Modules/_sre.c (SRE_SEARCH): Check for "ptr >= end" to break loops, so that we don't hang forever if a pointer passing the buffer limit is used. (scanner_search,scanner_match): Don't increment the current pointer if we're going to pass the buffer limit. * Misc/NEWS Mention the fix.	2002-11-07 03:28:56 +00:00
Gustavo Niemeyer	4e7be06a65	Fixed bug #470582 , using a modified version of patch #527371 , from Greg Chapman. * Modules/_sre.c (lastmark_restore): New function, implementing algorithm to restore a state to a given lastmark. In addition to the similar algorithm used in a few places of SRE_MATCH, restore lastindex when restoring lastmark. (SRE_MATCH): Replace lastmark inline restoring by lastmark_restore(), function. Also include it where missing. In SRE_OP_MARK, set lastindex only if i > lastmark. * Lib/test/re_tests.py * Lib/test/test_sre.py Included regression tests for the fixed bugs. * Misc/NEWS Mention fixes.	2002-11-06 14:06:53 +00:00
Michael W. Hudson	b6a4505123	Cray fixup as seen in bug #558153 .	2002-07-31 09:54:24 +00:00
Mark Hammond	8235ea1c3a	Land Patch [ 566100 ] Rationalize DL_IMPORT and DL_EXPORT.	2002-07-19 06:55:41 +00:00
Jeremy Hylton	938ace69a0	staticforward bites the dust. The staticforward define was needed to support certain broken C compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the static keyword when it was used with a forward declaration of a static initialized structure. Standard C allows the forward declaration with static, and we've decided to stop catering to broken C compilers. (In fact, we expect that the compilers are all fixed eight years later.) I'm leaving staticforward and statichere defined in object.h as static. This is only for backwards compatibility with C extensions that might still use it. XXX I haven't updated the documentation.	2002-07-17 16:30:39 +00:00
Neal Norwitz	35fc7606f0	SF #561244 Micro optimizations Convert loops to memset()s.	2002-06-13 21:11:11 +00:00
Neal Norwitz	bb2769f580	Revert use of METH_OLDARGS (use 0) to support 1.5.2	2002-03-31 15:46:00 +00:00
Neal Norwitz	b049325e92	Use symbolic METH_VARARGS/METH_OLDARGS instead of 1/0 for ml_flags	2002-03-31 14:44:22 +00:00
Fredrik Lundh	82b230732f	bug #133283 , #477728 , #483789 , #490573 backed out of broken minimal repeat patch from July also fixed a couple of minor potential resource leaks in pattern_subx (Guido had already fixed the big one)	2001-12-09 16:13:15 +00:00
Guido van Rossum	146483964e	Patch supplied by Burton Radons for his own SF bug #487390 : Modifying type.__module__ behavior. This adds the module name and a dot in front of the type name in every type object initializer, except for built-in types (and those that already had this). Note that it touches lots of Mac modules -- I have no way to test these but the changes look right. Apologies if they're not. This also touches the weakref docs, which contains a sample type object initializer. It also touches the mmap test output, because the mmap type's repr is included in that output. It touches object.h to put the correct description in a comment.	2001-12-08 18:02:58 +00:00
Guido van Rossum	4e173846c8	Fix for #489672 (Neil Norwitz): memory leak in test_sre. (At least for the repeatable test case that Tim produced.) pattern_subx(): Add missing DECREF(filter) in both exit branches (normal and error return). Also fix a DECREF(args) that should certainly be a DECREF(match) -- because it's inside if (!args) and right after allocation of match.	2001-12-07 04:25:10 +00:00
Fredrik Lundh	703ce8122c	(experimental) "finditer" method/function. this works pretty much like findall, but returns an iterator (which returns match objects) instead of a list of strings/tuples.	2001-10-24 22:16:30 +00:00
Fredrik Lundh	6de22ef677	another major speedup: let sre.sub/subn check for escapes in the template string, and don't call the template compiler if we can avoid it.	2001-10-22 21:18:08 +00:00
Fredrik Lundh	f864aa8fd9	sre.split should return the last segment, even if empty (sorry, barry)	2001-10-22 06:01:56 +00:00
Fredrik Lundh	dac58492aa	fixed character set description in docstring (SRE uses Python strings, not C strings) removed USE_PYTHON defines, and related sre.py helpers skip calling the subx helper if the template is callable. interestingly enough, this means that def callback(m): return literal result = pattern.sub(callback, string) is much faster than result = pattern.sub(literal, string)	2001-10-21 21:48:30 +00:00
Fredrik Lundh	1296a8d77e	sre.Scanner fixes (from Greg Chapman). also added a Scanner sanity check to the test suite. added a few missing exception checks in the _sre module	2001-10-21 18:04:11 +00:00
Fredrik Lundh	bec95b9d88	rewrote the pattern.sub and pattern.subn methods in C removed (conceptually flawed) getliteral helper; the new sub/subn code uses a faster code path for literal replacement strings, but doesn't (yet) look for literal patterns. added STATE_OFFSET macro, and use it to convert state.start/ptr to char indexes	2001-10-21 16:47:57 +00:00
Fredrik Lundh	971e78b55b	rewrote the pattern.split method in C also restored SRE Unicode support for 1.6/2.0/2.1	2001-10-20 17:48:46 +00:00
Fredrik Lundh	397a654791	SRE bug #441409 : compile should raise error for non-strings SRE bug #432570, 448951: reset group after failed match also bumped version number to 2.2.0	2001-10-18 19:30:16 +00:00
Fredrik Lundh	59b68656f8	fixed #449964 : sre.sub raises an exception if the template contains a \g<x> group reference followed by a character escape (also restructured a few things on the way to fixing #449000)	2001-09-18 20:55:24 +00:00
Fredrik Lundh	21009b9c6f	an SRE bugfix a day keeps Guido away... #462270: sub-tle difference between pre.sub and sre.sub. PRE ignored an empty match at the previous location, SRE didn't. also synced with Secret Labs "sreopen" codebase.	2001-09-18 18:47:09 +00:00
Sjoerd Mullender	89dfe9e292	Removed unreachable return to silence SGI compiler.	2001-08-30 14:37:07 +00:00
Martin v. Löwis	339d0f720e	Patch #445762 : Support --disable-unicode - Do not compile unicodeobject, unicodectype, and unicodedata if Unicode is disabled - check for Py_USING_UNICODE in all places that use Unicode functions - disables unicode literals, and the builtin functions - add the types.StringTypes list - remove Unicode literals from most tests.	2001-08-17 18:39:25 +00:00
Barry Warsaw	214a0b1382	init_sre(): Plug a little leak reported by Insure.	2001-08-16 20:33:48 +00:00
Fredrik Lundh	2d96f11d07	map re.sub() to string.replace(), when possible	2001-07-08 13:26:57 +00:00
Fredrik Lundh	d89a2e7731	bug #416670 added copy/deepcopy support to SRE (still not enabled, since it's not covered by the test suite)	2001-07-03 20:32:36 +00:00
Fredrik Lundh	df781e6a3f	reapplied darryl gallion's minimizing repeat fix. I'm still not 100% sure about this one, but test #133283 now works even with the fix in place, and so does the test suite. we'll see what comes up...	2001-07-02 19:54:28 +00:00
Fredrik Lundh	f71ae461bf	pythonware repository roundtrip (untabification)	2001-07-02 17:04:48 +00:00
Fredrik Lundh	19af43d78a	added martin's BIGCHARSET patch to SRE 2.1.1. martin reports 2x speedups for certain unicode character ranges.	2001-07-02 16:58:38 +00:00
Fredrik Lundh	b0f05bdfd3	merged with pythonware's SRE 2.1.1 codebase	2001-07-02 16:42:49 +00:00
Fredrik Lundh	9c7eab82b3	SRE: made "copyright" string static, to avoid potential linking conflicts.	2001-04-15 19:00:58 +00:00
Fredrik Lundh	b25e1ad253	sre 2.1b2 update: - take locale into account for word boundary anchors (#410271) - restored 2.0's *? behaviour (#233283, #408936 and others) - speed up re.sub/re.subn	2001-03-22 15:50:10 +00:00
Tim Peters	5687ffe0c5	SF patch 404928: Support for next Cygwin gcc (2.95.2-8)	2001-02-28 16:44:18 +00:00
Fredrik Lundh	1c5aa6901f	bumped SRE version number to 2.1. cleaned up and added 1.5.2 compatibility patches.	2001-01-16 07:37:30 +00:00
Fredrik Lundh	6f5cba68fc	fixed a memory leak in pattern cleanup (patch #103248 by cgw)	2001-01-16 07:05:29 +00:00
Fredrik Lundh	b35ffc0417	added "magic" number to the _sre module, to avoid weird errors caused by compiler/engine mismatches	2001-01-15 12:46:09 +00:00
Fredrik Lundh	fa25a7d51f	-- don't use recursion for unbounded non-greedy repeat (bugs #115903, #115696) This is based on a patch by Darrel Gallion. I'm not 100% sure about this fix, but I haven't managed to come up with any test case it cannot handle...	2001-01-14 23:55:55 +00:00
Fredrik Lundh	770617b23e	SRE fixes for 2.1 alpha: -- added some more docstrings -- fixed typo in scanner class (#125531) -- the multiline flag (?m) should't affect the \Z operator (#127259) -- fixed non-greedy backtracking bug (#123769, #127259) -- added sre.DEBUG flag (currently dumps the parsed pattern structure) -- fixed a couple of glitches in groupdict (the #126587 memory leak had already been fixed by AMK)	2001-01-14 15:06:11 +00:00

1 2

96 Commits