cpython

Commit Graph

Author	SHA1	Message	Date
Guido van Rossum	572dbf8f13	Checkpoint. Manipulated things so that string literals are always unicode, and a few other compensating changes, e.g. str <- unicode, chr <- unichr, and repr() of a unicode string no longer starts with 'u'. Lots of unit tests are broken, but some basic things work, in particular distutils works so the extensions can be built, and test_builtin.py works.	2007-04-27 23:53:51 +00:00
Guido van Rossum	e2a383d062	Rip out 'long' and 'L'-suffixed integer literals. (Rough first cut.)	2007-01-15 16:59:06 +00:00
Andrew M. Kuchling	c30faa812c	[Bug #1177831 ] Fix generation of code for GROUPREF_EXISTS. Thanks to Andre Malo for the fix.	2005-06-02 13:35:52 +00:00
Raymond Hettinger	049ade2997	Complete the previous effort to factor out constant expressions and improve the speed of the if/elif/else blocks.	2005-02-28 19:27:52 +00:00
Fredrik Lundh	5e7d51b62c	make sure to check for this limit even if we're running with -O	2004-10-15 06:15:08 +00:00
Martin v. Löwis	7d9c6c7e8c	Fix _sre.CODESIZE on 64-bit machines in UCS-4 mode. Fixes #931848 . Backported to 2.3.	2004-05-07 07:18:13 +00:00
Raymond Hettinger	d732c95eb0	Revert 1.51 booleans so that sre will still run on old pythons.	2004-03-27 09:24:36 +00:00
Raymond Hettinger	29e383754e	Remove unnecessary test. (Thanks Skip)	2004-03-26 20:16:39 +00:00
Raymond Hettinger	01c9f8c35f	Simple optimizations: * pre-build a single identity function for the fixup function * pre-build membership tests in dictionaries instead of in-line tuples * assign len() to a local variable * assign append() methods to a local variable * use xrange() instead of range() * replace "x<<1" with "x+x"	2004-03-26 11:16:55 +00:00
Martin v. Löwis	bc503d1e90	Use True/False instead of 0/1 for character classes.	2004-03-25 13:50:59 +00:00
Gustavo Niemeyer	ad3fc44ccb	Implemented non-recursive SRE matching.	2003-10-17 22:13:16 +00:00
Just van Rossum	74902508dc	Addendum to #764548 : restore 2.1 compatibility.	2003-07-02 21:37:16 +00:00
Just van Rossum	12723bacea	Fix and test for bug #764548 : Use isinstance() instead of comparing types directly, to enable subclasses of str and unicode to be used as patterns. Blessed by /F.	2003-07-02 20:03:04 +00:00
Martin v. Löwis	78e2f06cc6	Fully support 32-bit codes. Enable BIGCHARSET in UCS-4 builds.	2003-04-19 12:56:08 +00:00
Guido van Rossum	41c99e7f96	SF patch #720991 by Gary Herron: A small fix for bug #545855 and Greg Chapman's addition of op code SRE_OP_MIN_REPEAT_ONE for eliminating recursion on simple uses of pattern '*?' on a long string.	2003-04-14 17:59:34 +00:00
Guido van Rossum	577fb5a1db	Fix from SF patch #633359 by Greg Chapman for SF bug #610299 : The problem is in sre_compile.py: the call to _compile_charset near the end of _compile_info forgets to pass in the flags, so that the info charset is not compiled with re.U. (The info charset is used when searching to find the first character at which a match could start; it is not generated for patterns beginning with a repeat like '\w{1}'.)	2003-02-24 01:18:35 +00:00
Martin v. Löwis	67c4cb1f13	Disable big charsets in UCS-4 builds. Works around #599377 . Will backport to 2.2	2002-09-26 16:39:20 +00:00
Fredrik Lundh	4fb7027ec0	made the code match the comments (1.5.2 compatibility)	2002-06-27 20:08:25 +00:00
Raymond Hettinger	f13eb55d59	Replace boolean test with is None.	2002-06-02 00:40:05 +00:00
Fred Drake	b8f2274985	Added docstrings by Neal Norwitz. This closes SF bug #450980 .	2001-09-04 19:10:20 +00:00
Tim Peters	87cc0c329e	Whitespace normalization, plus: + test_quopri.py relied on significant trailing spaces. Fixed. + test_dircache.py (still) doesn't work on Windows (directory mtime on Windows doesn't work like it does on Unix).	2001-07-21 01:41:30 +00:00
Martin v. Löwis	3550dd30bb	Patch #442512 : put block indices in the right byte order on bigendian systems.	2001-07-19 14:26:10 +00:00
Fredrik Lundh	19af43d78a	added martin's BIGCHARSET patch to SRE 2.1.1. martin reports 2x speedups for certain unicode character ranges.	2001-07-02 16:58:38 +00:00
Fredrik Lundh	b25e1ad253	sre 2.1b2 update: - take locale into account for word boundary anchors (#410271) - restored 2.0's *? behaviour (#233283, #408936 and others) - speed up re.sub/re.subn	2001-03-22 15:50:10 +00:00
Fredrik Lundh	f2989b22ff	- restored 1.5.2 compatibility (sorry, eric) - removed __all__ cruft from internal modules (sorry, skip) - don't assume ASCII for string escapes (sorry, per)	2001-02-18 12:05:16 +00:00
Skip Montanaro	0de65807e6	bunch more __all__ lists also modified check_all function to suppress all warnings since they aren't relevant to what this test is doing (allows quiet checking of regsub, for instance)	2001-02-15 22:15:14 +00:00
Fredrik Lundh	2e24044f9d	from the really-stupid-bug department: uppercase literals should match uppercase strings also when the IGNORECASE flag is set (bug #128899) (also added test cases for recently fixed bugs to the regression suite -- or in other words, check in re_tests.py too...)	2001-01-15 18:28:14 +00:00
Fredrik Lundh	b35ffc0417	added "magic" number to the _sre module, to avoid weird errors caused by compiler/engine mismatches	2001-01-15 12:46:09 +00:00
Fredrik Lundh	770617b23e	SRE fixes for 2.1 alpha: -- added some more docstrings -- fixed typo in scanner class (#125531) -- the multiline flag (?m) should't affect the \Z operator (#127259) -- fixed non-greedy backtracking bug (#123769, #127259) -- added sre.DEBUG flag (currently dumps the parsed pattern structure) -- fixed a couple of glitches in groupdict (the #126587 memory leak had already been fixed by AMK)	2001-01-14 15:06:11 +00:00
Fredrik Lundh	13ac9926ac	Fixed too ambitious "nothing to repeat" check. Closes bug #114033 .	2000-10-07 17:38:23 +00:00
Fredrik Lundh	7898c3e685	-- reset marks if repeat_one tail doesn't match (this should fix Sjoerd's xmllib problem) -- added skip field to INFO header -- changed compiler to generate charset INFO header -- changed trace messages to support post-mortem analysis	2000-08-07 20:59:04 +00:00
Fredrik Lundh	e186983842	final 0.9.8 updates: -- added REPEAT_ONE operator -- added ANY_ALL operator (used to represent "(?s).")	2000-08-01 22:47:49 +00:00
Fredrik Lundh	2f2c67d7e5	-- fixed width calculations for alternations -- fixed literal check in branch operator (this broke test_tokenize, as reported by Mark Favas) -- added REPEAT_ONE operator (still not enabled, though) -- added some debugging stuff (maxlevel)	2000-08-01 21:05:41 +00:00
Fredrik Lundh	29c4ba9ada	SRE 0.9.8: passes the entire test suite -- reverted REPEAT operator to use "repeat context" strategy (from 0.8.X), but done right this time. -- got rid of backtracking stack; use nested SRE_MATCH calls instead (should probably put it back again in 0.9.9 ;-) -- properly reset state in scanner mode -- don't use aggressive inlining by default	2000-08-01 18:20:07 +00:00
Fredrik Lundh	8a3ebf8ca8	-- SRE 0.9.6 sync. this includes: + added "regs" attribute + fixed "pos" and "endpos" attributes + reset "lastindex" and "lastgroup" in scanner methods + removed (?P#id) syntax; the "lastindex" and "lastgroup" attributes are now always set + removed string module dependencies in sre_parse + better debugging support in sre_parse + various tweaks to build under 1.5.2	2000-07-23 21:46:17 +00:00
Fredrik Lundh	2855290b84	maintenance release: - reorganized some code to get rid of -Wall and -W4 warnings - fixed default argument handling for sub/subn/split methods (reported by Peter Schneider-Kamp).	2000-07-05 21:14:16 +00:00
Fredrik Lundh	72b82ba16d	- fixed grouping error bug - changed "group" operator to "groupref"	2000-07-03 21:31:48 +00:00
Fredrik Lundh	6f01398236	- added lookbehind support (?<=pattern), (?<!pattern). the pattern must have a fixed width. - got rid of array-module dependencies; the match pro- gram is now stored inside the pattern object, rather than in an extra string buffer. - cleaned up a various of potential leaks, api abuses, and other minors in the engine module. - use mal's new isalnum macro, rather than my own work- around. - untabified test_sre.py. seems like I removed a couple of trailing spaces in the process...	2000-07-03 18:44:21 +00:00
Fredrik Lundh	c2301730b8	- experimental: added two new attributes to the match object: "lastgroup" is the name of the last matched capturing group, "lastindex" is the index of the same group. if no group was matched, both attributes are set to None. the (?P#) feature will be removed in the next relase.	2000-07-02 22:25:39 +00:00
Fredrik Lundh	7cafe4d7e4	- actually enabled charset anchors in the engine (still not used by the code generator) - changed max repeat value in engine (to match earlier array fix) - added experimental "which part matched?" mechanism to sre; see http://hem.passagen.se/eff/2000_07_01_bot-archive.htm#416954 or python-dev for details.	2000-07-02 17:33:27 +00:00
Fredrik Lundh	3562f11764	-- use charset bitmaps where appropriate. this gives a 5-10% speedup for some tests, including the python tokenizer. -- added support for an optional charset anchor to the engine (currently unused by the code generator). -- removed workaround for array module bug.	2000-07-02 12:00:07 +00:00
Fredrik Lundh	22d2546520	today's SRE update: -- changed 1.6 to 2.0 in the file headers -- fixed ISALNUM macro for the unicode locale. this solution isn't perfect, but the best I can do with Python's current unicode database.	2000-07-01 17:50:59 +00:00
Fredrik Lundh	55a4f4a528	- fixed code generation error in multiline mode - fixed parser flag propagation (of all stupid bugs...)	2000-06-30 22:37:31 +00:00
Fredrik Lundh	4ccea94152	- reverted to "\x is binary byte" - removed evil tabs from sre_parse and sre_compile	2000-06-30 18:39:20 +00:00
Fredrik Lundh	0640e1161f	the mad patcher strikes again: -- added pickling support (only works if sre is imported) -- fixed wordsize problems in engine (instead of casting literals down to the character size, cast characters up to the literal size (same as the code word size). this prevents false hits when you're matching a unicode pattern against an 8-bit string. (unfortunately, this broke another test, but I think the test should be changed in this case; more on that on python-dev) -- added sre.purge function (unofficial, clears the cache)	2000-06-30 13:55:15 +00:00
Fredrik Lundh	43b3b49b5a	- fixed lookahead assertions (#10 , #11 , #12 ) - untabified sre_constants.py	2000-06-30 10:41:31 +00:00
Fredrik Lundh	90a0791322	- pedantic: make sure "python -t" doesn't complain...	2000-06-30 07:50:59 +00:00
Fredrik Lundh	01016fe972	- fixed split behaviour on empty matches - fixed compiler problems when using locale/unicode flags - fixed group/octal code parsing in sub/subn templates	2000-06-30 00:27:46 +00:00
Fredrik Lundh	29c08beab0	still trying to figure out how to fix the remaining group reset problem. in the meantime, I added some optimizations: - added "inline" directive to LOCAL (this assumes that AC_C_INLINE does what it's supposed to do). to compile SRE on a non-unix platform that doesn't support inline, you have to add a "#define inline" somewhere... - added code to generate a SRE_OP_INFO primitive - added code to do fast prefix search (enabled by the USE_FAST_SEARCH define; default is on, in this release)	2000-06-29 23:33:12 +00:00
Fredrik Lundh	8094611eb8	- fixed another split problem (those semantics are weird...) - got rid of $Id$'s (for the moment, at least). in other words, there should be no more "empty" checkins. - internal: some minor cleanups.	2000-06-29 18:03:25 +00:00
Fredrik Lundh	be2211e940	- fixed split (test_sre still complains about split, but that's caused by the group reset bug, not split itself) - added more mark slots (should be dynamically allocated, but 100 is better than 32. and checking for the upper limit is better than overwriting the memory ;-) - internal: renamed the cursor helper class - internal: removed some bloat from sre_compile	2000-06-29 16:57:40 +00:00
Fredrik Lundh	6c68dc7b1a	- removed "alpha only" licensing restriction - removed some hacks that worked around 1.6 alpha bugs - removed bogus test code from sre_parse	2000-06-29 10:34:56 +00:00
Fredrik Lundh	436c3d58a2	towards 1.6b1	2000-06-29 08:58:44 +00:00
Jeremy Hylton	b1aa19515f	Fredrik Lundh: here's the 96.6% version of SRE	2000-06-01 17:39:12 +00:00
Guido van Rossum	b81e70ebdb	Fredrik Lundh: new snapshot. Mostly reindented. This one should work with unicode expressions, and compile a bit more silently.	2000-04-10 17:10:48 +00:00
Andrew M. Kuchling	e3ba931aa4	This patch looks large, but it just deletes the ^M characters and untabifies the files. No actual code changes were made.	2000-04-02 05:22:30 +00:00
Guido van Rossum	7627c0de69	Added Fredrik Lundh's sre module and its supporting cast. NOTE: THIS IS VERY ROUGH ALPHA CODE!	2000-03-31 14:58:54 +00:00

1 2 3

107 Commits