- reorganized some code to get rid of -Wall and -W4
warnings
- fixed default argument handling for sub/subn/split
methods (reported by Peter Schneider-Kamp).
- Actually count the linefeeds in a the CDATA content.
- Don't call the endtag handler for an unmatched endtag (this makes
the base class simpler since it doesn't have to deal with unopened
endtags).
- If the __init__ method is called with keyword argument
translate_attribute_references=0, don't attempt to translate
character and entity references in attribute values.
the pattern must have a fixed width.
- got rid of array-module dependencies; the match pro-
gram is now stored inside the pattern object, rather
than in an extra string buffer.
- cleaned up a various of potential leaks, api abuses,
and other minors in the engine module.
- use mal's new isalnum macro, rather than my own work-
around.
- untabified test_sre.py. seems like I removed a couple
of trailing spaces in the process...
openpty(): Fallback code when os.openpty() does not exist attempted to
call _slave_open(), which should have been slave_open().
This bug only showed on platforms which do not provide a working openpty()
in the C library.
This patch implements relative-path semantics for the "source" facility resembling
those of cpp(1), documents the change, and improves the shlex test main to
make it easier to test this feature. Along the way, it fixes a name error
in the existing docs.
[Additional documentation markup changes for consistency by FLD.]
This patch delegates more string functions to string object methods,
uses the varargs delegation syntax, and stops using stringold.
Closes SourceForge patch #100712.
"lastgroup" is the name of the last matched capturing group,
"lastindex" is the index of the same group. if no group was
matched, both attributes are set to None.
the (?P#) feature will be removed in the next relase.
used by the code generator)
- changed max repeat value in engine (to match earlier array fix)
- added experimental "which part matched?" mechanism to sre; see
http://hem.passagen.se/eff/2000_07_01_bot-archive.htm#416954
or python-dev for details.
speedup for some tests, including the python tokenizer.
-- added support for an optional charset anchor to the engine
(currently unused by the code generator).
-- removed workaround for array module bug.
-- changed 1.6 to 2.0 in the file headers
-- fixed ISALNUM macro for the unicode locale. this
solution isn't perfect, but the best I can do with
Python's current unicode database.
This'll work fine with 2.0 or 1.5.2, but is less than ideal for
1.6a1/a2. But the code to accomodate 1.6a1/a2 was released with
Distutils 0.9, so it can go away now.
allows the caller to execute the various tests in pseudo-random order -
default is still to execute tests in the order returned by findtests().
* moved initialization of the various flag variables to the main() function
definition, making it possible to execute regrtest.main() interactively
and still override default behavior.
-- added pickling support (only works if sre is imported)
-- fixed wordsize problems in engine
(instead of casting literals down to the character size,
cast characters up to the literal size (same as the code
word size). this prevents false hits when you're matching
a unicode pattern against an 8-bit string. (unfortunately,
this broke another test, but I think the test should be
changed in this case; more on that on python-dev)
-- added sre.purge function
(unofficial, clears the cache)
group reset problem. in the meantime, I added some
optimizations:
- added "inline" directive to LOCAL
(this assumes that AC_C_INLINE does what it's
supposed to do). to compile SRE on a non-unix
platform that doesn't support inline, you have
to add a "#define inline" somewhere...
- added code to generate a SRE_OP_INFO primitive
- added code to do fast prefix search
(enabled by the USE_FAST_SEARCH define; default
is on, in this release)
errors in some of the hash algorithms. For exmaple, in float_hash and
complex_hash a certain part of the value is not included in the hash
calculation. See Tim's, Guido's, and my discussion of this on
python-dev in May under the title "fix float_hash and complex_hash for
64-bit *nix"
(2) The hash algorithms that use pointers (e.g. func_hash, code_hash)
are universally not correct on Win64 (they assume that sizeof(long) ==
sizeof(void*))
As well, this patch significantly cleans up the hash code. It adds the
two function _Py_HashDouble and _PyHash_VoidPtr that the various
hashing routine are changed to use.
These help maintain the hash function invariant: (a==b) =>
(hash(a)==hash(b))) I have added Lib/test/test_hash.py and
Lib/test/output/test_hash to test this for some cases.
get_starttag_text(): New method.
Return the text of the most recently parsed start tag, from
the '<' to the '>' or '/'. Not really useful for structure
processing, but requested for Web-related use. May also be
useful for being able to re-generate the input from the parse
events, but there's no equivalent for end tags.
attrfind: Be a little more forgiving of unquoted attribute values.
(those semantics are weird...)
- got rid of $Id$'s (for the moment, at least). in other
words, there should be no more "empty" checkins.
- internal: some minor cleanups.
(test_sre still complains about split, but that's caused by
the group reset bug, not split itself)
- added more mark slots
(should be dynamically allocated, but 100 is better than 32.
and checking for the upper limit is better than overwriting
the memory ;-)
- internal: renamed the cursor helper class
- internal: removed some bloat from sre_compile
accidentally wiped out by Ping's patch (which shouldn't have affected
this file at all, had Ping done a cvs update).
This checkin restores Gordon's version, with Fredrik's change merged
back in.
tests in sre_patch back to previous version
- fixed return value from findall
- renamed a bunch of functions inside _sre (way too
many leading underscores...)
</F>
Changed 'prune_file_list()' so it also prunes out RCS and CVS directories.
Added 'is_regex' parameter to 'select_pattern()', 'exclude_pattern()',
and 'translate_pattern()', so that you don't have to be constrained
by the simple shell-glob-like pattern language, and can escape into
full-blown regexes when needed. Currently this is only available
in code -- it's not exposed in the manifest template mini-language.
Added 'prune' option (controlled by --prune and --no-prune) to determine
whether we call 'prune_file_list()' or not -- it's true by default.
Fixed 'negative_opt' -- it was misnamed and not being seen by dist.py.
Added --no-defaults to the option table, so it's seen by FancyGetopt.
Testing: test_array.py was also extended to check that one can set the
full range of values for each of the integral signed and unsigned
array types.
This closes SourceForge patch #100506.