(this should fix Sjoerd's xmllib problem)
-- added skip field to INFO header
-- changed compiler to generate charset INFO header
-- changed trace messages to support post-mortem analysis
This doesn't change the copyright status for these files -- just the
markings! Doing it on the main branch for these three files for which
the HEAD revision was pushed back into 1.6.
-- fixed literal check in branch operator
(this broke test_tokenize, as reported by Mark Favas)
-- added REPEAT_ONE operator (still not enabled, though)
-- added some debugging stuff (maxlevel)
-- reverted REPEAT operator to use "repeat context" strategy
(from 0.8.X), but done right this time.
-- got rid of backtracking stack; use nested SRE_MATCH calls
instead (should probably put it back again in 0.9.9 ;-)
-- properly reset state in scanner mode
-- don't use aggressive inlining by default
* After discussion with Trent, all INT_PTR references have been removed in favour of the HANDLE it should always have been. Trent can see no 64bit issues here.
* In this process, I noticed that the close operation was dangerous, in that we could end up passing bogus results to the Win32 API. These result of the API functions passed the bogus values were never (and still are not) checked, but this is closer to "the right thing" (tm) than before.
Tested on Windows and Linux.
Checkin that replaces the INT_PTR types with HANDLEs still TBD (but as that is a "spelling" patch, rather than a functional one, I will commit it seperately.
originally submitted by Bill Tutt
Note: This code is actually going to be replaced in 2.0 by /F's new
database. Until then, this patch keeps the test suite working.
for systems that are missing those declarations from system include files.
Start by moving a pointy-haired ones from their previous locations to the
new section.
(The gethostname() one, for instance, breaks on several systems, because
some define it as (char *, size_t) and some as (char *, int).)
I purposely decided not to include the summary of used #defines like Tim did
in the first section of pyport.h. In my opinion, the number of #defines
likedly to be used by this section would make such an overview unwieldy. I
would suggest documenting the non-obvious ones, though.
+ added "regs" attribute
+ fixed "pos" and "endpos" attributes
+ reset "lastindex" and "lastgroup" in scanner methods
+ removed (?P#id) syntax; the "lastindex" and "lastgroup"
attributes are now always set
+ removed string module dependencies in sre_parse
+ better debugging support in sre_parse
+ various tweaks to build under 1.5.2
handlers "return void", according to ANSI C.
Removed the new Py_RETURN_FROM_SIGNAL_HANDLER macro.
Left RETSIGTYPE in the config stuff, because it's not clear to
me that others aren't relying on it (e.g., extension modules).
#if RETSIGTYPE != void
That isn't C, and MSVC properly refuses to compile it.
Introduced new Py_RETURN_FROM_SIGNAL_HANDLER macro in pyport.h
to expand to the correct thing based on RETSIGTYPE. However,
only void is ANSI! Do we still have platforms that return int?
The Unix config mess appears to #define RETSIGTYPE by magic
without being asked to, so I assume it's "a problem" across
Unices still.
to return something if RETSIGTYPE isn't void, in functions that are defined
to return RETSIGTYPE. Work around an argumentlist mismatch ('void' vs.
'void *') by using a static wrapper function.
and a couple of functions that were missed in the previous batches. Not
terribly tested, but very carefully scrutinized, three times.
All these were found by the little findkrc.py that I posted to python-dev,
which means there might be more lurking. Cases such as this:
long
func(a, b)
long a;
long b; /* flagword */
{
and other cases where the last ; in the argument list isn't followed by a
newline and an opening curly bracket. Regexps to catch all are welcome, of
course ;)
comments, docstrings or error messages. I fixed two minor things in
test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't").
There is a minor style issue involved: Guido seems to have preferred English
grammar (behaviour, honour) in a couple places. This patch changes that to
American, which is the more prominent style in the source. I prefer English
myself, so if English is preferred, I'd be happy to supply a patch myself ;)
* There was no error reported if the .read() method returns a non-string
* If read() returned too much data, the buffer would be overflowed causing a
core dump
* Used strncpy, not memcpy, which seems incorrect if there are embedded \0s.
* The args and bytes objects were leaked
The first two warnings seem harmless enough,
but the last one looks like a potential bug: an
uninitialized int is returned on error. (I also
ended up reformatting some of the code,
because it was hard to read.)
about int size mismatches at two calls to s_rand. Stuffed in
casts to make the code do what it did before but w/o warnings --
although unclear that's correct!
windows.
- added optional mode argument to popen2/popen3
for unix; if the second argument is an integer,
it's assumed to be the buffer size.
- changed nt.popen2/popen3/popen4 return values
to match the popen2 module (stdout first, not
stdin).
just for the sake of it.
note that this only covers the unlikely case that size_t
is smaller than a long; it's probably more likely that
there are platforms out there where size_t is *larger*
than a long, and mmapmodule cannot really deal with that
today.
cast to make sure Py_BuildValue gets the right thing.
this change eliminates bogus return codes from successful
spawn calls (e.g. 2167387144924954624 instead of 0).
staring at the diffs before checking this one in. let me know
asap if it breaks things on your platform.
-- ANSI-fying
(patch #100763 by Peter Schneider-Kamp, minus the
indentation changes and minus the changes the broke
the windows build)
In posixmodule.c:posix_fork, the function PyOS_AfterFork is called for
both the parent and the child, despite the docs stating that it should
be called in the new (child) process.
This causes problems in the parent since the forking thread becomes the
main thread according to the signal module.
Calling PyOS_AfterFork() only in the child fixes this. Changed for both
fork() and forkpty().
Somebody w/ gcc please check that the wngs are gone!
There are cheaper (at runtime) ways to prevent the wngs, but
they're obscure and delicate. I'm going for the easy Big
Hammer here under the theory that PCRE will be replaced by
SRE anyway.
- reorganized some code to get rid of -Wall and -W4
warnings
- fixed default argument handling for sub/subn/split
methods (reported by Peter Schneider-Kamp).
It gets initialized when pyexpat is imported, and is only accessible as an
attribute of pyexpat; it cannot be imported itself. This allows it to at
least be importable after pyexpat itself has been imported by adding it
to sys.modules, so it is not quite as strange.
This arrangement needs to be better thought out.
the pattern must have a fixed width.
- got rid of array-module dependencies; the match pro-
gram is now stored inside the pattern object, rather
than in an extra string buffer.
- cleaned up a various of potential leaks, api abuses,
and other minors in the engine module.
- use mal's new isalnum macro, rather than my own work-
around.
- untabified test_sre.py. seems like I removed a couple
of trailing spaces in the process...
Revise math_1(), math_2(), stub-generating macros, and function tables to
use PyArg_ParseTuple() and properly provide the function name for error
message generation.
Fix pow() docstring for MPW 3.1; had said "power" instead of "pow".
"lastgroup" is the name of the last matched capturing group,
"lastindex" is the index of the same group. if no group was
matched, both attributes are set to None.
the (?P#) feature will be removed in the next relase.
used by the code generator)
- changed max repeat value in engine (to match earlier array fix)
- added experimental "which part matched?" mechanism to sre; see
http://hem.passagen.se/eff/2000_07_01_bot-archive.htm#416954
or python-dev for details.
speedup for some tests, including the python tokenizer.
-- added support for an optional charset anchor to the engine
(currently unused by the code generator).
-- removed workaround for array module bug.
-- changed 1.6 to 2.0 in the file headers
-- fixed ISALNUM macro for the unicode locale. this
solution isn't perfect, but the best I can do with
Python's current unicode database.
CHAR_MAX, use hardcoded -128 and 127. This may seem strange, unless
you realize that we're talking about signed bytes here! Bytes are
always 8 bits and 2's complement. CHAR_MIN and CHAR_MAX are
properties of the char data type, which is guaranteed to hold at least
8 bits anyway.
Otherwise you'd get failing tests on platforms where unsigned char is
the default (e.g. AIX).
Thanks, Vladimir Marangozov, for finding this nit!
The common technique for printing out a pointer has been to cast to a long
and use the "%lx" printf modifier. This is incorrect on Win64 where casting
to a long truncates the pointer. The "%p" formatter should be used instead.
The problem as stated by Tim:
> Unfortunately, the C committee refused to define what %p conversion "looks
> like" -- they explicitly allowed it to be implementation-defined. Older
> versions of Microsoft C even stuck a colon in the middle of the address (in
> the days of segment+offset addressing)!
The result is that the hex value of a pointer will maybe/maybe not have a 0x
prepended to it.
Notes on the patch:
There are two main classes of changes:
- in the various repr() functions that print out pointers
- debugging printf's in the various thread_*.h files (these are why the
patch is large)
Closes SourceForge patch #100505.
-- added pickling support (only works if sre is imported)
-- fixed wordsize problems in engine
(instead of casting literals down to the character size,
cast characters up to the literal size (same as the code
word size). this prevents false hits when you're matching
a unicode pattern against an 8-bit string. (unfortunately,
this broke another test, but I think the test should be
changed in this case; more on that on python-dev)
-- added sre.purge function
(unofficial, clears the cache)
This patch fixes possible overflows in the socket module for 64-bit
platforms (mainly Win64). The changes are:
- abstract the socket type to SOCKET_T (this is SOCKET on Windows, int
on Un*x), this is necessary because sizeof(SOCKET) > sizeof(int) on
Win64
- use INVALID_SOCKET on Win32/64 for an error return value for
accept()
- ensure no overflow of the socket variable for: (1) a PyObject return
value (use PyLong_FromLongLong if necessary); and (2) printf
formatting in repr().
Closes SourceForge patch #100516.
Tim posted a long comment to python-dev (subject: "Controversial patch
(cmath)"; date: 6/29/00). The conclusion is that this whole module
stinks and this patch isn't perfect, but it's better than the acosh
and asinh we had, so let's check it in.
group reset problem. in the meantime, I added some
optimizations:
- added "inline" directive to LOCAL
(this assumes that AC_C_INLINE does what it's
supposed to do). to compile SRE on a non-unix
platform that doesn't support inline, you have
to add a "#define inline" somewhere...
- added code to generate a SRE_OP_INFO primitive
- added code to do fast prefix search
(enabled by the USE_FAST_SEARCH define; default
is on, in this release)
This patch fixes a possible overflow in the Sleep system call on
Win32/64 in the time_sleep() function in the time module. For very
large values of the give time to sleep the number of milliseconds can
overflow and give unexpected sleep intervals. THis patch raises an
OverflowError if the value overflows.
Closes SourceForge patch #100514.
This patch fixes the posix module for large file support mainly on
Win64, although some general cleanup is done as well.
The changes are:
- abstract stat->STAT, fstat->FSTAT, and struct stat->STRUCT_STAT
This is because stat() etc. are not the correct functions to use on
Win64 (nor maybe on other platforms?, if not then it is now trivial to
select the appropriate one). On Win64 the appropriate system functions
are _stati64(), etc.
- add _pystat_fromstructstat(), it builds the return tuple for the
fstat system call. This functionality was being duplicated. As well
the construction of the tuple was modified to ensure no overflow of
the time_t elements (sizeof(time_t) > sizeof(long) on Win64).
- add overflow protection for the return values of posix_spawnv and
posix_spawnve
- use the proper 64-bit capable lseek() on Win64
- use intptr_t instead of long where appropriate from Win32/64 blocks
(sizeof(void*) > sizeof(long) on Win64)
This closes SourceForge patch #100513.
Mark Hammond provided (a long time ago) a better Win32 specific
time_clock implementation in timemodule.c. The library for this
implementation does not exist on Win64 (yet, at least). This patch
makes Win64 fall back on the system's clock() function for
time_clock().
This closes SourceForge patch #100512.
(those semantics are weird...)
- got rid of $Id$'s (for the moment, at least). in other
words, there should be no more "empty" checkins.
- internal: some minor cleanups.
(test_sre still complains about split, but that's caused by
the group reset bug, not split itself)
- added more mark slots
(should be dynamically allocated, but 100 is better than 32.
and checking for the upper limit is better than overwriting
the memory ;-)
- internal: renamed the cursor helper class
- internal: removed some bloat from sre_compile
tests in sre_patch back to previous version
- fixed return value from findall
- renamed a bunch of functions inside _sre (way too
many leading underscores...)
</F>
Fix warnings on 64-bit build build of signalmodule.c
- Though I know that SIG_DFL and SIG_IGN are just small constants,
there are cast to function pointers so the appropriate Python call is
PyLong_FromVoidPtr so that the pointer value cannot overflow on Win64
where sizeof(long) < sizeof(void*).
This patch fixes cPickle.c for 64-bit platforms.
- The false assumption sizeof(long) == size(void*) exists where
PyInt_FromLong is used to represent a pointer. The safe Python call
for this is PyLong_FromVoidPtr. (On platforms where the above
assumption *is* true a PyInt is returned as before so there is no
effective change.)
- use size_t instead of int for some variables