* There was no error reported if the .read() method returns a non-string
* If read() returned too much data, the buffer would be overflowed causing a
core dump
* Used strncpy, not memcpy, which seems incorrect if there are embedded \0s.
* The args and bytes objects were leaked
The first two warnings seem harmless enough,
but the last one looks like a potential bug: an
uninitialized int is returned on error. (I also
ended up reformatting some of the code,
because it was hard to read.)
about int size mismatches at two calls to s_rand. Stuffed in
casts to make the code do what it did before but w/o warnings --
although unclear that's correct!
windows.
- added optional mode argument to popen2/popen3
for unix; if the second argument is an integer,
it's assumed to be the buffer size.
- changed nt.popen2/popen3/popen4 return values
to match the popen2 module (stdout first, not
stdin).
just for the sake of it.
note that this only covers the unlikely case that size_t
is smaller than a long; it's probably more likely that
there are platforms out there where size_t is *larger*
than a long, and mmapmodule cannot really deal with that
today.
cast to make sure Py_BuildValue gets the right thing.
this change eliminates bogus return codes from successful
spawn calls (e.g. 2167387144924954624 instead of 0).
staring at the diffs before checking this one in. let me know
asap if it breaks things on your platform.
-- ANSI-fying
(patch #100763 by Peter Schneider-Kamp, minus the
indentation changes and minus the changes the broke
the windows build)
In posixmodule.c:posix_fork, the function PyOS_AfterFork is called for
both the parent and the child, despite the docs stating that it should
be called in the new (child) process.
This causes problems in the parent since the forking thread becomes the
main thread according to the signal module.
Calling PyOS_AfterFork() only in the child fixes this. Changed for both
fork() and forkpty().
Somebody w/ gcc please check that the wngs are gone!
There are cheaper (at runtime) ways to prevent the wngs, but
they're obscure and delicate. I'm going for the easy Big
Hammer here under the theory that PCRE will be replaced by
SRE anyway.
- reorganized some code to get rid of -Wall and -W4
warnings
- fixed default argument handling for sub/subn/split
methods (reported by Peter Schneider-Kamp).
It gets initialized when pyexpat is imported, and is only accessible as an
attribute of pyexpat; it cannot be imported itself. This allows it to at
least be importable after pyexpat itself has been imported by adding it
to sys.modules, so it is not quite as strange.
This arrangement needs to be better thought out.
the pattern must have a fixed width.
- got rid of array-module dependencies; the match pro-
gram is now stored inside the pattern object, rather
than in an extra string buffer.
- cleaned up a various of potential leaks, api abuses,
and other minors in the engine module.
- use mal's new isalnum macro, rather than my own work-
around.
- untabified test_sre.py. seems like I removed a couple
of trailing spaces in the process...
Revise math_1(), math_2(), stub-generating macros, and function tables to
use PyArg_ParseTuple() and properly provide the function name for error
message generation.
Fix pow() docstring for MPW 3.1; had said "power" instead of "pow".
"lastgroup" is the name of the last matched capturing group,
"lastindex" is the index of the same group. if no group was
matched, both attributes are set to None.
the (?P#) feature will be removed in the next relase.
used by the code generator)
- changed max repeat value in engine (to match earlier array fix)
- added experimental "which part matched?" mechanism to sre; see
http://hem.passagen.se/eff/2000_07_01_bot-archive.htm#416954
or python-dev for details.
speedup for some tests, including the python tokenizer.
-- added support for an optional charset anchor to the engine
(currently unused by the code generator).
-- removed workaround for array module bug.
-- changed 1.6 to 2.0 in the file headers
-- fixed ISALNUM macro for the unicode locale. this
solution isn't perfect, but the best I can do with
Python's current unicode database.
CHAR_MAX, use hardcoded -128 and 127. This may seem strange, unless
you realize that we're talking about signed bytes here! Bytes are
always 8 bits and 2's complement. CHAR_MIN and CHAR_MAX are
properties of the char data type, which is guaranteed to hold at least
8 bits anyway.
Otherwise you'd get failing tests on platforms where unsigned char is
the default (e.g. AIX).
Thanks, Vladimir Marangozov, for finding this nit!
The common technique for printing out a pointer has been to cast to a long
and use the "%lx" printf modifier. This is incorrect on Win64 where casting
to a long truncates the pointer. The "%p" formatter should be used instead.
The problem as stated by Tim:
> Unfortunately, the C committee refused to define what %p conversion "looks
> like" -- they explicitly allowed it to be implementation-defined. Older
> versions of Microsoft C even stuck a colon in the middle of the address (in
> the days of segment+offset addressing)!
The result is that the hex value of a pointer will maybe/maybe not have a 0x
prepended to it.
Notes on the patch:
There are two main classes of changes:
- in the various repr() functions that print out pointers
- debugging printf's in the various thread_*.h files (these are why the
patch is large)
Closes SourceForge patch #100505.
-- added pickling support (only works if sre is imported)
-- fixed wordsize problems in engine
(instead of casting literals down to the character size,
cast characters up to the literal size (same as the code
word size). this prevents false hits when you're matching
a unicode pattern against an 8-bit string. (unfortunately,
this broke another test, but I think the test should be
changed in this case; more on that on python-dev)
-- added sre.purge function
(unofficial, clears the cache)
This patch fixes possible overflows in the socket module for 64-bit
platforms (mainly Win64). The changes are:
- abstract the socket type to SOCKET_T (this is SOCKET on Windows, int
on Un*x), this is necessary because sizeof(SOCKET) > sizeof(int) on
Win64
- use INVALID_SOCKET on Win32/64 for an error return value for
accept()
- ensure no overflow of the socket variable for: (1) a PyObject return
value (use PyLong_FromLongLong if necessary); and (2) printf
formatting in repr().
Closes SourceForge patch #100516.
Tim posted a long comment to python-dev (subject: "Controversial patch
(cmath)"; date: 6/29/00). The conclusion is that this whole module
stinks and this patch isn't perfect, but it's better than the acosh
and asinh we had, so let's check it in.
group reset problem. in the meantime, I added some
optimizations:
- added "inline" directive to LOCAL
(this assumes that AC_C_INLINE does what it's
supposed to do). to compile SRE on a non-unix
platform that doesn't support inline, you have
to add a "#define inline" somewhere...
- added code to generate a SRE_OP_INFO primitive
- added code to do fast prefix search
(enabled by the USE_FAST_SEARCH define; default
is on, in this release)
This patch fixes a possible overflow in the Sleep system call on
Win32/64 in the time_sleep() function in the time module. For very
large values of the give time to sleep the number of milliseconds can
overflow and give unexpected sleep intervals. THis patch raises an
OverflowError if the value overflows.
Closes SourceForge patch #100514.
This patch fixes the posix module for large file support mainly on
Win64, although some general cleanup is done as well.
The changes are:
- abstract stat->STAT, fstat->FSTAT, and struct stat->STRUCT_STAT
This is because stat() etc. are not the correct functions to use on
Win64 (nor maybe on other platforms?, if not then it is now trivial to
select the appropriate one). On Win64 the appropriate system functions
are _stati64(), etc.
- add _pystat_fromstructstat(), it builds the return tuple for the
fstat system call. This functionality was being duplicated. As well
the construction of the tuple was modified to ensure no overflow of
the time_t elements (sizeof(time_t) > sizeof(long) on Win64).
- add overflow protection for the return values of posix_spawnv and
posix_spawnve
- use the proper 64-bit capable lseek() on Win64
- use intptr_t instead of long where appropriate from Win32/64 blocks
(sizeof(void*) > sizeof(long) on Win64)
This closes SourceForge patch #100513.
Mark Hammond provided (a long time ago) a better Win32 specific
time_clock implementation in timemodule.c. The library for this
implementation does not exist on Win64 (yet, at least). This patch
makes Win64 fall back on the system's clock() function for
time_clock().
This closes SourceForge patch #100512.
(those semantics are weird...)
- got rid of $Id$'s (for the moment, at least). in other
words, there should be no more "empty" checkins.
- internal: some minor cleanups.
(test_sre still complains about split, but that's caused by
the group reset bug, not split itself)
- added more mark slots
(should be dynamically allocated, but 100 is better than 32.
and checking for the upper limit is better than overwriting
the memory ;-)
- internal: renamed the cursor helper class
- internal: removed some bloat from sre_compile
tests in sre_patch back to previous version
- fixed return value from findall
- renamed a bunch of functions inside _sre (way too
many leading underscores...)
</F>
Fix warnings on 64-bit build build of signalmodule.c
- Though I know that SIG_DFL and SIG_IGN are just small constants,
there are cast to function pointers so the appropriate Python call is
PyLong_FromVoidPtr so that the pointer value cannot overflow on Win64
where sizeof(long) < sizeof(void*).
This patch fixes cPickle.c for 64-bit platforms.
- The false assumption sizeof(long) == size(void*) exists where
PyInt_FromLong is used to represent a pointer. The safe Python call
for this is PyLong_FromVoidPtr. (On platforms where the above
assumption *is* true a PyInt is returned as before so there is no
effective change.)
- use size_t instead of int for some variables
This patches fixes a possible overflow of the optional timeout
parameter for the select() function (selectmodule.c). This timeout is
passed in as a double and then truncated to an int. If the double is
sufficiently large you can get unexpected results as it
overflows. This patch raises an overflow if the given select timeout
overflows.
[GvR: To my embarrassment, the original code was assuming an int could
always hold a million. Note that the overflow check doesn't test for
a very large *negative* timeout passed in -- but who in the world
would do such a thing?]
The cause: Relatively recent (last month) patches to getargs.c added
overflow checking to the PyArg_Parse*() integral formatters thereby
restricting 'b' to unsigned char value and 'h','i', and 'l' to signed
integral values (i.e. if the incoming value is outside of the
specified bounds you get an OverflowError, previous it silently
overflowed).
The problem: This broke the array module (as Fredrik pointed out)
because *its* formatters relied on the loose allowance of signed and
unsigned ranges being able to pass through PyArg_Parse*()'s
formatters.
The fix: This patch fixes the array module to work with the more
strict bounds checking now in PyArg_Parse*().
How: If the type signature of a formatter in the arraymodule exactly
matches one in PyArg_Parse*(), then use that directly. If there is no
equivalent type signature in PyArg_Parse*() (e.g. there is no unsigned
int formatter in PyArg_Parse*()), then use the next one up and do some
extra bounds checking in the array module.
This partially closes SourceForge patch #100506.
This patch adds the openpty() and forkpty() library calls to posixmodule.c,
when they are available on the target
system. (glibc-2.1-based Linux systems, FreeBSD and BSDI at least, probably
the other BSD-based systems as well.)
Lib/pty.py is also rewritten to use openpty when available, but falls
back to the old SGI method or the "manual" BSD open-a-pty
code. Openpty() is necessary to use the Unix98 ptys under Linux 2.2,
or when using non-standard tty names under (at least) BSDI, which is
why I needed it, myself ;-) forkpty() is included for symmetry.
New ucnhash module by Bill Tutt. This module contains the hash
table needed to map Unicode character names to Unicode ordinals
and is loaded on-the-fly by the standard unicode-escape codec.
I discovered the [MREMAP_MAYMOVE] symbol is only defined when _GNU_SOURCE is
defined; therefore, here is the change: if we are compiling for linux,
define _GNU_SOURCE before including mman.h, and all is done.