into. Jim writes:
The core dump was due to a C decrement operation
in a macro invocation in load_pop. (BAD)
I fixed this by moving the decrement outside
the macro call.
I added a comment to load_pop and load_mark
to document the fact that cPickle separates the
unpickling stack into two separate stacks, one for
objects and one for marks.
I also moved some increments out of some macro
calls (PyTuple_SET_ITEM and PyList_SET_ITEM).
This wasn't necessary, but made me feel better. :)
I tested these changes in *my* cPickle, which
doesn't have the new Unicode stuff.
Fix overflow bug in ldexp(x, exp). The 'exp' argument maps to a C int for the
math library call [double ldexp(double, int)], however the 'd'
PyArg_ParseTuple formatter was used to yield a double, which was subsequently
cast to an int. This could overflow.
[GvR: mysteriously, on Solaris 2.7, ldexp(1, 2147483647) returns Inf
while ldexp(1, 2147483646) raises OverflowError; this seems a bug in
the math library (it also takes a real long time to compute the
Inf outcome). Does this point to a bug in the CHECK() macro? It
should have discovered that the result was outside the HUGE_VAL range.]
instead. This seems more robust than returning an Unicode string with
some unconverted charcters in it.
This still doesn't support getting truly binary data out of Tcl, since
we look for the trailing null byte; but the old (pre-Unicode) code did
this too, so apparently there's no need. (Plus, I really don't feel
like finding out how Tcl deals with this in each version.)
1. In Tcl 8.2 and later, use Tcl_NewUnicodeObj() when passing a Python
Unicode object rather than going through UTF-8. (This function
doesn't exist in Tcl 8.1, so there the original UTF-8 code is still
used; in Tcl 8.0 there is no support for Unicode.) This assumes that
Tcl_UniChar is the same thing as Py_UNICODE; a run-time error is
issued if this is not the case.
2. In Tcl 8.1 and later (i.e., whenever Tcl supports Unicode), when a
string returned from Tcl contains bytes with the top bit set, we
assume it is encoded in UTF-8, and decode it into a Unicode string
object.
Notes:
- Passing Unicode strings to Tcl 8.0 does not do the right thing; this
isn't worth fixing.
- When passing an 8-bit string to Tcl 8.1 or later that has bytes with
the top bit set, Tcl tries to interpret it as UTF-8; it seems to fall
back on Latin-1 for non-UTF-8 bytes. I'm not sure what to do about
this besides telling the user to disambiguate such strings by
converting them to Unicode (forcing the user to be explicit about the
encoding).
- Obviously it won't be possible to get binary data out of Tk this
way. Do we need that ability? How to do it?
For more comments, read the patches@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.
(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode. I'm also holding back on his
change to main.c, which seems unnecessary to me.)
Checkin 2.131 of posixmodule.c changed os.stat on Windows, so that
"/bin/" type notation (trailing backslash) would work on Windows to
be consistent with Unix.
However, the patch broke the simple case of: os.stat("\\")
This did work in 1.5.2, and obviously should!
This patch addresses this, and restores the correct behaviour.
utime(path, NULL) call, setting the atime and mtime of the file to the
current time. The previous signature utime(path, (atime, mtime)) is
of course still allowed.
This patch is a workaround for Macintosh, where the GUSI I/O library
(time, stat, etc) use the MacOS epoch of 1-Jan-1904 and the MSL C
library (ctime, localtime, etc) uses the (apparently ANSI standard)
epoch of 1-Jan-1900. Python programs see the MacOS epoch and we
convert values when needed.
This patch changes posixmodule.c:execv to
a) check for zero length args (does this to execve, too), raising
ValueError.
b) raises more rational exceptions for various flavours of duff arguments.
I *hate*
TypeError: "illegal argument type for built-in operation"
It has to be one of the most frustrating error messages ever.
socklen_t (unsigned int) for most size parameters. Apparently this is
part of the UNIX 98 standard.
[GvR: the changes to configure.in etc. that I just checked in make
sure that socklen_t is defined everywhere, so I deleted the little
part of Jack's mod to define socklen_t if not in GUSI2. I suppose I
will have to add it to the Windows config.h in a minute.]
"""
Problem description:
Run the following script:
import test.test_cpickle
for x in xrange(1000000):
reload(test.test_cpickle)
Watch Python's memory use go up up and away!
In the course of debugging this I also saw that cPickle is
inconsistent with pickle - if you attempt a pickle.load or pickle.dump
on a closed file, you get a ValueError, whereas the corresponding
cPickle operations give an IOError. Since cPickle is advertised as
being compatible with pickle, I changed these exceptions to match.
"""
Windows), soclose (on OS2), or to close (everywhere else).
Hopefully this fixes a new compilation error that I suddenly get on
Windows because the macro definition for close -> closesocket
apparently was done before including io.h, which contains a prototype
for close. (No idea why this wasn't an error before.)
backslash from the pathname argument to stat() on Windows -- while on
Unix, stat("/bin/") succeeds and does the same thing as stat("/bin"),
on Windows, stat("\\windows\\") fails while stat("\\windows") succeeds.
This modified version of the patch recognizes both / and \.
(This is odd behavior of the MS C library, since
os.listdir("\\windows\\") succeeds!)
The bug is in mmap_read_line_method(), and its loop that searches for
newlines. After the loop reaches EOF, eol is incremented and points
after the end of the memory. This results in readline() method
sometimes picking up and returning a byte after the end of the string.
This is usually a bogus \0, but it could cause SIGSEGV if it's after
the end of the page).
The patch fixes the problem. Also, it uses memchr() for finding a
character, which is in fact the "strnchr" the comment is asking for.
memchr() is already used in Python sources, so there should be no
portability problems.
The Setup.in entry is sort of a lie; it links with -lexpat, but
Expat's Makefile doesn't actually build a libexpat.a. I'll send
Expat's author a patch to do that; if he doesn't accept it, this
rule will have to list Expat's object files (ick!), or have a
comment explaining how to build a .a file.
None in an argument list *terminates* the argument list: further
arguments are *ignored*. This isn't kosher, but too much code relies
on it, implicitly. For example, IDLE was pretty broken.