This doesn't change the copyright status for these files -- just the
markings! Doing it on the main branch for these three files for which
the HEAD revision was pushed back into 1.6.
-- fixed literal check in branch operator
(this broke test_tokenize, as reported by Mark Favas)
-- added REPEAT_ONE operator (still not enabled, though)
-- added some debugging stuff (maxlevel)
-- reverted REPEAT operator to use "repeat context" strategy
(from 0.8.X), but done right this time.
-- got rid of backtracking stack; use nested SRE_MATCH calls
instead (should probably put it back again in 0.9.9 ;-)
-- properly reset state in scanner mode
-- don't use aggressive inlining by default
* After discussion with Trent, all INT_PTR references have been removed in favour of the HANDLE it should always have been. Trent can see no 64bit issues here.
* In this process, I noticed that the close operation was dangerous, in that we could end up passing bogus results to the Win32 API. These result of the API functions passed the bogus values were never (and still are not) checked, but this is closer to "the right thing" (tm) than before.
Tested on Windows and Linux.
Checkin that replaces the INT_PTR types with HANDLEs still TBD (but as that is a "spelling" patch, rather than a functional one, I will commit it seperately.
originally submitted by Bill Tutt
Note: This code is actually going to be replaced in 2.0 by /F's new
database. Until then, this patch keeps the test suite working.
for systems that are missing those declarations from system include files.
Start by moving a pointy-haired ones from their previous locations to the
new section.
(The gethostname() one, for instance, breaks on several systems, because
some define it as (char *, size_t) and some as (char *, int).)
I purposely decided not to include the summary of used #defines like Tim did
in the first section of pyport.h. In my opinion, the number of #defines
likedly to be used by this section would make such an overview unwieldy. I
would suggest documenting the non-obvious ones, though.
+ added "regs" attribute
+ fixed "pos" and "endpos" attributes
+ reset "lastindex" and "lastgroup" in scanner methods
+ removed (?P#id) syntax; the "lastindex" and "lastgroup"
attributes are now always set
+ removed string module dependencies in sre_parse
+ better debugging support in sre_parse
+ various tweaks to build under 1.5.2
handlers "return void", according to ANSI C.
Removed the new Py_RETURN_FROM_SIGNAL_HANDLER macro.
Left RETSIGTYPE in the config stuff, because it's not clear to
me that others aren't relying on it (e.g., extension modules).
#if RETSIGTYPE != void
That isn't C, and MSVC properly refuses to compile it.
Introduced new Py_RETURN_FROM_SIGNAL_HANDLER macro in pyport.h
to expand to the correct thing based on RETSIGTYPE. However,
only void is ANSI! Do we still have platforms that return int?
The Unix config mess appears to #define RETSIGTYPE by magic
without being asked to, so I assume it's "a problem" across
Unices still.
to return something if RETSIGTYPE isn't void, in functions that are defined
to return RETSIGTYPE. Work around an argumentlist mismatch ('void' vs.
'void *') by using a static wrapper function.
and a couple of functions that were missed in the previous batches. Not
terribly tested, but very carefully scrutinized, three times.
All these were found by the little findkrc.py that I posted to python-dev,
which means there might be more lurking. Cases such as this:
long
func(a, b)
long a;
long b; /* flagword */
{
and other cases where the last ; in the argument list isn't followed by a
newline and an opening curly bracket. Regexps to catch all are welcome, of
course ;)
comments, docstrings or error messages. I fixed two minor things in
test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't").
There is a minor style issue involved: Guido seems to have preferred English
grammar (behaviour, honour) in a couple places. This patch changes that to
American, which is the more prominent style in the source. I prefer English
myself, so if English is preferred, I'd be happy to supply a patch myself ;)
* There was no error reported if the .read() method returns a non-string
* If read() returned too much data, the buffer would be overflowed causing a
core dump
* Used strncpy, not memcpy, which seems incorrect if there are embedded \0s.
* The args and bytes objects were leaked
The first two warnings seem harmless enough,
but the last one looks like a potential bug: an
uninitialized int is returned on error. (I also
ended up reformatting some of the code,
because it was hard to read.)
about int size mismatches at two calls to s_rand. Stuffed in
casts to make the code do what it did before but w/o warnings --
although unclear that's correct!
windows.
- added optional mode argument to popen2/popen3
for unix; if the second argument is an integer,
it's assumed to be the buffer size.
- changed nt.popen2/popen3/popen4 return values
to match the popen2 module (stdout first, not
stdin).
just for the sake of it.
note that this only covers the unlikely case that size_t
is smaller than a long; it's probably more likely that
there are platforms out there where size_t is *larger*
than a long, and mmapmodule cannot really deal with that
today.
cast to make sure Py_BuildValue gets the right thing.
this change eliminates bogus return codes from successful
spawn calls (e.g. 2167387144924954624 instead of 0).
staring at the diffs before checking this one in. let me know
asap if it breaks things on your platform.
-- ANSI-fying
(patch #100763 by Peter Schneider-Kamp, minus the
indentation changes and minus the changes the broke
the windows build)
In posixmodule.c:posix_fork, the function PyOS_AfterFork is called for
both the parent and the child, despite the docs stating that it should
be called in the new (child) process.
This causes problems in the parent since the forking thread becomes the
main thread according to the signal module.
Calling PyOS_AfterFork() only in the child fixes this. Changed for both
fork() and forkpty().
Somebody w/ gcc please check that the wngs are gone!
There are cheaper (at runtime) ways to prevent the wngs, but
they're obscure and delicate. I'm going for the easy Big
Hammer here under the theory that PCRE will be replaced by
SRE anyway.
- reorganized some code to get rid of -Wall and -W4
warnings
- fixed default argument handling for sub/subn/split
methods (reported by Peter Schneider-Kamp).
It gets initialized when pyexpat is imported, and is only accessible as an
attribute of pyexpat; it cannot be imported itself. This allows it to at
least be importable after pyexpat itself has been imported by adding it
to sys.modules, so it is not quite as strange.
This arrangement needs to be better thought out.
the pattern must have a fixed width.
- got rid of array-module dependencies; the match pro-
gram is now stored inside the pattern object, rather
than in an extra string buffer.
- cleaned up a various of potential leaks, api abuses,
and other minors in the engine module.
- use mal's new isalnum macro, rather than my own work-
around.
- untabified test_sre.py. seems like I removed a couple
of trailing spaces in the process...
Revise math_1(), math_2(), stub-generating macros, and function tables to
use PyArg_ParseTuple() and properly provide the function name for error
message generation.
Fix pow() docstring for MPW 3.1; had said "power" instead of "pow".
"lastgroup" is the name of the last matched capturing group,
"lastindex" is the index of the same group. if no group was
matched, both attributes are set to None.
the (?P#) feature will be removed in the next relase.
used by the code generator)
- changed max repeat value in engine (to match earlier array fix)
- added experimental "which part matched?" mechanism to sre; see
http://hem.passagen.se/eff/2000_07_01_bot-archive.htm#416954
or python-dev for details.
speedup for some tests, including the python tokenizer.
-- added support for an optional charset anchor to the engine
(currently unused by the code generator).
-- removed workaround for array module bug.
-- changed 1.6 to 2.0 in the file headers
-- fixed ISALNUM macro for the unicode locale. this
solution isn't perfect, but the best I can do with
Python's current unicode database.
CHAR_MAX, use hardcoded -128 and 127. This may seem strange, unless
you realize that we're talking about signed bytes here! Bytes are
always 8 bits and 2's complement. CHAR_MIN and CHAR_MAX are
properties of the char data type, which is guaranteed to hold at least
8 bits anyway.
Otherwise you'd get failing tests on platforms where unsigned char is
the default (e.g. AIX).
Thanks, Vladimir Marangozov, for finding this nit!
The common technique for printing out a pointer has been to cast to a long
and use the "%lx" printf modifier. This is incorrect on Win64 where casting
to a long truncates the pointer. The "%p" formatter should be used instead.
The problem as stated by Tim:
> Unfortunately, the C committee refused to define what %p conversion "looks
> like" -- they explicitly allowed it to be implementation-defined. Older
> versions of Microsoft C even stuck a colon in the middle of the address (in
> the days of segment+offset addressing)!
The result is that the hex value of a pointer will maybe/maybe not have a 0x
prepended to it.
Notes on the patch:
There are two main classes of changes:
- in the various repr() functions that print out pointers
- debugging printf's in the various thread_*.h files (these are why the
patch is large)
Closes SourceForge patch #100505.
-- added pickling support (only works if sre is imported)
-- fixed wordsize problems in engine
(instead of casting literals down to the character size,
cast characters up to the literal size (same as the code
word size). this prevents false hits when you're matching
a unicode pattern against an 8-bit string. (unfortunately,
this broke another test, but I think the test should be
changed in this case; more on that on python-dev)
-- added sre.purge function
(unofficial, clears the cache)
This patch fixes possible overflows in the socket module for 64-bit
platforms (mainly Win64). The changes are:
- abstract the socket type to SOCKET_T (this is SOCKET on Windows, int
on Un*x), this is necessary because sizeof(SOCKET) > sizeof(int) on
Win64
- use INVALID_SOCKET on Win32/64 for an error return value for
accept()
- ensure no overflow of the socket variable for: (1) a PyObject return
value (use PyLong_FromLongLong if necessary); and (2) printf
formatting in repr().
Closes SourceForge patch #100516.
Tim posted a long comment to python-dev (subject: "Controversial patch
(cmath)"; date: 6/29/00). The conclusion is that this whole module
stinks and this patch isn't perfect, but it's better than the acosh
and asinh we had, so let's check it in.
group reset problem. in the meantime, I added some
optimizations:
- added "inline" directive to LOCAL
(this assumes that AC_C_INLINE does what it's
supposed to do). to compile SRE on a non-unix
platform that doesn't support inline, you have
to add a "#define inline" somewhere...
- added code to generate a SRE_OP_INFO primitive
- added code to do fast prefix search
(enabled by the USE_FAST_SEARCH define; default
is on, in this release)
This patch fixes a possible overflow in the Sleep system call on
Win32/64 in the time_sleep() function in the time module. For very
large values of the give time to sleep the number of milliseconds can
overflow and give unexpected sleep intervals. THis patch raises an
OverflowError if the value overflows.
Closes SourceForge patch #100514.
This patch fixes the posix module for large file support mainly on
Win64, although some general cleanup is done as well.
The changes are:
- abstract stat->STAT, fstat->FSTAT, and struct stat->STRUCT_STAT
This is because stat() etc. are not the correct functions to use on
Win64 (nor maybe on other platforms?, if not then it is now trivial to
select the appropriate one). On Win64 the appropriate system functions
are _stati64(), etc.
- add _pystat_fromstructstat(), it builds the return tuple for the
fstat system call. This functionality was being duplicated. As well
the construction of the tuple was modified to ensure no overflow of
the time_t elements (sizeof(time_t) > sizeof(long) on Win64).
- add overflow protection for the return values of posix_spawnv and
posix_spawnve
- use the proper 64-bit capable lseek() on Win64
- use intptr_t instead of long where appropriate from Win32/64 blocks
(sizeof(void*) > sizeof(long) on Win64)
This closes SourceForge patch #100513.
Mark Hammond provided (a long time ago) a better Win32 specific
time_clock implementation in timemodule.c. The library for this
implementation does not exist on Win64 (yet, at least). This patch
makes Win64 fall back on the system's clock() function for
time_clock().
This closes SourceForge patch #100512.
(those semantics are weird...)
- got rid of $Id$'s (for the moment, at least). in other
words, there should be no more "empty" checkins.
- internal: some minor cleanups.
(test_sre still complains about split, but that's caused by
the group reset bug, not split itself)
- added more mark slots
(should be dynamically allocated, but 100 is better than 32.
and checking for the upper limit is better than overwriting
the memory ;-)
- internal: renamed the cursor helper class
- internal: removed some bloat from sre_compile
tests in sre_patch back to previous version
- fixed return value from findall
- renamed a bunch of functions inside _sre (way too
many leading underscores...)
</F>
Fix warnings on 64-bit build build of signalmodule.c
- Though I know that SIG_DFL and SIG_IGN are just small constants,
there are cast to function pointers so the appropriate Python call is
PyLong_FromVoidPtr so that the pointer value cannot overflow on Win64
where sizeof(long) < sizeof(void*).
This patch fixes cPickle.c for 64-bit platforms.
- The false assumption sizeof(long) == size(void*) exists where
PyInt_FromLong is used to represent a pointer. The safe Python call
for this is PyLong_FromVoidPtr. (On platforms where the above
assumption *is* true a PyInt is returned as before so there is no
effective change.)
- use size_t instead of int for some variables
This patches fixes a possible overflow of the optional timeout
parameter for the select() function (selectmodule.c). This timeout is
passed in as a double and then truncated to an int. If the double is
sufficiently large you can get unexpected results as it
overflows. This patch raises an overflow if the given select timeout
overflows.
[GvR: To my embarrassment, the original code was assuming an int could
always hold a million. Note that the overflow check doesn't test for
a very large *negative* timeout passed in -- but who in the world
would do such a thing?]
The cause: Relatively recent (last month) patches to getargs.c added
overflow checking to the PyArg_Parse*() integral formatters thereby
restricting 'b' to unsigned char value and 'h','i', and 'l' to signed
integral values (i.e. if the incoming value is outside of the
specified bounds you get an OverflowError, previous it silently
overflowed).
The problem: This broke the array module (as Fredrik pointed out)
because *its* formatters relied on the loose allowance of signed and
unsigned ranges being able to pass through PyArg_Parse*()'s
formatters.
The fix: This patch fixes the array module to work with the more
strict bounds checking now in PyArg_Parse*().
How: If the type signature of a formatter in the arraymodule exactly
matches one in PyArg_Parse*(), then use that directly. If there is no
equivalent type signature in PyArg_Parse*() (e.g. there is no unsigned
int formatter in PyArg_Parse*()), then use the next one up and do some
extra bounds checking in the array module.
This partially closes SourceForge patch #100506.
This patch adds the openpty() and forkpty() library calls to posixmodule.c,
when they are available on the target
system. (glibc-2.1-based Linux systems, FreeBSD and BSDI at least, probably
the other BSD-based systems as well.)
Lib/pty.py is also rewritten to use openpty when available, but falls
back to the old SGI method or the "manual" BSD open-a-pty
code. Openpty() is necessary to use the Unix98 ptys under Linux 2.2,
or when using non-standard tty names under (at least) BSDI, which is
why I needed it, myself ;-) forkpty() is included for symmetry.
New ucnhash module by Bill Tutt. This module contains the hash
table needed to map Unicode character names to Unicode ordinals
and is loaded on-the-fly by the standard unicode-escape codec.
I discovered the [MREMAP_MAYMOVE] symbol is only defined when _GNU_SOURCE is
defined; therefore, here is the change: if we are compiling for linux,
define _GNU_SOURCE before including mman.h, and all is done.
this patch adds a fast _flatten function to the _tkinter
module, and imports it from Tkinter.py (if available).
this speeds up canvas operations like create_line and
create_polygon. for example, a create_line with 5000
vertices runs about 50 times faster with this patch in
place.
The seek() method is broken for any 'whence' value (seek from
start, current, orend) other than the default. I have a patch
that fixes that as well as gets mmap'd files working on
Linux64 and Win64.
Python on UNIX now trusts PYTHONHOME unconditionally
Modules/getpath.c:
Landmark changed to os.py.
Setting PYTHONHOME now unconditionally sets sys.prefix
(and sys.exec_prefix). No further checks are done whether the
standard lib can be found in that location or not. This is in
sync with the PC subdir getpath implementations.
PC/getpathp.c:
Landmark changed to os.py.
PC/os2vacpp/getpathp.c:
Landmark changed to os.py.
Note: BAW's checkin on exceptions.c eliminates earlier concerns about
a bogus PYTHONHOME value leading to a core dump. Instead it causes a
useless sys.path and prevents imports.
Lots of typo fixes (a bit too much cut-and-paste in this module)
Aliases removed: attr_on, attr_off, attr_set
Lowercased the names COLOR_PAIR and PAIR_NUMBER
#ifdef's for compiling on Solaris added (need to understand SYSV curses
versions better and generalize this)
Bumped version number bumped to 1.6
that before) in the previous patch has one problem; rint() is not in
the C math library on all platforms (e.g. not for VC++). Make it
conditional on HAVE_RINT.
into. Jim writes:
The core dump was due to a C decrement operation
in a macro invocation in load_pop. (BAD)
I fixed this by moving the decrement outside
the macro call.
I added a comment to load_pop and load_mark
to document the fact that cPickle separates the
unpickling stack into two separate stacks, one for
objects and one for marks.
I also moved some increments out of some macro
calls (PyTuple_SET_ITEM and PyList_SET_ITEM).
This wasn't necessary, but made me feel better. :)
I tested these changes in *my* cPickle, which
doesn't have the new Unicode stuff.
Fix overflow bug in ldexp(x, exp). The 'exp' argument maps to a C int for the
math library call [double ldexp(double, int)], however the 'd'
PyArg_ParseTuple formatter was used to yield a double, which was subsequently
cast to an int. This could overflow.
[GvR: mysteriously, on Solaris 2.7, ldexp(1, 2147483647) returns Inf
while ldexp(1, 2147483646) raises OverflowError; this seems a bug in
the math library (it also takes a real long time to compute the
Inf outcome). Does this point to a bug in the CHECK() macro? It
should have discovered that the result was outside the HUGE_VAL range.]
instead. This seems more robust than returning an Unicode string with
some unconverted charcters in it.
This still doesn't support getting truly binary data out of Tcl, since
we look for the trailing null byte; but the old (pre-Unicode) code did
this too, so apparently there's no need. (Plus, I really don't feel
like finding out how Tcl deals with this in each version.)
1. In Tcl 8.2 and later, use Tcl_NewUnicodeObj() when passing a Python
Unicode object rather than going through UTF-8. (This function
doesn't exist in Tcl 8.1, so there the original UTF-8 code is still
used; in Tcl 8.0 there is no support for Unicode.) This assumes that
Tcl_UniChar is the same thing as Py_UNICODE; a run-time error is
issued if this is not the case.
2. In Tcl 8.1 and later (i.e., whenever Tcl supports Unicode), when a
string returned from Tcl contains bytes with the top bit set, we
assume it is encoded in UTF-8, and decode it into a Unicode string
object.
Notes:
- Passing Unicode strings to Tcl 8.0 does not do the right thing; this
isn't worth fixing.
- When passing an 8-bit string to Tcl 8.1 or later that has bytes with
the top bit set, Tcl tries to interpret it as UTF-8; it seems to fall
back on Latin-1 for non-UTF-8 bytes. I'm not sure what to do about
this besides telling the user to disambiguate such strings by
converting them to Unicode (forcing the user to be explicit about the
encoding).
- Obviously it won't be possible to get binary data out of Tk this
way. Do we need that ability? How to do it?
For more comments, read the patches@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.
(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode. I'm also holding back on his
change to main.c, which seems unnecessary to me.)
Checkin 2.131 of posixmodule.c changed os.stat on Windows, so that
"/bin/" type notation (trailing backslash) would work on Windows to
be consistent with Unix.
However, the patch broke the simple case of: os.stat("\\")
This did work in 1.5.2, and obviously should!
This patch addresses this, and restores the correct behaviour.
utime(path, NULL) call, setting the atime and mtime of the file to the
current time. The previous signature utime(path, (atime, mtime)) is
of course still allowed.
This patch is a workaround for Macintosh, where the GUSI I/O library
(time, stat, etc) use the MacOS epoch of 1-Jan-1904 and the MSL C
library (ctime, localtime, etc) uses the (apparently ANSI standard)
epoch of 1-Jan-1900. Python programs see the MacOS epoch and we
convert values when needed.
This patch changes posixmodule.c:execv to
a) check for zero length args (does this to execve, too), raising
ValueError.
b) raises more rational exceptions for various flavours of duff arguments.
I *hate*
TypeError: "illegal argument type for built-in operation"
It has to be one of the most frustrating error messages ever.
socklen_t (unsigned int) for most size parameters. Apparently this is
part of the UNIX 98 standard.
[GvR: the changes to configure.in etc. that I just checked in make
sure that socklen_t is defined everywhere, so I deleted the little
part of Jack's mod to define socklen_t if not in GUSI2. I suppose I
will have to add it to the Windows config.h in a minute.]
"""
Problem description:
Run the following script:
import test.test_cpickle
for x in xrange(1000000):
reload(test.test_cpickle)
Watch Python's memory use go up up and away!
In the course of debugging this I also saw that cPickle is
inconsistent with pickle - if you attempt a pickle.load or pickle.dump
on a closed file, you get a ValueError, whereas the corresponding
cPickle operations give an IOError. Since cPickle is advertised as
being compatible with pickle, I changed these exceptions to match.
"""
Windows), soclose (on OS2), or to close (everywhere else).
Hopefully this fixes a new compilation error that I suddenly get on
Windows because the macro definition for close -> closesocket
apparently was done before including io.h, which contains a prototype
for close. (No idea why this wasn't an error before.)
backslash from the pathname argument to stat() on Windows -- while on
Unix, stat("/bin/") succeeds and does the same thing as stat("/bin"),
on Windows, stat("\\windows\\") fails while stat("\\windows") succeeds.
This modified version of the patch recognizes both / and \.
(This is odd behavior of the MS C library, since
os.listdir("\\windows\\") succeeds!)
The bug is in mmap_read_line_method(), and its loop that searches for
newlines. After the loop reaches EOF, eol is incremented and points
after the end of the memory. This results in readline() method
sometimes picking up and returning a byte after the end of the string.
This is usually a bogus \0, but it could cause SIGSEGV if it's after
the end of the page).
The patch fixes the problem. Also, it uses memchr() for finding a
character, which is in fact the "strnchr" the comment is asking for.
memchr() is already used in Python sources, so there should be no
portability problems.
The Setup.in entry is sort of a lie; it links with -lexpat, but
Expat's Makefile doesn't actually build a libexpat.a. I'll send
Expat's author a patch to do that; if he doesn't accept it, this
rule will have to list Expat's object files (ick!), or have a
comment explaining how to build a .a file.
None in an argument list *terminates* the argument list: further
arguments are *ignored*. This isn't kosher, but too much code relies
on it, implicitly. For example, IDLE was pretty broken.
Reformatted for 8-space tabs and fitted into 80-char lines by GvR.
Mark writes:
* the Win32 version now accepts the same args as the Unix version.
The win32 specific "tag" param is now optional. The end result is
that the exact same test suite runs on Windows (definately a worthy
goal!).
* I changed the error object. All occurences of the error, except
for 1, corresponds to an underlying OS error. This one was changed
to a ValueError (a better error for that condition), and the module
error object is now simply EnvironmentError. All win32 error
routines now call the new Windows specific error handler.
(1) In opendir(), don't call the lock-release macros; we're
manipulating list objects and that shouldn't be done in unlocked
state.
(2) Don't use posix_strint() for chmod() -- the mode_t arg might be a
64 bit int (reported by Nick Maclaren).
This was originally submitted by Martin von Loewis as part of his
Unicode patch; all I did was add special cases for Python int and
float objects and rearrange the object type tests somewhat to speed up
the common cases (string, int, float, tuple, unicode, object).
The attached patch set includes a workaround to get Python with
Unicode compile on BSDI 4.x (courtesy Thomas Wouters; the cause
is a bug in the BSDI wchar.h header file) and Python interfaces
for the MBCS codec donated by Mark Hammond.
Also included are some minor corrections w/r to the docs of
the new "es" and "es#" parser markers (use PyMem_Free() instead
of free(); thanks to Mark Hammond for finding these).
The unicodedata tests are now in a separate file
(test_unicodedata.py) to avoid problems if the module cannot
be found.
a Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS block, but it
calls Py_BLOCK_THREADS anyway. The change moves Py_BLOCK_THREADS
to inside the if, so it's only executed when the function
actually returns unexpectedly.
Attached you find an update of the Unicode implementation.
The patch is against the current CVS version. I would appreciate
if someone with CVS checkin permissions could check the changes
in.
The patch contains all bugs and patches sent this week and also
fixes a leak in the codecs code and a bug in the free list code
for Unicode objects (which only shows up when compiling Python
with Py_DEBUG; thanks to MarkH for spotting this one).
This patch fixes 3 small problems.
1) If a map is used which is generated with 'makedbm -a',
the trailing '\0' is now handled correctely.
2) The nis.maps() function skipped the first map in the output list.
3) The library '-lnsl' is added in Setup.in (needed on Linux glibc2 and
Solaris systems. Maybe on other systems too?)
[I note that this still doesn't work when you are using NIS+ --GvR]
This patch allows building the Python 'mpzmodule' under SuSE Linux
without having to install the source package of the GMP-libary.
The gmp-mparam.h seems to be an internal header file. The patch
shouldn't hurt any other platforms.
The same problem (mixed mallocs) exists for the pcre stack.
The buffers md->... are allocated via PyMem_RESIZE in grow_stack(),
while in free_stack() they are released with free() instead of
PyMem_DEL().
The buffers self->regex and self->regex_extra are allocated in
pcre_compile() and pcre_study() via pcre_malloc, but are released
via free() instead of pcre_free.
(e.g. used for ZIP files).
The patch includes code that says:
+ Copyright (C) 1986 Gary S. Brown. You may use this program, or
+ code or tables extracted from it, as desired without restriction.
My interpretation (and Jim's) is that Gary S Brown has no claims under
copyright, patent or other rights or interests. Lawyers might disagree.
only. Through some mysterious interaction, they would take 9 separate
arguments as well. This misfeature is now disabled (to end a
difference with JPython).
building the dicts used to inform the user about the defined
constants when using the *conf*() APIs.
Thanks to Mark Hammond <mhammond@skippinet.com.au>.
strings to integers for the *conf*() functions.
Added code to sort the tables at module initialization. Three
dictionaries, confstr_names, sysconf_names, and pathconf_names, are
added to the module as well. These map known configuration setting
names to the numeric value which is used to represent the setting in
the system call. This code is always called.
Updated related comments.
pathconf() names, from Sjoerd.
Added code to verify that these tables are properly ordered, only
included and used when CHECK_CONFNAME_TABLES is defined. This is only
needed to test the tables, so I haven't enabled this by default.
available since the interface is poorly defined on at least one major
platform (Solaris).
Moved table of constant names for fpathconf() & pathconf() into the
conditional that defines the conv_path_confname() helper; Mark Hammond
reported that defining the table when none of the constants were
defined causes the compiler to complain (won't allow 0-length array,
imagine that!).
In posix_fpathconf(), use conv_path_confname() as the O& conversion
function, instead of the conv_confname() helper, which has the wrong
signature (posix_pathconf() already used the right thing).
and TMP_MAX.
Converted all functions that used PyArg_Parse() or PyArg_NoArgs() to
use PyArg_ParseTuple() and specified all function names using the
:name syntax in the format strings, to allow better error messages
when TypeError is raised for parameter type mismatches.
Brian E Gallew, which were improved and adapted to OpenSSL 0.9.4 by
Laszlo Kovacs of HP. Both have kindly given permission to include
the patches in the Python distribution. Final formatting by GvR.
Brian E Gallew, which were improved and adapted to OpenSSL 0.9.4 by
Laszlo Kovacs of HP. Both have kindly given permission to include
the patches in the Python distribution. Final formatting by GvR.
new:
readline.get_begidx() -> int
gets the beginning index in the command line string
delimiting the tab-completion scope. This would
probably be used from within a tab-completion
handler
readline.get_endidx() -> int
gets the ending index in the command line string
delimiting the tab-completion scope. This would
probably be used from within a tab-compeltion
handler
readline.set_completer_delims(string) -> None
sets the delimiters used by readline as word breakpoints
for tab-completion
readline.get_completer_delims() -> string
gets the delimiters used by readline as word breakpoints
for tab-completion
fixed:
readline.get_line_buffer() -> string
doesnt cause a debug message every other call
simply moves the call to Tk_MainWindow() after the Tcl/Tk
initialization calls. The patch is unconditional, it works with
earlier and later versions as well.
reformatted.)
- Illegal padding is now ignored. (Recommendation by GvR.)
- Padding no longer removes characters from data string (resulting in
lost data/strings with negative lengths).
- Illegal characters outside the ASCII range are now ignored, instead
of possibly being remapped to a valid character.
the right variant of gethostbyname_r for us, since not all Linuxes are
equal in this respect. Reported by Laurent Pointal.
(2) On BeOS, Chris Herborth reports that instead of arpa/inet.h you
must include net/netdb.h to get the inet_ntoa() and inet_addr()
prototypes.
names match the documentation.
Removed broken code that supports the __methods__ attribute on ast
objects; the right magic was added to Py_FindMethod() since this was
originally written. <ast-object>.__methods__ now works, so dir() and
rlcompleter are happy.
Treat them as read-only, and make a copy as appropriately. This was
first reported by Bill Janssend and later by Craig Rowland and Ron
Sedlmeyer. This fix is mine.
"""
It fixes a memory corruption error resulting from BadPickleGet
exceptions in load_get, load_binget and load_long_binget. This was
initially reported on c.l.py as a problem with Cookie.py; see the thread
titled "python core dump (SIGBUS) on Solaris" for more details.
If PyDict_GetItem(self->memo, py_key) call failed, then py_key was being
Py_DECREF'd out of existence before call was made to
PyErr_SetObject(BadPickleGet, py_key).
The bug can be duplicated as follows:
import cPickle
cPickle.loads('garyp')
This raises a BadPickleGet exception whose value is a freed object. A
core dump will soon follow.
"""
Jim Fulton approves of the patch.
different values in the environ dict with the same key (although he
couldn't explain exactly how this came to be). Since getenv() uses
the first one, Python should do too. (Some doubts about case
sensitivity, but for now this at least seems the right thing to do
regardless of platform.)
- Don't call Py_FatalError() when initialization fails.
- Fix bogus use of return value from PyRun_String().
- Fix misc. compiler errors on some platforms.
I've updated cPickle.c to use class exceptions:
Changed pickle error types to classes:
PickleError
PicklingError
UnpickleableError
UnpicklingError
And change the handling of unpickleable objects so that an UnpickleableError
is raised with the unpickleable object as the argument. UnpickleableError
has a reasonable string representation and provides access to the problem
object, which is useful during debugging.
[I'm still waiting for patches to do the same to pickle.py.]
The test really wanted to distinguish between the two. So now we test
for __GLIBC__ instead. I have confirmed that this works for glibc and
I have an email from Christian Tanzer confirming that it works for
libc5, so it should be fine.
I have attached a new cPickle that adds a new control attribute
to unpicklers:
Added new Unpickler attribute, find_global. If set to None, then
global and instance pickles are disabled. Otherwise, it should be set to
a callable object that takes two arguments, a module name and an
object name, and returns an object. If the attribute is unset, then
the default mechanism is used.
This feature provides an additional mechanism for controlling which
classes can be used for unpickling.
Without this, if inflate() returned Z_BUF_ERROR asking for more output
space, we would report the error; now, we increase the buffer size and
try again, just as for Z_OK.
"""
The GNU folks, in their infinite wisdom, have decided not to implement
altzone in libc6; this would not be horrible, except that timezone
(which is implemented) includes the current DST setting (i.e. timezone
for Central is 18000 in summer and 21600 in winter). So Python's
timezone and altzone variables aren't set correctly during DST.
Here's a patch relative to 1.5.2b2 that (a) makes timezone and altzone
show the "right" thing on Linux (by using the tm_gmtoff stuff
available in BSD, which is how the GLIBC manual claims things should
be done) and (b) should cope with the southern hemisphere. In pursuit
of (b), I also took the liberty of renaming the "summer" and "winter"
variables to "july" and "jan". This patch should also make certain
time calculations on Linux actually work right (like the tz-aware
functions in the rfc822 module).
(It's hard to find DST that's currently being used in the southern
hemisphere; I tested using Africa/Windhoek.)
"""
is not an empty string, this means that you have arrived at the
end of the stream of compressed data, and the contents of .unused_data are
whatever follows the compressed stream.
data struct before calling gethostby{name,addr}_r(); (2) ignore the
3/5/6 args determinations made by the configure script and switch on
platform identifiers instead:
AIX, OSF have 3 args
Sun, SGI have 5 args
Linux has 6 args
On all other platforms, undef HAVE_GETHOSTBYNAME_R altogether.
- Use HAVE_GETHOSTBYNAME_R_6_ARG instead of testing for Linux and
glibc2.
- If gethostbyname takes 3 args, undefine HAVE_GETHOSTBYNAME_R --
don't know what code should be used.
- New symbol USE_GETHOSTBYNAME_LOCK defined iff the lock should be used.
- Modify the gethostbyaddr() code to also hold on to the lock until
after it is safe to release, overlapping with the Python lock.
(Note: I think that it could in theory be possible that Python code
executed while gethostbyname_lock is held could attempt to reacquire
the lock -- e.g. in a signal handler or destructor. I will simply say
"don't do that then.")
Here's a patch to fix the race condition, which wasn't fixed by Rob's
patch. It holds the gethostbyname lock until the results are copied out,
which means that this lock and the Python global lock are held at the same
time. This shouldn't be a problem as long as the gethostbyname lock is
always acquired when the global lock is not held.
He writes:
I had an off-by-1000 error in floatsleep(),
and the problem with time.clock() is that it's not implemented properly
on QNX... ANSI says it's supposed to return _CPU_ time used by the
process, but on QNX it returns the amount of real time used... so I was
confused.
guessing what happened when strftime() returns 0. Is it buffer
overflow or was the result simply 0 bytes long? (This happens for an
empty format string, or when the format string is a single %Z and the
timezone is unknown.) if the buffer is at least 256 times as long as
the format, assume the latter.
converted was a "digit" -- use isalnum(). This test is there only to
guard against "+" or "-" being interpreted as a valid int literal.
Reported by Takahiro Nakayama.
up the _tkinter main loop. Not clear why; the _kbhit() call _tkinter
makes probably confuses the stdio library when buffering isn't set to
whatever it is by default.
f_fsid field, since it's not a scalar on all systems supporting this
call (in particular, it's a tuple of two longs on AIX). Since it's
not particularly useful, just nuke it. Adapted the doc strings too.
was appended to a list. Lists are reference count neutral, so the
string must be DECREF'd. Also added some checks for the return value
of PyList_Append().
Note: there are still some memory problems reported by Purify (I get
two Array Bounds Reads still and an Unitialized Memory Read). Also,
in scanning the code, there appears to be some potential problems
where return values aren't checked. To much to attack now though.
Purify) being caused by a bug in the readline library. Nothing we can
do about it.
Cause: readline_initialize_everything() throws away the return value
from rl_read_init_file(), but that happens to be the last reference to
a dynamically allocated char*.
decompressor object. This required adding a flag to the struct which is
true if initialisation was completed; on object destruction, deflateEnd()
is only called if the flag is true.
NOTE: There is still a bug of some sort in the behavior of zlib. In
at least one case, inflate returns Z_OK (which is typically
interpreted to mean that more output space is needed) when it has
finished inflating a buffer. This has been reported as a bug to the
zlib maintainers; we may need to change the Python interface.
> mpz.mpz('\xff') should return mpz(255). Instead it returns
> mpz(4294967295L). Looks like the constructor doesn't work with strings
> containing characters above chr(128).
Caused by using just 'char' where 'unsigned char' should have been used.
But IMHO, this problem really reveals an annoyance in Python's
makesetup. makesetup puts the global include directories "$(INCLUDEPY)
$(EXECINCLUDEPY)" in front of the directories defined by the module in
Setup. Therefore global (potentially older) header files are preferred
over the ones set by the module, which makes it hard to compile new
versions of modules when the old versions are installed. AFAIK, the
other way around is common practice for most other software.
This patch to makesetup would be an potential fix for this problem,
though I don't know if it breaks anything else.
- New copyright. (Open source)
- Added new protocol for binary string pickles that
takes out unneeded puts:
p=Pickler()
p.dump(x)
p.dump(y)
thePickle=p.getvalue()
This has little or no impact on pickling time, but
often reduces unpickling time and pickle size, sometimes
significantly.
- Changed unpickler to use internal data structure instead
of list to reduce unpickling times by about a third.
- Many cleanups to get rid of obfuscated error handling
involving 'goto finally' and status variables.
- Extensive reGuidofication. (formatting :)
- Fixed binary floating-point pickling bug. 0.0 was not
pickled correctly.
- Now use binary floating point format when saving
floats in binary mode.
- Fixed some error message spelling error.
- New copyright. (Open source)
- Fixed problem in seek method. The seek method should (and now does)
fill with nulls when seeking past the end of the "file".
device and control pseudo-device:
- first look for the device filename in the environment variable
AUDIODEV.
- if not found, use /dev/audio
- calculate the control device by tacking "ctl" onto the base device
name.
We now do this. Also, if the open fails, we call
PyErr_SetFromErrnoWithFilename() to give a more informative error
message.
Added a fileno() method to the audio object returned from open().
This returns the file descriptor which can be used by applications to
set up SIGPOLL notification, as per the manpage.
solves the conflict with curses over the 'clear' entry point much
nicer. (Jack had checked in the changes to cstubs eons ago, but I
never regenrated glmodule.c :-( )
the compilation flags for the gl, fl and fm modules. This avoids a
name conflict with the curses module (both gl and curses have an entry
point called 'clear').
thread state of the thread calling mainloop() (or another event
handling function) rather than the thread state of the function that
created the client data structure.
2-digit years are now converted using rules that are (according to
Fredrik Lundh) recommended by POSIX or X/Open: 0-68 mean 2000-2068,
69-99 mean 1969-1999.
2-digit years are now only accepted if time.accept2dyear is set to a
nonzero integer; if it is zero or not an integer or absent, only year
values >= 1900 are accepted. Year values 100-1899 and negative year
values are never accepted.
The initial value of time.accept2dyear depends on the environment
variable PYTHONY2K: if PYTHONY2K is set and non-empty,
time.accept2dyear is initialized to 0; if PYTHONY2K is empty or not
set, time.accept2dyear is initialized to 0.
I had to make a slight diddle to work with Python 1.4, which
we and some of our customers are still using. :(
I've also made a few minor enhancements:
- You can now both get and set the memo using a 'memo'
attribute. This is handy for certain advanced applications
that we have.
- Added a 'binary' attribute to get and set the binary
mode for a pickler.
- Added a somewhat experimental 'fast' attribute. When this
is set, objects are not placed in the memo during pickling.
This should lead to faster pickling and smaller pickles in
cases where:
o you *know* there are no circular references, and
o either you've:
- preloaded the memo with class information
by pickling classes in non-fast mode or by
manipilating the memo directly, or
- aren't pickling instances.
1. Only DECREF the class's module when the module is retrieved via
PyImport_Import. If it is retrieved from the modules dictionary with
PyDict_GetItem, it is using a borrowed reference.
2. If the module doesn't define the desired class, raise the same
SystemError that pickle.py does instead of returning an AttributeError
(which is cryptic at best).
Also, fix the PyArg_ParseTuple in cpm_loads (the externally visible
loads) function: Use "S" instead of "O" because cStringIO will croak
with a "bad arguments to internal function" if passed anything other
than a string.
gethostbyaddr(). (Plain gethostbyname() returns only the IP address.)
This moves the code shared by gethostbyaddr() and gethostbyname_ex()
to a subroutine.
Original patch by Dan Stromberg; some tweaks by GvR.
exceptions:
posix_error_with_filename(): New function which calls
PyErr_SetFromErrnoWithFilename()
The following methods have been changed to call
posix_error_with_filename():
posix_1str()
posix_strint()
posix_strintint()
posix_do_stat()
posix_mkdir()
posix_utime()
posix_readlink()
posix_open()
INITFUNC(): os.error (nee PosixError) is PyExc_OSError
low-level Python exit handler. This can attempt to call Python code
at a point that the interpreter and thread state have already been
destroyed, causing a Bus Error. Given the intended use of
Py_AtExit(), I'm not convinced that it's a good idea to call it
earlier during Python's finalization sequence... (Although this is
the only use for it in the entire distribution.)
PythonCmd_Error() but failed to return. The error wasn't very likely
(only when we run out of memory) but since the check is there we might
as well return the error. (I think that Barry introduced this buglet
when he added error checks everywhere.)