This patch fixes a problem on AIX with the signed int case code in
getargs.c, after Trent Mick's intervention about MIN/MAX overflow
checks. The AIX compiler/optimizer generates bogus code with the
default flags "-g -O" causing test_builtin to fail: int("10", 16) <>
16L. Swapping the two checks in the signed int code makes the problem
go away.
Also, make the error messages fit in 80 char lines in the
source.
The depth field was never decremented inside w_object(), and it was
never initialized in PyMarshal_WriteObjectToFile().
This caused imports from .pyc files to fil mysteriously when the .pyc
file was written by the broken code -- w_object() would bail out
early, but PyMarshal_WriteObjectToFile() doesn't check the error or
return an error code, and apparently the marshalling code doesn't call
PyErr_Check() either. (That's a separate patch if I feel like it.)
Various small fixes to the builtin module to ensure no buffer
overflows.
- chunk #1:
Proper casting to ensure no truncation, and hence no surprises, in the
comparison.
- chunk #2:
The id() function guarantees a unique return value for different
objects. It does this by returning the pointer to the object. By
returning a PyInt, on Win64 (sizeof(long) < sizeof(void*)) the pointer
is truncated and the guarantee may be proven false. The appropriate
return function is PyLong_FromVoidPtr, this returns a PyLong if that
is necessary to return the pointer without truncation.
[GvR: note that this means that id() can now return a long on Win32
platforms. This *might* break some code...]
- chunk #3:
Ensure no overflow in raw_input(). Granted the user would have to pass
in >2GB of data but it *is* a possible buffer overflow condition.
As I really do not have anything better to do at the moment, I have written
a patch to Python/marshal.c that prevents Python dumping core when trying
to marshal stack bustingly deep (or recursive) data structure.
It just throws an exception; even slightly clever handling of recursive
data is what pickle is for...
[Fred Drake:] Moved magic constant 5000 to a #define.
This closes SourceForge patch #100645.
the number of children of a node exceeds the max possible value for
the short that is used to count them. The Python runtime converts
this parser error into the SyntaxError "expression too long."
module and into _exceptions.c. This includes all the PyExc_* globals,
the bltin_exc table, init_class_exc(), fini_instances(),
finierrors().
Renamed _PyBuiltin_Init_1() to _PyBuiltin_Init() since the two phase
initializations are necessary any more.
Removed as obsolete _PyBuiltin_Init_2(), _PyBuiltin_Fini_1() and
_PyBuiltin_Fini_2().
need two phase init or fini of the builtin module. Change the call of
_PyBuiltin_Init_1() to _PyBuiltin_Init(). Add a call to
init_exceptions().
Py_Finalize(): Don't call _PyBuiltin_Fini_1(). Instead call
fini_exceptions() but move this to before the thread state is
cleared.
Limit the 'b' formatter of PyArg_ParseTuple to valid values of an unsigned
char, i.e. [0,UCHAR_MAX]. It is expected that this is the common usage of 'b'.
An OverflowError is raised if the parsed value is outside this range.
Changes the 'b', 'h', and 'i' formatters in PyArg_ParseTuple to raise an
Overflow exception if they overflow (previously they just silently
overflowed).
Changes by Guido: always accept values [0..255] (in addition to
[CHAR_MIN..CHAR_MAX]) for 'b' format; changed some spaces into tabs in
other code.
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
For more comments, read the patches@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.
(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode. I'm also holding back on his
change to main.c, which seems unnecessary to me.)
- When 'import exceptions' fails, don't suggest to use -v to print the traceback;
this doesn't actually work.
- Remove comment about fallback to string exceptions.
- Remove a PyErr_Occurred() check after all is said and done that can
never trigger.
- Remove static function newstdexception() which is no longer called.
Added 'u' and 'u#' tags for PyArg_ParseTuple - these turn a
PyUnicodeObject argument into a Py_UNICODE * buffer, or a Py_UNICODE *
buffer plus a length with the '#'. Also added an analog to 'U'
for Py_BuildValue.
return 0 (exceptions don't match). This means that if an ImportError
is raised because exceptions.py can't be imported, the interpreter
will exit "cleanly" with an error message instead of just core
dumping.
PyErr_SetFromErrnoWithFilename(), PyErr_SetFromWindowsErrWithFilename():
Don't test on Py_UseClassExceptionsFlag.
are no longer supported (i.e. -X option is removed).
_PyBuiltin_Init_1(): Don't call initerrors(). This does mean that it
is possible to raise an ImportError before that exception has been
initialized, say because exceptions.py can't be found, or contains
bogosity. See changes to errors.c for how this is handled.
_PyBuiltin_Init_2(): Don't test Py_UseClassExceptionsFlag, just go
ahead and initialize the class-based standard exceptions. If this
fails, we throw a Py_FatalError.
Changed all references to the MAGIC constant to use a global
pyc_magic instead. This global is initially set to MAGIC, but can be
changed by the _PyImport_Init() function to provide for
special features implemented in the compiler which are settable
using command line switches and affect the way PYC files are
generated.
Currently this change is only done for the -U flag.
Support for the new -U command line option option:
with the option enabled the Python compiler
interprets all "..." strings as u"..." (same with r"..." and
ur"...").
Follow a suggestion in an /*XXX*/ comment [in com_add()] to speed up
compilation by using supplemental dictionaries to keep track of names
and constants, eliminating quadratic behavior. With this patch in
place, the time to import a 5000-line file with lots of constants [at
the global level] is reduced from 20 seconds to under 3 on my system.
Here's a patch which changes modsupport to add 'u' and 'u#',
to support building Unicode objects from a null-terminated
Py_UNICODE *, and a Py_UNICODE * with length, respectively.
[Conversion from 'U' to 'u' by Fred, based on python-dev comments.]
Note that the use of None for NULL values of the Py_UNICODE* value is
still in; I'm not sure of the conclusion on that issue.
remaining object references if the environment variable PYTHONDUMPREFS
exists. The default behaviour caused problems for background or
otherwise invisible processes that use the debug build of Python.
Fixed a memory leak found by Fredrik Lundh. Instead of
PyUnicode_AsUTF8String() we now use _PyUnicode_AsUTF8String() which
returns the string object without incremented refcount (and assures
that the so obtained object remains alive until the Unicode object is
garbage collected).
"""
Running "test_extcall" repeatedly results in memory leaks.
One of these can't be fixed (at least not easily!), it happens since
this code:
def saboteur(**kw):
kw['x'] = locals()
d = {}
saboteur(a=1, **d)
creates a circular reference - d['x']['d']==d
The others are due to some missing decrefs in ceval.c, fixed by the
patch attached below.
Note: I originally wrote this without the "goto", just adding the
missing decref's where needed. But I think the goto is justified in
keeping the executable code size of ceval as small as possible.
"""
[I think the circular reference is more like kw['x']['kw'] == kw. --GvR]
Added special case to unicode(): when being passed a
Unicode object as first argument, return the object as-is.
Raises an exception when given a Unicode object *and* an
encoding name.
comparing code objects. This give sless surprising results in
-Optimized code. It also sorts code objects by name, now.
[I changed the patch to hash() slightly to touch fewer lines.]
his copy of test_contains.py seems to be broken -- the lines he
deleted were already absent). Checkin messages:
New Unicode support for int(), float(), complex() and long().
- new APIs PyInt_FromUnicode() and PyLong_FromUnicode()
- added support for Unicode to PyFloat_FromString()
- new encoding API PyUnicode_EncodeDecimal() which converts
Unicode to a decimal char* string (used in the above new
APIs)
- shortcuts for calls like int(<int object>) and float(<float obj>)
- tests for all of the above
Unicode compares and contains checks:
- comparing Unicode and non-string types now works; TypeErrors
are masked, all other errors such as ValueError during
Unicode coercion are passed through (note that PyUnicode_Compare
does not implement the masking -- PyObject_Compare does this)
- contains now works for non-string types too; TypeErrors are
masked and 0 returned; all other errors are passed through
Better testing support for the standard codecs.
Misc minor enhancements, such as an alias dbcs for the mbcs codec.
Changes:
- PyLong_FromString() now applies the same error checks as
does PyInt_FromString(): trailing garbage is reported
as error and not longer silently ignored. The only characters
which may be trailing the digits are 'L' and 'l' -- these
are still silently ignored.
- string.ato?() now directly interface to int(), long() and
float(). The error strings are now a little different, but
the type still remains the same. These functions are now
ready to get declared obsolete ;-)
- PyNumber_Int() now also does a check for embedded NULL chars
in the input string; PyNumber_Long() already did this (and
still does)
Followed by:
Looks like I've gone a step too far there... (and test_contains.py
seem to have a bug too).
I've changed back to reporting all errors in PyUnicode_Contains()
and added a few more test cases to test_contains.py (plus corrected
the join() NameError).
If a non-tuple sequence is passed as the *arg, convert it to a tuple
before checking its length.
If named keyword arguments are used in combination with **kwargs, make
a copy of kwargs before inserting the new keys.
the return value of PySequence_Length(). If an exception occurred,
the returned length will be -1. Make sure this doesn't get obscurred,
and that the bogus length isn't used.
executive summary:
Instead of typing 'apply(f, args, kwargs)' you can type 'f(*arg, **kwargs)'.
Some file-by-file details follow.
Grammar/Grammar:
simplify varargslist, replacing '*' '*' with '**'
add * & ** options to arglist
Include/opcode.h & Lib/dis.py:
define three new opcodes
CALL_FUNCTION_VAR
CALL_FUNCTION_KW
CALL_FUNCTION_VAR_KW
Python/ceval.c:
extend TypeError "keyword parameter redefined" message to include
the name of the offending keyword
reindent CALL_FUNCTION using four spaces
add handling of sequences and dictionaries using extend calls
fix function import_from to use PyErr_Format
The attached patch set includes a workaround to get Python with
Unicode compile on BSDI 4.x (courtesy Thomas Wouters; the cause
is a bug in the BSDI wchar.h header file) and Python interfaces
for the MBCS codec donated by Mark Hammond.
Also included are some minor corrections w/r to the docs of
the new "es" and "es#" parser markers (use PyMem_Free() instead
of free(); thanks to Mark Hammond for finding these).
The unicodedata tests are now in a separate file
(test_unicodedata.py) to avoid problems if the module cannot
be found.
Attached you find the latest update of the Unicode implementation.
The patch is against the current CVS version.
It includes the fix I posted yesterday for the core dump problem
in codecs.c (was introduced by my previous patch set -- sorry),
adds more tests for the codecs and two new parser markers
"es" and "es#".
Andy Robinson noted a core dump in the codecs.c file. This
was introduced by my latest patch which fixed a memory leak
in codecs.c. The bug causes all successful codec lookups to fail.
Attached you find an update of the Unicode implementation.
The patch is against the current CVS version. I would appreciate
if someone with CVS checkin permissions could check the changes
in.
The patch contains all bugs and patches sent this week and also
fixes a leak in the codecs code and a bug in the free list code
for Unicode objects (which only shows up when compiling Python
with Py_DEBUG; thanks to MarkH for spotting this one).
Added wrapping macros to dictobject.c, listobject.c, tupleobject.c,
frameobject.c, traceback.c that safely prevends core dumps
on stack overflow. Macros and functions in object.c, object.h.
The method is an "elevator destructor" that turns cascading
deletes into tail recursive behavior when some limit is hit.
* Changes to a recent patch by Chris Tismer to errors.c. Chris' patch
always used FormatMessage() to get the error message passing the error code
from errno - but errno and FormatMessage use a different numbering scheme.
The main reason the patch looked OK was that ENOFILE==ERROR_FILE_NOT_FOUND -
but that is about the only shared error code :-). The MS CRT docs tell you
to use _sys_errlist()/_sys_nerr. My patch does also this, and adds a very
similar function specifically for win32 error codes.
PR#175 -- when exec is passed a code object, it didn't sync the locals
from the dictionary back into their fast representation.
Also took the time to remove some repetitive code there and to do the
syncing even when an exception is raised (since a partial effect
should still be synced).
* in import.c, #ifdef out references to dynamic loading based on
HAVE_DYNAMIC_LOADING
* clean out the platform-specific crud from importdl.c.
[ maybe fold this function into import.c and drop the importdl.c file? Greg.]
* change GetDynLoadFunc's "funcname" parameter to "shortname". change
"name" to "fqname" for clarification.
* each GetDynLoadFunc now creates its own funcname value.
WARNING: as I mentioned previously, we may run into an issue with a
missing "_" on some platforms. Testing will show this pretty quickly,
however.
* move pathname munging into dynload_shlib.c
Here's a patch that avoids a warning caused by the "const char* pathname"
declaration for _PyImport_GetDynLoadFunc (in dynload_aix). The "aix_load"
function's 1st arg is prototyped as "char *pathname".
file per platform (really: per style of Dl API; e.g. all platforms
using dlopen() are grouped together in dynload_shlib.c.).
This is part of a set of patches by Greg Stein.
Duzan, for AIX, to support C++ objects with static initializers, when
using the genuine IBM C++ compiler (namely xlC/xlC_r).
See accompanying patches to configure.in and acconfig.h.
not as descriptive as what Barry suggests, but this also catches the
(in my opinion important) case where some other C code besides apply()
constructs a kwdict that doesn't have the right format. All the other
possibilities of getting it wrong (non-dict, wrong keywords etc) are
already caught so this makes sense to check here.
For a long time I've seen absurd tracebacks under -O (e.g., negative
line numbers), but very rarely. Since I was looking at tracebacks
anyway, thought I'd track it down. Turns out to be Guido's only
predictable blind spot <wink -- "char" is signed on some non-GvR
systems>. Patch follows.
tracefunc (or profilefunc -- we're not sure which), zap the global
trace and profile funcs so that we can't get into recursive loop when
instantiating the resulting class based exception.
"""
Following up Robin Dunn's troubles with freeze, here's a patch that
fixes an oddity regarding the import logic of shared modules on AIX.
Symbol resolution of shared modules is now handled properly for the cases
when the python library is linked to a binary with an arbitrary name.
This includes the standard python[version] executable, but also applications
that are embedding the python core (i.e. linked with libpython[version].a,
the latter being static or shared).
"""
Introduce a new builtin exception, UnboundLocalError, raised when ceval.c
tries to retrieve or delete a local name that isn't bound to a value.
Currently raises NameError, which makes this behavior a FAQ since the same
error is raised for "missing" global names too: when the user has a global
of the same name as the unbound local, NameError makes no sense to them.
Even in the absence of shadowing, knowing whether a bogus name is local or
global is a real aid to quick understanding.
Example:
D:\src\PCbuild>type local.py
x = 42
def f():
print x
x = 13
return x
f()
D:\src\PCbuild>python local.py
Traceback (innermost last):
File "local.py", line 8, in ?
f()
File "local.py", line 4, in f
print x
UnboundLocalError: x
D:\src\PCbuild>
Note that UnboundLocalError is a subclass of NameError, for compatibility
with existing class-exception code that may be trying to catch this as a
NameError. Unfortunately, I see no way to make this wholly compatible
with -X (see comments in bltinmodule.c): under -X, [UnboundLocalError
is an alias for NameError --GvR].
[The ceval.c patch differs slightly from the second version that Tim
submitted; I decided not to raise UnboundLocalError for DELETE_NAME,
only for DELETE_LOCAL. DELETE_NAME is only generated at the module
level, and since at that level a NameError is raised for referencing
an undefined name, it should also be raised for deleting one.]
We occasionally received reports from people getting "invalid tstate"
crashes (this is a fatal error in PyThreadState_Delete()). Finally
several people were able to reproduce it reliably and Tim Peters
discovered that there is a race condition when multiple threads are
calling this function without holding the global interpreter lock (the
function may be called without holding that).
Solved the race condition by adding a lock around the mutating uses of
interp->tstate_head. Tim and Jonathan Giddy have run tests that make
it likely that this fixes the crashes -- although Tim hasn't heard
from the person who reported the original problem.
ExtensionClasses in isinstance() and issubclass().
- abstract instance and class protocols are used *only* in those
cases that would generate errors before the patch. That is, there's
no penalty for the normal case.
- instance protocol: an object smells like an instance if it
has a __class__ attribute that smells like a class.
- class protocol: an object smells like a class if it has a
__bases__ attribute that is a tuple with elements that
smell like classes (although not all elements may actually get
sniffed ;).
man pages suggest that the proper thing to do is to add THR_NEW_LWP to
the flags on thr_create(), and that there really isn't a downside, so
I'll do that.
"""
Spec says that on success pthread_create returns 0. It does not say
that an error code will be < 0. Linux glibc2 pthread_create() returns
ENOMEM (12) when one exceed process limits. (It looks like it should
return EAGAIN, but that's another story.)
For reference, see:
http://www.opengroup.org/onlinepubs/7908799/xsh/pthread_create.html
"""
[I have a feeling that similar bugs were fixed before; perhaps someone
could check that all error checks no check for != 0?]
xrange(), especially for platforms where int and long are different
sizes (so sys.maxint isn't actually the theoretical limit for the
length of a list, but the largest C int is -- sys.maxint is the
largest Python int, which is actually a C long).
test for classes with a __complex__() method. The attribute is pulled
out of the instance with PyObject_GetAttr() but this transfers
ownership and the function object was never DECREF'd.
v temporary variable was never decref'd. Test this by starting up the
interpreter, hitting C-c, then immediately exiting.
Same potential leak can occur if error is E_NOMEM, since the return is
done in the case block. Added Py_XDECREF(v); to both blocks, just
before the return.
think we have our own DOS box (i.e. we're not started from a command
line shell), we print a message and wait for the user to hit a key
before the DOS box is closed.
The hacky heuristic for determining whether we have our *own* DOS box
(due to Mark Hammond) is to test whether we're on line zero...
The following patches (relative to 1.5.2b1) enable Python dynamic
loading to work on NetBSD platforms that use ELF (presnetly mips and
alpha systems). They automaticly determine wether the system is ELF or
a.out rather than using astatic list of platforms so that when other
NetBSD platforms move to ELF, python will continue to work without
change.
In other words, hex(sys.hexversion) == 0x010502b2 for Python 1.5.2b2.
This is derived from the new variable PY_VERSION_HEX defined in patchlevel.h.
(Cute, eh?)
Also (non-BSDI specific):
- Change the CHECK_STATUS() macro so it tests for nonzero error codes
instead of negative error codes only (this was needed for BSDI, but
appears to be correct according to the PTHREADS spec).
- use memset() to zero out the allocated lock structure. Again, this
was needed for BSDI, but can't hurt elsewhere either.
initialization of class exceptions. Specifically:
init_class_exc(): This function now returns an integer status of the
class exception initialization. No fatal errors in this method now.
Also, use PySys_WriteStderr() when writing error messages. When an
error occurs in this function, 0 is returned, but the partial creation
of the exception classes is not undone (this happens elsewhere).
Things that could trigger the fallback:
- exceptions.py fails to be imported (due to syntax error, etc.)
- one of the exception classes is missing (e.g. due to library
version mismatch)
- exception class can't be inserted into __builtin__'s dictionary
- MemoryError instance can't be pre-allocated
- some other PyErr_Occurred
newstdexception(): Changed the error message. This is still a fatal
error because if the string based exceptions can't be created, we
really can't continue.
initerrors(): Be sure to xdecref the .exc field, which might be
non-NULL if class exceptions init was aborted.
_PyBuiltin_Init_2(): If class exception init fails, print a warning
message and reinstate the string based exceptions.
that file in fact did not exist or at least was not used. Change this
so that __file__ is *only* set to the .pyc/.pyo file when it actually
read the code object from it; otherwise __file__ is set to the .py
file.
happen when you use a non-keyword argument after a keyword argument,
and in this case you also get a syntax error. I fully suspect that
the underflow is caused by the code that stops generating code when it
detects the syntax error, but I can't find the culprit right now. I
know, I know.)
The MS compiler doesn't call it 'long long', it uses __int64,
so a new #define, LONG_LONG, has been added and all occurrences
of 'long long' are replaced with it.
This is a patch that Bill Bummgarner did for 1.4 that hasn't made its
way into the distribution yet. This is important if you want to use
the ObjC module.
frozen packages. (I *think* this means that we can now have a
built-in module bar that's a submodule of a frozen package foo, by
registering the built-in module with a name "foo.bar" in the table of
builtin modules.)
an exception from errno, with a supplied filename (primarily used by
IOError and OSError). If class exceptions are used then the exception
is instantiated with a 3-tuple: (errno, strerror, filename). For
backwards compatibility reasons, if string exceptions are used,
filename is ignored.
PyErr_SetFromErrno(): Implement in terms of
PyErr_SetFromErrnoWithFilename().
OSError. The EnvironmentError serves primarily as the (common
implementation) base class for IOError and OSError. OSError is used
by posixmodule.c
Also added tuple definition of EnvironmentError when using string
based exceptions.
(1) If a sequence S is shorter than len(S) indicated, don't fail --
just use the shorter size. (I.e, len(S) is just a hint.)
(2) Implement the special case map(None, S) as list(S) -- it's faster.
must be enabled here, otherwise the errno we set on overflows is not
the errno that's being read by compile.c. Wonder how many other files
that do their own "#include config.h" need this too :-(
(Because of the structure of autoconf, it's not so simple to get this
into config.h...)
the filename contains at least a rudimentary pathname.
(The bad part is that we need to call getcwd() because only a prefix
of ".\\" is not enough -- we prefix the drive letter.)
and lists; if the size is negative, raise an exception. Also raise an
exception when an undefined type is found -- all this to increase the
chance that garbage input causes an exception instead of a core dump.
swapped arguments].
Also make sure that no use of a function pointer gotten from a
tp_as_sequence or tp_as_mapping structure is made without checking it
for NULL first.
the code here becomes much simpler. In particular: abs(), divmod(),
pow(), int(), long(), float(), len(), tuple(), list().
Also make sure that no use of a function pointer gotten from a
tp_as_sequence or tp_as_mapping structure is made without checking it
for NULL first.
A few other cosmetic things, such as properly reindenting slice().
old value in a temporary and XDECREF it only after then new value has
been set. This prevents the (unlikely) case where the destructor of
the member uses the containing object -- it would find it in an
undefined state.
because the path through the code would notice that sys.__path__ did
not exist and it would fall back to the default path (builtins +
sys.path) instead of failing). No longer.
Date: Thu, 14 Sep 1995 12:18:20 -0400
From: Alan Morse <alan@dvcorp.com>
To: python-list@cwi.nl
Subject: getargs bug in 1.2 and 1.3 BETA
We have found a bug in the part of the getargs code that we added
and submitted, and which was incorporated into 1.1.
The parsing of "O?" format specifiers is not handled correctly;
there is no "else" for the "if" and therefore it can never fail.
What's worse, the advancing of the varargs pointer is not
handled properly, so from then on it is out of sync, wreaking
all sorts of havoc. (If it had failed properly, then the out-of-sync
varargs would not have been an issue.)
Below is the context diff for the change.
Note that I have made a few stylistic changes beyond adding the
else case, namely:
1) Making the "O" case follow the convention established by the other
format specifiers of getting all their vararg arguments before
performing the test, rather than getting some before and some after
the test passes.
2) Making the logic of the tests parallel, so the "if" part indicates
that the format is accepted and the "else" part indicates that the
format has failed. They were inconsistent with each other and with the
the other format specifiers.
-Alan Morse (amorse@dvcorp.com)
to the table of built-in modules. This should normally be called
*before* Py_Initialize(). When the malloc() or realloc() call fails,
-1 is returned and the existing table is unchanged.
After a similar function by Just van Rossum.
int PyImport_ExtendInittab(struct _inittab *newtab);
int PyImport_AppendInittab(char *name, void (*initfunc)());
Adapted from code submitted by Just van Rossum.
PySys_WriteStdout(format, ...)
PySys_WriteStderr(format, ...)
The first function writes to sys.stdout; the second to sys.stderr. When
there is a problem, they write to the real (C level) stdout or stderr;
no exceptions are raised (but a pending exception may be cleared when a
new exception is caught).
Both take a printf-style format string as their first argument followed
by a variable length argument list determined by the format string.
*** WARNING ***
The format should limit the total size of the formatted output string to
1000 bytes. In particular, this means that no unrestricted "%s" formats
should occur; these should be limited using "%.<N>s where <N> is a
decimal number calculated so that <N> plus the maximum size of other
formatted text does not exceed 1000 bytes. Also watch out for "%f",
which can print hundreds of digits for very large numbers.
PyThreadState_GetDict() returns a dictionary that can be used to hold such
state; the caller should pick a unique key and store its state there. If
PyThreadState_GetDict() returns NULL, an exception has been raised (most
likely MemoryError) and the caller should pass on the exception. */
PyObject *
PyThreadState_GetDict()
Frozen packages are indicated by a negative size (the code string
is the __import__.py file). A frozen package module has its __path__
set to a string, the package name.
time can be in PyImport_ImportModuleEx(). Recursive calls from the
same thread are okay.
Potential problems:
- The lock should really be part of the interpreter state rather than
global, but that would require modifying more files, and I first want
to figure out whether this works at all.
- One could argue that the lock should be per module -- however that
would be complicated to implement. We would have to have a linked
list of locks per module name, *or* invent a new object type to
represent a lock, so we can store the locks in the module or in a
separate dictionary. Both seem unwarranted. The one situation where
this can cause problems is when loading a module takes a long time,
e.g. when the module's initialization code interacts with the user --
during that time, no other threads can run. I say, "too bad."
(modified) and use that.
Some differences in the cleanup algorithm:
- Clear __main__ before the other modules.
- Delete more sys variables: including ps1, ps2, exitfunc, argv, and
even path -- this will prevent new imports!
- Restore stdin, stdout, stderr from __stdin__, __stdout__,
__stderr__, effectively deleting hooks that the user might have
installed -- so their (the hooks') destructors will run.
This is an option for OS-es with case-insensitive but case-preserving
filesystems. It is currently supported for Win32 and MacOS. To
enable it, #define CHECK_IMPORT_CASE in your platform specific
config.h. It is enabled by default on those systems where it is
supported. On Win32, it can be disabled at runtime by setting the
environment variable PYTHONCASEOK (to any value).
When enabled, the feature checks that the case of the requested module
name matches that of the filename found in the filesystem, and raises
a NameError exception when they don't match.
pass it the true file. This is used to set __file__ properly, instead
of believing what the code object carries with it. (If the pointer
is NULL, the code object's co_filename is still used.)
- Add Py_FrozenFlag, intended to suppress error messages fron
getpath.c in frozen binaries.
- Add Py_GetPythonHome() and Py_SetPythonHome(), intended to allow
embedders to force a different PYTHONHOME.
- Add new interface PyErr_PrintEx(flag); same as PyErr_Print() but
flag determines whether sys.last_* are set or not. PyErr_Print()
now simply calls PyErr_PrintEx(1).
(1) Explicitly clear __builtin__._ and sys.{last,exc}_* before
clearing anything else. These are common places where user values
hide and people complain when their destructors fail. Since the
modules containing them are deleted *last* of all, they would come too
late in the normal destruction order. Sigh.
(2) Add some debugging aid to cleanup (after a suggestion by Marc
Lemburg) -- print the names of the modules being cleaned, and (when
-vv is used) print the names of the variables being cleared.
now implement the following finalization strategy.
1. Whenever this code deletes a module, its directory is cleared
carefully, as follows:
- set all names to None that begin with exactly one underscore
- set all names to None that don't begin with two underscores
- clear the directory
2. Modules are deleted in the following order:
- modules with a reference count of 1, except __builtin__ or __sys__
- repeat until no more are found with a reference count of 1
- __main__ if it's still there
- all remaining modules except __builtin__ or sys
- sys
_ __builtin__
This is a bit of a hack: when the shared library is loaded, the module
name is "package.module", but the module calls Py_InitModule*() with just
"module" for the name. The shared library loader squirrels away the true
name of the module in _Py_PackageContext, and Py_InitModule*() will
substitute this (if the name actually matches).
1) The __builtins__ variable in the __main__ module is set to the
__builtin__ module instead of its __dict__.
2) Get rid of the SIGHUP and SIGTERM handlers. They can't be made to
work reliably when threads may be in use, they are Unix specific, and
Python programmers can now program this functionality is a safer way
using the signal module.
Setting interp->builtins to the __builtin__ module instead of to its
dictionary had the unfortunate side effect of always running in
restricted execution mode :-(
I will check in a different way of setting __main__.__builtins__ to
the __builtin__ module later.
Also, there was a typo -- a comment was unfinished, and as a result
some finalizations were not being executed.
In Bart Simpson style,
I Will Not Check In Untested Changes.
I Will Not Check In Untested Changes.
I Will Not Check In Untested Changes.
I Will Not Check In Untested Changes.
I Will Not Check In Untested Changes.
I Will Not Check In Untested Changes.
I Will Not Check In Untested Changes.
I Will Not Check In Untested Changes.
I Will Not Check In Untested Changes.
I Will Not Check In Untested Changes.
- The interp->builtins variable (and hence, __main__.__builtins__) is
once again initialized to the built-in *module* instead of its
dictionary.
- The finalization order is once again changed. Signals are finalized
relatively early, because (1) it DECREF's the signal handlers, and if
a signal handler happens to be a bound method, deleting it could cause
problems when there's no current thread around, and (2) we don't want
to risk executing signal handlers during finalization.
__init__.py (or __init__.pyc/.pyo, whichever applies) is considered a
package. All other subdirectories are left alone. Should make Konrad
Hinsen happy!
tstate swapping. Only the acquiring and releasing of the lock is
conditional (twice, under ``#ifdef WITH_THREAD'' and inside ``if
(interpreter_lock)'').
but annoying memory leak. This was introduced when PyExc_Exception
was added; the loop above populating the PyExc_StandardError exception
tuple started at index 1 in bltin_exc, but PyExc_Exception was added
at index 0, so PyExc_StandardError was getting inserted in itself!
How else can a tuple include itself?!
Change the loop to start at index 2.
This was a *fun* one! :-)
dummy entry to sys.modules, marking the absence of a submodule by the
same name.
Thus, if module foo.bar executes the statement "import time",
sys.modules['foo.time'] will be set to None, once the absence of a
module foo.time is confirmed (by looking for it in foo's path).
The next time when foo.bar (or any other submodule of foo) executes
"import time", no I/O is necessary to determine that there is no
module foo.time.
(Justification: It may seem strange to pollute sys.modules. However,
since we're doing the lookup anyway it's definitely the fastest
solution. This is the same convention that 'ni' uses and I haven't
heard any complaints.)
right thing.
Still to do:
- Make reload() of a submodule work.
- Performance tweaks -- currently, a submodule that tries to import a
global module *always* searches the package directory first, even if
the global module was already imported. Not sure how to solve this
one; probably need to record misses per package.
- Documentation!
This doesn't yet support "import a.b.c" or "from a.b.c import x", but
it does recognize directories. When importing a directory, it
initializes __path__ to a list containing the directory name, and
loads the __init__ module if found.
The (internal) find_module() and load_module() functions are
restructured so that they both also handle built-in and frozen modules
and Mac resources (and directories of course). The imp module's
find_module() and (new) load_module() also have this functionality.
Moreover, imp unconditionally defines constants for all module types,
and has two more new functions: find_module_in_package() and
find_module_in_directory().
There's also a new API function, PyImport_ImportModuleEx(), which
takes all four __import__ arguments (name, globals, locals, fromlist).
The last three may be NULL. This is currently the same as
PyImport_ImportModule() but in the future it will be able to do
relative dotted-path imports.
Other changes:
- bltinmodule.c: in __import__, call PyImport_ImportModuleEx().
- ceval.c: always pass the fromlist to __import__, even if it is a C
function, so PyImport_ImportModuleEx() is useful.
- getmtime.c: the function has a second argument, the FILE*, on which
it applies fstat(). According to Sjoerd this is much faster. The
first (pathname) argument is ignored, but remains for backward
compatibility (so the Mac version still works without changes).
By cleverly combining the new imp functionality, the full support for
dotted names in Python (mini.py, not checked in) is now about 7K,
lavishly commented (vs. 14K for ni plus 11K for ihooks, also lavishly
commented).
Good night!
- Changed semantics for initialized flag (again); forget the ref
counting, forget the fatal errors -- redundant calls to
Py_Initialize() or Py_Finalize() calls are simply ignored.
- Automatically import site.py on initialization, unless a flag is set
not to do this by main().
Added PyErr_MemoryErrorInst to hold the pre-instantiated instance when
using class based exceptions.
Simplified the creation of all built-in exceptions, both class based
and string based. Actually, for class based exceptions, the string
ones are still created just in case there's a problem creating the
class based ones (so you still get *some* exception handling!). Now
the init and fini functions run through a list of structure elements,
creating the strings (and optionally classes) for every entry.
initerrors(): the new base class exceptions StandardError,
LookupError, and NumberError are initialized when using string
exceptions, to tuples containing the list of derived string
exceptions. This GvR trick enables forward compatibility! One bit of
nastiness is that the C code has to know the inheritance tree embodied
in exceptions.py.
Added the two phase init and fini functions.
the -X command line option.
Py_Initialize(): Handle the two phase initialization of the built-in
module.
Py_Finalize(): Handle the two phase finalization of the built-in
module.
parse_syntax_error(): New function which parses syntax errors that
PyErr_Print() will catch. This correctly parses such errors
regardless of whether PyExc_SyntaxError is an old-style string
exception or new-fangled class exception.
PyErr_Print(): Many changes:
1. Normalize the exception.
2. Handle SystemExit exceptions which might be class based. Digs
the exit code out of the "code" attribute. String based
SystemExit is handled the same as before.
3. Handle SyntaxError exceptions which might be class based. Digs
the various information bits out of the instance's attributes
(see parse_syntax_error() for details). String based
SyntaxError still works too.
4. Don't write the `:' after the exception if the exception is
class based and has an empty string str() value.
(PyExc_MemoryErrorInst) raise this instead of PyExc_MemoryError. This
only happens when exception classes are enabled (e.g. when Python is
started with -X).
former rather than the latter, since PyErr_NormalizeException takes
PyObject** and I didn't want to change the interface for set_exc_info
(but I did want the changes propagated to eval_code2!).
UNPACK_LIST byte codes and added a third code path that allows
generalized sequence unpacking. Now both syntaxes:
a, b, c = seq
[a, b, c] = seq
can be used to unpack any sequence with the exact right number of
items.
unpack_sequence(): out-lined implementation of generalized sequence
unpacking. tuple and list unpacking are still inlined.
PyErr_GivenExceptionMatches().
set_exc_info(): make sure to normalize exceptions.
do_raise(): Use PyErr_NormalizeException() if type is a class.
loop_subscript(): Use PyErr_ExceptionMatches() instead of raw pointer
compare for PyExc_IndexError.
- int PyErr_GivenExceptionMatches(obj1, obj2)
Returns 1 if obj1 and obj2 are the same object, or if obj1 is an
instance of type obj2, or of a class derived from obj2
- int PyErr_ExceptionMatches(obj)
Higher level wrapper around PyErr_GivenExceptionMatches() which uses
PyErr_Occurred() as obj1. This will be the more commonly called
function.
- void PyErr_NormalizeException(typeptr, valptr, tbptr)
Normalizes exceptions, and places the normalized values in the
arguments. If type is not a class, this does nothing. If type is a
class, then it makes sure that value is an instance of the class by:
1. if instance is of the type, or a class derived from type, it does
nothing.
2. otherwise it instantiates the class, using the value as an
argument. If value is None, it uses an empty arg tuple, and if
the value is a tuple, it uses just that.
classes as their second arguments. The former takes a class as the
first argument and returns true iff first is second, or is a subclass
of second.
The latter takes any object as the first argument and returns true iff
first is an instance of the second, or any subclass of second.
Also, change all occurances of pointer compares against
PyExc_IndexError with PyErr_ExceptionMatches() calls.
ExitThread(). As discussed in c.l.p, this takes care of
initialization and finalization of thread-local storage allocated by
the C runtime system. Not sure whether non-MS compilers grok this
though (but who cares :-).
scheme based on object's types, have a simple two-phase scheme based
on object's *names*:
/* To make the execution order of destructors for global
objects a bit more predictable, we first zap all objects
whose name starts with a single underscore, before we clear
the entire dictionary. We zap them by replacing them with
None, rather than deleting them from the dictionary, to
avoid rehashing the dictionary (to some extent). */
Py_Initmodule(), which is a macro wrapper around it).
The return value is now a NULL pointer if the initialization failed.
This may make old modules fail with a SEGFAULT, since they don't
expect this kind of failure. That's OK, since (a) it "never" happens,
and (b) they would fail with a fatal error otherwise, anyway.
Tons of extension modules should now check the return value of
Py_Initmodule*() -- that's on my TODO list.
importdl.c: the MAXSUFFIXSIZE macro is now defined in importdl.h, and
the modules dictionary is now passed using PyImport_GetModuleDict().
Also undefine USE_SHLIB for AIX -- in AIX 4.2 and up, dlfcn.h exists
but we don't want to use it.
- Got rid of inspection of some environment variables.
- Got rid of Py_GetProgramName() and related logic.
- Print the version header *after* successful initialization.
for more!).
- The global flags that can be set from environment variables are now
set in Py_Initialize (except the silly Py_SuppressPrint, which no
longer exists). This saves duplicate code in frozenmain.c and main.c.
- Py_GetProgramName() is now here; added Py_SetProgramName(). An
embedding program should no longer provide Py_GetProgramName(),
instead it should call Py_SetProgramName() *before* calling
Py_Initialize().
PyThreadState pointer instead of a (frame) PyObject pointer. This
makes much more sense. It is backward incompatible, but that's no
problem, because (a) the heaviest users are the Py_{BEGIN,END}_
ALLOW_THREADS macros here, which have been fixed too; (b) there are
very few direct users; (c) those who use it are there will probably
appreciate the change.
Also, added new functions PyEval_AcquireThread() and
PyEval_ReleaseThread() which allows the threads created by the thread
module as well threads created by others (!) to set/reset the current
thread, and at the same time acquire/release the interpreter lock.
Much saner.
int+int, int-int, int <compareop> int, and list[int].
(Unfortunately, int*int is way too much code to inline.)
Also corrected a NULL that should have been a zero.
replaces its own entry in sys.module, reference count errors ensue;
even if there is no reference count problem, it would be preferable
for the import to yield the new thing in sys.modules anyway (if only
because that's what later imports will yield). This opens the road to
an official hack to implement a __getattr__ like feature for modules:
stick an instance in sys.modules[__name__].
have a unique name, otherwise they get squished by locals2fast (or
fast2locals, I dunno) when the debugger is invoked before they have
been transferred to real locals.
get/set/del item). This removes a pile of duplication. There's no
abstract operator for 'not' but I removed the function call for it
anyway -- it's a little faster in-line.
dirname in sys.path. This means that you can create a symbolic link
foo in /usr/local/bin pointing to /usr/yourname/src/foo/foo.py, and
then invoking foo will insert /usr/yourname/src/foo in sys.path, not
/usr/local/bin. This makes it easier to have multifile programs
(before, the program would have to do an os.readlink(sys.argv[0])
itself and insert the resulting directory in sys.path -- Grail does
this).
Note that the expansion is only used for sys.path; sys.argv[0] is
still the original, unadorned filename (/usr/local/bin/foo in the
example).
2. Fix two bugs in complex():
- Memory leak when using complex(classinstance) -- r was never
DECREF'ed.
- Conversion of the second argument, if not complex, was done using
the type vector of the 1st.
recognized by the code generator and code generation for the test and
the subsequent suite is suppressed.
One must write *exactly* ``if __debug__:'' or ``elif __debug__:'' --
no parentheses or operators must be present, or the optimization is
not carried through. Whitespace doesn't matter. Other uses of
__debug__ will find __debug__ defined as 0 or 1 in the __builtin__
module.
Py_FdIsInteractive(). The flag is supposed to be set by the -i
command line option. The function is supposed to be called instead of
isatty(). This is used for Lee Busby's wish #1, to have an option
that pretends stdin is interactive even when it really isn't.
by the frameobject dealloc when it is time for the locals to go. When
there's still a traceback object referencing this stack frame, we
don't want the local variables to disappear yet.
(Hmm... Shouldn't they be copied to the f_locals dictionary?)
- Use co->... instead of f->f_code->...; save an extra lookup of what
we already have in a local variable).
- Remove test for nlocals > 0 before setting fastlocals to
f->f_localsplus; 0 is a rare case and the assignment is safe even
then.
called with keyword arguments -- the keyword and value were leaked.
This affected for instance with a __call__() method.
Bug reported and fix supplied by Jim Fulton.
i.e., counting opcode frequencies, or (with DXPAIRS defined) opcode
pair frequencies. Define DYNAMIC_EXECUTION_PROFILE on the command
line (for this file and for sysmodule.c) to enable.
table which is incorporated in the code object. This way, the runtime
overhead to keep track of line numbers is only incurred when an
exception has to be reported.
This is safe now that both intrcheck() and signalmodule.c schedule a
sigcheck() call via Py_AddPendingCall().
This gives another 7% speedup (never run such a test twice ;-).
to PyCode_New() argument list. Move MAXBLOCKS constant to conpile.h.
Added accurate calculation of the actual stack size needed by the
generated code.
Also commented out all fprintf statements (except for a new one to
diagnose stack underflow, and one in #ifdef'ed out code), and added
some new TO DO suggestions (now that the stacksize is taken of the TO
DO list).
The raise logic has one additional feature: if you raise <class>,
<value> where <value> is not an instance, it will construct an
instance using <value> as argument. If <value> is None, <class> is
instantiated without arguments. If <value> is a tuple, it is used as
the argument list.
This feature is intended to make it easier to upgrade code from using
string exceptions to using class exceptions; without this feature,
you'd have to change every raise statement from ``raise X'' to ``raise
X()'' and from ``raise X, y'' to ``raise X(y)''. The latter is still
the recommended form (because it has no ambiguities about the number
of arguments), but this change makes the transition less painful.
be Ellipsis!).
Bumped the API version because a linker-visible symbol is affected.
Old C code will still compile -- there's a b/w compat macro.
Similarly, old Python code will still run, builtin exports both
Ellipses and Ellipsis.