Improvements:
- does no longer need any extra memory
- has no relationship to tstate
- works in debug mode
- can easily be modified for free threading (hi Greg:)
Side effects:
Trashcan does change the order of object destruction.
Prevending that would be quite an immense effort, as
my attempts have shown. This version works always
the same, with debug mode or not. The slightly
changed destruction order should therefore be no problem.
Algorithm:
While the old idea of delaying the destruction of some
obejcts at a certain recursion level was kept, we now
no longer aloocate an object to hold these objects.
The delayed objects are instead chained together
via their ob_type field. The type is encoded via
ob_refcnt. When it comes to the destruction of the
chain of waiting objects, the topmost object is popped
off the chain and revived with type and refcount 1,
then it gets a normal Py_DECREF.
I am confident that this solution is near optimum
for minimizing side effects and code bloat.
Changed PyUnicode_Splitlines() maxsplit argument to keepends.
The maxsplit functionality was replaced by the keepends
functionality which allows keeping the line end markers together
with the string.
his copy of test_contains.py seems to be broken -- the lines he
deleted were already absent). Checkin messages:
New Unicode support for int(), float(), complex() and long().
- new APIs PyInt_FromUnicode() and PyLong_FromUnicode()
- added support for Unicode to PyFloat_FromString()
- new encoding API PyUnicode_EncodeDecimal() which converts
Unicode to a decimal char* string (used in the above new
APIs)
- shortcuts for calls like int(<int object>) and float(<float obj>)
- tests for all of the above
Unicode compares and contains checks:
- comparing Unicode and non-string types now works; TypeErrors
are masked, all other errors such as ValueError during
Unicode coercion are passed through (note that PyUnicode_Compare
does not implement the masking -- PyObject_Compare does this)
- contains now works for non-string types too; TypeErrors are
masked and 0 returned; all other errors are passed through
Better testing support for the standard codecs.
Misc minor enhancements, such as an alias dbcs for the mbcs codec.
Changes:
- PyLong_FromString() now applies the same error checks as
does PyInt_FromString(): trailing garbage is reported
as error and not longer silently ignored. The only characters
which may be trailing the digits are 'L' and 'l' -- these
are still silently ignored.
- string.ato?() now directly interface to int(), long() and
float(). The error strings are now a little different, but
the type still remains the same. These functions are now
ready to get declared obsolete ;-)
- PyNumber_Int() now also does a check for embedded NULL chars
in the input string; PyNumber_Long() already did this (and
still does)
Followed by:
Looks like I've gone a step too far there... (and test_contains.py
seem to have a bug too).
I've changed back to reporting all errors in PyUnicode_Contains()
and added a few more test cases to test_contains.py (plus corrected
the join() NameError).
executive summary:
Instead of typing 'apply(f, args, kwargs)' you can type 'f(*arg, **kwargs)'.
Some file-by-file details follow.
Grammar/Grammar:
simplify varargslist, replacing '*' '*' with '**'
add * & ** options to arglist
Include/opcode.h & Lib/dis.py:
define three new opcodes
CALL_FUNCTION_VAR
CALL_FUNCTION_KW
CALL_FUNCTION_VAR_KW
Python/ceval.c:
extend TypeError "keyword parameter redefined" message to include
the name of the offending keyword
reindent CALL_FUNCTION using four spaces
add handling of sequences and dictionaries using extend calls
fix function import_from to use PyErr_Format
The attached patch set includes a workaround to get Python with
Unicode compile on BSDI 4.x (courtesy Thomas Wouters; the cause
is a bug in the BSDI wchar.h header file) and Python interfaces
for the MBCS codec donated by Mark Hammond.
Also included are some minor corrections w/r to the docs of
the new "es" and "es#" parser markers (use PyMem_Free() instead
of free(); thanks to Mark Hammond for finding these).
The unicodedata tests are now in a separate file
(test_unicodedata.py) to avoid problems if the module cannot
be found.
/* More standard operations (at end for binary compatibility) */
should now be:
/* More standard operations (here for binary compatibility) */
since they're no longer at the end!
Attached you find an update of the Unicode implementation.
The patch is against the current CVS version. I would appreciate
if someone with CVS checkin permissions could check the changes
in.
The patch contains all bugs and patches sent this week and also
fixes a leak in the codecs code and a bug in the free list code
for Unicode objects (which only shows up when compiling Python
with Py_DEBUG; thanks to MarkH for spotting this one).
Added wrapping macros to dictobject.c, listobject.c, tupleobject.c,
frameobject.c, traceback.c that safely prevends core dumps
on stack overflow. Macros and functions in object.c, object.h.
The method is an "elevator destructor" that turns cascading
deletes into tail recursive behavior when some limit is hit.
a new proc type (objobjproc), a new slot sq_contains to
PySequenceMethods, and a new flag Py_TPFLAGS_HAVE_SEQUENCE_IN to
Py_TPFLAGS_DEFAULT. More to follow.
Introduce a new builtin exception, UnboundLocalError, raised when ceval.c
tries to retrieve or delete a local name that isn't bound to a value.
Currently raises NameError, which makes this behavior a FAQ since the same
error is raised for "missing" global names too: when the user has a global
of the same name as the unbound local, NameError makes no sense to them.
Even in the absence of shadowing, knowing whether a bogus name is local or
global is a real aid to quick understanding.
Example:
D:\src\PCbuild>type local.py
x = 42
def f():
print x
x = 13
return x
f()
D:\src\PCbuild>python local.py
Traceback (innermost last):
File "local.py", line 8, in ?
f()
File "local.py", line 4, in f
print x
UnboundLocalError: x
D:\src\PCbuild>
Note that UnboundLocalError is a subclass of NameError, for compatibility
with existing class-exception code that may be trying to catch this as a
NameError. Unfortunately, I see no way to make this wholly compatible
with -X (see comments in bltinmodule.c): under -X, [UnboundLocalError
is an alias for NameError --GvR].
[The ceval.c patch differs slightly from the second version that Tim
submitted; I decided not to raise UnboundLocalError for DELETE_NAME,
only for DELETE_LOCAL. DELETE_NAME is only generated at the module
level, and since at that level a NameError is raised for referencing
an undefined name, it should also be raised for deleting one.]
indicate to those that are using the CVS access that they are using a
newer-than-1.2.5 version, without committing to a particular version
number or patch level.
PycStringIO_IMPORT. While arguably the name used in the documentation
is more consistent, I think it's probably safer not to change the
macro definition and instead fix the doco.
Add a new member to the PyBufferProcs struct, bf_getcharbuffer. For
backward compatibility, this member should only be used (this includes
testing for NULL!) when the flag Py_TPFLAGS_HAVE_GETCHARBUFFER is set
in the type structure, below. Note that if its flag is not set, we
may be looking at an extension module compiled for 1.5.1, which will
have garbage at the bf_getcharbuffer member (because the struct wasn't
as long then). If the flag is one, the pointer may still be NULL.
The function found at this member is used in a similar manner as
bf_getreadbuffer, but it is known to point to 8-bit character data.
(See discussion in getargs.c checked in later.)
As a general feature for extending the type structure and the various
structures that (may) hang off it in a backwards compatible way, we
rename the tp_xxx4 "spare" slot to tp_flags. In 1.5.1 and before,
this slot was always zero. In 1.5.1, it may contain various flags
indicating extra fields that weren't present in 1.5.1. The only flag
defined so far is for the bf_getcharbuffer member of the PyBufferProcs
struct.
Note that the new spares (tp_xxx5 - tp_xxx8), once they become used,
should also be protected by a flag (or flags) in tp_flags.
The MS compiler doesn't call it 'long long', it uses __int64,
so a new #define, LONG_LONG, has been added and all occurrences
of 'long long' are replaced with it.
clear_carefully() used to do in import.c. Differences: leave only
__builtins__ alone in the 2nd pass; and don't clear the dictionary (on
the theory that as long as there are references left to the
dictionary, those might be destructors that might expect __builtins__
to be alive when they run; and __builtins__ can't normally be part of
a cycle).
embedders to force a different PYTHONHOME.
- Add new interface PyErr_PrintEx(flag); same as PyErr_Print() but
flag determines whether sys.last_* are set or not. PyErr_Print()
now simply calls PyErr_PrintEx(1).
signal handlers in a fork()ed child process when Python is compiled with
thread support. The bug was reported by Scott <scott@chronis.icgroup.com>.
What happens is that after a fork(), the variables used by the signal
module to determine whether this is the main thread or not are bogus,
and it decides that no thread is the main thread, so no signals will
be delivered.
The solution is the addition of PyOS_AfterFork(), which fixes the signal
module's variables. A dummy version of the function is present in the
intrcheck.c source file which is linked when the signal module is not
used.
is like PyImport_ImporModule(name) but receives the globals and locals
dict and the fromlist arguments as well. (The name is a char*; the
others are PyObject*s).
PyExc_NumberError, and PyExc_LookupError. Also added extern for
pre-instantiated exception instance PyExc_MemoryErrorInst.
Removed extern of obsolete exception PyExc_AccessError.
- int PyErr_GivenExceptionMatches(obj1, obj2)
Returns 1 if obj1 and obj2 are the same object, or if obj1 is an
instance of type obj2, or of a class derived from obj2
- int PyErr_ExceptionMatches(obj)
Higher level wrapper around PyErr_GivenExceptionMatches() which uses
PyErr_Occurred() as obj1. This will be the more commonly called
function.
- void PyErr_NormalizeException(typeptr, valptr, tbptr)
Normalizes exceptions, and places the normalized values in the
arguments. If type is not a class, this does nothing. If type is a
class, then it makes sure that value is an instance of the class by:
1. if instance is of the type, or a class derived from type, it does
nothing.
2. otherwise it instantiates the class, using the value as an
argument. If value is None, it uses an empty arg tuple, and if
the value is a tuple, it uses just that.
Py_DECREF, to reduce the warnings when compiling with reference count
debugging on. (There are still warnings for each call to
_Py_NewReference -- too bad.)
Introduce truly separate (sub)interpreter objects. For now, these
must be used by separate threads, created from C. See Demo/pysvr for
an example of how to use this. This also rationalizes Python's
initialization and finalization behavior:
Py_Initialize() -- initialize the whole interpreter
Py_Finalize() -- finalize the whole interpreter
tstate = Py_NewInterpreter() -- create a new (sub)interpreter
Py_EndInterpreter(tstate) -- delete a new (sub)interpreter
There are also new interfaces relating to threads and the interpreter
lock, which can be used to create new threads, and sometimes have to
be used to manipulate the interpreter lock when creating or deleting
sub-interpreters. These are only defined when WITH_THREAD is defined:
PyEval_AcquireLock() -- acquire the interpreter lock
PyEval_ReleaseLock() -- release the interpreter lock
PyEval_AcquireThread(tstate) -- acquire the lock and make the thread current
PyEval_ReleaseThread(tstate) -- release the lock and make NULL current
Other administrative changes:
- The header file bltinmodule.h is deleted.
- The init functions for Import, Sys and Builtin are now internal and
declared in pythonrun.h.
- Py_Setup() and Py_Cleanup() are no longer declared.
- The interpreter state and thread state structures are now linked
together in a chain (the chain of interpreters is a static variable
in pythonrun.c).
- Some members of the interpreter and thread structures have new,
shorter, more consistent, names.
- Added declarations for _PyImport_{Find,Fixup}Extension() to import.h.
PyThreadState pointer instead of a (frame) PyObject pointer. This
makes much more sense. It is backward incompatible, but that's no
problem, because (a) the heaviest users are the Py_{BEGIN,END}_
ALLOW_THREADS macros here, which have been fixed too; (b) there are
very few direct users; (c) those who use it are there will probably
appreciate the change.
Also, added new functions PyEval_AcquireThread() and
PyEval_ReleaseThread() which allows the threads created by the thread
module as well threads created by others (!) to set/reset the current
thread, and at the same time acquire/release the interpreter lock.
Much saner.
last variable to which a floating point expression is assigned. The
macro passes its address to a dummy function so that the optimizer
can't delay calculating its value until after the macro.
hash value. Interning strings (which requires hash caching) tries to
ensure that only one string object with a given value exists, so
equality tests are one pointer comparison. Together, these can speed
the interpreter up by as much as 20%. Each costs the size of a long
or pointer per string object. In addition, interned strings live
until the end of times. If you are concerned about memory footprint,
simply comment the #define out here (and rebuild everything!).
be Ellipsis!).
Bumped the API version because a linker-visible symbol is affected.
Old C code will still compile -- there's a b/w compat macro.
Similarly, old Python code will still run, builtin exports both
Ellipses and Ellipsis.
Under Windows, add MS_DLL_ID and MS_DLL_VERSION_ID for Mark H.
Independent change: if Py_TRACE_REFS is defined, rename Py_InitModule4
so so linking with incompatible modules will create a link time error.
[Backing out of previous changes (also for modsupport.c) to test
the latter at runtime.]
use the new names exclusively, and the linker will see the new names.
Files that import "Python.h" also only see the new names. Files that
import "allobjects.h" will continue to be able to use the old names,
due to the inclusion (in allobjects.h) of "rename2.h".
object.h: made sizes and refcnts signed ints.
stringobject.h: make getstrsize() signed int.
methodobject.h: add METH_VARARGS and METH_FREENAME flag bit definitions.
and __setattr__ support to override getattr(x, name) and
setattr(x, name, value) for class instances. This uses a special
hack whereby the class is supposed to be static: the __getattr__
and __setattr__ methods are looked up only once and saved in the
instance structure for speed
added config.h, config.h.in
moved parser.h to ../Parser, patchlevel.h to ../Python
allobjects.h: include config.h
some: remove all refs to THINK_C_3_0
mymalloc.h: di HAVE_STDLIB differently, use size_t instead of MALLARG
* dosmodule.c: MSDOS specific stuff from posixmodule.c.
* posixmodule.c: removed all MSDOS specific stuff.
* tokenizer.h, parsetok.h: in prototypes, don't mix named and unnamed
parameters (MSC doesn't like this).
* funcobject.c (func_repr): don't call getstringvalue(None) for anonymous
functions.
* bltinmodule.c: removed lambda (which is now a built-in function);
removed implied lambda for string arg to filter/map/reduce.
* Grammar, graminit.[ch], compile.[ch]: replaced lambda as built-in
function by lambda as grammar entity: instead of "lambda('x: x+1')" you
write "lambda x: x+1".
* Xtmodule.c (checkargdict): return 0, not NULL, for error.
* posixmodule.c: don't prototype getcwd() -- it's not portable...
* mappingobject.c: double-check validity of last_name_char in
dict{lookup,insert,remove}.
* arraymodule.c: need memmove only for non-STDC Suns.
* Makefile: comment out HTML_LIBS and XT_USE by default
* pythonmain.c: don't prototype getopt() -- it's not standardized
* socketmodule.c: cast flags arg to {get,set}sockopt() and addrbuf arg to
recvfrom() to (ANY*).
* pythonrun.c (initsigs): fix prototype, make it static
* intobject.c (LONG_BIT): only #define it if not already defined
* classobject.[ch]: remove all references to unused instance_convert()
* mappingobject.c (getmappingsize): Don't return NULL in int function.
* object.[ch], bltinmodule.c, fileobject.c: changed str() to call
strobject() which calls an object's __str__ method if it has one.
strobject() is also called by writeobject() when PRINT_RAW is passed.
* ceval.c: rationalize code for PRINT_ITEM (no change in function!)
* funcobject.c, codeobject.c: added compare and hash functionality.
Functions with identical code objects and the same global dictionary are
equal. Code objects are equal when their code, constants list and names
list are identical (i.e. the filename and code name don't count).
(hash doesn't work yet since the constants are in a list and lists can't
be hashed -- suppose this should really be done with a tuple now we have
resizetuple!)
* PROTO.h, mymalloc.h: added #ifdefs for TURBOC and GNUC.
* allobjects.h: added #include "rangeobject.h"
* Grammar: added lambda_input; relaxed syntax for exec.
* bltinmodule.c: added bagof, map, reduce, lambda, xrange.
* tupleobject.[ch]: added resizetuple().
* rangeobject.[ch]: new object type to speed up range operations (not
convinced this is needed!!!)
* Grammar: add exec statement; allow testlist in expr statement.
* ceval.c, compile.c, opcode.h: support exec statement;
avoid optimizing locals when it is used
* fileobject.{c,h}: add getfilename() internal function.
image objects, and lots of new methods.
* Added counting of allocations and deallocations of builtin types if
COUNT_ALLOCS is defined. Had to move calls to NEWREF down in some
files.
* Bug fix in sorting lists.
* Makefile: change location of FORMS library.
* posixmodule.c: turn #if 0 into #ifdef MSDOS (stuff in unistd.h or not)
* Almost all .h files: added CPP magic to avoid duplicate inclusions and
to support inclusion from C++.
objects of its derived classes; allow anything that has an attribute
named "__privileged__" access to anything.
* object.[ch]: added hasattr() -- test whether getattr() will succeed.
Added $(SYSDEF) to its build rule in Makefile.
* cgensupport.[ch], modsupport.[ch]: removed some old stuff. Also
changed files that still used it... And made several things static
that weren't but should have been... And other minor cleanups...
* listobject.[ch]: add external interfaces {set,get}listslice
* socketmodule.c: fix bugs in new send() argument parsing.
* sunaudiodevmodule.c: added flush() and close().
function found as instance data.
* socketmodule.c: added 'flags' argument sendto/recvfrom, rewrite
argument parsing in send/recv.
* More changes related to access (terminology change: owner instead of
class; allow any object as owner; local/global variables are owned
by their dictionary, only class/instance data is owned by the class;
"from...import *" now only imports objects with public access; etc.)
* Added "access *: ...", made access work for class methods.
* Introduced subclass check: make sure that when calling
ClassName.methodname(instance, ...), the instance is an instance of
ClassName or of a subclass thereof (this might break some old code!)
yet). The class is now passed to eval_code and stored in the current
frame. It is also stored in instance method objects. An "unbound"
instance method is now returned when a function is retrieved through
"classname.funcname", which when called passes the class to eval_code.
(1) dictionaries/mappings now have attributes values() and items() as
well as keys(); at the C level, use the new function mappinggetnext()
to iterate over a dictionary.
(2) "class C(): ..." is now illegal; you must write "class C: ...".
(3) Class objects now know their own name (finally!); and minor
improvements to the way how classes, functions and methods are
represented as strings.
(4) Added an "access" statement and semantics. (This is still
experimental -- as long as you don't use the keyword 'access' nothing
should be changed.)
f_fastlocals in a traceback object (this is a core dump hazard
if there are <nil> entries), but instead eval_code() merges the fast
locals back into the locals dictionary if it looks like the local
variables will be retained. Also, the merge routines save
exceptions since this is sometimes needed (alas!).
* Added id() to bltinmodule.c, which returns an object's address
(identity). Useful to walk arbitrary data structures containing
cycles.
* Added compile() to bltinmodule.c and compile_string() to
pythonrun.[ch]: support to exec/eval arbitrary code objects. The
code that defaults globals and locals is moved from run_node in
pythonrun.c (which is now identical to eval_node) to eval_code in
ceval.c. [XXX For elegance a clean-up session is necessary.]
lookup (opcode.h, ceval.[ch], compile.c, frameobject.[ch],
pythonrun.c, import.c). The .pyc MAGIC number is changed again.
Added get_menu_text to flmodule.
* Stubs for faster implementation of local variables (not yet finished)
* Added function name to code object. Print it for code and function
objects. THIS MAKES THE .PYC FILE FORMAT INCOMPATIBLE (the version
number has changed accordingly)
* Print address of self for built-in methods
* New internal functions getattro and setattro (getattr/setattr with
string object arg)
* Replaced "dictobject" with more powerful "mappingobject"
* New per-type functio tp_hash to implement arbitrary object hashing,
and hashobject() to interface to it
* Added built-in functions hash(v) and hasattr(v, 'name')
* classobject: made some functions static that accidentally weren't;
added __hash__ special instance method to implement hash()
* Added proper comparison for built-in methods and functions
* Fixcprt.py: added [-y file] option, do only files younger than file.
* modsupport.[ch]: added vmkvalue().
* intobject.c: use mkvalue().
* stringobject.c: added "formatstring"; renamed string* to string_*;
ceval.c: call formatstring for string % value.
* longobject.c: close memory leak in divmod.
* parsetok.c: set result node to NULL when returning an error.
* various modules: added 1993 to copyright.
* thread.c: added copyright notice.
* ceval.c: minor change to error message for "+"
* stdwinmodule.c: check for error from wfetchcolor
* config.c: MS-DOS fixes (define PYTHONPATH, use DELIM, use osdefs.h)
* Add declaration of inittab to import.h
* sysmodule.c: added sys.builtin_module_names
* xxmodule.c, xxobject.c: fix minor errors
* posixmodule.c: move extern function declarations to top
* listobject.c: cmp() arguments must be void* if __STDC__
* Makefile, allobjects.h, panelmodule.c, modsupport.c: get rid of
strdup() -- it is a portability risk
* Makefile: enclosed ranlib command in parentheses for Sequent Make
which aborts if the command is not found even if '-' is present
* timemodule.c: time() returns a floating point number, in microsecond
precision if BSD_TIME is defined.
return NULL for malloc(0) or realloc(p, 0). (This should be done
differently than wasting one byte, but alas...)
* Moved "add'l libraries" option in Makefile to an earlier place.
* Remove argument compatibility hacks (b) and (c).
* Add grey2mono, dither2mono and mono2grey to imageop.
* Dup the fd in socket.fromfd().
* Added new modules mpz, md5 (by JH, requiring GNU MP 1.2). Affects
Makefile and config.c.
* socketmodule.c: added socket.fromfd(fd, family, type, [proto]),
converted socket() to use of getargs().
* flmodule.c: added {do,check}_only_forms to fl's list of functions;
and don't print a message when an unknown object is returned.
* pythonrun.c: catch SIGHUP and SIGTERM to do essential cleanup.
* Made jpegmodule.c smaller by using getargs() and mkvalue() consistently.
* Increased parser stack size to 500 in parser.h.
* Implemented custom allocation of stack frames to frameobject.c and
added dynamic stack overflow checks (value stack only) to ceval.c.
(There seems to be a bug left: sometimes stack traces don't make sense.)
sys.stderr or sys.stdin, and to work with any object as long as it has
a write() (respectively readline()) methods. Some functions that took
a FILE* argument now take an object* argument.
argument to malloc() (size_t or unsigned int)
* listobject.c: check for overflow of the size of the object,
so things like range(0x7fffffff) will raise MemoryError instead
of calling malloc() with -4 (and then crashing -- malloc's fault)
coercion is now completely generic.
* ceval.c: for instances, don't coerce for + and *; * reverses
arguments if left one is non-instance numeric and right one sequence.
* socketmodule.c: get rid of makepair(); fix makesocketaddr to fix
broken recvfrom()
* socketmodule: get rid of getStrarg()
* ceval.h: move eval_code() to new file eval.h, so compile.h is no
longer needed.
* ceval.c: move thread comments to ceval.h; always make save/restore
thread functions available (for dynloaded modules)
* cdmodule.c, listobject.c: don't include compile.h
* flmodule.c: include ceval.h
* import.c: include eval.h instead of ceval.h
* cgen.py: add forground(); noport(); winopen(""); to initgl().
* bltinmodule.c, socketmodule.c, fileobject.c, posixmodule.c,
selectmodule.c:
adapt to threads (add BGN/END SAVE macros)
* stdwinmodule.c: adapt to threads and use a special stdwin lock.
* pythonmain.c: don't include getpythonpath().
* pythonrun.c: use BGN/END SAVE instead of direct calls; also more
BGN/END SAVE calls etc.
* thread.c: bigger stack size for sun; change exit() to _exit()
* threadmodule.c: use BGN/END SAVE macros where possible
* timemodule.c: adapt better to threads; use BGN/END SAVE; add
longsleep internal function if BSD_TIME; cosmetics
* split pythonmain.c in two: most stuff goes to pythonrun.c, in the library.
* new optional built-in threadmodule.c, build upon Sjoerd's thread.{c,h}.
* new module from Sjoerd: mmmodule.c (dynamically loaded).
* new module from Sjoerd: sv (svgen.py, svmodule.c.proto).
* new files thread.{c,h} (from Sjoerd).
* new xxmodule.c (example only).
* myselect.h: bzero -> memset
* select.c: bzero -> memset; removed global variable