Added special case to unicode(): when being passed a
Unicode object as first argument, return the object as-is.
Raises an exception when given a Unicode object *and* an
encoding name.
comparing code objects. This give sless surprising results in
-Optimized code. It also sorts code objects by name, now.
[I changed the patch to hash() slightly to touch fewer lines.]
his copy of test_contains.py seems to be broken -- the lines he
deleted were already absent). Checkin messages:
New Unicode support for int(), float(), complex() and long().
- new APIs PyInt_FromUnicode() and PyLong_FromUnicode()
- added support for Unicode to PyFloat_FromString()
- new encoding API PyUnicode_EncodeDecimal() which converts
Unicode to a decimal char* string (used in the above new
APIs)
- shortcuts for calls like int(<int object>) and float(<float obj>)
- tests for all of the above
Unicode compares and contains checks:
- comparing Unicode and non-string types now works; TypeErrors
are masked, all other errors such as ValueError during
Unicode coercion are passed through (note that PyUnicode_Compare
does not implement the masking -- PyObject_Compare does this)
- contains now works for non-string types too; TypeErrors are
masked and 0 returned; all other errors are passed through
Better testing support for the standard codecs.
Misc minor enhancements, such as an alias dbcs for the mbcs codec.
Changes:
- PyLong_FromString() now applies the same error checks as
does PyInt_FromString(): trailing garbage is reported
as error and not longer silently ignored. The only characters
which may be trailing the digits are 'L' and 'l' -- these
are still silently ignored.
- string.ato?() now directly interface to int(), long() and
float(). The error strings are now a little different, but
the type still remains the same. These functions are now
ready to get declared obsolete ;-)
- PyNumber_Int() now also does a check for embedded NULL chars
in the input string; PyNumber_Long() already did this (and
still does)
Followed by:
Looks like I've gone a step too far there... (and test_contains.py
seem to have a bug too).
I've changed back to reporting all errors in PyUnicode_Contains()
and added a few more test cases to test_contains.py (plus corrected
the join() NameError).
If a non-tuple sequence is passed as the *arg, convert it to a tuple
before checking its length.
If named keyword arguments are used in combination with **kwargs, make
a copy of kwargs before inserting the new keys.
the return value of PySequence_Length(). If an exception occurred,
the returned length will be -1. Make sure this doesn't get obscurred,
and that the bogus length isn't used.
executive summary:
Instead of typing 'apply(f, args, kwargs)' you can type 'f(*arg, **kwargs)'.
Some file-by-file details follow.
Grammar/Grammar:
simplify varargslist, replacing '*' '*' with '**'
add * & ** options to arglist
Include/opcode.h & Lib/dis.py:
define three new opcodes
CALL_FUNCTION_VAR
CALL_FUNCTION_KW
CALL_FUNCTION_VAR_KW
Python/ceval.c:
extend TypeError "keyword parameter redefined" message to include
the name of the offending keyword
reindent CALL_FUNCTION using four spaces
add handling of sequences and dictionaries using extend calls
fix function import_from to use PyErr_Format
The attached patch set includes a workaround to get Python with
Unicode compile on BSDI 4.x (courtesy Thomas Wouters; the cause
is a bug in the BSDI wchar.h header file) and Python interfaces
for the MBCS codec donated by Mark Hammond.
Also included are some minor corrections w/r to the docs of
the new "es" and "es#" parser markers (use PyMem_Free() instead
of free(); thanks to Mark Hammond for finding these).
The unicodedata tests are now in a separate file
(test_unicodedata.py) to avoid problems if the module cannot
be found.
Attached you find the latest update of the Unicode implementation.
The patch is against the current CVS version.
It includes the fix I posted yesterday for the core dump problem
in codecs.c (was introduced by my previous patch set -- sorry),
adds more tests for the codecs and two new parser markers
"es" and "es#".
Andy Robinson noted a core dump in the codecs.c file. This
was introduced by my latest patch which fixed a memory leak
in codecs.c. The bug causes all successful codec lookups to fail.
Attached you find an update of the Unicode implementation.
The patch is against the current CVS version. I would appreciate
if someone with CVS checkin permissions could check the changes
in.
The patch contains all bugs and patches sent this week and also
fixes a leak in the codecs code and a bug in the free list code
for Unicode objects (which only shows up when compiling Python
with Py_DEBUG; thanks to MarkH for spotting this one).
Added wrapping macros to dictobject.c, listobject.c, tupleobject.c,
frameobject.c, traceback.c that safely prevends core dumps
on stack overflow. Macros and functions in object.c, object.h.
The method is an "elevator destructor" that turns cascading
deletes into tail recursive behavior when some limit is hit.
* Changes to a recent patch by Chris Tismer to errors.c. Chris' patch
always used FormatMessage() to get the error message passing the error code
from errno - but errno and FormatMessage use a different numbering scheme.
The main reason the patch looked OK was that ENOFILE==ERROR_FILE_NOT_FOUND -
but that is about the only shared error code :-). The MS CRT docs tell you
to use _sys_errlist()/_sys_nerr. My patch does also this, and adds a very
similar function specifically for win32 error codes.
PR#175 -- when exec is passed a code object, it didn't sync the locals
from the dictionary back into their fast representation.
Also took the time to remove some repetitive code there and to do the
syncing even when an exception is raised (since a partial effect
should still be synced).
* in import.c, #ifdef out references to dynamic loading based on
HAVE_DYNAMIC_LOADING
* clean out the platform-specific crud from importdl.c.
[ maybe fold this function into import.c and drop the importdl.c file? Greg.]
* change GetDynLoadFunc's "funcname" parameter to "shortname". change
"name" to "fqname" for clarification.
* each GetDynLoadFunc now creates its own funcname value.
WARNING: as I mentioned previously, we may run into an issue with a
missing "_" on some platforms. Testing will show this pretty quickly,
however.
* move pathname munging into dynload_shlib.c
Here's a patch that avoids a warning caused by the "const char* pathname"
declaration for _PyImport_GetDynLoadFunc (in dynload_aix). The "aix_load"
function's 1st arg is prototyped as "char *pathname".