Removed all instances of Py_UCS2 from the codebase, and so also (I hope)
the last remaining reliance on the platform having an integral type
with exactly 16 bits.
PyUnicode_DecodeUTF16() and PyUnicode_EncodeUTF16() now read and write
one byte at a time.
with functionality needed for both unix-Python and MacPython and a
new smaller ./Mac/Python/macglue.c which contains MacPython stuff only.
pymactoolbox.h has moved to ./Include from ./Mac/Include and now also
contains the relevant stuff from macglue.h.
The net effect of this is that the ./Mac subdirectory is not needed
anymore for building the unix-Python core on MacOSX (it is needed
for building the extension modules).
This introduces:
- A new operator // that means floor division (the kind of division
where 1/2 is 0).
- The "future division" statement ("from __future__ import division)
which changes the meaning of the / operator to implement "true
division" (where 1/2 is 0.5).
- New overloadable operators __truediv__ and __floordiv__.
- New slots in the PyNumberMethods struct for true and floor division,
new abstract APIs for them, new opcodes, and so on.
I emphasize that without the future division statement, the semantics
of / will remain unchanged until Python 3.0.
Not yet implemented are warnings (default off) when / is used with int
or long arguments.
This has been on display since 7/31 as SF patch #443474.
Flames to /dev/null.
- Add an explicit call to PyType_Ready(&PyList_Type) to pythonrun.c
(just for the heck of it, really -- we should either explicitly
ready all types, or none).
Python warning which can be catched by means of the Python warning
framework.
It also adds two new APIs which hopefully make it easier for Python
to switch to buffer overflow safe [v]snprintf() APIs for error
reporting et al. The two new APIs are PyOS_snprintf() and
PyOS_vsnprintf() and work just like the standard ones in many
C libs. On platforms which have snprintf(), the native APIs are used,
on all other an emulation with snprintf() tries to do its best.
And remove all the extern decls in the middle of .c files.
Apparently, it was excluded from the header file because it is
intended for internal use by the interpreter. It's still intended for
internal use and documented as such in the header file.
that 'yield' is a keyword. This doesn't help test_generators at all! I
don't know why not. These things do work now (and didn't before this
patch):
1. "from __future__ import generators" now works in a native shell.
2. Similarly "python -i xxx.py" now has generators enabled in the
shell if xxx.py had them enabled.
3. This program (which was my doctest proxy) works fine:
from __future__ import generators
source = """\
def f():
yield 1
"""
exec compile(source, "", "single") in globals()
print type(f())
that info to code dynamically compiled *by* code compiled with generators
enabled. Doesn't yet work because there's still no way to tell the parser
that "yield" is OK (unlike nested_scopes, the parser has its fingers in
this too).
Replaced PyEval_GetNestedScopes by a more-general
PyEval_MergeCompilerFlags. Perhaps I should not have? I doubted it was
*intended* to be part of the public API, so just did.
the yield statement. I figure we have to have this in before I can
release 2.2a1 on Wednesday.
Note: test_generators is currently broken, I'm counting on Tim to fix
this.
Although this is a one-character change, more work needs to be done:
the compiler can get rid of a lot of non-nested-scopes code, the
documentation needs to be updated, the future statement can be
simplified, etc.
But this change enables the nested scope semantics everywhere, and
that's the important part -- we can now test code for compatibility
with nested scopes by default.
path (with no profile/trace function) through eval_code2() and
eval_frame() avoids several checks.
In the common cases of calls, returns, and exception propogation,
eval_code2() and eval_frame() used to test two values in the
thread-state: the profiling function and the tracing function. With
this change, a flag is set in the thread-state if either of these is
active, allowing a single check to suffice when both are NULL. This
also simplifies the code needed when either function is in use but is
already active (to avoid profiling/tracing the profiler/tracer); the
flag is set to 0 when the profile/trace code is entered, allowing the
same check to suffice for "already in the tracer" for call/return/
exception events.
Python interpreter.
This change adds two new C-level APIs: PyEval_SetProfile() and
PyEval_SetTrace(). These can be used to install profile and trace
functions implemented in C, which can operate at much higher speeds
than Python-based functions. The overhead for calling a C-based
profile function is a very small fraction of a percent of the overhead
involved in calling a Python-based function.
The machinery required to call a Python-based profile or trace
function been moved to sysmodule.c, where sys.setprofile() and
sys.setprofile() simply become users of the new interface.
Implement sys.maxunicode.
Explicitly wrap around upper/lower computations for wide Py_UNICODE.
When decoding large characters with UTF-8, represent expected test
results using the \U notation.
Add configure option --enable-unicode.
Add config.h macros Py_USING_UNICODE, PY_UNICODE_TYPE, Py_UNICODE_SIZE,
SIZEOF_WCHAR_T.
Define Py_UCS2.
Encode and decode large UTF-8 characters into single Py_UNICODE values
for wide Unicode types; likewise for UTF-16.
Remove test whether sizeof Py_UNICODE is two.
unicodeobject.h, which forces sizeof(Py_UNICODE) == sizeof(Py_UCS4).
(this may be good enough for platforms that doesn't have a 16-bit
type. the UTF-16 codecs don't work, though)
the next free valuestack slot, not to the base (in America, stacks push
and pop at the top -- they mutate at the bottom in Australia <winK>).
eval_frame(): assert that f_stacktop isn't NULL upon entry.
frame_delloc(): avoid ordered pointer comparisons involving f_stacktop
when f_stacktop is NULL.
Gave Python linear-time repr() implementations for dicts, lists, strings.
This means, e.g., that repr(range(50000)) is no longer 50x slower than
pprint.pprint() in 2.2 <wink>.
I don't consider this a bugfix candidate, as it's a performance boost.
Added _PyString_Join() to the internal string API. If we want that in the
public API, fine, but then it requires runtime error checks instead of
asserts.
_PyLong_FromByteArray
_PyLong_AsByteArray
Untested and probably buggy -- they compile OK, but nothing calls them
yet. Will soon be called by the struct module, to implement x-platform
'q' and 'Q'.
If other people have uses for them, we could move them into the public API.
See longobject.h for usage details.
UTF-16 codec will now interpret and remove a *leading* BOM mark. Sub-
sequent BOM characters are no longer interpreted and removed.
UTF-16-LE and -BE pass through all BOM mark characters.
These changes should get the UTF-16 codec more in line with what
the Unicode FAQ recommends w/r to BOM marks.
and introduces a new method .decode().
The major change is that strg.encode() will no longer try to convert
Unicode returns from the codec into a string, but instead pass along
the Unicode object as-is. The same is now true for all other codec
return types. The underlying C APIs were changed accordingly.
Note that even though this does have the potential of breaking
existing code, the chances are low since conversion from Unicode
previously took place using the default encoding which is normally
set to ASCII rendering this auto-conversion mechanism useless for
most Unicode encodings.
The good news is that you can now use .encode() and .decode() with
much greater ease and that the door was opened for better accessibility
of the builtin codecs.
As demonstration of the new feature, the patch includes a few new
codecs which allow string to string encoding and decoding (rot13,
hex, zip, uu, base64).
Written by Marc-Andre Lemburg. Copyright assigned to the PSF.