This introduces:
- A new operator // that means floor division (the kind of division
where 1/2 is 0).
- The "future division" statement ("from __future__ import division)
which changes the meaning of the / operator to implement "true
division" (where 1/2 is 0.5).
- New overloadable operators __truediv__ and __floordiv__.
- New slots in the PyNumberMethods struct for true and floor division,
new abstract APIs for them, new opcodes, and so on.
I emphasize that without the future division statement, the semantics
of / will remain unchanged until Python 3.0.
Not yet implemented are warnings (default off) when / is used with int
or long arguments.
This has been on display since 7/31 as SF patch #443474.
Flames to /dev/null.
- Add an explicit call to PyType_Ready(&PyList_Type) to pythonrun.c
(just for the heck of it, really -- we should either explicitly
ready all types, or none).
Python warning which can be catched by means of the Python warning
framework.
It also adds two new APIs which hopefully make it easier for Python
to switch to buffer overflow safe [v]snprintf() APIs for error
reporting et al. The two new APIs are PyOS_snprintf() and
PyOS_vsnprintf() and work just like the standard ones in many
C libs. On platforms which have snprintf(), the native APIs are used,
on all other an emulation with snprintf() tries to do its best.
And remove all the extern decls in the middle of .c files.
Apparently, it was excluded from the header file because it is
intended for internal use by the interpreter. It's still intended for
internal use and documented as such in the header file.
that 'yield' is a keyword. This doesn't help test_generators at all! I
don't know why not. These things do work now (and didn't before this
patch):
1. "from __future__ import generators" now works in a native shell.
2. Similarly "python -i xxx.py" now has generators enabled in the
shell if xxx.py had them enabled.
3. This program (which was my doctest proxy) works fine:
from __future__ import generators
source = """\
def f():
yield 1
"""
exec compile(source, "", "single") in globals()
print type(f())
that info to code dynamically compiled *by* code compiled with generators
enabled. Doesn't yet work because there's still no way to tell the parser
that "yield" is OK (unlike nested_scopes, the parser has its fingers in
this too).
Replaced PyEval_GetNestedScopes by a more-general
PyEval_MergeCompilerFlags. Perhaps I should not have? I doubted it was
*intended* to be part of the public API, so just did.
the yield statement. I figure we have to have this in before I can
release 2.2a1 on Wednesday.
Note: test_generators is currently broken, I'm counting on Tim to fix
this.
Although this is a one-character change, more work needs to be done:
the compiler can get rid of a lot of non-nested-scopes code, the
documentation needs to be updated, the future statement can be
simplified, etc.
But this change enables the nested scope semantics everywhere, and
that's the important part -- we can now test code for compatibility
with nested scopes by default.
path (with no profile/trace function) through eval_code2() and
eval_frame() avoids several checks.
In the common cases of calls, returns, and exception propogation,
eval_code2() and eval_frame() used to test two values in the
thread-state: the profiling function and the tracing function. With
this change, a flag is set in the thread-state if either of these is
active, allowing a single check to suffice when both are NULL. This
also simplifies the code needed when either function is in use but is
already active (to avoid profiling/tracing the profiler/tracer); the
flag is set to 0 when the profile/trace code is entered, allowing the
same check to suffice for "already in the tracer" for call/return/
exception events.
Python interpreter.
This change adds two new C-level APIs: PyEval_SetProfile() and
PyEval_SetTrace(). These can be used to install profile and trace
functions implemented in C, which can operate at much higher speeds
than Python-based functions. The overhead for calling a C-based
profile function is a very small fraction of a percent of the overhead
involved in calling a Python-based function.
The machinery required to call a Python-based profile or trace
function been moved to sysmodule.c, where sys.setprofile() and
sys.setprofile() simply become users of the new interface.
Implement sys.maxunicode.
Explicitly wrap around upper/lower computations for wide Py_UNICODE.
When decoding large characters with UTF-8, represent expected test
results using the \U notation.
Add configure option --enable-unicode.
Add config.h macros Py_USING_UNICODE, PY_UNICODE_TYPE, Py_UNICODE_SIZE,
SIZEOF_WCHAR_T.
Define Py_UCS2.
Encode and decode large UTF-8 characters into single Py_UNICODE values
for wide Unicode types; likewise for UTF-16.
Remove test whether sizeof Py_UNICODE is two.
unicodeobject.h, which forces sizeof(Py_UNICODE) == sizeof(Py_UCS4).
(this may be good enough for platforms that doesn't have a 16-bit
type. the UTF-16 codecs don't work, though)
the next free valuestack slot, not to the base (in America, stacks push
and pop at the top -- they mutate at the bottom in Australia <winK>).
eval_frame(): assert that f_stacktop isn't NULL upon entry.
frame_delloc(): avoid ordered pointer comparisons involving f_stacktop
when f_stacktop is NULL.
Gave Python linear-time repr() implementations for dicts, lists, strings.
This means, e.g., that repr(range(50000)) is no longer 50x slower than
pprint.pprint() in 2.2 <wink>.
I don't consider this a bugfix candidate, as it's a performance boost.
Added _PyString_Join() to the internal string API. If we want that in the
public API, fine, but then it requires runtime error checks instead of
asserts.
_PyLong_FromByteArray
_PyLong_AsByteArray
Untested and probably buggy -- they compile OK, but nothing calls them
yet. Will soon be called by the struct module, to implement x-platform
'q' and 'Q'.
If other people have uses for them, we could move them into the public API.
See longobject.h for usage details.
UTF-16 codec will now interpret and remove a *leading* BOM mark. Sub-
sequent BOM characters are no longer interpreted and removed.
UTF-16-LE and -BE pass through all BOM mark characters.
These changes should get the UTF-16 codec more in line with what
the Unicode FAQ recommends w/r to BOM marks.
and introduces a new method .decode().
The major change is that strg.encode() will no longer try to convert
Unicode returns from the codec into a string, but instead pass along
the Unicode object as-is. The same is now true for all other codec
return types. The underlying C APIs were changed accordingly.
Note that even though this does have the potential of breaking
existing code, the chances are low since conversion from Unicode
previously took place using the default encoding which is normally
set to ASCII rendering this auto-conversion mechanism useless for
most Unicode encodings.
The good news is that you can now use .encode() and .decode() with
much greater ease and that the door was opened for better accessibility
of the builtin codecs.
As demonstration of the new feature, the patch includes a few new
codecs which allow string to string encoding and decoding (rot13,
hex, zip, uu, base64).
Written by Marc-Andre Lemburg. Copyright assigned to the PSF.
Store floats and doubles to full precision in marshal.
Test that floats read from .pyc/.pyo closely match those read from .py.
Declare PyFloat_AsString() in floatobject header file.
Add new PyFloat_AsReprString() API function.
Document the functions declared in floatobject.h.
safely together and don't duplicate logic (the common logic was factored
out into new private API function _PySequence_IterContains()).
Visible change:
some_complex_number in some_instance
no longer blows up if some_instance has __getitem__ but neither
__contains__ nor __iter__. test_iter changed to ensure that remains true.
NEEDS DOC CHANGES.
This one surprised me! While I expected tuple() to be a no-brainer, turns
out it's actually dripping with consequences:
1. It will *allow* the popular PySequence_Fast() to work with any iterable
object (code for that not yet checked in, but should be trivial).
2. It caused two std tests to fail. This because some places used
PyTuple_Sequence() (the C spelling of tuple()) as an indirect way to test
whether something *is* a sequence. But tuple() code only looked for the
existence of sq->item to determine that, and e.g. an instance passed
that test whether or not it supported the other operations tuple()
needed (e.g., __len__). So some things the tests *expected* to fail
with an AttributeError now fail with a TypeError instead. This looks
like an improvement to me; e.g., test_coercion used to produce 559
TypeErrors and 2 AttributeErrors, and now they're all TypeErrors. The
error details are more informative too, because the places calling this
were *looking* for TypeErrors in order to replace the generic tuple()
"not a sequence" msg with their own more specific text, and
AttributeErrors snuck by that.
patch for sharing single character Unicode objects.
Martin's patch had to be reworked in a number of ways to take Unicode
resizing into consideration as well. Here's what the updated patch
implements:
* Single character Unicode strings in the Latin-1 range are shared
(not only ASCII chars as in Martin's original patch).
* The ASCII and Latin-1 codecs make use of this optimization,
providing a noticable speedup for single character strings. Most
Unicode methods can use the optimization as well (by virtue
of using PyUnicode_FromUnicode()).
* Some code cleanup was done (replacing memcpy with Py_UNICODE_COPY)
* The PyUnicode_Resize() can now also handle the case of resizing
unicode_empty which previously resulted in an error.
* Modified the internal API _PyUnicode_Resize() and
the public PyUnicode_Resize() API to handle references to
shared objects correctly. The _PyUnicode_Resize() signature
changed due to this.
* Callers of PyUnicode_FromUnicode() may now only modify the Unicode
object contents of the returned object in case they called the API
with NULL as content template.
Note that even though this patch passes the regression tests, there
may still be subtle bugs in the sharing code.
sees it (test_iter.py is unchanged).
- Added a tp_iternext slot, which calls the iterator's next() method;
this is much faster for built-in iterators over built-in types
such as lists and dicts, speeding up pybench's ForLoop with about
25% compared to Python 2.1. (Now there's a good argument for
iterators. ;-)
- Renamed the built-in sequence iterator SeqIter, affecting the C API
functions for it. (This frees up the PyIter prefix for generic
iterator operations.)
- Added PyIter_Check(obj), which checks that obj's type has a
tp_iternext slot and that the proper feature flag is set.
- Added PyIter_Next(obj) which calls the tp_iternext slot. It has a
somewhat complex return condition due to the need for speed: when it
returns NULL, it may not have set an exception condition, meaning
the iterator is exhausted; when the exception StopIteration is set
(or a derived exception class), it means the same thing; any other
exception means some other error occurred.
new slot tp_iter in type object, plus new flag Py_TPFLAGS_HAVE_ITER
new C API PyObject_GetIter(), calls tp_iter
new builtin iter(), with two forms: iter(obj), and iter(function, sentinel)
new internal object types iterobject and calliterobject
new exception StopIteration
new opcodes for "for" loops, GET_ITER and FOR_ITER (also supported by dis.py)
new magic number for .pyc files
new special method for instances: __iter__() returns an iterator
iteration over dictionaries: "for x in dict" iterates over the keys
iteration over files: "for x in file" iterates over lines
TODO:
documentation
test suite
decide whether to use a different way to spell iter(function, sentinal)
decide whether "for key in dict" is a good idea
use iterators in map/filter/reduce, min/max, and elsewhere (in/not in?)
speed tuning (make next() a slot tp_next???)
Update docstring and library reference section on 'sys' module.
New API PyErr_Display, just for displaying errors, called by excepthook.
Uncaught exceptions now call sys.excepthook; if that fails, we fall back
to calling PyErr_Display directly.
Also comes with sys.__excepthook__ and sys.__displayhook__.
must now initialize the extra field used by the weak-ref machinery to
NULL themselves, to avoid having to require PyObject_INIT() to check
if the type supports weak references and do it there. This causes less
work to be done for all objects (the type object does not need to be
consulted to check for the Py_TPFLAGS_HAVE_WEAKREFS bit).
If a module has a future statement enabling nested scopes, they are
also enable for the exec statement and the functions compile() and
execfile() if they occur in the module.
If Python is run with the -i option, which enters interactive mode
after executing a script, and the script it runs enables nested
scopes, they are also enabled in interactive mode.
XXX The use of -i with -c "from __future__ import nested_scopes" is
not supported. What's the point?
To support these changes, many function variants have been added to
pythonrun.c. All the variants names end with Flags and they take an
extra PyCompilerFlags * argument. It is possible that this complexity
will be eliminated in a future version of the interpreter in which
nested scopes are not optional.
with free variables. Thanks to Martin v. Loewis for finding two of
the problems. This fixes SF buf 405583.
There is also a C API change: PyFrame_New() is reverting to its
pre-2.1 signature. The change introduced by nested scopes was a
mistake. XXX Is this okay between beta releases?
cell_clear(), the GC helper, must decref its reference to break
cycles.
frame_dealloc() must dealloc all cell vars and free vars in addition
to locals.
eval_code2() setup code must INCREF cells it copies out of the
closure.
The STORE_DEREF opcode implementation must DECREF the object it passes
to PyCell_Set().
(Also remove warning about module-level global decl, because we can't
distinguish from code passed to exec.)
Define PyCompilerFlags type contains a single element,
cf_nested_scopes, that is true if a nested scopes future statement has
been entered at the interactive prompt.
New API functions:
PyNode_CompileFlags()
PyRun_InteractiveOneFlags()
-- same as their non Flags counterparts except that the take an
optional PyCompilerFlags pointer
compile.c: In jcompile() use PyCompilerFlags argument. If
cf_nested_scopes is true, compile code with nested scopes. If it
is false, but the code has a valid future nested scopes statement,
set it to true.
pythonrun.c: Create a new PyCompilerFlags object in
PyRun_InteractiveLoop() and thread it through to
PyRun_InteractiveOneFlags().
for errors raised in future.c.
Move some helper functions from compile.c to errors.c and make them
API functions: PyErr_SyntaxLocation() and PyErr_ProgramText().
XXX still need to integrate into symtable API
compile.h: Remove ff_n_simple_stmt; obsolete.
Add ff_found_docstring used internally to skip one and only
one string at the beginning of a module.
compile.c: Add check for from __future__ imports to far into the file.
In symtable_global() check for -1 returned from
symtable_lookup(), which signifies name not defined.
Add missing DECERF in symtable_add_def.
Free c->c_future.
future.c: Add special handling for multiple statements joined on a
single line using one or more semicolons; this form can
include an illegal future statement that would otherwise be
hard to detect.
Add support for detecting and skipping doc strings.
Makefile.pre.in: add target future.o
Include/compile.h: define PyFutureFeaters and PyNode_Future()
add c_future slot to struct compiling
Include/symtable.h: add st_future slot to struct symtable
Python/future.c: implementation of PyNode_Future()
Python/compile.c: use PyNode_Future() for nested_scopes support
Python/symtable.c: include compile.h to pick up PyFutureFeatures decl
compile.h: #define NESTED_SCOPES_DEFAULT 0 for Python 2.1
__future__ feature name: "nested_scopes"
symtable.h: Add st_nested_scopes slot. Define flags to track exec and
import star.
Lib/test/test_scope.py: requires nested scopes
compile.c: Fiddle with error messages.
Reverse the sense of ste_optimized flag on
PySymtableEntryObjects. If it is true, there is an optimization
conflict.
Modify get_ref_type to respect st_nested_scopes flags.
Refactor symtable_load_symbols() into several smaller functions,
which use struct symbol_info to share variables. In new function
symtable_update_flags(), raise an error or warning for import * or
bare exec that conflicts with nested scopes. Also, modify handle
for free variables to respect st_nested_scopes flag.
In symtable_init() assign st_nested_scopes flag to
NESTED_SCOPES_DEFAULT (defined in compile.h).
Add preliminary and often incorrect implementation of
symtable_check_future().
Add symtable_lookup() helper for future use.
release the interned string dictionary. This is useful for memory
use debugging because it eliminates a huge source of noise from the
reports. Only defined when INTERN_STRINGS is defined.
of nested functions. Either is allowed in a function if it contains
no defs or lambdas or the defs and lambdas it contains have no free
variables. If a function is itself nested and has free variables,
either is illegal.
Revise the symtable to use a PySymtableEntryObject, which holds all
the revelent information for a scope, rather than using a bunch of
st_cur_XXX pointers in the symtable struct. The changes simplify the
internal management of the current symtable scope and of the stack.
Added new C source file: Python/symtable.c. (Does the Windows build
process need to be updated?)
As part of these changes, the initial _symtable module interface
introduced in 2.1a2 is replaced. A dictionary of
PySymtableEntryObjects are returned.
symtable.h, so that they can be used by external module.
Improve error handling in symtable_enter_scope(), which return an
error code that went unchecked by most callers. XXX The error handling
in symtable code is sloppy in general.
Modify symtable to record the line number that begins each scope.
This can help to identify which code block is being referred to when
multiple blocks are bound to the same name.
Add st_scopes dict that is used to preserve scope info when
PyNode_CompileSymtable() is called. Otherwise, this information is
tossed as soon as it is no longer needed.
Add Py_SymtableString() to pythonrun; analogous to Py_CompileString().
discussion on python-dev. 'from mod import *' is still banned except
at the module level.
Fix value for special NOOPT entry in symtable. Initialze to 0 instead
of None, so that later uses of PyInt_AS_LONG() are valid. (Bug
reported by Donn Cave.)
replace local REPR macros with PyObject_REPR in object.h
This change eliminates an extra malloc/free when a frame with free
variables is created. Any cell vars or free vars are stored in
f_localsplus after the locals and before the stack.
eval_code2() fills in the appropriate values after handling
initialization of locals.
To track the size the frame has an f_size member that tracks the total
size of f_localsplus. It used to be implicitly f_nlocals + f_stacksize.
They're named as if public, so I did a Bad Thing by changing
PyMarshal_ReadObjectFromFile() to suck up the remainder of the file in one
gulp: anyone who counted on that leaving the file pointer merely at the
end of the next object would be screwed. So restored
PyMarshal_ReadObjectFromFile() to its earlier state, renamed the new greedy
code to PyMarshal_ReadLastObjectFromFile(), and changed Python internals to
call the latter instead.
The majority of the changes are in the compiler. The mainloop changes
primarily to implement the new opcodes and to pass a function's
closure to eval_code2(). Frames and functions got new slots to hold
the closure.
Include/compile.h
Add co_freevars and co_cellvars slots to code objects.
Update PyCode_New() to take freevars and cellvars as arguments
Include/funcobject.h
Add func_closure slot to function objects.
Add GetClosure()/SetClosure() functions (and corresponding
macros) for getting at the closure.
Include/frameobject.h
PyFrame_New() now takes a closure.
Include/opcode.h
Add four new opcodes: MAKE_CLOSURE, LOAD_CLOSURE, LOAD_DEREF,
STORE_DEREF.
Remove comment about old requirement for opcodes to fit in 7
bits.
compile.c
Implement changes to code objects for co_freevars and co_cellvars.
Modify symbol table to use st_cur_name (string object for the name
of the current scope) and st_cur_children (list of nested blocks).
Also define st_nested, which might more properly be called
st_cur_nested. Add several DEF_XXX flags to track def-use
information for free variables.
New or modified functions of note:
com_make_closure(struct compiling *, PyCodeObject *)
Emit LOAD_CLOSURE opcodes as needed to pass cells for free
variables into nested scope.
com_addop_varname(struct compiling *, int, char *)
Emits opcodes for LOAD_DEREF and STORE_DEREF.
get_ref_type(struct compiling *, char *name)
Return NAME_CLOSURE if ref type is FREE or CELL
symtable_load_symbols(struct compiling *)
Decides what variables are cell or free based on def-use info.
Can now raise SyntaxError if nested scopes are mixed with
exec or from blah import *.
make_scope_info(PyObject *, PyObject *, int, int)
Helper functions for symtable scope stack.
symtable_update_free_vars(struct symtable *)
After a code block has been analyzed, it must check each of
its children for free variables that are not defined in the
block. If a variable is free in a child and not defined in
the parent, then it is defined by block the enclosing the
current one or it is a global. This does the right logic.
symtable_add_use() is now a macro for symtable_add_def()
symtable_assign(struct symtable *, node *)
Use goto instead of for (;;)
Fixed bug in symtable where name of keyword argument in function
call was treated as assignment in the scope of the call site. Ex:
def f():
g(a=2) # a was considered a local of f
ceval.c
eval_code2() now take one more argument, a closure.
Implement LOAD_CLOSURE, LOAD_DEREF, STORE_DEREF, MAKE_CLOSURE>
Also: When name error occurs for global variable, report that the
name was global in the error mesage.
Objects/frameobject.c
Initialize f_closure to be a tuple containing space for cellvars
and freevars. f_closure is NULL if neither are present.
Objects/funcobject.c
Add support for func_closure.
Python/import.c
Change the magic number.
Python/marshal.c
Track changes to code objects.