Adds a single function to improve generated bytecode. Has a single line
attachment point, so it is completely de-coupled from both the compiler
and ceval.c.
Makes three simple transforms that do not require a basic block analysis
or re-ordering of code. Gives improved timings on pystone, pybench,
and any code using either "while 1" or "x,y=y,x".
Arranged that all the objects exposed by __builtin__ appear in the list
of all objects. I basically peed away two days tracking down a mystery
leak in sys.gettotalrefcount() in a ZODB app (== tons of code), because
the object leaking the references didn't appear in the sys.getobjects(0)
list. The object happened to be False. Now False is in the list, along
with other popular & previously missing leak candidates (like None).
Alas, we still don't have a choke point covering *all* Python objects,
so the list of all objects may still be incomplete.
variables to store internal data. As a result, any atempts to use the
unicode system with multiple active interpreters, or successive
interpreter executions, would fail.
Now that information is stored into members of the PyInterpreterState
structure.
Added two predictions:
GET_ITER --> FOR_ITER
FOR_ITER --> STORE_FAST or UNPACK_SEQUENCE
Improves timings on pybench and timeit.py. Pystone results are neutral.
Applied to common cases:
COMPARE_OP is often followed by a JUMP_IF.
JUMP_IF is usually followed by POP_TOP.
Shows improved timings on PyStone, PyBench, and specific tests
using timeit.py:
python timeit.py -s "x=1" "if x==1: pass"
python timeit.py -s "x=1" "if x==2: pass"
python timeit.py -s "x=1" "if x: pass"
python timeit.py -s "x=100" "while x!=1: x-=1"
Potential future candidates:
GET_ITER predicts FOR_ITER
FOR_ITER predicts STORE_FAST or UNPACK_SEQUENCE
Also, applied missing goto fast_next_opcode to DUP_TOPX.
My previous patches should have used fast_next_opcode
in a few places instead of continue.
Also, applied one PyInt_AS_LONG macro in a place where
the type had already been checked.
rarely needed, but can sometimes be useful to release objects
referenced by the traceback held in sys.exc_info()[2]. (SF patch
#693195.) Thanks to Kevin Jacobs!
import warnings.py _after_ site.py has run. This ensures that site.py
is again the first .py to be imported, giving it back full control over
sys.path.
with an indented code block but no newline would raise SyntaxError.
This would have been a four-line change in parsetok.c... Except
codeop.py depends on this behavior, so a compilation flag had to be
invented that causes the tokenizer to revert to the old behavior;
this required extra changes to 2 .h files, 2 .c files, and 2 .py
files. (Fixes SF bug #501622.)
mostly from SF patch #683257, but I had to change unlock_import() to
return an error value to avoid fatal error.
Should this be backported? The patch requested this, but it's a new
feature.
"Unsigned" (i.e., positive-looking, but really negative) hex/oct
constants with a leading minus sign are once again properly negated.
The micro-optimization for negated numeric constants did the wrong
thing for such hex/oct constants. The patch avoids the optimization
for all hex/oct constants.
This needs to be backported to Python 2.2!
object is not a real str or unicode but an instance
of a subclass, construct the output via looping
over __getitem__. This guarantees that the result
is the same for function==None and function==lambda x:x
This doesn't happen for tuples, because filtertuple()
uses PyTuple_GetItem().
(This was discussed on SF bug #665835).
-DCALL_PROFILE: Count the number of function calls executed.
When this symbol is defined, the ceval mainloop and helper functions
count the number of function calls made. It keeps detailed statistics
about what kind of object was called and whether the call hit any of
the special fast paths in the code.
Optimization:
When we take the fast_function() path, which seems to be taken for
most function calls, and there is minimal frame setup to do, avoid
call PyEval_EvalCodeEx(). The eval code ex function does a lot of
work to handle keywords args and star args, free variables,
generators, etc. The inlined version simply allocates the frame and
copies the arguments values into the frame.
The optimization gets a little help from compile.c which adds a
CO_NOFREE flag to code objects that don't have free variables or cell
variables. This change allows fast_function() to get into the fast
path with fewer tests.
I measure a couple of percent speedup in pystone with this change, but
there's surely more that can be done.
blindly assumed that tp_as_sequence->sq_item always returns
a str or unicode object. This might fail with str or unicode
subclasses.
This patch checks whether the object returned from __getitem__
is a str/unicode object and raises a TypeError if not (and
the filter function returned true).
Furthermore the result for __getitem__ can be more than one
character long, so checks for enough memory have to be done.
__module__ is the string name of the module the function was defined
in, just like __module__ of classes. In some cases, particularly for
C functions, the __module__ may be None.
Change PyCFunction_New() from a function to a macro, but keep an
unused copy of the function around so that we don't change the binary
API.
Change pickle's save_global() to use whichmodule() if __module__ is
None, but add the __module__ logic to whichmodule() since it might be
used outside of pickle.