This change implements a new bytecode compiler, based on a
transformation of the parse tree to an abstract syntax defined in
Parser/Python.asdl.
The compiler implementation is not complete, but it is in stable
enough shape to run the entire test suite excepting two disabled
tests.
- SF Bug #772896, unknown encoding results in MemoryError, which is not helpful
I will only backport the segfault fix. I'll let Anthony decide if he wants
the other changes backported. I will do the backport if asked.
because (essentially) I didn't realise that PY_BEGIN/END_ALLOW_THREADS
actually expanded to nothing under a no-threads build, so if you somehow
NULLed out the threadstate (e.g. by calling PyThread_SaveThread) it would
stay NULLed when you return to Python. Argh!
Backport candidate.
modes like non-interactive modes. This allows for non-latin-1 users
to write unicode strings directly and sets Japanese users free from
weird manual escaping <wink> in shift_jis environments.
(Reviewed by Martin v. Loewis)
[ 960406 ] unblock signals in threads
although the changes do not correspond exactly to any patch attached to
that report.
Non-main threads no longer have all signals masked.
A different interface to readline is used.
The handling of signals inside calls to PyOS_Readline is now rather
different.
These changes are all a bit scary! Review and cross-platform testing
much appreciated.
A change from Duncan Booth, to deal with changes in the way pgen gets
built. Note that graminit.[ch] aren't normally built on Windows (they're
obtained from CVS).
the build, so I'm restoring it. I'm not sure what Neal's intent was,
since the line following the one he removed was "REQN(i, 1)" so i is
obviously used. ;)
with an indented code block but no newline would raise SyntaxError.
This would have been a four-line change in parsetok.c... Except
codeop.py depends on this behavior, so a compilation flag had to be
invented that causes the tokenizer to revert to the old behavior;
this required extra changes to 2 .h files, 2 .c files, and 2 .py
files. (Fixes SF bug #501622.)
(see Patch 512981). I changed stdin to sys_stdin in the body of the
function, but did not change stderr to sys_stdout, though I suspect that may
be the correct course. I don't know the code involved well enough to judge.
-- replace then with slightly faster PyObject_Call(o,a,NULL). (The
difference is that the latter requires a to be a tuple; the former
allows other values and wraps them in a tuple if necessary; it
involves two more levels of C function calls to accomplish all that.)
Change the parser and compiler to use PyMalloc.
Only the files implementing processes that will request memory
allocations small enough for PyMalloc to be a win have been
changed, which are:-
- Python/compile.c
- Parser/acceler.c
- Parser/node.c
- Parser/parsetok.c
This augments the aggressive overallocation strategy implemented by
Tim Peters in PyNode_AddChild() [Parser/node.c], in reducing the
impact of platform malloc()/realloc()/free() corner case behaviour.
Such corner cases are known to be triggered by test_longexp and
test_import.
Jeremy Hylton, in accepting this patch, recommended this as a
bugfix candidate for 2.2. While the changes to Python/compile.c
and Parser/node.c backport easily (and could go in), the changes
to Parser/acceler.c and Parser/parsetok.c require other not
insignificant changes as a result of the differences in the memory
APIs between 2.3 and 2.2, which I'm not in a position to work
through at the moment. This is a pity, as the Parser/parsetok.c
changes are the most important after the Parser/node.c changes, due
to the size of the memory requests involved and their frequency.
disaster too, so this change is here to stay. Beefed up the comments
and added some stats Andrew reported. Also a small change to the
macro body, to make it obvious how XXXROUNDUP(0) ends up returning 0.
See SF patch 578297 for context.
Not a bugfix candidate, as the functional changes here have already
been backported to the 2.2 line (this patch just improves clarity).
This gets us closer to consistent Ctrl+C behaviour on NT and Win9x. NT now reliably generates KeyboardInterrupt exceptions for NT when a file IO operation was aborted. Bugfix candidate
more trivial lexical helper macros so that uses of these guys expand
to nothing at all when they're not enabled. This should help sub-
standard compilers that can't do a good job of optimizing away the
previous "(void)0" expressions.
Py_DECREF: There's only one definition of this now. Yay! That
was that last one in the family defined multiple times in an #ifdef
maze.
Py_FatalError(): Changed the char* signature to const char*.
_Py_NegativeRefcount(): New helper function for the Py_REF_DEBUG
expansion of Py_DECREF. Calling an external function cuts down on
the volume of generated code. The previous inline expansion of abort()
didn't work as intended on Windows (the program often kept going, and
the error msg scrolled off the screen unseen). _Py_NegativeRefcount
calls Py_FatalError instead, which captures our best knowledge of
how to abort effectively across platforms.
children gets large, to avoid severe platform realloc() degeneration
in extreme cases (like test_longexp).
Bugfix candidate.
This was doing extremely timid over-allocation, just rounding up to the
nearest multiple of 3. Now so long as the number of children is <= 128,
it rounds up to a multiple of 4 but via a much faster method. When the
number of children exceeds 128, though, and more space is needed, it
doubles the capacity. This is aggressive over-allocation.
SF patch <http://www.python.org/sf/578297> has Andrew MacIntyre using
PyMalloc in the parser to overcome platform malloc problems in
test_longexp on OS/2 EMX. Jack Jansen notes there that it didn't help
him on the Mac, because the Mac has problems with frequent ever-growing
reallocs, not just with gazillions of teensy mallocs. Win98 has no
visible problems with test_longexp, but I tried boosting the test-case
size and soon got "senseless" MemoryErrors out of it, and soon after
crashed the OS: as I've seen in many other contexts before, while the
Win98 realloc remains zippy in bad cases, it leads to extreme
fragmentation of user address space, to the point that the OS barfs.
I don't yet know whether this fixes Jack's Mac problems, but it does cure
Win98's problems when boosting the test case size. It also speeds
test_longexp in its unaltered state.
Highlights: import and friends will understand any of \r, \n and \r\n
as end of line. Python file input will do the same if you use mode 'U'.
Everything can be disabled by configuring with --without-universal-newlines.
See PEP278 for details.
This introduces:
- A new operator // that means floor division (the kind of division
where 1/2 is 0).
- The "future division" statement ("from __future__ import division)
which changes the meaning of the / operator to implement "true
division" (where 1/2 is 0.5).
- New overloadable operators __truediv__ and __floordiv__.
- New slots in the PyNumberMethods struct for true and floor division,
new abstract APIs for them, new opcodes, and so on.
I emphasize that without the future division statement, the semantics
of / will remain unchanged until Python 3.0.
Not yet implemented are warnings (default off) when / is used with int
or long arguments.
This has been on display since 7/31 as SF patch #443474.
Flames to /dev/null.
This is really stupid because it cannot be suppressed or altered using
the warning framework; that's because the warning framework is built
on Python interpreter internals, and the parser generator doesn't have
access to any of those (you cannot use anything of type PyObject * in
the parser).
But it's better than nothing, and implementing a proper check for this
appears to require modifying compile.c in a dozen places, for which I
don't have the stamina today. I promise we'll do better in 2.2a2.
At least it tells you the filename and line number (unlike the first
hack I considered :-).
that 'yield' is a keyword. This doesn't help test_generators at all! I
don't know why not. These things do work now (and didn't before this
patch):
1. "from __future__ import generators" now works in a native shell.
2. Similarly "python -i xxx.py" now has generators enabled in the
shell if xxx.py had them enabled.
3. This program (which was my doctest proxy) works fine:
from __future__ import generators
source = """\
def f():
yield 1
"""
exec compile(source, "", "single") in globals()
print type(f())
the yield statement. I figure we have to have this in before I can
release 2.2a1 on Wednesday.
Note: test_generators is currently broken, I'm counting on Tim to fix
this.
#115555.
The error from s_push() on stack overflow was -1, which was passed
through unchanged by push(), but not tested for by push()'s caller --
which only expected positive error codes. Fixed by changing s_push()
to return E_NOMEM on stack overflow. (Not quite the proper error code
either, but I can't be bothered adding a new E_STACKOVERFLOW error
code in all the right places.)
Add definitions of INT_MAX and LONG_MAX to pyport.h.
Remove includes of limits.h and conditional definitions of INT_MAX
and LONG_MAX elsewhere.
This closes SourceForge patch #101659 and bug #115323.
Add the EXTENDED_ARG opcode to the virtual machine, allowing 32-bit
arguments to opcodes instead of being forced to stick to the 16-bit
limit. This is especially useful for machine-generated code, which
can be too long for the SET_LINENO parameter to fit into 16 bits.
This closes the implementation portion of SourceForge patch #100893.
fields token and expected must also be initialized, otherwise the
tests in parsetok() can generate uninitialized memory read errors.
This quiets an Insure warning.
handlers "return void", according to ANSI C.
Removed the new Py_RETURN_FROM_SIGNAL_HANDLER macro.
Left RETSIGTYPE in the config stuff, because it's not clear to
me that others aren't relying on it (e.g., extension modules).
#if RETSIGTYPE != void
That isn't C, and MSVC properly refuses to compile it.
Introduced new Py_RETURN_FROM_SIGNAL_HANDLER macro in pyport.h
to expand to the correct thing based on RETSIGTYPE. However,
only void is ANSI! Do we still have platforms that return int?
The Unix config mess appears to #define RETSIGTYPE by magic
without being asked to, so I assume it's "a problem" across
Unices still.
Work around intrcheck.c's desire to pass 'PyErr_CheckSignals' to
'Py_AddPendingCall' by providing a (static) wrapper function that has the
right number of arguments.
comments, docstrings or error messages. I fixed two minor things in
test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't").
There is a minor style issue involved: Guido seems to have preferred English
grammar (behaviour, honour) in a couple places. This patch changes that to
American, which is the more prominent style in the source. I prefer English
myself, so if English is preferred, I'd be happy to supply a patch myself ;)
used for indentation related errors. This patch includes Ping's
improvements for indentation-related error messages.
Closes SourceForge patches #100734 and #100856.
the number of children of a node exceeds the max possible value for
the short that is used to count them. The Python runtime converts
this parser error into the SyntaxError "expression too long."
For more comments, read the patches@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.
(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode. I'm also holding back on his
change to main.c, which seems unnecessary to me.)
signal handlers in a fork()ed child process when Python is compiled with
thread support. The bug was reported by Scott <scott@chronis.icgroup.com>.
What happens is that after a fork(), the variables used by the signal
module to determine whether this is the main thread or not are bogus,
and it decides that no thread is the main thread, so no signals will
be delivered.
The solution is the addition of PyOS_AfterFork(), which fixes the signal
module's variables. A dummy version of the function is present in the
intrcheck.c source file which is linked when the signal module is not
used.
Also grandly renamed.
Here's the new interface:
When WITH_READLINE is defined, two functions are defined:
- PyOS_GnuReadline (what used to be my_readline() with WITH_READLINE)
- PyOS_ReadlineInit (for Dave Ascher)
Always, these functions are defined:
- PyOS_StdioReadline (what used to be my_readline() without WITH_READLINE)
- PyOS_Readline (the interface used by tokenizer.c and [raw_]input().
There's a global function pointer PyOS_ReadlineFunctionPointer,
initialized to NULL. When PyOS_Readline finds this to be NULL, it
sets it to either PyOS_GnuReadline or PyOS_StdioReadline depending on
which one makes more sense (i.e. it uses GNU only if it is defined
*and* stdin is indeed a tty device).
An embedding program that has its own wishes can set the function
pointer to a function of its own design. It should take a char*
prompt argument (which may be NULL) and return a string *ending in a
\n character* -- or "" for EOF or NULL for a user interrupt.
--Guido van Rossum (home page: http://www.python.org/~guido/)