Heavily revised, comprising revisions:
46640 - original trunk revision (backed out in r46655)
46647 - markup fix (backed out in r46655)
46692:46918 merged from branch aimacintyre-sf1454481
branch tested on buildbots (Windows buildbots had problems
not related to these changes).
46640 Patch #1454481: Make thread stack size runtime tunable.
46647 Markup fix
The first is causing many buildbots to fail test runs, and there
are multiple causes with seemingly no immediate prospects for
repairing them. See python-dev discussion.
Note that a branch can (and should) be created for resolving these
problems, like
svn copy svn+ssh://svn.python.org/python/trunk -r46640 svn+ssh://svn.python.org/python/branches/NEW_BRANCH
followed by merging rev 46647 to the new branch.
set_exc_info(), reset_exc_info(): By exploiting the
likely (who knows?) invariant that when an exception's
`type` is NULL, its `value` and `traceback` are also NULL,
save some cycles in heavily-executed code.
This is a "a kronar saved is a kronar earned" patch: the
speedup isn't reliably measurable, but it obviously does
reduce the operation count in the normal (no exception
raised) path through PyEval_EvalFrameEx().
The tim-exc_sanity branch tries to push this harder, but
is still blowing up (at least in part due to pre-existing
subtle bugs that appear to have no other visible
consequences!).
Not a bugfix candidate.
both mystrtoul.c and longobject.c. Share the table instead. Also
cut its size by 64 entries (they had been used for an inscrutable
trick originally, but the code no longer tries to use that trick).
Applied patch zombie-frames-2.diff from sf patch 876206 with updates for
Python 2.5 and also modified to retain the free_list to avoid the 67%
slow-down in pybench recursion test. 5% speed up in function call pybench.
- Warn-raise ImportWarning when importing would have picked up a directory
as package, if only it'd had an __init__.py. This swaps two tests (for
case-ness and __init__-ness), but case-test is not really more expensive,
and it's not in a speed-critical section.
- Test for the new warning by importing a common non-package directory on
sys.path: site-packages
- In regrtest.py, silence warnings generated by the build-environment
because Modules/ (which is added to sys.path for Setup-created modules)
has 'zlib' and '_ctypes' directories without __init__.py's.
MAXPATHLEN-sized buffers for various output-buffers (like to realpath()),
and that's correct on BSD platforms, but not Linux (which uses PATH_MAX, and
does not define MAXPATHLEN.) Cursory googling suggests Linux is following a
newer standard than BSD, but in cases like this, who knows. Using the
greater of PATH_MAX and 1024 as a fallback for MAXPATHLEN seems to be the
most portable solution.
Py_VISIT: cast the `op` argument to PyObject* when calling
`visit()`. Else the caller has to pay too much attention to
this silly detail (e.g., frame_traverse needs to traverse
`struct _frame *` and `PyCodeObject *` pointers too).
that are suspended outside of any try/except/finally blocks to be
garbage collected even if they are part of a cycle. Generators that
suspend inside of an active try/except or try/finally block (including
those created by a ``with`` statement) are still not GC-able if they
are part of a cycle, however.
tracing/line number table in except blocks.
Reflow long lines introduced by col_offset changes. Update test_ast
to handle new fields in excepthandler.
As note in Python.asdl says, we might want to rethink how attributes
are handled. Perhaps they should be the same as other fields, with
the primary difference being how they are defined for all types within
a sum.
Also fix asdl_c so that constructors with int fields don't fail when
passed a zero value.
adds the following API calls: PySet_Clear(), _PySet_Next(), and
_PySet_Update(). The latter two are considered non-public. Tests and
documentation (for the public API) are included.
objimpl.h, pymem.h: Stop mapping PyMem_{Del, DEL} and PyMem_{Free, FREE}
to PyObject_{Free, FREE} in a release build. They're aliases for the
system free() now.
_subprocess.c/sp_handle_dealloc(): Since the memory was originally
obtained via PyObject_NEW, it must be released via PyObject_FREE (or
_DEL).
pythonrun.c, tokenizer.c, parsermodule.c: I lost count of the number of
PyObject vs PyMem mismatches in these -- it's like the specific
function called at each site was picked at random, sometimes even with
memory obtained via PyMem getting released via PyObject. Changed most
to use PyObject uniformly, since the blobs allocated are predictably
small in most cases, and obmalloc is generally faster than system
mallocs then.
If extension modules in real life prove as sloppy as Python's front
end, we'll have to revert the objimpl.h + pymem.h part of this patch.
Note that no problems will show up in a debug build (all calls still go
thru obmalloc then). Problems will show up only in a release build, most
likely segfaults.
of tuple) that provides incremental decoders and encoders (a way to use
stateful codecs without the stream API). Functions
codecs.getincrementaldecoder() and codecs.getincrementalencoder() have
been added.
added message attribute compared to the previous version of Exception. It is
also a new-style class, making all exceptions now new-style. KeyboardInterrupt
and SystemExit inherit from BaseException directly. String exceptions now
raise DeprecationWarning.
Applies patch 1104669, and closes bugs 1012952 and 518846.
- IMPORT_NAME takes an extra argument from the stack: the relativeness of
the import. Only passed to __import__ when it's not -1.
- __import__() takes an optional 5th argument for the same thing; it
__defaults to -1 (old semantics: try relative, then absolute)
- 'from . import name' imports name (be it module or regular attribute)
from the current module's *package*. Likewise, 'from .module import name'
will import name from a sibling to the current module.
- Importing from outside a package is not allowed; 'from . import sys' in a
toplevel module will not work, nor will 'from .. import sys' in a
(single-level) package.
- 'from __future__ import absolute_import' will turn on the new semantics
for import and from-import: imports will be absolute, except for
from-import with dots.
Includes tests for regular imports and importhooks, parser changes and a
NEWS item, but no compiler-package changes or documentation changes.
This was started by Mike Bland and completed by Guido
(with help from Neal).
This still needs a __future__ statement added;
Thomas is working on Michael's patch for that aspect.
There's a small amount of code cleanup and refactoring
in ast.c, compile.c and ceval.c (I fixed the lltrace
behavior when EXT_POP is used -- however I had to make
lltrace a static global).
breaks the parser module, because it adds the if/else construct as well as
two new grammar rules for backward compatibility. If no one else fixes
parsermodule, I guess I'll go ahead and fix it later this week.
The TeX code was checked with texcheck.py, but not rendered. There is
actually a slight incompatibility:
>>> (x for x in lambda:0)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: iteration over non-sequence
changes into
>>> (x for x in lambda: 0)
File "<stdin>", line 1
(x for x in lambda: 0)
^
SyntaxError: invalid syntax
Since there's no way the former version can be useful, it's probably a
bugfix ;)
Add C API function Py_GetBuildNumber(), add it to the interactive prompt
banner (i.e. Py_GetBuildInfo()), and add it as the sys.build_number
attribute. The build number is a string instead of an int because it may
contain a trailing 'M' if there are local modifications.
Strip off leading dots and slash so the generated files are the same regardless
of whether you configure in the checkout directory or build.
If anyone configures in a different directory, we might want a cleaner
approach using os.path.*(). Hopefully this is good enough.
In C++, it's an error to pass a string literal to a char* function
without a const_cast(). Rather than require every C++ extension
module to put a cast around string literals, fix the API to state the
const-ness.
I focused on parts of the API where people usually pass literals:
PyArg_ParseTuple() and friends, Py_BuildValue(), PyMethodDef, the type
slots, etc. Predictably, there were a large set of functions that
needed to be fixed as a result of these changes. The most pervasive
change was to make the keyword args list passed to
PyArg_ParseTupleAndKewords() to be a const char *kwlist[].
One cast was required as a result of the changes: A type object
mallocs the memory for its tp_doc slot and later frees it.
PyTypeObject says that tp_doc is const char *; but if the type was
created by type_new(), we know it is safe to cast to char *.
_PyUnicode_IsLinebreak():
Changed the declarations to match the definitions.
Don't know why they differed; MSVC warned about it;
don't know why only these two functions use "const".
Someone who does may want to do something saner ;-).
INT_MIN is used in Python/compile.c, but it was also used
in Objects/abstract.c Python/getargs.c. If we need it for compile.c,
we can get it from the same place as the other files.
This change implements a new bytecode compiler, based on a
transformation of the parse tree to an abstract syntax defined in
Parser/Python.asdl.
The compiler implementation is not complete, but it is in stable
enough shape to run the entire test suite excepting two disabled
tests.
about illegal code points. The codec now supports PEP 293 style error handlers.
(This is a variant of the Nik Haldimann's patch that detects truncated data)
* Bring in free list from dictionary code.
* Improve several comments.
* Differencing can leave many dummy entries. If more than
1/6 are dummies, then resize them away.
* Factor-out common code with new macro, PyAnySet_CheckExact.
- Handle both frozenset() and frozenset([]).
- Do not use singleton for frozenset subclasses.
- Finalize the singleton.
- Add test cases.
* Factor-out set_update_internal() from set_update(). Simplifies the
code for several internal callers.
* Factor constant expressions out of loop in set_merge_internal().
* Minor comment touch-ups.
[ 1180995 ] binary formats for marshalling floats
Adds 2 new type codes for marshal (binary floats and binary complexes), a
new marshal version (2), updates MAGIC and fiddles the de-serializing of
code objects to be less likely to clobber the real reason for failing if
it fails.
[ 1181301 ] make float packing copy bytes when they can
which hasn't been reviewed, despite numerous threats to check it in
anyway if noone reviews it. Please read the diff on the checkin list,
at least!
The basic idea is to examine the bytes of some 'probe values' to see if
the current platform is a IEEE 754-ish platform, and if so
_PyFloat_{Pack,Unpack}{4,8} just copy bytes around.
The rest is hair for testing, and tests.
(Contributed by Bob Ippolito.)
This patch trims down the Python core on Darwin by making it
independent of CoreFoundation and CoreServices. It does this by:
Changed linker flags in configure/configure.in
Removed the unused PyMac_GetAppletScriptFile
Moved the implementation of PyMac_StrError to the MacOS module
Moved the implementation of PyMac_GetFullPathname to the
Carbon.File module
In cyclic gc, clear weakrefs to unreachable objects before allowing any
Python code (weakref callbacks or __del__ methods) to run.
This is a critical bugfix, affecting all versions of Python since weakrefs
were introduced. I'll backport to 2.3.
exposed in header files. Fixed a few comments in these headers.
As we might have expected, writing down invariants systematically exposed a
(minor) bug. In this case, function objects have a writeable func_code
attribute, which could be set to code objects with the wrong number of
free variables. Calling the resulting function segfaulted the interpreter.
Added a corresponding test.
today. pyconfig.h can override it if not, and can also override
Py_IS_INFINITY now. Py_IS_NAN and Py_IS_INFINITY are overridden now
for Microsoft compilers, using efficient MS-specific spellings.
decoding incomplete input (when the input stream is temporarily exhausted).
codecs.StreamReader now implements buffering, which enables proper
readline support for the UTF-16 decoders. codecs.StreamReader.read()
has a new argument chars which specifies the number of characters to
return. codecs.StreamReader.readline() and codecs.StreamReader.readlines()
have a new argument keepends. Trailing "\n"s will be stripped from the lines
if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and
PyUnicode_DecodeUTF16Stateful.
This checkin is adapted from part 2 (of 3) of Trevor Perrin's patch set.
BACKWARD INCOMPATIBILITY: SHIFT must now be divisible by 5. AFAIK,
nobody will care. long_pow() could be complicated to worm around that,
if necessary.
long_pow():
- BUGFIX: This leaked the base and power when the power was negative
(and so the computation delegated to float pow).
- Instead of doing right-to-left exponentiation, do left-to-right. This
is more efficient for small bases, which is the common case.
- In addition, if the exponent is large (more than FIVEARY_CUTOFF
digits), precompute [a**i % c for i in range(32)], and go left to
right 5 bits at a time.
l_divmod():
- The signature changed so that callers who don't want the quotient,
or don't want the remainder, can pass NULL in the slot they don't
want. This saves them from having to declare a vrbl for unwanted
stuff, and remembering to decref it.
long_mod(), long_div(), long_classic_div():
- Adjust to new l_divmod() signature, and simplified as a result.
This checkin is adapted from part 1 (of 3) of Trevor Perrin's patch set.
x_mul()
- sped a little by optimizing the C
- sped a lot (~2X) if it's doing a square; note that long_pow() squares
often
k_mul()
- more cache-friendly now if it's doing a square
KARATSUBA_CUTOFF
- boosted; gradeschool mult is quicker now, and it may have been too low
for many platforms anyway
KARATSUBA_SQUARE_CUTOFF
- new
- since x_mul is a lot faster at squaring now, the point at which
Karatsuba pays for squaring is much higher than for general mult
happen in 2.3, but nobody noticed it still was getting generated (the
warning was disabled by default). OverflowWarning and
PyExc_OverflowWarning should be removed for 2.5, and left notes all over
saying so.
unicodedata.east_asian_width(). You can still implement your own
simple width() function using it like this:
def width(u):
w = 0
for c in unicodedata.normalize('NFC', u):
cwidth = unicodedata.east_asian_width(c)
if cwidth in ('W', 'F'): w += 2
else: w += 1
return w
or broken by basic ctype functions in 4.4BSD descendants. This
will be fixed in their future development branches but they'll keep
the POSIX-incompatibility for their backward-compatiblities in near
future.
to NULL during the lifetime of the object.
* listobject.c nevertheless did not conform to the other invariants,
either; fixed.
* listobject.c now uses list_clear() as the obvious internal way to clear
a list, instead of abusing list_ass_slice() for that. It makes it easier
to enforce the invariant about ob_item == NULL.
* listsort() sets allocated to -1 during sort; any mutation will set it
to a value >= 0, so it is a safe way to detect mutation. A negative
value for allocated does not cause a problem elsewhere currently.
test_sort.py has a new test for this fix.
* listsort() leak: if items were added to the list during the sort, AND if
these items had a __del__ that puts still more stuff into the list,
then this more stuff (and the PyObject** array to hold them) were
overridden at the end of listsort() and never released.
mutation during list.sort() used to rely on that listobject.c always
NULL'ed ob_item when ob_size fell to 0. That's no longer true, so the
test for list mutation during a sort is no longer reliable. Changed the
test to rely instead on that listobject.c now never NULLs-out ob_item
after (if ever) ob_item gets a non-NULL value. This new assumption is
also documented now, as a required invariant in listobject.h.
The new assumption allowed some real simplification to some of the
hairier code in listsort(), so is a Good Thing on that count.
Rewrote Py_RETURN_{NONE, TRUE, FALSE} to expand to comma expressions
rather than "do {} while(0)" thingies. The OP complained because he
likes using MS /W4 sometimes, and then all his uses of these things
generate nuisance warnings about testing a constant expression (in
the "while(0)" part). Comma expressions don't have this problem
(although it's a lucky accident that comma expressions suffice for these
macros!).
- weakref.ref and weakref.ReferenceType will become aliases for each
other
- weakref.ref will be a modern, new-style class with proper __new__
and __init__ methods
- weakref.WeakValueDictionary will have a lighter memory footprint,
using a new weakref.ref subclass to associate the key with the
value, allowing us to have only a single object of overhead for each
dictionary entry (currently, there are 3 objects of overhead per
entry: a weakref to the value, a weakref to the dictionary, and a
function object used as a weakref callback; the weakref to the
dictionary could be avoided without this change)
- a new macro, PyWeakref_CheckRefExact(), will be added
- PyWeakref_CheckRef() will check for subclasses of weakref.ref
This closes SF patch #983019.
The builtin eval() function now accepts any mapping for the locals argument.
Time sensitive steps guarded by PyDict_CheckExact() to keep from slowing
down the normal case. My timings so no measurable impact.
The LaTeX is untested (well, so is the new API, for that matter).
Note that I also changed NULL to get spelled consistently in concrete.tex.
If that was a wrong thing to do, Fred should yell at me.
New include file timefuncs.h exports private API function
_PyTime_DoubleToTimet() from timemodule.c. timemodule should export
some other functions too (look for painful bits in datetimemodule.c).
Added insane-argument checking to datetime's assorted fromtimestamp()
and utcfromtimestamp() methods. Added insane-argument tests of these
to test_datetime, and insane-argument tests for ctime(), localtime()
and gmtime() to test_time.
iswide() for east asian width manipulation. (Inspired by David
Goodger, Reviewed by Martin v. Loewis)
- Move _PyUnicode_TypeRecord.flags to the end of the struct so that
no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
(Code contributed by Jiwon Seo.)
The documentation portion of the patch is being re-worked and will be
checked-in soon. Likewise, PEP 289 will be updated to reflect Guido's
rationale for the design decisions on binding behavior (as described in
in his patch comments and in discussions on python-dev).
The test file, test_genexps.py, is written in doctest format and is
meant to exercise all aspects of the the patch. Further additions are
welcome from everyone. Please stress test this new feature as much as
possible before the alpha release.
realloc(). This is achieved by tracking the overallocation size in a new
field and using that information to skip calls to realloc() whenever
possible.
* Simplified and tightened the amount of overallocation. For larger lists,
this overallocates by 1/8th (compared to the previous scheme which ranged
between 1/4th to 1/32nd over-allocation). For smaller lists (n<6), the
maximum overallocation is one byte (formerly it could be upto eight bytes).
This saves memory in applications with large numbers of small lists.
* Eliminated the NRESIZE macro in favor of a new, static list_resize function
that encapsulates the resizing logic. Coverting this back to macro would
give a small (under 1%) speed-up. This was too small to warrant the loss
of readability, maintainability, and de-coupling.
* Some functions using NRESIZE had grown unnecessarily complex in their
efforts to bend to the macro's calling pattern. With the new list_resize
function in place, those other functions could be simplified. That is
being saved for a separate patch.
* The ob_item==NULL check could be eliminated from the new list_resize
function. This would entail finding each piece of code that sets ob_item
to NULL and adding a new line to invalidate the overallocation tracking
field. Rather than impose a new requirement on other pieces of list code,
it was preferred to leave the NULL check in place and retain the benefits
of decoupling, maintainability and information hiding (only PyList_New()
and list_sort() need to know about the new field). This approach also
reduces the odds of breaking an extension module.
(Collaborative effort by Raymond Hettinger, Hye-Shik Chang, Tim Peters,
and Armin Rigo.)
semantics to include subtypes. Most concrete object APIs then had
a Py<type>_CheckExact() macro added to test for an object's type
not including subtypes.
The PyDict_CheckExact() macro wasn't created at that time, so I've added
it for API completeness/symmetry - even though nobody has complained
about its absence in the time since 2.2 was released.
Not a backport candidate.
with most other concrete object checks, but the docs weren't brought into
line.
PyList_CheckExact() was added at 2.2 but never documented.
backport candidate.
bit by checking the value of UCHAR_MAX in Include/Python.h. There was a
check in Objects/stringobject.c. Remove that. (Note that we don't define
UCHAR_MAX if it's not defined as the old test did.)
* Add more tests
* Refactor and neaten the code a bit.
* Rename union_update() to update().
* Improve the algorithms (making them a closer to sets.py).
Also SF patch 843455.
This is a critical bugfix.
I'll backport to 2.3 maint, but not beyond that. The bugs this fixes
have been there since weakrefs were introduced.
* Install the unittests, docs, newsitem, include file, and makefile update.
* Exercise the new functions whereever sets.py was being used.
Includes the docs for libfuncs.tex. Separate docs for the types are
forthcoming.
The embed2.diff patch solves the user's problem by exporting the missing
symbols from the Python core so Python can be embedded in another Cygwin
application (well, at lest vim).
A new API (only accessible from C) to interrupt a thread by sending it
an exception. This is not always effective, but might help some people.
Requested by Just van Rossum and Alex Martelli. It is intentional
that you have to write your own C extension to call it from Python.
Docs will have to wait.
the purpose. Increased my claim to two bits, hoping that nobody
will complain about it. I'm taking the highest two bits, whatever
the integer word size may be.
The compiler was reseting the list comprehension tmpname counter for each function, but the symtable was using the same counter for the entire module. Repair by move tmpname into the symtable entry.
Bugfix candidate.
* Can now test for basic blocks.
* Optimize inverted comparisions.
* Optimize unary_not followed by a conditional jump.
* Added a new opcode, NOP, to keep code size constant.
* Applied NOP to previous transformations where appropriate.
Note, the NOP would not be necessary if other functions were
added to re-target jump addresses and update the co_lnotab mapping.
That would yield slightly faster and cleaner bytecode at the
expense of optimizer simplicity and of keeping it decoupled
from the line-numbering structure.
new line.
New pvt API function _Py_PrintReferenceAddresses(): Prints only the
addresses and refcnts of the live objects. This is always safe to call,
because it has no dependence on Python's C API.
Py_Finalize(): If envar PYTHONDUMPREFS is set, call (the new)
_Py_PrintReferenceAddresses() right before dumping final pymalloc stats.
We can't print the reprs of the objects here because too much of the
interpreter has been shut down. You need to correlate the addresses
displayed here with the object reprs printed by the earlier
PYTHONDUMPREFS call to _Py_PrintReferences().
New functions:
unsigned long PyInt_AsUnsignedLongMask(PyObject *);
unsigned PY_LONG_LONG) PyInt_AsUnsignedLongLongMask(PyObject *);
unsigned long PyLong_AsUnsignedLongMask(PyObject *);
unsigned PY_LONG_LONG) PyLong_AsUnsignedLongLongMask(PyObject *);
New and changed format codes:
b unsigned char 0..UCHAR_MAX
B unsigned char none **
h unsigned short 0..USHRT_MAX
H unsigned short none **
i int INT_MIN..INT_MAX
I * unsigned int 0..UINT_MAX
l long LONG_MIN..LONG_MAX
k * unsigned long none
L long long LLONG_MIN..LLONG_MAX
K * unsigned long long none
Notes:
* New format codes.
** Changed from previous "range-and-a-half" to "none"; the
range-and-a-half checking wasn't particularly useful.
New test test_getargs2.py, to verify all this.
Arranged that all the objects exposed by __builtin__ appear in the list
of all objects. I basically peed away two days tracking down a mystery
leak in sys.gettotalrefcount() in a ZODB app (== tons of code), because
the object leaking the references didn't appear in the sys.getobjects(0)
list. The object happened to be False. Now False is in the list, along
with other popular & previously missing leak candidates (like None).
Alas, we still don't have a choke point covering *all* Python objects,
so the list of all objects may still be incomplete.
_Py_AddToAllObjects() that simply inserts an object at the front of
the doubly-linked list of all objects. Changed PyType_Ready() (the
closest thing we've got to a choke point for type objects) to call
that.
variables to store internal data. As a result, any atempts to use the
unicode system with multiple active interpreters, or successive
interpreter executions, would fail.
Now that information is stored into members of the PyInterpreterState
structure.