This was started by Mike Bland and completed by Guido
(with help from Neal).
This still needs a __future__ statement added;
Thomas is working on Michael's patch for that aspect.
There's a small amount of code cleanup and refactoring
in ast.c, compile.c and ceval.c (I fixed the lltrace
behavior when EXT_POP is used -- however I had to make
lltrace a static global).
breaks the parser module, because it adds the if/else construct as well as
two new grammar rules for backward compatibility. If no one else fixes
parsermodule, I guess I'll go ahead and fix it later this week.
The TeX code was checked with texcheck.py, but not rendered. There is
actually a slight incompatibility:
>>> (x for x in lambda:0)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: iteration over non-sequence
changes into
>>> (x for x in lambda: 0)
File "<stdin>", line 1
(x for x in lambda: 0)
^
SyntaxError: invalid syntax
Since there's no way the former version can be useful, it's probably a
bugfix ;)
Add C API function Py_GetBuildNumber(), add it to the interactive prompt
banner (i.e. Py_GetBuildInfo()), and add it as the sys.build_number
attribute. The build number is a string instead of an int because it may
contain a trailing 'M' if there are local modifications.
Strip off leading dots and slash so the generated files are the same regardless
of whether you configure in the checkout directory or build.
If anyone configures in a different directory, we might want a cleaner
approach using os.path.*(). Hopefully this is good enough.
In C++, it's an error to pass a string literal to a char* function
without a const_cast(). Rather than require every C++ extension
module to put a cast around string literals, fix the API to state the
const-ness.
I focused on parts of the API where people usually pass literals:
PyArg_ParseTuple() and friends, Py_BuildValue(), PyMethodDef, the type
slots, etc. Predictably, there were a large set of functions that
needed to be fixed as a result of these changes. The most pervasive
change was to make the keyword args list passed to
PyArg_ParseTupleAndKewords() to be a const char *kwlist[].
One cast was required as a result of the changes: A type object
mallocs the memory for its tp_doc slot and later frees it.
PyTypeObject says that tp_doc is const char *; but if the type was
created by type_new(), we know it is safe to cast to char *.
_PyUnicode_IsLinebreak():
Changed the declarations to match the definitions.
Don't know why they differed; MSVC warned about it;
don't know why only these two functions use "const".
Someone who does may want to do something saner ;-).
INT_MIN is used in Python/compile.c, but it was also used
in Objects/abstract.c Python/getargs.c. If we need it for compile.c,
we can get it from the same place as the other files.
This change implements a new bytecode compiler, based on a
transformation of the parse tree to an abstract syntax defined in
Parser/Python.asdl.
The compiler implementation is not complete, but it is in stable
enough shape to run the entire test suite excepting two disabled
tests.
about illegal code points. The codec now supports PEP 293 style error handlers.
(This is a variant of the Nik Haldimann's patch that detects truncated data)
* Bring in free list from dictionary code.
* Improve several comments.
* Differencing can leave many dummy entries. If more than
1/6 are dummies, then resize them away.
* Factor-out common code with new macro, PyAnySet_CheckExact.
- Handle both frozenset() and frozenset([]).
- Do not use singleton for frozenset subclasses.
- Finalize the singleton.
- Add test cases.
* Factor-out set_update_internal() from set_update(). Simplifies the
code for several internal callers.
* Factor constant expressions out of loop in set_merge_internal().
* Minor comment touch-ups.
[ 1180995 ] binary formats for marshalling floats
Adds 2 new type codes for marshal (binary floats and binary complexes), a
new marshal version (2), updates MAGIC and fiddles the de-serializing of
code objects to be less likely to clobber the real reason for failing if
it fails.
[ 1181301 ] make float packing copy bytes when they can
which hasn't been reviewed, despite numerous threats to check it in
anyway if noone reviews it. Please read the diff on the checkin list,
at least!
The basic idea is to examine the bytes of some 'probe values' to see if
the current platform is a IEEE 754-ish platform, and if so
_PyFloat_{Pack,Unpack}{4,8} just copy bytes around.
The rest is hair for testing, and tests.
(Contributed by Bob Ippolito.)
This patch trims down the Python core on Darwin by making it
independent of CoreFoundation and CoreServices. It does this by:
Changed linker flags in configure/configure.in
Removed the unused PyMac_GetAppletScriptFile
Moved the implementation of PyMac_StrError to the MacOS module
Moved the implementation of PyMac_GetFullPathname to the
Carbon.File module
In cyclic gc, clear weakrefs to unreachable objects before allowing any
Python code (weakref callbacks or __del__ methods) to run.
This is a critical bugfix, affecting all versions of Python since weakrefs
were introduced. I'll backport to 2.3.
exposed in header files. Fixed a few comments in these headers.
As we might have expected, writing down invariants systematically exposed a
(minor) bug. In this case, function objects have a writeable func_code
attribute, which could be set to code objects with the wrong number of
free variables. Calling the resulting function segfaulted the interpreter.
Added a corresponding test.
today. pyconfig.h can override it if not, and can also override
Py_IS_INFINITY now. Py_IS_NAN and Py_IS_INFINITY are overridden now
for Microsoft compilers, using efficient MS-specific spellings.
decoding incomplete input (when the input stream is temporarily exhausted).
codecs.StreamReader now implements buffering, which enables proper
readline support for the UTF-16 decoders. codecs.StreamReader.read()
has a new argument chars which specifies the number of characters to
return. codecs.StreamReader.readline() and codecs.StreamReader.readlines()
have a new argument keepends. Trailing "\n"s will be stripped from the lines
if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and
PyUnicode_DecodeUTF16Stateful.
This checkin is adapted from part 2 (of 3) of Trevor Perrin's patch set.
BACKWARD INCOMPATIBILITY: SHIFT must now be divisible by 5. AFAIK,
nobody will care. long_pow() could be complicated to worm around that,
if necessary.
long_pow():
- BUGFIX: This leaked the base and power when the power was negative
(and so the computation delegated to float pow).
- Instead of doing right-to-left exponentiation, do left-to-right. This
is more efficient for small bases, which is the common case.
- In addition, if the exponent is large (more than FIVEARY_CUTOFF
digits), precompute [a**i % c for i in range(32)], and go left to
right 5 bits at a time.
l_divmod():
- The signature changed so that callers who don't want the quotient,
or don't want the remainder, can pass NULL in the slot they don't
want. This saves them from having to declare a vrbl for unwanted
stuff, and remembering to decref it.
long_mod(), long_div(), long_classic_div():
- Adjust to new l_divmod() signature, and simplified as a result.
This checkin is adapted from part 1 (of 3) of Trevor Perrin's patch set.
x_mul()
- sped a little by optimizing the C
- sped a lot (~2X) if it's doing a square; note that long_pow() squares
often
k_mul()
- more cache-friendly now if it's doing a square
KARATSUBA_CUTOFF
- boosted; gradeschool mult is quicker now, and it may have been too low
for many platforms anyway
KARATSUBA_SQUARE_CUTOFF
- new
- since x_mul is a lot faster at squaring now, the point at which
Karatsuba pays for squaring is much higher than for general mult
happen in 2.3, but nobody noticed it still was getting generated (the
warning was disabled by default). OverflowWarning and
PyExc_OverflowWarning should be removed for 2.5, and left notes all over
saying so.
unicodedata.east_asian_width(). You can still implement your own
simple width() function using it like this:
def width(u):
w = 0
for c in unicodedata.normalize('NFC', u):
cwidth = unicodedata.east_asian_width(c)
if cwidth in ('W', 'F'): w += 2
else: w += 1
return w
or broken by basic ctype functions in 4.4BSD descendants. This
will be fixed in their future development branches but they'll keep
the POSIX-incompatibility for their backward-compatiblities in near
future.
to NULL during the lifetime of the object.
* listobject.c nevertheless did not conform to the other invariants,
either; fixed.
* listobject.c now uses list_clear() as the obvious internal way to clear
a list, instead of abusing list_ass_slice() for that. It makes it easier
to enforce the invariant about ob_item == NULL.
* listsort() sets allocated to -1 during sort; any mutation will set it
to a value >= 0, so it is a safe way to detect mutation. A negative
value for allocated does not cause a problem elsewhere currently.
test_sort.py has a new test for this fix.
* listsort() leak: if items were added to the list during the sort, AND if
these items had a __del__ that puts still more stuff into the list,
then this more stuff (and the PyObject** array to hold them) were
overridden at the end of listsort() and never released.
mutation during list.sort() used to rely on that listobject.c always
NULL'ed ob_item when ob_size fell to 0. That's no longer true, so the
test for list mutation during a sort is no longer reliable. Changed the
test to rely instead on that listobject.c now never NULLs-out ob_item
after (if ever) ob_item gets a non-NULL value. This new assumption is
also documented now, as a required invariant in listobject.h.
The new assumption allowed some real simplification to some of the
hairier code in listsort(), so is a Good Thing on that count.
Rewrote Py_RETURN_{NONE, TRUE, FALSE} to expand to comma expressions
rather than "do {} while(0)" thingies. The OP complained because he
likes using MS /W4 sometimes, and then all his uses of these things
generate nuisance warnings about testing a constant expression (in
the "while(0)" part). Comma expressions don't have this problem
(although it's a lucky accident that comma expressions suffice for these
macros!).
- weakref.ref and weakref.ReferenceType will become aliases for each
other
- weakref.ref will be a modern, new-style class with proper __new__
and __init__ methods
- weakref.WeakValueDictionary will have a lighter memory footprint,
using a new weakref.ref subclass to associate the key with the
value, allowing us to have only a single object of overhead for each
dictionary entry (currently, there are 3 objects of overhead per
entry: a weakref to the value, a weakref to the dictionary, and a
function object used as a weakref callback; the weakref to the
dictionary could be avoided without this change)
- a new macro, PyWeakref_CheckRefExact(), will be added
- PyWeakref_CheckRef() will check for subclasses of weakref.ref
This closes SF patch #983019.
The builtin eval() function now accepts any mapping for the locals argument.
Time sensitive steps guarded by PyDict_CheckExact() to keep from slowing
down the normal case. My timings so no measurable impact.
The LaTeX is untested (well, so is the new API, for that matter).
Note that I also changed NULL to get spelled consistently in concrete.tex.
If that was a wrong thing to do, Fred should yell at me.
New include file timefuncs.h exports private API function
_PyTime_DoubleToTimet() from timemodule.c. timemodule should export
some other functions too (look for painful bits in datetimemodule.c).
Added insane-argument checking to datetime's assorted fromtimestamp()
and utcfromtimestamp() methods. Added insane-argument tests of these
to test_datetime, and insane-argument tests for ctime(), localtime()
and gmtime() to test_time.
iswide() for east asian width manipulation. (Inspired by David
Goodger, Reviewed by Martin v. Loewis)
- Move _PyUnicode_TypeRecord.flags to the end of the struct so that
no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
(Code contributed by Jiwon Seo.)
The documentation portion of the patch is being re-worked and will be
checked-in soon. Likewise, PEP 289 will be updated to reflect Guido's
rationale for the design decisions on binding behavior (as described in
in his patch comments and in discussions on python-dev).
The test file, test_genexps.py, is written in doctest format and is
meant to exercise all aspects of the the patch. Further additions are
welcome from everyone. Please stress test this new feature as much as
possible before the alpha release.
realloc(). This is achieved by tracking the overallocation size in a new
field and using that information to skip calls to realloc() whenever
possible.
* Simplified and tightened the amount of overallocation. For larger lists,
this overallocates by 1/8th (compared to the previous scheme which ranged
between 1/4th to 1/32nd over-allocation). For smaller lists (n<6), the
maximum overallocation is one byte (formerly it could be upto eight bytes).
This saves memory in applications with large numbers of small lists.
* Eliminated the NRESIZE macro in favor of a new, static list_resize function
that encapsulates the resizing logic. Coverting this back to macro would
give a small (under 1%) speed-up. This was too small to warrant the loss
of readability, maintainability, and de-coupling.
* Some functions using NRESIZE had grown unnecessarily complex in their
efforts to bend to the macro's calling pattern. With the new list_resize
function in place, those other functions could be simplified. That is
being saved for a separate patch.
* The ob_item==NULL check could be eliminated from the new list_resize
function. This would entail finding each piece of code that sets ob_item
to NULL and adding a new line to invalidate the overallocation tracking
field. Rather than impose a new requirement on other pieces of list code,
it was preferred to leave the NULL check in place and retain the benefits
of decoupling, maintainability and information hiding (only PyList_New()
and list_sort() need to know about the new field). This approach also
reduces the odds of breaking an extension module.
(Collaborative effort by Raymond Hettinger, Hye-Shik Chang, Tim Peters,
and Armin Rigo.)
semantics to include subtypes. Most concrete object APIs then had
a Py<type>_CheckExact() macro added to test for an object's type
not including subtypes.
The PyDict_CheckExact() macro wasn't created at that time, so I've added
it for API completeness/symmetry - even though nobody has complained
about its absence in the time since 2.2 was released.
Not a backport candidate.
with most other concrete object checks, but the docs weren't brought into
line.
PyList_CheckExact() was added at 2.2 but never documented.
backport candidate.
bit by checking the value of UCHAR_MAX in Include/Python.h. There was a
check in Objects/stringobject.c. Remove that. (Note that we don't define
UCHAR_MAX if it's not defined as the old test did.)
* Add more tests
* Refactor and neaten the code a bit.
* Rename union_update() to update().
* Improve the algorithms (making them a closer to sets.py).
Also SF patch 843455.
This is a critical bugfix.
I'll backport to 2.3 maint, but not beyond that. The bugs this fixes
have been there since weakrefs were introduced.