Added 0b and 0o literals to tokenizer.
Modified PyOS_strtoul to support 0b and 0o inputs.
Modified PyLong_FromString to support guessing 0b and 0o inputs.
Renamed test_hexoct.py to test_int_literal.py and added binary tests.
Added upper and lower case 0b, 0O, and 0X tests to test_int_literal.py
Added "Z" format_char to PyOS_ascii_formatd to support empty float presentation type.
Renamed buf_size in PyOS_ascii_formatd to more accurately reflect it's meaning.
Modified format.__float__ to use the new "Z" format as the default.
Added test cases.
across platforms: it should now raise OverflowError on all
platforms. (Previously it raised OverflowError only on
non IEEE 754 platforms.)
Also fix the (already existing) test for this behaviour
so that it actually raises TestFailed instead of just
referencing it.
which forbids constructing types that have it set. The effect is to speed
./python.exe -m timeit -s 'import abc' -s 'class Foo(object): __metaclass__ = abc.ABCMeta' 'Foo()'
up from 2.5us to 0.201us. This fixes issue 1762.
Highlights:
- Adding PyObject_Format.
- Adding string.Format class.
- Adding __format__ for str, unicode, int, long, float, datetime.
- Adding builtin format.
- Adding ''.format and u''.format.
- str/unicode fixups for formatters.
The files in Objects/stringlib that implement PEP 3101 (stringdefs.h,
unicodedefs.h, formatter.h, string_format.h) are identical in trunk
and py3k. Any changes from here on should be made to trunk, and
changes will propogate to py3k).
_PyLong_Format. In longobject.c, changed long_format to
_PyLong_Format. In intobject.c, changed uses of PyOS_snprintf to
_PyInt_Format instead.
_PyLong_Format is similar to py3k's routine of the same name, except
it has 2 additional parameters: addL and newstyle. addL was existing
in long_format, and controls adding the trailing "L". This is
unneeded in py3k. newstyle is used to control whether octal prepends
"0" (the pre-2.6 style), or "0o" (the 3.0 sytle).
PyNumber_ToBase is needed for PEP 3127 (Integer Literal Support and
Syntax) and PEP 3101 (Advanced String Formatting).
This changeset does not need merging into py3k.
in Object/ are named ``free_list``, the counter ``numfree`` and the upper
limit is a macro ``PyName_MAXFREELIST`` inside an #ifndef block.
The chances should make it easier to adjust Python for platforms with
less memory, e.g. mobile phones.
I implemented the function sys._compact_freelists() and C API functions PyInt_/PyFloat_CompactFreeList() to compact the pre-allocated blocks of ints and floats. They allow the user to reduce the memory usage of a Python process that deals with lots of numbers.
The patch also renames sys._cleartypecache to sys._clear_type_cache
Add PyFrozenSet_Check(), which was not needed before; The list of Py*Set_Check* macros seems to be complete now.
Add missing NEWS entries about all this.
Works like PyTuple_SetItem() to build-up values in a brand new frozenset.
Also, PyFrozenSet_New() is now guaranteed to produce a distinct new frozenset.
First chapter of the Python 3.0 io framework back port: _fileio
The next step depends on a working bytearray type which itself depends on a backport of the nwe buffer API.
round included:
* Revert round to its 2.6 behavior (half away from 0).
* Because round, floor, and ceil always return float again, it's no
longer necessary to have them delegate to __xxx___, so I've ripped
that out of their implementations and the Real ABC. This also helps
in implementing types that work in both 2.6 and 3.0: you return int
from the __xxx__ methods, and let it get enabled by the version
upgrade.
* Make pow(-1, .5) raise a ValueError again.
the complex_pow part), r56649, r56652, r56715, r57296, r57302, r57359, r57361,
r57372, r57738, r57739, r58017, r58039, r58040, and r59390, and new
documentation. The only significant difference is that round(x) returns a float
to preserve backward-compatibility. See http://bugs.python.org/issue1689.
This changes the rules for when __hash__ is inherited slightly,
by allowing it to be inherited when one or more of __lt__, __le__,
__gt__, __ge__ are overridden, as long as __eq__ and __ne__ aren't.
Allows dictionaries to be pre-sized (upto 255 elements) saving time lost
to re-sizes with their attendant mallocs and re-insertions.
Has zero effect on small dictionaries (5 elements or fewer), a slight
benefit for dicts upto 22 elements (because they had to resize once
anyway), and more benefit for dicts upto 255 elements (saving multiple
resizes during the build-up and reducing the number of collisions on
the first insertions). Beyond 255 elements, there is no addional benefit.
Issue #1580: New free format floating point representation based on "Floating-Point Printer Sample Code", by Robert G. Burger. For example repr(11./5) now returns '2.2' instead of '2.2000000000000002'.
Thanks to noam for the patch! I had to modify doubledigits.c slightly to support X64 and IA64 machines on Windows. I also added the new file to the three project files.
Added PyFloat_GetMax(), PyFloat_GetMin() and PyFloat_GetInfo() to the float API.
Added a dictionary sys.float_info with information about the internal floating point type to the sys module.
I've finished the last task for the PCbuild9 directory today. I don't think there is much left to do. Now you can all play around with the shiny new VS 2008 and try the PGO builds. I was able to get a speed improvement of about 10% on py3k.
Have fun! :)
as usual with slicing (both with str and unicode strings). This
fixes issue 1259.
For str only the stringobject.c file was modified. But for unicode,
I needed to repeat in the four functions a lot of code, so created
a new function that does part of the job for them (and placed it in
find.h, following a suggestion of Barry).
Also added tests for this behaviour.
also hex escapes) -- this was reaching beyond the end of the input string
buffer, even though it is not supposed to be \0-terminated.
This has no visible effect but is clearly the correct thing to do.
(In 3.0 it had a visible effect after removing ob_sstate from PyString.)
Python code; but it is possible from C. object.__str__ had the issue of not
expecting a type to doing something within it's tp_str implementation that
could trigger an infinite recursion, but it could in C code.. Both found
thanks to BaseException and how it handles its repr.
Closes issue #1686386. Thanks to Thomas Herve for taking an initial stab at
coming up with a solution.
predictable to being completely predictable. The value of hash(n)
is unchanged for any n that's small enough to be representable as an
int, and also unchanged for the vast majority of long integers n of
reasonable size.
Backport abc.py and isinstance/issubclass overloading to 2.6.
I had to backport test_typechecks.py myself, and make one small change
to abc.py to avoid duplicate work when x.__class__ and type(x) are the
same.
ever going back out to Python code in PyObject_Call(). Required introducing a
static RuntimeError instance so that normalizing an exception there is no
reliance on a recursive call that would put the exception system over the
recursion check itself.
- Specialcase extended slices that amount to a shallow copy the same way as
is done for simple slices, in the tuple, string and unicode case.
- Specialcase step-1 extended slices to optimize the common case for all
involved types.
- For lists, allow extended slice assignment of differing lengths as long
as the step is 1. (Previously, 'l[:2:1] = []' failed even though
'l[:2] = []' and 'l[:2:None] = []' do not.)
- Implement extended slicing for buffer, array, structseq, mmap and
UserString.UserString.
- Implement slice-object support (but not non-step-1 slice assignment) for
UserString.MutableString.
- Add tests for all new functionality.
Py_ssize_t members.
Simplify the implementation of UnicodeError objects:
start and end attributes are now stored directly as
Py_ssize_t members, which simplifies various get and
set functions.
a large width is passed on 32-bit platforms. Found by Google.
It would be good for people to review this especially carefully and verify
I don't have an off by one error and there is no other way to cause overflow.
- Reenable modules on x64 that had been disabled aeons ago for Itanium.
- Cleared up confusion about compilers for 64 bit windows. There is only Itanium and x64. Added macros MS_WINI64 and MS_WINX64 for those rare cases where it matters, such as the disabling of modules above.
- Set target platform (_WIN32_WINNT and WINVER) to 0x0501 (XP) for x64, and 0x0400 (NT 4.0) otherwise, which are the targeted minimum platforms.
- Fixed thread_nt.h. The emulated InterlockedCompareExchange function didn´t work on x64, probaby due to the lack of a "volatile" specifier. Anyway, win95 is no longer a target platform.
- Itertools module used wrong constant to check for overflow in count()
- PyInt_AsSsize_t couldn't deal with attribute error when accessing the __long__ member.
- PyLong_FromSsize_t() incorrectly specified that the operand were unsigned.
With these changes, the x64 passes the testsuite, for those modules present.
http://mail.python.org/pipermail/python-dev/2007-March/071796.html .
I've kept a couple of still-valid extra tests in test_descr, but didn't
bother to sort through the new comments and refactorings added in r53997
to see if some of them could be kept. If so, they could go in a
follow-up check-in.
type.__new__(), and then calls object.__init__(cls), just to be anal.
This allows us to restore the code in string.py's _TemplateMetaclass
that called super(...).__init__(name, bases, dct), which I commented
out yesterday since it broke due to the stricter argument checking
added to object.__init__().
now stricter in rejecting excess arguments. The only time when
either allows excess arguments is when it is not overridden and the
other one is. For backwards compatibility, when both are
overridden, it is a deprecation warning (for now; maybe a Py3k
warning later).
When merging this into 3.0, the warnings should become errors.
Note: without the change to string.py, lots of spurious warnings happen.
What's going on there?
to complex using its __complex__() method before falling back to the
__float__() method. Therefore, the functions in the cmath module now
can operate on objects that define a __complex__() method.
(backport)
Patch #1591665: implement the __dir__() special function lookup in PyObject_Dir.
Had to change a few bits of the patch because classobjs and __methods__ are still
in Py2.6.
We add some new rules that are required for preserving internal
invariants of types.
1. If type (or a subclass of type) appears in bases, it must appear
before any non-type bases. If a non-type base (like a regular
new-style class) occurred first, it could trick type into
allocating the new class an __dict__ which must be impossible.
2. There are several checks that are made of bases when creating a
type. Those checks are now repeated when assigning to __bases__.
We also add the restriction that assignment to __bases__ may not
change the metaclass of the type.
Add new tests for these cases and for a few other oddball errors that
were no previously tested. Remove a crasher test that was fixed.
Also some internal refactoring: Extract the code to find the most
derived metaclass of a type and its bases. It is now needed in two
places. Rewrite the TypeError checks in test_descr to use doctest.
The tests now clearly show what exception they expect to see.
Fixes bug 1569356, but at the cost of a minor incompatibility in
locals(). Add test that verifies that the class namespace is not
polluted. Also clarify the behavior in the library docs.
Along the way, cleaned up the dict_to_map and map_to_dict
implementations and added some comments that explain what they do.
of some of the common builtin types.
Use a bit in tp_flags for each common builtin type. Check the bit
to determine if any instance is a subclass of these common types.
The check avoids a function call and O(n) search of the base classes.
The check is done in the various Py*_Check macros rather than calling
PyType_IsSubtype().
All the bits are set in tp_flags when the type is declared
in the Objects/*object.c files because PyType_Ready() is not called
for all the types. Should PyType_Ready() be called for all types?
If so and the change is made, the changes to the Objects/*object.c files
can be reverted (remove setting the tp_flags). Objects/typeobject.c
would also have to be modified to add conditions
for Py*_CheckExact() in addition to each the PyType_IsSubtype check.
When running the interpreter in an environment that would cause it to set
stdout/stderr/stdin's encoding, having a sitecustomize that would replace
them with something other than PyFile objects would crash the interpreter.
Fix it by simply ignoring the encoding-setting for non-files.
This could do with a test, but I can think of no maintainable and portable
way to test this bug, short of adding a sitecustomize.py to the buildsystem
and have it always run with it (hmmm....)
* unified the way intobject, longobject and mystrtoul handle
values around -sys.maxint-1.
* in general, trying to entierely avoid overflows in any computation
involving signed ints or longs is extremely involved. Fixed a few
simple cases where a compiler might be too clever (but that's all
guesswork).
* more overflow checks against bad data in marshal.c.
* 2.5 specific: fixed a number of places that were still confusing int
and Py_ssize_t. Some of them could potentially have caused
"real-world" breakage.
* list.pop(x): fixing overflow issues on x was messy. I just reverted
to PyArg_ParseTuple("n"), which does the right thing. (An obscure
test was trying to give a Decimal to list.pop()... doesn't make
sense any more IMHO)
* trying to write a few tests...
i_divmod(): As discussed on Python-Dev, changed the overflow
checking to live happily with recent gcc optimizations that
assume signed integer arithmetic never overflows.
This differs from the corresponding change on the 2.5 and 2.4
branches, using a less obscure approach, but one that /may/
tickle platform idiocies in their definitions of LONG_MIN.
The 2.4 + 2.5 change avoided introducing a dependence on
LONG_MIN, at the cost of substantially goofier code.
OverflowError while x*x succeeds and produces infinity; apparently
these inconsistencies cannot be fixed across ``all'' platforms and
there's a widespread feeling that therefore ``every'' platform
should keep suffering forevermore. Ah well.
inf) but didn't; added a test to test_float to verify that, and ignored the
ERANGE value for errno in the pow operation to make the new test pass (with
help from Marilyn Davis at the Google Python Sprint -- thanks!).
Replace UnicodeDecodeErrors raised during == and !=
compares of Unicode and other objects with a new
UnicodeWarning.
All other comparisons continue to raise exceptions.
Exceptions other than UnicodeDecodeErrors are also left
untouched.
were failing due to inappropriate clipping of numbers larger than 2**31
with new-style classes. (typeobject.c) In reviewing the code for classic
classes, there were 2 problems. Any negative value return could be returned.
Always return -1 if there was an error. Also make the checks similar
with the new-style classes. I believe this is correct for 32 and 64 bit
boxes, including Windows64.
Add a test of classic classes too.
I modified this patch some by fixing style, some error checking, and adding
XXX comments. This patch requires review and some changes are to be expected.
I'm checking in now to get the greatest possible review and establish a
baseline for moving forward. I don't want this to hold up release if possible.
This is the first batch of fixes that should be easy to verify based on context.
This fixes problem numbers: 220 (ast), 323-324 (symtable),
321-322 (structseq), 215 (array), 210 (hotshot), 182 (codecs), 209 (etree).
PyMapping_Size and PySequence_Size.
Because len() tries first sequence, then mapping size, it will always
raise a "non-mapping object has no len" error which is confusing.
be wrong.
The real change is to pass (bufsz - 1) to PyOS_ascii_formatd and 1
to strncat. strncat copies n+1 bytes from src (not dest).
Reported by Klocwork #58.
The problem of checking too eagerly for recursive calls is the
following: if a RuntimeError is caused by recursion, and if code needs
to normalize it immediately (as in the 2nd test), then
PyErr_NormalizeException() needs a call to the RuntimeError class to
instantiate it, and this hits the recursion limit again... causing
PyErr_NormalizeException() to never finish.
Moved this particular recursion check to slot_tp_call(), which is not
involved in instantiating built-in exceptions.
Backport candidate.
arguments in reverse, the interpreter would infinitely recourse trying to get a
coercion that worked. So put in a recursion check after a coercion is made and
the next call to attempt to use the coerced values.
Fixes bug #992017 and closes crashers/coerce.py .
the char buffer was requested. Now it actually returns the char buffer if
available or raises a TypeError if it isn't (as is raised for the other buffer
types if they are not present but requested).
Not a backport candidate since it does change semantics of the buffer object
(although it could be argued this is enough of a bug to bother backporting).
Give a consistent behavior for comparison and hashing of method objects
(both user- and built-in methods). Now compares the 'self' recursively.
The hash was already asking for the hash of 'self'.
to each allocated block. This was using 4 bytes for each such
piece of info regardless of platform. This didn't really matter
before (proof: no bug reports, and the debug-build obmalloc would
have assert-failed if it was ever asked for a chunk of memory
>= 2**32 bytes), since container indices were plain ints. But after
the Py_ssize_t changes, it's at least theoretically possible to
allocate a list or string whose guts exceed 2**32 bytes, and the
PYMALLOC_DEBUG routines would fail then (having only 4 bytes
to record the originally requested size).
Now we use sizeof(size_t) bytes for each of a PYMALLOC_DEBUG
build's extra debugging fields. This won't make any difference
on 32-bit boxes, but will add 16 bytes to each allocation in
a debug build on a 64-bit box.
he didn't know this), so merged in some changes I made during
review. Nothing material apart from changing a new `mask` local
from int to Py_ssize_t. Mostly this is repairing comments that
were made incorrect, and adding new comments. Also a few
minor code rewrites for clarity or helpful succinctness.
a new comment) suggests there are almost certainly large input
integers in all non-binary input bases for which one Python digit
too few is initally allocated to hold the final result. Instead
of assert-failing when that happens, allocate more space. Alas,
I estimate it would take a few days to find a specific such case,
so this isn't backed up by a new test (not to mention that such
a case may take hours to run, since conversion time is quadratic
in the number of digits, and preliminary attempts suggested that
the smallest such inputs contain at least a million digits).
Make some functions that should have been static static.
Fix a bunch of refleaks by fixing the definition of
MiddlingExtendsException.
Remove all the __new__ implementations apart from
BaseException_new. Rewrite most code that needs it to cope with
NULL fields (such code could get excercised anyway, the
__new__-removal just makes it more likely). This involved
editing the code for WindowsError, which I can't test.
This fixes all the refleaks in at least the start of a regrtest
-R :: run.
Fix a number of problems with the need for speed code:
One is doing this sort of thing:
Py_DECREF(self->field);
self->field = newval;
Py_INCREF(self->field);
without being very sure that self->field doesn't start with a
value that has a __del__, because that almost certainly can lead
to segfaults.
As self->args is constrained to be an exact tuple we may as well
exploit this fact consistently. This leads to quite a lot of
simplification (and, hey, probably better performance).
Add some error checking in places lacking it.
Fix some rather strange indentation in the Unicode code.
Delete some trailing whitespace.
More to come, I haven't fixed all the reference leaks yet...
(If compiled without FAST search support, changed the pre-memcmp test
to check the last character as well as the first. This gave a 25%
speedup for my test case.)
Rewrote the split algorithms so they stop when maxsplit gets to 0.
Previously they did a string match first then checked if the maxsplit
was reached. The new way prevents a needless string search.
results list.
Originally it allocated 0 items and used the list growth during append. Now
it preallocates 12 items so the first few appends don't need list reallocs.
("Here are some words ."*2).split(None, 1) is 7% faster
("Here are some words ."*2).split() is is 15% faster
(Your milage may vary, see dealership for details.)
File parsing like this
for line in f:
count += len(line.split())
is also about 15% faster. There is a slowdown of about 3% for large
strings because of the additional overhead of checking if the append is
to a preallocated region of the list or not. This will be the rare case.
It could be improved with special case code but we decided it was not
useful enough.
There is a cost of 12*sizeof(PyObject *) bytes per list. For the normal
case of file parsing this is not a problem because of the lists have
a short lifetime. We have not come up with cases where this is a problem
in real life.
I chose 12 because human text averages about 11 words per line in books,
one of my data sets averages 6.2 words with a final peak at 11 words per
line, and I work with a tab delimited data set with 8 tabs per line (or
9 words per line). 12 encompasses all of these.
Also changed the last rstrip code to append then reverse, rather than
doing insert(0). The strip() and rstrip() times are now comparable.
this is on par with a corresponding find, and nearly twice as fast
as split(sep, 1)
full tests, a unicode version, and documentation will follow to-
morrow.
The SIGCHECK macro defined here has always been bizarre, but
it apparently causes compiler warnings on "Sun Studio 11".
I believe the warnings are bogus, but it doesn't hurt to make
the macro definition saner.
Bugfix candidate (but I'm not going to bother).