(high_bits << PyLong_SHIFT) + low_bits with
(high_bits << PyLong_SHIFT) | low_bits
in Objects/longobject.c. Motivation:
- shouldn't unnecessarily mix bit ops with arithmetic ops (style)
- this pattern should be spelt the same way thoughout (consistency)
- it's very very very slightly faster: no need to worry about
carries to the high digit (nano-optimization).
The debug memory api now keeps track of which external API (PyMem_* or PyObject_*) was used to allocate each block and treats any API violation as an error. Added separate _PyMem_DebugMalloc functions for the Py_Mem API instead of having it use the _PyObject_DebugMalloc functions.
Without this change, test_unicode.UnicodeTest.test_codecs_utf7 crashes in
debug mode. What happens is the unicode string u'\U000abcde' with a length
of 1 encodes to the string '+2m/c3g-' of length 8. Since only 5 bytes is
reserved in the buffer, a buffer overrun occurs.
lnotab-based tracing is very complicated and isn't documented very well. There
were at least 3 comment blocks purporting to document co_lnotab, and none did a
very good job. This patch unifies them into Objects/lnotab_notes.txt which
tries to completely capture the current state of affairs.
I also discovered that we've attached 2 layers of patches to the basic tracing
scheme. The first layer avoids jumping to instructions that don't start a line,
to avoid problems in if statements and while loops. The second layer
discovered that jumps backward do need to trace at instructions that don't
start a line, so it added extra lnotab entries for 'while' and 'for' loops, and
added a special case for backward jumps within the same line. I replaced these
patches by just treating forward and backward jumps differently.
made it try to set the line number from the trace callback for a 'call' event.
This patch makes the error message a little more helpful in that case, and
makes it a little less likely that a future editor will make the same mistake
in test_trace.
Most uses of PyCode_Addr2Line
(http://www.google.com/codesearch?q=PyCode_Addr2Line) are just trying to get
the line number of a specified frame, but there's no way to do that directly.
Forcing people to go through the code object makes them know more about the
guts of the interpreter than they should need.
The remaining uses of PyCode_Addr2Line seem to be getting the line from a
traceback (for example,
http://www.google.com/codesearch/p?hl=en#u_9_nDrchrw/pygame-1.7.1release/src/base.c&q=PyCode_Addr2Line),
which is replaced by the tb_lineno field. So we may be able to deprecate
PyCode_Addr2Line entirely for external use.
Most uses of PyCode_New found by http://www.google.com/codesearch?q=PyCode_New
are trying to build an empty code object, usually to put it in a dummy frame
object. This patch adds a PyCode_NewEmpty wrapper which lets the user specify
just the filename, function name, and first line number, instead of also
requiring lots of code internals.
the __doc__ into the subclass instance __dict__. The fix refactors
property_copy to call property_init in such a way that the __doc__
logic is re-executed correctly when getter_doc is 1, thus simplifying
property_copy.
PyUnicode_DecodeUTF8() once, remember the result and output it in a second
step. This avoids problems with counting UTF-8 bytes that ignores the effect
of using the replace error handler in PyUnicode_DecodeUTF8().
If anyone wants to clean up the documentation, feel free. It's my first documentation foray, and it's not that great.
Will port to py3k with a different strategy.
- simplify parsing and printing of complex numbers
- make complex(repr(z)) round-tripping work for complex
numbers involving nans, infs, or negative zeros
- don't accept some of the stranger complex strings
that were previously allowed---e.g., complex('1..1j')
int, long, and float __format__(), and it keeps their implementation
in sync with py3k.
Also added PyOS_double_to_string. This is the "fallback" version
that's also available in trunk, and should be kept in sync with that
code. I'll add an issue to document PyOS_double_to_string in the C
API.
There are many internal cleanups. Externally visible changes include:
- Implement PEP 378, Format Specifier for Thousands Separator, for
floats, ints, and longs.
- Issue #5515: 'n' formatting for ints, longs, and floats handles
leading zero formatting poorly.
- Issue #5772: For float.__format__, don't add a trailing ".0" if
we're using no type code and we have an exponent.
correctly rounded, using round-half-to-even. This ensures that the
value of float(n) doesn't depend on whether we're using 15-bit digits
or 30-bit digits for Python longs.
untrackable objects are not tracked by the garbage collector. This can
reduce the size of collections and therefore the garbage collection overhead
on long-running programs, depending on their particular use of datatypes.
(trivia: this makes the "binary_trees" benchmark from the Computer Language
Shootout 40% faster)
The basic algorithm remains the same; the most significant speedups
come from the following three changes:
(1) normalize by shifting instead of multiplying and dividing
(2) the old algorithm usually did an unnecessary extra iteration of
the outer loop; remove this. As a special case, this means that
long divisions with a single-digit result run twice as fast as
before.
(3) make inner loop much tighter.
Various benchmarks show speedups of between 50% and 150% for long
integer divisions and modulo operations.
sizeof(Py_UNICODE) == 2, PyUnicode_FromWideChar now converts
each character outside the BMP to the appropriate surrogate pair.
Thanks Victor Stinner for the patch.
(backport of r70452 from py3k to trunk)
For simple uses for str.format(), this makes the typing easier. Hopfully this
will help in the adoption of str.format().
For example:
'The {} is {}'.format('sky', 'blue')
You can mix and matcth auto-numbering and named replacement fields:
'The {} is {color}'.format('sky', color='blue')
But you can't mix and match auto-numbering and specified numbering:
'The {0} is {}'.format('sky', 'blue')
ValueError: cannot switch from manual field specification to automatic field numbering
Will port to 3.1.
and cleanups in Objects/longobject.c. The most significant change is that
longs now use less memory: average savings are 2 bytes per long on 32-bit
systems and 6 bytes per long on 64-bit systems. (This memory saving already
exists in py3k.)
platforms. The previous code was fragile, depending on the twin
accidents that:
(1) in C, casting the double value 2.**63 to long returns the integer
value -2**63, and
(2) in Python, hash(-2**63) == hash(2**63).
There's already a test for this in test_hash.
- fix some places where counters into ob_digit were declared as
int instead of Py_ssize_t
- add (twodigit) casts where necessary
- fix code in _PyLong_AsByteArray that uses << on negative values
#4759: allow None as first argument of bytearray.translate(), for consistency with bytes.translate().
Also fix segfault for bytearray.translate(x, None) -- will backport this part to 3.0 and 2.6.
the __long__ slot is allowed to return either int or long, but the behaviour of
float objects should not change between 2.5 and 2.6.
Reviewed by Benjamin Peterson
exception afterwards (for a subsequent parameter), the user code will
not call PyBuffer_Release() and memory will leak.
Reviewed by Amaury Forgeot d'Arc.
match Python 2.5 speed despite the __instancecheck__ / __subclasscheck__
mechanism. In the process, fix a bug where isinstance() and issubclass(),
when given a tuple of classes as second argument, were looking up
__instancecheck__ / __subclasscheck__ on the tuple rather than on each
type object.
Reviewed by Benjamin Peterson and Raymond Hettinger.
* crashes on memory allocation failure found with failmalloc
* memory leaks found with valgrind
* compiler warnings in opt mode which would lead to invalid memory reads
* problem using wrong name in decimal module reported by pychecker
Update the valgrind suppressions file with new leaks that are small/one-time
leaks we don't care about (ie, they are too hard to fix).
TBR=barry
TESTED=./python -E -tt ./Lib/test/regrtest.py -uall (both debug and opt modes)
in opt mode:
valgrind -q --leak-check=yes --suppressions=Misc/valgrind-python.supp \
./python -E -tt ./Lib/test/regrtest.py -uall,-bsddb,-compiler \
-x test_logging test_ssl test_multiprocessing
valgrind -q --leak-check=yes --suppressions=Misc/valgrind-python.supp \
./python -E -tt ./Lib/test/regrtest.py test_multiprocessing
for i in `seq 1 4000` ; do
LD_PRELOAD=~/local/lib/libfailmalloc.so FAILMALLOC_INTERVAL=$i \
./python -c pass
done
At least some of these fixes should probably be backported to 2.5.
rewrite float.fromhex to only allow ASCII hex digits on all platforms.
(Tests for this are already present, but the test_float failures
on Solaris hadn't been noticed before.)
Reviewed by Antoine Pitrou.
Optimization of str.format() for cases with str, unicode, int, long,
and float arguments. This gives about 30% speed improvement for the
simplest (but most common) cases. This patch skips the __format__
dispatch, and also avoids creating an object to hold the format_spec.
Unfortunately there's a complication in 2.6 with int, long, and float
because they always expect str format_specs. So in the unicode
version of this optimization, just check for unicode objects. int,
float, long, and str can be added later, if needed.