Reduce the base by the modulus when the base is larger than
the modulus. This can unboundedly speed the "startup costs"
of doing modular exponentiation, particularly in cases where
the base is much larger than the modulus. Original patch
by Armin Rigo, inspired by https://github.com/pyca/ed25519.
(grafted from f34c59494420765b013136ca93f63b716d9f1d30)
(the latter renamed to _PyLong_Frexp) now use the same core code. The
exponent produced by _PyLong_Frexp now has type Py_ssize_t instead of the
previously used int, and no longer needs scaling by PyLong_SHIFT. This
frees the math module from having to know anything about the PyLong
implementation. This closes issue #5576.
(high_bits << PyLong_SHIFT) + low_bits with
(high_bits << PyLong_SHIFT) | low_bits
in Objects/longobject.c. Motivation:
- shouldn't unnecessarily mix bit ops with arithmetic ops (style)
- this pattern should be spelt the same way thoughout (consistency)
- it's very very very slightly faster: no need to worry about
carries to the high digit (nano-optimization).
correctly rounded, using round-half-to-even. This ensures that the
value of float(n) doesn't depend on whether we're using 15-bit digits
or 30-bit digits for Python longs.
The basic algorithm remains the same; the most significant speedups
come from the following three changes:
(1) normalize by shifting instead of multiplying and dividing
(2) the old algorithm usually did an unnecessary extra iteration of
the outer loop; remove this. As a special case, this means that
long divisions with a single-digit result run twice as fast as
before.
(3) make inner loop much tighter.
Various benchmarks show speedups of between 50% and 150% for long
integer divisions and modulo operations.
and cleanups in Objects/longobject.c. The most significant change is that
longs now use less memory: average savings are 2 bytes per long on 32-bit
systems and 6 bytes per long on 64-bit systems. (This memory saving already
exists in py3k.)
- fix some places where counters into ob_digit were declared as
int instead of Py_ssize_t
- add (twodigit) casts where necessary
- fix code in _PyLong_AsByteArray that uses << on negative values
long raise ValueError instead of returning 0. Also, change the error
message for conversion of an infinity to an integer, replacing 'long' by
'integer', so that it's appropriate for both long(float('inf')) and
int(float('inf')).
The patch also adds acosh, asinh, atanh, log1p and copysign to all platforms. Finally it fixes differences between platforms like different results or exceptions for edge cases. Have fun :)
Rather than sprinkle casts throughout the code, change Py_CHARMASK to
always cast it's result to an unsigned char. This should ensure we
do the right thing when accessing an array with the result.