mirror of https://github.com/python/cpython
gh-100833: Remove 'volatile' qualifiers in fsum algorithm (#100845)
This PR removes the `volatile` qualifier on various intermediate quantities in the `math.fsum` implementation, and updates the notes preceding the algorithm accordingly (as well as fixing some of the exsting notes). See the linked issue #100833 for discussion.
This commit is contained in:
parent
b139bcd892
commit
87d3bd0e02
|
@ -0,0 +1 @@
|
|||
Speed up :func:`math.fsum` by removing defensive ``volatile`` qualifiers.
|
|
@ -1358,30 +1358,30 @@ FUNC1(tanh, tanh, 0,
|
|||
Dickinson's post at <http://bugs.python.org/file10357/msum4.py>.
|
||||
See those links for more details, proofs and other references.
|
||||
|
||||
Note 1: IEEE 754R floating point semantics are assumed,
|
||||
but the current implementation does not re-establish special
|
||||
value semantics across iterations (i.e. handling -Inf + Inf).
|
||||
Note 1: IEEE 754 floating-point semantics with a rounding mode of
|
||||
roundTiesToEven are assumed.
|
||||
|
||||
Note 2: No provision is made for intermediate overflow handling;
|
||||
therefore, sum([1e+308, 1e-308, 1e+308]) returns 1e+308 while
|
||||
sum([1e+308, 1e+308, 1e-308]) raises an OverflowError due to the
|
||||
Note 2: No provision is made for intermediate overflow handling;
|
||||
therefore, fsum([1e+308, -1e+308, 1e+308]) returns 1e+308 while
|
||||
fsum([1e+308, 1e+308, -1e+308]) raises an OverflowError due to the
|
||||
overflow of the first partial sum.
|
||||
|
||||
Note 3: The intermediate values lo, yr, and hi are declared volatile so
|
||||
aggressive compilers won't algebraically reduce lo to always be exactly 0.0.
|
||||
Also, the volatile declaration forces the values to be stored in memory as
|
||||
regular doubles instead of extended long precision (80-bit) values. This
|
||||
prevents double rounding because any addition or subtraction of two doubles
|
||||
can be resolved exactly into double-sized hi and lo values. As long as the
|
||||
hi value gets forced into a double before yr and lo are computed, the extra
|
||||
bits in downstream extended precision operations (x87 for example) will be
|
||||
exactly zero and therefore can be losslessly stored back into a double,
|
||||
thereby preventing double rounding.
|
||||
Note 3: The algorithm has two potential sources of fragility. First, C
|
||||
permits arithmetic operations on `double`s to be performed in an
|
||||
intermediate format whose range and precision may be greater than those of
|
||||
`double` (see for example C99 §5.2.4.2.2, paragraph 8). This can happen for
|
||||
example on machines using the now largely historical x87 FPUs. In this case,
|
||||
`fsum` can produce incorrect results. If `FLT_EVAL_METHOD` is `0` or `1`, or
|
||||
`FLT_EVAL_METHOD` is `2` and `long double` is identical to `double`, then we
|
||||
should be safe from this source of errors. Second, an aggressively
|
||||
optimizing compiler can re-associate operations so that (for example) the
|
||||
statement `yr = hi - x;` is treated as `yr = (x + y) - x` and then
|
||||
re-associated as `yr = y + (x - x)`, giving `y = yr` and `lo = 0.0`. That
|
||||
re-association would be in violation of the C standard, and should not occur
|
||||
except possibly in the presence of unsafe optimizations (e.g., -ffast-math,
|
||||
-fassociative-math). Such optimizations should be avoided for this module.
|
||||
|
||||
Note 4: A similar implementation is in Modules/cmathmodule.c.
|
||||
Be sure to update both when making changes.
|
||||
|
||||
Note 5: The signature of math.fsum() differs from builtins.sum()
|
||||
Note 4: The signature of math.fsum() differs from builtins.sum()
|
||||
because the start argument doesn't make sense in the context of
|
||||
accurate summation. Since the partials table is collapsed before
|
||||
returning a result, sum(seq2, start=sum(seq1)) may not equal the
|
||||
|
@ -1467,7 +1467,7 @@ math_fsum(PyObject *module, PyObject *seq)
|
|||
Py_ssize_t i, j, n = 0, m = NUM_PARTIALS;
|
||||
double x, y, t, ps[NUM_PARTIALS], *p = ps;
|
||||
double xsave, special_sum = 0.0, inf_sum = 0.0;
|
||||
volatile double hi, yr, lo;
|
||||
double hi, yr, lo;
|
||||
|
||||
iter = PyObject_GetIter(seq);
|
||||
if (iter == NULL)
|
||||
|
|
Loading…
Reference in New Issue