The setobject freelist was consuming memory but not providing much value.
Even when a freelisted setobject was available, most of the setobject
fields still needed to be initialized and the small table still required
a memset(). This meant that the custom freelisting scheme for sets was
providing almost no incremental benefit over the default Python freelist
scheme used by _PyObject_Malloc() in Objects/obmalloc.c.
Modern processors tend to make consecutive memory accesses cheaper than
random probes into memory.
Small sets can fit into L1 cache, so they get less benefit. But they do
come out ahead because the consecutive probes don't probe the same key
more than once and because the randomization step occurs less frequently
(or not at all).
For the open addressing step, putting the perturb shift before the index
calculation gets the upper bits into play sooner.
The Gdb prettyprint plugin depended on the dummy object being displayable.
Other solutions besides a unicode object are possible. For now, get it
back up and running.
The identity checks in lookkey() need to be there to prevent the dummy
object from leaking through Py_RichCompareBool() into user code in the
rare circumstance where the dummy's hash value exactly matches the hash
value of the actual key being looked up.
Letting the compiler decide how to optimize the multiply by five
gives it the freedom to make better choices for the best technique
for a given target machine.
For example, GCC on x86_64 produces a little bit better code:
Old-way (3 steps with a data dependency between each step):
shrq $5, %r13
leaq 1(%rbx,%r13), %rax
leaq (%rax,%rbx,4), %rbx
New-way (3 steps with no dependency between the first two steps
which can be run in parallel):
leaq (%rbx,%rbx,4), %rax # i*5
shrq $5, %r13 # perturb >>= PERTURB_SHIFT
leaq 1(%r13,%rax), %rbx # 1 + perturb + i*5
computation as the overflow behavior of signed integers is undefined.
NOTE: This change is smaller compared to 3.2 as much of this cleanup had
already been done. I added the comment that my change in 3.2 added so that the
code would match up. Otherwise this just adds or synchronizes appropriate UL
designations on some constants to be pedantic.
In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent.
Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).
Cleanup only - no functionality or hash values change.
computation as the overflow behavior of signed integers is undefined.
In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent.
Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).
Cleanup only - no functionality or hash values change.