Commit Graph

267 Commits

Author SHA1 Message Date
Raymond Hettinger 1bd8d75be3 Issue #23290: Optimize set_merge() for cases where the target is empty.
(Contributed by Serhiy Storchaka.)
2015-05-13 01:26:14 -07:00
Raymond Hettinger 438f9134cf Mirco-optimizations to reduce register spills and reloads observed on CLANG and GCC. 2015-02-09 06:48:29 -06:00
Raymond Hettinger 8249282622 Minor code clean up. 2015-02-04 08:37:02 -08:00
Raymond Hettinger 06bb1226d1 Issue 23359: Reduce size of code in set_lookkey. Only do linear probes when there is no wrap-around.
Nice simplification contributed by Serhiy Storchaka :-)
2015-02-03 08:15:30 -08:00
Raymond Hettinger c658d85487 Issue 23359: Tighten inner search loop for sets (don't and-mask every entry lookup). 2015-02-02 08:35:00 -08:00
Raymond Hettinger 59ecabd12a Keep the definition of i consistent between set_lookkey() and set_insert_clean(). 2015-01-31 02:45:12 -08:00
Raymond Hettinger 9edd753229 Minor tweak to improve code clarity. 2015-01-30 20:09:23 -08:00
Raymond Hettinger 06a1c8dfa0 Fix typo in a comment. 2015-01-30 18:02:15 -08:00
Raymond Hettinger f8d1a31e70 Revert unintended part of the commit (the key==dummy test wasn't supposed to change). 2015-01-26 22:06:43 -08:00
Raymond Hettinger a5ebbf6295 Remove unneeded dummy test from the set search loop (when the hashes match we know the key is not a dummy). 2015-01-26 21:54:35 -08:00
Raymond Hettinger 3037e84ad1 Issue #23269: Tighten search_loop in set_insert_clean()
Instead of masking and shifting every loopup, move the wrap-around
test outside of the inner-loop.
2015-01-26 21:33:48 -08:00
Raymond Hettinger b335dfe7fa Set the hash values of dummy entries to -1. Improves quality of entry->hash == hash tests. 2015-01-25 16:38:52 -08:00
Raymond Hettinger 4d45c1069b Update out-of-date comments. 2015-01-25 16:27:40 -08:00
Raymond Hettinger 93035c44fd Issue #23119: Simplify setobject by inlining the special case for unicode equality testing. 2015-01-25 16:12:49 -08:00
Raymond Hettinger ed741d4ff0 A hybrid of and-masking and a conditional-set-to-zero produce even faster search loop. 2015-01-18 21:25:15 -08:00
Raymond Hettinger bd9b200b87 Update copyright for 2015 updates. 2015-01-18 16:10:30 -08:00
Raymond Hettinger 9cd6a789c6 Clean-up, simplify, and slightly speed-up bounds logic in set_pop().
Elsewhere in the setobject.c code we do a bitwise-and with the mask
instead of using a conditional to reset to zero on wrap-around.
Using that same technique here use gives cleaner, faster, and more
consistent code.
2015-01-18 16:06:18 -08:00
Raymond Hettinger 1202a4733e Issue 23261: Clean-up the hack to store the set.pop() search finger in a hash field instead of the setobject. 2015-01-18 13:12:42 -08:00
Raymond Hettinger 8edf27c134 Small clean-up. Factor-out common code for add, contains, and discard function pairs. 2014-12-26 23:08:58 -08:00
Raymond Hettinger 08e3dc0ad6 Issue #23107: Tighten-up loops in setobject.c
* Move the test for an exact key match to after a hash match
* Use "used" as a loop counter instead of "fill"
* Minor improvements to variable names and code consistency
2014-12-26 20:14:00 -08:00
Victor Stinner 12174a5dca Issue #22156: Fix "comparison between signed and unsigned integers" compiler
warnings in the Objects/ subdirectory.

PyType_FromSpecWithBases() and PyType_FromSpec() now reject explicitly negative
slot identifiers.
2014-08-15 23:17:38 +02:00
Raymond Hettinger 426d9958a2 Add development comments to setobject.c 2014-05-18 21:40:20 +01:00
Eric V. Smith 6ba5665fc7 Fix typo in comment. 2014-01-14 08:15:03 -05:00
Raymond Hettinger 74fc8c47f6 Add comments to frozenset_hash().
Also, provide a minor hint to the compiler on how to group the xors.
2014-01-05 12:00:31 -08:00
Raymond Hettinger e259f13874 Minor code clean-up. Keep the C-API all in one section. 2013-12-15 11:56:14 -08:00
Raymond Hettinger 710a67edfc Note that LINEAR_PROBES can be set to zero. 2013-09-21 20:17:31 -07:00
Raymond Hettinger 4ef0528b97 Minor beautification. Put updates and declarations in a more logical order. 2013-09-21 15:39:49 -07:00
Raymond Hettinger 0ce1953bf7 When LINEAR_PROBES=0, let the compiler remove the dead code on its own. 2013-09-21 14:07:18 -07:00
Raymond Hettinger c70a2b7bb9 Make the linear probe sequence clearer. 2013-09-21 14:02:55 -07:00
Raymond Hettinger 8408dc581e Issue 18771: Make it possible to set the number linear probes at compile-time. 2013-09-15 14:57:15 -07:00
Raymond Hettinger 742d8716ff Put the defines in the logical section and fix indentation. 2013-09-08 00:25:57 -07:00
Raymond Hettinger 583cd03fd1 Minor code beautification. 2013-09-07 22:06:35 -07:00
Raymond Hettinger 4ea9080da9 Improve code clarity by removing two unattractive macros. 2013-09-07 21:01:29 -07:00
Raymond Hettinger 8f8839e10a Remove the freelist scheme for setobjects.
The setobject freelist was consuming memory but not providing much value.
Even when a freelisted setobject was available, most of the setobject
fields still needed to be initialized and the small table still required
a memset().  This meant that the custom freelisting scheme for sets was
providing almost no incremental benefit over the default Python freelist
scheme used by _PyObject_Malloc() in Objects/obmalloc.c.
2013-09-07 20:26:50 -07:00
Raymond Hettinger 04fd9dd52b Small rearrangement to bring together the three functions for probing the hash table. 2013-09-07 17:41:01 -07:00
Raymond Hettinger ae7b00e2d3 Move the overview comment to the top of the file. 2013-09-07 15:05:00 -07:00
Raymond Hettinger c56e0e3980 Minor touchups. 2013-09-02 16:32:27 -07:00
Raymond Hettinger 69492dab07 Factor-out the common code for setting a KeyError. 2013-09-02 15:59:26 -07:00
Raymond Hettinger a35adf5b09 Instead of XORed indicies, switch to a hybrid of linear probing and open addressing.
Modern processors tend to make consecutive memory accesses cheaper than
random probes into memory.

Small sets can fit into L1 cache, so they get less benefit.  But they do
come out ahead because the consecutive probes don't probe the same key
more than once and because the randomization step occurs less frequently
(or not at all).

For the open addressing step, putting the perturb shift before the index
calculation gets the upper bits into play sooner.
2013-09-02 03:23:21 -07:00
Raymond Hettinger 6c3c1ccd1b Update copyright. 2013-08-31 21:34:24 -07:00
Raymond Hettinger 95c0d67581 Further reduce the cost of hash collisions by inspecting an additional nearby entry. 2013-08-31 21:27:08 -07:00
Raymond Hettinger afe890923f Tighten-up the lookkey() logic and beautify the code a bit.
Use less code by moving many of the steps from the initial
lookup into the main search loop.

Beautify the code but keep the overall logic unchanged.
2013-08-28 20:59:31 -07:00
Antoine Pitrou 9d95254bb7 Issue #18772: fix the gdb plugin after the set implementation changes 2013-08-24 21:07:07 +02:00
Raymond Hettinger bfc1e1a9cd Add the same dummy type that is used in dictionaries. 2013-08-23 03:22:15 -05:00
Raymond Hettinger fcf3b500ba Issue 18797: Remove unneeded refcount adjustments for dummy objects.
It suffices to keep just one reference when the object is created.
2013-08-22 08:20:31 -07:00
Raymond Hettinger 5bb1b1dd6f Hoist the global dummy lookup out of the inner loop for set_merge(). 2013-08-21 01:34:18 -07:00
Raymond Hettinger 929cbac307 Remove a redundant hash table probe (this was artifact from an earlier draft of the patch). 2013-08-20 23:03:28 -07:00
Raymond Hettinger ae9e616a00 Issue 18772: Restore set dummy object back to unicode and restore the identity checks in lookkey().
The Gdb prettyprint plugin depended on the dummy object being displayable.
Other solutions besides a unicode object are possible.  For now, get it
back up and running.

The identity checks in lookkey() need to be there to prevent the dummy
object from leaking through Py_RichCompareBool() into user code in the
rare circumstance where the dummy's hash value exactly matches the hash
value of the actual key being looked up.
2013-08-20 22:28:24 -07:00
Raymond Hettinger 3c0a4f5def Issue18771: Reduce the cost of hash collisions for set objects. 2013-08-19 07:36:04 -07:00
Raymond Hettinger 07351a0449 Remove the else-clause because the conditions are no longer mutually exclusive. 2013-08-17 02:39:46 -07:00
Raymond Hettinger 237b34b074 Use a known unique object for the dummy entry.
This lets us run PyObject_RichCompareBool() without
first needing to check whether the entry is a dummy.
2013-08-17 02:31:53 -07:00
Raymond Hettinger 8ad3919577 Hoist the global "dummy" lookup outside of the reinsertion loop. 2013-08-15 02:18:55 -07:00
Antoine Pitrou 9ed5f27266 Issue #18722: Remove uses of the "register" keyword in C code. 2013-08-13 20:18:52 +02:00
Raymond Hettinger c629d4c9a2 Replace outdated optimization with clearer code that compiles better.
Letting the compiler decide how to optimize the multiply by five
gives it the freedom to make better choices for the best technique
for a given target machine.

For example, GCC on x86_64 produces a little bit better code:

Old-way (3 steps with a data dependency between each step):

    shrq    $5, %r13
    leaq    1(%rbx,%r13), %rax
    leaq    (%rax,%rbx,4), %rbx

New-way (3 steps with no dependency between the first two steps
         which can be run in parallel):

    leaq    (%rbx,%rbx,4), %rax     # i*5
    shrq    $5, %r13                # perturb >>= PERTURB_SHIFT
    leaq    1(%r13,%rax), %rbx      # 1 + perturb + i*5
2013-08-05 22:24:50 -07:00
Raymond Hettinger c86d7e989c Silence compiler warning for an unused declaration 2013-08-04 12:00:36 -07:00
Antoine Pitrou 5e946bacef Fix compilation warning with gcc 4.8 (unused typedef) 2013-06-18 23:28:18 +02:00
Gregory P. Smith c2176e46d7 Fix the internals of our hash functions to used unsigned values during hash
computation as the overflow behavior of signed integers is undefined.

NOTE: This change is smaller compared to 3.2 as much of this cleanup had
already been done.  I added the comment that my change in 3.2 added so that the
code would match up.  Otherwise this just adds or synchronizes appropriate UL
designations on some constants to be pedantic.

In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent.

Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).

Cleanup only - no functionality or hash values change.
2012-12-10 18:32:53 -08:00
Gregory P. Smith 27cbcd6241 Fix the internals of our hash functions to used unsigned values during hash
computation as the overflow behavior of signed integers is undefined.

In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent.

Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).

Cleanup only - no functionality or hash values change.
2012-12-10 18:15:46 -08:00
Ezio Melotti 0e1af282b8 Fix typo. 2012-09-28 16:43:40 +03:00
David Malcolm 49526f48fc Issue #14785: Add sys._debugmallocstats() to help debug low-level memory allocation issues 2012-06-22 14:55:41 -04:00
Antoine Pitrou a701388de1 Rename _PyIter_GetBuiltin to _PyObject_GetBuiltin, and do not include it in the stable ABI. 2012-04-05 00:04:20 +02:00
Kristján Valur Jónsson 31668b8f7a Issue #14288: Serialization support for builtin iterators. 2012-04-03 10:49:41 +00:00
Antoine Pitrou 093ce9cd8c Issue #6695: Full garbage collection runs now clear the freelist of set objects.
Initial patch by Matthias Troffaes.
2011-12-16 11:24:27 +01:00
Benjamin Peterson 1cebc207ea merge 3.2 2011-10-30 14:24:59 -04:00
Benjamin Peterson 2b50a01d11 remove unused variable 2011-10-30 14:24:44 -04:00
Petri Lehtinen c34f5c256a Fix the return value of set_discard (issue #10519) 2011-10-30 14:35:39 +02:00
Petri Lehtinen e0aa803714 Fix the return value of set_discard (issue #10519) 2011-10-30 14:35:12 +02:00
Petri Lehtinen 7c5e34d8a3 Avoid unnecessary recursive function calls (#closes #10519) 2011-10-30 13:57:45 +02:00
Petri Lehtinen 5acc27ebe4 Avoid unnecessary recursive function calls (closes #10519) 2011-10-30 13:56:41 +02:00
Martin v. Löwis bd928fef42 Rename _Py_identifier to _Py_IDENTIFIER. 2011-10-14 10:20:37 +02:00
Martin v. Löwis 1ee1b6fe0d Use identifier API for PyObject_GetAttrString. 2011-10-10 18:11:30 +02:00
Martin v. Löwis d63a3b8beb Implement PEP 393. 2011-09-28 07:41:54 +02:00
Mark Dickinson 57e683e53e Issue #1621: Fix undefined behaviour in bytes.__hash__, str.__hash__, tuple.__hash__, frozenset.__hash__ and set indexing operations. 2011-09-24 18:18:40 +01:00
Brian Curtin dfc80e3d97 Replace Py_NotImplemented returns with the macro form Py_RETURN_NOTIMPLEMENTED.
The macro was introduced in #12724.
2011-08-10 20:28:54 -05:00
Victor Stinner 4f2dab5c33 Revert my commit 7ba176c2f558: "Avoid useless "++" at the end of functions
Warnings found by the Clang Static Analyzer."

Most people prefer ++ at the end of functions.
2011-05-27 16:46:51 +02:00
Victor Stinner a1a807b6ef set_repr(): handle correctly PyUnicode_FromUnicode() error (MemoryError)
Bug found by the Clang Static Analyzer.
2011-05-26 14:24:30 +02:00
Victor Stinner 97e561ef24 Avoid useless "++" at the end of functions
Warnings found by the Clang Static Analyzer.
2011-05-26 13:53:47 +02:00
Éric Araujo bbcfc1f2d9 Merge from 3.1.
The fix was already committed to 3.2, but I merged two small changes
recommended by Raymond while I was working on the 2.7 patch to ease
future merges.
2011-03-23 03:43:22 +01:00
Éric Araujo 48049911d6 Fix obscure set crashers (#8420). Backport of d56b3cafb1e6, reviewed by Raymond. 2011-03-23 02:08:07 +01:00
Antoine Pitrou 715f3cd10d Issue #8685: Speed up set difference `a - b` when source set `a` is
much larger than operand `b`.  Patch by Andrew Bennetts.
2010-11-30 22:23:20 +00:00
Antoine Pitrou fbb1c6191c Follow up to #9778: fix regressions on 64-bit Windows builds 2010-10-23 17:37:54 +00:00
Georg Brandl 00da4e0b5a Remove unneeded casts to hashfunc. 2010-10-18 07:32:48 +00:00
Benjamin Peterson 8f67d0893f make hashes always the size of pointers; introduce Py_hash_t #9778 2010-10-17 20:54:53 +00:00
Georg Brandl 2d444496b3 Reindent. 2010-09-03 10:52:55 +00:00
Raymond Hettinger faf7b7f4ec Issue 8420: Fix obscure set crashers. 2010-09-03 10:00:50 +00:00
Daniel Stutzbach 928d4eeee8 Removed an extraneous semicolon 2010-09-02 15:06:03 +00:00
Antoine Pitrou f72006f442 Merged revisions 84146-84147,84150 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r84146 | antoine.pitrou | 2010-08-17 19:55:07 +0200 (mar., 17 août 2010) | 4 lines

  Issue #9612: The set object is now 64-bit clean under Windows.
........
  r84147 | antoine.pitrou | 2010-08-17 20:30:06 +0200 (mar., 17 août 2010) | 3 lines

  Fix <deque iterator>.__length_hint__() under 64-bit Windows.
........
  r84150 | antoine.pitrou | 2010-08-17 21:33:30 +0200 (mar., 17 août 2010) | 3 lines

  Clean some 64-bit issues. Also, always spell "ssize_t" "Py_ssize_t".
........
2010-08-17 19:39:39 +00:00
Antoine Pitrou 671b4d948e Issue #9612: The set object is now 64-bit clean under Windows. 2010-08-17 17:55:07 +00:00
Raymond Hettinger 51ced7afe7 Issue8757: Implicit set-to-frozenset conversion not thread-safe. 2010-08-06 09:57:49 +00:00
Raymond Hettinger 38bf2ccf4c Issue8757: Implicit set-to-frozenset conversion not thread-safe. 2010-08-06 09:52:17 +00:00
Antoine Pitrou 7f14f0d8a0 Recorded merge of revisions 81032 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

................
  r81032 | antoine.pitrou | 2010-05-09 17:52:27 +0200 (dim., 09 mai 2010) | 9 lines

  Recorded merge of revisions 81029 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r81029 | antoine.pitrou | 2010-05-09 16:46:46 +0200 (dim., 09 mai 2010) | 3 lines

    Untabify C files. Will watch buildbots.
  ........
................
2010-05-09 16:14:21 +00:00
Antoine Pitrou f95a1b3c53 Recorded merge of revisions 81029 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81029 | antoine.pitrou | 2010-05-09 16:46:46 +0200 (dim., 09 mai 2010) | 3 lines

  Untabify C files. Will watch buildbots.
........
2010-05-09 15:52:27 +00:00
Raymond Hettinger 3fb156caa4 Issue 8436: set.__init__ accepts keyword args 2010-04-18 23:05:22 +00:00
Raymond Hettinger dbe961215a Issue 8436: set.__init__ accepts keyword args 2010-04-18 23:03:16 +00:00
Raymond Hettinger b136a9c9d7 Issue 8420: Fix ref counting problem in set_repr(). 2010-04-18 20:28:33 +00:00
Raymond Hettinger f88db8de76 Issue 8420: Fix ref counting problem in set_repr(). 2010-04-18 20:26:14 +00:00
Victor Stinner 08b36bdab4 Merged revisions 78886 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r78886 | victor.stinner | 2010-03-13 01:13:22 +0100 (sam., 13 mars 2010) | 2 lines

  Issue #7818: set().test_c_api() doesn't expect a set('abc'), modify the set.
........
2010-03-13 00:19:17 +00:00
Ezio Melotti 807e98e0af Merged revisions 78541 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

................
  r78541 | ezio.melotti | 2010-03-01 06:08:34 +0200 (Mon, 01 Mar 2010) | 17 lines

  Merged revisions 78515-78516,78522 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r78515 | georg.brandl | 2010-02-28 20:19:17 +0200 (Sun, 28 Feb 2010) | 1 line

    #8030: make builtin type docstrings more consistent: use "iterable" instead of "seq(uence)", use "new" to show that set() always returns a new object.
  ........
    r78516 | georg.brandl | 2010-02-28 20:26:37 +0200 (Sun, 28 Feb 2010) | 1 line

    The set types can also be called without arguments.
  ........
    r78522 | ezio.melotti | 2010-03-01 01:59:00 +0200 (Mon, 01 Mar 2010) | 1 line

    #8030: more docstring fix for builtin types.
  ........
................
2010-03-01 04:10:55 +00:00
Ezio Melotti 7f807b79d8 Merged revisions 78515-78516,78522 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r78515 | georg.brandl | 2010-02-28 20:19:17 +0200 (Sun, 28 Feb 2010) | 1 line

  #8030: make builtin type docstrings more consistent: use "iterable" instead of "seq(uence)", use "new" to show that set() always returns a new object.
........
  r78516 | georg.brandl | 2010-02-28 20:26:37 +0200 (Sun, 28 Feb 2010) | 1 line

  The set types can also be called without arguments.
........
  r78522 | ezio.melotti | 2010-03-01 01:59:00 +0200 (Mon, 01 Mar 2010) | 1 line

  #8030: more docstring fix for builtin types.
........
2010-03-01 04:08:34 +00:00
Raymond Hettinger c566df3f55 Issue 7263: Fix set.intersection() docstring. 2009-11-19 00:01:54 +00:00