Bump the blocksize up from 62 to 64 to speed up the modulo calculation.

Remove the old comment suggesting that it was desireable to have
blocksize+2 as a multiple of the cache line length.  That would
have made sense only if the block structure start point was always
aligned to a cache line boundary.  However, the memory allocations
are 16 byte aligned, so we don't really have control over whether
the struct spills across cache line boundaries.
This commit is contained in:
Raymond Hettinger 2015-02-26 23:21:29 -08:00
parent b1e6e57a17
commit daf57f25e5
2 changed files with 4 additions and 7 deletions

View File

@ -542,7 +542,7 @@ class TestBasic(unittest.TestCase):
@support.cpython_only
def test_sizeof(self):
BLOCKLEN = 62
BLOCKLEN = 64
basesize = support.calcobjsize('2P4nlP')
blocksize = struct.calcsize('2P%dP' % BLOCKLEN)
self.assertEqual(object.__sizeof__(deque()), basesize)

View File

@ -10,14 +10,11 @@
/* The block length may be set to any number over 1. Larger numbers
* reduce the number of calls to the memory allocator, give faster
* indexing and rotation, and reduce the link::data overhead ratio.
*
* Ideally, the block length will be set to two less than some
* multiple of the cache-line length (so that the full block
* including the leftlink and rightlink will fit neatly into
* cache lines).
* Making the block length a power of two speeds-up the modulo
* calculation in deque_item().
*/
#define BLOCKLEN 62
#define BLOCKLEN 64
#define CENTER ((BLOCKLEN - 1) / 2)
/* A `dequeobject` is composed of a doubly-linked list of `block` nodes.