Commit Graph

80 Commits

Author SHA1 Message Date
Tim Peters e70ddf3a99 Widespread, but mostly in _PyMalloc_Malloc: optimize away all expensive
runtime multiplications and divisions, via the scheme developed with
Vladimir Marangozov on Python-Dev.  The pool_header struct loses its
capacity member, but gains nextoffset and maxnextoffset members; this
still leaves it at 32 bytes on a 32-bit box (it has to be padded to a
multiple of 8 bytes).
2002-04-05 04:32:29 +00:00
Tim Peters b7265dbe3e _PyMalloc_Realloc(): removed a now-pointless cast. 2002-04-04 05:08:31 +00:00
Tim Peters 84c1b97467 _PyMalloc_{Malloc, Realloc}: Strive to meet the doc's promises about
what these do given a 0 size argument.  This is so that when pymalloc
is enabled, we don't need to wrap pymalloc calls in goofy little
routines special-casing 0.  Note that it's virtually impossible to meet
the doc's promise that malloc(0) will never return NULL; this makes a
best effort, but not an insane effort.  The code does promise that
realloc(not-NULL, 0) will never return NULL (malloc(0) is much harder).

_PyMalloc_Realloc:  Changed to take over all requests for 0 bytes, and
rearranged to be a little quicker in expected cases.

All over the place:  when resorting to the platform allocator, call
free/malloc/realloc directly, without indirecting thru macros.  This
should avoid needing a nightmarish pile of #ifdef-ery if PYMALLOC_DEBUG
is changed so that pymalloc takes over all Py(Mem, Object} memory
operations (which would add useful debugging info to PyMem_xyz
allocations too).
2002-04-04 04:44:32 +00:00
Tim Peters 6169f09d64 Fixed errors in two comments. 2002-04-01 20:12:59 +00:00
Tim Peters 338e010b45 Restructured my pool-management overview in terms of the three
possible pool states.  I think it's much clearer now.

Added a new long overdue block-management overview comment block.

I believe the comments are in good shape now.

Added two comments about possible small optimizations (one getting rid
of runtime multiplications at the cost of a new pool_header member; the
other getting rid of runtime divisions and the pool_header capacity
member, at the cost of a static const vector of 32 uints).
2002-04-01 19:23:44 +00:00
Tim Peters 7ccfadf3a8 New PYMALLOC_DEBUG function void _PyMalloc_DebugDumpStats(void).
This displays stats about the # of arenas, pools, blocks and bytes, to
stderr, both used and reserved but unused.

CAUTION:  Because PYMALLOC_DEBUG is on, the debug malloc routine adds
16 bytes to each request.  This makes each block appear two size classes
higher than it would be if PYMALLOC_DEBUG weren't on.

So far, playing with this confirms the obvious:  there's a lot of activity
in the "small dict" size class, but nothing in the core makes any use of
the 8-byte or 16-byte classes.
2002-04-01 06:04:21 +00:00
Tim Peters 57b17ad6ae Add one more assert that indirectly interlocking conditions are consistent
with each other.
2002-03-31 02:59:48 +00:00
Tim Peters 4c5be0ce09 Fixed an error in a new assert. 2002-03-31 02:52:29 +00:00
Tim Peters b1da050131 Fixed a typo in a new comment. 2002-03-31 02:51:40 +00:00
Tim Peters 2c95c99a64 _PyMalloc_Free(): As was already done for _PyMalloc_Malloc, rearranged
the code so that the most frequent cases come first.  Added comments.
Found a hidden assumption that a pool contains room for at least two
blocks, and added an assert to catch a violation if it ever happens in
a place where that matters.  Gave the normal "I allocated this block"
case a longer basic block to work with before it has to do its first
branch (via breaking apart an embedded assignment in an "if", and
hoisting common code out of both branches).
2002-03-31 02:18:01 +00:00
Tim Peters 1e16db6d3b Added a long-overdue comment block giving an overview of pool operations
and terminology, plus explanation of some extreme obscurities.
2002-03-31 01:05:22 +00:00
Tim Peters c2ce91af5f It's once again thought safe to call the pymalloc free/realloc with an
address obtained from system malloc/realloc without holding the GIL.

When the vector of arena base addresses has to grow, the old vector is
deliberately leaked.  This makes "stale" x-thread references safe.
arenas and narenas are also declared volatile, and changed in an order
that prevents a thread from picking up a value of narenas too large
for the value of arenas it sees.

Added more asserts.

Fixed an old inaccurate comment.

Added a comment explaining why it's safe to call pymalloc free/realloc
with an address obtained from system malloc/realloc even when arenas is
still NULL (this is obscure, since the ADDRESS_IN_RANGE macro
appears <wink> to index into arenas).
2002-03-30 21:36:04 +00:00
Tim Peters 7b85b4aa7f new_arena(): In error cases, reset the number of available pools to 0.
Else the pymalloc malloc will go insane the next time it's called.
2002-03-30 10:42:09 +00:00
Tim Peters 1d99af8d69 Changed the #-of-arenas counters to uints -- no need to be insane about
this.  But added an overflow check just in case there is.

Got rid of the ushort macro.  It wasn't used anymore (it was only used
in the no-longer-exists off_t macro), and there's no plausible use for it.
2002-03-30 10:35:09 +00:00
Tim Peters df4d1377ed Turns out the off_t macro isn't used anymore, so got rid of it. 2002-03-30 07:07:24 +00:00
Tim Peters 3c83df2047 Now that we're no longer linking arenas together, there's no need to
waste the first pool if malloc happens to return a pool-aligned address.

This means the number of pools per arena can now vary by 1.  Unfortunately,
the code counted up from 0 to a presumed constant number of pools.  So
changed the increasing "watermark" counter to a decreasing "nfreepools"
counter instead, and fiddled various stuff accordingly.  This also allowed
getting rid of two more macros.

Also changed the code to align the first address to a pool boundary
instead of a page boundary.  These are two parallel sets of macro #defines
that happen to be identical now, but the page macros are in theory more
restrictive (bigger), and there's simply no reason I can see that it
wasn't aligning to the less restrictive pool size all along (the code
only relies on pool alignment).

Hmm.  The "page size" macros aren't used for anything *except* defining
the pool size macros, and the comments claim the latter isn't necessary.
So this has the feel of a layer of indirection that doesn't serve a
purpose; should probably get rid of the page macros now.
2002-03-30 07:04:41 +00:00
Tim Peters 12300686ca Retract the claim that this is always safe if PyMem_{Del, DEL, Free, FREE}
are called without the GIL.  It's incredibly unlikely to fail, but I can't
make this bulletproof without either adding a lock for exclusion, or
giving up on growing the arena base-address vector (it would be safe if
this were a static array).
2002-03-30 06:20:23 +00:00
Tim Peters d97a1c008c Lots of changes:
+ A new scheme for determining whether an address belongs to a pymalloc
  arena.  This should be 100% reliable.  The poolp->pooladdr and
  poolp->magic members are gone.  A new poolp->arenaindex member takes
  their place.  Note that the pool header overhead doesn't actually
  shrink, though, since the header is padded to a multiple of 8 bytes.

+ _PyMalloc_Free and _PyMalloc_Realloc should now be safe to call for
  any legit address, whether obtained from a _PyMalloc function or from
  the system malloc/realloc.  It should even be safe to call
   _PyMalloc_Free when *not* holding the GIL, provided that the passed-in
  address was obtained from system malloc/realloc.  Since this is
  accomplished without any locks, you better believe the code is subtle.
  I hope it's sufficiently commented.

+ The above implies we don't need the new PyMalloc_{New, NewVar, Del}
  API anymore, and could switch back to PyObject_XXX without breaking
  existing code mixing PyObject_XXX with PyMem_{Del, DEL, Free, FREE}.
  Nothing is done here about that yet, and I'd like to see this new
  code exercised more first.

+ The small object threshhold is boosted to 256 (the max).  We should
  play with that some more, but the old 64 was way too small for 2.3.

+ Getting a new arena is now done via new function new_arena().

+ Removed some unused macros, and squashed out some macros that were
  used only once to define other macros.

+ Arenas are no longer linked together.  A new vector of arena base
  addresses had to be created anyway to make address classification
  bulletproof.

+ A lot of the patch size is an illusion:  given the way address
  classification works now, it was more convenient to switch the
  sense of the prime "if" tests in the realloc and free functions,
  so the "if" and "else" blocks got swapped.

+ Assorted minor code, comment and whitespace cleanup.

Back to the Windows installer <wink>.
2002-03-30 06:09:22 +00:00
Neil Schemenauer bd02b14255 Add missing "void" to function. 2002-03-28 21:05:38 +00:00
Tim Peters d1139e043c PYMALLOC_DEBUG routines: The "check API family" gimmick was going nowhere
fast, and just cluttered the code.  Get rid of it for now.  If a compelling
case can be made for it, easy to restore it later.
2002-03-28 07:32:11 +00:00
Tim Peters e085017ab7 _PyMalloc_DebugRealloc(): simplify decl of "fresh".
Assorted:  bump the serial number via a trivial new bumpserialno()
function.  The point is to give a single place to set a breakpoint when
waiting for a specific serial number.
2002-03-24 00:34:21 +00:00
Tim Peters 62c06ba6a9 Minor code cleanup -- no semantic changes. 2002-03-23 22:28:18 +00:00
Tim Peters ddea208be9 Give Python a debug-mode pymalloc, much as sketched on Python-Dev.
When WITH_PYMALLOC is defined, define PYMALLOC_DEBUG to enable the debug
allocator.  This can be done independent of build type (release or debug).
A debug build automatically defines PYMALLOC_DEBUG when pymalloc is
enabled.  It's a detected error to define PYMALLOC_DEBUG when pymalloc
isn't enabled.

Two debugging entry points defined only under PYMALLOC_DEBUG:

+ _PyMalloc_DebugCheckAddress(const void *p) can be used (e.g., from gdb)
  to sanity-check a memory block obtained from pymalloc.  It sprays
  info to stderr (see next) and dies via Py_FatalError if the block is
  detectably damaged.

+ _PyMalloc_DebugDumpAddress(const void *p) can be used to spray info
  about a debug memory block to stderr.

A tiny start at implementing "API family" checks isn't good for
anything yet.

_PyMalloc_DebugRealloc() has been optimized to do little when the new
size is <= old size.  However, if the new size is larger, it really
can't call the underlying realloc() routine without either violating its
contract, or knowing something non-trivial about how the underlying
realloc() works.  A memcpy is always done in this case.

This was a disaster for (and only) one of the std tests:  test_bufio
creates single text file lines up to a million characters long.  On
Windows, fileobject.c's get_line() uses the horridly funky
getline_via_fgets(), which keeps growing and growing a string object
hoping to find a newline.  It grew the string object 1000 bytes each
time, so for a million-character string it took approximately forever
(I gave up after a few minutes).

So, also:

fileobject.c, getline_via_fgets():  When a single line is outrageously
long, grow the string object at a mildly exponential rate, instead of
just 1000 bytes at a time.

That's enough so that a debug-build test_bufio finishes in about 5 seconds
on my Win98SE box.  I'm curious to try this on Win2K, because it has very
different memory behavior than Win9X, and test_bufio always took a factor
of 10 longer to complete on Win2K.  It *could* be that the endless
reallocs were simply killing it on Win2K even in the release build.
2002-03-23 10:03:50 +00:00
Tim Peters ce7fb9b515 Just whitespace fiddling. 2002-03-23 00:28:57 +00:00
Tim Peters 1221c0a435 Build obmalloc.c directly instead of #include'ing from object.c.
Also move all _PyMalloc_XXX entry points into obmalloc.c.

The Windows build works fine.
The Unix build is changed here (Makefile.pre.in), but not tested.
No other platform's build process has been fiddled.
2002-03-23 00:20:15 +00:00
Neil Schemenauer 558ba52f10 Remove malloc hooks. 2002-03-22 23:20:15 +00:00
Neil Schemenauer 25f3dc21b5 Drop the PyCore_* memory API. 2002-03-18 21:06:21 +00:00
Neil Schemenauer 11f5be8d88 Simpilify PyCore_* macros by assuming the function prototypes for
malloc() and free() don't change.
2002-03-18 18:13:41 +00:00
Tim Peters b2336529ef Identifiers matching _[A-Z_]\w* are reserved for C implementations.
May or may not be related to bug 407680 (obmalloc.c - looks like it's
corrupted).  This repairs the illegal vrbl names, but leaves a pile of
illegal macro names (_THIS_xxx, _SYSTEM_xxx, _SET_HOOKS, _FETCH_HOOKS).
2001-03-11 18:36:13 +00:00
Neil Schemenauer a35c688055 Add Vladimir Marangozov's object allocator. It is disabled by default. This
closes SF patch #401229.
2001-02-27 04:45:05 +00:00