Commit Graph

200 Commits

Author SHA1 Message Date
Tim Peters 6fc13d9595 Finished transitioning to using gc_refs to track gc objects' states.
This was mostly a matter of adding comments and light code rearrangement.
Upon untracking, gc_next is still set to NULL.  It's a cheap way to
provoke memory faults if calling code is insane.  It's also used in some
way by the trashcan mechanism.
2002-07-02 18:12:35 +00:00
Tim Peters ea405639bf Reserved another gc_refs value for untracked objects. Every live gc
object should now have a well-defined gc_refs value, with clear transitions
among gc_refs states.  As a result, none of the visit_XYZ traversal
callbacks need to check IS_TRACKED() anymore, and those tests were removed.
(They were already looking for objects with specific gc_refs states, and
the gc_refs state of an untracked object can no longer match any other
gc_refs state by accident.)
Added more asserts.
I expect that the gc_next == NULL indicator for an untracked object is
now redundant and can also be removed, but I ran out of time for this.
2002-07-02 00:52:30 +00:00
Tim Peters 19b74c7868 OK, I couldn't stand it <0.5 wink>: removed all uncertainty about what's
in gc_refs, even at the cost of putting back a test+branch in
visit_decref.

The good news:  since gc_refs became utterly tame then, it became
clear that another special value could be useful.  The move_roots() and
move_root_reachable() passes have now been replaced by a single
move_unreachable() pass.  Besides saving a pass over the generation, this
has a better effect:  most of the time everything turns out to be
reachable, so we were breaking the generation list apart and moving it
into into the reachable list, one element at a time.  Now the reachable
stuff stays in the generation list, and the unreachable stuff is moved
instead.  This isn't quite as good as it sounds, since sometimes we
guess wrongly that a thing is unreachable, and have to move it back again.

Still, overall, it yields a significant (but not dramatic) boost in
collection speed.
2002-07-01 03:52:19 +00:00
Tim Peters 93cd83e4ae visit_decref(): Two optimizations.
1. You're not supposed to call this with a NULL argument, although the
   docs could be clearer about that.  The other visit_XYZ() functions
   don't bother to check.  This doesn't either now, although it does
   assert non-NULL-ness now.

2. It doesn't matter whether the object is currently tracked, so don't
   bother checking that either (if it isn't currently tracked, it may
   have some nonsense value in gc_refs, but it doesn't hurt to
   decrement gibberish, and it's cheaper to do so than to make everyone
   test for trackedness).

It would be nice to get rid of the other tests on IS_TRACKED.  Perhaps
trackedness should not be a matter of not being in any gc list, but
should be a matter of being in a new "untracked" gc list.  This list
simply wouldn't be involved in the collection mechanism.  A newly
created object would be put in the untracked list.  Tracking would
simply unlink it and move it into the gen0 list.  Untracking would do
the reverse.  No test+branch needed then.  visit_move() may be vulnerable
then, though, and I don't know how this would work with the trashcan.
2002-06-30 21:31:03 +00:00
Tim Peters 8839617cc9 SF bug #574132: Major GC related performance regression
"The regression" is actually due to that 2.2.1 had a bug that prevented
the regression (which isn't a regression at all) from showing up.  "The
regression" is actually a glitch in cyclic gc that's been there forever.

As the generation being collected is analyzed, objects that can't be
collected (because, e.g., we find they're externally referenced, or
are in an unreachable cycle but have a __del__ method) are moved out
of the list of candidates.  A tricksy scheme uses negative values of
gc_refs to mark such objects as being moved.  However, the exact
negative value set at the start may become "more negative" over time
for objects not in the generation being collected, and the scheme was
checking for an exact match on the negative value originally assigned.
As a result, objects in generations older than the one being collected
could get scanned too, and yanked back into a younger generation.  Doing
so doesn't lead to an error, but doesn't do any good, and can burn an
unbounded amount of time doing useless work.

A test case is simple (thanks to Kevin Jacobs for finding it!):

x = []
for i in xrange(200000):
    x.append((1,))

Without the patch, this ends up scanning all of x on every gen0 collection,
scans all of x twice on every gen1 collection, and x gets yanked back into
gen1 on every gen0 collection.  With the patch, once x gets to gen2, it's
never scanned again until another gen2 collection, and stays in gen2.

Bugfix candidate, although the code has changed enough that I think I'll
need to port it by hand.  2.2.1 also has a different bug that causes
bound method objects not to get tracked at all (so the test case doesn't
burn absurd amounts of time in 2.2.1, but *should* <wink>).
2002-06-30 17:56:40 +00:00
Neil Schemenauer c9051640f8 Fix small bug. The count of objects in all generations younger then the
collected one should be zeroed.
2002-06-28 19:16:04 +00:00
Martin v. Löwis 14f8b4cfcb Patch #568124: Add doc string macros. 2002-06-13 20:33:02 +00:00
Jeremy Hylton 8a13518d25 Remove casts to PyObject * when declaration is for PyObject * 2002-06-06 23:23:55 +00:00
Neil Schemenauer a2b11ecb08 Add IS_TRACKED and IS_MOVED macros. This makes the logic a little more clear. 2002-05-21 15:53:24 +00:00
Neil Schemenauer 2880ae53e6 Move all data for a single generation into a structure. The set of
generations is now an array.  This cleans up some code and makes it easy
to change the number of generations.  Also, implemented a
gc_list_is_empty() function.  This makes the logic a little clearer in
places.  The performance impact of these changes should be negligible.

One functional change is that allocation/collection counters are always
zeroed at the start of a collection.  This should fix SF bug #551915.
This change is too big for back-porting but the minimal patch on SF
looks good for a bugfix release.
2002-05-04 05:35:20 +00:00
Tim Peters fa8efab30f _PyObject_GC_New: Could call PyObject_INIT with a NULL 1st argument.
_PyObject_GC_NewVar:  Could call PyObject_INIT_VAR likewise.

Bugfix candidate.
2002-04-28 01:57:25 +00:00
Neil Schemenauer fec4eb1be1 Allow PyObject_Del to be used as a function designator. Provide binary
compatibility function.

Make PyObject_GC_Track and PyObject_GC_UnTrack functions instead of
trivial macros wrapping functions.  Provide binary compatibility
functions.
2002-04-12 02:41:03 +00:00
Neil Schemenauer b883310d59 Make _PyObject_GC_UnTrack do nothing if WITH_CYCLE_GC is not defined. 2002-03-29 03:04:25 +00:00
Guido van Rossum ff413af605 This is Neil's fix for SF bug 535905 (Evil Trashcan and GC interaction).
The fix makes it possible to call PyObject_GC_UnTrack() more than once
on the same object, and then move the PyObject_GC_UnTrack() call to
*before* the trashcan code is invoked.

BUGFIX CANDIDATE!
2002-03-28 20:34:59 +00:00
Neil Schemenauer 1b0e4fcc29 Use pymalloc for realloc() as well. 2002-03-22 15:41:03 +00:00
Neil Schemenauer dcc819a5c9 Use pymalloc if it's enabled. 2002-03-22 15:33:15 +00:00
Neal Norwitz 2a47c0fa23 Fix spelling mistakes. Bugfix candidates. 2002-01-29 00:53:41 +00:00
Martin v. Löwis f8a6f241b3 Check for NULL return value of PyList_New (follow-up to patch #486743). 2001-12-02 18:31:02 +00:00
Martin v. Löwis 155aad17be Patch #486743: remove bad INCREF, propagate exception in append_objects. 2001-12-02 12:21:34 +00:00
Martin v. Löwis c8fe77bd4c Use identity instead of equality when looking for referrers. Fixes #485781. 2001-11-29 18:08:31 +00:00
Martin v. Löwis 560da62fc7 Rename get_referents to get_referrers. Fixes #483815. 2001-11-24 09:24:51 +00:00
Tim Peters db8656118a has_finalizer(): simplified "if (complicated_bool) 1 else 0" to
"complicated_bool".
2001-11-01 19:35:45 +00:00
Neil Schemenauer a765c120f6 Add has_finalizer predictate function. Use it when deciding which
objects to save in gc.garbage.  This should be the last change needed to
fix SF bug 477059: "__del__ on new classes vs. GC".

Note that this change slightly changes the behavior of the collector.
Before, if a cycle was found that contained instances with __del__
methods then all instance objects in that cycle were saved in
gc.garbage.  Now, only objects with __del__ methods are saved in
gc.garbage.
2001-11-01 17:35:23 +00:00
Guido van Rossum 8cc705eabc SF bug #477059 (my own): __del__ on new classes vs. GC.
When moving objects with a __del__ attribute to a special list, look
for __del__ on new-style classes with the HEAPTYPE flag set as well.
(HEAPTYPE means the class was created by a class statement.)
2001-11-01 14:23:28 +00:00
Neil Schemenauer e8c40cb722 Make the gc.collect() function respect the collection lock. This fixes
SF bug 476129: "gc.collect sometimes hangs".
2001-10-31 23:09:35 +00:00
Guido van Rossum bca8c2ebea Use double curly braces for the generation0/1/2 initializers, to shut
up GCC warnings.
2001-10-12 20:52:48 +00:00
Tim Peters 9e4ca10ce4 SF bug [#467145] Python 2.2a4 build problem on HPUX 11.0.
The platform requires 8-byte alignment for doubles, but the GC header
was 12 bytes and that threw off the natural alignment of the double
members of a subtype of complex.  The fix puts the GC header into a
union with a double as the other member, to force no-looser-than
double alignment of GC headers.  On boxes that require 8-byte alignment
for doubles, this may add pad bytes to the GC header accordingly; ditto
for platforms that *prefer* 8-byte alignment for doubles.  On platforms
that don't care, it shouldn't change the memory layout (because the
size of the old GC header is certainly greater than the size of a double
on all platforms, so unioning with a double shouldn't change size or
alignment on such boxes).
2001-10-11 18:31:31 +00:00
Tim Peters f2a67daca2 Guido suggests, and I agree, to insist that SIZEOF_VOID_P be a power of 2.
This simplifies the rounding in _PyObject_VAR_SIZE, allows to restore the
pre-rounding calling sequence, and allows some nice little simplifications
in its callers.  I'm still making it return a size_t, though.
2001-10-07 03:54:51 +00:00
Tim Peters 6d483d3477 _PyObject_VAR_SIZE: always round up to a multiple-of-pointer-size value.
As Guido suggested, this makes the new subclassing code substantially
simpler.  But the mechanics of doing it w/ C macro semantics are a mess,
and _PyObject_VAR_SIZE has a new calling sequence now.

Question:  The PyObject_NEW_VAR macro appears to be part of the public API.
Regardless of what it expands to, the notion that it has to round up the
memory it allocates is new, and extensions containing the old
PyObject_NEW_VAR macro expansion (which was embedded in the
PyObject_NEW_VAR expansion) won't do this rounding.  But the rounding
isn't actually *needed* except for new-style instances with dict pointers
after a variable-length blob of embedded data.  So my guess is that we do
not need to bump the API version for this (as the rounding isn't needed
for anything an extension can do unless it's recompiled anyway).  What's
your guess?
2001-10-06 21:27:34 +00:00
Tim Peters 406fe3b1c0 Repaired the debug Windows deaths in test_descr, by allocating enough
pad memory to properly align the __dict__ pointer in all cases.

gcmodule.c/objimpl.h, _PyObject_GC_Malloc:
+ Added a "padding" argument so that this flavor of malloc can allocate
  enough bytes for alignment padding (it can't know this is needed, but
  its callers do).

typeobject.c, PyType_GenericAlloc:
+ Allocated enough bytes to align the __dict__ pointer.
+ Sped and simplified the round-up-to-PTRSIZE logic.
+ Added blank lines so I could parse the if/else blocks <0.7 wink>.
2001-10-06 19:04:01 +00:00
Tim Peters 8c18f25850 _PyObject_GC_Malloc(): split a complicated line in two. As is, there was
no way to talk the debugger into showing me how many bytes were being
allocated.
2001-10-06 08:03:20 +00:00
Neil Schemenauer 43411b5683 Make more things internal to this file. Remove
visit_finalizer_reachable since it's the same as visit_reachable.
Rename visit_reachable to visit_move.  Objects can now have the GC type
flag set, reachable by tp_traverse and not be in a GC linked list.  This
should make the collector more robust and easier to use by extension
module writers.  Add memory management functions for container objects
(new, del, resize).
2001-08-30 00:05:51 +00:00
Neil Schemenauer 17e7be60b4 Remove "referents" structure (it's not needed). Check return value
of PyList_Append.
2001-08-10 14:46:47 +00:00
Neil Schemenauer c7c8d8e32d Add get_objects function. This is a low level function (like
get_referents, and is not yet documented in the library manual).
Suggestions for a better name welcome.
2001-08-09 15:58:59 +00:00
Neil Schemenauer 48c7034454 Add get_referents function. Closes SF patch #402925. 2001-08-09 15:38:31 +00:00
Neil Schemenauer b2c2c9e977 - update Neil's email address 2000-10-04 16:34:09 +00:00
Neil Schemenauer 97d723bd62 - do not start collection during processing of an exception 2000-10-04 16:25:07 +00:00
Neil Schemenauer 7760cff294 Fix some long/"l" int/"i" mismatches. Fixes bug #113779. 2000-09-22 22:35:36 +00:00
Neil Schemenauer 544de1effb - Add DEBUG_SAVEALL option. When enabled all garbage objects found by the
collector will be saved in gc.garbage.  This is useful for debugging a
  program that creates reference cycles.

- Fix else statements in gcmodule.c to conform to Python coding standards.
2000-09-22 15:22:38 +00:00
Jeremy Hylton 3263dc2b15 compromise value for threshold0: not too high, not too low 2000-09-05 15:44:50 +00:00
Jeremy Hylton 045946d4ee set the default threshold much higher
we don't need to run gc frequently
2000-09-01 04:01:55 +00:00
Jeremy Hylton b709df3810 refactor __del__ exception handler into PyErr_WriteUnraisable
add sanity check to gc: if an exception occurs during GC, call
PyErr_WriteUnraisable and then call Py_FatalEror.
2000-09-01 02:47:25 +00:00
Jeremy Hylton 0625777b53 apply patch #101362 by Vladimir Marangozov
also initial static debug variable to 0
2000-08-31 15:10:24 +00:00
Vladimir Marangozov f9d20c3786 Neil Schemenauer: GC enable(), disable(), isenabled() interface.
Small stylistic changes by VM:
- is_enabled() -> isenabled()
- static ... Py_<func> -> static ... gc_<func>
2000-08-06 22:45:31 +00:00
Barry Warsaw 35e459c3eb debug_instance(): Use the same %p format directive as with
debug_cycle(), and don't cast the pointer to a long.  Neither needs
the literal `0x' prefix as %p automatically inserts this (on Linux at
least).
2000-07-12 05:18:36 +00:00
Fred Drake cc1be2401e Always use the :funcname part of the format specifier for PyArg_ParseTuple()
so we get better error messages.
2000-07-12 04:42:23 +00:00
Fred Drake b35de5b78a Neil Schemenauer <nascheme@enme.ucalgary.ca>:
Change a cast, intialize a local, and make some sprintf() format strings
type-appropriate (add the "l" to "%d").

Closes SourceForge patch #100737.
2000-07-11 14:37:41 +00:00
Peter Schneider-Kamp 8bc8f0d036 ANSI-fication 2000-07-10 17:15:07 +00:00
Vladimir Marangozov b16714b4d0 Initialize the return value in collect_generations() since it is updated
conditionally in the code.
2000-07-10 05:37:39 +00:00
Jeremy Hylton c5007aa5c3 final patches from Neil Schemenauer for garbage collection 2000-06-30 05:02:53 +00:00