cpython

Commit Graph

Author	SHA1	Message	Date
Jeremy Hylton	7083bb744a	Oops. Return -1 to distinguish error from empty dict. This change probably isn't work a bug fix. It's unlikely that anyone was calling this method without passing it a real dict.	2004-02-17 20:10:11 +00:00
Raymond Hettinger	0c66967e3d	Simplify previous checkin -- a new function was not needed.	2003-12-13 13:31:55 +00:00
Raymond Hettinger	8f5cdaa784	* Added a new method flag, METH_COEXIST. * Used the flag to optimize set.__contains__(), dict.__contains__(), dict.__getitem__(), and list.__getitem__().	2003-12-13 11:26:12 +00:00
Raymond Hettinger	bc0f2ab9bb	Expose dict_contains() and PyDict_Contains() with is about 10% faster than PySequence_Contains() and more clearly applicable to dicts. Apply the new function in setobject.c where __contains__ checking is ubiquitous.	2003-11-25 21:12:14 +00:00
Raymond Hettinger	574aa32578	SF patch #798467 : Update docstring of has_key for bool changes (Contributed by George Yoshida.)	2003-09-01 22:12:08 +00:00
Raymond Hettinger	c8d2290c8c	SF patch #729395 : Dictionary tuning Adjust resize argument for dict.update() and dict.copy(). Extends the previous change to dict.__setitem__().	2003-05-07 00:49:40 +00:00
Raymond Hettinger	3539f6b895	SF patch #729395 : Dictionary tuning * Increase dictionary growth rate resulting in more sparse dictionaries, fewer lookup collisions, increased memory use, and better cache performance. For dicts with over 50k entries, keep the current growth rate in case an application is suffering from tight memory constraints. * Set the most common case (no resize) to fall-through the test.	2003-05-05 22:22:10 +00:00
Raymond Hettinger	930427b892	Add a reference to dictnotes.txt. It does no good if you don't know it's there or where to find it.	2003-05-03 06:51:59 +00:00
Raymond Hettinger	1da1dbf458	Renamed PyObject_GenericGetIter to PyObject_SelfIter to more accurately describe what the function does. Suggested by Thomas Wouters.	2003-03-17 19:46:11 +00:00
Raymond Hettinger	0153826964	Created PyObject_GenericGetIter(). Factors out the common case of returning self.	2003-03-17 08:24:35 +00:00
Raymond Hettinger	a3e1e4cd79	SF patch #693753 : fix for bug 639806: default for dict.pop (contributed by Michael Stone.)	2003-03-06 23:54:28 +00:00
Neal Norwitz	0732301738	Add closing ) in comment	2003-02-15 14:45:12 +00:00
Tim Peters	080c88b912	cPickle.c, load_build(): Taught cPickle how to pick apart the optional proto 2 slot state. pickle.py, load_build(): CAUTION: Noted that cPickle's load_build and pickle's load_build really don't do the same things with the state, and didn't before this patch either. cPickle never tries to do .update(), and has no backoff if instance.__dict__ can't be retrieved. There are no tests that can tell the difference, and part of what cPickle's load_build() did looked accidental to me, so I don't know what the true intent is here. pickletester.py, test_pickle.py: Got rid of the hack for exempting cPickle from running some of the proto 2 tests. dictobject.c, PyDict_Next(): documented intended use.	2003-02-15 03:01:11 +00:00
Raymond Hettinger	ea3fdf44a2	SF patch #659536 : Use PyArg_UnpackTuple where possible. Obtain cleaner coding and a system wide performance boost by using the fast, pre-parsed PyArg_Unpack function instead of PyArg_ParseTuple function which is driven by a format string.	2002-12-29 16:33:45 +00:00
Martin v. Löwis	32b4a1ba62	Constify char* API. Fixes #651363 . 2.2 candidate.	2002-12-11 13:21:12 +00:00
Tim Peters	bca1cbc6f8	SF 548651: Fix the METH_CLASS implementation. Most of these patches are from Thomas Heller, with long lines folded by Tim. The change to test_descr.py is from Guido. See the bug report. Not a bugfix candidate -- METH_CLASS is new in 2.3.	2002-12-09 22:56:13 +00:00
Raymond Hettinger	e03e5b1f91	Remove assumption that cls is a subclass of dict. Simplifies the code and gets Just van Rossum's example to work.	2002-12-07 08:10:51 +00:00
Raymond Hettinger	b02bb5ed0a	Replace BadInternalCall with TypeError. Add a test case. Fix whitespace. Just van Rossum showed a weird, but clever way for pure python code to trigger the BadInternalCall. The C code had assumed that calling a class constructor would return an instance of that class; however, classes that abuse __new__ can invalidate that assumption.	2002-12-04 07:32:25 +00:00
Neal Norwitz	ef786ae1a5	Add missing decref	2002-11-27 19:38:00 +00:00
Raymond Hettinger	e33d3df030	SF Patch 643443. Added dict.fromkeys(iterable, value=None), a class method for constructing new dictionaries from sequences of keys.	2002-11-27 07:29:33 +00:00
Just van Rossum	a797d8150d	Patch #642500 with slight modifications: allow keyword arguments in dict() constructor. Example: >>> dict(a=1, b=2) {'a': 1, 'b': 2} >>>	2002-11-23 09:45:04 +00:00
Guido van Rossum	efae8862fe	In doc strings, use 'k in D' rather than D.has_key(k).	2002-09-04 11:29:45 +00:00
Guido van Rossum	45ec02aed1	SF patch 576101, by Oren Tirosh: alternative implementation of interning. I modified Oren's patch significantly, but the basic idea and most of the implementation is unchanged. Interned strings created with PyString_InternInPlace() are now mortal, and you must keep a reference to the resulting string around; use the new function PyString_InternImmortal() to create immortal interned strings.	2002-08-19 21:43:18 +00:00
Jeremy Hylton	938ace69a0	staticforward bites the dust. The staticforward define was needed to support certain broken C compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the static keyword when it was used with a forward declaration of a static initialized structure. Standard C allows the forward declaration with static, and we've decided to stop catering to broken C compilers. (In fact, we expect that the compilers are all fixed eight years later.) I'm leaving staticforward and statichere defined in object.h as static. This is only for backwards compatibility with C extensions that might still use it. XXX I haven't updated the documentation.	2002-07-17 16:30:39 +00:00
Guido van Rossum	2147df748f	Make StopIteration a sink state. This is done by clearing out the di_dict field when the end of the list is reached. Also make the error ("dictionary changed size during iteration") a sticky state. Also remove the next() method -- one is supplied automatically by PyType_Ready() because the tp_iternext slot is set. That's a good thing, because the implementation given here was buggy (it never raised StopIteration).	2002-07-16 20:30:22 +00:00
Martin v. Löwis	14f8b4cfcb	Patch #568124 : Add doc string macros.	2002-06-13 20:33:02 +00:00
Guido van Rossum	e027d9818f	Add Raymond Hettinger's d.pop(). See SF patch 539949.	2002-04-12 15:11:59 +00:00
Neil Schemenauer	6189b89cc5	PyObject_GC_Del and PyObject_Del can now be used as a function designators. Remove PyMalloc_New.	2002-04-12 02:43:00 +00:00
Guido van Rossum	77f6a65eb0	Add the 'bool' type and its values 'False' and 'True', as described in PEP 285. Everything described in the PEP is here, and there is even some documentation. I had to fix 12 unit tests; all but one of these were printing Boolean outcomes that changed from 0/1 to False/True. (The exception is test_unicode.py, which did a type(x) == type(y) style comparison. I could've fixed that with a single line using issubtype(x, type(y)), but instead chose to be explicit about those places where a bool is expected. Still to do: perhaps more documentation; change standard library modules to return False/True from predicates.	2002-04-03 22:41:51 +00:00
Tim Peters	1f7df3595a	Remove the CACHE_HASH and INTERN_STRINGS preprocessor symbols.	2002-03-29 03:29:08 +00:00
Guido van Rossum	ff413af605	This is Neil's fix for SF bug 535905 (Evil Trashcan and GC interaction). The fix makes it possible to call PyObject_GC_UnTrack() more than once on the same object, and then move the PyObject_GC_UnTrack() call to before the trashcan code is invoked. BUGFIX CANDIDATE!	2002-03-28 20:34:59 +00:00
Neil Schemenauer	dcc819a5c9	Use pymalloc if it's enabled.	2002-03-22 15:33:15 +00:00
Tim Peters	f582b82fe9	SF bug #491415 PyDict_UpdateFromSeq2() unused PyDict_UpdateFromSeq2(): removed it. PyDict_MergeFromSeq2(): made it public and documented it. PyDict_Merge() docs: updated to reveal <wink> that the second argument can be any mapping object.	2001-12-11 18:51:08 +00:00
Guido van Rossum	dbb53d9918	Fix of SF bug #475877 (Mutable subtype instances are hashable). Rather than tweaking the inheritance of type object slots (which turns out to be too messy to try), this fix adds a __hash__ to the list and dict types (the only mutable types I'm aware of) that explicitly raises an error. This has the advantage that list.__hash__([]) also raises an error (previously, this would invoke object.__hash__([]), returning the argument's address); ditto for dict.__hash__. The disadvantage for this fix is that 3rd party mutable types aren't automatically fixed. This should be added to the rules for creating subclassable extension types: if you don't want your object to be hashable, add a tp_hash function that raises an exception. Also, it's possible that I've forgotten about other mutable types for which this should be done.	2001-12-03 16:32:18 +00:00
Tim Peters	a427a2b8d0	Rename "dictionary" (type and constructor) to "dict".	2001-10-29 22:25:45 +00:00
Tim Peters	4d85953fe6	dictionary() constructor: + Change keyword arg name from "x" to "items". People passing a mapping object can stretch their imaginations <wink>. + Simplify the docstring text.	2001-10-27 18:27:48 +00:00
Tim Peters	1fc240e851	Generalize dictionary() to accept a sequence of 2-sequences. At the outer level, the iterator protocol is used for memory-efficiency (the outer sequence may be very large if fully materialized); at the inner level, PySequence_Fast() is used for time-efficiency (these should always be sequences of length 2). dictobject.c, new functions PyDict_{Merge,Update}FromSeq2. These are wholly analogous to PyDict_{Merge,Update}, but process a sequence-of-2- sequences argument instead of a mapping object. For now, I left these functions file static, so no corresponding doc changes. It's tempting to change dict.update() to allow a sequence-of-2-seqs argument too. Also changed the name of dictionary's keyword argument from "mapping" to "x". Got a better name? "mapping_or_sequence_of_pairs" isn't attractive, although more so than "mosop" <wink>. abstract.h, abstract.tex: Added new PySequence_Fast_GET_SIZE function, much faster than going thru the all-purpose PySequence_Size. libfuncs.tex: - Document dictionary(). - Fiddle tuple() and list() to admit that their argument is optional. - The long-winded repetitions of "a sequence, a container that supports iteration, or an iterator object" is getting to be a PITA. Many months ago I suggested factoring this out into "iterable object", where the definition of that could include being explicit about generators too (as is, I'm not sure a reader outside of PythonLabs could guess that "an iterator object" includes a generator call). - Please check my curly braces -- I'm going blind <0.9 wink>. abstract.c, PySequence_Tuple(): When PyObject_GetIter() fails, leave its error msg alone now (the msg it produces has improved since PySequence_Tuple was generalized to accept iterable objects, and PySequence_Tuple was also stomping on the msg in cases it shouldn't have even before PyObject_GetIter grew a better msg).	2001-10-26 05:06:50 +00:00
Guido van Rossum	9475a2310d	Enable GC for new-style instances. This touches lots of files, since many types were subclassable but had a xxx_dealloc function that called PyObject_DEL(self) directly instead of deferring to self->ob_type->tp_free(self). It is permissible to set tp_free in the type object directly to _PyObject_Del, for non-GC types, or to _PyObject_GC_Del, for GC types. Still, PyObject_DEL was a tad faster, so I'm fearing that our pystone rating is going down again. I'm not sure if doing something like void xxx_dealloc(PyObject *self) { if (PyXxxCheckExact(self)) PyObject_DEL(self); else self->ob_type->tp_free(self); } is any faster than always calling the else branch, so I haven't attempted that -- however those types whose own dealloc is fancier (int, float, unicode) do use this pattern.	2001-10-05 20:51:39 +00:00
Tim Peters	0ab085c4cb	Changed the dict implementation to take "string shortcuts" only when keys are true strings -- no subclasses need apply. This may be debatable. The problem is that a str subclass may very well want to override __eq__ and/or __hash__ (see the new example of case-insensitive strings in test_descr), but go-fast shortcuts for strings are ubiquitous in our dicts (and subclass overrides aren't even looked for then). Another go-fast reason for the change is that PyCheck_StringExact() is a quicker test than PyCheck_String(), and we make such a test on virtually every access to every dict. OTOH, a str subclass may also be perfectly happy using the base str eq and hash, and this change slows them a lot. But those cases are still hypothetical, while Python's own reliance on true-string dicts is not.	2001-09-14 00:25:33 +00:00
Tim Peters	b95ec09a44	Repair typo in comment.	2001-09-02 18:35:54 +00:00
Tim Peters	25786c0851	Make dictionary() a real constructor. Accepts at most one argument, "a mapping object", in the same sense dict.update(x) requires of x (that x has a keys() method and a getitem). Questionable: The other type constructors accept a keyword argument, so I did that here too (e.g., dictionary(mapping={1:2}) works). But type_call doesn't pass the keyword args to the tp_new slot (it passes NULL), it only passes them to the tp_init slot, so getting at them required adding a tp_init slot to dicts. Looks like that makes the normal case (i.e., no args at all) a little slower (the time it takes to call dict.tp_init and have it figure out there's nothing to do).	2001-09-02 08:22:48 +00:00
Neil Schemenauer	e83c00efd0	Use new GC API.	2001-08-29 23:54:21 +00:00
Martin v. Löwis	e3eb1f2b23	Patch #427190 : Implement and use METH_NOARGS and METH_O.	2001-08-16 13:15:00 +00:00
Guido van Rossum	05ac6de2d5	Add PyDict_Merge(a, b, override): PyDict_Merge(a, b, 1) is the same as PyDict_Update(a, b). PyDict_Merge(a, b, 0) does something similar but leaves existing items unchanged.	2001-08-10 20:28:28 +00:00
Tim Peters	6d6c1a35e0	Merge of descr-branch back into trunk.	2001-08-02 04:15:00 +00:00
Barry Warsaw	66a0d1d9b9	dict_update(): Generalize this method so {}.update() accepts any "mapping" object, specifically one that supports PyMapping_Keys() and PyObject_GetItem(). This allows you to say e.g. {}.update(UserDict()) We keep the special case for concrete dict objects, although that seems moderately questionable. OTOH, the code exists and works, so why change that? .update()'s docstring already claims that D.update(E) implies calling E.keys() so it's appropriate not to transform AttributeErrors in PyMapping_Keys() to TypeErrors. Patch eyeballed by Tim.	2001-06-26 20:08:32 +00:00
Tim Peters	c605784174	dict_repr: Reuse one of the int vars (minor code simplification).	2001-06-16 07:52:53 +00:00
Tim Peters	a7259597f1	SF bug 433228: repr(list) woes when len(list) big. Gave Python linear-time repr() implementations for dicts, lists, strings. This means, e.g., that repr(range(50000)) is no longer 50x slower than pprint.pprint() in 2.2 <wink>. I don't consider this a bugfix candidate, as it's a performance boost. Added _PyString_Join() to the internal string API. If we want that in the public API, fine, but then it requires runtime error checks instead of asserts.	2001-06-16 05:11:17 +00:00
Tim Peters	afb6ae8452	Store the mask instead of the size in dictobjects. The mask is more frequently used, and in particular this allows to drop the last remaining obvious time-waster in the crucial lookdict() and lookdict_string() functions. Other changes consist mostly of changing "i < ma_size" to "i <= ma_mask" everywhere.	2001-06-04 21:00:21 +00:00
Tim Peters	453163d842	lookdict: stop more insane core-dump mutating comparison cases. Should be possible to provoke unbounded recursion now, but leaving that to someone else to provoke and repair. Bugfix candidate -- although this is getting harder to backstitch, and the cases it's protecting against are mondo contrived.	2001-06-03 04:54:32 +00:00
Tim Peters	7b5d0afb1e	lookdict: Reduce obfuscating code duplication with a judicious goto. This code is likely to get even hairier to squash core dumps due to mutating comparisons, and it's hard enough to follow without that.	2001-06-03 04:14:43 +00:00
Tim Peters	19b77cfc4b	Finish the dict->string coredump fix. Need sleep. Bugfix candidate.	2001-06-02 08:27:39 +00:00
Tim Peters	23cf6be23c	Coredumpers from Michael Hudson, mutating dicts while printing or converting to string. Critical bugfix candidate -- if you take this seriously <wink>.	2001-06-02 08:02:56 +00:00
Tim Peters	f4b33f61fb	dict_popitem(): Repaired last-second 2.1 comment, which misidentified the true reason for allocating the tuple before checking the dict size.	2001-06-02 05:42:29 +00:00
Tim Peters	eb28ef209e	New collision resolution scheme: no polynomials, simpler, faster, less code, less memory. Tests have uncovered no drawbacks. Christian and Vladimir are the other two people who have burned many brain cells on the dict code in recent years, and they like the approach too, so I'm checking it in without further ado.	2001-06-02 05:27:19 +00:00
Tim Peters	15d4929ae4	Implement an old idea of Christian Tismer's: use polynomial division instead of multiplication to generate the probe sequence. The idea is recorded in Python-Dev for Dec 2000, but that version is prone to rare infinite loops. The value is in getting all the bits of the hash code to participate; and, e.g., this speeds up querying every key in a dict with keys [i << 16 for i in range(20000)] by a factor of 500. Should be equally valuable in any bad case where the high-order hash bits were getting ignored. Also wrote up some of the motivations behind Python's ever-more-subtle hash table strategy.	2001-05-27 07:39:22 +00:00
Martin v. Löwis	cd35306a25	Patch #424335 : Implement string_richcompare, remove string_compare. Use new _PyString_Eq in lookdict_string.	2001-05-24 16:56:35 +00:00
Tim Peters	f8a548c23c	dictresize(): Rebuild small tables if there are any dummies, not just if they're entirely full. Not a question of correctness, but of temporarily misplaced common sense.	2001-05-24 16:26:40 +00:00
Tim Peters	0c6010be75	Jack Jansen hit a bug in the new dict code, reported on python-dev. dictresize() was too aggressive about never ever resizing small dicts. If a small dict is entirely full, it needs to rebuild it despite that it won't actually resize it, in order to purge old dummy entries thus creating at least one virgin slot (lookdict assumes at least one such exists). Also took the opportunity to add some high-level comments to dictresize.	2001-05-23 23:33:57 +00:00
Fred Drake	0c23231f6e	Remove unused variable.	2001-05-22 22:36:52 +00:00
Tim Peters	dea48ec581	SF patch #425242 : Patch which "inlines" small dictionaries. The idea is Marc-Andre Lemburg's, the implementation is Tim's. Add a new ma_smalltable member to dictobjects, an embedded vector of MINSIZE (8) dictentry structs. Short course is that this lets us avoid additional malloc(s) for dicts with no more than 5 entries. The changes are widespread but mostly small. Long course: WRT speed, all scalar operations (getitem, setitem, delitem) on non-empty dicts benefit from no longer needing NULL-pointer checks (ma_table is never NULL anymore). Bulk operations (copy, update, resize, clearing slots during dealloc) benefit in some cases from now looping on the ma_fill count rather than on ma_size, but that was an unexpected benefit: the original reason to loop on ma_fill was to let bulk operations on empty dicts end quickly (since the NULL-pointer checks went away, empty dicts aren't special-cased any more). Special considerations: For dicts that remain empty, this change is a lose on two counts: the dict object contains 8 new dictentry slots now that weren't needed before, and dict object creation also spends time memset'ing these doomed-to-be-unsused slots to NULLs. For dicts with one or two entries that never get larger than 2, it's a mix: a malloc()/free() pair is no longer needed, and the 2-entry case gets to use 8 slots (instead of 4) thus decreasing the chance of collision. Against that, dict object creation spends time memset'ing 4 slots that aren't strictly needed in this case. For dicts with 3 through 5 entries that never get larger than 5, it's a pure win: the dict is created with all the space they need, and they never need to resize. Before they suffered two malloc()/free() calls, plus 1 dict resize, to get enough space. In addition, the 8-slot table they ended with consumed more memory overall, because of the hidden overhead due to the additional malloc. For dicts with 6 or more entries, the ma_smalltable member is wasted space, but then these are large(r) dicts so 8 slots more or less doesn't make much difference. They still benefit all the time from removing ubiquitous dynamic null-pointer checks, and get a small benefit (but relatively smaller the larger the dict) from not having to do two mallocs, two frees, and a resize on the way to getting their sixth entry. All in all it appears a small but definite general win, with larger benefits in specific cases. It's especially nice that it allowed to get rid of several branches, gotos and labels, and overall made the code smaller.	2001-05-22 20:40:22 +00:00
Tim Peters	91a364df17	Bugfix candidate. Two exceedingly unlikely errors in dictresize(): 1. The loop for finding the new size had an off-by-one error at the end (could over-index the polys[] vector). 2. The polys[] vector ended with a 0, apparently intended as a sentinel value but never used as such; i.e., it was never checked, so 0 could have been used as a polynomial. Neither bug could trigger unless a dict grew to 2**30 slots; since that would consume at least 12GB of memory just to hold the dict pointers, I'm betting it's not the cause of the bug Fred's tracking down <wink>.	2001-05-19 07:04:38 +00:00
Tim Peters	1928314ef4	Speed dictresize by collapsing its two passes into one; the reason given in the comments for using two passes was bogus, as the only object that can get decref'ed due to the copy is the dummy key, and decref'ing dummy can't have side effects (for one thing, dummy is immortal! for another, it's a string object, not a potentially dangerous user-defined object).	2001-05-17 22:25:34 +00:00
Tim Peters	342c65e19a	Aggressive reordering of dict comparisons. In case of collision, it stands to reason that me_key is much more likely to match the key we're looking for than to match dummy, and if the key is absent me_key is much more likely to be NULL than dummy: most dicts don't even have a dummy entry. Running instrumented dict code over the test suite and some apps confirmed that matching dummy was 200-300x less frequent than matching key in practice. So this reorders the tests to try the common case first. It can lose if a large dict with many collisions is mostly deleted, not resized, and then frequently searched, but that's hardly a case we should be favoring.	2001-05-13 06:43:53 +00:00
Tim Peters	2f228e75e4	Get rid of the superstitious "~" in dict hashing's "i = (~hash) & mask". The comment following used to say: /* We use ~hash instead of hash, as degenerate hash functions, such as for ints <sigh>, can have lots of leading zeros. It's not really a performance risk, but better safe than sorry. 12-Dec-00 tim: so ~hash produces lots of leading ones instead -- what's the gain? / That is, there was never a good reason for doing it. And to the contrary, as explained on Python-Dev last December, it tended to make the sum* (i + incr) & mask (which is the first table index examined in case of collison) the same "too often" across distinct hashes. Changing to the simpler "i = hash & mask" reduced the number of string-dict collisions (== # number of times we go around the lookup for-loop) from about 6 million to 5 million during a full run of the test suite (these are approximate because the test suite does some random stuff from run to run). The number of collisions in non-string dicts also decreased, but not as dramatically. Note that this may, for a given dict, change the order (wrt previous releases) of entries exposed by .keys(), .values() and .items(). A number of std tests suffered bogus failures as a result. For dicts keyed by small ints, or (less so) by characters, the order is much more likely to be in increasing order of key now; e.g., >>> d = {} >>> for i in range(10): ... d[i] = i ... >>> d {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9} >>> Unfortunately. people may latch on to that in small examples and draw a bogus conclusion. test_support.py Moved test_extcall's sortdict() into test_support, made it stronger, and imported sortdict into other std tests that needed it. test_unicode.py Excluced cp875 from the "roundtrip over range(128)" test, because cp875 doesn't have a well-defined inverse for unicode("?", "cp875"). See Python-Dev for excruciating details. Cookie.py Chaged various output functions to sort dicts before building strings from them. test_extcall Fiddled the expected-result file. This remains sensitive to native dict ordering, because, e.g., if there are multiple errors in a keyword-arg dict (and test_extcall sets up many cases like that), the specific error Python complains about first depends on native dict ordering.	2001-05-13 00:19:31 +00:00
Tim Peters	4fa58bfac2	Restore dicts' tp_compare slot, and change dict_richcompare to say it doesn't know how to do LE, LT, GE, GT. dict_richcompare can't do the latter any faster than dict_compare can. More importantly, for cmp(dict1, dict2), Python first tries rich compares with EQ, LT, and GT one at a time, even if the tp_compare slot is defined, and dict_richcompare called dict_compare for the latter two because it couldn't do them itself. The result was a lot of wasted calls to dict_compare. Now dict_richcompare gives up at once the times Python calls it with LT and GT from try_rich_to_3way_compare(), and dict_compare is called only once (when Python gets around to trying the tp_compare slot). Continued mystery: despite that this cut the number of calls to dict_compare approximately in half in test_mutants.py, the latter still runs amazingly slowly. Running under the debugger doesn't show excessive activity in the dict comparison code anymore, so I'm guessing the culprit is somewhere else -- but where? Perhaps in the element (key/value) comparison code? We clearly spend a lot of time figuring out how to compare things.	2001-05-10 21:45:19 +00:00
Tim Peters	3918fb2549	Repair typo in comment.	2001-05-10 18:58:31 +00:00
Tim Peters	95bf9390a4	SF bug #422121 Insecurities in dict comparison. Fixed a half dozen ways in which general dict comparison could crash Python (even cause Win98SE to reboot) in the presence of kay and/or value comparison routines that mutate the dict during dict comparison. Bugfix candidate.	2001-05-10 08:32:44 +00:00
Tim Peters	e63415ead8	SF patch #421922 : Implement rich comparison for dicts. d1 == d2 and d1 != d2 now work even if the keys and values in d1 and d2 don't support comparisons other than ==, and testing dicts for equality is faster now (especially when inequality obtains).	2001-05-08 04:38:29 +00:00
Guido van Rossum	b1f35bffe5	Mchael Hudson pointed out that the code for detecting changes in dictionary size was comparing ma_size, the hash table size, which is always a power of two, rather than ma_used, wich changes on each insertion or deletion. Fixed this.	2001-05-02 15:13:44 +00:00
Guido van Rossum	09e563abb4	Add experimental iterkeys(), itervalues(), iteritems() to dict objects. Tests show that iteritems() is 5-10% faster than iterating over the dict and extracting the value with dict[key].	2001-05-01 12:10:21 +00:00
Guido van Rossum	213c7a6aa5	Mondo changes to the iterator stuff, without changing how Python code sees it (test_iter.py is unchanged). - Added a tp_iternext slot, which calls the iterator's next() method; this is much faster for built-in iterators over built-in types such as lists and dicts, speeding up pybench's ForLoop with about 25% compared to Python 2.1. (Now there's a good argument for iterators. ;-) - Renamed the built-in sequence iterator SeqIter, affecting the C API functions for it. (This frees up the PyIter prefix for generic iterator operations.) - Added PyIter_Check(obj), which checks that obj's type has a tp_iternext slot and that the proper feature flag is set. - Added PyIter_Next(obj) which calls the tp_iternext slot. It has a somewhat complex return condition due to the need for speed: when it returns NULL, it may not have set an exception condition, meaning the iterator is exhausted; when the exception StopIteration is set (or a derived exception class), it means the same thing; any other exception means some other error occurred.	2001-04-23 14:08:49 +00:00
Guido van Rossum	59d1d2b434	Iterators phase 1. This comprises: new slot tp_iter in type object, plus new flag Py_TPFLAGS_HAVE_ITER new C API PyObject_GetIter(), calls tp_iter new builtin iter(), with two forms: iter(obj), and iter(function, sentinel) new internal object types iterobject and calliterobject new exception StopIteration new opcodes for "for" loops, GET_ITER and FOR_ITER (also supported by dis.py) new magic number for .pyc files new special method for instances: __iter__() returns an iterator iteration over dictionaries: "for x in dict" iterates over the keys iteration over files: "for x in file" iterates over lines TODO: documentation test suite decide whether to use a different way to spell iter(function, sentinal) decide whether "for key in dict" is a good idea use iterators in map/filter/reduce, min/max, and elsewhere (in/not in?) speed tuning (make next() a slot tp_next???)	2001-04-20 19:13:02 +00:00
Guido van Rossum	55ad67d74d	Oops. Removed dictiter_new decl that wasn't supposed to go in yet.	2001-04-20 16:52:06 +00:00
Guido van Rossum	0dbb4fba4c	Implement, test and document "key in dict" and "key not in dict". I know some people don't like this -- if it's really controversial, I'll take it out again. (If it's only Alex Martelli who doesn't like it, that doesn't count as "real controversial" though. :-) That's why this is a separate checkin from the iterators stuff I'm about to check in next.	2001-04-20 16:50:40 +00:00
Guido van Rossum	e04eaec5b6	Tim pointed out a remaining vulnerability in popitem(): the PyTuple_New() could conceivably clear the dict, so move the test for an empty dict after the tuple allocation. It means that we waste time allocating and deallocating a 2-tuple when the dict is empty, but who cares. It also means that when the dict is empty and there's no memory to allocate a 2-tuple, we raise MemoryError, not KeyError -- but that may actually a good idea: if there's no room for a lousy 2-tuple, what are the chances that there's room for a KeyError instance?	2001-04-16 00:02:32 +00:00
Guido van Rossum	a4dd011259	Tentative fix for a problem that Tim discovered at the last moment, and reported to python-dev: because we were calling dict_resize() in PyDict_Next(), and because GC's dict_traverse() uses PyDict_Next(), and because PyTuple_New() can cause GC, and because dict_items() calls PyTuple_New(), it was possible for dict_items() to have the dict resized right under its nose. The solution is convoluted, and touches several places: keys(), values(), items(), popitem(), PyDict_Next(), and PyDict_SetItem(). There are two parts to it. First, we no longer call dict_resize() in PyDict_Next(), which seems to solve the immediate problem. But then PyDict_SetItem() must have a different policy about when it calls dict_resize(), because we want to guarantee (e.g. for an algorithm that Jeremy uses in the compiler) that you can loop over a dict using PyDict_Next() and make changes to the dict as long as those changes are only value replacements for existing keys using PyDict_SetItem(). This is done by resizing after the insertion instead of before, and by remembering the size before we insert the item, and if the size is still the same, we don't bother to even check if we might need to resize. An additional detail is that if the dict starts out empty, we must still resize it before the insertion. That was the first part. :-) The second part is to make keys(), values(), items(), and popitem() safe against side effects on the dict caused by allocations, under the assumption that if the GC can cause arbitrary Python code to run, it can cause other threads to run, and it's not inconceivable that our dict could be resized -- it would be insane to write code that relies on this, but not all code is sane. Now, I have this nagging feeling that the loops in lookdict probably are blissfully assuming that doing a simple key comparison does not change the dict's size. This is not necessarily true (the keys could be class instances after all). But that's a battle for another day.	2001-04-15 22:16:26 +00:00
Tim Peters	6783070ebf	Make PyDict_Next safe to use for loops that merely modify the values associated with existing dict keys. This is a variant of part of Michael Hudson's patch #409864 "lazy fix for Pings bizarre scoping crash".	2001-03-21 19:23:56 +00:00
Guido van Rossum	b932420cc7	Rich comparisons: - Use PyObject_RichCompareBool() when comparing keys; this makes the error handling cleaner. - There were two implementations for dictionary comparison, an old one (#ifdef'ed out) and a new one. Got rid of the old one, which was abandoned years ago. - In the characterize() function, part of dictionary comparison, use PyObject_RichCompareBool() to compare keys and values instead. But continue to use PyObject_Compare() for comparing the final (deciding) elements. - Align the comments in the type struct initializer. Note: I don't implement rich comparison for dictionaries -- there doesn't seem to be much to be gained. (The existing comparison already decides that shorter dicts are always smaller than longer dicts.)	2001-01-18 00:39:02 +00:00
Jeremy Hylton	1fb6088e86	dict_update has two boundary conditions: a.update(a) and a.update({}) Added test for second one.	2001-01-03 22:34:59 +00:00
Tim Peters	f7f88b11e4	Add long-overdue docstrings to dict methods.	2000-12-13 23:18:45 +00:00
Tim Peters	f1c7c884b3	Typo repair in comments. Fell for GregS's .popitem() poke.	2000-12-13 19:58:25 +00:00
Tim Peters	ea8f2bf9ca	Bring comments up to date (e.g., they still said the table had to be a prime size, which is in fact never true anymore ...).	2000-12-13 01:02:46 +00:00
Guido van Rossum	ba6ab84e73	Add popitem() -- SF patch #102733 .	2000-12-12 22:02:18 +00:00
Moshe Zadka	5725d1eb03	Backing out my changes. Improved version coming soon to a Source Forge near you!	2000-11-30 19:30:21 +00:00
Moshe Zadka	1a62750eda	Added .first{item,value,key}() to dictionaries. Complete with docos and tests. OKed by Guido.	2000-11-30 12:31:03 +00:00
Guido van Rossum	8586991099	REMOVED all CWI, CNRI and BeOpen copyright markings. This should match the situation in the 1.6b1 tree.	2000-09-01 23:29:29 +00:00
Fred Drake	1bff34ab96	Slight performance hack that also avoids requiring the existence of thread state for dictionaries that have only been indexed by string keys. See the comments in SourceForge for more. This closes SourceForge patch #101309.	2000-08-31 19:31:38 +00:00
Fred Drake	c88b99ce06	Clear errors raised by PyObject_Compare() without losing any existing exception context. This avoids improperly propogating errors raised by a user-defined __cmp__() by a subsequent lookup operation. This patch does not include the performance enhancement patch for dictionaries with string keys only; that will be checked in separately. This closes SourceForge patch #101277 and bug #112558.	2000-08-31 19:04:07 +00:00
Guido van Rossum	164452cec4	Barry's patch to implement the new setdefault() method.	2000-08-08 16:12:54 +00:00
Thomas Wouters	7889010731	Miscelaneous ANSIfications. I'm assuming here 'main' should take (int, char**) and return an int even on PC platforms. If not, please fix PC/utils/makesrc.c ;-P	2000-07-22 19:25:51 +00:00
Thomas Wouters	7e47402264	Spelling fixes supplied by Rob W. W. Hooft. All these are fixes in either comments, docstrings or error messages. I fixed two minor things in test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't"). There is a minor style issue involved: Guido seems to have preferred English grammar (behaviour, honour) in a couple places. This patch changes that to American, which is the more prominent style in the source. I prefer English myself, so if English is preferred, I'd be happy to supply a patch myself ;)	2000-07-16 12:04:32 +00:00
Tim Peters	1f5871e834	Removed Py_PROTO and switched to ANSI C declarations in the dict implementation. This was really to test whether my new CVS+SSH setup is more usable than the old one -- and turns out it is (for whatever reason, it was impossible to do a commit before that involved more than one directory).	2000-07-04 17:44:48 +00:00
Guido van Rossum	4cc6ac7b87	Neil Schemenauer: small fixes for GC	2000-07-01 01:00:38 +00:00
Guido van Rossum	ffcc3813d8	Change copyright notice - 2nd try.	2000-06-30 23:58:06 +00:00
Guido van Rossum	fd71b9e9d4	Change copyright notice.	2000-06-30 23:50:40 +00:00
Jeremy Hylton	c5007aa5c3	final patches from Neil Schemenauer for garbage collection	2000-06-30 05:02:53 +00:00
Jeremy Hylton	d08b4c4524	part 2 of Neil Schemenauer's GC patches: This patch modifies the type structures of objects that participate in GC. The object's tp_basicsize is increased when GC is enabled. GC information is prefixed to the object to maintain binary compatibility. GC objects also define the tp_flag Py_TPFLAGS_GC.	2000-06-23 19:37:02 +00:00
Jeremy Hylton	8caad49c30	Round 1 of Neil Schemenauer's GC patches: This patch adds the type methods traverse and clear necessary for GC implementation.	2000-06-23 14:18:11 +00:00
Guido van Rossum	b18618dab7	Vladimir Marangozov's long-awaited malloc restructuring. For more comments, read the patches@python.org archives. For documentation read the comments in mymalloc.h and objimpl.h. (This is not exactly what Vladimir posted to the patches list; I've made a few changes, and Vladimir sent me a fix in private email for a problem that only occurs in debug mode. I'm also holding back on his change to main.c, which seems unnecessary to me.)	2000-05-03 23:44:39 +00:00

1 2 3 4 5

201 Commits