* Removed the ifilter flag wart by splitting it into two simpler functions.
* Fixed comment tabbing in C code.
* Factored module start-up code into a loop.

Documentation:
* Re-wrote introduction.
* Addede examples for quantifiers.
* Simplified python equivalent for islice().
* Documented split of ifilter().

Sets.py:
* Replace old ifilter() usage with new.
This commit is contained in:
Raymond Hettinger 2003-02-09 06:40:58 +00:00
parent cb3319f61e
commit 60eca9331a
4 changed files with 354 additions and 193 deletions

View File

@ -12,45 +12,43 @@ This module implements a number of iterator building blocks inspired
by constructs from the Haskell and SML programming languages. Each
has been recast in a form suitable for Python.
With the advent of iterators and generators in Python 2.3, each of
these tools can be expressed easily and succinctly in pure python.
Rather duplicating what can already be done, this module emphasizes
providing value in other ways:
The module standardizes a core set of fast, memory efficient tools
that are useful by themselves or in combination. Standardization helps
avoid the readability and reliability problems which arise when many
different individuals create their own slightly varying implementations,
each with their own quirks and naming conventions.
\begin{itemize}
The tools are designed to combine readily with each another. This makes
it easy to construct more specialized tools succinctly and efficiently
in pure Python.
\item Instead of constructing an over-specialized toolset, this module
provides basic building blocks that can be readily combined.
For instance, SML provides a tabulation tool: \code{tabulate(\var{f})}
which produces a sequence \code{f(0), f(1), ...}. This toolbox
provides \function{imap()} and \function{count()} which can be combined
to form \code{imap(\var{f}, count())} and produce an equivalent result.
For instance, SML provides a tabulation tool: \code{tabulate(\var{f})}
which produces a sequence \code{f(0), f(1), ...}. This toolbox
takes a different approach of providing \function{imap()} and
\function{count()} which can be combined to form
\code{imap(\var{f}, count())} and produce an equivalent result.
Whether cast in pure python form or C code, tools that use iterators
are more memory efficient (and faster) than their list based counterparts.
Adopting the principles of just-in-time manufacturing, they create
data when and where needed instead of consuming memory with the
computer equivalent of ``inventory''.
\item Some tools were dropped because they offer no advantage over their
pure python counterparts or because their behavior was too
surprising.
Some tools were omitted from the module because they offered no
advantage over their pure python counterparts or because their behavior
was too surprising.
For instance, SML provides a tool: \code{cycle(\var{seq})} which
loops over the sequence elements and then starts again when the
sequence is exhausted. The surprising behavior is the need for
significant auxiliary storage (unusual for iterators). Also, it
is trivially implemented in python with almost no performance
penalty.
For instance, SML provides a tool: \code{cycle(\var{seq})} which
loops over the sequence elements and then starts again when the
sequence is exhausted. The surprising behavior is the need for
significant auxiliary storage (which is unusual for an iterator).
If needed, the tool is readily constructible using pure Python.
\item Another source of value comes from standardizing a core set of tools
to avoid the readability and reliability problems that arise when many
different individuals create their own slightly varying implementations
each with their own quirks and naming conventions.
\item Whether cast in pure python form or C code, tools that use iterators
are more memory efficient (and faster) than their list based counterparts.
Adopting the principles of just-in-time manufacturing, they create
data when and where needed instead of consuming memory with the
computer equivalent of ``inventory''.
\end{itemize}
Other tools are being considered for inclusion in future versions of the
module. For instance, the function
\function{chain(\var{it0}, \var{it1}, ...})} would return elements from
the first iterator until it was exhausted and then move on to each
successive iterator. The module author welcomes suggestions for other
basic building blocks.
\begin{seealso}
\seetext{The Standard ML Basis Library,
@ -107,24 +105,36 @@ by functions or loops that truncate the stream.
\end{verbatim}
\end{funcdesc}
\begin{funcdesc}{ifilter}{predicate, iterable \optional{, invert}}
\begin{funcdesc}{ifilter}{predicate, iterable}
Make an iterator that filters elements from iterable returning only
those for which the predicate is \code{True}. If
\var{invert} is \code{True}, then reverse the process and pass through
only those elements for which the predicate is \code{False}.
If \var{predicate} is \code{None}, return the items that are true
(or false if \var{invert} has been set). Equivalent to:
those for which the predicate is \code{True}.
If \var{predicate} is \code{None}, return the items that are true.
Equivalent to:
\begin{verbatim}
def ifilter(predicate, iterable, invert=False):
iterable = iter(iterable)
while True:
x = iterable.next()
def ifilter(predicate, iterable):
if predicate is None:
b = bool(x)
else:
b = bool(predicate(x))
if not invert and b or invert and not b:
def predicate(x):
return x
for x in iterable:
if predicate(x):
yield x
\end{verbatim}
\end{funcdesc}
\begin{funcdesc}{ifilterfalse}{predicate, iterable}
Make an iterator that filters elements from iterable returning only
those for which the predicate is \code{False}.
If \var{predicate} is \code{None}, return the items that are false.
Equivalent to:
\begin{verbatim}
def ifilterfalse(predicate, iterable):
if predicate is None:
def predicate(x):
return x
for x in iterable:
if not predicate(x):
yield x
\end{verbatim}
\end{funcdesc}
@ -169,20 +179,16 @@ by functions or loops that truncate the stream.
\begin{verbatim}
def islice(iterable, *args):
iterable = iter(iterable)
s = slice(*args)
next = s.start or 0
stop = s.stop
step = s.step or 1
cnt = 0
while True:
while cnt < next:
dummy = iterable.next()
cnt += 1
for cnt, element in enumerate(iterable):
if cnt < next:
continue
if cnt >= stop:
break
yield iterable.next()
cnt += 1
yield element
next += step
\end{verbatim}
\end{funcdesc}
@ -324,6 +330,18 @@ from building blocks.
>>> def nth(iterable, n):
... "Returns the nth item"
... return islice(iterable, n, n+1).next()
... return list(islice(iterable, n, n+1))
>>> def all(pred, seq):
... "Returns True if pred(x) is True for every element in the iterable"
... return not nth(ifilterfalse(pred, seq), 0)
>>> def some(pred, seq):
... "Returns True if pred(x) is True at least one element in the iterable"
... return bool(nth(ifilter(pred, seq), 0))
>>> def no(pred, seq):
... "Returns True if pred(x) is False for every element in the iterable"
... return not nth(ifilter(pred, seq), 0)
\end{verbatim}

View File

@ -57,7 +57,7 @@ what's tested is actually `z in y'.
__all__ = ['BaseSet', 'Set', 'ImmutableSet']
from itertools import ifilter
from itertools import ifilter, ifilterfalse
class BaseSet(object):
"""Common base class for mutable and immutable sets."""
@ -204,9 +204,9 @@ class BaseSet(object):
value = True
selfdata = self._data
otherdata = other._data
for elt in ifilter(otherdata.has_key, selfdata, True):
for elt in ifilterfalse(otherdata.has_key, selfdata):
data[elt] = value
for elt in ifilter(selfdata.has_key, otherdata, True):
for elt in ifilterfalse(selfdata.has_key, otherdata):
data[elt] = value
return result
@ -227,7 +227,7 @@ class BaseSet(object):
result = self.__class__()
data = result._data
value = True
for elt in ifilter(other._data.has_key, self, True):
for elt in ifilterfalse(other._data.has_key, self):
data[elt] = value
return result
@ -260,7 +260,7 @@ class BaseSet(object):
self._binary_sanity_check(other)
if len(self) > len(other): # Fast check for obvious cases
return False
for elt in ifilter(other._data.has_key, self, True):
for elt in ifilterfalse(other._data.has_key, self):
return False
return True
@ -269,7 +269,7 @@ class BaseSet(object):
self._binary_sanity_check(other)
if len(self) < len(other): # Fast check for obvious cases
return False
for elt in ifilter(self._data.has_key, other, True):
for elt in ifilterfalse(self._data.has_key, other):
return False
return True

View File

@ -13,12 +13,19 @@ class TestBasicOps(unittest.TestCase):
def isEven(x):
return x%2==0
self.assertEqual(list(ifilter(isEven, range(6))), [0,2,4])
self.assertEqual(list(ifilter(isEven, range(6), True)), [1,3,5])
self.assertEqual(list(ifilter(None, [0,1,0,2,0])), [1,2])
self.assertRaises(TypeError, ifilter)
self.assertRaises(TypeError, ifilter, 3)
self.assertRaises(TypeError, ifilter, isEven, 3)
self.assertRaises(TypeError, ifilter, isEven, [3], True, 4)
def test_ifilterfalse(self):
def isEven(x):
return x%2==0
self.assertEqual(list(ifilterfalse(isEven, range(6))), [1,3,5])
self.assertEqual(list(ifilterfalse(None, [0,1,0,2,0])), [0,0,0])
self.assertRaises(TypeError, ifilterfalse)
self.assertRaises(TypeError, ifilterfalse, 3)
self.assertRaises(TypeError, ifilterfalse, isEven, 3)
def test_izip(self):
ans = [(x,y) for x, y in izip('abc',count())]
@ -133,7 +140,19 @@ Samuele
>>> def nth(iterable, n):
... "Returns the nth item"
... return islice(iterable, n, n+1).next()
... return list(islice(iterable, n, n+1))
>>> def all(pred, seq):
... "Returns True if pred(x) is True for every element in the iterable"
... return not nth(ifilterfalse(pred, seq), 0)
>>> def some(pred, seq):
... "Returns True if pred(x) is True at least one element in the iterable"
... return bool(nth(ifilter(pred, seq), 0))
>>> def no(pred, seq):
... "Returns True if pred(x) is False for every element in the iterable"
... return not nth(ifilter(pred, seq), 0)
"""

View File

@ -957,7 +957,6 @@ typedef struct {
PyObject_HEAD
PyObject *func;
PyObject *it;
long invert;
} ifilterobject;
PyTypeObject ifilter_type;
@ -965,17 +964,13 @@ PyTypeObject ifilter_type;
static PyObject *
ifilter_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
PyObject *func, *seq, *invert=NULL;
PyObject *func, *seq;
PyObject *it;
ifilterobject *lz;
long inv=0;
if (!PyArg_UnpackTuple(args, "ifilter", 2, 3, &func, &seq, &invert))
if (!PyArg_UnpackTuple(args, "ifilter", 2, 2, &func, &seq))
return NULL;
if (invert != NULL && PyObject_IsTrue(invert))
inv = 1;
/* Get iterator. */
it = PyObject_GetIter(seq);
if (it == NULL)
@ -990,7 +985,6 @@ ifilter_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
Py_INCREF(func);
lz->func = func;
lz->it = it;
lz->invert = inv;
return (PyObject *)lz;
}
@ -1046,7 +1040,7 @@ ifilter_next(ifilterobject *lz)
ok = PyObject_IsTrue(good);
Py_DECREF(good);
}
if (ok ^ lz->invert)
if (ok)
return item;
Py_DECREF(item);
}
@ -1060,11 +1054,10 @@ ifilter_getiter(PyObject *lz)
}
PyDoc_STRVAR(ifilter_doc,
"ifilter(function or None, sequence [, invert]) --> ifilter object\n\
"ifilter(function or None, sequence) --> ifilter object\n\
\n\
Return those items of sequence for which function(item) is true. If\n\
invert is set to True, return items for which function(item) if False.\n\
If function is None, return the items that are true (unless invert is set).");
Return those items of sequence for which function(item) is true.\n\
If function is None, return the items that are true.");
PyTypeObject ifilter_type = {
PyObject_HEAD_INIT(NULL)
@ -1112,6 +1105,160 @@ PyTypeObject ifilter_type = {
};
/* ifilterfalse object ************************************************************/
typedef struct {
PyObject_HEAD
PyObject *func;
PyObject *it;
} ifilterfalseobject;
PyTypeObject ifilterfalse_type;
static PyObject *
ifilterfalse_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
PyObject *func, *seq, *invert=NULL;
PyObject *it;
ifilterfalseobject *lz;
if (!PyArg_UnpackTuple(args, "ifilterfalse", 2, 2, &func, &seq))
return NULL;
/* Get iterator. */
it = PyObject_GetIter(seq);
if (it == NULL)
return NULL;
/* create ifilterfalseobject structure */
lz = (ifilterfalseobject *)type->tp_alloc(type, 0);
if (lz == NULL) {
Py_DECREF(it);
return NULL;
}
Py_INCREF(func);
lz->func = func;
lz->it = it;
return (PyObject *)lz;
}
static void
ifilterfalse_dealloc(ifilterfalseobject *lz)
{
PyObject_GC_UnTrack(lz);
Py_XDECREF(lz->func);
Py_XDECREF(lz->it);
lz->ob_type->tp_free(lz);
}
static int
ifilterfalse_traverse(ifilterfalseobject *lz, visitproc visit, void *arg)
{
int err;
if (lz->it) {
err = visit(lz->it, arg);
if (err)
return err;
}
if (lz->func) {
err = visit(lz->func, arg);
if (err)
return err;
}
return 0;
}
static PyObject *
ifilterfalse_next(ifilterfalseobject *lz)
{
PyObject *item;
long ok;
for (;;) {
item = PyIter_Next(lz->it);
if (item == NULL)
return NULL;
if (lz->func == Py_None) {
ok = PyObject_IsTrue(item);
} else {
PyObject *good;
good = PyObject_CallFunctionObjArgs(lz->func,
item, NULL);
if (good == NULL) {
Py_DECREF(item);
return NULL;
}
ok = PyObject_IsTrue(good);
Py_DECREF(good);
}
if (!ok)
return item;
Py_DECREF(item);
}
}
static PyObject *
ifilterfalse_getiter(PyObject *lz)
{
Py_INCREF(lz);
return lz;
}
PyDoc_STRVAR(ifilterfalse_doc,
"ifilterfalse(function or None, sequence) --> ifilterfalse object\n\
\n\
Return those items of sequence for which function(item) is false.\n\
If function is None, return the items that are false.");
PyTypeObject ifilterfalse_type = {
PyObject_HEAD_INIT(NULL)
0, /* ob_size */
"itertools.ifilterfalse", /* tp_name */
sizeof(ifilterfalseobject), /* tp_basicsize */
0, /* tp_itemsize */
/* methods */
(destructor)ifilterfalse_dealloc, /* tp_dealloc */
0, /* tp_print */
0, /* tp_getattr */
0, /* tp_setattr */
0, /* tp_compare */
0, /* tp_repr */
0, /* tp_as_number */
0, /* tp_as_sequence */
0, /* tp_as_mapping */
0, /* tp_hash */
0, /* tp_call */
0, /* tp_str */
PyObject_GenericGetAttr, /* tp_getattro */
0, /* tp_setattro */
0, /* tp_as_buffer */
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC |
Py_TPFLAGS_BASETYPE, /* tp_flags */
ifilterfalse_doc, /* tp_doc */
(traverseproc)ifilterfalse_traverse, /* tp_traverse */
0, /* tp_clear */
0, /* tp_richcompare */
0, /* tp_weaklistoffset */
(getiterfunc)ifilterfalse_getiter, /* tp_iter */
(iternextfunc)ifilterfalse_next, /* tp_iternext */
0, /* tp_methods */
0, /* tp_members */
0, /* tp_getset */
0, /* tp_base */
0, /* tp_dict */
0, /* tp_descr_get */
0, /* tp_descr_set */
0, /* tp_dictoffset */
0, /* tp_init */
PyType_GenericAlloc, /* tp_alloc */
ifilterfalse_new, /* tp_new */
PyObject_GC_Del, /* tp_free */
};
/* count object ************************************************************/
typedef struct {
@ -1508,8 +1655,8 @@ repeat(elem) --> elem, elem, elem, ...\n\
\n\
Iterators terminating on the shortest input sequence:\n\
izip(p, q, ...) --> (p[0], q[0]), (p[1], q[1]), ... \n\
ifilter(pred, seq, invert=False) --> elements of seq where\n\
pred(elem) is True (or False if invert is set)\n\
ifilter(pred, seq) --> elements of seq where pred(elem) is True\n\
ifilterfalse(pred, seq) --> elements of seq where pred(elem) is False\n\
islice(seq, [start,] stop [, step]) --> elements from\n\
seq[start:stop:step]\n\
imap(fun, p, q, ...) --> fun(p0, q0), fun(p1, q1), ...\n\
@ -1523,56 +1670,33 @@ dropwhile(pred, seq) --> seq[n], seq[n+1], starting when pred fails\n\
PyMODINIT_FUNC
inititertools(void)
{
int i;
PyObject *m;
char *name;
PyTypeObject *typelist[] = {
&dropwhile_type,
&takewhile_type,
&islice_type,
&starmap_type,
&imap_type,
&times_type,
&ifilter_type,
&ifilterfalse_type,
&count_type,
&izip_type,
&repeat_type,
NULL
};
m = Py_InitModule3("itertools", NULL, module_doc);
PyModule_AddObject(m, "dropwhile", (PyObject *)&dropwhile_type);
if (PyType_Ready(&dropwhile_type) < 0)
for (i=0 ; typelist[i] != NULL ; i++) {
if (PyType_Ready(typelist[i]) < 0)
return;
Py_INCREF(&dropwhile_type);
PyModule_AddObject(m, "takewhile", (PyObject *)&takewhile_type);
if (PyType_Ready(&takewhile_type) < 0)
name = strchr(typelist[i]->tp_name, '.') + 1;
if (name == NULL)
return;
Py_INCREF(&takewhile_type);
PyModule_AddObject(m, "islice", (PyObject *)&islice_type);
if (PyType_Ready(&islice_type) < 0)
return;
Py_INCREF(&islice_type);
PyModule_AddObject(m, "starmap", (PyObject *)&starmap_type);
if (PyType_Ready(&starmap_type) < 0)
return;
Py_INCREF(&starmap_type);
PyModule_AddObject(m, "imap", (PyObject *)&imap_type);
if (PyType_Ready(&imap_type) < 0)
return;
Py_INCREF(&imap_type);
PyModule_AddObject(m, "times", (PyObject *)&times_type);
if (PyType_Ready(&times_type) < 0)
return;
Py_INCREF(&times_type);
if (PyType_Ready(&ifilter_type) < 0)
return;
Py_INCREF(&ifilter_type);
PyModule_AddObject(m, "ifilter", (PyObject *)&ifilter_type);
if (PyType_Ready(&count_type) < 0)
return;
Py_INCREF(&count_type);
PyModule_AddObject(m, "count", (PyObject *)&count_type);
if (PyType_Ready(&izip_type) < 0)
return;
Py_INCREF(&izip_type);
PyModule_AddObject(m, "izip", (PyObject *)&izip_type);
if (PyType_Ready(&repeat_type) < 0)
return;
Py_INCREF(&repeat_type);
PyModule_AddObject(m, "repeat", (PyObject *)&repeat_type);
Py_INCREF(typelist[i]);
PyModule_AddObject(m, name, (PyObject *)typelist[i]);
}
}