decode(): While writing tests for uu.py, Nick Mathewson discovered
that the 'Truncated input file' exception could never get raised,
because its "if not str:" test was actually testing the builtin
function "str", not the local string vrbl "s" as intended.
Bugfix candidate.
fixed. Regrettably, this must be run manually -- somehow the I/O
redirection of the regression test breaks the test. When run under
the regression test, this raises ImportError with a warning to that
effect.
Bugfix candidate!
Fix various serious problems:
- The ThreadingTCPServer class and its derived classes were completely
broken because the main thread would close the request before the
handler thread had time to look at it. This was introduced by
Ping's close_request() patch. The fix moves the close_request()
calls to after the handler has run to completion in the BaseServer
class and the ForkingMixIn class; when using the ThreadingMixIn,
closing the request is the handler's responsibility.
- The ForkingUDPServer class has always been been broken because the
socket was closed in the child before calling the handler. I fixed
this by simply not calling server_close() in the child at all.
- I cannot get the UnixDatagramServer class to work at all. The
recvfrom() call doesn't return a meaningful client address. I added
a comment to this effect. Maybe it works on other Unix versions.
- The __all__ variable was missing ThreadingMixIn and ForkingMixIn.
- Bumped __version__ to "0.4".
- Added a note about the test suite (to be checked in shortly).
solver. In conjunction, they easily found a tour of a 200x200 board:
that's 200**2 == 40,000 levels of backtracking. Explicitly resumable
generators allow that to be coded as easily as a recursive solver (easier,
actually, because different levels can use level-customized algorithms
without pain), but without blowing the stack. Indeed, I've never written
an exhaustive Tour solver in any language before that can handle boards so
large ("exhaustive" == guaranteed to find a solution if one exists, as
opposed to probabilistic heuristic approaches; of course, the age of the
universe may be a blip in the time needed!).
We should not depend on two spaces between words, so use the white
space after the to-be-encoded word only as lookahead and don't
actually consume it in the regular expression.
committed.
tokenize.py: I like these changes, and have tested them extensively
without even realizing it, so I just updated the docstring and the docs.
tabnanny.py: Also liked this, but did a little code fiddling. I should
really rewrite this to *exploit* generators, but that's near the bottom
of my effort/benefit scale so doubt I'll get to it anytime soon (it
would be most useful as a non-trivial example of ideal use of generators;
but test_generators.py has already grown plenty of food-for-thought
examples).
inspect.py: I'm sure Ping intended for this to continue running even
under 1.5.2, so I reverted this to the last pre-gen-branch version. The
"bugfix" I checked in in-between was actually repairing a bug *introduced*
by the conversion to generators, so it's OK that the reverted version
doesn't reflect that checkin.
class FieldStorage: this patch changes read_lines() and co. to use a
StringIO() instead of a real file. The write() calls are redirected
to a private method that replaces it with a real, external file only
when it gets too big (> 1000 bytes).
This avoids problems in forms using the multipart/form-data encoding
with many fields. The original code created a temporary file for
*every* field (not just for file upload fields), thereby sometimes
exceeding the open file limit of some systems.
Note that the simpler solution "use a real file only for file uploads"
can't be used because the form field parser has no way to tell which
fields correspond to file uploads.
It's *possible* but extremely unlikely that this would break someone's
code; they would have to be stepping way outside the documented
interface for FieldStorage and use f.file.fileno(), or depend on
overriding make_file() to return a file-like object with additional
known properties.
examples of use. These poke stuff not specifically targeted before, incl.
recursive local generators relying on nested scopes, ditto but also
inside class methods and rebinding instance vars, and anonymous
partially-evaluated generators (the N-Queens solver creates a different
column-generator for each row -- AFAIK this is my invention, and it's
really pretty <wink>). No problems, not even a new leak.
"return expr" instances in generators (which latter may be generators
due to otherwise invisible "yield" stmts hiding in "if 0" blocks).
This was fun the first time, but this has gotten truly ugly now.
that required explicitly calling LazyList.clear() in the two tests that
use LazyList (I added a LazyList Fibonacci generator too).
A real bitch: the extremely inefficient first version of the 2-3-5 test
*looked* like a slow leak on Win98SE, but it wasn't "really": it generated
so many results that the heap grew over 4Mb (tons of frames! the number
of frames grows exponentially in that test). Then Win98SE malloc() starts
fragmenting address space allocating more and more heaps, and the visible
memory use grew very slowly while the disk was thrashing like mad.
Printing fewer results (i.e., keeping the heap burden under 4Mb) made
that illusion vanish.
Looks like there's no hope for plugging the LazyList leaks automatically
short of adding frameobjects and genobjects to gc. OTOH, they're very
easy to break by hand, and they're the only *kind* of plausibly realistic
leaks I've been able to provoke.
Dilemma.
Implement sys.maxunicode.
Explicitly wrap around upper/lower computations for wide Py_UNICODE.
When decoding large characters with UTF-8, represent expected test
results using the \U notation.
break cycles, which are a special problem when running generator tests
that provoke exceptions by invoking the .next() method of a named
generator-iterator: then the iterator is named in globs, and the
iterator's frame gets a tracekback object pointing back to globs, and
gc doesn't chase these types so the cycle leaks.
Also changed _run_examples() to make a copy of globs itself, so its
callers (direct and indirect) don't have to (and changed the callers
to stop making their own copies); *that* much is a change I've been
meaning to make for a long time (it's more robust the new way).
Here's a way to provoke the symptom without doctest; it leaks at a
prodigious rate; if the last two "source" lines are replaced with
g().next()
the iterator isn't named and then there's no leak:
source = """\
def g():
yield 1/0
k = g()
k.next()
"""
code = compile(source, "<source>", "exec")
def f(globs):
try:
exec code in globs
except ZeroDivisionError:
pass
while 1:
f(globals().copy())
After this change, running test_generators in an infinite loop still leaks,
but reduced from a flood to a trickle.
Good news: Some of this stuff is pretty sophisticated (read nuts), and
I haven't bumped into a bug yet.
Bad news: If I run the doctest in an infinite loop, memory is clearly
leaking.