(reported by François Pinard).
Added some missing "_" characters in the same cluster of productions.
Added missing floor division operator in m_expr production, and mention
floor division in the relevant portion of the text.
Unicode objects are currently taken as binary data by the write()
method. This is not what Unicode users expect, nor what the
StringIO.py code does. Until somebody adds a way to specify binary or
text mode for cStringIO objects, change the format string to use "t#"
instead of "s#", so that it will request the "text buffer" version.
This will try the default encoding for Unicode objects.
This is *not* a 2.2 bugfix (since it *is* a semantic change).
This now does a dynamic analysis of which elements are so frequently
repeated as to constitute noise. The primary benefit is an enormous
speedup in find_longest_match, as the innermost loop can have factors
of 100s less potential matches to worry about, in cases where the
sequences have many duplicate elements. In effect, this zooms in on
sequences of non-ubiquitous elements now.
While I like what I've seen of the effects so far, I still consider
this experimental. Please give it a try!
As threatened, PyMem_{Free, FREE} also invoke the object deallocator now
when pymalloc is enabled (well, it does when pymalloc isn't enabled too,
but in that case "the object deallocator" is plain free()).
This is maximally backward-compatible, but it leaves a bitter aftertaste.
Also massive reworking of comments.
don't understand how this function works, also beefed up the docs. The
most common usage error is of this form (often spread out across gotos):
if (_PyString_Resize(&s, n) < 0) {
Py_DECREF(s);
s = NULL;
goto outtahere;
}
The error is that if _PyString_Resize runs out of memory, it automatically
decrefs the input string object s (which also deallocates it, since its
refcount must be 1 upon entry), and sets s to NULL. So if the "if"
branch ever triggers, it's an error to call Py_DECREF(s): s is already
NULL! A correct way to write the above is the simpler (and intended)
if (_PyString_Resize(&s, n) < 0)
goto outtahere;
Bugfix candidate.
This implements ideas from Marc-Andre, Martin, Guido and me on Python-Dev.
"Short" Unicode strings are encoded into a "big enough" stack buffer,
then exactly as much string space as they turn out to need is allocated
at the end. This should have speed benefits akin to Martin's "measure
once, allocate once" strategy, but without needing a distinct measuring
pass.
"Long" Unicode strings allocate as much heap space as they could possibly
need (4 x # Unicode chars), and do a realloc at the end to return the
untouched excess. Since the overallocation is likely to be substantial,
this shouldn't burden the platform realloc with unusably small excess
blocks.
Also simplified uses of the PyString_xyz functions. Also added a release-
build check that 4*size doesn't overflow a C int. Sooner or later, that's
going to happen.
left and right type were of the same type and not classic instances.
This shortcut is dangerous for proxy types, because it means that
coerce(Proxy(1), Proxy(2.1)) leaves Proxy(1) unchanged rather than
turning it into Proxy(1.0).
In an ever-so-slight change of semantics, I now only take the shortcut
when the left and right types are of the same type and don't have the
CHECKTYPES feature. It so happens that classic instances have this
flag, so the shortcut is still skipped in this case (i.e. nothing
changes for classic instances). Proxies also have this flag set
(otherwise implementing numeric operations on proxies would become
nightmarish) and this means that the shortcut is also skipped there,
as desired. It so happens that int, long and float also have this
flag set; that means that e.g. coerce(1, 1) will now invoke
int_coerce(). This is fine: int_coerce() can deal with this, and I'm
not worried about the performance; int_coerce() is only invoked when
the user explicitly calls coerce(), which should be rarer than rare.
This fixes the problem that Barry reported on python-dev:
>>> 23000 .__class__ = bool
crashes in the deallocator. This was because int inherited tp_free
from object, which uses the default allocator.
2.2. Bugfix candidate.
better auto-recognition of a Jython file vs. a CPython (or agnostic)
file by looking at the #! line more closely, and inspecting the import
statements in the first 20000 bytes (configurable). Specifically,
(py-import-check-point-max): New variable, controlling how far into
the buffer it will search for import statements.
(py-jpython-packages): List of package names that are Jython-ish.
(py-shell-alist): List of #! line programs and the modes associated
with them.
(jpython-mode-hook): Extra hook that runs when entering jpython-mode
(what about Jython mode? <20k wink>).
(py-choose-shell-by-shebang, py-choose-shell-by-import,
py-choose-shell): New functions.
(python-mode): Use py-choose-shell.
(jpython-mode): New command.
(py-execute-region): Don't use my previous hacky attempt at doing
this, use the new py-choose-shell function.
One other thing this file now does: it attempts to add the proper
hooks to interpreter-mode-alist and auto-mode-alist if they aren't
already there. Might help with Emacs users since that editor doesn't
come with python-mode by default.
Allows for some customization of the underlying comint buffer.
(py-shell): Call the new hook.
(info-lookup-maybe-add-help): A new call suggested by Milan Zamazal to
make lookups in the Info documentation easier.
On Win2K it thought 'foo' started at byte offset 0 instead of at the
pagesize, and on Win98 it thought 'foo' didn't exist at all. Somehow
or other this is related to the new "in memory file" gimmicks in
bsddb, but the old bsddb we use on Windows sucks so bad anyway I don't
want to bother digging deeper. Flushing the file in test_mmap after
writing to it makes the problem go away, so good enough.