- Do not compile unicodeobject, unicodectype, and unicodedata if Unicode is disabled
- check for Py_USING_UNICODE in all places that use Unicode functions
- disables unicode literals, and the builtin functions
- add the types.StringTypes list
- remove Unicode literals from most tests.
__dict__ attribute. Deleting it, or setting it to a non-dictionary
result in a TypeError. Note that getting it the first time magically
initializes it to an empty dict so that func.__dict__ will always
appear to be a dictionary (never None).
Closes SF bug #446645.
LettError, Erik van Blokland, http://www.letterror.com/
the Python Windows installer finally has an attractive Pythonic bitmap
to delight the senses and dampen the fears of the millions and millions of
eager new Windows users anticipating their first Python programming joy.
Always knew Mac users secretly wanted to switch to Windows <wink>.
list.
Present the URLs at the bottom in a consistent manner, conforming to the
style guide.
Remove the lone use of "e.g.", which the style guide does not allow.
(Tim & I should agree on where to add new additions: I add them at the
top, Tim adds them at the bottom. I like the top better because folks
who occasionally check out the NEWS file will see the latest news
first.)
additional offset is only applied to continuation lines for block
opening statements.
(py-compute-indentation): Only add py-continuation-offset if
py-statement-opens-block-p is true.
This completes the q/Q project.
longobject.c _PyLong_AsByteArray: The original code had a gross bug:
the most-significant Python digit doesn't necessarily have SHIFT
significant bits, and you really need to count how many copies of the sign
bit it has else spurious overflow errors result.
test_struct.py: This now does exhaustive std q/Q testing at, and on both
sides of, all relevant power-of-2 boundaries, both positive and negative.
NEWS: Added brief dict news while I was at it.
native mode, and only when config #defines HAVE_LONG_LONG. Standard mode
will eventually treat them as 8-byte ints across all platforms, but that
likely requires a new set of routines in longobject.c first (while
sizeof(long) >= 4 is guaranteed by C, there's nothing in C we can rely
on x-platform to hold 8 bytes of int, so we'll have to roll our own;
I'm thinking of a simple pair of conversion functions, Python long
to/from sized vector of unsigned bytes; that may be useful for GMP
conversions too; std q/Q would call them with size fixed at 8).
test_struct.py: In addition to adding some native-mode 'q' and 'Q' tests,
got rid of unused code, and repaired a non-portable assumption about
native sizeof(short) (it isn't 2 on some Cray boxes).
libstruct.tex: In addition to adding a bit of 'q'/'Q' docs (more needed
later), removed an erroneous footnote about 'I' behavior.
Armin Rigo pointed out that the way the line-# table got built didn't work
for lines generating more than 255 bytes of bytecode. Fixed as he
suggested, plus corresponding changes to pyassem.py, plus added some
long overdue docs about this subtle table to compile.c.
Bugfix candidate (line numbers may be off in tracebacks under -O).
code, less memory. Tests have uncovered no drawbacks. Christian and
Vladimir are the other two people who have burned many brain cells on the
dict code in recent years, and they like the approach too, so I'm checking
it in without further ado.
instead of multiplication to generate the probe sequence. The idea is
recorded in Python-Dev for Dec 2000, but that version is prone to rare
infinite loops.
The value is in getting *all* the bits of the hash code to participate;
and, e.g., this speeds up querying every key in a dict with keys
[i << 16 for i in range(20000)] by a factor of 500. Should be equally
valuable in any bad case where the high-order hash bits were getting
ignored.
Also wrote up some of the motivations behind Python's ever-more-subtle
hash table strategy.
*are* obsolete; three variables and the maketrans() function are not
(yet) obsolete.
Add a compensating warnings.filterwarnings() call to test_strop.py.
Add this to the NEWS.
elements when crunching a list, dict or tuple. Now takes linear time
instead -- huge speedup for even moderately large containers, and the
code is notably simpler too.
Added some basic "is the output correct?" tests to test_pprint.
The comment following used to say:
/* We use ~hash instead of hash, as degenerate hash functions, such
as for ints <sigh>, can have lots of leading zeros. It's not
really a performance risk, but better safe than sorry.
12-Dec-00 tim: so ~hash produces lots of leading ones instead --
what's the gain? */
That is, there was never a good reason for doing it. And to the contrary,
as explained on Python-Dev last December, it tended to make the *sum*
(i + incr) & mask (which is the first table index examined in case of
collison) the same "too often" across distinct hashes.
Changing to the simpler "i = hash & mask" reduced the number of string-dict
collisions (== # number of times we go around the lookup for-loop) from about
6 million to 5 million during a full run of the test suite (these are
approximate because the test suite does some random stuff from run to run).
The number of collisions in non-string dicts also decreased, but not as
dramatically.
Note that this may, for a given dict, change the order (wrt previous
releases) of entries exposed by .keys(), .values() and .items(). A number
of std tests suffered bogus failures as a result. For dicts keyed by
small ints, or (less so) by characters, the order is much more likely to be
in increasing order of key now; e.g.,
>>> d = {}
>>> for i in range(10):
... d[i] = i
...
>>> d
{0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
>>>
Unfortunately. people may latch on to that in small examples and draw a
bogus conclusion.
test_support.py
Moved test_extcall's sortdict() into test_support, made it stronger,
and imported sortdict into other std tests that needed it.
test_unicode.py
Excluced cp875 from the "roundtrip over range(128)" test, because
cp875 doesn't have a well-defined inverse for unicode("?", "cp875").
See Python-Dev for excruciating details.
Cookie.py
Chaged various output functions to sort dicts before building
strings from them.
test_extcall
Fiddled the expected-result file. This remains sensitive to native
dict ordering, because, e.g., if there are multiple errors in a
keyword-arg dict (and test_extcall sets up many cases like that), the
specific error Python complains about first depends on native dict
ordering.
Allow module getattr and setattr to exploit string interning, via the
previously null module object tp_getattro and tp_setattro slots. Yields
a very nice speedup for things like random.random and os.path etc.
Fixed a half dozen ways in which general dict comparison could crash
Python (even cause Win98SE to reboot) in the presence of kay and/or
value comparison routines that mutate the dict during dict comparison.
Bugfix candidate.
d1 == d2 and d1 != d2 now work even if the keys and values in d1 and d2
don't support comparisons other than ==, and testing dicts for equality
is faster now (especially when inequality obtains).
NEEDS DOC CHANGES.
More AttributeErrors transmuted into TypeErrors, in test_b2.py, and,
again, this strikes me as a good thing.
This checkin completes the iterator generalization work that obviously
needed to be done. Can anyone think of others that should be changed?
NEEDS DOC CHANGES
A few more AttributeErrors turned into TypeErrors, but in test_contains
this time.
The full story for instance objects is pretty much unexplainable, because
instance_contains() tries its own flavor of iteration-based containment
testing first, and PySequence_Contains doesn't get a chance at it unless
instance_contains() blows up. A consequence is that
some_complex_number in some_instance
dies with a TypeError unless some_instance.__class__ defines __iter__ but
does not define __getitem__.
to string.join(), so that when the latter figures out in midstream that
it really needs unicode.join() instead, unicode.join() can actually get
all the sequence elements (i.e., there's no guarantee that the sequence
passed to string.join() can be iterated over *again* by unicode.join(),
so string.join() must not pass on the original sequence object anymore).
because PySequence_Fast() started working for free as soon as
PySequence_Tuple() learned how to work with iterators. For some reason
unicode.join() still doesn't work, though.
NEEDS DOC CHANGES.
This one surprised me! While I expected tuple() to be a no-brainer, turns
out it's actually dripping with consequences:
1. It will *allow* the popular PySequence_Fast() to work with any iterable
object (code for that not yet checked in, but should be trivial).
2. It caused two std tests to fail. This because some places used
PyTuple_Sequence() (the C spelling of tuple()) as an indirect way to test
whether something *is* a sequence. But tuple() code only looked for the
existence of sq->item to determine that, and e.g. an instance passed
that test whether or not it supported the other operations tuple()
needed (e.g., __len__). So some things the tests *expected* to fail
with an AttributeError now fail with a TypeError instead. This looks
like an improvement to me; e.g., test_coercion used to produce 559
TypeErrors and 2 AttributeErrors, and now they're all TypeErrors. The
error details are more informative too, because the places calling this
were *looking* for TypeErrors in order to replace the generic tuple()
"not a sequence" msg with their own more specific text, and
AttributeErrors snuck by that.
NEEDS DOC CHANGES.
Possibly contentious: The first time s.next() yields StopIteration (for
a given map argument s) is the last time map() *tries* s.next(). That
is, if other sequence args are longer, s will never again contribute
anything but None values to the result, even if trying s.next() again
could yield another result. This is the same behavior map() used to have
wrt IndexError, so it's the only way to be wholly backward-compatible.
I'm not a fan of letting StopIteration mean "try again later" anyway.
to no longer insist that len(seq) be defined.
NEEDS DOC CHANGES.
This is meant to be a model for how other functions of this ilk (max,
filter, etc) can be generalized similarly. Feel encouraged to grab your
favorite and convert it!
Note some cute consequences:
list(file) == file.readlines() == list(file.xreadlines())
list(dict) == dict.keys()
list(dict.iteritems()) = dict.items()
list(xrange(i, j, k)) == range(i, j, k)
must now initialize the extra field used by the weak-ref machinery to
NULL themselves, to avoid having to require PyObject_INIT() to check
if the type supports weak references and do it there. This causes less
work to be done for all objects (the type object does not need to be
consulted to check for the Py_TPFLAGS_HAVE_WEAKREFS bit).
SF patch #103683: Alternative dll version resources.
Changes similar to the patch. MarkH should review.
File version and Product version text strings now 2.1a2.
64-bit file and product version numbers are now
PY_MAJOR_VERSION, PY_MINOR_VERSION, messy, PYTHON_API_VERSION
where
messy = PY_MICRO_VERSION*1000 + PY_RELEASE_LEVEL*10 + PY_RELEASE_SERIAL
Updated company name to "Digital Creations 2".
Copyright now lists Guido; "C in a circle" symbol used instead of (C).
Comments added so this is less likely to get flubbed again, and
#if/#error guys added to trigger if the version number manipulations
above overflow.
Add note about _symtable.
Add note that 'from ... import *' restriction may go away -- and move
the whole entry closer to the top, because it might bite people.
internal states. Put the old .seed() (which could only get at about
the square root of the # of possibilities) under the new name .whseed(),
for bit-level compatibility with older versions. This occurred to me
while reviewing effbot's book (he found himself stumbling over .seed()
more than once there ...).
- All constructors grow an optional argument `factory' which is a
callable used when new message instances are created by the next()
methods. Defaults to the rfc822.Message class.
- A new subclass of UnixMailbox is added, called PortableUnixMailbox.
It's identical to UnixMailbox, but uses a more portable test for
From_ delimiter lines. With PortableUnixMailbox, any line that
starts with "From " is considered a delimiter (this should really
check for two newlines before the F, but it doesn't.
SF patch http://sourceforge.net/patch/?func=detailpatch&patch_id=103453&group_id=5470
PyMember_Set of T_CHAR always raises exception.
Unfortunately, this is a use of a C API function that Python itself never makes, so
there's no .py test I can check in to verify this stays fixed. But the fault in the
code is obvious, and Dave Cole's patch just as obviously fixes it.
got broken). Also added new method .jumpahead(N). This finally gives us
a semi-decent answer to how Python's RNGs can be used safely and efficiently
in multithreaded programs (although it requires the user to use the new
machinery!).
functionality of, whrandom.py. Also closes all the "XXX" todos in
random.py. New frequently-requested functions/methods getstate() and
setstate(). All exported functions are now bound methods of a hidden
instance. Killed all unintended exports. Updated the docs.
FRED: The more I fiddle the docs, the less I understand the exact
intended use of the \var, \code, \method tags. Please review critically.
GUIDO: See email. I updated NEWS as if whrandom were deprecated; I
think it should be.
ctime, gmtime and localtime optional, defaulting to 'the current time' in
all cases. Adjust docs, add news item. Also convert all argument-handling to
METH_VARARGS. Closes SF patch #103265.
- Changed description of rich comparisons to emphasize that < and >
(etc.) are each other's reflection. Also use this word in the note
about the demise of __rcmp__.
except that it always returns Unicode objects.
A new C API PyObject_Unicode() is also provided.
This closes patch #101664.
Written by Marc-Andre Lemburg. Copyright assigned to Guido van Rossum.
Christmas present to myself: the bisect module didn't define what
happened if the new element was already in the list. It so happens
that it inserted the new element "to the right" of all equal elements.
Since it wasn't defined, among other bad implications it was a mystery
how to use bisect to determine whether an element was already in the
list (I've seen code that *assumed* "to the right" without justification).
Added new methods bisect_left and insort_left that insert "to the left"
instead; made the old names bisect and insort aliases for the new names
bisect_right and insort_right; beefed up docstrings to explain what
these actually do; and added a std test for the bisect module.
delimiter, watch out for backslash escaped delimiters. Also use =
instead of eq for character comparison (because a character is = to
it's integer value, but not eq to it).
In the limits.h comment, noted that INT_MAX and LONG_MAX are guaranteed
to be defined.
Noted that Reliant UNIX now gets proper API support for extension modules.
reverse() didn't work at all due to bad arg check.
Fixed that.
Added Brad Chapman to ACKS file, as the proud new owner of two
implicitly copyrighted lines of Python source code <wink>.
Repaired buffer_info's total lack of arg-checking.
Replaced memmove by memcpy in reverse() guts, as memmove is
often slower and the memory areas are guaranteed disjoint.
Replaced poke-and-hope unchecked decl of tmp buffer size by
assert-checked larger tmp buffer.
Got rid of inconsistent spaces before open paren in docstrings.
Added reverse() sanity tests to test_array.py.
XXX notes for now.
I could use help here!!!! Please mail me patches ASAP. We may have
to put some of this off to 2.0final, but it's best to have it in shape
now...
the Python Unicode implementation.
The internal buffer used for implementing the buffer protocol
is renamed to defenc to make this change visible. It now holds the
default encoded version of the Unicode object and is calculated
on demand (NULL otherwise).
Since the default encoding defaults to ASCII, this will mean that
Unicode objects which hold non-ASCII characters will no longer
work on C APIs using the "s" or "t" parser markers. C APIs must now
explicitly provide Unicode support via the "u", "U" or "es"/"es#"
parser markers in order to work with non-ASCII Unicode strings.
(Note: this patch will also have to be applied to the 1.6 branch
of the CVS tree.)
Montanaro, handle execution of indented regions by inserting an "if
1:" in front of the block. This better preserves things like triple
quoted strings and commented regions. This patch resolves PR#264.
1.5.2 was released, except those who contributed only to Doc files --
Fred has his own way of doing this.
This doesn't mean that I've got everyone who contributed *before*
1.5.2 was released in here... :-(
executive summary:
Instead of typing 'apply(f, args, kwargs)' you can type 'f(*arg, **kwargs)'.
Some file-by-file details follow.
Grammar/Grammar:
simplify varargslist, replacing '*' '*' with '**'
add * & ** options to arglist
Include/opcode.h & Lib/dis.py:
define three new opcodes
CALL_FUNCTION_VAR
CALL_FUNCTION_KW
CALL_FUNCTION_VAR_KW
Python/ceval.c:
extend TypeError "keyword parameter redefined" message to include
the name of the offending keyword
reindent CALL_FUNCTION using four spaces
add handling of sequences and dictionaries using extend calls
fix function import_from to use PyErr_Format
The attached patch set includes a workaround to get Python with
Unicode compile on BSDI 4.x (courtesy Thomas Wouters; the cause
is a bug in the BSDI wchar.h header file) and Python interfaces
for the MBCS codec donated by Mark Hammond.
Also included are some minor corrections w/r to the docs of
the new "es" and "es#" parser markers (use PyMem_Free() instead
of free(); thanks to Mark Hammond for finding these).
The unicodedata tests are now in a separate file
(test_unicodedata.py) to avoid problems if the module cannot
be found.
Attached you find the latest update of the Unicode implementation.
The patch is against the current CVS version.
It includes the fix I posted yesterday for the core dump problem
in codecs.c (was introduced by my previous patch set -- sorry),
adds more tests for the codecs and two new parser markers
"es" and "es#".
Attached you find an update of the Unicode implementation.
The patch is against the current CVS version. I would appreciate
if someone with CVS checkin permissions could check the changes
in.
The patch contains all bugs and patches sent this week and also
fixes a leak in the codecs code and a bug in the free list code
for Unicode objects (which only shows up when compiling Python
with Py_DEBUG; thanks to MarkH for spotting this one).
(python): Set defgroup :prefix to "py-" to make variable names cleaner.
(py-jpython-command, py-jpython-command-args): Set :tag for proper
capitalization of JPython in variable name display.
first time a py buffer is visited during the Emacs session. This
ensures that py-which-shells is initialized and also guarantees that
the mode lines reflect the correct shell. First bug found by GvR,
second one has long bugged :) me.
(py-toggle-shells): Programmatically, arg can also take the symbols
`cpython' or `jpython', which makes it easy to call with the value of
py-default-interpreter.
(py-shell): Don't need to initialize py-which-* variables since these
will guarantee to be initialized by python-mode when the first py
buffer is visited.
(py-default-interpreter): Update docstring.
casing when py-honor-comment-indentation is nil, but this could be a
religious issue with some. Seems to me we should still be dedenting
such comment lines one level.
buffer-syntactic-context -- just short circuit the TQS test by jumping
to point-min and doing the test from there. For long files, this will
be faster than looping with a re-search-backwards.
I don't know what its origins are but I think I've seen it
once in a NeXT dictionary application -- not sure whether
anyone owns copyright but I don't see why we should risk it.
py-newline-and-indent. These ought to get picked up by the mapcar
that follows; any existing binding to newline-and-indent gets shadowed
to py-newline-and-indent.
This will break some people who, e.g. bind C-m or C-j to newline but
still want these bound to py-newline-and-indent in Python mode. On
the other hand, the forced binding pisses off Emacs diehards. So
consider this experimental and see if any tall Dutch guys complain :-)
standard narrow-to-defun but works with Python classes and methods.
With no arg, narrows to most enclosing def/method. With C-u arg,
narrows to most enclosing class.
string we find ourselves in, based on the passed in delimiter.
(py-compute-indentation): Fixes for indentation errors when we land
inside a triple quoted string. For example:
def foo():
if os.path.isfile(o_pri_mbox_file) and os.path.isfile(o_pub_mbox_file):
print """\
I found both a private and a public mbox archive file
private: %s
public : %s
I won't move either file, but you should choose one and move it to
%s
You may want to merge them manually, but be careful about exposing private
correspondences to the public.""" % (
o_pri_mbox_file, o_pub_mbox_file, mbox_file)
*----indentation would be wrong on this line.
#simple things. First step: rename the Imenu supportive variables and
#functions in this file to py-imenu-* so I can grok what is part of
#python-mode and what is part of Imenu.
(py-imenu-create-index-engine): Fixed problem with two classes in a
single file, caused by new semantics of py-beginning-of-def-or-class
when called programmatically.
#Note, there are still some problems with Imenu when arguments to
#functions are funky, but it should be much better now.
string in the argument to execfile() so a Windows temp directory
named, e.g. c:\\tmp doesn't get interpreted as a file name with an
embedded tab! (given by C. Waldman).
this string should not end with whitespace.
(py-compute-indentation): Append whitespace regexp to
py-block-comment-prefix so that any combination of intervening
whitespace will be recognized.
change error messages to be a little more straightforward
change definition of FULL_PATH so that an error is raised if the
setuid wrapper is used un-edited
shell buffers.
(py-shell): Moved the require of comint to the top level. Also
use-local-map py-shell-map instead of hacking on the comint-mode-map.
This eliminates breakage of other comint-mode buffers (e.g. shell).
interactions with newer Emacsen, I've rewritten the way all the
process filters work in the *Python* buffer. We use more of the
comint infrastructure, specifically the default process filter. This
means that scrolling is now handled by the default comint variables
including comint-scroll-to-bottom-on-output. Note that this is
somewhat experimental change!
(py-comint-output-filter-function): Moved to here from the obsolete
py-process-filter function, the logic to pop and exec the next queued
file waiting to be executed.
(py-execute-file): Don't bind comint-scroll-to-bottom-on-output to t,
and save the excursion when inserting the "working on" message. This
lets the standard comint scrolling variables as set by the user,
continue to work.
(python-mode, py-shell, py-describe-mode): Remove description of
py-scroll-process-buffer. Also in py-shell, make
comint-output-filter-functions buffer-local, and add
py-comint-output-filter-function to this hook (instead of setting the
process filter).
(py-scroll-process-buffer): Deleted this variable. See comint
variables including comint-scroll-to-bottom-on-output.
(py-execute-region): When exec files are being queued, push the next
temp file on the end of the list.
(py-submit-bug-report): Removed reporting of py-scroll-process-buffer.
"3.67 fixes Imenu as far as classes are concerned, but some default
values for function arguments are still not supported."
This ought to fix that problem.
indetnation of normal statements: The regular expression that searches
for indenting comment lines has been changed to not require a
space/tab after the first `#'. We then explicitly look for
py-block-comment-prefix depending on the value of
py-honor-comment-indentation.
I think this more accurately reflects the documentation for
py-honor-comment-indentation.
conformations, etc., etc. inspired and given by Michael Ernst. These
include error string fixes, moving of comments to docstrings, some
other non-related typos, terminology standardizing (b/w TP and myself,
and b/w myself and myself :-) although more can still be done.
E.g. "outdenting" => "dedenting".
serial number isn't enough to uniquify the temp file name -- what if
two users are on the same machine? Add in the (emacs-pid) to help
further. Should never be tickled on Emacs 20, XEmacs 20, 21.
py-mark-def-or-class): Integrated Michael Ernst latest patches.
Primarily, it allows functions that search or mark defs/classes based
on programmatic specification, to take an 'either flag value which
allows searching for both classes and defs (stopping at the nearest
construct).
Also clean up some docstrings.
comment-indent-function's default lambda value (in simple.el), this
version finally kills this nit: auto-filling a comment that starts in
column zero with filladapt turned off would cascade the #'s to the
right.
Now auto-filling seems to work with or without filladapt, and with the
comment starting in any column.
(python-mode): Set comment-indent-function.
(py-execute-import-or-reload): Cool new command that imports or
reloads the current file as a module, so as not to clutter the global
namespace. Bound to C-c C-m.
(py-execute-def-or-class): New command that sends the current def or
class to the interpreter. Bound to C-M-x.
(py-execute-string): New command that sends arbitrary string to the
interpreter. Not bound by default.
(py-describe-mode): Doco updates.
py-beginning-of-def-or-class, and defaliased for backwards
compatibility. ME patch to add optional second argument, count.
(end-of-python-def-or-class): Renamed to py-end-of-def-or-class, and
defaliased for backwards compatibility. ME patch to add optional
second argument, count.
(py-shell): Recognize the Python debugger prompt
(py-jump-to-exception): Force into python-mode any buffer that gets
jumped to on exception. Cope with py-exception-buffer possibly a
cons.
JPython interpreters. This implementation may suck.
(py-jpython-command, py-jpython-command-args): New variables.
(py-mode-map): py-toggle-shells bound to C-c C-t
(py-toggle-shells): Command to toggle between using CPython (the
default) and JPython. This is buffer local, and notice the mode-name
change.
(py-shell): Use either CPython or JPython. Note that py-execute-*
still needs to be modified.
an open paren, do a better job of reindenting the line. For example:
def foo():
print 'hello %s, %d' % (
a, b)
Hit TAB on the line starting with `a'. Without this patch this line
will never be reindented.
(py-compute-indentation): int-to-char isn't defined in Emacs, but we
don't really need it anyway, so just remove this conversion. XEmacs
is happy either way.
(py-parse-state): The Emacs branch (i.e. w/o buffer-syntactic-context)
wasn't adjusting point correctly.
otherwise return nil.
(py-execute-region): When executing the buffer asynchronously in a
subprocess, if an exception occurred, show both the output buffer and
the file containing the exception, leaving point on the source line
containing bottom-most error in the traceback. If no exception
occurred, jump to the output buffer (no change).
from CC Mode.
(py-guess-indent-offset): Teach it about colons in `literals'
(e.g. comments and strings). Don't false hit colons in literals; keep
searching for a real block introducing line.
of a line in py-tab-face to aid in seeing mixed tab/space indentation.
This face defaults to the `default' face so it is unobtrusive until
you `M-x customize-face' py-tab-face to something obnoxious like
"Yellow".