iswide() for east asian width manipulation. (Inspired by David
Goodger, Reviewed by Martin v. Loewis)
- Move _PyUnicode_TypeRecord.flags to the end of the struct so that
no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
(Code contributed by Jiwon Seo.)
The documentation portion of the patch is being re-worked and will be
checked-in soon. Likewise, PEP 289 will be updated to reflect Guido's
rationale for the design decisions on binding behavior (as described in
in his patch comments and in discussions on python-dev).
The test file, test_genexps.py, is written in doctest format and is
meant to exercise all aspects of the the patch. Further additions are
welcome from everyone. Please stress test this new feature as much as
possible before the alpha release.
close() calls would attempt to free() the buffer already free()ed on
the first close(). [bug introduced with patch #788249]
Making sure that the buffer is free()ed in file object deallocation is
a belt-n-braces bit of insurance against a memory leak.
array.extend() now accepts iterable arguments implements as a series
of appends. Besides being a user convenience and matching the behavior
for lists, this the saves memory and cycles that would be used to
create a temporary array object.
lists. Speeds append() operations and reduces memory requirements
(because of more conservative overallocation).
Paves the way for the feature request for array.extend() to support
arbitrary iterable arguments.
The writelines() method now accepts any iterable argument and writes
the lines one at a time rather than using ''.join(lines) followed by
a single write. Results in considerable memory savings and makes the
method suitable for use with generator expressions.
(Championed by Bob Ippolito.)
The update() method for mappings now accepts all the same argument forms
as the dict() constructor. This includes item lists and/or keyword
arguments.
are within proper boundaries as specified in the docs.
This can break possible code (datetime module needed changing, for instance)
that uses 0 for values that need to be greater 1 or greater (month, day, and
day of year).
Fixes bug #897625.
recent gcc on Linux/x86)
[ 899109 ] 1==float('nan')
by implementing rich comparisons for floats.
Seems to make comparisons involving NaNs somewhat less surprising
when the underlying C compiler actually implements C99 semantics.
to list_init.
* Replaced the code in list_extend with the superior code from list_fill.
* Eliminated list_fill.
Results:
* list.extend() no longer creates an intermediate tuple except to handle
the special case of x.extend(x). The saves memory and time.
* list.extend(x) runs
about the same x is a list or tuple,
a little faster when x is an iterable not defining __len__, and
twice as fast when x is an iterable defining __len__.
* the code is about 15 lines shorter and no longer duplicates
functionality.
The Py2.3 approach overallocated small lists by up to 8 elements.
The last checkin would limited this to one but slowed down (by 20 to 30%)
the creation of small lists between 3 to 8 elements.
This tune-up balances the two, limiting overallocation to 3 elements
(significantly reducing space consumption from Py2.3) and running faster
than the previous checkin.
The first part of the growth pattern (0, 4, 8, 16) neatly meshes with
allocators that trigger data movement only when crossing a power of two
boundary. Also, then even numbers mesh well with common data alignments.
realloc(). This is achieved by tracking the overallocation size in a new
field and using that information to skip calls to realloc() whenever
possible.
* Simplified and tightened the amount of overallocation. For larger lists,
this overallocates by 1/8th (compared to the previous scheme which ranged
between 1/4th to 1/32nd over-allocation). For smaller lists (n<6), the
maximum overallocation is one byte (formerly it could be upto eight bytes).
This saves memory in applications with large numbers of small lists.
* Eliminated the NRESIZE macro in favor of a new, static list_resize function
that encapsulates the resizing logic. Coverting this back to macro would
give a small (under 1%) speed-up. This was too small to warrant the loss
of readability, maintainability, and de-coupling.
* Some functions using NRESIZE had grown unnecessarily complex in their
efforts to bend to the macro's calling pattern. With the new list_resize
function in place, those other functions could be simplified. That is
being saved for a separate patch.
* The ob_item==NULL check could be eliminated from the new list_resize
function. This would entail finding each piece of code that sets ob_item
to NULL and adding a new line to invalidate the overallocation tracking
field. Rather than impose a new requirement on other pieces of list code,
it was preferred to leave the NULL check in place and retain the benefits
of decoupling, maintainability and information hiding (only PyList_New()
and list_sort() need to know about the new field). This approach also
reduces the odds of breaking an extension module.
(Collaborative effort by Raymond Hettinger, Hye-Shik Chang, Tim Peters,
and Armin Rigo.)
which can be reviewed via
http://coding.derkeiler.com/Archive/Python/comp.lang.python/2003-12/1011.html
Duncan Booth investigated, and discovered that an "optimisation" was
in fact a pessimisation for small numbers of elements in a source list,
compared to not having the optimisation, although with large numbers
of elements in the source list the optimisation was quite beneficial.
He posted his change to comp.lang.python (but not to SF).
Further research has confirmed his assessment that the optimisation only
becomes a net win when the source list has more than 100 elements.
I also found that the optimisation could apply to tuples as well,
but the gains only arrive with source tuples larger than about 320
elements and are nowhere near as significant as the gains with lists,
(~95% gain @ 10000 elements for lists, ~20% gain @ 10000 elements for
tuples) so I haven't proceeded with this.
The code as it was applied the optimisation to list subclasses as
well, and this also appears to be a net loss for all reasonable sized
sources (~80-100% for up to 100 elements, ~20% for more than 500
elements; I tested up to 10000 elements).
Duncan also suggested special casing empty lists, which I've extended
to all empty sequences.
On the basis that list_fill() is only ever called with a list for the
result argument, testing for the source being the destination has
now happens before testing source types.
Original idea by Guido van Rossum.
Idea for skipable inner iterators by Raymond Hettinger.
Idea for argument order and identity function default by Alex Martelli.
Implementation by Hye-Shik Chang (with tweaks by Raymond Hettinger).
comments about why both calls to cyclic gc here can cause problems.
I'll backport to 2.3 maint. Since the calls were introduced in 2.3,
that will be the end of it.
and left shifts. (Thanks to Kalle Svensson for SF patch 849227.)
This addresses most of the remaining semantic changes promised by
PEP 237, except for repr() of a long, which still shows the trailing
'L'. The PEP appears to promise warnings for operations that
changed semantics compared to Python 2.3, but this is not
implemented; we've suffered through enough warnings related to
hex/oct literals and I think it's best to be silent now.
by the function object or by the method object, the function
object's attribute usually wins. Christian Tismer pointed out that
that this is really a mistake, because this only happens for special
methods (like __reduce__) where the method object's version is
really more appropriate than the function's attribute. So from now
on, all method attributes will have precedence over function
attributes with the same name.
Also SF patch 843455.
This is a critical bugfix.
I'll backport to 2.3 maint, but not beyond that. The bugs this fixes
have been there since weakrefs were introduced.
* Install the unittests, docs, newsitem, include file, and makefile update.
* Exercise the new functions whereever sets.py was being used.
Includes the docs for libfuncs.tex. Separate docs for the types are
forthcoming.
subtype_dealloc(): This left the dying object exposed to gc, so that
if cyclic gc triggered during the weakref callback, gc tried to delete
the dying object a second time. That's a disaster. subtype_dealloc()
had a (I hope!) unique problem here, as every normal dealloc routine
untracks the object (from gc) before fiddling with weakrefs etc. But
subtype_dealloc has obscure technical reasons for re-registering the
dying object with gc (already explained in a large comment block at
the bottom of the function).
The fix amounts to simply refraining from reregistering the dying object
with gc until after the weakref callback (if any) has been called.
This is a critical bug (hard to predict, and causes seemingly random
memory corruption when it occurs). I'll backport it to 2.3 later.
It works like the pure python verion except:
* it stops storing data after of the iterators gets deallocated
* the data queue is implemented with two stacks instead of one dictionary.
key provides C support for the decorate-sort-undecorate pattern.
reverse provide a stable sort of the list with the comparisions reversed.
* Amended the docs to guarantee sort stability.
* Added C coded getrandbits(k) method that runs in linear time.
* Call the new method from randrange() for ranges >= 2**53.
* Adds a warning for generators not defining getrandbits() whenever they
have a call to randrange() with too large of a population.
If a length-1 Unicode string was in the freelist and it was
uninitialized or pointed to a very large (magnitude) negative number,
the check
unicode_latin1[unicode->str[0]] == unicode
could cause a segmentation violation, e.g. unicode->str[0] is 0xcbcbcbcb.
Fix this in two ways:
1. Change guard befor unicode_latin1[] to test against 256U. If I
understand correctly, the unsigned long used to store UCS4 on my
box was getting converted to a signed long to compare with the
signed constant 256.
2. Change _PyUnicode_New() to make sure the first element of str is
always initialized to zero. There are several places in the code
where the caller can exit with an error before initializing any
of str, which would leave junk in str[0].
Also, silence a compiler warning on pointer vs. int arithmetic.
Bug fix candidate.
Add support for the iterator and mapping protocols.
For Py2.3, this was done for shelve, dumbdbm and other mapping objects, but
not for bsddb and dbhash which were inadvertently missed.
file_truncate(): C doesn't define what fflush(fp) does if fp is open
for update, and the preceding I/O operation on fp was input. On Windows,
fflush() actually changes the current file position then. Because
Windows doesn't support ftruncate() directly, this not only caused
Python's file.truncate() to change the file position (contra our docs),
it also caused the file not to change size.
Repaired by getting the initial file position at the start, restoring
it at the end, and tossing all the complicated micro-efficiency checks
trying to avoid "provably unnecessary" seeks. file.truncate() can't
be a frequent operation, and seeking to the current file position has
got to be cheap anyway.
Bugfix candidate.
* Relaxed the argument restrictions for non-operator methods. They now
allow any iterable instead of requiring a set. This makes the module
a little easier to use and paves the way for an efficient C
implementation which can take better advantage of iterable arguments
while screening out immutables.
* Deprecated Set.update() because it now duplicates Set.union_update()
* Adapted the tests and docs to include the above changes.
* Added more test coverage including testing identities and checking
to make sure non-restartable generators work as arguments.
Will backport to Py2.3.1 so that the interface remains consistent
across versions. The deprecation of update() will be changed to
a FutureWarning.
number. This accounts for the 2 refcount leaks per test_complex run
Michael Hudson discovered (I figured only I would have the stomach to
look for leaks in floating-point code <wink>).
The default seed is time.time().
Multiplied by 256 before truncating so that fractional seconds are used.
This way, two successive calls to random.seed() are much more likely
to produce different sequences.
The fix is confined to the Windows installer.
Not a bugfix candidate: the need for the new -n switch added here was
introduced by moving to the idlefork IDLE (so this change isn't needed
or helpful before 2.3).
arbitrary bytes before the actual zip compatible archive. Zipfiles
containing comments at the end of the file are still not supported.
Add a testcase to test_zipimport, and update NEWS.
This closes sf #775637 and sf #669036.
are satisfied in a case-insensitive manner, the attempt to import (the
non-existent) fcntl gets satisfied by FCNTL.py instead, and the tempfile
module defines a Unix-specific _set_cloexec() function in that case. As
a result, temp files can't be created then (blows up with an AttributeError
trying to reference fcntl.fcntl). This just popped up in the spambayes
project, where there is no apparent workaround (which is why I'm pushing
this in now).
New Plan (releases to be made off the head, ongoing random 2.4 stuff
to be done on a short-lived branch, provided anyone is motivated enough
to create one).
skip over functions with private names (as indicated by the underscore
naming convention). The old default created too much of a risk that
user tests were being skipped inadvertently. Note, this change could
break code in the unlikely case that someone had intentionally put
failing tests in the docstrings of private functions. The breakage
is easily fixable by specifying the old behavior when calling testmod()
or Tester(). The more likely case is that the silent failure was
unintended and that the user needed to be informed so the test could be
fixed.
Related to SF patch 723231 (which pointed out the problem, but didn't
fix it, just shut up the warning msg -- which was pointing out a dead-
serious bug!).
Bugfix candidate.
behavior, creating many threads very quickly. A long debugging session
revealed that the Windows implementation of PyThread_start_new_thread()
was choked with "laziness" errors:
1. It checked MS _beginthread() for a failure return, but when that
happened it returned heap trash as the function result, instead of
an id of -1 (the proper error-return value).
2. It didn't consider that the Win32 CreateSemaphore() can fail.
3. When creating a great many threads very quickly, it's quite possible
that any particular bootstrap call can take virtually any amount of
time to return. But the code waited for a maximum of 5 seconds, and
didn't check to see whether the semaphore it was waiting for got
signaled. If it in fact timed out, the function could again return
heap trash as the function result. This is actually what confused
the test program, as the heap trash usually turned out to be 0, and
then multiple threads all got id 0 simultaneously, confusing the
hell out of threading.py's _active dict (mapping id to thread
object). A variety of baffling behaviors followed from that.
WRT #1 and #2, error returns are checked now, and "thread.error: can't
start new thread" gets raised now if a new thread (or new semaphore)
can't be created. WRT #3, we now wait for the semaphore without a
timeout.
Also removed useless local vrbls, folded long lines, and changed callobj
to a stack auto (it was going thru malloc/free instead, for no discernible
reason).
Bugfix candidate.
I won't have time to write real docs, but spent a lot of time adding
comments to his code and fleshing out the exported functions' docstrings.
There's probably opportunity to consolidate how docstrings get extracted
too, and the new code for that is probably better than the old code for
that (which strained mightily to recover from 2.2's new class/type
gimmicks).
SF bug #760703: SocketHandler and LogRecord don't work well together
SF bug #757821: logging module docs
Applied Vinay Sajip's patch with a few minor fixups and a NEWS item.
Patched __init__.py - added new function
makeLogRecord (for bug report 760703).
Patched handlers.py - updated some docstrings and
deleted some old commented-out code.
Patched test_logging.py to make use of makeLogRecord.
Patched liblogging.tex to fill documentation gaps (both
760703 and bug 757821).
now accepts "True" when a test expects "1", and similarly for "False"
versus "0". This is un-doctest-like, but on balance makes it much
more pleasant to write doctests that pass under 2.2 and 2.3. I expect
it to go away again, when 2.2 is forgotten. In the meantime, there's
a new doctest module constant that can be passed to a new optional
argument, if you want to turn this behavior off.
Note that this substitution is very simple-minded: the expected and
actual outputs have to consist of single tokens. No attempt is made,
e.g., to accept [True, False] when a test expects [1, 0]. This is a
simple hack for simple tests, and I intend to keep it that way.
fix the hangs on Win98SE when starting IDLE via "python" from a DOS box,
but did appear to make them harder to provoke. I closed that bug report
as being hopeless (and if someone wants to open it again, don't dare
assign it to me again <0.1 wink>).
have to insert it in front of other classes, nor do dirty tricks like
inserting a "dummy" HTTPHandler after a ProxyHandler when building an
opener with proxy support.
Python-Dev. Fixed typos in test comments. Added some trivial new test
guts to show the parallelism (now) among __delitem__, __setitem__ and
__getitem__ wrt error conditions.
Still a bugfix candidate for 2.2.3 final, but waiting for Fred to get a
chance to chime in.
Someone review this, please! Final releases are getting close, Fred
(the weakref guy) won't be around until Tuesday, and the pre-patch
code can indeed raise spurious RuntimeErrors in the presence of
threads or mutating comparison functions.
See the bug report for my confusions: I can't see any reason for why
__delitem__ iterated over the keys. The new one-liner implementation
is much faster, can't raise RuntimeError, and should be better-behaved
in all respects wrt threads.
New tests test_weak_keyed_bad_delitem and
test_weak_keyed_cascading_deletes fail before this patch.
Bugfix candidate for 2.2.3 too, if someone else agrees with this patch.
float_pow(): Don't let the platform pow() raise -1.0 to an integer power
anymore; at least glibc gets it wrong in some cases. Note that
math.pow() will continue to deliver wrong (but platform-native) results
in such cases.
tp_free is NULL or PyObject_Del at the end. Because it's a base type
it must call tp_free in its dealloc function, and because it's gc'able
it must not call PyObject_Del.
inherit_slots(): Don't inherit tp_free unless the type and its base
agree about whether they're gc'able. If the type is gc'able and the
base is not, and the base uses the default PyObject_Del for its
tp_free, give the type PyObject_GC_Del for its tp_free (the appropriate
default for a gc'able type).
cPickle.c: The Pickler and Unpickler types claim to be base classes
and gc'able, but their dealloc functions didn't call tp_free.
Repaired that. Also call PyType_Ready() on these typeobjects, so
that the correct (PyObject_GC_Del) default memory-freeing function
gets plugged into these types' tp_free slots.
one good use: a subclass adding a method to express the duration as
a number of hours (or minutes, or whatever else you want to add). The
native breakdown into days+seconds+us is often clumsy. Incidentally
moved a large chunk of object-initialization code closer to the top of
the file, to avoid worse forward-reference trickery.
- When redirecting, always use GET. This is common practice and
more-or-less sanctioned by the HTTP standard.
- Add a handler for 307 redirection, which becomes an error for POST,
but a regular redirect for GET and HEAD.
After removing that, two testers on machines where C: is not the system
drive reported that the installer suggested their system drive instead
of C:, and that's what they wanted it to do.
Reverted a Py2.3b1 change to iterator in subclasses of list and tuple.
They had been changed to use __getitem__ whenever it had been overriden
in the subclass.
This caused some usabilty and performance problems. Also, it was
inconsistent with the rest of python where many container methods
access the underlying object directly without first checking for
an overridden getter. Users needing a change in iterator behavior
should override it directly.
- The socket module now provides the functions inet_pton and inet_ntop
for converting between string and packed representation of IP addresses.
See SF patch #658327.
This still needs a bit of work in the doc area, because it is not
available on all platforms (especially not on Windows).
Revised netrc.py to include the additional ascii punctuation
characters. Omitted the other logic changes. See
Lib/netrc.py 1.17.
Since this is more of a feature request than a bug,
including in Py2.3 but not recommending for backporting.
docs here are best-guess: the MS docs I could find weren't clear, and
some even claimed _commit() has no effect on Win32 systems (which is
easily shown to be false just by trying it).
* Can now test for basic blocks.
* Optimize inverted comparisions.
* Optimize unary_not followed by a conditional jump.
* Added a new opcode, NOP, to keep code size constant.
* Applied NOP to previous transformations where appropriate.
Note, the NOP would not be necessary if other functions were
added to re-target jump addresses and update the co_lnotab mapping.
That would yield slightly faster and cleaner bytecode at the
expense of optimizer simplicity and of keeping it decoupled
from the line-numbering structure.
raising an exception. This is consistent with calling the
constructors for the other builtin types -- called without argument
they all return the false value of that type. (SF patch #724135)
Thanks to Alex Martelli.
I'm finding some pretty baffling output, like reprs consisting entirely
of three left parens. At least this will let us know what type the object
is (it's not str -- there's no quote character in the repr).
New tool combinerefs.py, to combine the two output blocks produced via
PYTHONDUMPREFS.
even farther down, to just before the call to
_PyObject_DebugMallocStats(). This required the following changes:
- pystate.c, PyThreadState_GetDict(): changed not to raise an
exception or issue a fatal error when no current thread state is
available, but simply return NULL without raising an exception
(ever).
- object.c, Py_ReprEnter(): when PyThreadState_GetDict() returns NULL,
don't raise an exception but return 0. This means that when
printing a container that's recursive, printing will go on and on
and on. But that shouldn't happen in the case we care about (see
first bullet).
- Updated Misc/NEWS and Doc/api/init.tex to reflect changes to
PyThreadState_GetDict() definition.
interpreted by slicing, so negative values count from the end of the
list. This was the only place where such an interpretation was not
placed on a list index.
A small fix for bug #545855 and Greg Chapman's
addition of op code SRE_OP_MIN_REPEAT_ONE for
eliminating recursion on simple uses of pattern '*?' on a
long string.
- range() now works even if the arguments are longs with magnitude
larger than sys.maxint, as long as the total length of the sequence
fits. E.g., range(2**100, 2**101, 2**100) is the following list:
[1267650600228229401496703205376L]. (SF patch #707427.)
of PyObject_HasAttr(); the former promises never to execute
arbitrary Python code. Undid many of the changes recently made to
worm around the worst consequences of that PyObject_HasAttr() could
execute arbitrary Python code.
Compatibility is hard to discuss, because the dangerous cases are
so perverse, and much of this appears to rely on implementation
accidents.
To start with, using hasattr() to check for __del__ wasn't only
dangerous, in some cases it was wrong: if an instance of an old-
style class didn't have "__del__" in its instance dict or in any
base class dict, but a getattr hook said __del__ existed, then
hasattr() said "yes, this object has a __del__". But
instance_dealloc() ignores the possibility of getattr hooks when
looking for a __del__, so while object.__del__ succeeds, no
__del__ method is called when the object is deleted. gc was
therefore incorrect in believing that the object had a finalizer.
The new method doesn't suffer that problem (like instance_dealloc(),
_PyObject_Lookup() doesn't believe __del__ exists in that case), but
does suffer a somewhat opposite-- and even more obscure --oddity:
if an instance of an old-style class doesn't have "__del__" in its
instance dict, and a base class does have "__del__" in its dict,
and the first base class with a "__del__" associates it with a
descriptor (an object with a __get__ method), *and* if that
descriptor raises an exception when __get__ is called, then
(a) the current method believes the instance does have a __del__,
but (b) hasattr() does not believe the instance has a __del__.
While these disagree, I believe the new method is "more correct":
because the descriptor *will* be called when the object is
destructed, it can execute arbitrary Python code at the time the
object is destructed, and that's really what gc means by "has a
finalizer": not specifically a __del__ method, but more generally
the possibility of executing arbitrary Python code at object
destruction time. Code in a descriptor's __get__() executed at
destruction time can be just as problematic as code in a
__del__() executed then.
So I believe the new method is better on all counts.
Bugfix candidate, but it's unclear to me how all this differs in
the 2.2 branch (e.g., new-style and old-style classes already
took different gc paths in 2.3 before this last round of patches,
but don't in the 2.2 branch).
platforms which have dup(2). The makefile() method is built directly on top
of the socket without duplicating the file descriptor, allowing timeouts to
work properly. Includes a new test case (urllibnet) which requires the
network resource.
Closes bug 707074.
pack_float, pack_double, save_float: All the routines for creating
IEEE-format packed representations of floats and doubles simply ignored
that rounding can (in rare cases) propagate out of a long string of
1 bits. At worst, the end-off carry can (by mistake) interfere with
the exponent value, and then unpacking yields a result wrong by a factor
of 2. In less severe cases, it can end up losing more low-order bits
than intended, or fail to catch overflow *caused* by rounding.
Bugfix candidate, but I already backported this to 2.2.
In 2.3, this code remains in severe need of refactoring.
variables to store internal data. As a result, any atempts to use the
unicode system with multiple active interpreters, or successive
interpreter executions, would fail.
Now that information is stored into members of the PyInterpreterState
structure.
- Implement the behavior as specified in PEP 277, meaning os.listdir()
will only return unicode strings if it is _called_ with a unicode
argument.
- And then return only unicode, don't attempt to convert to ASCII.
- Don't switch on Py_FileSystemDefaultEncoding, but simply use the
default encoding if Py_FileSystemDefaultEncoding is NULL. This means
os.listdir() can now raise UnicodeDecodeError if the default encoding
can't represent the directory entry. (This seems better than silcencing
the error and fall back to a byte string.)
- Attempted to decribe the above in Doc/lib/libos.tex.
- Reworded the Misc/NEWS items to reflect the current situation.
This checkin also fixes bug #696261, which was due to os.listdir() not
using Py_FileSystemDefaultEncoding, like all file system calls are
supposed to.
[ 555817 ] Flawed fcntl.ioctl implementation.
with my patch that allows for an array to be mutated when passed
as the buffer argument to ioctl() (details complicated by
backwards compatibility considerations -- read the docs!).
Allow mixed-type __eq__ and __ne__ for Set objects. This is messier than
I'd like because Set *also* implements __cmp__. I know of one glitch now:
cmp(s, t) returns 0 now when s and t are both Sets and s == t, despite
that Set.__cmp__ unconditionally raises TypeError (and by intent). The
rub is that __eq__ gets tried first, and the x.__eq__(y) True result
convinces Python that cmp(x, y) is 0 without even calling Set.__cmp__.
rarely needed, but can sometimes be useful to release objects
referenced by the traceback held in sys.exc_info()[2]. (SF patch
#693195.) Thanks to Kevin Jacobs!
test_linuxaudiodev.py) are no longer run by default. This is
because they don't always work, depending on your hardware and
software. To run these tests, you must use an invocation like
./python Lib/test/regrtest.py -u audio test_ossaudiodev
with an indented code block but no newline would raise SyntaxError.
This would have been a four-line change in parsetok.c... Except
codeop.py depends on this behavior, so a compilation flag had to be
invented that causes the tokenizer to revert to the old behavior;
this required extra changes to 2 .h files, 2 .c files, and 2 .py
files. (Fixes SF bug #501622.)
This changes the default __new__ to refuse arguments iff tp_init is the
default __init__ implementation -- thus making it a TypeError when you
try to pass arguments to a constructor if the class doesn't override at
least __init__ or __new__.
mostly from SF patch #683257, but I had to change unlock_import() to
return an error value to avoid fatal error.
Should this be backported? The patch requested this, but it's a new
feature.
folded; this will change in Python 2.4. On a 32-bit machine, this
happens for 0x80000000 through 0xffffffff, and for octal constants in
the same value range. No warning is issued if an explicit base is
given, *or* if the string contains a sign (since in those cases no
sign folding ever happens).
"Unsigned" (i.e., positive-looking, but really negative) hex/oct
constants with a leading minus sign are once again properly negated.
The micro-optimization for negated numeric constants did the wrong
thing for such hex/oct constants. The patch avoids the optimization
for all hex/oct constants.
This needs to be backported to Python 2.2!
__ne__ no longer complain if they don't know how to compare to the other
thing. If no meaningful way to compare is known, saying "not equal" is
sensible. This allows things like
if adatetime in some_sequence:
and
somedict[adatetime] = whatever
to work as expected even if some_sequence contains non-datetime objects,
or somedict non-datetime keys, because they only call __eq__.
It still complains (raises TypeError) for mixed-type comparisons in
contexts that require a total ordering, such as list.sort(), use as a
key in a BTree-based data structure, and cmp().
extension implemented flush() was fixed. Scott also rewrite the
zlib test suite using the unittest module. (SF bug #640230 and
patch #678531.)
Backport candidate I think.
anymore either, so don't. This also allows to get rid of obscure code
making __getnewargs__ identical to __getstate__ (hmm ... hope there
wasn't more to this than I realize!).
(pickling no longer needs them, and immutable objects shouldn't have
visible __setstate__() methods regardless). Rearranged the code to
put the internal setstate functions in the constructor sections.
Repaired the timedelta reduce() method, which was still producing
stuff that required a public timedelta.__setstate__() when unpickling.
compare against "the other" argument, we raise TypeError,
in order to prevent comparison from falling back to the
default (and worse than useless, in this case) comparison
by object address.
That's fine so far as it goes, but leaves no way for
another date/datetime object to make itself comparable
to our objects. For example, it leaves Marc-Andre no way
to teach mxDateTime dates how to compare against Python
dates.
Discussion on Python-Dev raised a number of impractical
ideas, and the simple one implemented here: when we don't
know how to compare against "the other" argument, we raise
TypeError *unless* the other object has a timetuple attr.
In that case, we return NotImplemented instead, and Python
will give the other object a shot at handling the
comparison then.
Note that comparisons of time and timedelta objects still
suffer the original problem, though.
This gives much the same treatment to datetime.fromtimestamp(stamp, tz) as
the last batch of checkins gave to datetime.now(tz): do "the obvious"
thing with the tz argument instead of a senseless thing.
tzinfo.fromutc() method. The C code doesn't implement any of this
yet (well, not the C code on the machine I'm using now), nor does
the test suite reflect it. The Python datetime.py implementation and
test suite in the sandbox do match these doc changes. The C
implementation probably won't catch up before Thursday (Wednesday is
a scheduled "black hole" day this week <0.4 wink>).
is not supported on sets. (Unfortunately, sorting a list of sets may
still return random results because it uses < exclusively, but for
sets that inly implements a partial ordering. Oh well.)
into time. This is little more than *exporting* the datetimetz object
under the name "datetime", and similarly for timetz. A good implementation
of this change requires more work, but this is fully functional if you
don't stare too hard at the internals (e.g., right now a type named
"datetime" shows up as a base class of the type named "datetime"). The
docs also need extensive revision, not part of this checkin.
cases, plus even tougher tests of that. This implementation follows
the correctness proof very closely, and should also be quicker (yes,
I wrote the proof before the code, and the code proves the proof <wink>).
(or None) now. In 2.3a1 they could also return an int or long, but that
was an unhelpfully redundant leftover from an earlier version wherein
they couldn't return a timedelta. TOOWTDI.
Added the logging package. In the meantime, Neal Norwitz added a
test_logging.py to the std test suite, which would have caught this
oversight in the Windows installer.
compiler flags which are necessary to get a clean compile. The former is
for user-specified optimizer, debug, trace fiddling. See patch 640843.
Add /sw/lib and /sw/include to setup.py search paths on Darwin to take
advantage of fink goodies.
Add scriptsinstall target to Makefile to install certain scripts from
Tools/scripts directory.
env.
This adds @CFLAGS@ and @CPPFLAGS@ to the end of the respective
variable definitions. It also adds $(LDFLAGS) to the $(CC) invocation
to build $(PGEN).
module.
The code is shorter, more readable, faster, and dramatically increases the
range of acceptable dates.
Also, used the floor division operator in leapdays().
[ 643835 ] Set Next Statement for Python debuggers
with a few tweaks by me: adding an unsigned or two, mentioning that
not all jumps are allowed in the doc for pdb, adding a NEWS item and
a note to whatsnew, and AuCTeX doing something cosmetic to libpdb.tex.
[#521782] unreliable file.read() error handling
* Objects/fileobject.c
(file_read): Clear errors before leaving the loop in all situations,
and also check if some data was read before exiting the loop with an
EWOULDBLOCK exception.
* Doc/lib/libstdtypes.tex
* Objects/fileobject.c
Document that sometimes a read() operation can return less data than
what the user asked, if running in non-blocking mode.
* Misc/NEWS
Document the fix.
[#448679] Left to right
* Python/compile.c
(com_dictmaker): Reordered evaluation of dictionaries to follow strict
LTR evaluation.
* Lib/compiler/pycodegen.py
(CodeGenerator.visitDict): Reordered evaluation of dictionaries to
follow strict LTR evaluation.
* Doc/ref/ref5.tex
Documented the general LTR evaluation order idea.
* Misc/NEWS
Documented change in evaluation order of dictionaries.
[#636769] Fix for major rexec bugs
* Lib/rexec.py
(FileBase): Added 'xreadlines' and '__iter__' to allowed file methods.
(FileWrapper.__init__): Removed unnecessary self.f variable, which gave
direct access to the file object.
(RExec): Added 'xreadlines' and '_weakref' to allowed modules.
(RExec.r_open): Convert string subclasses to a real string classes
before doing comparisons with mode parameter.
* Lib/ihooks.py
(BasicModuleImporter.import_module/reload/unload): Convert the module
name to a real string before working with it.
(ModuleImporter.import_module/import_it/reload): Convert the module
name to a real strings before working with it.
* Misc/NEWS
Document the change.
supported as the second argument. This has the same meaning as
for isinstance(), i.e. issubclass(X, (A, B)) is equivalent
to issubclass(X, A) or issubclass(X, B). Compared to isinstance(),
this patch does not search the tuple recursively for classes, i.e.
any entry in the tuple that is not a class, will result in a
TypeError.
This closes SF patch #649608.
this can result in significantly smaller files. All classes as well as the
open function now accept an optional binary parameter, which defaults to
False for backward compatibility. Added a small test suite, updated the
libref documentation (including documenting the exported classes and fixing
a few other nits) and added a note about the change to Misc/NEWS.
[#495695] webbrowser.py: selection of browser
* Lib/webbrowser.py
Only include graphic browsers in _tryorder if DISPLAY is set. Also,
included skipstone support, as suggested by Fred in the mentioned bug.
* Misc/NEWS
Mention fix and skipstone inclusion.
whether this is a correct thing to do:
+ There are linker warnings (see PCbuild\readme.txt).
+ test_bsddb passes, in both release and debug builds now.
+ test_bsddb3 has several failures, but it did before too.
Also made pythoncore a dependency of the _bsddb project, updated
build instructions, added database conversion XXX to NEWS, and fiddled
the Windows installer accordingly.
+ News blurb, but as much XXX as news.
+ Updated installer (install the new bsddb package, and the Berkeley DLL;
still don't know how to fold that into _bsddb.pyd).
+ Fleshed out build instructions.
+ Debug Python still blows up.
Armin Rigo's Draconian but effective fix for
SF bug 453523: list.sort crasher
slightly fiddled to catch more cases of list mutation. The dreaded
internal "immutable list type" is gone! OTOH, if you look at a list
*while* it's being sorted now, it will appear to be empty. Better
than a core dump.
This bug happened because: 1) the scanner_search and scanner_match methods
were not checking the buffer limits before increasing the current pointer;
and 2) SRE_SEARCH was using "if (ptr == end)" as a loop break, instead of
"if (ptr >= end)".
* Modules/_sre.c
(SRE_SEARCH): Check for "ptr >= end" to break loops, so that we don't
hang forever if a pointer passing the buffer limit is used.
(scanner_search,scanner_match): Don't increment the current pointer
if we're going to pass the buffer limit.
* Misc/NEWS
Mention the fix.
* Lib/distutils/command/bdist_rpm.py
(bdist_rpm.initialize_options): Included verify_script attribute.
(bdist_rpm.finalize_package_data): Ensure that verify_script is a filename.
(bdist_rpm._make_spec_file): Included verify_script in script_options
tuple.
* Misc/NEWS
Mention change.
from Greg Chapman.
* Modules/_sre.c
(lastmark_restore): New function, implementing algorithm to restore
a state to a given lastmark. In addition to the similar algorithm used
in a few places of SRE_MATCH, restore lastindex when restoring lastmark.
(SRE_MATCH): Replace lastmark inline restoring by lastmark_restore(),
function. Also include it where missing. In SRE_OP_MARK, set lastindex
only if i > lastmark.
* Lib/test/re_tests.py
* Lib/test/test_sre.py
Included regression tests for the fixed bugs.
* Misc/NEWS
Mention fixes.
The last round boosted "the limit" from 2GB to 4GB. This round gets
rid of the 4GB limit. For files > 4GB, gzip stores just the last 32
bits of the file size, and now we play along with that too. Tested
by hand (on a 6+GB file) on Win2K.
Boosting from 2GB to 4GB was arguably enough "a bugfix". Going beyond
that smells more like "new feature" to me.
Fixed the signed/unsigned confusions when dealing with files >= 2GB.
4GB is still a hard limitation of the gzip file format, though.
Testing this was a bitch on Win98SE due to frequent system freezes. It
didn't freeze while running gzip, it kept freezing while trying to *create*
a > 2GB test file! This wasn't Python's doing. I don't know of a
reasonable way to test this functionality in regrtest.py, so I'm not
checking in a test case (a test case would necessarily require creating
a 2GB+ file first, using gzip to zip it, using gzip to unzip it again,
and then compare before-and-after; so >4GB free space would be required,
and a loooong time; I did all this "by hand" once).
Bugfix candidate, I guess.
sys.getwindowsversion() on Windows (new enahanced Tim-proof <wink>
version), and fix test_pep277.py in a few minor ways.
Including doc and NEWS entries.
fairly large, most are caused by reformatting section and subsection
headings. The changes fall into the following categories:
* reformatted section and subsection headers.
* escaped isolated asterisks which would be interpreted as starting bold
or italic text (e.g. "void (*)(PyObject \*)").
* quoted stuff that looks like internal references but isn't
(e.g. ``PyCmp_``).
* changed visually balanced quotes to just use apostrophes
(e.g. "'string'" instead of "`string'").
* introduced and indenting multiline chunks of code.
* created one table (search for "New codecs").
from SF patch http://www.python.org/sf/554192
This adds two new functions to mimetypes:
guess_all_extensions() which returns a list of all known
extensions for a mime type, and add_type() which adds one
mapping between a mime type and an extension.
interning. I modified Oren's patch significantly, but the basic idea
and most of the implementation is unchanged. Interned strings created
with PyString_InternInPlace() are now mortal, and you must keep a
reference to the resulting string around; use the new function
PyString_InternImmortal() to create immortal interned strings.
[ 587993 ] SET_LINENO killer
Remove SET_LINENO. Tracing is now supported by inspecting co_lnotab.
Many sundry changes to document and adapt to this change.
k_mul() when inputs have vastly different sizes, and a little more
efficient when they're close to a factor of 2 out of whack.
I consider this done now, although I'll set up some more correctness
tests to run overnight.
correct now, so added some final comments, did some cleanup, and enabled
it for all long-int multiplies. The KARAT envar no longer matters,
although I left some #if 0'ed code in there for my own use (temporary).
k_mul() is still much slower than x_mul() if the inputs have very
differenent sizes, and that still needs to be addressed.
SF 560379: Karatsuba multiplication.
Lots of things were changed from that. This needs a lot more testing,
for correctness and speed, the latter especially when bit lengths are
unbalanced. For now, the Karatsuba code gets invoked if and only if
envar KARAT exists.
PyErr_SetExcFromWindowsErr(), PyErr_SetExcFromWindowsErrWithFilename().
Similar to PyErr_SetFromWindowsErrWithFilename() and
PyErr_SetFromWindowsErr(), but they allow to specify
the exception type to raise. Available on Windows.
See SF patch #576458.
more trivial lexical helper macros so that uses of these guys expand
to nothing at all when they're not enabled. This should help sub-
standard compilers that can't do a good job of optimizing away the
previous "(void)0" expressions.
Py_DECREF: There's only one definition of this now. Yay! That
was that last one in the family defined multiple times in an #ifdef
maze.
Py_FatalError(): Changed the char* signature to const char*.
_Py_NegativeRefcount(): New helper function for the Py_REF_DEBUG
expansion of Py_DECREF. Calling an external function cuts down on
the volume of generated code. The previous inline expansion of abort()
didn't work as intended on Windows (the program often kept going, and
the error msg scrolled off the screen unseen). _Py_NegativeRefcount
calls Py_FatalError instead, which captures our best knowledge of
how to abort effectively across platforms.
Repair segfaults and infinite loops in COUNT_ALLOCS builds in the
presence of new-style (heap-allocated) classes/types.
Bugfix candidate. I'll backport this to 2.2. It's irrelevant in 2.1.
This patch enhances Python/import.c/find_module() so
that unicode objects found in sys.path will be treated
as legal directory names (The current code ignores
anything that is not a str). The unicode name is
converted to str using Py_FileSystemDefaultEncoding.
library. Since multiple versions can be installed simultaneously, it's
crucial that you only select libraries and header files which are compatible
with each other. Version checking is done from highest version to lowest.
Building using version 1 of Berkeley DB is disabled by default because of
the hash file bugs people keep rediscovering. It can be enabled by
uncommenting a few lines in setup.py. Closes patch 553108.
This was a simple typo. Strange that the compiler didn't catch it!
Instead of WHY_CONTINUE, two tests used CONTINUE_LOOP, which isn't a
why_code at all, but an opcode; but even though 'why' is declared as
an enum, comparing it to an int is apparently not even worth a
warning -- not in gcc, and not in VC++. :-(
Will fix in 2.2 too.
[ 400998 ] experimental support for extended slicing on lists
somewhat spruced up and better tested than it was when I wrote it.
Includes docs & tests. The whatsnew section needs expanding, and arrays
should support extended slices -- later.
While I was at it, I added a tp_clear handler and changed the
tp_dealloc handler to use the clear_slots helper for the tp_clear
handler.
Also tightened the rules for slot names: they must now be proper
identifiers (ignoring the dirty little fact that <ctype.h> is locale
sensitive).
Also set mp->flags = READONLY for the __weakref__ pseudo-slot.
Most of this is a 2.2 bugfix candidate; I'll apply it there myself.
BOM_UTF32, BOM_UTF32_LE and BOM_UTF32_BE that represent the Byte
Order Mark in UTF-8, UTF-16 and UTF-32 encodings for little and
big endian systems.
The old names BOM32_* and BOM64_* were off by a factor of 2.
This closes SF bug http://www.python.org/sf/555360
Change the module constructor (module_init) to have the signature
__init__(name:str, doc=None); this prevents the call from type_new()
to succeed. While we're at it, prevent repeated calling of
module_init for the same module from leaking the dict, changing the
semantics so that __dict__ is only initialized if NULL.
Also adding a unittest, test_module.py.
This is an incompatibility with 2.2, if anybody was instantiating the
module class before, their argument list was probably empty; so this
can't be backported to 2.2.x.
[ 559250 ] more POSIX signal stuff
Adds support (and docs and tests and autoconfery) for posix signal
mask handling -- sigpending, sigprocmask and sigsuspend.
for 'str' and 'unicode', and can be used instead of
types.StringTypes, e.g. to test whether something is "a string":
isinstance(x, string) is True for Unicode and 8-bit strings. This
is an abstract base class and cannot be instantiated directly.
and the .seed() and .whseed() methods failed to reset it. In other
words, setting the seed didn't completely determine the sequence of
results produced by random.gauss(). It does now. Programs repeatedly
mixing calls to a seed method with calls to gauss() may see different
results now.
Bugfix candidate (random.gauss() has always been broken in this way),
despite that it may change results.
This now does a dynamic analysis of which elements are so frequently
repeated as to constitute noise. The primary benefit is an enormous
speedup in find_longest_match, as the innermost loop can have factors
of 100s less potential matches to worry about, in cases where the
sequences have many duplicate elements. In effect, this zooms in on
sequences of non-ubiquitous elements now.
While I like what I've seen of the effects so far, I still consider
this experimental. Please give it a try!
left and right type were of the same type and not classic instances.
This shortcut is dangerous for proxy types, because it means that
coerce(Proxy(1), Proxy(2.1)) leaves Proxy(1) unchanged rather than
turning it into Proxy(1.0).
In an ever-so-slight change of semantics, I now only take the shortcut
when the left and right types are of the same type and don't have the
CHECKTYPES feature. It so happens that classic instances have this
flag, so the shortcut is still skipped in this case (i.e. nothing
changes for classic instances). Proxies also have this flag set
(otherwise implementing numeric operations on proxies would become
nightmarish) and this means that the shortcut is also skipped there,
as desired. It so happens that int, long and float also have this
flag set; that means that e.g. coerce(1, 1) will now invoke
int_coerce(). This is fine: int_coerce() can deal with this, and I'm
not worried about the performance; int_coerce() is only invoked when
the user explicitly calls coerce(), which should be rarer than rare.
closes SF #514433
can now pass 'None' as the filename for the bsddb.*open functions,
and you'll get an in-memory temporary store.
docs are ripped out of the bsddb dbopen man page. Fred may want to
clean them up.
Considering this for 2.2, but not 2.1.
option. It was the cause of at least one way UNWISE.EXE could vanish
(install a python; uninstall it; install it again; reboot the machine;
abracadabra the uinstaller is gone).
Bugfix candidate, but I'll backport it myself.
Add a method zfill to str, unicode and UserString and change
Lib/string.py accordingly.
This activates the zfill version in unicodeobject.c that was
commented out and implements the same in stringobject.c. It also
adds the test for unicode support in Lib/string.py back in and
uses repr() instead() of str() (as it was before Lib/string.py 1.62)
when PyType_Ready() was called, if ob_type was found to be NULL, it
was always set to &PyType_Type; now it is set to base->ob_type,
where base is tp_base, defaulting to &PyObject_Type.
- PyType_Ready() accidentally did not inherit tp_is_gc; now it does.
Bugfix candidate.
This patch makes it possible to pass Warning instances as the first
argument to warnings.warn. In this case the category argument
will be ignored. The message text used will be str(warninginstance).
As promised in my response to the bug report, I'm not really fixing
it; in fact, one could argule over what the proper fix should do.
Instead, I'm adding a little magic that raises TypeError if you try to
pickle an instance of a class that has __slots__ but doesn't define or
override __getstate__. This is done by adding a bozo __getstate__
that always raises TypeError.
Bugfix candidate (also the checkin to typeobject.c, of course).
dropping MS's inadequate _chsize() function. This was inspired by
SF patch 498109 ("fileobject truncate support for win32"), which I
rejected.
libstdtypes.tex: Someone who knows should update the availability
blurb. For example, if it's available on Linux, it would be good to
say so.
test_largefile: Uncommented the file.truncate() tests, and reworked to
do more. The old comment about "permission errors" in the truncation
tests under Windows was almost certainly due to that the file wasn't open
for *write* access at this point, so of course MS wouldn't let you
truncate it. I'd be appalled if a Unixish system did.
CAUTION: Someone should run this test on Linux (etc) too. The
truncation part was commented out before. Note that test_largefile isn't
run by default.
helper module _ssl.
The support for the RAND_* APIs in _ssl is now only enabled
for OpenSSL 0.9.5 and up since they were added in that
release.
Note that socketmodule.* should really be renamed to _socket.* --
unfortunately, this seems to lose the CVS history of the file.
Please review and test... I was only able to test the header file
chaos in socketmodule.c/h on Linux. The test run through fine
and compiles don't give errors or warnings.
WARNING: This patch does *not* include changes to the various
non-Unix build process files.
where their capabilities intersect. Would be nice if people using non-
MSVC compilers (Borland etc) took a whack at doing something similar for
them (this code relies on the MS _cwait function).
Instead of sending the real user and host, use "anonymous@" (i.e. no
host name at all!) as the default anonymous FTP password. This avoids
privacy violations.
Removed the ancient "#define ANY void".
Bugfix candidate? Hard call. The bug report claims the existence of
this #define creates conflicts with other packages, which is easy to
believe. OTOH, some extension authors may still be relying on its
presence. I'm afraid you can't win on this one.
PyDict_UpdateFromSeq2(): removed it.
PyDict_MergeFromSeq2(): made it public and documented it.
PyDict_Merge() docs: updated to reveal <wink> that the second
argument can be any mapping object.
Anthony Roach.
Release the global interpreter lock around platform spawn calls.
Bugfix candidate? Hard to say; I favor "yes, bugfix".
These clearly *should* have been releasing the GIL all along, if for no
other reason than compatibility with the similar os.system(). But it's
possible some program out there is (a) multithreaded, (b) calling a spawn
function with P_WAIT, and (c) relying on the spawn call to block all their
threads until the spawned program completes. I think it's very unlikely
anyone is doing that on purpose, but someone may be doing so by accident.
Big Hammer to implement -Qnew as PEP 238 says it should work (a global
option affecting all instances of "/").
pydebug.h, main.c, pythonrun.c: define a private _Py_QnewFlag flag, true
iff -Qnew is passed on the command line. This should go away (as the
comments say) when true division becomes The Rule. This is
deliberately not exposed to runtime inspection or modification: it's
a one-way one-shot switch to pretend you're using Python 3.
ceval.c: when _Py_QnewFlag is set, treat BINARY_DIVIDE as
BINARY_TRUE_DIVIDE.
test_{descr, generators, zipfile}.py: fiddle so these pass under
-Qnew too. This was just a matter of s!/!//! in test_generators and
test_zipfile. test_descr was trickier, as testbinop() is passed
assumptions that "/" is the same as calling a "__div__" method; put
a temporary hack there to call "__truediv__" instead when the method
name is "__div__" and 1/2 evaluates to 0.5.
Three standard tests still fail under -Qnew (on Windows; somebody
please try the Linux tests with -Qnew too! Linux runs a whole bunch
of tests Windows doesn't):
test_augassign
test_class
test_coercion
I can't stay awake longer to stare at this (be my guest). Offhand
cures weren't obvious, nor was it even obvious that cures are possible
without major hackery.
Question: when -Qnew is in effect, should calls to __div__ magically
change into calls to __truediv__? See "major hackery" at tail end of
last paragraph <wink>.
It was easier than I thought, assuming that no other things contribute
to the instance size besides slots -- a pretty good bet. With a test
suite, no less!
happy if one could delete the __dict__ attribute of an instance. I
love to make Jim happy, so here goes...
- New-style objects now support deleting their __dict__. This is for
all intents and purposes equivalent to assigning a brand new empty
dictionary, but saves space if the object is not used further.
slot_tp_descr_set(): When deleting an attribute described by a
descriptor implemented in Python, the descriptor's __del__ method is
called by the slot_tp_descr_set dispatch function. This is bogus --
__del__ already has a different meaning. Renaming this use of __del__
is renamed to __delete__.
vgetargskeywords(): Now that this routine is checking for bad input
(rather than dump core in some cases), some bad calls are raising errors
that previously "worked". This patch makes the error strings more
revealing, and changes the exceptions from SystemError to RuntimeError
(under the theory that SystemError is more of a "can't happen!" assert-
like thing, and so inappropriate for bad arguments to a public C API
function).
special-cases classic classes, it doesn't do anything about other
cases where different metaclasses are involved (except for the trivial
case where one metaclass is a subclass of the others). Also note that
it's metaclass, not metatype.
This gives mmap() on Windows the ability to create read-only, write-
through and copy-on-write mmaps. A new keyword argument is introduced
because the mmap() signatures diverged between Windows and Unix, so
while they (now) both support this functionality, there wasn't a way to
spell it in a common way without introducing a new spelling gimmick.
The old spellings are still accepted, so there isn't a backward-
compatibility issue here.
and NEWS. Bugfix candidate? That's a dilemma for Anthony <wink>: /F
did fix a longstanding bug here, but the fix can cause code to raise an
exception that previously worked by accident.
XXX Remaining problems:
- The GC module doesn't know about these; I think it has its reasons
to disallow calling __del__, but for now, __del__ on new-style
objects is called when the GC module discards an object, for better
or for worse.
- The code to call a __del__ handler is really ridiculously
complicated, due to all the different debug #ifdefs. I've copied
this from the similar code in classobject.c, so I'm pretty sure I
did it right, but it's not pretty. :-(
- No tests yet.
outer level, the iterator protocol is used for memory-efficiency (the
outer sequence may be very large if fully materialized); at the inner
level, PySequence_Fast() is used for time-efficiency (these should
always be sequences of length 2).
dictobject.c, new functions PyDict_{Merge,Update}FromSeq2. These are
wholly analogous to PyDict_{Merge,Update}, but process a sequence-of-2-
sequences argument instead of a mapping object. For now, I left these
functions file static, so no corresponding doc changes. It's tempting
to change dict.update() to allow a sequence-of-2-seqs argument too.
Also changed the name of dictionary's keyword argument from "mapping"
to "x". Got a better name? "mapping_or_sequence_of_pairs" isn't
attractive, although more so than "mosop" <wink>.
abstract.h, abstract.tex: Added new PySequence_Fast_GET_SIZE function,
much faster than going thru the all-purpose PySequence_Size.
libfuncs.tex:
- Document dictionary().
- Fiddle tuple() and list() to admit that their argument is optional.
- The long-winded repetitions of "a sequence, a container that supports
iteration, or an iterator object" is getting to be a PITA. Many
months ago I suggested factoring this out into "iterable object",
where the definition of that could include being explicit about
generators too (as is, I'm not sure a reader outside of PythonLabs
could guess that "an iterator object" includes a generator call).
- Please check my curly braces -- I'm going blind <0.9 wink>.
abstract.c, PySequence_Tuple(): When PyObject_GetIter() fails, leave
its error msg alone now (the msg it produces has improved since
PySequence_Tuple was generalized to accept iterable objects, and
PySequence_Tuple was also stomping on the msg in cases it shouldn't
have even before PyObject_GetIter grew a better msg).
This adds unsetenv to posix, and uses it in the __delitem__ method of
os.environ.
(XXX Should we change the preferred name for putenv to setenv, for
consistency?)
This is a big one, touching lots of files. Some of the platforms
aren't tested yet. Briefly, this changes the return value of the
os/posix functions stat(), fstat(), statvfs(), fstatvfs(), and the
time functions localtime(), gmtime(), and strptime() from tuples into
pseudo-sequences. When accessed as a sequence, they behave exactly as
before. But they also have attributes like st_mtime or tm_year. The
stat return value, moreover, has a few platform-specific attributes
that are not available through the sequence interface (because
everybody expects the sequence to have a fixed length, these couldn't
be added there). If your platform's struct stat doesn't define
st_blksize, st_blocks or st_rdev, they won't be accessible from Python
either.
(Still missing is a documentation update.)
This changes Pythread_start_thread() to return the thread ID, or -1
for an error. (It's technically an incompatible API change, but I
doubt anyone calls it.)
call, or via setting an instance or class vrbl.
Rewrote the calibration docs.
Modern boxes are so friggin' fast, and a profiler event does so much work
anyway, that the cost of looking up an instance vrbl (the bias constant)
per profile event just isn't a big deal.
actual run of the profiler, instead of timing a simplified simulation of
part of what the profiler does. It computes a constant about 60% higher
on my Win98SE box than the old method, and the new constant appears much
more realistic. Deleted the undocumented simple(), instrumented(), and
profiler_simulation() methods (which existed only to support the previous
calibration method).
from Tim Hochberg. Also mucho fiddling to change the way doctest
determines whether a thing is a function, module or class. Under 2.2,
this really requires the functions in inspect.py (e.g., types.ClassType
is close to meaningless now, if not outright misleading).
Generalize PyLong_AsLongLong to accept int arguments too. The real point
is so that PyArg_ParseTuple's 'L' code does too. That code was
undocumented (AFAICT), so documented it.
- property() now takes 4 keyword arguments: fget, fset, fdel, doc.
Note that the real purpose of the 'f' prefix is to make fdel fit in
('del' is a keyword, so can't used as a keyword argument name).
- These map to visible readonly attributes 'fget', 'fset', 'fdel',
and '__doc__' in the property object.
- fget/fset/fdel weren't discoverable from Python before.
- __doc__ is new, and allows to associate a docstring with a property.
iterable object. I'm not sure how that got overlooked before!
Got rid of the internal _PySequence_IterContains, introduced a new
internal _PySequence_IterSearch, and rewrote all the iteration-based
"count of", "index of", and "is the object in it or not?" routines to
just call the new function. I suppose it's slower this way, but the
code duplication was getting depressing.
Curious: the MS docs say stati64 etc are supported even on Win95, but
Win95 doesn't support a filesystem that allows partitions > 2 Gb.
test_largefile: This was opening its test file in text mode. I have no
idea how that worked under Win64, but it sure needs binary mode on Win98.
BTW, on Win98 test_largefile runs quickly (under a second).
requires that errno ever get set, and it looks like glibc is already
playing that game. New rules:
+ Never use HUGE_VAL. Use the new Py_HUGE_VAL instead.
+ Never believe errno. If overflow is the only thing you're interested in,
use the new Py_OVERFLOWED(x) macro. If you're interested in any libm
errors, use the new Py_SET_ERANGE_IF_OVERFLOW(x) macro, which attempts
to set errno the way C89 said it worked.
Unfortunately, none of these are reliable, but they work on Windows and I
*expect* under glibc too.
getting Infs, NaNs, or nonsense in 2.1 and before; in yesterday's CVS we
were getting OverflowError; but these functions always make good sense
for positive arguments, no matter how large).
the fiddling is simply due to that no caller of PyLong_AsDouble ever
checked for failure (so that's fixing old bugs). PyLong_AsDouble is much
faster for big inputs now too, but that's more of a happy consequence
than a design goal.
bag. It's clearly wrong for classic classes, at heart because a classic
class doesn't have a __class__ attribute, and I'm unclear on whether
that's feature or bug. I'll repair this once I find out (in the
meantime, dir() applied to classic classes won't find the base classes,
while dir() applied to a classic-class instance *will* find the base
classes but not *their* base classes).
Please give the new dir() a try and see whether you love it or hate it.
The new dir([]) behavior is something I could come to love. Here's
something to hate:
>>> class C:
... pass
...
>>> c = C()
>>> dir(c)
['__doc__', '__module__']
>>>
The idea that an instance has a __doc__ attribute is jarring (of course
it's really c.__class__.__doc__ == C.__doc__; likewise for __module__).
OTOH, the code already has too many special cases, and dir(x) doesn't
have a compelling or clear purpose when x isn't a module.
Stephen Hansen reported via email that he didn't finish the port to
Borland C, so remove the old item saying it worked and add a new item
saying what I know; I've asked Stephen for more details.
- Do not compile unicodeobject, unicodectype, and unicodedata if Unicode is disabled
- check for Py_USING_UNICODE in all places that use Unicode functions
- disables unicode literals, and the builtin functions
- add the types.StringTypes list
- remove Unicode literals from most tests.
__dict__ attribute. Deleting it, or setting it to a non-dictionary
result in a TypeError. Note that getting it the first time magically
initializes it to an empty dict so that func.__dict__ will always
appear to be a dictionary (never None).
Closes SF bug #446645.
(Tim & I should agree on where to add new additions: I add them at the
top, Tim adds them at the bottom. I like the top better because folks
who occasionally check out the NEWS file will see the latest news
first.)
This completes the q/Q project.
longobject.c _PyLong_AsByteArray: The original code had a gross bug:
the most-significant Python digit doesn't necessarily have SHIFT
significant bits, and you really need to count how many copies of the sign
bit it has else spurious overflow errors result.
test_struct.py: This now does exhaustive std q/Q testing at, and on both
sides of, all relevant power-of-2 boundaries, both positive and negative.
NEWS: Added brief dict news while I was at it.
native mode, and only when config #defines HAVE_LONG_LONG. Standard mode
will eventually treat them as 8-byte ints across all platforms, but that
likely requires a new set of routines in longobject.c first (while
sizeof(long) >= 4 is guaranteed by C, there's nothing in C we can rely
on x-platform to hold 8 bytes of int, so we'll have to roll our own;
I'm thinking of a simple pair of conversion functions, Python long
to/from sized vector of unsigned bytes; that may be useful for GMP
conversions too; std q/Q would call them with size fixed at 8).
test_struct.py: In addition to adding some native-mode 'q' and 'Q' tests,
got rid of unused code, and repaired a non-portable assumption about
native sizeof(short) (it isn't 2 on some Cray boxes).
libstruct.tex: In addition to adding a bit of 'q'/'Q' docs (more needed
later), removed an erroneous footnote about 'I' behavior.
code, less memory. Tests have uncovered no drawbacks. Christian and
Vladimir are the other two people who have burned many brain cells on the
dict code in recent years, and they like the approach too, so I'm checking
it in without further ado.
instead of multiplication to generate the probe sequence. The idea is
recorded in Python-Dev for Dec 2000, but that version is prone to rare
infinite loops.
The value is in getting *all* the bits of the hash code to participate;
and, e.g., this speeds up querying every key in a dict with keys
[i << 16 for i in range(20000)] by a factor of 500. Should be equally
valuable in any bad case where the high-order hash bits were getting
ignored.
Also wrote up some of the motivations behind Python's ever-more-subtle
hash table strategy.
*are* obsolete; three variables and the maketrans() function are not
(yet) obsolete.
Add a compensating warnings.filterwarnings() call to test_strop.py.
Add this to the NEWS.
elements when crunching a list, dict or tuple. Now takes linear time
instead -- huge speedup for even moderately large containers, and the
code is notably simpler too.
Added some basic "is the output correct?" tests to test_pprint.
The comment following used to say:
/* We use ~hash instead of hash, as degenerate hash functions, such
as for ints <sigh>, can have lots of leading zeros. It's not
really a performance risk, but better safe than sorry.
12-Dec-00 tim: so ~hash produces lots of leading ones instead --
what's the gain? */
That is, there was never a good reason for doing it. And to the contrary,
as explained on Python-Dev last December, it tended to make the *sum*
(i + incr) & mask (which is the first table index examined in case of
collison) the same "too often" across distinct hashes.
Changing to the simpler "i = hash & mask" reduced the number of string-dict
collisions (== # number of times we go around the lookup for-loop) from about
6 million to 5 million during a full run of the test suite (these are
approximate because the test suite does some random stuff from run to run).
The number of collisions in non-string dicts also decreased, but not as
dramatically.
Note that this may, for a given dict, change the order (wrt previous
releases) of entries exposed by .keys(), .values() and .items(). A number
of std tests suffered bogus failures as a result. For dicts keyed by
small ints, or (less so) by characters, the order is much more likely to be
in increasing order of key now; e.g.,
>>> d = {}
>>> for i in range(10):
... d[i] = i
...
>>> d
{0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
>>>
Unfortunately. people may latch on to that in small examples and draw a
bogus conclusion.
test_support.py
Moved test_extcall's sortdict() into test_support, made it stronger,
and imported sortdict into other std tests that needed it.
test_unicode.py
Excluced cp875 from the "roundtrip over range(128)" test, because
cp875 doesn't have a well-defined inverse for unicode("?", "cp875").
See Python-Dev for excruciating details.
Cookie.py
Chaged various output functions to sort dicts before building
strings from them.
test_extcall
Fiddled the expected-result file. This remains sensitive to native
dict ordering, because, e.g., if there are multiple errors in a
keyword-arg dict (and test_extcall sets up many cases like that), the
specific error Python complains about first depends on native dict
ordering.
Allow module getattr and setattr to exploit string interning, via the
previously null module object tp_getattro and tp_setattro slots. Yields
a very nice speedup for things like random.random and os.path etc.
Fixed a half dozen ways in which general dict comparison could crash
Python (even cause Win98SE to reboot) in the presence of kay and/or
value comparison routines that mutate the dict during dict comparison.
Bugfix candidate.
d1 == d2 and d1 != d2 now work even if the keys and values in d1 and d2
don't support comparisons other than ==, and testing dicts for equality
is faster now (especially when inequality obtains).
NEEDS DOC CHANGES.
More AttributeErrors transmuted into TypeErrors, in test_b2.py, and,
again, this strikes me as a good thing.
This checkin completes the iterator generalization work that obviously
needed to be done. Can anyone think of others that should be changed?
NEEDS DOC CHANGES
A few more AttributeErrors turned into TypeErrors, but in test_contains
this time.
The full story for instance objects is pretty much unexplainable, because
instance_contains() tries its own flavor of iteration-based containment
testing first, and PySequence_Contains doesn't get a chance at it unless
instance_contains() blows up. A consequence is that
some_complex_number in some_instance
dies with a TypeError unless some_instance.__class__ defines __iter__ but
does not define __getitem__.
to string.join(), so that when the latter figures out in midstream that
it really needs unicode.join() instead, unicode.join() can actually get
all the sequence elements (i.e., there's no guarantee that the sequence
passed to string.join() can be iterated over *again* by unicode.join(),
so string.join() must not pass on the original sequence object anymore).
because PySequence_Fast() started working for free as soon as
PySequence_Tuple() learned how to work with iterators. For some reason
unicode.join() still doesn't work, though.
NEEDS DOC CHANGES.
This one surprised me! While I expected tuple() to be a no-brainer, turns
out it's actually dripping with consequences:
1. It will *allow* the popular PySequence_Fast() to work with any iterable
object (code for that not yet checked in, but should be trivial).
2. It caused two std tests to fail. This because some places used
PyTuple_Sequence() (the C spelling of tuple()) as an indirect way to test
whether something *is* a sequence. But tuple() code only looked for the
existence of sq->item to determine that, and e.g. an instance passed
that test whether or not it supported the other operations tuple()
needed (e.g., __len__). So some things the tests *expected* to fail
with an AttributeError now fail with a TypeError instead. This looks
like an improvement to me; e.g., test_coercion used to produce 559
TypeErrors and 2 AttributeErrors, and now they're all TypeErrors. The
error details are more informative too, because the places calling this
were *looking* for TypeErrors in order to replace the generic tuple()
"not a sequence" msg with their own more specific text, and
AttributeErrors snuck by that.
NEEDS DOC CHANGES.
Possibly contentious: The first time s.next() yields StopIteration (for
a given map argument s) is the last time map() *tries* s.next(). That
is, if other sequence args are longer, s will never again contribute
anything but None values to the result, even if trying s.next() again
could yield another result. This is the same behavior map() used to have
wrt IndexError, so it's the only way to be wholly backward-compatible.
I'm not a fan of letting StopIteration mean "try again later" anyway.
to no longer insist that len(seq) be defined.
NEEDS DOC CHANGES.
This is meant to be a model for how other functions of this ilk (max,
filter, etc) can be generalized similarly. Feel encouraged to grab your
favorite and convert it!
Note some cute consequences:
list(file) == file.readlines() == list(file.xreadlines())
list(dict) == dict.keys()
list(dict.iteritems()) = dict.items()
list(xrange(i, j, k)) == range(i, j, k)
must now initialize the extra field used by the weak-ref machinery to
NULL themselves, to avoid having to require PyObject_INIT() to check
if the type supports weak references and do it there. This causes less
work to be done for all objects (the type object does not need to be
consulted to check for the Py_TPFLAGS_HAVE_WEAKREFS bit).
Add note about _symtable.
Add note that 'from ... import *' restriction may go away -- and move
the whole entry closer to the top, because it might bite people.
internal states. Put the old .seed() (which could only get at about
the square root of the # of possibilities) under the new name .whseed(),
for bit-level compatibility with older versions. This occurred to me
while reviewing effbot's book (he found himself stumbling over .seed()
more than once there ...).
- All constructors grow an optional argument `factory' which is a
callable used when new message instances are created by the next()
methods. Defaults to the rfc822.Message class.
- A new subclass of UnixMailbox is added, called PortableUnixMailbox.
It's identical to UnixMailbox, but uses a more portable test for
From_ delimiter lines. With PortableUnixMailbox, any line that
starts with "From " is considered a delimiter (this should really
check for two newlines before the F, but it doesn't.
got broken). Also added new method .jumpahead(N). This finally gives us
a semi-decent answer to how Python's RNGs can be used safely and efficiently
in multithreaded programs (although it requires the user to use the new
machinery!).
functionality of, whrandom.py. Also closes all the "XXX" todos in
random.py. New frequently-requested functions/methods getstate() and
setstate(). All exported functions are now bound methods of a hidden
instance. Killed all unintended exports. Updated the docs.
FRED: The more I fiddle the docs, the less I understand the exact
intended use of the \var, \code, \method tags. Please review critically.
GUIDO: See email. I updated NEWS as if whrandom were deprecated; I
think it should be.
ctime, gmtime and localtime optional, defaulting to 'the current time' in
all cases. Adjust docs, add news item. Also convert all argument-handling to
METH_VARARGS. Closes SF patch #103265.
- Changed description of rich comparisons to emphasize that < and >
(etc.) are each other's reflection. Also use this word in the note
about the demise of __rcmp__.
except that it always returns Unicode objects.
A new C API PyObject_Unicode() is also provided.
This closes patch #101664.
Written by Marc-Andre Lemburg. Copyright assigned to Guido van Rossum.
Christmas present to myself: the bisect module didn't define what
happened if the new element was already in the list. It so happens
that it inserted the new element "to the right" of all equal elements.
Since it wasn't defined, among other bad implications it was a mystery
how to use bisect to determine whether an element was already in the
list (I've seen code that *assumed* "to the right" without justification).
Added new methods bisect_left and insort_left that insert "to the left"
instead; made the old names bisect and insort aliases for the new names
bisect_right and insort_right; beefed up docstrings to explain what
these actually do; and added a std test for the bisect module.
In the limits.h comment, noted that INT_MAX and LONG_MAX are guaranteed
to be defined.
Noted that Reliant UNIX now gets proper API support for extension modules.
XXX notes for now.
I could use help here!!!! Please mail me patches ASAP. We may have
to put some of this off to 2.0final, but it's best to have it in shape
now...