- The copy module now "copies" function objects (as atomic objects).
- dict.__getitem__ now looks for a __missing__ hook before raising
KeyError.
- Added a new type, defaultdict, to the collections module.
This uses the new __missing__ hook behavior added to dict (see above).
On a box where sizeof(size_t) == 4, C doesn't define
what happens when a size_t value is shifted right by
32 bits, and this caused test_mmap to fail on Windows
in a debug build. So use different code to break
the size apart depending on how large size_t actually
is.
This looks like an illusion, since lots of code in this
module still appears to assume sizes can't be more
than 32 bits (e.g., the internal _GetMapSize() still
returns an int), but at least test_mmap passes again.
has been applied fairly arbitrarily in this module (nsmallest uses
Py_ssize_t, nlargest does not) and it probably deserves a more complete
review. Fixes heapq.nsmallest() always returning the empty list (on
platforms with 64-bit ssize_t/long)
Based on lsprof (patch #1212837) by Brett Rosen and Ted Czotter.
With further editing by Michael Hudson and myself.
History in svn repo: http://codespeak.net/svn/user/arigo/hack/misc/lsprof
* Module/_lsprof.c is the internal C module, Lib/cProfile.py a wrapper.
* pstats.py updated to display cProfile's caller/callee timings if available.
* setup.py and NEWS updated.
* documentation updates in the profiler section:
- explain the differences between the three profilers that we have now
- profile and cProfile can use a unified documentation, like (c)Pickle
- mention that hotshot is "for specialized usage" now
- removed references to the "old profiler" that no longer exists
* test updates:
- extended test_profile to cover delicate cases like recursion
- added tests for the caller/callee displays
- added test_cProfile, performing the same tests for cProfile
* TO-DO:
- cProfile gives a nicer name to built-in, particularly built-in methods,
which could be backported to profile.
- not tested on Windows recently!
is larger than FD_SETSIZE.
This can only be acheived with ulimit -n SOME_NUMBER_BIGGER_THAN_FD_SETSIZE
which is typically only available to root. Since this wouldn't normally
be run in a test (ie, run as root), it doesn't seem too worthwhile to
add a normal test. The bug report has one version of a test. I've
written another. Not sure what the best thing to do is.
Do the check before calling internal_select() because we can't set
an error in between Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS.
This seemed the clearest solution, ie handle before calling internal_select()
rather than inside. Plus there is at least one place outside
of internal_select() that needed to be handled.
Will backport.
on both Unix (SVR4 and BSD) and Windows. Restores behaviour of passing -1
for anonymous memory on Unix. Use MAP_ANONYMOUS instead of _ANON since
the latter is deprecated according to Linux (gentoo) man pages.
Should we continue to allow mmap.mmap(0, length) to work on Windows?
0 is a valid fd.
Will backport bugfix portions.
loses information:
OverflowError: regular expression code size limit exceeded
Otherwise the compiled code is gibberish, possibly leading at
least to wrong results or (as reported on c.l.py) internal
sre errors at match time.
I'm not sure how to test this. SRE_CODE is a 2-byte type on
my box, and it's easy to create a regexp that causes the new
exception to trigger here. But it may be a 4-byte type on
other boxes, and creating a regexp large enough to trigger
problems there would be pretty crazy.
Bugfix candidate.
cast first PyUnicode_Decode argument to proper type (why is
"char *" used for encoded byte streams, btw? shouldn't that
be "void *" or, if necessary, "unsigned char *"?)
Subversion revision number.
First, in an svn export, there will be no .svn directory, so use an in-file
$Revision$ keyword string with the keyword chrome stripped off.
Also, use $(srcdir) in the Makefile.pre.in to handle the case where Python is
build outside the source tree.
Add C API function Py_GetBuildNumber(), add it to the interactive prompt
banner (i.e. Py_GetBuildInfo()), and add it as the sys.build_number
attribute. The build number is a string instead of an int because it may
contain a trailing 'M' if there are local modifications.
In C++, it's an error to pass a string literal to a char* function
without a const_cast(). Rather than require every C++ extension
module to put a cast around string literals, fix the API to state the
const-ness.
I focused on parts of the API where people usually pass literals:
PyArg_ParseTuple() and friends, Py_BuildValue(), PyMethodDef, the type
slots, etc. Predictably, there were a large set of functions that
needed to be fixed as a result of these changes. The most pervasive
change was to make the keyword args list passed to
PyArg_ParseTupleAndKewords() to be a const char *kwlist[].
One cast was required as a result of the changes: A type object
mallocs the memory for its tp_doc slot and later frees it.
PyTypeObject says that tp_doc is const char *; but if the type was
created by type_new(), we know it is safe to cast to char *.
accepts strings only for unpickling reasons. This check prevents the honest
mistake of passing a string like '2:59.0' to time() and getting an insane
object.
This change implements a new bytecode compiler, based on a
transformation of the parse tree to an abstract syntax defined in
Parser/Python.asdl.
The compiler implementation is not complete, but it is in stable
enough shape to run the entire test suite excepting two disabled
tests.
Since I can't test this, I'm just adding a comment. If we get access
to AIX boxes, we can test this and really resolve. Anyone from IBM
want to offer help?
Backport candidate I suppose.
Fix segfault. I tried to write a test, but it wouldn't crash
when running regrtest. This really should have some sort of test.
Should definitely be backported.
about illegal code points. The codec now supports PEP 293 style error handlers.
(This is a variant of the Nik Haldimann's patch that detects truncated data)
VC++6 doesn't accept them.
This *will* result in tons of the following warning from gcc 3.x:
(gcc "2.96ish" doesn't issue this warning)
warning: integer constant is too large for "long" type
the code compiles fine regardless. squashing the gcc warnings
is the next task.
Would someone on windows please confirm that this does or does not
compile and if it does or does not pass the test_hashlib.py unit
tests.
A new hashlib module to replace the md5 and sha modules. It adds
support for additional secure hashes such as SHA-256 and SHA-512. The
hashlib module uses OpenSSL for fast platform optimized
implementations of algorithms when available. The old md5 and sha
modules still exist as wrappers around hashlib to preserve backwards
compatibility.
[ 1232517 ] OverflowError in time.utime() causes strange traceback
A needed error check was missing.
(Actually, this error check may only have become necessary in fairly
recent Python, not sure).
Backport candidate.
* the has_key() method was not raising a DBError when a database error
had occurred. [SF patch id 1212590]
* added a wrapper for the DBEnv.set_lg_regionmax method [SF patch id 1212590]
* DBKeyEmptyError now derives from KeyError just like DBNotFoundError.
* internally everywhere DB_NOTFOUND was checked for has been updated
to also check for DB_KEYEMPTY. This fixes the semantics of a couple
operations on recno and queue databases to be more intuitive and results
in less unexpected DBKeyEmptyError exceptions being raised.
because (essentially) I didn't realise that PY_BEGIN/END_ALLOW_THREADS
actually expanded to nothing under a no-threads build, so if you somehow
NULLed out the threadstate (e.g. by calling PyThread_SaveThread) it would
stay NULLed when you return to Python. Argh!
Backport candidate.
[ 1166660 ] The readline module can cause python to segfault
It seems to me that the code I'm rewriting here attempted to call any
user-supplied hook functions using the thread state of the thread that
called the hook-setting function, as opposed to that of the thread
that is currently executing. This doesn't work, in general.
Fix this by using the PyGILState API (It wouldn't be that hard to
define a dummy version of said API when #ifndef WITH_THREAD, would
it?).
Also, check the conversion to integer of the return value of a hook
function for errors (this problem was mentioned in the ipython bug
report linked to in the above bug).
PyNumber_Check, rather than trying to convert to a float. Reimplemented
writer - now raises exceptions when it sees a quotechar but neither
doublequote or escapechar are set. Doublequote results are now more
consistent (eg, single quote should generate """", rather than "",
which is ambiguous).
when this limit is reached. Limit defaults to 128k, and is changed
by module set_field_limit() method. Previously, an unmatched quote
character could result in the entire file being read into the field
buffer, potentially exhausting virtual memory.
only contains instances of the dialect type, we can refer directly to the
dialect instances rather than creating new ones. In other words, if the
dialect comes from the registry, and we apply no further modifications,
the reader/writer can use the dialect object directly.
was done because we were previously performing validation of the dialect
from python, but this is now down within the C module. Also, the method
we were using to detect classes did not work with new-style classes.
regrtest.py: skip rgbimg and imageop as they are not built on 64-bit systems.
_tkinter.c: replace %.8x with %p for printing pointers.
setup.py: add lib64 into the library directories.
memset() wrote one past the end of the buffer, which was likely to be unused padding or a yet-to-be-initialized local variable. This routine is already tested by test_socket.
nothing in gc currently cares, the original coding could screw up if,
e.g., you tried to move a node to the list it's already in, and the node
was already the last in its list.
Introduced gc_list_move(), which captures the common gc_list_remove() +
gc_list_append() sequence. In fact, no uses of gc_list_append() remained
(they were all in a gc_list_move() sequence), so commented that one out.
gc_list_merge(): assert that `from` != `to`; that was an implicit
precondition, now verified in a debug build.
Others: added comments about their purpose.
In cyclic gc, clear weakrefs to unreachable objects before allowing any
Python code (weakref callbacks or __del__ methods) to run.
This is a critical bugfix, affecting all versions of Python since weakrefs
were introduced. I'll backport to 2.3.
deque_item(): a performance bug: the linked list of blocks was followed
from the left in most cases, because the test (i < (deque->len >> 1)) was
after "i %= BLOCKLEN".
deque_clear(): replaced a call to deque_len() with deque->len; not sure what
this call was here for, nor if all compilers under the sun would inline it.
deque_traverse(): I belive that it could be called by the GC when the deque
has leftblock==rightblock==NULL, because it is tracked before the first block
is allocated (though closely before). Still, a C extension module subclassing
deque could provide its own tp_alloc that could trigger a GC collection after
the PyObject_GC_Track()...
deque_richcompare(): rewrote to cleanly check for end-of-iterations instead of
relying on deque.__iter__().next() to succeed exactly len(deque) times -- an
assumption which can break if deques are subclassed. Added a test.
I wonder if the length should be explicitely bounded to INT_MAX, with
OverflowErrors, as in listobject.c. On 64-bit machines, adding more than
INT_MAX in the deque will result in trouble. (Note to anyone/me fixing
this: carefully check for overflows if len is close to INT_MAX in the
following functions: deque_rotate(), deque_item(), deque_ass_item())
The previous approach was too easily fooled (a rotate() sufficed).
* Use it->counter to determine when iteration is complete. The
previous approach was too complex.
* Strengthen an assertion and add a comment here or there.
* Change the centering by one to make it possible to test the module
with BLOCKLEN's as low as two. Testing small blocks makes end-point
errors surface more readily.
decoding incomplete input (when the input stream is temporarily exhausted).
codecs.StreamReader now implements buffering, which enables proper
readline support for the UTF-16 decoders. codecs.StreamReader.read()
has a new argument chars which specifies the number of characters to
return. codecs.StreamReader.readline() and codecs.StreamReader.readlines()
have a new argument keepends. Trailing "\n"s will be stripped from the lines
if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and
PyUnicode_DecodeUTF16Stateful.
Several functions adopted the strategy of altering a full lengthed
string copy and resizing afterwards. That would fail if the initial
string was short enough (0 or 1) to be interned. Interning precluded
the subsequent resizing operation.
The solution was to make sure the initial string was at least two
characters long.
Added tests to verify that all binascii functions do not crater when
given an empty string argument.
truncate() left the stream position unchanged, which meant the
"truncated" data didn't go away:
>>> io.write('abc')
>>> io.truncate(0)
>>> io.write('xyz')
>>> io.getvalue()
'abcxyz'
Patch by Dima Dorfman.
[ 1009560 ] Fix @decorator evaluation order
From the description:
Changes in this patch:
- Change Grammar/Grammar to require
newlines between adjacent decorators.
- Fix order of evaluation of decorators
in the C (compile.c) and python
(Lib/compiler/pycodegen.py) compilers
- Add better order of evaluation check
to test_decorators.py (test_eval_order)
- Update the decorator documentation in
the reference manual (improve description
of evaluation order and update syntax
description)
and the comment:
Used Brett's evaluation order (see
http://mail.python.org/pipermail/python-dev/2004-August/047835.html)
(I'm checking this in for Anthony who was having problems getting SF to
talk to him)
That's the title of the report, but the hole was probably plugged since
Python 2.0. See corresponding checkin to PC/getpathp.c: a crucial
precondition for joinpath() was neither documented nor verified, and there
are so many callers with so many conditional paths that no "eyeball
analysis" is satisfactory. Now Python dies with a fatal error if the
precondition isn't satisfied, instead of allowing a buffer overrun.
NOT TESTED! The Windows version of the patch was, but not this one. I
don't feel like waiting for someone to notice the patch I attached to the
bug report. If it doesn't compile, sorry, but fix it <wink>. If it
does compile, it's "obviously correct".
unicodedata.east_asian_width(). You can still implement your own
simple width() function using it like this:
def width(u):
w = 0
for c in unicodedata.normalize('NFC', u):
cwidth = unicodedata.east_asian_width(c)
if cwidth in ('W', 'F'): w += 2
else: w += 1
return w
discussed recently in python-dev:
In _locale module:
- bind_textdomain_codeset() binding
In gettext module:
- bind_textdomain_codeset() function
- lgettext(), lngettext(), ldgettext(), ldngettext(),
which return translated strings encoded in
preferred system encoding, if
bind_textdomain_codeset() was not used.
- Added equivalent functionality in translate()
function and catalog classes.
Every change was also documented.
and installed layouts to make maintenance simple and easy. And it
also adds four new codecs; big5hkscs, euc-jis-2004, shift-jis-2004
and iso2022-jp-2004.
incorrect declaration for ypall_callback in /usr/include/rpcsvc/ypcInt.h .
Shouldn't hurt any code since the differences are unsigned long instead of int and
void * instead of char *. Removes warning about improper function pointer
assignment during compilation.
[ 960406 ] unblock signals in threads
although the changes do not correspond exactly to any patch attached to
that report.
Non-main threads no longer have all signals masked.
A different interface to readline is used.
The handling of signals inside calls to PyOS_Readline is now rather
different.
These changes are all a bit scary! Review and cross-platform testing
much appreciated.
- weakref.ref and weakref.ReferenceType will become aliases for each
other
- weakref.ref will be a modern, new-style class with proper __new__
and __init__ methods
- weakref.WeakValueDictionary will have a lighter memory footprint,
using a new weakref.ref subclass to associate the key with the
value, allowing us to have only a single object of overhead for each
dictionary entry (currently, there are 3 objects of overhead per
entry: a weakref to the value, a weakref to the dictionary, and a
function object used as a weakref callback; the weakref to the
dictionary could be avoided without this change)
- a new macro, PyWeakref_CheckRefExact(), will be added
- PyWeakref_CheckRef() will check for subclasses of weakref.ref
This closes SF patch #983019.
Fix memory leaks revealed by valgrind and ensuing code inspection.
In the existing test suite valgrind revealed two memory leaks (DB_get
and DBC_set_range). Code inspection revealed that there were many other
potential similar leaks (many on odd code error paths such as passing
something other than a DBTxn object for a txn= parameter or in the face
of an out of memory error). The most common case that would cause a
leak was when using recno or queue format databases with integer keys,
sometimes only with an exception exit.
The LaTeX is untested (well, so is the new API, for that matter).
Note that I also changed NULL to get spelled consistently in concrete.tex.
If that was a wrong thing to do, Fred should yell at me.
New include file timefuncs.h exports private API function
_PyTime_DoubleToTimet() from timemodule.c. timemodule should export
some other functions too (look for painful bits in datetimemodule.c).
Added insane-argument checking to datetime's assorted fromtimestamp()
and utcfromtimestamp() methods. Added insane-argument tests of these
to test_datetime, and insane-argument tests for ctime(), localtime()
and gmtime() to test_time.
more than a second of precision. Primarily affects ctime, localtime, and
gmtime.
Closes bug #919012 thanks to Tim Peters' code.
Tim suggests that the new funciton being introduced, _PyTime_DoubletoTimet(),
should be added to the internal C API and then used in datetime where
appropriate. Not being done now for lack of time.
Beardsley.
If the seconds are different, we still need to calculate the differences
between milliseconds.
Also, on a Gentoo Linux (2.6.5) dual Athlon MP box with glibc 2.3,
time can go backwards. This probably happens when the process switches
the CPU it's running on. Time can also go backwards when running NTP.
If we detect a negative time delta (ie, time went backwards), return
a delta of 0. This prevents an illegal array access elsewhere.
I think it's safest to *not* update prev_timeofday in this case, so we
return without updating.
Backport candidate.
This fixes the problem and the test passes. I'm not sure
the test is really correct though. It seems like it would
be better to raise an exception. I think that wasn't done
for backwards compatability.
Bugfix candidate.
#!-scripts, only the filename part, and this can lead to incorrect
initialization of sys.path and sys.executable if there is another python
on $PATH before the one used in #!.
The fix was picked up from the darwinports crowd, thanks!
Added setbdaddr and makebdaddr.
Extended makesockaddr to understand Bluetooth addresses.
Changed getsockaddr to expect the Bluetooth addresses as a string,
not a six element tuple.
Reformatted some of the Bluetooth code to be more consistent with PEP 7.
iswide() for east asian width manipulation. (Inspired by David
Goodger, Reviewed by Martin v. Loewis)
- Move _PyUnicode_TypeRecord.flags to the end of the struct so that
no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
[ 728330 ] Don't define _SGAPI on IRIX
The Right Thing would be nice, for now this'll do. At least it isn't
going to break anything *other* than IRIX...
(Code contributed by Jiwon Seo.)
The documentation portion of the patch is being re-worked and will be
checked-in soon. Likewise, PEP 289 will be updated to reflect Guido's
rationale for the design decisions on binding behavior (as described in
in his patch comments and in discussions on python-dev).
The test file, test_genexps.py, is written in doctest format and is
meant to exercise all aspects of the the patch. Further additions are
welcome from everyone. Please stress test this new feature as much as
possible before the alpha release.
same method that implements __setitem__ also implements __delitem__.
Also, there were several good use cases (removing items from a queue
and implementing Forth style stack ops).
binascii_a2b_qp() and binascii_b2a_qp() with calls to PyMem_Malloc() and
PyMem_Free(). These won't return NULL unless the allocations actually fail,
so it won't trigger a bogus memory error on some platforms <cough>AIX</cough>
when passed a length of zero.
- return the full size of the sockaddr_un structure, without which
bind() fails with EINVAL;
- set test_socketserver to use a socket name that meets the form
required by the underlying implementation;
- don't bother exercising the forking AF_UNIX tests on EMX - its
fork() can't handle the stress.
with major C compilers (VACPP, EMX+gcc and [Open]Watcom).
Also tidy up the export of spawn*() symbols in the os module to match what
is found/implemented.
It's possible to create insane datetime objects by using the constructor
"backdoor" inserted for fast unpickling. Doing extensive range checking
would eliminate the backdoor's purpose (speed), but at least a little
checking can stop honest mistakes.
Bugfix candidate.
* The default __reversed__ performed badly, so reintroduced a custom
reverse iterator.
* Added length transparency to improve speed with map(), list(), etc.
array.extend() now accepts iterable arguments implements as a series
of appends. Besides being a user convenience and matching the behavior
for lists, this the saves memory and cycles that would be used to
create a temporary array object.
lists. Speeds append() operations and reduces memory requirements
(because of more conservative overallocation).
Paves the way for the feature request for array.extend() to support
arbitrary iterable arguments.
The writelines() method now accepts any iterable argument and writes
the lines one at a time rather than using ''.join(lines) followed by
a single write. Results in considerable memory savings and makes
the method suitable for use with generator expressions.
are within proper boundaries as specified in the docs.
This can break possible code (datetime module needed changing, for instance)
that uses 0 for values that need to be greater 1 or greater (month, day, and
day of year).
Fixes bug #897625.
__getitem__() and __setitem__().
Simplifies the API, reduces the code size, adds flexibility, and makes
deques work with bisect.bisect(), random.shuffle(), and random.sample().
* Add doctests for the examples in the library reference.
* Add two methods, left() and right(), modeled after deques in C++ STL.
* Apply the new method to asynchat.py.
* Add comparison operators to make deques more substitutable for lists.
* Replace the LookupErrors with IndexErrors to more closely match lists.
* Fixed a bug in the compatibility interface set_location() method
where it would not properly search to the next nearest key when
used on BTree databases. [SF bug id 788421]
* Fixed a bug in the compatibility interface set_location() method
where it could crash when looking up keys in a hash or recno
format database due to an incorrect free().
Allow the user to create Tkinter.Tcl objects which are
just like Tkinter.Tk objects except that they do not
initialize Tk. This is useful in circumstances where the
script is being run on machines that do not have an X
server running -- in those cases, Tk initialization fails,
even if no window is ever created.
Includes documentation change and tests.
Tested on Linux, Solaris and Windows.
Reviewed by Martin von Loewis.
as "This is the same format as used by gl.lrectwrite() and the imgfile
module." This implies a certain byte order in multi-byte pixel
formats. However, the code was originally written on an SGI
(big-endian) and *uses* the fact that bytes are stored in a particular
order in ints. This means that the code uses and produces different
byte order on little-endian systems.
This fix adds a module-level flag "backward_compatible" (default not
set, and if not set, behaves as if set to 1--i.e. backward compatible)
that can be used on a little-endian system to use the same byte order
as the SGI. Using this flag it is then possible to prepare
SGI-compatible images on a little-endian system.
This patch is the result of a (small) discussion on python-dev and was
submitted to SourceForge as patch #874358.
Original idea by Guido van Rossum.
Idea for skipable inner iterators by Raymond Hettinger.
Idea for argument order and identity function default by Alex Martelli.
Implementation by Hye-Shik Chang (with tweaks by Raymond Hettinger).
Also SF patch 843455.
This is a critical bugfix.
I'll backport to 2.3 maint, but not beyond that. The bugs this fixes
have been there since weakrefs were introduced.