This was a fair amount of rework of the patch. Refactored test_fork1 so it
could be reused by the new tests for wait3/4. Also made them into new style
unittests (derive from unittest.TestCase).
This patch adds a-LAW encoding to audioop and replaces the old
u-LAW encoding/decoding code with the current code from sox.
Possible issues: the code from sox uses int16_t.
Code by Lars Immisch
This will hopefully get rid of some Coverity warnings, be a hint to
developers, and be marginally faster.
Some asserts were added when the type is currently known, but depends
on values from another function.
"""
The attached patch fixes all the ctypes tests so they pass on amd64.
It also fixes several warnings. I'm not sure what else to do with the
patch. Let me know how you want to handle these in the future.
I'm not sure the patch is 100% correct. You will need to decide what
can be 64 bits and what can't. I believe
sq_{item,slice,ass_item,ass_slice} all need to use Py_ssize_t. The
types in ctypes.h may not require all the changes I made. I don't
know how you want to support older version, so I unconditionally
changed the types to Py_ssize_t.
"""
The patch is also in the ctypes SVN repository now, after small
changes to add compatibility with older Python versions.
PyObject_Unicode(). This problem was originally reported from Coverity
and addresses mail on python-dev "checkin r43015".
This inlines the conversion of the string to unicode and cleans
up/simplifies some code at the end of the PyObject_Unicode().
We really need a complete C API test module for all public APIs
and passing good and bad parameter values.
Will backport.
Anyway, this is the changes to the with-statement
so that __exit__ must return a true value in order
for a pending exception to be ignored.
The PEP (343) is already updated.
missing PyObject_Del()'s, simplify some code by using Py_BuildValue()
instead of creating a tuple with items manually, stop clobbering builtin
exceptions in a few places, and guard against NULL-returning functions some
more.
This fixes 117 of the 780 (!?!#%@#$!!) reference leaks in test_bsddb3. I
ain't not done yet, although this review of 5kloc was just the easy part.
an error code, this let `self` leak. This is a disaster
on Windows, since `self` already points to a newly-opened
file object, and it was impossible for Python code to
close the thing since the only reference to it was in a
blob of leaked C memory.
test_hotshot test_bad_sys_path(): This new test provoked
the C bug above. This test passed, but left an open
"@test" file behind, which caused a massive cascade of
bogus test failures in later, unrelated tests on Windows.
Changed the test code to remove the @test file it leaves
behind, which relies on the change above to close that
file first.
to an unsigned int (and back again) on 64-bit machines, even though the
actual value of the Py_ssize_t variable is way below 31 bits. I suspect
compiler-error.
the sentinel value in the main function, rather than the helper. This
function could possibly do with an early-out if any of the helper calls ends
up with a len of 0, but I doubt it really matters (how common are malformed
hangul syllables, really?)
posix__getfullpathname().
In partial answer to the now-deleted XXX comment:
/* XXX(twouters) Why use 'et#' here at all? insize isn't used */
`insize` is an input parameter too, and it was left uninitialized,
leading to seemingly random failures.
non-32bit platforms. Will still only allow 32 bits in a timestamp on Win64,
but at least it won't crash, and it'll work right on platforms where longs
are big enough to contain time_t's.
(A better-working, although conceptually less-right fix would have been to
use Py_ssize_t here, but Martin and Tim won't let me.)
- New semantics for __exit__() -- it must re-raise the exception
if type is not None; the with-statement itself doesn't do this.
(See the updated PEP for motivation.)
- Added context managers to:
- file
- thread.LockType
- threading.{Lock,RLock,Condition,Semaphore,BoundedSemaphore}
- decimal.Context
- Added contextlib.py, which defines @contextmanager, nested(), closing().
- Unit tests all around; bot no docs yet.
- IMPORT_NAME takes an extra argument from the stack: the relativeness of
the import. Only passed to __import__ when it's not -1.
- __import__() takes an optional 5th argument for the same thing; it
__defaults to -1 (old semantics: try relative, then absolute)
- 'from . import name' imports name (be it module or regular attribute)
from the current module's *package*. Likewise, 'from .module import name'
will import name from a sibling to the current module.
- Importing from outside a package is not allowed; 'from . import sys' in a
toplevel module will not work, nor will 'from .. import sys' in a
(single-level) package.
- 'from __future__ import absolute_import' will turn on the new semantics
for import and from-import: imports will be absolute, except for
from-import with dots.
Includes tests for regular imports and importhooks, parser changes and a
NEWS item, but no compiler-package changes or documentation changes.
- The copy module now "copies" function objects (as atomic objects).
- dict.__getitem__ now looks for a __missing__ hook before raising
KeyError.
- Added a new type, defaultdict, to the collections module.
This uses the new __missing__ hook behavior added to dict (see above).
On a box where sizeof(size_t) == 4, C doesn't define
what happens when a size_t value is shifted right by
32 bits, and this caused test_mmap to fail on Windows
in a debug build. So use different code to break
the size apart depending on how large size_t actually
is.
This looks like an illusion, since lots of code in this
module still appears to assume sizes can't be more
than 32 bits (e.g., the internal _GetMapSize() still
returns an int), but at least test_mmap passes again.
has been applied fairly arbitrarily in this module (nsmallest uses
Py_ssize_t, nlargest does not) and it probably deserves a more complete
review. Fixes heapq.nsmallest() always returning the empty list (on
platforms with 64-bit ssize_t/long)
Based on lsprof (patch #1212837) by Brett Rosen and Ted Czotter.
With further editing by Michael Hudson and myself.
History in svn repo: http://codespeak.net/svn/user/arigo/hack/misc/lsprof
* Module/_lsprof.c is the internal C module, Lib/cProfile.py a wrapper.
* pstats.py updated to display cProfile's caller/callee timings if available.
* setup.py and NEWS updated.
* documentation updates in the profiler section:
- explain the differences between the three profilers that we have now
- profile and cProfile can use a unified documentation, like (c)Pickle
- mention that hotshot is "for specialized usage" now
- removed references to the "old profiler" that no longer exists
* test updates:
- extended test_profile to cover delicate cases like recursion
- added tests for the caller/callee displays
- added test_cProfile, performing the same tests for cProfile
* TO-DO:
- cProfile gives a nicer name to built-in, particularly built-in methods,
which could be backported to profile.
- not tested on Windows recently!
is larger than FD_SETSIZE.
This can only be acheived with ulimit -n SOME_NUMBER_BIGGER_THAN_FD_SETSIZE
which is typically only available to root. Since this wouldn't normally
be run in a test (ie, run as root), it doesn't seem too worthwhile to
add a normal test. The bug report has one version of a test. I've
written another. Not sure what the best thing to do is.
Do the check before calling internal_select() because we can't set
an error in between Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS.
This seemed the clearest solution, ie handle before calling internal_select()
rather than inside. Plus there is at least one place outside
of internal_select() that needed to be handled.
Will backport.
on both Unix (SVR4 and BSD) and Windows. Restores behaviour of passing -1
for anonymous memory on Unix. Use MAP_ANONYMOUS instead of _ANON since
the latter is deprecated according to Linux (gentoo) man pages.
Should we continue to allow mmap.mmap(0, length) to work on Windows?
0 is a valid fd.
Will backport bugfix portions.
loses information:
OverflowError: regular expression code size limit exceeded
Otherwise the compiled code is gibberish, possibly leading at
least to wrong results or (as reported on c.l.py) internal
sre errors at match time.
I'm not sure how to test this. SRE_CODE is a 2-byte type on
my box, and it's easy to create a regexp that causes the new
exception to trigger here. But it may be a 4-byte type on
other boxes, and creating a regexp large enough to trigger
problems there would be pretty crazy.
Bugfix candidate.
cast first PyUnicode_Decode argument to proper type (why is
"char *" used for encoded byte streams, btw? shouldn't that
be "void *" or, if necessary, "unsigned char *"?)
Subversion revision number.
First, in an svn export, there will be no .svn directory, so use an in-file
$Revision$ keyword string with the keyword chrome stripped off.
Also, use $(srcdir) in the Makefile.pre.in to handle the case where Python is
build outside the source tree.
Add C API function Py_GetBuildNumber(), add it to the interactive prompt
banner (i.e. Py_GetBuildInfo()), and add it as the sys.build_number
attribute. The build number is a string instead of an int because it may
contain a trailing 'M' if there are local modifications.
In C++, it's an error to pass a string literal to a char* function
without a const_cast(). Rather than require every C++ extension
module to put a cast around string literals, fix the API to state the
const-ness.
I focused on parts of the API where people usually pass literals:
PyArg_ParseTuple() and friends, Py_BuildValue(), PyMethodDef, the type
slots, etc. Predictably, there were a large set of functions that
needed to be fixed as a result of these changes. The most pervasive
change was to make the keyword args list passed to
PyArg_ParseTupleAndKewords() to be a const char *kwlist[].
One cast was required as a result of the changes: A type object
mallocs the memory for its tp_doc slot and later frees it.
PyTypeObject says that tp_doc is const char *; but if the type was
created by type_new(), we know it is safe to cast to char *.
accepts strings only for unpickling reasons. This check prevents the honest
mistake of passing a string like '2:59.0' to time() and getting an insane
object.
This change implements a new bytecode compiler, based on a
transformation of the parse tree to an abstract syntax defined in
Parser/Python.asdl.
The compiler implementation is not complete, but it is in stable
enough shape to run the entire test suite excepting two disabled
tests.
Since I can't test this, I'm just adding a comment. If we get access
to AIX boxes, we can test this and really resolve. Anyone from IBM
want to offer help?
Backport candidate I suppose.
Fix segfault. I tried to write a test, but it wouldn't crash
when running regrtest. This really should have some sort of test.
Should definitely be backported.
about illegal code points. The codec now supports PEP 293 style error handlers.
(This is a variant of the Nik Haldimann's patch that detects truncated data)
VC++6 doesn't accept them.
This *will* result in tons of the following warning from gcc 3.x:
(gcc "2.96ish" doesn't issue this warning)
warning: integer constant is too large for "long" type
the code compiles fine regardless. squashing the gcc warnings
is the next task.
Would someone on windows please confirm that this does or does not
compile and if it does or does not pass the test_hashlib.py unit
tests.
A new hashlib module to replace the md5 and sha modules. It adds
support for additional secure hashes such as SHA-256 and SHA-512. The
hashlib module uses OpenSSL for fast platform optimized
implementations of algorithms when available. The old md5 and sha
modules still exist as wrappers around hashlib to preserve backwards
compatibility.
[ 1232517 ] OverflowError in time.utime() causes strange traceback
A needed error check was missing.
(Actually, this error check may only have become necessary in fairly
recent Python, not sure).
Backport candidate.
* the has_key() method was not raising a DBError when a database error
had occurred. [SF patch id 1212590]
* added a wrapper for the DBEnv.set_lg_regionmax method [SF patch id 1212590]
* DBKeyEmptyError now derives from KeyError just like DBNotFoundError.
* internally everywhere DB_NOTFOUND was checked for has been updated
to also check for DB_KEYEMPTY. This fixes the semantics of a couple
operations on recno and queue databases to be more intuitive and results
in less unexpected DBKeyEmptyError exceptions being raised.
because (essentially) I didn't realise that PY_BEGIN/END_ALLOW_THREADS
actually expanded to nothing under a no-threads build, so if you somehow
NULLed out the threadstate (e.g. by calling PyThread_SaveThread) it would
stay NULLed when you return to Python. Argh!
Backport candidate.
[ 1166660 ] The readline module can cause python to segfault
It seems to me that the code I'm rewriting here attempted to call any
user-supplied hook functions using the thread state of the thread that
called the hook-setting function, as opposed to that of the thread
that is currently executing. This doesn't work, in general.
Fix this by using the PyGILState API (It wouldn't be that hard to
define a dummy version of said API when #ifndef WITH_THREAD, would
it?).
Also, check the conversion to integer of the return value of a hook
function for errors (this problem was mentioned in the ipython bug
report linked to in the above bug).
PyNumber_Check, rather than trying to convert to a float. Reimplemented
writer - now raises exceptions when it sees a quotechar but neither
doublequote or escapechar are set. Doublequote results are now more
consistent (eg, single quote should generate """", rather than "",
which is ambiguous).
when this limit is reached. Limit defaults to 128k, and is changed
by module set_field_limit() method. Previously, an unmatched quote
character could result in the entire file being read into the field
buffer, potentially exhausting virtual memory.
only contains instances of the dialect type, we can refer directly to the
dialect instances rather than creating new ones. In other words, if the
dialect comes from the registry, and we apply no further modifications,
the reader/writer can use the dialect object directly.
was done because we were previously performing validation of the dialect
from python, but this is now down within the C module. Also, the method
we were using to detect classes did not work with new-style classes.
regrtest.py: skip rgbimg and imageop as they are not built on 64-bit systems.
_tkinter.c: replace %.8x with %p for printing pointers.
setup.py: add lib64 into the library directories.
memset() wrote one past the end of the buffer, which was likely to be unused padding or a yet-to-be-initialized local variable. This routine is already tested by test_socket.
nothing in gc currently cares, the original coding could screw up if,
e.g., you tried to move a node to the list it's already in, and the node
was already the last in its list.
Introduced gc_list_move(), which captures the common gc_list_remove() +
gc_list_append() sequence. In fact, no uses of gc_list_append() remained
(they were all in a gc_list_move() sequence), so commented that one out.
gc_list_merge(): assert that `from` != `to`; that was an implicit
precondition, now verified in a debug build.
Others: added comments about their purpose.
In cyclic gc, clear weakrefs to unreachable objects before allowing any
Python code (weakref callbacks or __del__ methods) to run.
This is a critical bugfix, affecting all versions of Python since weakrefs
were introduced. I'll backport to 2.3.
deque_item(): a performance bug: the linked list of blocks was followed
from the left in most cases, because the test (i < (deque->len >> 1)) was
after "i %= BLOCKLEN".
deque_clear(): replaced a call to deque_len() with deque->len; not sure what
this call was here for, nor if all compilers under the sun would inline it.
deque_traverse(): I belive that it could be called by the GC when the deque
has leftblock==rightblock==NULL, because it is tracked before the first block
is allocated (though closely before). Still, a C extension module subclassing
deque could provide its own tp_alloc that could trigger a GC collection after
the PyObject_GC_Track()...
deque_richcompare(): rewrote to cleanly check for end-of-iterations instead of
relying on deque.__iter__().next() to succeed exactly len(deque) times -- an
assumption which can break if deques are subclassed. Added a test.
I wonder if the length should be explicitely bounded to INT_MAX, with
OverflowErrors, as in listobject.c. On 64-bit machines, adding more than
INT_MAX in the deque will result in trouble. (Note to anyone/me fixing
this: carefully check for overflows if len is close to INT_MAX in the
following functions: deque_rotate(), deque_item(), deque_ass_item())
The previous approach was too easily fooled (a rotate() sufficed).
* Use it->counter to determine when iteration is complete. The
previous approach was too complex.
* Strengthen an assertion and add a comment here or there.
* Change the centering by one to make it possible to test the module
with BLOCKLEN's as low as two. Testing small blocks makes end-point
errors surface more readily.
decoding incomplete input (when the input stream is temporarily exhausted).
codecs.StreamReader now implements buffering, which enables proper
readline support for the UTF-16 decoders. codecs.StreamReader.read()
has a new argument chars which specifies the number of characters to
return. codecs.StreamReader.readline() and codecs.StreamReader.readlines()
have a new argument keepends. Trailing "\n"s will be stripped from the lines
if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and
PyUnicode_DecodeUTF16Stateful.
Several functions adopted the strategy of altering a full lengthed
string copy and resizing afterwards. That would fail if the initial
string was short enough (0 or 1) to be interned. Interning precluded
the subsequent resizing operation.
The solution was to make sure the initial string was at least two
characters long.
Added tests to verify that all binascii functions do not crater when
given an empty string argument.
truncate() left the stream position unchanged, which meant the
"truncated" data didn't go away:
>>> io.write('abc')
>>> io.truncate(0)
>>> io.write('xyz')
>>> io.getvalue()
'abcxyz'
Patch by Dima Dorfman.
[ 1009560 ] Fix @decorator evaluation order
From the description:
Changes in this patch:
- Change Grammar/Grammar to require
newlines between adjacent decorators.
- Fix order of evaluation of decorators
in the C (compile.c) and python
(Lib/compiler/pycodegen.py) compilers
- Add better order of evaluation check
to test_decorators.py (test_eval_order)
- Update the decorator documentation in
the reference manual (improve description
of evaluation order and update syntax
description)
and the comment:
Used Brett's evaluation order (see
http://mail.python.org/pipermail/python-dev/2004-August/047835.html)
(I'm checking this in for Anthony who was having problems getting SF to
talk to him)
That's the title of the report, but the hole was probably plugged since
Python 2.0. See corresponding checkin to PC/getpathp.c: a crucial
precondition for joinpath() was neither documented nor verified, and there
are so many callers with so many conditional paths that no "eyeball
analysis" is satisfactory. Now Python dies with a fatal error if the
precondition isn't satisfied, instead of allowing a buffer overrun.
NOT TESTED! The Windows version of the patch was, but not this one. I
don't feel like waiting for someone to notice the patch I attached to the
bug report. If it doesn't compile, sorry, but fix it <wink>. If it
does compile, it's "obviously correct".
unicodedata.east_asian_width(). You can still implement your own
simple width() function using it like this:
def width(u):
w = 0
for c in unicodedata.normalize('NFC', u):
cwidth = unicodedata.east_asian_width(c)
if cwidth in ('W', 'F'): w += 2
else: w += 1
return w
discussed recently in python-dev:
In _locale module:
- bind_textdomain_codeset() binding
In gettext module:
- bind_textdomain_codeset() function
- lgettext(), lngettext(), ldgettext(), ldngettext(),
which return translated strings encoded in
preferred system encoding, if
bind_textdomain_codeset() was not used.
- Added equivalent functionality in translate()
function and catalog classes.
Every change was also documented.
and installed layouts to make maintenance simple and easy. And it
also adds four new codecs; big5hkscs, euc-jis-2004, shift-jis-2004
and iso2022-jp-2004.
incorrect declaration for ypall_callback in /usr/include/rpcsvc/ypcInt.h .
Shouldn't hurt any code since the differences are unsigned long instead of int and
void * instead of char *. Removes warning about improper function pointer
assignment during compilation.
[ 960406 ] unblock signals in threads
although the changes do not correspond exactly to any patch attached to
that report.
Non-main threads no longer have all signals masked.
A different interface to readline is used.
The handling of signals inside calls to PyOS_Readline is now rather
different.
These changes are all a bit scary! Review and cross-platform testing
much appreciated.