This problem is related to a wrong behavior from mark_save/restore(),
which don't restore the mark_stack_base before restoring the marks.
Greg's suggestion was to change the asserts, which happen to be
the only recursive ops that can continue the loop, but the problem would
happen to any operation with the same behavior. So, rather than
hardcoding this into asserts, I have changed mark_save/restore() to
always restore the stackbase before restoring the marks.
Both solutions should fix these two cases, presented by Greg:
>>> re.match('(a)(?:(?=(b)*)c)*', 'abb').groups()
('b', None)
>>> re.match('(a)((?!(b)*))*', 'abb').groups()
('b', None, None)
The rest of the bug and patch in #725149 must be discussed further.
within repeats of alternatives. The only change to the original
patch was to convert the tests to the new test_re.py file.
This patch fixes cases like:
>>> re.match('((a)|b)*', 'abc').groups()
('b', '')
Which is wrong (it's impossible to match the empty string),
and incompatible with other regex systems, like the following
examples show:
% perl -e '"abc" =~ /^((a)|b)*/; print "$1 $2\n";'
b a
% echo "abc" | sed -r -e "s/^((a)|b)*/\1 \2|/"
b a|c
- The socket module now provides the functions inet_pton and inet_ntop
for converting between string and packed representation of IP addresses.
See SF patch #658327.
This still needs a bit of work in the doc area, because it is not
available on all platforms (especially not on Windows).
(contributed by logistix; substantially reworked by rhettinger).
To create a representation of non-string arrays, array_repr() was
starting with a base Python string object and repeatedly using +=
to concatenate the representation of individual objects.
Logistix had the idea to convert to an intermediate tuple form and
then join it all at once. I took advantage of existing tools and
formed a list with array_tolist() and got its representation through
PyObject_Repr(v) which already has a fast implementation for lists.
docs here are best-guess: the MS docs I could find weren't clear, and
some even claimed _commit() has no effect on Win32 systems (which is
easily shown to be false just by trying it).
* UINT_MAX -> ULONG_MAX since we are dealing with longs
* ParseTuple needs &int for 'i' and &long for 'l'
There may be a better way to do this, but this works.
string does what is expected (ie unset [BEGIN|END]LIBPATH)
- set the size of the DosQuerySysInfo buffer correctly; it was safe,
but incorrect (allowing a 1 element overrun)
I've applied a modified version of Greg Chapman's patch. I've included
the fixes without introducing the reorganization mentioned, for the sake
of stability. Also, the second fix mentioned in the patch don't fix the
mentioned problem anymore, because of the change introduced by patch
#720991 (by Greg as well). The new fix wasn't complicated though, and is
included as well.
As a note. It seems that there are other places that require the
"protection" of LASTMARK_SAVE()/LASTMARK_RESTORE(), and are just waiting
for someone to find how to break them. Particularly, I belive that every
recursion of SRE_MATCH() should be protected by these macros. I won't
do that right now since I'm not completely sure about this, and we don't
have much time for testing until the next release.
to be compliant with previous python versions, by backing out the changes
made in revision 2.84 which affected this. The bugfix for backtracking is
still maintained.
New functions:
unsigned long PyInt_AsUnsignedLongMask(PyObject *);
unsigned PY_LONG_LONG) PyInt_AsUnsignedLongLongMask(PyObject *);
unsigned long PyLong_AsUnsignedLongMask(PyObject *);
unsigned PY_LONG_LONG) PyLong_AsUnsignedLongLongMask(PyObject *);
New and changed format codes:
b unsigned char 0..UCHAR_MAX
B unsigned char none **
h unsigned short 0..USHRT_MAX
H unsigned short none **
i int INT_MIN..INT_MAX
I * unsigned int 0..UINT_MAX
l long LONG_MIN..LONG_MAX
k * unsigned long none
L long long LLONG_MIN..LLONG_MAX
K * unsigned long long none
Notes:
* New format codes.
** Changed from previous "range-and-a-half" to "none"; the
range-and-a-half checking wasn't particularly useful.
New test test_getargs2.py, to verify all this.
A small fix for bug #545855 and Greg Chapman's
addition of op code SRE_OP_MIN_REPEAT_ONE for
eliminating recursion on simple uses of pattern '*?' on a
long string.
of PyObject_HasAttr(); the former promises never to execute
arbitrary Python code. Undid many of the changes recently made to
worm around the worst consequences of that PyObject_HasAttr() could
execute arbitrary Python code.
Compatibility is hard to discuss, because the dangerous cases are
so perverse, and much of this appears to rely on implementation
accidents.
To start with, using hasattr() to check for __del__ wasn't only
dangerous, in some cases it was wrong: if an instance of an old-
style class didn't have "__del__" in its instance dict or in any
base class dict, but a getattr hook said __del__ existed, then
hasattr() said "yes, this object has a __del__". But
instance_dealloc() ignores the possibility of getattr hooks when
looking for a __del__, so while object.__del__ succeeds, no
__del__ method is called when the object is deleted. gc was
therefore incorrect in believing that the object had a finalizer.
The new method doesn't suffer that problem (like instance_dealloc(),
_PyObject_Lookup() doesn't believe __del__ exists in that case), but
does suffer a somewhat opposite-- and even more obscure --oddity:
if an instance of an old-style class doesn't have "__del__" in its
instance dict, and a base class does have "__del__" in its dict,
and the first base class with a "__del__" associates it with a
descriptor (an object with a __get__ method), *and* if that
descriptor raises an exception when __get__ is called, then
(a) the current method believes the instance does have a __del__,
but (b) hasattr() does not believe the instance has a __del__.
While these disagree, I believe the new method is "more correct":
because the descriptor *will* be called when the object is
destructed, it can execute arbitrary Python code at the time the
object is destructed, and that's really what gc means by "has a
finalizer": not specifically a __del__ method, but more generally
the possibility of executing arbitrary Python code at object
destruction time. Code in a descriptor's __get__() executed at
destruction time can be just as problematic as code in a
__del__() executed then.
So I believe the new method is better on all counts.
Bugfix candidate, but it's unclear to me how all this differs in
the 2.2 branch (e.g., new-style and old-style classes already
took different gc paths in 2.3 before this last round of patches,
but don't in the 2.2 branch).
instead of looping. Smaller and clearer. Faster, too, when we're not
appending to gc.garbage: gc_list_merge() takes constant time, regardless
of the lists' sizes.
append_objects(): Moved up to live with the other list manipulation
utilities.
externally unreachable objects with finalizers, and externally unreachable
objects without finalizers reachable from such objects. This allows us
to call has_finalizer() at most once per object, and so limit the pain of
nasty getattr hooks. This fixes the failing "boom 2" example Jeremy
posted (a non-printing variant of which is now part of test_gc), via never
triggering the nasty part of its __getattr__ method.
to special-case classic classes, or to worry about refcounts;
has_finalizer() deleted the current object iff the first entry in
the unreachable list has changed. I don't believe it was correct
to check for ob_refcnt == 1, either: the dealloc routine would get
called by Py_DECREF then, but there's nothing to stop the dealloc
routine from ressurecting the object, and then gc would remain at
the head of the unreachable list despite that its refcount temporarily
fell to 0 (and that would lead to an infinite loop in move_finalizers()).
I'm still worried about has_finalizer() resurrecting other objects
in the unreachable list: what's to stop them from getting collected?
delstr from initgc() into collect(). initgc() isn't called unless the
user explicitly imports gc, so can be used only for initialization of
user-visible module features; delstr needs to be initialized for proper
internal operation, whether or not gc is explicitly imported.
Bugfix candidate? I don't know whether the new bug was backported to
2.2 already.
pack_float, pack_double, save_float: All the routines for creating
IEEE-format packed representations of floats and doubles simply ignored
that rounding can (in rare cases) propagate out of a long string of
1 bits. At worst, the end-off carry can (by mistake) interfere with
the exponent value, and then unpacking yields a result wrong by a factor
of 2. In less severe cases, it can end up losing more low-order bits
than intended, or fail to catch overflow *caused* by rounding.
Bugfix candidate, but I already backported this to 2.2.
In 2.3, this code remains in severe need of refactoring.
for specific platforms. Use this to add plat-mac and
plat-mac/lib-scriptpackages on MacOSX. Also tested for not having adverse
effects on Linux, and I think this code isn't used on Windows anyway.
Fixes#661521.
invalid, rather than returning a string of random garbage of the
estimated result length. Closes SF patch #703471 by Hye-Shik Chang.
Will backport to 2.2-maint (consider it done.)
it instead of the OS-specific <linux/soundcard.h> or <machine/soundcard.h>.
Mixers devices have an ioctl-only interface, no read/write -- so the
flags passed to open() don't really matter. Thus, drop the 'mode'
parameter to openmixer() (ie. second arg to newossmixerobject()) and
always open mixers with O_RDWR.
- The applet logic has been replaced to bundlebuilder's bootstrap script
- Due to Apple being extremely string about argv[0], we need a way to
specify the actual executable name for use with sys.executable. See
the comment embedded in the code.
- Implement the behavior as specified in PEP 277, meaning os.listdir()
will only return unicode strings if it is _called_ with a unicode
argument.
- And then return only unicode, don't attempt to convert to ASCII.
- Don't switch on Py_FileSystemDefaultEncoding, but simply use the
default encoding if Py_FileSystemDefaultEncoding is NULL. This means
os.listdir() can now raise UnicodeDecodeError if the default encoding
can't represent the directory entry. (This seems better than silcencing
the error and fall back to a byte string.)
- Attempted to decribe the above in Doc/lib/libos.tex.
- Reworded the Misc/NEWS items to reflect the current situation.
This checkin also fixes bug #696261, which was due to os.listdir() not
using Py_FileSystemDefaultEncoding, like all file system calls are
supposed to.
[ 555817 ] Flawed fcntl.ioctl implementation.
with my patch that allows for an array to be mutated when passed
as the buffer argument to ioctl() (details complicated by
backwards compatibility considerations -- read the docs!).
Change setup.py to catch all exceptions.
- Rename module if the exception was an ImportError
- Only warn if the exception was any other error
Revert _iconv_codec to raising a RuntimeError.
(see SF bug #690309) and raise ImportErrors instead of
RuntimeErrors, so building Python continues even
if importing iconv_codecs fails.
This is a temporary fix until we get proper configure
support for "broken" iconv implementations.
is required for the chosen internal encoding in the init function,
as this seems to have a better chance of working under Irix and
Solaris.
Also change the test character from '\x01' to '0'.
This might fix SF bug #690309.
Rather than trying to second-guess the various error returns
of a second connect(), use select() to determine whether the
socket becomes writable (which means connected).
reasons: importing module can fail, or the attribute lookup
module.name can fail. We were giving the same error msg for
both cases, making it needlessly hard to guess what went wrong.
These cases give different error msgs now.
Fix off-by-1 error in normalize_line_endings():
when *p == '\0' the NUL was copied into q and q was auto-incremented,
the loop was broken out of,
then a newline was appended followed by a NUL.
So the function, in effect, was strcpy() but added two extra chars
which was caught by obmalloc in debug mode, since there was only
room for 1 additional newline.
Get test working under regrtest (added test_main).
the optional proto 2 slot state.
pickle.py, load_build(): CAUTION: Noted that cPickle's
load_build and pickle's load_build really don't do the same
things with the state, and didn't before this patch either.
cPickle never tries to do .update(), and has no backoff if
instance.__dict__ can't be retrieved. There are no tests
that can tell the difference, and part of what cPickle's
load_build() did looked accidental to me, so I don't know
what the true intent is here.
pickletester.py, test_pickle.py: Got rid of the hack for
exempting cPickle from running some of the proto 2 tests.
dictobject.c, PyDict_Next(): documented intended use.
exercised by the test suite before cPickle knows how to create NEWOBJ
too. For now, it was just tried once by hand (via loading a NEWOBJ
pickle created by pickle.py).
inet_aton() rather than inet_addr() -- the latter is obsolete because
it has a problem: "255.255.255.255" is a valid address but
indistinguishable from an error.
(I'm not sure if inet_aton() exists everywhere -- in case it doesn't,
I've left the old code in with an #ifdef.)
* Modules/bz2module.c
(BZ2FileObject): Now the structure includes a pointer to a file object,
instead of "inheriting" one. Also, some members were copied from the
PyFileObject structure to avoid dealing with the internals of that
structure from outside fileobject.c.
(Util_GetLine,Util_DropReadAhead,Util_ReadAhead,Util_ReadAheadGetLineSkip,
BZ2File_write,BZ2File_writelines,BZ2File_init,BZ2File_dealloc,
BZ2Comp_dealloc,BZ2Decomp_dealloc):
These functions were adapted to the change above.
(BZ2File_seek,BZ2File_close): Use PyObject_CallMethod instead of
getting the function attribute locally.
(BZ2File_notsup): Removed, since it's not necessary anymore to overload
truncate(), and readinto() with dummy functions.
(BZ2File_methods): Added xreadlines() as an alias to BZ2File_getiter,
and removed truncate() and readinto().
(BZ2File_get_newlines,BZ2File_get_closed,BZ2File_get_mode,BZ2File_get_name,
BZ2File_getset):
Implemented getters for "newlines", "mode", and "name".
(BZ2File_members): Implemented "softspace" member.
(BZ2File_init): Reworked to create a file instance instead of initializing
itself as a file subclass. Also, pass "name" object untouched to the
file constructor, and use PyObject_CallFunction instead of building the
argument tuple locally.
(BZ2File_Type): Set tp_new to PyType_GenericNew, tp_members to
BZ2File_members, and tp_getset to BZ2File_getset.
(initbz2): Do not set BZ2File_Type.tp_base nor BZ2File_Type.tp_new.
* Doc/lib/libbz2.tex
Do not mention that BZ2File inherits from the file type.
* Removed the ifilter flag wart by splitting it into two simpler functions.
* Fixed comment tabbing in C code.
* Factored module start-up code into a loop.
Documentation:
* Re-wrote introduction.
* Addede examples for quantifiers.
* Simplified python equivalent for islice().
* Documented split of ifilter().
Sets.py:
* Replace old ifilter() usage with new.
__ne__ no longer complain if they don't know how to compare to the other
thing. If no meaningful way to compare is known, saying "not equal" is
sensible. This allows things like
if adatetime in some_sequence:
and
somedict[adatetime] = whatever
to work as expected even if some_sequence contains non-datetime objects,
or somedict non-datetime keys, because they only call __eq__.
It still complains (raises TypeError) for mixed-type comparisons in
contexts that require a total ordering, such as list.sort(), use as a
key in a BTree-based data structure, and cmp().