sentences are separated by two spaces.
Improve _fix_sentence_endings() a bit -- look for ".!?" instead of just
".", and factor out the list of sentence-ending punctuation characters
to a class attribute.
I've made considerable changes to Michael's code, specifically to use
the select() system call directly and to store the timeout as a C
double instead of a Python object; internally, -1.0 (or anything
negative) represents the None from the API.
I'm not 100% sure that all corner cases are covered correctly, so
please keep an eye on this. Next I'm going to try it Windows before
Tim complains.
No way is this a bugfix candidate. :-)
Straightforward fix. Will backport to 2.2. If there's ever a new 2.1
release, this could be backported there too (since it's an issue with
anything that's got both a __reduce__ and a __setstate__).
Straightforward fix. Will backport to 2.2. If there's ever a new 2.1
release, this could be backported there too (since it's an issue with
anything that's got both a __reduce__ and a __setstate__).
This is a conservative version of SF patch 504889. It uses the log
module instead of calling print in various places, and it ignores the
verbose argument passed to many functions and set as an attribute on
some objects. Instead, it uses the verbosity set on the logger via
the command line.
The log module is now preferred over announce() and warn() methods
that exist only for backwards compatibility.
XXX This checkin changes a lot of modules that have no test suite and
aren't exercised by the Python build process. It will need
substantial testing.
While I was at it, I added a tp_clear handler and changed the
tp_dealloc handler to use the clear_slots helper for the tp_clear
handler.
Also tightened the rules for slot names: they must now be proper
identifiers (ignoring the dirty little fact that <ctype.h> is locale
sensitive).
Also set mp->flags = READONLY for the __weakref__ pseudo-slot.
Most of this is a 2.2 bugfix candidate; I'll apply it there myself.
# XXX this isn't used anywhere, and worse, it has the same name as a method
# in Command with subtly different semantics. (This one just has one
# source -> one dest; that one has many sources -> one dest.) Nuke it?
Yes. Nuke it.
modules, distutils does not understand that the build version of the
source tree is needed.
This patch fixes distutils.sysconfig to understand that the running
Python is part of the build tree and needs to use the appropriate
"shape" of the tree. This does not assume anything about the current
directory, so can be used to build 3rd-party modules using Python's
build tree as well.
This is useful since it allows us to use a non-installed debug-mode
Python with 3rd-party modules for testing. It as the side-effect that
set_python_build() is no longer needed (the hack which was added to
allow distutils to be used to build the "standard" extension modules).
This closes SF patch #547734.
BOM_UTF32, BOM_UTF32_LE and BOM_UTF32_BE that represent the Byte
Order Mark in UTF-8, UTF-16 and UTF-32 encodings for little and
big endian systems.
The old names BOM32_* and BOM64_* were off by a factor of 2.
This closes SF bug http://www.python.org/sf/555360
Change the module constructor (module_init) to have the signature
__init__(name:str, doc=None); this prevents the call from type_new()
to succeed. While we're at it, prevent repeated calling of
module_init for the same module from leaking the dict, changing the
semantics so that __dict__ is only initialized if NULL.
Also adding a unittest, test_module.py.
This is an incompatibility with 2.2, if anybody was instantiating the
module class before, their argument list was probably empty; so this
can't be backported to 2.2.x.
The HTTPError class tries to act as a regular response objects for
HTTP protocol errors that include full responses. If the failure is
more basic, like no valid response, the __init__ choked when it tried
to initialize its superclasses in addinfourl hierarchy that requires a
valid response.
The solution isn't elegant but seems to be effective. Do not
initialize the base classes if there isn't a file object containing
the response. In this case, user code expecting to use the addinfourl
methods will fail; but it was going to fail anyway.
It might be cleaner to factor out HTTPError into two classes, only one
of which inherits from addinfourl. Not sure that the extra complexity
would lead to any improved functionality, though.
Partial fix for SF bug # 563665.
Bug fix candidate for 2.1 and 2.2.
-f/--fromfile <filename>
option. This runs all and only the tests named in the file, in the
order given (although -x may weed that list, and -r may shuffle it).
Lines starting with '#' are ignored.
This goes a long way toward helping to automate the binary-search-like
procedure I keep reinventing by hand when a test fails due to interaction
among tests (no failure in isolation, and some unknown number of
predecessor tests need to run first -- now you can stick all the test
names in a file, and comment/uncomment blocks of lines until finding a
minimal set of predecessors).
__call__() can be 2-3x slower than the equivalent normal method.
_handle_message(): The structure of message/rfc822 message has
changed. Now parent's payload is a list of length 1, and the zeroth
element is the Message sub-object. Adjust the printing of such
message trees to reflect this change.
There's some wierdness here, but the test ran before and not after,
so I'm just hacking the change out. Someone more motivated than
me can work out what's really happening.
Raymond: *PLEASE* run the test suite before checking things like
this in!
subclasses.
MIMENonMultipart: Base class for non-multipart/* content type subclass
specializations, e.g. image/gif. This class overrides attach() which
raises an exception, since it makes no sense to attach a subpart to
e.g. an image/gif message.
MIMEMultipart: Base class for multipart/* content type subclass
specializations, e.g. multipart/mixed. Does little more than provide
a useful constructor.
If a rexec instance allows writing in the current directory (a common
thing to do), there's a way to execute bogus bytecode. Fix this by
not allowing imports from .pyc files (in a way that allows a site to
configure things so that .pyc files *are* allowed, if writing is not
allowed).
I'll apply this to 2.2 and 2.1 too.
- Add comment explaining the structure of the stack.
- Minor optimization: make stack tuple directly usable as part of return
value for enter/exit events.
(or how do I "mark" something to be a candidate?)
fixed an old buglet that caused bdb to be unable to
continue in the botframe, after a breakpoint was set.
the key idea is not to set botframe to the bottom level frame,
but its f_back, which actually might be None.
Additional changes: migrated old exception trick to use
sys._getframe(), which exists both in 2.1 and 2.2 .
Note: I believe Mark Hammond needs to look over his code now.
F5 correctly starts up in the debugger, but later on doesn't stop at a given
breakpoint any longer.
kind regards - chris
[ 559250 ] more POSIX signal stuff
Adds support (and docs and tests and autoconfery) for posix signal
mask handling -- sigpending, sigprocmask and sigsuspend.
A MemoryError is now raised when the list cannot be created.
There is a test, but as the comment says, it really only
works for 32 bit systems. I don't know how to improve
the test for other systems (ie, 64 bit or systems
where the data size != addressable size,
e.g. 64 bit data, but 48 bit addressable memory)
instead of calling the getaddrlist() method, since the latter doesn't
work with multiple calls (it will return the empty list for the second
and subsequent calls).
Closes SF bug #555035. Include a unittest.
of the PyUNIT version of the same file. This helps people understand that
this version is the same as the version from the independent PyUNIT
release (confusion was indicated on the PyUNIT mailing list).
email package's Parser to handle the three common line endings.
Certain protocols such as IMAP define CRLF line endings and it doesn't
make sense for the client app to have to normalize the line endings
before handing it message off to the Parser.
_parsebody(): Be more flexible in the matching of line endings for
finding the MIME separators. Accept any of \r, \n and \r\n. Note
that we do /not/ change the line endings in the payloads, we just
accept any of those three around MIME boundaries.
single byte character sets. Also fixed a semantic problem with the
constructor's default arguments. Specifically,
__init__(): Change the maxlinelen argument default to None instead of
MAXLINELEN. The semantics should have been (and now are) that if
maxlinelen is given it is always honored. If it isn't given, but
header_name is given, then the maximum line length is calculated. If
neither are given then the default 76 characters is used.
_split(): If the character set is a single byte character set then we
can split the line at the maxlinelen because we know that encoding the
header won't increase its length. If the charset isn't a single byte
charset then we use the quicker divide-and-conquer line splitting
algorithm as before.
for the email package. The former is now just a shell project that
has some extra files for packaging for independent use (e.g. setup.py
and README).
Added a compatibility layer so that the same API can be used in Python
2.1 and 2.2/2.3 with the major differences shuffled off into helper
modules (_compat21.py and _compat22.py).
Also bumped the package version number to 2.0.3 for some fixes to be
checked in momentarily.
Scot Stevenson. Could be a bug fix candidate, but probably doesn't
matter much unless a certain blue-nosed cat suddenly becomes corporeal
and starts emailing some stmp.py (sic) fronted mailer.
returned a proxy for __class__ whose __bases__ was also a proxy. The
merge_class_dict() helper for dir() assumed incorrectly that __bases__
would always be a tuple and used the in-line tuple API on the proxy.
I will backport this to 2.2 as well.
test if 'callable' has not been supplied is to test for None instead of
False. The previous correction to 'if callable()' was wrong because an unusable
callback would be ignored rather than raising an exception.
and the .seed() and .whseed() methods failed to reset it. In other
words, setting the seed didn't completely determine the sequence of
results produced by random.gauss(). It does now. Programs repeatedly
mixing calls to a seed method with calls to gauss() may see different
results now.
Bugfix candidate (random.gauss() has always been broken in this way),
despite that it may change results.
This now does a dynamic analysis of which elements are so frequently
repeated as to constitute noise. The primary benefit is an enormous
speedup in find_longest_match, as the innermost loop can have factors
of 100s less potential matches to worry about, in cases where the
sequences have many duplicate elements. In effect, this zooms in on
sequences of non-ubiquitous elements now.
While I like what I've seen of the effects so far, I still consider
this experimental. Please give it a try!
On Win2K it thought 'foo' started at byte offset 0 instead of at the
pagesize, and on Win98 it thought 'foo' didn't exist at all. Somehow
or other this is related to the new "in memory file" gimmicks in
bsddb, but the old bsddb we use on Windows sucks so bad anyway I don't
want to bother digging deeper. Flushing the file in test_mmap after
writing to it makes the problem go away, so good enough.
build's "undetected error" problems were originally detected with
extension types, but we can whitebox test the same situations with
new-style classes.
Also add a test that Python doesn't die with SIGXFSZ if it exceeds the
file rlimit. (Assuming this will also test the behavior when the 2GB
limit is exceed on a platform that doesn't have large file support.)
closes SF #514433
can now pass 'None' as the filename for the bsddb.*open functions,
and you'll get an in-memory temporary store.
docs are ripped out of the bsddb dbopen man page. Fred may want to
clean them up.
Considering this for 2.2, but not 2.1.
http://www.python.org/sf/444708
This adds the optional argument for str.strip
to unicode.strip too and makes it possible
to call str.strip with a unicode argument
and unicode.strip with a str argument.
- islink() now returns true for alias files
- walk() no longer follows aliases while traversing
- realpath() implemented, returning an alias-free pathname.
As this could conceivably break existing code I think it isn't a bugfix candidate.
test data: this test fails on WIndows now if universal newlines are
enabled (which they aren't yet, by default). I don't know whether the
test will also fail on Linux now.
SF bug #522264 reported by Evelyn Mitchell.
The code included a comment about "STAR STAR" which was translated
into the code as the bogus attribute token.STARSTAR. This name never
caused an attribute error because it was never retrieved. The code
was based on an old version of the grammar that specified kwargs as
two tokens ('*' '*'). I checked as far back as 2.1 and didn't find
this production.
The fix is simple, because token.DOUBLESTAR is the only token
allowed. Also update the grammar fragment in com_arglist().
XXX I'll bet lots of other grammar fragments in comments are out of
date, probably in this module and in compile.c.
Close a file before trying to unlink it, and apparently Cygwin needs
writes to an mmap'ed file to get flushed before they're visible.
Bugfix candidate, but I think only for the 2.2 line (it's testing
features that I think were new in 2.2).
Change type_get_doc (the get function for __doc__) to look in tp_dict
more often, and if it finds a descriptor in tp_dict, to call it (with
a NULL instance). This means you can add a __doc__ descriptor to a
new-style class that returns instance docs when called on an instance,
and class docs when called on a class -- or the same docs in either
case, but lazily computed.
I'll also check this into the 2.2 maintenance branch.
If a str or unicode method returns the original object,
make sure that for str and unicode subclasses the original
will not be returned.
This should prevent SF bug http://www.python.org/sf/460020
from reappearing.