gzip shouldn't raise ValueError on corrupt files
Currently the gzip module will raise a ValueError if the file was
corrupt (bad crc or bad size). I can't see how that applies to
reading a corrupt file. IOError seems better, and it's what code
will likely be looking for.
guarantee to keep valid pointers in its slots.
tests: Moved ExtensionSaver from test_copy_reg into pickletester, and
use it both places. Once extension codes get assigned, it won't be
safe to overwrite them willy nilly in test suites, and ExtensionSaver
does a thorough job of undoing any possible damage.
Beefed up the EXT[124] tests a bit, to check the smallest and largest
codes in each opcode's range too.
blindly assumed that tp_as_sequence->sq_item always returns
a str or unicode object. This might fail with str or unicode
subclasses.
This patch checks whether the object returned from __getitem__
is a str/unicode object and raises a TypeError if not (and
the filter function returned true).
Furthermore the result for __getitem__ can be more than one
character long, so checks for enough memory have to be done.
this clarifies that they are part of an internal API (albeit shared
between pickle.py, copy_reg.py and cPickle.c).
I'd like to do the same for copy_reg.dispatch_table, but worry that it
might be used by existing code. This risk doesn't exist for the
extension registry.
module. This increases code coverage of Python/sysmodule.c
from 68% to 77% (on Linux).
The script doesn't exercise the error branch that handles an evil
or lost sys.excepthook in Python/pythonrun.c::PyErr_PrintEx().
Also this script might not work on Jython in its current form.
From SF patch #662807.
extension implemented flush() was fixed. Scott also rewrite the
zlib test suite using the unittest module. (SF bug #640230 and
patch #678531.)
Backport candidate I think.
outcome as __slotnames__ on the class. (Like __slots__, it's not safe
to ask for this as an attribute -- you must look for it in the
specific class's __dict__. But it must be set using attribute
notation, because __dict__ is a read-only proxy.)
* Treat major, minor numbers of HTTP version as separate integers
* Fix errors if version string is "HTTP/1.2.3" or even simply "BLAH".
* send_error() checks if 'self.command' is a
HEAD. However, if there's an error parsing the first line of the
HTTP request the self.command wasn't set yet; force
self.command to be initialized to None.
are actually getting generated. Add helpered method
ensure_opcode_in_pickle to do a correct job checking for that. Changed
test_long1(), test_long4(), and test_short_tuples() to use it.
huge. On older Linux systems, the C library's strtod() apparently
gives up before seeing the end of the string when it sees so many
digits that it thinks the result must be Infinity. (It is wrong, BTW
-- there could be an "e-10000" hiding behind 10,000 digits.) The
shorter shuge still tests what it's testing, without relying on
strtod() doing a super job.
exception, ResourceDenied. This is used to distinguish between tests that
are skipped for other reasons (platform support, missing data, etc.) from
those that are skipped because a "resource" has not been enabled. This
prevents those tests from being reported as unexpected skips for the
platform; those should only be considered unexpected skips if the resource
were enabled.
[ 676342 ] after using pdb readline does not work correctly
using Michael Stone's patch so the completer functionality of
cmd is only setup between preloop and postloop.
loops. Renamed DATA and BINDATA to DATA0 and DATA1. Included
disassemblies, but noted why we can't test them. Added XXX comment to
cPickle about a mysterious comment, where pickle and cPickle diverge
in how they number PUT indices.
Assorted code cleanups; e.g., sizeof(char) is 1 by definition, so there's
no need to do things like multiply by sizeof(char) in hairy malloc
arguments. Fixed an undetected-overflow bug in readline_file().
longobject.c: Fixed a really stupid bug in the new _PyLong_NumBits.
pickle.py: Fixed stupid bug in save_long(): When proto is 2, it
wrote LONG1 or LONG4, but forgot to return then -- it went on to
append the proto 1 LONG opcode too.
Fixed equally stupid cancelling bugs in load_long1() and
load_long4(): they *returned* the unpickled long instead of pushing
it on the stack. The return values were ignored. Tests passed
before only because save_long() pickled the long twice.
Fixed bugs in encode_long().
Noted that decode_long() is quadratic-time despite our hopes,
because long(string, 16) is still quadratic-time in len(string).
It's hex() that's linear-time. I don't know a way to make decode_long()
linear-time in Python, short of maybe transforming the 256's-complement
bytes into marshal's funky internal format, and letting marshal decode
that. It would be more valuable to make long(string, 16) linear time.
pickletester.py: Added a global "protocols" vector so tests can try
all the protocols in a sane way. Changed test_ints() and test_unicode()
to do so. Added a new test_long(), but the tail end of it is disabled
because it "takes forever" under pickle.py (but runs very quickly under
cPickle: cPickle proto 2 for longs is linear-time).
anymore either, so don't. This also allows to get rid of obscure code
making __getnewargs__ identical to __getstate__ (hmm ... hope there
wasn't more to this than I realize!).
longer needs to be public, and shoudn't be public because all datetime
objects are immutable. The Python implementation has changed
accordingly, but still need to change the C implementation.
The 4th item can be None or an iterator yielding list items, which are
used to append() or extend() the object. The 5th item can be None or
an iterator yielding a dict's (key, value) pairs, which are stuffed
into the object using __setitem__.
Also (as a separate, though related, feature) add "batching" for list
and dict items. If you pickled a dict or list with a million items in
the past, it would push a million items onto the stack. It now pushes
only 1000 items at a time on the stack, using repeated APPENDS or
SETITEMS opcodes. (For lists, I hope that using many short extend()
calls doesn't exhibit quadratic behavior.)
__module__ is the string name of the module the function was defined
in, just like __module__ of classes. In some cases, particularly for
C functions, the __module__ may be None.
Change PyCFunction_New() from a function to a macro, but keep an
unused copy of the function around so that we don't change the binary
API.
Change pickle's save_global() to use whichmodule() if __module__ is
None, but add the __module__ logic to whichmodule() since it might be
used outside of pickle.
error handers in the Unicode codecs: Negative
positions are treated as being relative to the end of
the input and out of bounds positions result in an
IndexError.
Also update the PEP and include an explanation of
this in the documentation for codecs.register_error.
Fixes a small bug in iconv_codecs: if the position
from the callback is negative *add* it to the size
instead of substracting it.
From SF patch #677429.
M rpc.py
SF Bug 676398 Doesn't handle non-built-in exceptions
1. Move exception formatting to the subprocess; allows subclassing of
exceptions, including subclasses created in the shell without
introducing excessive complexity in the RPC mechanism.
2. Provide access to linecache from subprocess to support this.
classes have a __reduce__ that returns (self.__class__,
self.__getstate__()). tzinfo.__reduce__() is a bit smarter, calling
__getinitargs__ and __getstate__ if they exist, and falling back to
__dict__ if it exists and isn't empty.
for this iconv() implementation in the init function.
For encoding: use a byteswapped version of the input if
neccessary.
For decoding: byteswap every piece returned by iconv()
if neccessary (but not those pieces returned from the
callback)
Comment out test_sane() in the test script, because
whether this works depends on whether byte swapping
is neccessary or not (an on Py_UNICODE_SIZE)
on the type instead of self.save(t). This defeated the purpose of
NEWOBJ, because it didn't generate a BINGET opcode when t was already
memoized; but moreover, it would generate multiple BINPUT opcodes for
the same type! pickletools.dis() doesn't like this.
How I found this? I was playing with picklesize.py in the datetime
sandbox, and noticed that protocol 2 pickles for multiple objects were
in fact larger than protocol 1 pickles! That was suspicious, so I
decided to disassemble one of the pickles.
This really needs a unit test, but I'm exhausted. I'll be late for
work as it is. :-(
the same function, don't save the state or write a BUILD opcode. This
is so that a type (e.g. datetime :-) can support protocol 2 using
__getnewargs__ while also supporting protocol 0 and 1 using
__getstate__. (Without this, the state would be pickled twice with
protocol 2, unless __getstate__ is defined to return None, which
breaks protocol 0 and 1.)
popped a MARK, but without stack emulation the disassembler couldn't
know that, and subsequent indentation got hosed.
Now the disassembler does do enough stack emulation to catch this. While
I was at it, also added lots of sanity checks for other stack operations,
and correct use of the memo. This goes (I think) a long way toward being
a "pickle verifier" now too.
types. The special handling for these can now be removed from save_newobj().
Add some testing for this.
Also add support for setting the 'fast' flag on the Python Pickler class,
which suppresses use of the memo.
only get run by test_pickle.py now (& not by test_cpickle.py). This
should be undone when protocol 2 is implemented in cPickle too.
test_cpickle should pass again.
object.__reduce__, do a getattr() on the class so we can explicitly
test for it. The reduce()-calling code becomes a bit more regular as
a result.
Also add support slots: if an object has slots, the default state is
(dict, slots) where dict is the __dict__ or None, and slots is a dict
mapping slot names to slot values. We do a best-effort approach to
find slot names, assuming the __slots__ fields of classes aren't
modified after class definition time to misrepresent the actual list
of slots defined by a class.
of the opcode character instead (but stripping the quotes).
Added a proto 2 test section for the canonical recursive-tuple case.
Note that since pickle's save_tuple() takes different paths depending on
tuple length now, beefier tests are really needed (but not in pickletools);
the "short tuple" case tried here was actually broken yesterday, and it's
subtle stuff so needs to be tested.
be one of 0, 1 or 2).
I should note that the previous checkin also added NEWOBJ support to
the unpickler -- but there's nothing yet that generates this.
some notion of low-level efficiency. Undid that, but left one routine
alone: save_inst() claims it has a reason for not using memoize().
I don't understand that comment, so added an XXX comment there.
then the embedded argument consumes at least 256 bytes. The difference
between a 3-byte prefix (LONG2 + 2 bytes) and a 5-byte prefix (LONG4 +
4 bytes) is at worst less than 1%. Note that binary strings and binary
Unicode strings also have only "size is 1 byte, or size is 4 bytes?"
flavors, and I expect for the same reason. The only place a 2-byte
thingie was used was in BININT2, where the 2 bytes make up the *entire*
embedded argument (and now EXT2 also does this); that's a large savings
over 4 bytes, because the total opcode+argument size is so small in
the BININT2/EXT2 case.
Removed the TAKEN_FROM_ARGUMENT "number of bytes" code, and bifurcated it
into TAKEN_FROM_ARGUMENT1 and TAKEN_FROM_ARGUMENT4. Now there's enough
info in ArgumentDescriptor objects to deduce the # of bytes consumed by
each opcode.
Rearranged the order in which proto2 opcodes are listed in pickle.py.
component strings by a blank instead of a period. Guido pointed
out that the component strings (at least the first one) can be
dotted already. find_class() is overridable too, so only God knows
all the possibilities that make sense to someone.
Add support for the DOM Level 3 (draft) DOMImplementationSource interface
to the xml.dom and xml.dom.minidom modules. Note API issue: the draft spec
says to return null when there is no suitable implementation, while the
Python getDOMImplementation() function raises ImportError (minor).
Re-arrange the imports into "Python normal form."
Add test of the getUserData() / setUserData() methods, including the
NODE_CLONED callback.
Added support for renameNode() and getInterface().
Changed Node.unlink() so an unlinked node is not rendered completely
unusable by setting childNodes to None.
Element.removeAttributeNode() is slightly less destructive.
Added test for the wholeText attribute.
Added a test for Text.replaceWholeText().
Fixed to properly create Element in test of user data
Rename a local variable so it makes sense when viewed as a sequence.
Unlink a few documents when we're done with them.
Added tests to define the behavior of the cloneNode() and importNode()
mehods, especially in the "difficult" cases of document and
document-type nodes.
Filled in a few more of the other cloneNode() tests.
NodeList.item() does not exist before Python 2.2, since it requires being
able to create subtypes of list. Use the subscript syntax instead.
Added a test that minidom documents can be pickled and unpickled.
Closes SF bug #609641.
Fill in an empty test, making sure we get the whitespace right for the
data attribute of a processing instruction.
Added checks for a few more invariants for processing instructions.
testProcessingInstruction(): The length attribute of the NodeList
interface is not implemented for Python 2.0, 2.1, so only use
len() to test the length.
testSchemaType(): New test, testing just the minimum of schemaType
support; this is different from the test_xmlbuilder version of the
test since it doesn't rely on using a specific builder, and the
builders support different levels of DTD support.
Add tests for the removeNamedItem() and removeNamedItemNS() methods of
the NamedNodeMap instances found on Element nodes.
These do not pass; the fix will be committed shortly.
Added support for the DOM Level 3 (draft) Element.setIdAttribute*() methods.
Do more to avoid creating new Attr nodes, so that attributes do not lose
their ID-ness when set using setIdAttribute*().
M RemoteDebugger.py
M rpc.py
Fix the incorrect shell exception tracebacks generated when running
under debugger control:
1. Use rpc.SocketIO.asynccall() instead of remotecall() to handle the
IdbProxy.run() command.
2. Add a 'shell' attribute to RemoteDebugger.IdbProxy to allow setting
of ModifiedInterpreter's active_seq attribute from RemoteDebugger code.
3. Cleanup PyShell.ModifiedInterpreter.runcode() and remove ambiguity
regarding use of begin/endexecuting().
4. In runcode() and cleanup_traceback() use 'console' instead of 'file' to
denote the entity to which the exception traceback is printed.
5. Enhance cleanup_traceback() so if the traceback is pruned entirely away
(the error is in IDLE internals) it will be displayed in its entirety
instead.
6. ModifiedInterpreter.runcode() now prints ERROR RPC returns to both
console and __stderr__.
7. Make a small tweak to the rpc.py debug messages.
Wrap a lot of long lines.
Clean up a handler for expat.error.
If a lexical handler is set, make sure we call the startDTD() and
endDTD(). If the lexical handler is unset (by setting it to None),
remove the handlers from the underlying pyexpat parser object.
Closes SF bug #485584.
In namespaces mode, make sure we set up the qnames dictionary
correctly for the AttributesNSImpl instance passed to the
start-element-handler.
Closes SF bug #563399.
Support skippedEntity. Fixes#665486.
Basic minidom changes to support the new higher-performance builder, as
described: http://mail.python.org/pipermail/xml-sig/2002-February/007217.html
Use True/False where appropriate.
isSupported(): Implemented from DOM Level 2.
Support a variety of things from the DOM Level 3 draft, integrate with
the xml.dom.xmlbuilder module for the new Document and
DOMImplementation methods.
Support the NODE_CLONED callback for the UserDataHandler set using
setUserData().
Add Entity and Notation nodes to minidom.
Add __getitem__() to ReadOnlySequentialNamedNodeMap to match NamedNodeMap.
TupleType was used without being defined; rename to _TupleType and define.
Add magic so that instances of the NamedNodeMap (and its read-only cousin)
take a bit less memory in the new-style world of Python 2.2/2.3. Now, the
assignments to __slots__ actually work. ;-)
Add support for the Text.wholeText attribute.
Document.createCDATASection(): Do not pass unsupported arg to CDATASection
constructor.
Implemented Text.replaceWholeText().
Updated minidom interfaces to work better with current 4Suite XPath and Xslt.
* Added childNodes to class Attr
* Added localName and prefix to all Nodes
* Added specified on class Attr
* Changed DOMImplementation.createDocument to all creating a document with no document element and
a
Null doctype
* Changed CharacterData__setattr__ to keep nodeValue and data in synch
* fixed typo of ownerDoc in createDocumentFragment
* Changed Comment to inherit from CharacterData
* Allowed mutation of name on PIs
* Added importNode and rewrote cloneNode so both use same code base
* Changed EmptyNodeList to be a list not a tuple
Use a table-driven DOMImplementation.hasFeature().
Shorten lines longer than 80 characters.
Rename CloneNode to _clone_node (better naming consistency within the
module).
When defining localName as a property, the defproperty() call is
needed for each class that defined _get_localName(), otherwise only
the first version is used for Python 2.2 and newer.
Node.insertBefore(): When the reference node is not found, raise the
exception defined by the DOM specification.
Attr._set_value(): Added setter that does the right thing.
Childless.removeChild(): Raise the exception defined by the
specification, even though it seem less than intuitive.
_clone_node(): Access nodeType constants so we actually find them.
Add support for document fragments.
Node.removeChild(), .replaceChild():
Fix exception raised when a reference node is not found.
CharacterData._set_data(): Update the nodeValue attribute as well as
the data attribute.
Entity.attributes, .childNodes: Added these attributes.
Document.removeChild(): Raise the right exception when the node being
removed is not a child of this node.
Element.removeAttributeNode(): Raise the right exception when the
node isn't present on this element. Don't unlink the node unless
it is present.
Added support for the following methods and accessors:
Node._get_childNodes(), Attr._get_specified(), Attr._set_prefix(),
NamedNodeMap.has_key(), .getNamedItem(), .getNamedItemNS(),
.removeNamedItem(), .removeNamedItemNS(),
ProcessingInstruction._get_data(), ._get_target(), ._set_data(),
._set_target(), CharacterData.__len__(),
Document.getElementById().
Add many more of the _get_*() accessors.
Convert internal helpers to use a more consistent naming convention.
Remove unused definition of _nssplit(); there can be only one!
Move the Identified mixin up so it can be used by one more class.
Remove comment about NamedNodeMap.__getitem__(); the API won't be
changing now! Way too late for that.
Preliminary support for getElementById() for DOMs built with
xml.dom.expatbuilder.
Not necessarily very efficient, but it works, and is still fast for Document
instances that do not have the ID information.
DOMImplementation.createDocument(): Don't forget to add the
DocumentType node to the tree. This appearantly was lost in the
previous release.
DocumentType.writexml(): New function.
Implement the final determination on the behaviors of importNode() and
cloneNode() with regard to Document and DocumentType nodes.
When cloning and importing, call the UserDataHandler with the right
operation, not just blindly use NODE_CLONED.
parse(), parseString(): When called with parser=None, use
xml.dom.expatbuilder instead of xml.dom.pulldom, to get a performance
boost (the main point of expatbuilder).
Fix for calling parse / parseString with a given parser instance;
the else-paths were ignored when refactoring the function signatures;
pychecker found that error instantly, BTW (hint, hint)
Added pickle support for NamedNodeMap, ReadOnlySequentialNamedNodeMap,
and ElementInfo. Closes SF bug #609641.
In _clone_node for elements, fixed arguments for getAttributeNodeNS
At least make sure the DOM API won't allow you to modify the child
node list of an entity node (since entity ndoes are supposed to be
readonly).
Add support for the DOM Level 3 (draft) DOMImplementationSource interface
to the xml.dom and xml.dom.minidom modules. Note API issue: the draft spec
says to return null when there is no suitable implementation, while the
Python getDOMImplementation() function raises ImportError (minor).
Implement the DOM Level 3 Attr.isId property.
Refactor the lookup of the ElementInfo objects.
Implement the schemaType attribute for Element and Attr nodes.
Defined by the (draft) DOM Level 3 specification.
getElementById(): Support caching of IDs found. This implementation is
sufficient for DOM Level 2 compliance, but additional changes will be
needed to support the setIdAttribute() and setIdAttributeNS() methods
in DOM Level 3.
Add support for Text.isWhitespaceInElementContent (draft Level 3).
NamedNodeMap.removeNamedItem(), .removeNamedItemNS():
Pass the new tests: Return the removed node, or raise NotFoundErr
if there was no matching node.
When changing attributes via a NamedNodeMap, update the ID-cache
appropriately.
Added support for the DOM Level 3 (draft) Element.setIdAttribute*() methods.
setAttributeNode(): Be more careful about not calling
removeAttributeNode() twice for a single node.
Do more to avoid creating new Attr nodes, so that attributes do not lose
their ID-ness when set using setIdAttribute*().
Work harder to avoid calls to Attr.__setattr__() and
CharacterData.__setattr__().
Attr.unlink():
Implement everything directly instead of calling to the base
class, which does several things that aren't needed for Attr
nodes.
Change some remaining assignments that caused __setattr__() to be
called when it can be avoided. expatbuilder can now perform DOM
construction without __setattr__() interferance in common cases.
Remove unused _make_parent_nodes logic.
compare against "the other" argument, we raise TypeError,
in order to prevent comparison from falling back to the
default (and worse than useless, in this case) comparison
by object address.
That's fine so far as it goes, but leaves no way for
another date/datetime object to make itself comparable
to our objects. For example, it leaves Marc-Andre no way
to teach mxDateTime dates how to compare against Python
dates.
Discussion on Python-Dev raised a number of impractical
ideas, and the simple one implemented here: when we don't
know how to compare against "the other" argument, we raise
TypeError *unless* the other object has a timetuple attr.
In that case, we return NotImplemented instead, and Python
will give the other object a shot at handling the
comparison then.
Note that comparisons of time and timedelta objects still
suffer the original problem, though.
add memoize() helper function to update the memo.
The first element of the tuple returned by __reduce__() must be a
callable. If it isn't the Unpickler will raise an error. Catch this
error in the pickler and raise the error there.
The memoize() helper also has a comment explaining how the memo
works. So methods can't use memoize() because the write funny codes.
This gives much the same treatment to datetime.fromtimestamp(stamp, tz) as
the last batch of checkins gave to datetime.now(tz): do "the obvious"
thing with the tz argument instead of a senseless thing.
checked in two days agao:
Refactoring of, and new rules for, dt.astimezone(tz).
dt must be aware now, and tz.utcoffset() and tz.dst() must not return None.
The old dt.astimezone(None) no longer works to change an aware datetime
into a naive datetime; use dt.replace(tzinfo=None) instead.
The tzinfo base class now supplies a new fromutc(self, dt) method, and
datetime.astimezone(tz) invokes tz.fromutc(). The default implementation
of fromutc() reproduces the same results as the old astimezone()
implementation, but tzinfo subclasses can override fromutc() if the
default implementation isn't strong enough to get the correct results
in all cases (for example, this may be necessary if a tzinfo subclass
models a time zone whose "standard offset" (wrt UTC) changed in some
year(s), or in some variations of double-daylight time -- the creativity
of time zone politics can't be captured in a single default implementation).
Sebastien Keim pointed out that iterkeys and __contains__ require
their own definitions so their behavior will update when the
underlying method is subclassed.
M PyShell.py
M config-keys.def
M configHandler.py
1. Clear any un-entered characters from input line before printing the
restart boundary.
2. Restore the Debug menu: There are now both Shell and Debug menus.
3. Add Control-F6 keybinding to Restart Shell.
4. Clarify PyShell.cancel_check() comment.
5. Update doc string for Bindings.py and re-format the file slightly.
(Loewis) which uses 'SRCDIR' (if available) in package dir path.
2. Merge Python IDLE setup.py Rev 1.5 (Loewis) to allow installation
from the build directory. IDLEfork SF Patch 668998 (Loewis)
When daylight time ends, an hour repeats on the local clock (for example,
in US Eastern, the clock jumps from 1:59 back to 1:00 again). Times in
the repeated hour are ambiguous. A tzinfo subclass that wants to play
with astimezone() needs to treat times in the repeated hour as being
standard time. astimezone() previously required that such times be
treated as daylight time. There seems no killer argument either way,
but Guido wants the standard-time version, and it does seem easier the
new way to code both American (local-time based) and European (UTC-based)
switch rules, and the astimezone() implementation is simpler.
port the tests to PyUnit and add many tests for error
cases. This increases code coverage in Python/bltinmodule.c
from 75% to 92%. (From SF patch #662807, with
assert_(not fcmp(x, y)) replaced with assertAlmostEqual(x, y)
where possible)
The Py2.3 updates to the pyclbr module return both Class and Function
objects. The IDLE ClassBrowser module only knew about Class and could
not handle objects which did not define "super".
Fixed by adding a guard.
Patch from Brett Cannon:
First, the 'y' directive now handles [00, 68] as a suffix for the
21st century while [69, 99] is treated as the suffix for the 20th
century (this is for Open Group compatibility).
strptime now returns default values that make it a valid date ...
the ability to pass in a regex object to use instead of a format
string (and the inverse ability to have strptime return a regex object)
has been removed. This is in preparation for a future patch that will
add some caching internally to get a speed boost.