Basic minidom changes to support the new higher-performance builder, as
described: http://mail.python.org/pipermail/xml-sig/2002-February/007217.html
Use True/False where appropriate.
isSupported(): Implemented from DOM Level 2.
Support a variety of things from the DOM Level 3 draft, integrate with
the xml.dom.xmlbuilder module for the new Document and
DOMImplementation methods.
Support the NODE_CLONED callback for the UserDataHandler set using
setUserData().
Add Entity and Notation nodes to minidom.
Add __getitem__() to ReadOnlySequentialNamedNodeMap to match NamedNodeMap.
TupleType was used without being defined; rename to _TupleType and define.
Add magic so that instances of the NamedNodeMap (and its read-only cousin)
take a bit less memory in the new-style world of Python 2.2/2.3. Now, the
assignments to __slots__ actually work. ;-)
Add support for the Text.wholeText attribute.
Document.createCDATASection(): Do not pass unsupported arg to CDATASection
constructor.
Implemented Text.replaceWholeText().
Updated minidom interfaces to work better with current 4Suite XPath and Xslt.
* Added childNodes to class Attr
* Added localName and prefix to all Nodes
* Added specified on class Attr
* Changed DOMImplementation.createDocument to all creating a document with no document element and
a
Null doctype
* Changed CharacterData__setattr__ to keep nodeValue and data in synch
* fixed typo of ownerDoc in createDocumentFragment
* Changed Comment to inherit from CharacterData
* Allowed mutation of name on PIs
* Added importNode and rewrote cloneNode so both use same code base
* Changed EmptyNodeList to be a list not a tuple
Use a table-driven DOMImplementation.hasFeature().
Shorten lines longer than 80 characters.
Rename CloneNode to _clone_node (better naming consistency within the
module).
When defining localName as a property, the defproperty() call is
needed for each class that defined _get_localName(), otherwise only
the first version is used for Python 2.2 and newer.
Node.insertBefore(): When the reference node is not found, raise the
exception defined by the DOM specification.
Attr._set_value(): Added setter that does the right thing.
Childless.removeChild(): Raise the exception defined by the
specification, even though it seem less than intuitive.
_clone_node(): Access nodeType constants so we actually find them.
Add support for document fragments.
Node.removeChild(), .replaceChild():
Fix exception raised when a reference node is not found.
CharacterData._set_data(): Update the nodeValue attribute as well as
the data attribute.
Entity.attributes, .childNodes: Added these attributes.
Document.removeChild(): Raise the right exception when the node being
removed is not a child of this node.
Element.removeAttributeNode(): Raise the right exception when the
node isn't present on this element. Don't unlink the node unless
it is present.
Added support for the following methods and accessors:
Node._get_childNodes(), Attr._get_specified(), Attr._set_prefix(),
NamedNodeMap.has_key(), .getNamedItem(), .getNamedItemNS(),
.removeNamedItem(), .removeNamedItemNS(),
ProcessingInstruction._get_data(), ._get_target(), ._set_data(),
._set_target(), CharacterData.__len__(),
Document.getElementById().
Add many more of the _get_*() accessors.
Convert internal helpers to use a more consistent naming convention.
Remove unused definition of _nssplit(); there can be only one!
Move the Identified mixin up so it can be used by one more class.
Remove comment about NamedNodeMap.__getitem__(); the API won't be
changing now! Way too late for that.
Preliminary support for getElementById() for DOMs built with
xml.dom.expatbuilder.
Not necessarily very efficient, but it works, and is still fast for Document
instances that do not have the ID information.
DOMImplementation.createDocument(): Don't forget to add the
DocumentType node to the tree. This appearantly was lost in the
previous release.
DocumentType.writexml(): New function.
Implement the final determination on the behaviors of importNode() and
cloneNode() with regard to Document and DocumentType nodes.
When cloning and importing, call the UserDataHandler with the right
operation, not just blindly use NODE_CLONED.
parse(), parseString(): When called with parser=None, use
xml.dom.expatbuilder instead of xml.dom.pulldom, to get a performance
boost (the main point of expatbuilder).
Fix for calling parse / parseString with a given parser instance;
the else-paths were ignored when refactoring the function signatures;
pychecker found that error instantly, BTW (hint, hint)
Added pickle support for NamedNodeMap, ReadOnlySequentialNamedNodeMap,
and ElementInfo. Closes SF bug #609641.
In _clone_node for elements, fixed arguments for getAttributeNodeNS
At least make sure the DOM API won't allow you to modify the child
node list of an entity node (since entity ndoes are supposed to be
readonly).
Add support for the DOM Level 3 (draft) DOMImplementationSource interface
to the xml.dom and xml.dom.minidom modules. Note API issue: the draft spec
says to return null when there is no suitable implementation, while the
Python getDOMImplementation() function raises ImportError (minor).
Implement the DOM Level 3 Attr.isId property.
Refactor the lookup of the ElementInfo objects.
Implement the schemaType attribute for Element and Attr nodes.
Defined by the (draft) DOM Level 3 specification.
getElementById(): Support caching of IDs found. This implementation is
sufficient for DOM Level 2 compliance, but additional changes will be
needed to support the setIdAttribute() and setIdAttributeNS() methods
in DOM Level 3.
Add support for Text.isWhitespaceInElementContent (draft Level 3).
NamedNodeMap.removeNamedItem(), .removeNamedItemNS():
Pass the new tests: Return the removed node, or raise NotFoundErr
if there was no matching node.
When changing attributes via a NamedNodeMap, update the ID-cache
appropriately.
Added support for the DOM Level 3 (draft) Element.setIdAttribute*() methods.
setAttributeNode(): Be more careful about not calling
removeAttributeNode() twice for a single node.
Do more to avoid creating new Attr nodes, so that attributes do not lose
their ID-ness when set using setIdAttribute*().
Work harder to avoid calls to Attr.__setattr__() and
CharacterData.__setattr__().
Attr.unlink():
Implement everything directly instead of calling to the base
class, which does several things that aren't needed for Attr
nodes.
Change some remaining assignments that caused __setattr__() to be
called when it can be avoided. expatbuilder can now perform DOM
construction without __setattr__() interferance in common cases.
Remove unused _make_parent_nodes logic.
[1.3] Added documentation of the namespace URI for elements with no namespace.
[1.4] New property http://www.python.org/sax/properties/encoding.
[1.5] Support optional string interning in pyexpat.
[1.15]
Added understanding of the feature_validation, feature_external_pes,
and feature_string_interning features.
Added support for the feature_external_ges feature.
Added support for the property_xml_string property.
[1.16]
Made it recognize the namespace prefixes feature.
[1.17]
removed erroneous first line
[1.19]
Support optional string interning in pyexpat.
[1.21]
Restore compatibility with versions of Python that did not support weak
references. These do not get the cyclic reference fix, but they will
continue to work as they did before.
[1.22]
Activate entity processing unless standalone.
ContentHandler. While GC will eventually clean up, it can take longer than
normal for applications that create a lot of strings (or other immutables)
rather without creating many containers.
This closes SF bug #535474.
This is probably a little bit faster, but mostly is just cleaner code.
The old-style support is still used for Python versions < 2.2 so this
source file can be shared with PyXML.
attribute values. Just using escape() can (and always has) led to broken
XML being generated. This makes sure it always produces the right thing.
This actually closes SF bug #440351.
If pyexpat is not available and more than one attempt is made to load
an expat-based xml parser, an empty xml.parser.expat module will be
created. This empty module will confuse xml.sax.expatreader into
thinking that pyexpat is available.
The ugly fix is to verify that the expat module actually defines the
names that are imported from pyexpat.
createAttributeNS(), use the parallel setAttributeNode() or
setAttributeNodeNS() to add the node to the document -- do not assume
that setAttributeNode() will operate properly for both.
- addition of a DocumentFragment implementation and createDocumentFragment method
- proper setting of ownerDocument for all nodes
- setting of namespaceURI to None in Element as a class attribute
- addition of setAttributeNodeNS and removeAttributeNodeNS as aliases
for setAttributeNode and removeAttributeNode
- support for inheriting from DOMImplementation to extend it with
additional features (to override the Document class)
in pulldom:
- support for nodes (comment and PI) that occur before he document element;
that became necessary as pulldom now delays creation of the document
until it has the document element.
NamedNodeMap.setNamedItem(). Martin, should I sync the PyXML tree, too,
or do you want to do it? (I don't know if you're wrapping the 0.6.4
release right now.)
New method; this is the "alternate" access to the exception code.
(Useful for Python DOM implementations that support the accessor
method approach to retrieving attribute values.)
This will make it incompatible with the version found in Python 2.0.
Does this need to be done to PyXML too?
Changes that might break existing code are marked with (!) below.
- Formatting nit: no spaces inside parentheses: foo( a ) -> foo(a).
- Break long lines.
- (!) Fix getAttribute() and getAttributeNS() to return "" instead of
raising KeyError when the attribute is not found.
- (!) Fix getAttributeNodeNS() to return None instead of raising
KeyError. (Curiously, getAttributeNode() already did this.)
- Added hasAttributes(), which returns true iff the node has any
attributes. )This is DOM level 3.)
- (!) In createDocument(), if the qualified name is not empty,
actually create and insert the first element with that name (this
will become doc.documentElement). MvL believes that it should be an
error to specify an empty qualified name; I'm not going there today,
since it would require making a matching change to pulldom. Maybe
MvL will do this.
- In Document.writexml(), insert an xml declaration at the top. (This
doesn't include the encoding since there's no way to specify the
encoding. If that's preferred, all writexml() methods should be
fixed to support an optional encoding argument that they pass to
each other -- and they should use it to encode all text they write,
too. Later.)
- implement hasAttribute and hasAttributeNS (1.7)
- Node.replaceChild(): Update the sibling nodes to point to newChild. Set
the .nextSibling attribute on oldChild instead of adding a .newChild
attribute (1.9).
give minidom.py behaviour that complies with the DOM Level 1 REC,
which says that when a node newChild is added to the tree, "if the
newChild is already in the tree, it is first removed."
pulldom.py is patched to use the public minidom interface instead
of setting .parentNode itself. Possibly this reduces pulldom's
efficiency; someone else will have to pronounce on that.
Make Node inherit from xml.dom.Node to pick up the NodeType values
defined by the W3C recommendation.
When raising AttributeError, be sure to provide the name of the attribute
that does not exist.
Node.normalize(): Make sure we do not allow an empty text node to survive
as the first child; update the sibling links properly.
_getElementsByTagNameNSHelper(): Make recursive calls using the right
number of parameters.
Attr.__setattr__(): Be sure to update name and nodeName at the same time
since they are synonyms for this node type.
AttributeList: Renamed to NamedNodeMap (AttributeList maintained as an
alias). Compute the length attribute dynamically to allow
the underlying structures to mutate.
AttributeList.item(): Call .keys() on the dictionary rather than using
self.keys() for performance.
AttributeList.setNamedItem(), .setNamedItemNS():
Added methods.
Text.splitText():
Added method.
DocumentType:
Added implementation class.
DOMImplementation:
Added implementation class.
Document.appendChild(): Do not allow a second document element to be added.
Document.documentElement: Find this dynamically, so that one can be
removed and another added.
Document.unlink(): Clear the doctype attribute.
_get_StringIO(): Only use the StringIO module; cStringIO does not support
Unicode.
objects; uses minidom if one is not provided to the constructor.
parse(): Pick up the default_bufsize default value dynamically so that
the value in the module may be (meaningfully) changed at runtime.
This (partially) closes patch #102477.
behavior.
Added support for the Attr.ownerElement attribute.
Everywhere: Define constant object attributes in the classes rather than
on the instances during object construction. This reduces the amount of
work needed for object construction and destruction; these need to be
lightweight operations on a DOM.
Node._get_firstChild(),
Node._get_lastChild(): Return None if there are no children (required for
compliance with DOM level 1).
Node.insertBefore(): If refChild is None, append the new node instead of
failing (required for compliance). Also, update the sibling
relationships. Return the inserted node (required for compliance).
Node.appendChild(): Update the parent of the appended node.
Node.replaceChild(): Actually replace the old child! Update the parent
and sibling relationships of both the old and new children. Return
the replaced child (required for compliance).
Node.normalize(): Implemented the normalize() method. Required for
compliance, but missing from the release. Useful for joining
adjacent Text nodes into a single node for easier processing.
Node.cloneNode(): Actually make this work. Don't let the new node share
the instance __dict__ with the original. Do proper recursion if
doing a "deep" clone. Move the attribute cloning out of the base
class, since only Element is supposed to have attributes.
Node.unlink(): Simplify handling of child nodes for efficiency, and
remove the attribute handling since only Element nodes support
attributes.
Attr.cloneNode(): Extend this to clear the ownerElement attribute in
the clone.
AttributeList.items(),
AttributeList.itemsNS(): Slight performance improvement (avoid lambda).
Element.cloneNode(): Extend Node.cloneNode() with support for the
attributes. Clone the Attr objects after creating the underlying
clone.
Element.unlink(): Clean out the attributes here instead of in the base
class, since this is the only class that will have them.
Element.toxml(): Adjust to create only one AttributeList instance; minor
efficiency improvement.
_nssplit(): No need to re-import string.
Document.__init__(): No longer needed once constant attributes are
initialized in the class itself.
Document.createElementNS(),
Document.createAttributeNS(): Use the defined constructors rather than
directly access the classes.
_get_StringIO(): New function. Create an output StringIO using the most
efficient available flavor.
parse(),
parseString(): Import pulldom here instead of in the public namespace of
the module.
correct order of constructor args in createAttributeNS
pulldom: use symbolic names for uri and localnames
correct usage of createAttribute and setAttributeNode signatures.
callers of feed will get a SAXException.
In close, feed the last chunk first before calling endDocument, so that
the parser may report errors before the end of the document. Don't do
anything in a nested parser.
Don't call endDocument in parse; that will be called in close.
Use self._source for finding the SystemID; XML_GetBase will be cleared in
case of an error.
caused the drive letter to cause urlopen() to think it was an unrecognized
URL scheme. This only passes system ids to urlopen() if the file does not
exist. It works on Windows & Unix.
It should work everywhere else as well.
xml.sax: Fix parse and parseString not to rely on ExpatParser
Greatly simplify import logic by using __import__
saxutils: Support Unicode strings and files as parameters to
prepare_input_source
Add support for parsing already-opened files. Make sure the parse()
method closes exactly those files that it opens.
Modified by FLD for better conformance to the Python style guide.
This closes SourceForge patch #101512.
comments, docstrings or error messages. I fixed two minor things in
test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't").
There is a minor style issue involved: Guido seems to have preferred English
grammar (behaviour, honour) in a couple places. This patch changes that to
American, which is the more prominent style in the source. I prefer English
myself, so if English is preferred, I'd be happy to supply a patch myself ;)