attribute values. Just using escape() can (and always has) led to broken
XML being generated. This makes sure it always produces the right thing.
This actually closes SF bug #440351.
If pyexpat is not available and more than one attempt is made to load
an expat-based xml parser, an empty xml.parser.expat module will be
created. This empty module will confuse xml.sax.expatreader into
thinking that pyexpat is available.
The ugly fix is to verify that the expat module actually defines the
names that are imported from pyexpat.
createAttributeNS(), use the parallel setAttributeNode() or
setAttributeNodeNS() to add the node to the document -- do not assume
that setAttributeNode() will operate properly for both.
- addition of a DocumentFragment implementation and createDocumentFragment method
- proper setting of ownerDocument for all nodes
- setting of namespaceURI to None in Element as a class attribute
- addition of setAttributeNodeNS and removeAttributeNodeNS as aliases
for setAttributeNode and removeAttributeNode
- support for inheriting from DOMImplementation to extend it with
additional features (to override the Document class)
in pulldom:
- support for nodes (comment and PI) that occur before he document element;
that became necessary as pulldom now delays creation of the document
until it has the document element.
NamedNodeMap.setNamedItem(). Martin, should I sync the PyXML tree, too,
or do you want to do it? (I don't know if you're wrapping the 0.6.4
release right now.)
New method; this is the "alternate" access to the exception code.
(Useful for Python DOM implementations that support the accessor
method approach to retrieving attribute values.)
This will make it incompatible with the version found in Python 2.0.
Does this need to be done to PyXML too?
Changes that might break existing code are marked with (!) below.
- Formatting nit: no spaces inside parentheses: foo( a ) -> foo(a).
- Break long lines.
- (!) Fix getAttribute() and getAttributeNS() to return "" instead of
raising KeyError when the attribute is not found.
- (!) Fix getAttributeNodeNS() to return None instead of raising
KeyError. (Curiously, getAttributeNode() already did this.)
- Added hasAttributes(), which returns true iff the node has any
attributes. )This is DOM level 3.)
- (!) In createDocument(), if the qualified name is not empty,
actually create and insert the first element with that name (this
will become doc.documentElement). MvL believes that it should be an
error to specify an empty qualified name; I'm not going there today,
since it would require making a matching change to pulldom. Maybe
MvL will do this.
- In Document.writexml(), insert an xml declaration at the top. (This
doesn't include the encoding since there's no way to specify the
encoding. If that's preferred, all writexml() methods should be
fixed to support an optional encoding argument that they pass to
each other -- and they should use it to encode all text they write,
too. Later.)
- implement hasAttribute and hasAttributeNS (1.7)
- Node.replaceChild(): Update the sibling nodes to point to newChild. Set
the .nextSibling attribute on oldChild instead of adding a .newChild
attribute (1.9).
give minidom.py behaviour that complies with the DOM Level 1 REC,
which says that when a node newChild is added to the tree, "if the
newChild is already in the tree, it is first removed."
pulldom.py is patched to use the public minidom interface instead
of setting .parentNode itself. Possibly this reduces pulldom's
efficiency; someone else will have to pronounce on that.
Make Node inherit from xml.dom.Node to pick up the NodeType values
defined by the W3C recommendation.
When raising AttributeError, be sure to provide the name of the attribute
that does not exist.
Node.normalize(): Make sure we do not allow an empty text node to survive
as the first child; update the sibling links properly.
_getElementsByTagNameNSHelper(): Make recursive calls using the right
number of parameters.
Attr.__setattr__(): Be sure to update name and nodeName at the same time
since they are synonyms for this node type.
AttributeList: Renamed to NamedNodeMap (AttributeList maintained as an
alias). Compute the length attribute dynamically to allow
the underlying structures to mutate.
AttributeList.item(): Call .keys() on the dictionary rather than using
self.keys() for performance.
AttributeList.setNamedItem(), .setNamedItemNS():
Added methods.
Text.splitText():
Added method.
DocumentType:
Added implementation class.
DOMImplementation:
Added implementation class.
Document.appendChild(): Do not allow a second document element to be added.
Document.documentElement: Find this dynamically, so that one can be
removed and another added.
Document.unlink(): Clear the doctype attribute.
_get_StringIO(): Only use the StringIO module; cStringIO does not support
Unicode.
objects; uses minidom if one is not provided to the constructor.
parse(): Pick up the default_bufsize default value dynamically so that
the value in the module may be (meaningfully) changed at runtime.
This (partially) closes patch #102477.
behavior.
Added support for the Attr.ownerElement attribute.
Everywhere: Define constant object attributes in the classes rather than
on the instances during object construction. This reduces the amount of
work needed for object construction and destruction; these need to be
lightweight operations on a DOM.
Node._get_firstChild(),
Node._get_lastChild(): Return None if there are no children (required for
compliance with DOM level 1).
Node.insertBefore(): If refChild is None, append the new node instead of
failing (required for compliance). Also, update the sibling
relationships. Return the inserted node (required for compliance).
Node.appendChild(): Update the parent of the appended node.
Node.replaceChild(): Actually replace the old child! Update the parent
and sibling relationships of both the old and new children. Return
the replaced child (required for compliance).
Node.normalize(): Implemented the normalize() method. Required for
compliance, but missing from the release. Useful for joining
adjacent Text nodes into a single node for easier processing.
Node.cloneNode(): Actually make this work. Don't let the new node share
the instance __dict__ with the original. Do proper recursion if
doing a "deep" clone. Move the attribute cloning out of the base
class, since only Element is supposed to have attributes.
Node.unlink(): Simplify handling of child nodes for efficiency, and
remove the attribute handling since only Element nodes support
attributes.
Attr.cloneNode(): Extend this to clear the ownerElement attribute in
the clone.
AttributeList.items(),
AttributeList.itemsNS(): Slight performance improvement (avoid lambda).
Element.cloneNode(): Extend Node.cloneNode() with support for the
attributes. Clone the Attr objects after creating the underlying
clone.
Element.unlink(): Clean out the attributes here instead of in the base
class, since this is the only class that will have them.
Element.toxml(): Adjust to create only one AttributeList instance; minor
efficiency improvement.
_nssplit(): No need to re-import string.
Document.__init__(): No longer needed once constant attributes are
initialized in the class itself.
Document.createElementNS(),
Document.createAttributeNS(): Use the defined constructors rather than
directly access the classes.
_get_StringIO(): New function. Create an output StringIO using the most
efficient available flavor.
parse(),
parseString(): Import pulldom here instead of in the public namespace of
the module.
correct order of constructor args in createAttributeNS
pulldom: use symbolic names for uri and localnames
correct usage of createAttribute and setAttributeNode signatures.
callers of feed will get a SAXException.
In close, feed the last chunk first before calling endDocument, so that
the parser may report errors before the end of the document. Don't do
anything in a nested parser.
Don't call endDocument in parse; that will be called in close.
Use self._source for finding the SystemID; XML_GetBase will be cleared in
case of an error.
caused the drive letter to cause urlopen() to think it was an unrecognized
URL scheme. This only passes system ids to urlopen() if the file does not
exist. It works on Windows & Unix.
It should work everywhere else as well.
xml.sax: Fix parse and parseString not to rely on ExpatParser
Greatly simplify import logic by using __import__
saxutils: Support Unicode strings and files as parameters to
prepare_input_source
Add support for parsing already-opened files. Make sure the parse()
method closes exactly those files that it opens.
Modified by FLD for better conformance to the Python style guide.
This closes SourceForge patch #101512.
comments, docstrings or error messages. I fixed two minor things in
test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't").
There is a minor style issue involved: Guido seems to have preferred English
grammar (behaviour, honour) in a couple places. This patch changes that to
American, which is the more prominent style in the source. I prefer English
myself, so if English is preferred, I'd be happy to supply a patch myself ;)