Issue #6472: The xml.etree package is updated to ElementTree 1.3. The cElementTree module is updated too.

This commit is contained in:
Florent Xicluna 2010-03-11 14:36:19 +00:00
parent 4478662f83
commit 3e8c189faa
11 changed files with 3323 additions and 1207 deletions

View File

@ -26,7 +26,8 @@ Each element has a number of properties associated with it:
* a number of child elements, stored in a Python sequence * a number of child elements, stored in a Python sequence
To create an element instance, use the Element or SubElement factory functions. To create an element instance, use the :class:`Element` constructor or the
:func:`SubElement` factory function.
The :class:`ElementTree` class can be used to wrap an element structure, and The :class:`ElementTree` class can be used to wrap an element structure, and
convert it from and to XML. convert it from and to XML.
@ -46,9 +47,10 @@ Functions
.. function:: Comment([text]) .. function:: Comment([text])
Comment element factory. This factory function creates a special element that Comment element factory. This factory function creates a special element that
will be serialized as an XML comment. The comment string can be either an 8-bit will be serialized as an XML comment by the standard serializer. The comment
ASCII string or a Unicode string. *text* is a string containing the comment string can be either an 8-bit ASCII string or a Unicode string. *text* is a
string. Returns an element instance representing a comment. string containing the comment string. Returns an element instance representing
a comment.
.. function:: dump(elem) .. function:: dump(elem)
@ -62,37 +64,36 @@ Functions
*elem* is an element tree or an individual element. *elem* is an element tree or an individual element.
.. function:: Element(tag[, attrib][, **extra])
Element factory. This function returns an object implementing the standard
Element interface. The exact class or type of that object is implementation
dependent, but it will always be compatible with the _ElementInterface class in
this module.
The element name, attribute names, and attribute values can be either 8-bit
ASCII strings or Unicode strings. *tag* is the element name. *attrib* is an
optional dictionary, containing element attributes. *extra* contains additional
attributes, given as keyword arguments. Returns an element instance.
.. function:: fromstring(text) .. function:: fromstring(text)
Parses an XML section from a string constant. Same as XML. *text* is a string Parses an XML section from a string constant. Same as XML. *text* is a string
containing XML data. Returns an Element instance. containing XML data. Returns an Element instance.
.. function:: fromstringlist(sequence[, parser])
Parses an XML document from a sequence of string fragments. *sequence* is a list
or other sequence containing XML data fragments. *parser* is an optional parser
instance. If not given, the standard :class:`XMLParser` parser is used.
Returns an Element instance.
.. versionadded:: 2.7
.. function:: iselement(element) .. function:: iselement(element)
Checks if an object appears to be a valid element object. *element* is an Checks if an object appears to be a valid element object. *element* is an
element instance. Returns a true value if this is an element object. element instance. Returns a true value if this is an element object.
.. function:: iterparse(source[, events]) .. function:: iterparse(source[, events[, parser]])
Parses an XML section into an element tree incrementally, and reports what's Parses an XML section into an element tree incrementally, and reports what's
going on to the user. *source* is a filename or file object containing XML data. going on to the user. *source* is a filename or file object containing XML data.
*events* is a list of events to report back. If omitted, only "end" events are *events* is a list of events to report back. If omitted, only "end" events are
reported. Returns an :term:`iterator` providing ``(event, elem)`` pairs. reported. *parser* is an optional parser instance. If not given, the standard
:class:`XMLParser` parser is used. Returns an :term:`iterator`
providing ``(event, elem)`` pairs.
.. note:: .. note::
@ -109,8 +110,8 @@ Functions
Parses an XML section into an element tree. *source* is a filename or file Parses an XML section into an element tree. *source* is a filename or file
object containing XML data. *parser* is an optional parser instance. If not object containing XML data. *parser* is an optional parser instance. If not
given, the standard XMLTreeBuilder parser is used. Returns an ElementTree given, the standard :class:`XMLParser` parser is used. Returns an
instance. :class:`ElementTree` instance.
.. function:: ProcessingInstruction(target[, text]) .. function:: ProcessingInstruction(target[, text])
@ -121,6 +122,16 @@ Functions
an element instance, representing a processing instruction. an element instance, representing a processing instruction.
.. function:: register_namespace(prefix, uri)
Registers a namespace prefix. The registry is global, and any existing mapping
for either the given prefix or the namespace URI will be removed. *prefix* is a
namespace prefix. *uri* is a namespace uri. Tags and attributes in this namespace
will be serialized with the given prefix, if at all possible.
.. versionadded:: 2.7
.. function:: SubElement(parent, tag[, attrib[, **extra]]) .. function:: SubElement(parent, tag[, attrib[, **extra]])
Subelement factory. This function creates an element instance, and appends it Subelement factory. This function creates an element instance, and appends it
@ -140,155 +151,193 @@ Functions
US-ASCII). Returns an encoded string containing the XML data. US-ASCII). Returns an encoded string containing the XML data.
.. function:: XML(text) .. function:: tostringlist(element[, encoding])
Generates a string representation of an XML element, including all subelements.
*element* is an Element instance. *encoding* is the output encoding (default is
US-ASCII). Returns a sequence object containing the XML data.
.. versionadded:: 2.7
.. function:: XML(text[, parser])
Parses an XML section from a string constant. This function can be used to Parses an XML section from a string constant. This function can be used to
embed "XML literals" in Python code. *text* is a string containing XML data. embed "XML literals" in Python code. *text* is a string containing XML data.
Returns an Element instance. *parser* is an optional parser instance. If not given, the standard
:class:`XMLParser` parser is used. Returns an Element instance.
.. function:: XMLID(text) .. function:: XMLID(text[, parser])
Parses an XML section from a string constant, and also returns a dictionary Parses an XML section from a string constant, and also returns a dictionary
which maps from element id:s to elements. *text* is a string containing XML which maps from element id:s to elements. *text* is a string containing XML
data. Returns a tuple containing an Element instance and a dictionary. data. *parser* is an optional parser instance. If not given, the standard
:class:`XMLParser` parser is used. Returns a tuple containing an Element
instance and a dictionary.
.. _elementtree-element-interface: .. _elementtree-element-objects:
The Element Interface Element Objects
--------------------- ---------------
Element objects returned by Element or SubElement have the following methods
and attributes.
.. attribute:: Element.tag .. class:: Element(tag[, attrib[, **extra]])
Element class. This class defines the Element interface, and provides a
reference implementation of this interface.
The element name, attribute names, and attribute values can be either 8-bit
ASCII strings or Unicode strings. *tag* is the element name. *attrib* is an
optional dictionary, containing element attributes. *extra* contains additional
attributes, given as keyword arguments.
.. attribute:: tag
A string identifying what kind of data this element represents (the element A string identifying what kind of data this element represents (the element
type, in other words). type, in other words).
.. attribute:: Element.text .. attribute:: text
The *text* attribute can be used to hold additional data associated with the The *text* attribute can be used to hold additional data associated with the
element. As the name implies this attribute is usually a string but may be any element. As the name implies this attribute is usually a string but may be
application-specific object. If the element is created from an XML file the any application-specific object. If the element is created from an XML file
attribute will contain any text found between the element tags. the attribute will contain any text found between the element tags.
.. attribute:: Element.tail .. attribute:: tail
The *tail* attribute can be used to hold additional data associated with the The *tail* attribute can be used to hold additional data associated with the
element. This attribute is usually a string but may be any application-specific element. This attribute is usually a string but may be any
object. If the element is created from an XML file the attribute will contain application-specific object. If the element is created from an XML file the
any text found after the element's end tag and before the next tag. attribute will contain any text found after the element's end tag and before
the next tag.
.. attribute:: Element.attrib .. attribute:: attrib
A dictionary containing the element's attributes. Note that while the *attrib* A dictionary containing the element's attributes. Note that while the
value is always a real mutable Python dictionary, an ElementTree implementation *attrib* value is always a real mutable Python dictionary, an ElementTree
may choose to use another internal representation, and create the dictionary implementation may choose to use another internal representation, and create
only if someone asks for it. To take advantage of such implementations, use the the dictionary only if someone asks for it. To take advantage of such
dictionary methods below whenever possible. implementations, use the dictionary methods below whenever possible.
The following dictionary-like methods work on the element attributes. The following dictionary-like methods work on the element attributes.
.. method:: Element.clear() .. method:: clear()
Resets an element. This function removes all subelements, clears all Resets an element. This function removes all subelements, clears all
attributes, and sets the text and tail attributes to None. attributes, and sets the text and tail attributes to None.
.. method:: Element.get(key[, default=None]) .. method:: get(key[, default])
Gets the element attribute named *key*. Gets the element attribute named *key*.
Returns the attribute value, or *default* if the attribute was not found. Returns the attribute value, or *default* if the attribute was not found.
.. method:: Element.items() .. method:: items()
Returns the element attributes as a sequence of (name, value) pairs. The Returns the element attributes as a sequence of (name, value) pairs. The
attributes are returned in an arbitrary order. attributes are returned in an arbitrary order.
.. method:: Element.keys() .. method:: keys()
Returns the elements attribute names as a list. The names are returned in an Returns the elements attribute names as a list. The names are returned in an
arbitrary order. arbitrary order.
.. method:: Element.set(key, value) .. method:: set(key, value)
Set the attribute *key* on the element to *value*. Set the attribute *key* on the element to *value*.
The following methods work on the element's children (subelements). The following methods work on the element's children (subelements).
.. method:: Element.append(subelement) .. method:: append(subelement)
Adds the element *subelement* to the end of this elements internal list of Adds the element *subelement* to the end of this elements internal list of
subelements. subelements.
.. method:: Element.find(match) .. method:: extend(subelements)
Appends *subelements* from a sequence object with zero or more elements.
Raises :exc:`AssertionError` if a subelement is not a valid object.
.. versionadded:: 2.7
.. method:: find(match)
Finds the first subelement matching *match*. *match* may be a tag name or path. Finds the first subelement matching *match*. *match* may be a tag name or path.
Returns an element instance or ``None``. Returns an element instance or ``None``.
.. method:: Element.findall(match) .. method:: findall(match)
Finds all subelements matching *match*. *match* may be a tag name or path. Finds all subelements matching *match*. *match* may be a tag name or path.
Returns an iterable yielding all matching elements in document order. Returns an iterable yielding all matching elements in document order.
.. method:: Element.findtext(condition[, default=None]) .. method:: findtext(condition[, default])
Finds text for the first subelement matching *condition*. *condition* may be a Finds text for the first subelement matching *condition*. *condition* may be
tag name or path. Returns the text content of the first matching element, or a tag name or path. Returns the text content of the first matching element,
*default* if no element was found. Note that if the matching element has no or *default* if no element was found. Note that if the matching element has
text content an empty string is returned. no text content an empty string is returned.
.. method:: Element.getchildren() .. method:: getchildren()
Returns all subelements. The elements are returned in document order. .. deprecated:: 2.7
Use ``list(elem)`` or iteration.
.. method:: Element.getiterator([tag=None]) .. method:: getiterator([tag])
Creates a tree iterator with the current element as the root. The iterator .. deprecated:: 2.7
iterates over this element and all elements below it, in document (depth first) Use method :meth:`Element.iter` instead.
order. If *tag* is not ``None`` or ``'*'``, only elements whose tag equals
*tag* are returned from the iterator.
.. method:: Element.insert(index, element) .. method:: insert(index, element)
Inserts a subelement at the given position in this element. Inserts a subelement at the given position in this element.
.. method:: Element.makeelement(tag, attrib) .. method:: iter([tag])
Creates a new element object of the same type as this element. Do not call this Creates a tree iterator with the current element as the root. The iterator
method, use the SubElement factory function instead. iterates over this element and all elements below it, in document (depth
first) order. If *tag* is not ``None`` or ``'*'``, only elements whose tag
equals *tag* are returned from the iterator. If the tree structure is
modified during iteration, the result is undefined.
.. method:: Element.remove(subelement) .. method:: makeelement(tag, attrib)
Removes *subelement* from the element. Unlike the findXYZ methods this method Creates a new element object of the same type as this element. Do not call
compares elements based on the instance identity, not on tag value or contents. this method, use the SubElement factory function instead.
Element objects also support the following sequence type methods for working
with subelements: :meth:`__delitem__`, :meth:`__getitem__`, :meth:`__setitem__`,
:meth:`__len__`.
Caution: Because Element objects do not define a :meth:`__nonzero__` method, .. method:: remove(subelement)
elements with no subelements will test as ``False``. ::
Removes *subelement* from the element. Unlike the findXYZ methods this
method compares elements based on the instance identity, not on tag value
or contents.
Element objects also support the following sequence type methods for working
with subelements: :meth:`__delitem__`, :meth:`__getitem__`, :meth:`__setitem__`,
:meth:`__len__`.
Caution: Because Element objects do not define a :meth:`__nonzero__` method,
elements with no subelements will test as ``False``. ::
element = root.find('foo') element = root.find('foo')
@ -348,9 +397,8 @@ ElementTree Objects
.. method:: getiterator([tag]) .. method:: getiterator([tag])
Creates and returns a tree iterator for the root element. The iterator .. deprecated:: 2.7
loops over all elements in this tree, in section order. *tag* is the tag Use method :meth:`ElementTree.iter` instead.
to look for (default is to return all elements)
.. method:: getroot() .. method:: getroot()
@ -358,19 +406,28 @@ ElementTree Objects
Returns the root element for this tree. Returns the root element for this tree.
.. method:: iter([tag])
Creates and returns a tree iterator for the root element. The iterator
loops over all elements in this tree, in section order. *tag* is the tag
to look for (default is to return all elements)
.. method:: parse(source[, parser]) .. method:: parse(source[, parser])
Loads an external XML section into this element tree. *source* is a file Loads an external XML section into this element tree. *source* is a file
name or file object. *parser* is an optional parser instance. If not name or file object. *parser* is an optional parser instance. If not
given, the standard XMLTreeBuilder parser is used. Returns the section given, the standard XMLParser parser is used. Returns the section
root element. root element.
.. method:: write(file[, encoding]) .. method:: write(file[, encoding[, xml_declaration]])
Writes the element tree to a file, as XML. *file* is a file name, or a Writes the element tree to a file, as XML. *file* is a file name, or a
file object opened for writing. *encoding* [1]_ is the output encoding file object opened for writing. *encoding* [1]_ is the output encoding
(default is US-ASCII). (default is US-ASCII). *xml_declaration* controls if an XML declaration
should be added to the file. Use False for never, True for always, None
for only if not US-ASCII or UTF-8. None is default.
This is the XML file that is going to be manipulated:: This is the XML file that is going to be manipulated::
@ -389,13 +446,13 @@ Example of changing the attribute "target" of every link in first paragraph::
>>> from xml.etree.ElementTree import ElementTree >>> from xml.etree.ElementTree import ElementTree
>>> tree = ElementTree() >>> tree = ElementTree()
>>> tree.parse("index.xhtml") >>> tree.parse("index.xhtml")
<Element html at b7d3f1ec> <Element 'html' at b7d3f1ec>
>>> p = tree.find("body/p") # Finds first occurrence of tag p in body >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
>>> p >>> p
<Element p at 8416e0c> <Element 'p' at 8416e0c>
>>> links = p.getiterator("a") # Returns list of all links >>> links = list(p.iter("a")) # Returns list of all links
>>> links >>> links
[<Element a at b7d4f9ec>, <Element a at b7d4fb0c>] [<Element 'a' at b7d4f9ec>, <Element 'a' at b7d4fb0c>]
>>> for i in links: # Iterates through all found links >>> for i in links: # Iterates through all found links
... i.attrib["target"] = "blank" ... i.attrib["target"] = "blank"
>>> tree.write("output.xhtml") >>> tree.write("output.xhtml")
@ -433,7 +490,7 @@ TreeBuilder Objects
.. method:: close() .. method:: close()
Flushes the parser buffers, and returns the toplevel document Flushes the builder buffers, and returns the toplevel document
element. Returns an Element instance. element. Returns an Element instance.
@ -455,18 +512,31 @@ TreeBuilder Objects
containing element attributes. Returns the opened element. containing element attributes. Returns the opened element.
.. _elementtree-xmltreebuilder-objects: In addition, a custom :class:`TreeBuilder` object can provide the
following method:
XMLTreeBuilder Objects .. method:: doctype(name, pubid, system)
----------------------
Handles a doctype declaration. *name* is the doctype name. *pubid* is the
public identifier. *system* is the system identifier. This method does not
exist on the default :class:`TreeBuilder` class.
.. versionadded:: 2.7
.. class:: XMLTreeBuilder([html,] [target]) .. _elementtree-xmlparser-objects:
XMLParser Objects
-----------------
.. class:: XMLParser([html [, target[, encoding]]])
Element structure builder for XML source data, based on the expat parser. *html* Element structure builder for XML source data, based on the expat parser. *html*
are predefined HTML entities. This flag is not supported by the current are predefined HTML entities. This flag is not supported by the current
implementation. *target* is the target object. If omitted, the builder uses an implementation. *target* is the target object. If omitted, the builder uses an
instance of the standard TreeBuilder class. instance of the standard TreeBuilder class. *encoding* [1]_ is optional.
If given, the value overrides the encoding specified in the XML file.
.. method:: close() .. method:: close()
@ -476,22 +546,23 @@ XMLTreeBuilder Objects
.. method:: doctype(name, pubid, system) .. method:: doctype(name, pubid, system)
Handles a doctype declaration. *name* is the doctype name. *pubid* is the .. deprecated:: 2.7
public identifier. *system* is the system identifier. Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
target.
.. method:: feed(data) .. method:: feed(data)
Feeds data to the parser. *data* is encoded data. Feeds data to the parser. *data* is encoded data.
:meth:`XMLTreeBuilder.feed` calls *target*\'s :meth:`start` method :meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
for each opening tag, its :meth:`end` method for each closing tag, for each opening tag, its :meth:`end` method for each closing tag,
and data is processed by method :meth:`data`. :meth:`XMLTreeBuilder.close` and data is processed by method :meth:`data`. :meth:`XMLParser.close`
calls *target*\'s method :meth:`close`. calls *target*\'s method :meth:`close`.
:class:`XMLTreeBuilder` can be used not only for building a tree structure. :class:`XMLParser` can be used not only for building a tree structure.
This is an example of counting the maximum depth of an XML file:: This is an example of counting the maximum depth of an XML file::
>>> from xml.etree.ElementTree import XMLTreeBuilder >>> from xml.etree.ElementTree import XMLParser
>>> class MaxDepth: # The target object of the parser >>> class MaxDepth: # The target object of the parser
... maxDepth = 0 ... maxDepth = 0
... depth = 0 ... depth = 0
@ -507,7 +578,7 @@ This is an example of counting the maximum depth of an XML file::
... return self.maxDepth ... return self.maxDepth
... ...
>>> target = MaxDepth() >>> target = MaxDepth()
>>> parser = XMLTreeBuilder(target=target) >>> parser = XMLParser(target=target)
>>> exampleXml = """ >>> exampleXml = """
... <a> ... <a>
... <b> ... <b>
@ -530,4 +601,3 @@ This is an example of counting the maximum depth of an XML file::
appropriate standards. For example, "UTF-8" is valid, but "UTF8" is appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
and http://www.iana.org/assignments/character-sets. and http://www.iana.org/assignments/character-sets.

View File

@ -0,0 +1,7 @@
<?pi data?>
<!-- comment -->
<root xmlns='namespace'>
<element key='value'>text</element>
<element>text</element>tail
<empty-element/>
</root>

View File

@ -0,0 +1,6 @@
<!-- comment -->
<root>
<element key='value'>text</element>
<element>text</element>tail
<empty-element/>
</root>

File diff suppressed because it is too large Load Diff

View File

@ -1,30 +1,11 @@
# xml.etree test for cElementTree # xml.etree test for cElementTree
import sys
from test import test_support from test import test_support
ET = test_support.import_module('xml.etree.cElementTree') cET = test_support.import_module('xml.etree.cElementTree')
SAMPLE_XML = """
<body>
<tag>text</tag>
<tag />
<section>
<tag>subtext</tag>
</section>
</body>
"""
SAMPLE_XML_NS = """ # cElementTree specific tests
<body xmlns="http://effbot.org/ns">
<tag>text</tag>
<tag />
<section>
<tag>subtext</tag>
</section>
</body>
"""
def sanity(): def sanity():
""" """
@ -33,191 +14,21 @@ def sanity():
>>> from xml.etree import cElementTree >>> from xml.etree import cElementTree
""" """
def check_method(method):
if not hasattr(method, '__call__'):
print method, "not callable"
def serialize(ET, elem, encoding=None):
import StringIO
file = StringIO.StringIO()
tree = ET.ElementTree(elem)
if encoding:
tree.write(file, encoding)
else:
tree.write(file)
return file.getvalue()
def summarize(elem):
return elem.tag
def summarize_list(seq):
return map(summarize, seq)
def interface():
"""
Test element tree interface.
>>> element = ET.Element("tag", key="value")
>>> tree = ET.ElementTree(element)
Make sure all standard element methods exist.
>>> check_method(element.append)
>>> check_method(element.insert)
>>> check_method(element.remove)
>>> check_method(element.getchildren)
>>> check_method(element.find)
>>> check_method(element.findall)
>>> check_method(element.findtext)
>>> check_method(element.clear)
>>> check_method(element.get)
>>> check_method(element.set)
>>> check_method(element.keys)
>>> check_method(element.items)
>>> check_method(element.getiterator)
Basic method sanity checks.
>>> serialize(ET, element) # 1
'<tag key="value" />'
>>> subelement = ET.Element("subtag")
>>> element.append(subelement)
>>> serialize(ET, element) # 2
'<tag key="value"><subtag /></tag>'
>>> element.insert(0, subelement)
>>> serialize(ET, element) # 3
'<tag key="value"><subtag /><subtag /></tag>'
>>> element.remove(subelement)
>>> serialize(ET, element) # 4
'<tag key="value"><subtag /></tag>'
>>> element.remove(subelement)
>>> serialize(ET, element) # 5
'<tag key="value" />'
>>> element.remove(subelement)
Traceback (most recent call last):
ValueError: list.remove(x): x not in list
>>> serialize(ET, element) # 6
'<tag key="value" />'
"""
def find():
"""
Test find methods (including xpath syntax).
>>> elem = ET.XML(SAMPLE_XML)
>>> elem.find("tag").tag
'tag'
>>> ET.ElementTree(elem).find("tag").tag
'tag'
>>> elem.find("section/tag").tag
'tag'
>>> ET.ElementTree(elem).find("section/tag").tag
'tag'
>>> elem.findtext("tag")
'text'
>>> elem.findtext("tog")
>>> elem.findtext("tog", "default")
'default'
>>> ET.ElementTree(elem).findtext("tag")
'text'
>>> elem.findtext("section/tag")
'subtext'
>>> ET.ElementTree(elem).findtext("section/tag")
'subtext'
>>> summarize_list(elem.findall("tag"))
['tag', 'tag']
>>> summarize_list(elem.findall("*"))
['tag', 'tag', 'section']
>>> summarize_list(elem.findall(".//tag"))
['tag', 'tag', 'tag']
>>> summarize_list(elem.findall("section/tag"))
['tag']
>>> summarize_list(elem.findall("section//tag"))
['tag']
>>> summarize_list(elem.findall("section/*"))
['tag']
>>> summarize_list(elem.findall("section//*"))
['tag']
>>> summarize_list(elem.findall("section/.//*"))
['tag']
>>> summarize_list(elem.findall("*/*"))
['tag']
>>> summarize_list(elem.findall("*//*"))
['tag']
>>> summarize_list(elem.findall("*/tag"))
['tag']
>>> summarize_list(elem.findall("*/./tag"))
['tag']
>>> summarize_list(elem.findall("./tag"))
['tag', 'tag']
>>> summarize_list(elem.findall(".//tag"))
['tag', 'tag', 'tag']
>>> summarize_list(elem.findall("././tag"))
['tag', 'tag']
>>> summarize_list(ET.ElementTree(elem).findall("/tag"))
['tag', 'tag']
>>> summarize_list(ET.ElementTree(elem).findall("./tag"))
['tag', 'tag']
>>> elem = ET.XML(SAMPLE_XML_NS)
>>> summarize_list(elem.findall("tag"))
[]
>>> summarize_list(elem.findall("{http://effbot.org/ns}tag"))
['{http://effbot.org/ns}tag', '{http://effbot.org/ns}tag']
>>> summarize_list(elem.findall(".//{http://effbot.org/ns}tag"))
['{http://effbot.org/ns}tag', '{http://effbot.org/ns}tag', '{http://effbot.org/ns}tag']
"""
def parseliteral():
r"""
>>> element = ET.XML("<html><body>text</body></html>")
>>> ET.ElementTree(element).write(sys.stdout)
<html><body>text</body></html>
>>> element = ET.fromstring("<html><body>text</body></html>")
>>> ET.ElementTree(element).write(sys.stdout)
<html><body>text</body></html>
>>> print ET.tostring(element)
<html><body>text</body></html>
>>> print ET.tostring(element, "ascii")
<?xml version='1.0' encoding='ascii'?>
<html><body>text</body></html>
>>> _, ids = ET.XMLID("<html><body>text</body></html>")
>>> len(ids)
0
>>> _, ids = ET.XMLID("<html><body id='body'>text</body></html>")
>>> len(ids)
1
>>> ids["body"].tag
'body'
"""
def check_encoding(encoding):
"""
>>> check_encoding("ascii")
>>> check_encoding("us-ascii")
>>> check_encoding("iso-8859-1")
>>> check_encoding("iso-8859-15")
>>> check_encoding("cp437")
>>> check_encoding("mac-roman")
"""
ET.XML(
"<?xml version='1.0' encoding='%s'?><xml />" % encoding
)
def bug_1534630():
"""
>>> bob = ET.TreeBuilder()
>>> e = bob.data("data")
>>> e = bob.start("tag", {})
>>> e = bob.end("tag")
>>> e = bob.close()
>>> serialize(ET, e)
'<tag />'
"""
def test_main(): def test_main():
from test import test_xml_etree_c from test import test_xml_etree, test_xml_etree_c
# Run the tests specific to the C implementation
test_support.run_doctest(test_xml_etree_c, verbosity=True) test_support.run_doctest(test_xml_etree_c, verbosity=True)
# Assign the C implementation before running the doctests
pyET = test_xml_etree.ET
test_xml_etree.ET = cET
try:
# Run the same test suite as xml.etree.ElementTree
test_xml_etree.test_main(module_name='xml.etree.cElementTree')
finally:
test_xml_etree.ET = pyET
if __name__ == '__main__': if __name__ == '__main__':
test_main() test_main()

View File

@ -1,6 +1,6 @@
# #
# ElementTree # ElementTree
# $Id: ElementInclude.py 1862 2004-06-18 07:31:02Z Fredrik $ # $Id: ElementInclude.py 3375 2008-02-13 08:05:08Z fredrik $
# #
# limited xinclude support for element trees # limited xinclude support for element trees
# #
@ -16,7 +16,7 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# The ElementTree toolkit is # The ElementTree toolkit is
# #
# Copyright (c) 1999-2004 by Fredrik Lundh # Copyright (c) 1999-2008 by Fredrik Lundh
# #
# By obtaining, using, and/or copying this software and/or its # By obtaining, using, and/or copying this software and/or its
# associated documentation, you agree that you have read, understood, # associated documentation, you agree that you have read, understood,
@ -42,14 +42,14 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# Licensed to PSF under a Contributor Agreement. # Licensed to PSF under a Contributor Agreement.
# See http://www.python.org/2.4/license for licensing details. # See http://www.python.org/psf/license for licensing details.
## ##
# Limited XInclude support for the ElementTree package. # Limited XInclude support for the ElementTree package.
## ##
import copy import copy
import ElementTree from . import ElementTree
XINCLUDE = "{http://www.w3.org/2001/XInclude}" XINCLUDE = "{http://www.w3.org/2001/XInclude}"

View File

@ -1,6 +1,6 @@
# #
# ElementTree # ElementTree
# $Id: ElementPath.py 1858 2004-06-17 21:31:41Z Fredrik $ # $Id: ElementPath.py 3375 2008-02-13 08:05:08Z fredrik $
# #
# limited xpath support for element trees # limited xpath support for element trees
# #
@ -8,8 +8,13 @@
# 2003-05-23 fl created # 2003-05-23 fl created
# 2003-05-28 fl added support for // etc # 2003-05-28 fl added support for // etc
# 2003-08-27 fl fixed parsing of periods in element names # 2003-08-27 fl fixed parsing of periods in element names
# 2007-09-10 fl new selection engine
# 2007-09-12 fl fixed parent selector
# 2007-09-13 fl added iterfind; changed findall to return a list
# 2007-11-30 fl added namespaces support
# 2009-10-30 fl added child element value filter
# #
# Copyright (c) 2003-2004 by Fredrik Lundh. All rights reserved. # Copyright (c) 2003-2009 by Fredrik Lundh. All rights reserved.
# #
# fredrik@pythonware.com # fredrik@pythonware.com
# http://www.pythonware.com # http://www.pythonware.com
@ -17,7 +22,7 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# The ElementTree toolkit is # The ElementTree toolkit is
# #
# Copyright (c) 1999-2004 by Fredrik Lundh # Copyright (c) 1999-2009 by Fredrik Lundh
# #
# By obtaining, using, and/or copying this software and/or its # By obtaining, using, and/or copying this software and/or its
# associated documentation, you agree that you have read, understood, # associated documentation, you agree that you have read, understood,
@ -43,7 +48,7 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# Licensed to PSF under a Contributor Agreement. # Licensed to PSF under a Contributor Agreement.
# See http://www.python.org/2.4/license for licensing details. # See http://www.python.org/psf/license for licensing details.
## ##
# Implementation module for XPath support. There's usually no reason # Implementation module for XPath support. There's usually no reason
@ -53,146 +58,246 @@
import re import re
xpath_tokenizer = re.compile( xpath_tokenizer_re = re.compile(
"(::|\.\.|\(\)|[/.*:\[\]\(\)@=])|((?:\{[^}]+\})?[^/:\[\]\(\)@=\s]+)|\s+" "("
).findall "'[^']*'|\"[^\"]*\"|"
"::|"
class xpath_descendant_or_self: "//?|"
pass "\.\.|"
"\(\)|"
## "[/.*:\[\]\(\)@=])|"
# Wrapper for a compiled XPath. "((?:\{[^}]+\})?[^/\[\]\(\)@=\s]+)|"
"\s+"
class Path:
##
# Create an Path instance from an XPath expression.
def __init__(self, path):
tokens = xpath_tokenizer(path)
# the current version supports 'path/path'-style expressions only
self.path = []
self.tag = None
if tokens and tokens[0][0] == "/":
raise SyntaxError("cannot use absolute path on element")
while tokens:
op, tag = tokens.pop(0)
if tag or op == "*":
self.path.append(tag or op)
elif op == ".":
pass
elif op == "/":
self.path.append(xpath_descendant_or_self())
continue
else:
raise SyntaxError("unsupported path syntax (%s)" % op)
if tokens:
op, tag = tokens.pop(0)
if op != "/":
raise SyntaxError(
"expected path separator (%s)" % (op or tag)
) )
if self.path and isinstance(self.path[-1], xpath_descendant_or_self):
raise SyntaxError("path cannot end with //")
if len(self.path) == 1 and isinstance(self.path[0], type("")):
self.tag = self.path[0]
## def xpath_tokenizer(pattern, namespaces=None):
# Find first matching object. for token in xpath_tokenizer_re.findall(pattern):
tag = token[1]
if tag and tag[0] != "{" and ":" in tag:
try:
prefix, uri = tag.split(":", 1)
if not namespaces:
raise KeyError
yield token[0], "{%s}%s" % (namespaces[prefix], uri)
except KeyError:
raise SyntaxError("prefix %r not found in prefix map" % prefix)
else:
yield token
def find(self, element): def get_parent_map(context):
tag = self.tag parent_map = context.parent_map
if tag is None: if parent_map is None:
nodeset = self.findall(element) context.parent_map = parent_map = {}
if not nodeset: for p in context.root.iter():
return None for e in p:
return nodeset[0] parent_map[e] = p
for elem in element: return parent_map
if elem.tag == tag:
return elem
return None
## def prepare_child(next, token):
# Find text for first matching object. tag = token[1]
def select(context, result):
for elem in result:
for e in elem:
if e.tag == tag:
yield e
return select
def findtext(self, element, default=None): def prepare_star(next, token):
tag = self.tag def select(context, result):
if tag is None: for elem in result:
nodeset = self.findall(element) for e in elem:
if not nodeset: yield e
return default return select
return nodeset[0].text or ""
for elem in element:
if elem.tag == tag:
return elem.text or ""
return default
## def prepare_self(next, token):
# Find all matching objects. def select(context, result):
for elem in result:
yield elem
return select
def findall(self, element): def prepare_descendant(next, token):
nodeset = [element] token = next()
index = 0 if token[0] == "*":
tag = "*"
elif not token[0]:
tag = token[1]
else:
raise SyntaxError("invalid descendant")
def select(context, result):
for elem in result:
for e in elem.iter(tag):
if e is not elem:
yield e
return select
def prepare_parent(next, token):
def select(context, result):
# FIXME: raise error if .. is applied at toplevel?
parent_map = get_parent_map(context)
result_map = {}
for elem in result:
if elem in parent_map:
parent = parent_map[elem]
if parent not in result_map:
result_map[parent] = None
yield parent
return select
def prepare_predicate(next, token):
# FIXME: replace with real parser!!! refs:
# http://effbot.org/zone/simple-iterator-parser.htm
# http://javascript.crockford.com/tdop/tdop.html
signature = []
predicate = []
while 1: while 1:
token = next()
if token[0] == "]":
break
if token[0] and token[0][:1] in "'\"":
token = "'", token[0][1:-1]
signature.append(token[0] or "-")
predicate.append(token[1])
signature = "".join(signature)
# use signature to determine predicate type
if signature == "@-":
# [@attribute] predicate
key = predicate[1]
def select(context, result):
for elem in result:
if elem.get(key) is not None:
yield elem
return select
if signature == "@-='":
# [@attribute='value']
key = predicate[1]
value = predicate[-1]
def select(context, result):
for elem in result:
if elem.get(key) == value:
yield elem
return select
if signature == "-" and not re.match("\d+$", predicate[0]):
# [tag]
tag = predicate[0]
def select(context, result):
for elem in result:
if elem.find(tag) is not None:
yield elem
return select
if signature == "-='" and not re.match("\d+$", predicate[0]):
# [tag='value']
tag = predicate[0]
value = predicate[-1]
def select(context, result):
for elem in result:
for e in elem.findall(tag):
if "".join(e.itertext()) == value:
yield elem
break
return select
if signature == "-" or signature == "-()" or signature == "-()-":
# [index] or [last()] or [last()-index]
if signature == "-":
index = int(predicate[0]) - 1
else:
if predicate[0] != "last":
raise SyntaxError("unsupported function")
if signature == "-()-":
try: try:
path = self.path[index] index = int(predicate[2]) - 1
index = index + 1 except ValueError:
except IndexError: raise SyntaxError("unsupported expression")
return nodeset else:
set = [] index = -1
if isinstance(path, xpath_descendant_or_self): def select(context, result):
parent_map = get_parent_map(context)
for elem in result:
try: try:
tag = self.path[index] parent = parent_map[elem]
if not isinstance(tag, type("")): # FIXME: what if the selector is "*" ?
tag = None elems = list(parent.findall(elem.tag))
else: if elems[index] is elem:
index = index + 1 yield elem
except IndexError: except (IndexError, KeyError):
tag = None # invalid path pass
for node in nodeset: return select
new = list(node.getiterator(tag)) raise SyntaxError("invalid predicate")
if new and new[0] is node:
set.extend(new[1:]) ops = {
else: "": prepare_child,
set.extend(new) "*": prepare_star,
else: ".": prepare_self,
for node in nodeset: "..": prepare_parent,
for node in node: "//": prepare_descendant,
if path == "*" or node.tag == path: "[": prepare_predicate,
set.append(node) }
if not set:
return []
nodeset = set
_cache = {} _cache = {}
## class _SelectorContext:
# (Internal) Compile path. parent_map = None
def __init__(self, root):
self.root = root
def _compile(path): # --------------------------------------------------------------------
p = _cache.get(path)
if p is not None: ##
return p # Generate all matching objects.
p = Path(path)
if len(_cache) >= 100: def iterfind(elem, path, namespaces=None):
# compile selector pattern
if path[-1:] == "/":
path = path + "*" # implicit all (FIXME: keep this?)
try:
selector = _cache[path]
except KeyError:
if len(_cache) > 100:
_cache.clear() _cache.clear()
_cache[path] = p if path[:1] == "/":
return p raise SyntaxError("cannot use absolute path on element")
next = iter(xpath_tokenizer(path, namespaces)).next
token = next()
selector = []
while 1:
try:
selector.append(ops[token[0]](next, token))
except StopIteration:
raise SyntaxError("invalid path")
try:
token = next()
if token[0] == "/":
token = next()
except StopIteration:
break
_cache[path] = selector
# execute selector pattern
result = [elem]
context = _SelectorContext(elem)
for select in selector:
result = select(context, result)
return result
## ##
# Find first matching object. # Find first matching object.
def find(element, path): def find(elem, path, namespaces=None):
return _compile(path).find(element) try:
return iterfind(elem, path, namespaces).next()
## except StopIteration:
# Find text for first matching object. return None
def findtext(element, path, default=None):
return _compile(path).findtext(element, default)
## ##
# Find all matching objects. # Find all matching objects.
def findall(element, path): def findall(elem, path, namespaces=None):
return _compile(path).findall(element) return list(iterfind(elem, path, namespaces))
##
# Find text for first matching object.
def findtext(elem, path, default=None, namespaces=None):
try:
elem = iterfind(elem, path, namespaces).next()
return elem.text or ""
except StopIteration:
return default

File diff suppressed because it is too large Load Diff

View File

@ -1,10 +1,10 @@
# $Id: __init__.py 1821 2004-06-03 16:57:49Z fredrik $ # $Id: __init__.py 3375 2008-02-13 08:05:08Z fredrik $
# elementtree package # elementtree package
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# The ElementTree toolkit is # The ElementTree toolkit is
# #
# Copyright (c) 1999-2004 by Fredrik Lundh # Copyright (c) 1999-2008 by Fredrik Lundh
# #
# By obtaining, using, and/or copying this software and/or its # By obtaining, using, and/or copying this software and/or its
# associated documentation, you agree that you have read, understood, # associated documentation, you agree that you have read, understood,
@ -30,4 +30,4 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# Licensed to PSF under a Contributor Agreement. # Licensed to PSF under a Contributor Agreement.
# See http://www.python.org/2.4/license for licensing details. # See http://www.python.org/psf/license for licensing details.

View File

@ -24,6 +24,9 @@ Core and Builtins
Library Library
------- -------
- Issue #6472: The xml.etree package is updated to ElementTree 1.3. The
cElementTree module is updated too.
- Issue #7880: Fix sysconfig when the python executable is a symbolic link. - Issue #7880: Fix sysconfig when the python executable is a symbolic link.
- Issue #7624: Fix isinstance(foo(), collections.Callable) for old-style - Issue #7624: Fix isinstance(foo(), collections.Callable) for old-style

File diff suppressed because it is too large Load Diff