Merged revisions 78838-78839,78917,78919,78934,78937 via svnmerge from

svn+ssh://pythondev@svn.python.org/python/trunk

........
  r78838 | florent.xicluna | 2010-03-11 15:36:19 +0100 (jeu, 11 mar 2010) | 2 lines

  Issue #6472: The xml.etree package is updated to ElementTree 1.3.  The cElementTree module is updated too.
........
  r78839 | florent.xicluna | 2010-03-11 16:55:11 +0100 (jeu, 11 mar 2010) | 2 lines

  Fix repr of tree Element on windows.
........
  r78917 | florent.xicluna | 2010-03-13 12:18:49 +0100 (sam, 13 mar 2010) | 2 lines

  Move the xml test data to their own directory.
........
  r78919 | florent.xicluna | 2010-03-13 13:41:48 +0100 (sam, 13 mar 2010) | 2 lines

  Do not chdir when running test_xml_etree, and enhance the findfile helper.
........
  r78934 | florent.xicluna | 2010-03-13 18:56:19 +0100 (sam, 13 mar 2010) | 2 lines

  Update some parts of the xml.etree documentation.
........
  r78937 | florent.xicluna | 2010-03-13 21:30:15 +0100 (sam, 13 mar 2010) | 3 lines

  Add the keyword argument "method=None" to the .write() method and the tostring/tostringlist functions.
  Update the function, class and method signatures, according to the new convention.
........
This commit is contained in:
Florent Xicluna 2010-03-13 23:24:31 +00:00
parent 9451a1c6ae
commit f15351d938
19 changed files with 3534 additions and 1327 deletions

View File

@ -6,9 +6,9 @@
.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com> .. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
The Element type is a flexible container object, designed to store hierarchical The :class:`Element` type is a flexible container object, designed to store
data structures in memory. The type can be described as a cross between a list hierarchical data structures in memory. The type can be described as a cross
and a dictionary. between a list and a dictionary.
Each element has a number of properties associated with it: Each element has a number of properties associated with it:
@ -23,7 +23,8 @@ Each element has a number of properties associated with it:
* a number of child elements, stored in a Python sequence * a number of child elements, stored in a Python sequence
To create an element instance, use the Element or SubElement factory functions. To create an element instance, use the :class:`Element` constructor or the
:func:`SubElement` factory function.
The :class:`ElementTree` class can be used to wrap an element structure, and The :class:`ElementTree` class can be used to wrap an element structure, and
convert it from and to XML. convert it from and to XML.
@ -31,8 +32,14 @@ convert it from and to XML.
A C implementation of this API is available as :mod:`xml.etree.cElementTree`. A C implementation of this API is available as :mod:`xml.etree.cElementTree`.
See http://effbot.org/zone/element-index.htm for tutorials and links to other See http://effbot.org/zone/element-index.htm for tutorials and links to other
docs. Fredrik Lundh's page is also the location of the development version of the docs. Fredrik Lundh's page is also the location of the development version of
xml.etree.ElementTree. the xml.etree.ElementTree.
.. versionchanged:: 2.7
The ElementTree API is updated to 1.3. For more information, see
`Introducing ElementTree 1.3
<http://effbot.org/zone/elementtree-13-intro.htm>`_.
.. _elementtree-functions: .. _elementtree-functions:
@ -43,16 +50,16 @@ Functions
.. function:: Comment(text=None) .. function:: Comment(text=None)
Comment element factory. This factory function creates a special element Comment element factory. This factory function creates a special element
that will be serialized as an XML comment. The comment string can be either that will be serialized as an XML comment by the standard serializer. The
an ASCII-only :class:`bytes` object or a :class:`str` object. *text* is a comment string can be either a bytestring or a Unicode string. *text* is a
string containing the comment string. Returns an element instance string containing the comment string. Returns an element instance
representing a comment. representing a comment.
.. function:: dump(elem) .. function:: dump(elem)
Writes an element tree or element structure to sys.stdout. This function should Writes an element tree or element structure to sys.stdout. This function
be used for debugging only. should be used for debugging only.
The exact output format is implementation dependent. In this version, it's The exact output format is implementation dependent. In this version, it's
written as an ordinary XML file. written as an ordinary XML file.
@ -60,38 +67,36 @@ Functions
*elem* is an element tree or an individual element. *elem* is an element tree or an individual element.
.. function:: Element(tag, attrib={}, **extra)
Element factory. This function returns an object implementing the standard
Element interface. The exact class or type of that object is implementation
dependent, but it will always be compatible with the _ElementInterface class in
this module.
The element name, attribute names, and attribute values can be either an
ASCII-only :class:`bytes` object or a :class:`str` object. *tag* is the
element name. *attrib* is an optional dictionary, containing element
attributes. *extra* contains additional attributes, given as keyword
arguments. Returns an element instance.
.. function:: fromstring(text) .. function:: fromstring(text)
Parses an XML section from a string constant. Same as XML. *text* is a string Parses an XML section from a string constant. Same as XML. *text* is a
containing XML data. Returns an Element instance. string containing XML data. Returns an :class:`Element` instance.
.. function:: fromstringlist(sequence, parser=None)
Parses an XML document from a sequence of string fragments. *sequence* is a
list or other sequence containing XML data fragments. *parser* is an
optional parser instance. If not given, the standard :class:`XMLParser`
parser is used. Returns an :class:`Element` instance.
.. versionadded:: 2.7
.. function:: iselement(element) .. function:: iselement(element)
Checks if an object appears to be a valid element object. *element* is an Checks if an object appears to be a valid element object. *element* is an
element instance. Returns a true value if this is an element object. element instance. Returns a true value if this is an element object.
.. function:: iterparse(source, events=None) .. function:: iterparse(source, events=None, parser=None)
Parses an XML section into an element tree incrementally, and reports what's Parses an XML section into an element tree incrementally, and reports what's
going on to the user. *source* is a filename or file object containing XML data. going on to the user. *source* is a filename or file object containing XML
*events* is a list of events to report back. If omitted, only "end" events are data. *events* is a list of events to report back. If omitted, only "end"
reported. Returns an :term:`iterator` providing ``(event, elem)`` pairs. events are reported. *parser* is an optional parser instance. If not
given, the standard :class:`XMLParser` parser is used. Returns an
:term:`iterator` providing ``(event, elem)`` pairs.
.. note:: .. note::
@ -106,196 +111,267 @@ Functions
.. function:: parse(source, parser=None) .. function:: parse(source, parser=None)
Parses an XML section into an element tree. *source* is a filename or file Parses an XML section into an element tree. *source* is a filename or file
object containing XML data. *parser* is an optional parser instance. If not object containing XML data. *parser* is an optional parser instance. If
given, the standard XMLTreeBuilder parser is used. Returns an ElementTree not given, the standard :class:`XMLParser` parser is used. Returns an
instance. :class:`ElementTree` instance.
.. function:: ProcessingInstruction(target, text=None) .. function:: ProcessingInstruction(target, text=None)
PI element factory. This factory function creates a special element that will PI element factory. This factory function creates a special element that
be serialized as an XML processing instruction. *target* is a string containing will be serialized as an XML processing instruction. *target* is a string
the PI target. *text* is a string containing the PI contents, if given. Returns containing the PI target. *text* is a string containing the PI contents, if
an element instance, representing a processing instruction. given. Returns an element instance, representing a processing instruction.
.. function:: register_namespace(prefix, uri)
Registers a namespace prefix. The registry is global, and any existing
mapping for either the given prefix or the namespace URI will be removed.
*prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
attributes in this namespace will be serialized with the given prefix, if at
all possible.
.. versionadded:: 2.7
.. function:: SubElement(parent, tag, attrib={}, **extra) .. function:: SubElement(parent, tag, attrib={}, **extra)
Subelement factory. This function creates an element instance, and appends it Subelement factory. This function creates an element instance, and appends
to an existing element. it to an existing element.
The element name, attribute names, and attribute values can be an ASCII-only The element name, attribute names, and attribute values can be either
:class:`bytes` object or a :class:`str` object. *parent* is the parent bytestrings or Unicode strings. *parent* is the parent element. *tag* is
element. *tag* is the subelement name. *attrib* is an optional dictionary, the subelement name. *attrib* is an optional dictionary, containing element
containing element attributes. *extra* contains additional attributes, given attributes. *extra* contains additional attributes, given as keyword
as keyword arguments. Returns an element instance. arguments. Returns an element instance.
.. function:: tostring(element, encoding=None) .. function:: tostring(element, encoding=None, method=None)
Generates a string representation of an XML element, including all subelements. Generates a string representation of an XML element, including all
*element* is an Element instance. *encoding* is the output encoding (default is subelements. *element* is an :class:`Element` instance. *encoding* is the
US-ASCII). Returns an encoded string containing the XML data. output encoding (default is None). *method* is either ``"xml"``,
``"html"`` or ``"text"`` (default is ``"xml"``). Returns an (optionally)
encoded string containing the XML data.
.. function:: XML(text) .. function:: tostringlist(element, encoding=None, method=None)
Generates a string representation of an XML element, including all
subelements. *element* is an :class:`Element` instance. *encoding* is the
output encoding (default is None). *method* is either ``"xml"``,
``"html"`` or ``"text"`` (default is ``"xml"``). Returns a sequence object
containing the XML data.
.. versionadded:: 2.7
.. function:: XML(text, parser=None)
Parses an XML section from a string constant. This function can be used to Parses an XML section from a string constant. This function can be used to
embed "XML literals" in Python code. *text* is a string containing XML data. embed "XML literals" in Python code. *text* is a string containing XML
Returns an Element instance. data. *parser* is an optional parser instance. If not given, the standard
:class:`XMLParser` parser is used. Returns an :class:`Element` instance.
.. function:: XMLID(text) .. function:: XMLID(text, parser=None)
Parses an XML section from a string constant, and also returns a dictionary Parses an XML section from a string constant, and also returns a dictionary
which maps from element id:s to elements. *text* is a string containing XML which maps from element id:s to elements. *text* is a string containing XML
data. Returns a tuple containing an Element instance and a dictionary. data. *parser* is an optional parser instance. If not given, the standard
:class:`XMLParser` parser is used. Returns a tuple containing an
:class:`Element` instance and a dictionary.
.. _elementtree-element-interface: .. _elementtree-element-objects:
The Element Interface Element Objects
--------------------- ---------------
Element objects returned by Element or SubElement have the following methods
and attributes.
.. class:: Element(tag, attrib={}, **extra)
.. attribute:: Element.tag Element class. This class defines the Element interface, and provides a
reference implementation of this interface.
A string identifying what kind of data this element represents (the element The element name, attribute names, and attribute values can be either
type, in other words). bytestrings or Unicode strings. *tag* is the element name. *attrib* is
an optional dictionary, containing element attributes. *extra* contains
additional attributes, given as keyword arguments.
.. attribute:: Element.text .. attribute:: tag
The *text* attribute can be used to hold additional data associated with the A string identifying what kind of data this element represents (the
element. As the name implies this attribute is usually a string but may be any element type, in other words).
application-specific object. If the element is created from an XML file the
attribute will contain any text found between the element tags.
.. attribute:: Element.tail .. attribute:: text
The *tail* attribute can be used to hold additional data associated with the The *text* attribute can be used to hold additional data associated with
element. This attribute is usually a string but may be any application-specific the element. As the name implies this attribute is usually a string but
object. If the element is created from an XML file the attribute will contain may be any application-specific object. If the element is created from
any text found after the element's end tag and before the next tag. an XML file the attribute will contain any text found between the element
tags.
.. attribute:: Element.attrib .. attribute:: tail
A dictionary containing the element's attributes. Note that while the *attrib* The *tail* attribute can be used to hold additional data associated with
value is always a real mutable Python dictionary, an ElementTree implementation the element. This attribute is usually a string but may be any
may choose to use another internal representation, and create the dictionary application-specific object. If the element is created from an XML file
only if someone asks for it. To take advantage of such implementations, use the the attribute will contain any text found after the element's end tag and
dictionary methods below whenever possible. before the next tag.
The following dictionary-like methods work on the element attributes.
.. attribute:: attrib
.. method:: Element.clear() A dictionary containing the element's attributes. Note that while the
*attrib* value is always a real mutable Python dictionary, an ElementTree
implementation may choose to use another internal representation, and
create the dictionary only if someone asks for it. To take advantage of
such implementations, use the dictionary methods below whenever possible.
Resets an element. This function removes all subelements, clears all The following dictionary-like methods work on the element attributes.
attributes, and sets the text and tail attributes to None.
.. method:: Element.get(key, default=None) .. method:: clear()
Gets the element attribute named *key*. Resets an element. This function removes all subelements, clears all
attributes, and sets the text and tail attributes to None.
Returns the attribute value, or *default* if the attribute was not found.
.. method:: get(key, default=None)
.. method:: Element.items() Gets the element attribute named *key*.
Returns the element attributes as a sequence of (name, value) pairs. The Returns the attribute value, or *default* if the attribute was not found.
attributes are returned in an arbitrary order.
.. method:: Element.keys() .. method:: items()
Returns the elements attribute names as a list. The names are returned in an Returns the element attributes as a sequence of (name, value) pairs. The
arbitrary order. attributes are returned in an arbitrary order.
.. method:: Element.set(key, value) .. method:: keys()
Set the attribute *key* on the element to *value*. Returns the elements attribute names as a list. The names are returned
in an arbitrary order.
The following methods work on the element's children (subelements).
.. method:: set(key, value)
.. method:: Element.append(subelement) Set the attribute *key* on the element to *value*.
Adds the element *subelement* to the end of this elements internal list of The following methods work on the element's children (subelements).
subelements.
.. method:: Element.find(match) .. method:: append(subelement)
Finds the first subelement matching *match*. *match* may be a tag name or path. Adds the element *subelement* to the end of this elements internal list
Returns an element instance or ``None``. of subelements.
.. method:: Element.findall(match) .. method:: extend(subelements)
Finds all subelements matching *match*. *match* may be a tag name or path. Appends *subelements* from a sequence object with zero or more elements.
Returns an iterable yielding all matching elements in document order. Raises :exc:`AssertionError` if a subelement is not a valid object.
.. versionadded:: 2.7
.. method:: Element.findtext(condition, default=None)
Finds text for the first subelement matching *condition*. *condition* may be a .. method:: find(match)
tag name or path. Returns the text content of the first matching element, or
*default* if no element was found. Note that if the matching element has no
text content an empty string is returned.
Finds the first subelement matching *match*. *match* may be a tag name
or path. Returns an element instance or ``None``.
.. method:: Element.getchildren()
Returns all subelements. The elements are returned in document order. .. method:: findall(match)
Finds all matching subelements, by tag name or path. Returns a list
containing all matching elements in document order.
.. method:: Element.getiterator(tag=None)
Creates a tree iterator with the current element as the root. The iterator .. method:: findtext(match, default=None)
iterates over this element and all elements below it, in document (depth first)
order. If *tag* is not ``None`` or ``'*'``, only elements whose tag equals
*tag* are returned from the iterator.
Finds text for the first subelement matching *match*. *match* may be
a tag name or path. Returns the text content of the first matching
element, or *default* if no element was found. Note that if the matching
element has no text content an empty string is returned.
.. method:: Element.insert(index, element)
Inserts a subelement at the given position in this element. .. method:: getchildren()
.. deprecated:: 2.7
Use ``list(elem)`` or iteration.
.. method:: Element.makeelement(tag, attrib)
Creates a new element object of the same type as this element. Do not call this .. method:: getiterator(tag=None)
method, use the SubElement factory function instead.
.. deprecated:: 2.7
Use method :meth:`Element.iter` instead.
.. method:: Element.remove(subelement)
Removes *subelement* from the element. Unlike the findXYZ methods this method .. method:: insert(index, element)
compares elements based on the instance identity, not on tag value or contents.
Element objects also support the following sequence type methods for working Inserts a subelement at the given position in this element.
with subelements: :meth:`__delitem__`, :meth:`__getitem__`, :meth:`__setitem__`,
:meth:`__len__`.
Caution: Because Element objects do not define a :meth:`__bool__` method,
elements with no subelements will test as ``False``. ::
element = root.find('foo') .. method:: iter(tag=None)
if not element: # careful! Creates a tree :term:`iterator` with the current element as the root.
print("element not found, or element has no subelements") The iterator iterates over this element and all elements below it, in
document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
elements whose tag equals *tag* are returned from the iterator. If the
tree structure is modified during iteration, the result is undefined.
if element is None:
print("element not found") .. method:: iterfind(match)
Finds all matching subelements, by tag name or path. Returns an iterable
yielding all matching elements in document order.
.. versionadded:: 2.7
.. method:: itertext()
Creates a text iterator. The iterator loops over this element and all
subelements, in document order, and returns all inner text.
.. versionadded:: 2.7
.. method:: makeelement(tag, attrib)
Creates a new element object of the same type as this element. Do not
call this method, use the :func:`SubElement` factory function instead.
.. method:: remove(subelement)
Removes *subelement* from the element. Unlike the find\* methods this
method compares elements based on the instance identity, not on tag value
or contents.
:class:`Element` objects also support the following sequence type methods
for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`,
:meth:`__setitem__`, :meth:`__len__`.
Caution: Elements with no subelements will test as ``False``. This behavior
will change in future versions. Use specific ``len(elem)`` or ``elem is
None`` test instead. ::
element = root.find('foo')
if not element: # careful!
print("element not found, or element has no subelements")
if element is None:
print("element not found")
.. _elementtree-elementtree-objects: .. _elementtree-elementtree-objects:
@ -306,70 +382,88 @@ ElementTree Objects
.. class:: ElementTree(element=None, file=None) .. class:: ElementTree(element=None, file=None)
ElementTree wrapper class. This class represents an entire element hierarchy, ElementTree wrapper class. This class represents an entire element
and adds some extra support for serialization to and from standard XML. hierarchy, and adds some extra support for serialization to and from
standard XML.
*element* is the root element. The tree is initialized with the contents of the *element* is the root element. The tree is initialized with the contents
XML *file* if given. of the XML *file* if given.
.. method:: _setroot(element) .. method:: _setroot(element)
Replaces the root element for this tree. This discards the current Replaces the root element for this tree. This discards the current
contents of the tree, and replaces it with the given element. Use with contents of the tree, and replaces it with the given element. Use with
care. *element* is an element instance. care. *element* is an element instance.
.. method:: find(path) .. method:: find(match)
Finds the first toplevel element with given tag. Same as Finds the first toplevel element matching *match*. *match* may be a tag
getroot().find(path). *path* is the element to look for. Returns the name or path. Same as getroot().find(match). Returns the first matching
first matching element, or ``None`` if no element was found. element, or ``None`` if no element was found.
.. method:: findall(path) .. method:: findall(match)
Finds all toplevel elements with the given tag. Same as Finds all matching subelements, by tag name or path. Same as
getroot().findall(path). *path* is the element to look for. Returns a getroot().findall(match). *match* may be a tag name or path. Returns a
list or :term:`iterator` containing all matching elements, in document list containing all matching elements, in document order.
order.
.. method:: findtext(path, default=None) .. method:: findtext(match, default=None)
Finds the element text for the first toplevel element with given tag. Finds the element text for the first toplevel element with given tag.
Same as getroot().findtext(path). *path* is the toplevel element to look Same as getroot().findtext(match). *match* may be a tag name or path.
for. *default* is the value to return if the element was not *default* is the value to return if the element was not found. Returns
found. Returns the text content of the first matching element, or the the text content of the first matching element, or the default value no
default value no element was found. Note that if the element has is element was found. Note that if the element is found, but has no text
found, but has no text content, this method returns an empty string. content, this method returns an empty string.
.. method:: getiterator(tag=None) .. method:: getiterator(tag=None)
Creates and returns a tree iterator for the root element. The iterator .. deprecated:: 2.7
loops over all elements in this tree, in section order. *tag* is the tag Use method :meth:`ElementTree.iter` instead.
to look for (default is to return all elements)
.. method:: getroot() .. method:: getroot()
Returns the root element for this tree. Returns the root element for this tree.
.. method:: iter(tag=None)
Creates and returns a tree iterator for the root element. The iterator
loops over all elements in this tree, in section order. *tag* is the tag
to look for (default is to return all elements)
.. method:: iterfind(match)
Finds all matching subelements, by tag name or path. Same as
getroot().iterfind(match). Returns an iterable yielding all matching
elements in document order.
.. versionadded:: 2.7
.. method:: parse(source, parser=None) .. method:: parse(source, parser=None)
Loads an external XML section into this element tree. *source* is a file Loads an external XML section into this element tree. *source* is a file
name or file object. *parser* is an optional parser instance. If not name or file object. *parser* is an optional parser instance. If not
given, the standard XMLTreeBuilder parser is used. Returns the section given, the standard XMLParser parser is used. Returns the section
root element. root element.
.. method:: write(file, encoding=None) .. method:: write(file, encoding=None, xml_declaration=None, method=None)
Writes the element tree to a file, as XML. *file* is a file name, or a Writes the element tree to a file, as XML. *file* is a file name, or a
file object opened for writing. *encoding* [1]_ is the output encoding file object opened for writing. *encoding* [1]_ is the output encoding
(default is US-ASCII). (default is None). *xml_declaration* controls if an XML declaration
should be added to the file. Use False for never, True for always, None
for only if not US-ASCII or UTF-8 (default is None). *method* is either
``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an
(optionally) encoded string.
This is the XML file that is going to be manipulated:: This is the XML file that is going to be manipulated::
@ -388,13 +482,13 @@ Example of changing the attribute "target" of every link in first paragraph::
>>> from xml.etree.ElementTree import ElementTree >>> from xml.etree.ElementTree import ElementTree
>>> tree = ElementTree() >>> tree = ElementTree()
>>> tree.parse("index.xhtml") >>> tree.parse("index.xhtml")
<Element html at b7d3f1ec> <Element 'html' at 0xb77e6fac>
>>> p = tree.find("body/p") # Finds first occurrence of tag p in body >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
>>> p >>> p
<Element p at 8416e0c> <Element 'p' at 0xb77ec26c>
>>> links = p.getiterator("a") # Returns list of all links >>> links = list(p.iter("a")) # Returns list of all links
>>> links >>> links
[<Element a at b7d4f9ec>, <Element a at b7d4fb0c>] [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
>>> for i in links: # Iterates through all found links >>> for i in links: # Iterates through all found links
... i.attrib["target"] = "blank" ... i.attrib["target"] = "blank"
>>> tree.write("output.xhtml") >>> tree.write("output.xhtml")
@ -407,12 +501,12 @@ QName Objects
.. class:: QName(text_or_uri, tag=None) .. class:: QName(text_or_uri, tag=None)
QName wrapper. This can be used to wrap a QName attribute value, in order to QName wrapper. This can be used to wrap a QName attribute value, in order
get proper namespace handling on output. *text_or_uri* is a string containing to get proper namespace handling on output. *text_or_uri* is a string
the QName value, in the form {uri}local, or, if the tag argument is given, the containing the QName value, in the form {uri}local, or, if the tag argument
URI part of a QName. If *tag* is given, the first argument is interpreted as an is given, the URI part of a QName. If *tag* is given, the first argument is
URI, and this argument is interpreted as a local name. :class:`QName` instances interpreted as an URI, and this argument is interpreted as a local name.
are opaque. :class:`QName` instances are opaque.
.. _elementtree-treebuilder-objects: .. _elementtree-treebuilder-objects:
@ -423,74 +517,89 @@ TreeBuilder Objects
.. class:: TreeBuilder(element_factory=None) .. class:: TreeBuilder(element_factory=None)
Generic element structure builder. This builder converts a sequence of start, Generic element structure builder. This builder converts a sequence of
data, and end method calls to a well-formed element structure. You can use this start, data, and end method calls to a well-formed element structure. You
class to build an element structure using a custom XML parser, or a parser for can use this class to build an element structure using a custom XML parser,
some other XML-like format. The *element_factory* is called to create new or a parser for some other XML-like format. The *element_factory* is called
Element instances when given. to create new :class:`Element` instances when given.
.. method:: close() .. method:: close()
Flushes the parser buffers, and returns the toplevel document Flushes the builder buffers, and returns the toplevel document
element. Returns an Element instance. element. Returns an :class:`Element` instance.
.. method:: data(data) .. method:: data(data)
Adds text to the current element. *data* is a string. This should be Adds text to the current element. *data* is a string. This should be
either an ASCII-only :class:`bytes` object or a :class:`str` object. either a bytestring, or a Unicode string.
.. method:: end(tag) .. method:: end(tag)
Closes the current element. *tag* is the element name. Returns the closed Closes the current element. *tag* is the element name. Returns the
element. closed element.
.. method:: start(tag, attrs) .. method:: start(tag, attrs)
Opens a new element. *tag* is the element name. *attrs* is a dictionary Opens a new element. *tag* is the element name. *attrs* is a dictionary
containing element attributes. Returns the opened element. containing element attributes. Returns the opened element.
.. _elementtree-xmltreebuilder-objects: In addition, a custom :class:`TreeBuilder` object can provide the
following method:
XMLTreeBuilder Objects .. method:: doctype(name, pubid, system)
----------------------
Handles a doctype declaration. *name* is the doctype name. *pubid* is
the public identifier. *system* is the system identifier. This method
does not exist on the default :class:`TreeBuilder` class.
.. versionadded:: 2.7
.. class:: XMLTreeBuilder(html=0, target=None) .. _elementtree-xmlparser-objects:
Element structure builder for XML source data, based on the expat parser. *html* XMLParser Objects
are predefined HTML entities. This flag is not supported by the current -----------------
implementation. *target* is the target object. If omitted, the builder uses an
instance of the standard TreeBuilder class.
.. class:: XMLParser(html=0, target=None, encoding=None)
:class:`Element` structure builder for XML source data, based on the expat
parser. *html* are predefined HTML entities. This flag is not supported by
the current implementation. *target* is the target object. If omitted, the
builder uses an instance of the standard TreeBuilder class. *encoding* [1]_
is optional. If given, the value overrides the encoding specified in the
XML file.
.. method:: close() .. method:: close()
Finishes feeding data to the parser. Returns an element structure. Finishes feeding data to the parser. Returns an element structure.
.. method:: doctype(name, pubid, system) .. method:: doctype(name, pubid, system)
Handles a doctype declaration. *name* is the doctype name. *pubid* is the .. deprecated:: 2.7
public identifier. *system* is the system identifier. Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
target.
.. method:: feed(data) .. method:: feed(data)
Feeds data to the parser. *data* is encoded data. Feeds data to the parser. *data* is encoded data.
:meth:`XMLTreeBuilder.feed` calls *target*\'s :meth:`start` method :meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
for each opening tag, its :meth:`end` method for each closing tag, for each opening tag, its :meth:`end` method for each closing tag,
and data is processed by method :meth:`data`. :meth:`XMLTreeBuilder.close` and data is processed by method :meth:`data`. :meth:`XMLParser.close`
calls *target*\'s method :meth:`close`. calls *target*\'s method :meth:`close`.
:class:`XMLTreeBuilder` can be used not only for building a tree structure. :class:`XMLParser` can be used not only for building a tree structure.
This is an example of counting the maximum depth of an XML file:: This is an example of counting the maximum depth of an XML file::
>>> from xml.etree.ElementTree import XMLTreeBuilder >>> from xml.etree.ElementTree import XMLParser
>>> class MaxDepth: # The target object of the parser >>> class MaxDepth: # The target object of the parser
... maxDepth = 0 ... maxDepth = 0
... depth = 0 ... depth = 0
@ -506,7 +615,7 @@ This is an example of counting the maximum depth of an XML file::
... return self.maxDepth ... return self.maxDepth
... ...
>>> target = MaxDepth() >>> target = MaxDepth()
>>> parser = XMLTreeBuilder(target=target) >>> parser = XMLParser(target=target)
>>> exampleXml = """ >>> exampleXml = """
... <a> ... <a>
... <b> ... <b>
@ -526,7 +635,6 @@ This is an example of counting the maximum depth of an XML file::
.. rubric:: Footnotes .. rubric:: Footnotes
.. [#] The encoding string included in XML output should conform to the .. [#] The encoding string included in XML output should conform to the
appropriate standards. For example, "UTF-8" is valid, but "UTF8" is appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
and http://www.iana.org/assignments/character-sets. and http://www.iana.org/assignments/character-sets.

View File

@ -1,23 +0,0 @@
:mod:`xml.etree` --- The ElementTree API for XML
================================================
.. module:: xml.etree
:synopsis: Package containing common ElementTree modules.
.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
The ElementTree package is a simple, efficient, and quite popular library for
XML manipulation in Python. The :mod:`xml.etree` package contains the most
common components from the ElementTree API library. In the current release,
this package contains the :mod:`ElementTree`, :mod:`ElementPath`, and
:mod:`ElementInclude` modules from the full ElementTree distribution.
.. XXX To be continued!
.. seealso::
`ElementTree Overview <http://effbot.org/tag/elementtree>`_
The home page for :mod:`ElementTree`. This includes links to additional
documentation, alternative implementations, and other add-ons.

View File

@ -402,12 +402,14 @@ def temp_cwd(name='tempcwd', quiet=False):
rmtree(name) rmtree(name)
def findfile(file, here=__file__): def findfile(file, here=__file__, subdir=None):
"""Try to find a file on sys.path and the working directory. If it is not """Try to find a file on sys.path and the working directory. If it is not
found the argument passed to the function is returned (this does not found the argument passed to the function is returned (this does not
necessarily signal failure; could still be the legitimate path).""" necessarily signal failure; could still be the legitimate path)."""
if os.path.isabs(file): if os.path.isabs(file):
return file return file
if subdir is not None:
file = os.path.join(subdir, file)
path = sys.path path = sys.path
path = [os.path.dirname(here)] + path path = [os.path.dirname(here)] + path
for dn in path: for dn in path:

View File

@ -1,9 +1,7 @@
# test for xml.dom.minidom # test for xml.dom.minidom
import os
import sys
import pickle import pickle
from test.support import verbose, run_unittest from test.support import verbose, run_unittest, findfile
import unittest import unittest
import xml.dom import xml.dom
@ -14,12 +12,8 @@ from xml.dom.minidom import parse, Node, Document, parseString
from xml.dom.minidom import getDOMImplementation from xml.dom.minidom import getDOMImplementation
if __name__ == "__main__": tstfile = findfile("test.xml", subdir="xmltestdata")
base = sys.argv[0]
else:
base = __file__
tstfile = os.path.join(os.path.dirname(base), "test.xml")
del base
# The tests of DocumentType importing use these helpers to construct # The tests of DocumentType importing use these helpers to construct
# the documents to work with, since not all DOM builders actually # the documents to work with, since not all DOM builders actually

View File

@ -15,7 +15,9 @@ from xml.sax.xmlreader import InputSource, AttributesImpl, AttributesNSImpl
from io import StringIO from io import StringIO
from test.support import findfile, run_unittest from test.support import findfile, run_unittest
import unittest import unittest
import os
TEST_XMLFILE = findfile("test.xml", subdir="xmltestdata")
TEST_XMLFILE_OUT = findfile("test.xml.out", subdir="xmltestdata")
ns_uri = "http://www.python.org/xml-ns/saxtest/" ns_uri = "http://www.python.org/xml-ns/saxtest/"
@ -311,7 +313,7 @@ class XMLFilterBaseTest(unittest.TestCase):
# #
# =========================================================================== # ===========================================================================
xml_test_out = open(findfile("test.xml.out")).read() xml_test_out = open(TEST_XMLFILE_OUT).read()
class ExpatReaderTest(XmlTestBase): class ExpatReaderTest(XmlTestBase):
@ -323,7 +325,7 @@ class ExpatReaderTest(XmlTestBase):
xmlgen = XMLGenerator(result) xmlgen = XMLGenerator(result)
parser.setContentHandler(xmlgen) parser.setContentHandler(xmlgen)
parser.parse(open(findfile("test.xml"))) parser.parse(open(TEST_XMLFILE))
self.assertEquals(result.getvalue(), xml_test_out) self.assertEquals(result.getvalue(), xml_test_out)
@ -452,7 +454,7 @@ class ExpatReaderTest(XmlTestBase):
xmlgen = XMLGenerator(result) xmlgen = XMLGenerator(result)
parser.setContentHandler(xmlgen) parser.setContentHandler(xmlgen)
parser.parse(findfile("test.xml")) parser.parse(TEST_XMLFILE)
self.assertEquals(result.getvalue(), xml_test_out) self.assertEquals(result.getvalue(), xml_test_out)
@ -462,7 +464,7 @@ class ExpatReaderTest(XmlTestBase):
xmlgen = XMLGenerator(result) xmlgen = XMLGenerator(result)
parser.setContentHandler(xmlgen) parser.setContentHandler(xmlgen)
parser.parse(InputSource(findfile("test.xml"))) parser.parse(InputSource(TEST_XMLFILE))
self.assertEquals(result.getvalue(), xml_test_out) self.assertEquals(result.getvalue(), xml_test_out)
@ -473,7 +475,7 @@ class ExpatReaderTest(XmlTestBase):
parser.setContentHandler(xmlgen) parser.setContentHandler(xmlgen)
inpsrc = InputSource() inpsrc = InputSource()
inpsrc.setByteStream(open(findfile("test.xml"))) inpsrc.setByteStream(open(TEST_XMLFILE))
parser.parse(inpsrc) parser.parse(inpsrc)
self.assertEquals(result.getvalue(), xml_test_out) self.assertEquals(result.getvalue(), xml_test_out)
@ -534,9 +536,9 @@ class ExpatReaderTest(XmlTestBase):
xmlgen = XMLGenerator(result) xmlgen = XMLGenerator(result)
parser = create_parser() parser = create_parser()
parser.setContentHandler(xmlgen) parser.setContentHandler(xmlgen)
parser.parse(findfile("test.xml")) parser.parse(TEST_XMLFILE)
self.assertEquals(parser.getSystemId(), findfile("test.xml")) self.assertEquals(parser.getSystemId(), TEST_XMLFILE)
self.assertEquals(parser.getPublicId(), None) self.assertEquals(parser.getPublicId(), None)

File diff suppressed because it is too large Load Diff

View File

@ -1,31 +1,11 @@
# xml.etree test for cElementTree # xml.etree test for cElementTree
import doctest
import sys
from test import support from test import support
ET = support.import_module('xml.etree.cElementTree') cET = support.import_module('xml.etree.cElementTree')
SAMPLE_XML = """
<body>
<tag>text</tag>
<tag />
<section>
<tag>subtext</tag>
</section>
</body>
"""
SAMPLE_XML_NS = """ # cElementTree specific tests
<body xmlns="http://effbot.org/ns">
<tag>text</tag>
<tag />
<section>
<tag>subtext</tag>
</section>
</body>
"""
def sanity(): def sanity():
""" """
@ -34,187 +14,26 @@ def sanity():
>>> from xml.etree import cElementTree >>> from xml.etree import cElementTree
""" """
def check_method(method):
if not hasattr(method, '__call__'):
print(method, "not callable")
def serialize(ET, elem):
import io
file = io.StringIO()
tree = ET.ElementTree(elem)
tree.write(file)
return file.getvalue()
def summarize(elem):
return elem.tag
def summarize_list(seq):
return list(map(summarize, seq))
def interface():
"""
Test element tree interface.
>>> element = ET.Element("tag", key="value")
>>> tree = ET.ElementTree(element)
Make sure all standard element methods exist.
>>> check_method(element.append)
>>> check_method(element.insert)
>>> check_method(element.remove)
>>> check_method(element.getchildren)
>>> check_method(element.find)
>>> check_method(element.findall)
>>> check_method(element.findtext)
>>> check_method(element.clear)
>>> check_method(element.get)
>>> check_method(element.set)
>>> check_method(element.keys)
>>> check_method(element.items)
>>> check_method(element.getiterator)
Basic method sanity checks.
>>> serialize(ET, element) # 1
'<tag key="value" />'
>>> subelement = ET.Element("subtag")
>>> element.append(subelement)
>>> serialize(ET, element) # 2
'<tag key="value"><subtag /></tag>'
>>> element.insert(0, subelement)
>>> serialize(ET, element) # 3
'<tag key="value"><subtag /><subtag /></tag>'
>>> element.remove(subelement)
>>> serialize(ET, element) # 4
'<tag key="value"><subtag /></tag>'
>>> element.remove(subelement)
>>> serialize(ET, element) # 5
'<tag key="value" />'
>>> element.remove(subelement)
Traceback (most recent call last):
ValueError: list.remove(x): x not in list
>>> serialize(ET, element) # 6
'<tag key="value" />'
"""
def find():
"""
Test find methods (including xpath syntax).
>>> elem = ET.XML(SAMPLE_XML)
>>> elem.find("tag").tag
'tag'
>>> ET.ElementTree(elem).find("tag").tag
'tag'
>>> elem.find("section/tag").tag
'tag'
>>> ET.ElementTree(elem).find("section/tag").tag
'tag'
>>> elem.findtext("tag")
'text'
>>> elem.findtext("tog")
>>> elem.findtext("tog", "default")
'default'
>>> ET.ElementTree(elem).findtext("tag")
'text'
>>> elem.findtext("section/tag")
'subtext'
>>> ET.ElementTree(elem).findtext("section/tag")
'subtext'
>>> summarize_list(elem.findall("tag"))
['tag', 'tag']
>>> summarize_list(elem.findall("*"))
['tag', 'tag', 'section']
>>> summarize_list(elem.findall(".//tag"))
['tag', 'tag', 'tag']
>>> summarize_list(elem.findall("section/tag"))
['tag']
>>> summarize_list(elem.findall("section//tag"))
['tag']
>>> summarize_list(elem.findall("section/*"))
['tag']
>>> summarize_list(elem.findall("section//*"))
['tag']
>>> summarize_list(elem.findall("section/.//*"))
['tag']
>>> summarize_list(elem.findall("*/*"))
['tag']
>>> summarize_list(elem.findall("*//*"))
['tag']
>>> summarize_list(elem.findall("*/tag"))
['tag']
>>> summarize_list(elem.findall("*/./tag"))
['tag']
>>> summarize_list(elem.findall("./tag"))
['tag', 'tag']
>>> summarize_list(elem.findall(".//tag"))
['tag', 'tag', 'tag']
>>> summarize_list(elem.findall("././tag"))
['tag', 'tag']
>>> summarize_list(ET.ElementTree(elem).findall("/tag"))
['tag', 'tag']
>>> summarize_list(ET.ElementTree(elem).findall("./tag"))
['tag', 'tag']
>>> elem = ET.XML(SAMPLE_XML_NS)
>>> summarize_list(elem.findall("tag"))
[]
>>> summarize_list(elem.findall("{http://effbot.org/ns}tag"))
['{http://effbot.org/ns}tag', '{http://effbot.org/ns}tag']
>>> summarize_list(elem.findall(".//{http://effbot.org/ns}tag"))
['{http://effbot.org/ns}tag', '{http://effbot.org/ns}tag', '{http://effbot.org/ns}tag']
"""
def parseliteral():
r"""
>>> element = ET.XML("<html><body>text</body></html>")
>>> ET.ElementTree(element).write(sys.stdout)
<html><body>text</body></html>
>>> element = ET.fromstring("<html><body>text</body></html>")
>>> ET.ElementTree(element).write(sys.stdout)
<html><body>text</body></html>
>>> print(ET.tostring(element))
<html><body>text</body></html>
>>> print(repr(ET.tostring(element, "ascii")))
b"<?xml version='1.0' encoding='ascii'?>\n<html><body>text</body></html>"
>>> _, ids = ET.XMLID("<html><body>text</body></html>")
>>> len(ids)
0
>>> _, ids = ET.XMLID("<html><body id='body'>text</body></html>")
>>> len(ids)
1
>>> ids["body"].tag
'body'
"""
def check_encoding(encoding):
"""
>>> check_encoding("ascii")
>>> check_encoding("us-ascii")
>>> check_encoding("iso-8859-1")
>>> check_encoding("iso-8859-15")
>>> check_encoding("cp437")
>>> check_encoding("mac-roman")
"""
ET.XML(
"<?xml version='1.0' encoding='%s'?><xml />" % encoding
)
def bug_1534630():
"""
>>> bob = ET.TreeBuilder()
>>> e = bob.data("data")
>>> e = bob.start("tag", {})
>>> e = bob.end("tag")
>>> e = bob.close()
>>> serialize(ET, e)
'<tag />'
"""
def test_main(): def test_main():
from test import test_xml_etree_c from test import test_xml_etree, test_xml_etree_c
# Run the tests specific to the C implementation
support.run_doctest(test_xml_etree_c, verbosity=True) support.run_doctest(test_xml_etree_c, verbosity=True)
# Assign the C implementation before running the doctests
# Patch the __name__, to prevent confusion with the pure Python test
pyET = test_xml_etree.ET
py__name__ = test_xml_etree.__name__
test_xml_etree.ET = cET
if __name__ != '__main__':
test_xml_etree.__name__ = __name__
try:
# Run the same test suite as xml.etree.ElementTree
test_xml_etree.test_main(module_name='xml.etree.cElementTree')
finally:
test_xml_etree.ET = pyET
test_xml_etree.__name__ = py__name__
if __name__ == '__main__': if __name__ == '__main__':
test_main() test_main()

View File

@ -0,0 +1,7 @@
<?pi data?>
<!-- comment -->
<root xmlns='namespace'>
<element key='value'>text</element>
<element>text</element>tail
<empty-element/>
</root>

View File

@ -0,0 +1,6 @@
<!-- comment -->
<root>
<element key='value'>text</element>
<element>text</element>tail
<empty-element/>
</root>

View File

@ -1,6 +1,6 @@
# #
# ElementTree # ElementTree
# $Id: ElementInclude.py 1862 2004-06-18 07:31:02Z Fredrik $ # $Id: ElementInclude.py 3375 2008-02-13 08:05:08Z fredrik $
# #
# limited xinclude support for element trees # limited xinclude support for element trees
# #
@ -16,7 +16,7 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# The ElementTree toolkit is # The ElementTree toolkit is
# #
# Copyright (c) 1999-2004 by Fredrik Lundh # Copyright (c) 1999-2008 by Fredrik Lundh
# #
# By obtaining, using, and/or copying this software and/or its # By obtaining, using, and/or copying this software and/or its
# associated documentation, you agree that you have read, understood, # associated documentation, you agree that you have read, understood,
@ -42,7 +42,7 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# Licensed to PSF under a Contributor Agreement. # Licensed to PSF under a Contributor Agreement.
# See http://www.python.org/2.4/license for licensing details. # See http://www.python.org/psf/license for licensing details.
## ##
# Limited XInclude support for the ElementTree package. # Limited XInclude support for the ElementTree package.

View File

@ -1,6 +1,6 @@
# #
# ElementTree # ElementTree
# $Id: ElementPath.py 1858 2004-06-17 21:31:41Z Fredrik $ # $Id: ElementPath.py 3375 2008-02-13 08:05:08Z fredrik $
# #
# limited xpath support for element trees # limited xpath support for element trees
# #
@ -8,8 +8,13 @@
# 2003-05-23 fl created # 2003-05-23 fl created
# 2003-05-28 fl added support for // etc # 2003-05-28 fl added support for // etc
# 2003-08-27 fl fixed parsing of periods in element names # 2003-08-27 fl fixed parsing of periods in element names
# 2007-09-10 fl new selection engine
# 2007-09-12 fl fixed parent selector
# 2007-09-13 fl added iterfind; changed findall to return a list
# 2007-11-30 fl added namespaces support
# 2009-10-30 fl added child element value filter
# #
# Copyright (c) 2003-2004 by Fredrik Lundh. All rights reserved. # Copyright (c) 2003-2009 by Fredrik Lundh. All rights reserved.
# #
# fredrik@pythonware.com # fredrik@pythonware.com
# http://www.pythonware.com # http://www.pythonware.com
@ -17,7 +22,7 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# The ElementTree toolkit is # The ElementTree toolkit is
# #
# Copyright (c) 1999-2004 by Fredrik Lundh # Copyright (c) 1999-2009 by Fredrik Lundh
# #
# By obtaining, using, and/or copying this software and/or its # By obtaining, using, and/or copying this software and/or its
# associated documentation, you agree that you have read, understood, # associated documentation, you agree that you have read, understood,
@ -43,7 +48,7 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# Licensed to PSF under a Contributor Agreement. # Licensed to PSF under a Contributor Agreement.
# See http://www.python.org/2.4/license for licensing details. # See http://www.python.org/psf/license for licensing details.
## ##
# Implementation module for XPath support. There's usually no reason # Implementation module for XPath support. There's usually no reason
@ -53,146 +58,246 @@
import re import re
xpath_tokenizer = re.compile( xpath_tokenizer_re = re.compile(
"(::|\.\.|\(\)|[/.*:\[\]\(\)@=])|((?:\{[^}]+\})?[^/:\[\]\(\)@=\s]+)|\s+" "("
).findall "'[^']*'|\"[^\"]*\"|"
"::|"
"//?|"
"\.\.|"
"\(\)|"
"[/.*:\[\]\(\)@=])|"
"((?:\{[^}]+\})?[^/\[\]\(\)@=\s]+)|"
"\s+"
)
class xpath_descendant_or_self: def xpath_tokenizer(pattern, namespaces=None):
pass for token in xpath_tokenizer_re.findall(pattern):
tag = token[1]
## if tag and tag[0] != "{" and ":" in tag:
# Wrapper for a compiled XPath.
class Path:
##
# Create an Path instance from an XPath expression.
def __init__(self, path):
tokens = xpath_tokenizer(path)
# the current version supports 'path/path'-style expressions only
self.path = []
self.tag = None
if tokens and tokens[0][0] == "/":
raise SyntaxError("cannot use absolute path on element")
while tokens:
op, tag = tokens.pop(0)
if tag or op == "*":
self.path.append(tag or op)
elif op == ".":
pass
elif op == "/":
self.path.append(xpath_descendant_or_self())
continue
else:
raise SyntaxError("unsupported path syntax (%s)" % op)
if tokens:
op, tag = tokens.pop(0)
if op != "/":
raise SyntaxError(
"expected path separator (%s)" % (op or tag)
)
if self.path and isinstance(self.path[-1], xpath_descendant_or_self):
raise SyntaxError("path cannot end with //")
if len(self.path) == 1 and isinstance(self.path[0], type("")):
self.tag = self.path[0]
##
# Find first matching object.
def find(self, element):
tag = self.tag
if tag is None:
nodeset = self.findall(element)
if not nodeset:
return None
return nodeset[0]
for elem in element:
if elem.tag == tag:
return elem
return None
##
# Find text for first matching object.
def findtext(self, element, default=None):
tag = self.tag
if tag is None:
nodeset = self.findall(element)
if not nodeset:
return default
return nodeset[0].text or ""
for elem in element:
if elem.tag == tag:
return elem.text or ""
return default
##
# Find all matching objects.
def findall(self, element):
nodeset = [element]
index = 0
while 1:
try: try:
path = self.path[index] prefix, uri = tag.split(":", 1)
index = index + 1 if not namespaces:
except IndexError: raise KeyError
return nodeset yield token[0], "{%s}%s" % (namespaces[prefix], uri)
set = [] except KeyError:
if isinstance(path, xpath_descendant_or_self): raise SyntaxError("prefix %r not found in prefix map" % prefix)
else:
yield token
def get_parent_map(context):
parent_map = context.parent_map
if parent_map is None:
context.parent_map = parent_map = {}
for p in context.root.iter():
for e in p:
parent_map[e] = p
return parent_map
def prepare_child(next, token):
tag = token[1]
def select(context, result):
for elem in result:
for e in elem:
if e.tag == tag:
yield e
return select
def prepare_star(next, token):
def select(context, result):
for elem in result:
for e in elem:
yield e
return select
def prepare_self(next, token):
def select(context, result):
for elem in result:
yield elem
return select
def prepare_descendant(next, token):
token = next()
if token[0] == "*":
tag = "*"
elif not token[0]:
tag = token[1]
else:
raise SyntaxError("invalid descendant")
def select(context, result):
for elem in result:
for e in elem.iter(tag):
if e is not elem:
yield e
return select
def prepare_parent(next, token):
def select(context, result):
# FIXME: raise error if .. is applied at toplevel?
parent_map = get_parent_map(context)
result_map = {}
for elem in result:
if elem in parent_map:
parent = parent_map[elem]
if parent not in result_map:
result_map[parent] = None
yield parent
return select
def prepare_predicate(next, token):
# FIXME: replace with real parser!!! refs:
# http://effbot.org/zone/simple-iterator-parser.htm
# http://javascript.crockford.com/tdop/tdop.html
signature = []
predicate = []
while 1:
token = next()
if token[0] == "]":
break
if token[0] and token[0][:1] in "'\"":
token = "'", token[0][1:-1]
signature.append(token[0] or "-")
predicate.append(token[1])
signature = "".join(signature)
# use signature to determine predicate type
if signature == "@-":
# [@attribute] predicate
key = predicate[1]
def select(context, result):
for elem in result:
if elem.get(key) is not None:
yield elem
return select
if signature == "@-='":
# [@attribute='value']
key = predicate[1]
value = predicate[-1]
def select(context, result):
for elem in result:
if elem.get(key) == value:
yield elem
return select
if signature == "-" and not re.match("\d+$", predicate[0]):
# [tag]
tag = predicate[0]
def select(context, result):
for elem in result:
if elem.find(tag) is not None:
yield elem
return select
if signature == "-='" and not re.match("\d+$", predicate[0]):
# [tag='value']
tag = predicate[0]
value = predicate[-1]
def select(context, result):
for elem in result:
for e in elem.findall(tag):
if "".join(e.itertext()) == value:
yield elem
break
return select
if signature == "-" or signature == "-()" or signature == "-()-":
# [index] or [last()] or [last()-index]
if signature == "-":
index = int(predicate[0]) - 1
else:
if predicate[0] != "last":
raise SyntaxError("unsupported function")
if signature == "-()-":
try: try:
tag = self.path[index] index = int(predicate[2]) - 1
if not isinstance(tag, type("")): except ValueError:
tag = None raise SyntaxError("unsupported expression")
else:
index = index + 1
except IndexError:
tag = None # invalid path
for node in nodeset:
new = list(node.getiterator(tag))
if new and new[0] is node:
set.extend(new[1:])
else:
set.extend(new)
else: else:
for node in nodeset: index = -1
for node in node: def select(context, result):
if path == "*" or node.tag == path: parent_map = get_parent_map(context)
set.append(node) for elem in result:
if not set: try:
return [] parent = parent_map[elem]
nodeset = set # FIXME: what if the selector is "*" ?
elems = list(parent.findall(elem.tag))
if elems[index] is elem:
yield elem
except (IndexError, KeyError):
pass
return select
raise SyntaxError("invalid predicate")
ops = {
"": prepare_child,
"*": prepare_star,
".": prepare_self,
"..": prepare_parent,
"//": prepare_descendant,
"[": prepare_predicate,
}
_cache = {} _cache = {}
## class _SelectorContext:
# (Internal) Compile path. parent_map = None
def __init__(self, root):
self.root = root
def _compile(path): # --------------------------------------------------------------------
p = _cache.get(path)
if p is not None: ##
return p # Generate all matching objects.
p = Path(path)
if len(_cache) >= 100: def iterfind(elem, path, namespaces=None):
_cache.clear() # compile selector pattern
_cache[path] = p if path[-1:] == "/":
return p path = path + "*" # implicit all (FIXME: keep this?)
try:
selector = _cache[path]
except KeyError:
if len(_cache) > 100:
_cache.clear()
if path[:1] == "/":
raise SyntaxError("cannot use absolute path on element")
next = iter(xpath_tokenizer(path, namespaces)).__next__
token = next()
selector = []
while 1:
try:
selector.append(ops[token[0]](next, token))
except StopIteration:
raise SyntaxError("invalid path")
try:
token = next()
if token[0] == "/":
token = next()
except StopIteration:
break
_cache[path] = selector
# execute selector pattern
result = [elem]
context = _SelectorContext(elem)
for select in selector:
result = select(context, result)
return result
## ##
# Find first matching object. # Find first matching object.
def find(element, path): def find(elem, path, namespaces=None):
return _compile(path).find(element) try:
return next(iterfind(elem, path, namespaces))
## except StopIteration:
# Find text for first matching object. return None
def findtext(element, path, default=None):
return _compile(path).findtext(element, default)
## ##
# Find all matching objects. # Find all matching objects.
def findall(element, path): def findall(elem, path, namespaces=None):
return _compile(path).findall(element) return list(iterfind(elem, path, namespaces))
##
# Find text for first matching object.
def findtext(elem, path, default=None, namespaces=None):
try:
elem = next(iterfind(elem, path, namespaces))
return elem.text or ""
except StopIteration:
return default

File diff suppressed because it is too large Load Diff

View File

@ -1,10 +1,10 @@
# $Id: __init__.py 1821 2004-06-03 16:57:49Z fredrik $ # $Id: __init__.py 3375 2008-02-13 08:05:08Z fredrik $
# elementtree package # elementtree package
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# The ElementTree toolkit is # The ElementTree toolkit is
# #
# Copyright (c) 1999-2004 by Fredrik Lundh # Copyright (c) 1999-2008 by Fredrik Lundh
# #
# By obtaining, using, and/or copying this software and/or its # By obtaining, using, and/or copying this software and/or its
# associated documentation, you agree that you have read, understood, # associated documentation, you agree that you have read, understood,
@ -30,4 +30,4 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# Licensed to PSF under a Contributor Agreement. # Licensed to PSF under a Contributor Agreement.
# See http://www.python.org/2.4/license for licensing details. # See http://www.python.org/psf/license for licensing details.

View File

@ -836,7 +836,7 @@ EXTRAPLATDIR= @EXTRAPLATDIR@
MACHDEPS= $(PLATDIR) $(EXTRAPLATDIR) MACHDEPS= $(PLATDIR) $(EXTRAPLATDIR)
XMLLIBSUBDIRS= xml xml/dom xml/etree xml/parsers xml/sax XMLLIBSUBDIRS= xml xml/dom xml/etree xml/parsers xml/sax
LIBSUBDIRS= tkinter site-packages test test/output test/data \ LIBSUBDIRS= tkinter site-packages test test/output test/data \
test/decimaltestdata \ test/decimaltestdata test/xmltestdata \
encodings \ encodings \
email email/mime email/test email/test/data \ email email/mime email/test email/test/data \
html json json/tests http dbm xmlrpc \ html json json/tests http dbm xmlrpc \

View File

@ -283,6 +283,9 @@ C-API
Library Library
------- -------
- Issue #6472: The xml.etree package is updated to ElementTree 1.3. The
cElementTree module is updated too.
- Issue #7774: Set sys.executable to an empty string if argv[0] has been set to - Issue #7774: Set sys.executable to an empty string if argv[0] has been set to
an non existent program name and Python is unable to retrieve the real an non existent program name and Python is unable to retrieve the real
program name program name

File diff suppressed because it is too large Load Diff

View File

@ -1006,8 +1006,6 @@ def add_files(db):
lib.add_file("audiotest.au") lib.add_file("audiotest.au")
lib.add_file("cfgparser.1") lib.add_file("cfgparser.1")
lib.add_file("sgml_input.html") lib.add_file("sgml_input.html")
lib.add_file("test.xml")
lib.add_file("test.xml.out")
lib.add_file("testtar.tar") lib.add_file("testtar.tar")
lib.add_file("test_difflib_expect.html") lib.add_file("test_difflib_expect.html")
lib.add_file("check_soundcard.vbs") lib.add_file("check_soundcard.vbs")
@ -1019,6 +1017,9 @@ def add_files(db):
lib.add_file("zipdir.zip") lib.add_file("zipdir.zip")
if dir=='decimaltestdata': if dir=='decimaltestdata':
lib.glob("*.decTest") lib.glob("*.decTest")
if dir=='xmltestdata':
lib.glob("*.xml")
lib.add_file("test.xml.out")
if dir=='output': if dir=='output':
lib.glob("test_*") lib.glob("test_*")
if dir=='idlelib': if dir=='idlelib':