Merged revisions 78838-78839,78917,78919,78934,78937 via svnmerge from

svn+ssh://pythondev@svn.python.org/python/trunk

........
  r78838 | florent.xicluna | 2010-03-11 15:36:19 +0100 (jeu, 11 mar 2010) | 2 lines

  Issue #6472: The xml.etree package is updated to ElementTree 1.3.  The cElementTree module is updated too.
........
  r78839 | florent.xicluna | 2010-03-11 16:55:11 +0100 (jeu, 11 mar 2010) | 2 lines

  Fix repr of tree Element on windows.
........
  r78917 | florent.xicluna | 2010-03-13 12:18:49 +0100 (sam, 13 mar 2010) | 2 lines

  Move the xml test data to their own directory.
........
  r78919 | florent.xicluna | 2010-03-13 13:41:48 +0100 (sam, 13 mar 2010) | 2 lines

  Do not chdir when running test_xml_etree, and enhance the findfile helper.
........
  r78934 | florent.xicluna | 2010-03-13 18:56:19 +0100 (sam, 13 mar 2010) | 2 lines

  Update some parts of the xml.etree documentation.
........
  r78937 | florent.xicluna | 2010-03-13 21:30:15 +0100 (sam, 13 mar 2010) | 3 lines

  Add the keyword argument "method=None" to the .write() method and the tostring/tostringlist functions.
  Update the function, class and method signatures, according to the new convention.
........
This commit is contained in:
Florent Xicluna 2010-03-13 23:24:31 +00:00
parent 9451a1c6ae
commit f15351d938
19 changed files with 3534 additions and 1327 deletions

View File

@ -6,9 +6,9 @@
.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com> .. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
The Element type is a flexible container object, designed to store hierarchical The :class:`Element` type is a flexible container object, designed to store
data structures in memory. The type can be described as a cross between a list hierarchical data structures in memory. The type can be described as a cross
and a dictionary. between a list and a dictionary.
Each element has a number of properties associated with it: Each element has a number of properties associated with it:
@ -23,7 +23,8 @@ Each element has a number of properties associated with it:
* a number of child elements, stored in a Python sequence * a number of child elements, stored in a Python sequence
To create an element instance, use the Element or SubElement factory functions. To create an element instance, use the :class:`Element` constructor or the
:func:`SubElement` factory function.
The :class:`ElementTree` class can be used to wrap an element structure, and The :class:`ElementTree` class can be used to wrap an element structure, and
convert it from and to XML. convert it from and to XML.
@ -31,8 +32,14 @@ convert it from and to XML.
A C implementation of this API is available as :mod:`xml.etree.cElementTree`. A C implementation of this API is available as :mod:`xml.etree.cElementTree`.
See http://effbot.org/zone/element-index.htm for tutorials and links to other See http://effbot.org/zone/element-index.htm for tutorials and links to other
docs. Fredrik Lundh's page is also the location of the development version of the docs. Fredrik Lundh's page is also the location of the development version of
xml.etree.ElementTree. the xml.etree.ElementTree.
.. versionchanged:: 2.7
The ElementTree API is updated to 1.3. For more information, see
`Introducing ElementTree 1.3
<http://effbot.org/zone/elementtree-13-intro.htm>`_.
.. _elementtree-functions: .. _elementtree-functions:
@ -43,16 +50,16 @@ Functions
.. function:: Comment(text=None) .. function:: Comment(text=None)
Comment element factory. This factory function creates a special element Comment element factory. This factory function creates a special element
that will be serialized as an XML comment. The comment string can be either that will be serialized as an XML comment by the standard serializer. The
an ASCII-only :class:`bytes` object or a :class:`str` object. *text* is a comment string can be either a bytestring or a Unicode string. *text* is a
string containing the comment string. Returns an element instance string containing the comment string. Returns an element instance
representing a comment. representing a comment.
.. function:: dump(elem) .. function:: dump(elem)
Writes an element tree or element structure to sys.stdout. This function should Writes an element tree or element structure to sys.stdout. This function
be used for debugging only. should be used for debugging only.
The exact output format is implementation dependent. In this version, it's The exact output format is implementation dependent. In this version, it's
written as an ordinary XML file. written as an ordinary XML file.
@ -60,24 +67,20 @@ Functions
*elem* is an element tree or an individual element. *elem* is an element tree or an individual element.
.. function:: Element(tag, attrib={}, **extra)
Element factory. This function returns an object implementing the standard
Element interface. The exact class or type of that object is implementation
dependent, but it will always be compatible with the _ElementInterface class in
this module.
The element name, attribute names, and attribute values can be either an
ASCII-only :class:`bytes` object or a :class:`str` object. *tag* is the
element name. *attrib* is an optional dictionary, containing element
attributes. *extra* contains additional attributes, given as keyword
arguments. Returns an element instance.
.. function:: fromstring(text) .. function:: fromstring(text)
Parses an XML section from a string constant. Same as XML. *text* is a string Parses an XML section from a string constant. Same as XML. *text* is a
containing XML data. Returns an Element instance. string containing XML data. Returns an :class:`Element` instance.
.. function:: fromstringlist(sequence, parser=None)
Parses an XML document from a sequence of string fragments. *sequence* is a
list or other sequence containing XML data fragments. *parser* is an
optional parser instance. If not given, the standard :class:`XMLParser`
parser is used. Returns an :class:`Element` instance.
.. versionadded:: 2.7
.. function:: iselement(element) .. function:: iselement(element)
@ -86,12 +89,14 @@ Functions
element instance. Returns a true value if this is an element object. element instance. Returns a true value if this is an element object.
.. function:: iterparse(source, events=None) .. function:: iterparse(source, events=None, parser=None)
Parses an XML section into an element tree incrementally, and reports what's Parses an XML section into an element tree incrementally, and reports what's
going on to the user. *source* is a filename or file object containing XML data. going on to the user. *source* is a filename or file object containing XML
*events* is a list of events to report back. If omitted, only "end" events are data. *events* is a list of events to report back. If omitted, only "end"
reported. Returns an :term:`iterator` providing ``(event, elem)`` pairs. events are reported. *parser* is an optional parser instance. If not
given, the standard :class:`XMLParser` parser is used. Returns an
:term:`iterator` providing ``(event, elem)`` pairs.
.. note:: .. note::
@ -107,187 +112,258 @@ Functions
.. function:: parse(source, parser=None) .. function:: parse(source, parser=None)
Parses an XML section into an element tree. *source* is a filename or file Parses an XML section into an element tree. *source* is a filename or file
object containing XML data. *parser* is an optional parser instance. If not object containing XML data. *parser* is an optional parser instance. If
given, the standard XMLTreeBuilder parser is used. Returns an ElementTree not given, the standard :class:`XMLParser` parser is used. Returns an
instance. :class:`ElementTree` instance.
.. function:: ProcessingInstruction(target, text=None) .. function:: ProcessingInstruction(target, text=None)
PI element factory. This factory function creates a special element that will PI element factory. This factory function creates a special element that
be serialized as an XML processing instruction. *target* is a string containing will be serialized as an XML processing instruction. *target* is a string
the PI target. *text* is a string containing the PI contents, if given. Returns containing the PI target. *text* is a string containing the PI contents, if
an element instance, representing a processing instruction. given. Returns an element instance, representing a processing instruction.
.. function:: register_namespace(prefix, uri)
Registers a namespace prefix. The registry is global, and any existing
mapping for either the given prefix or the namespace URI will be removed.
*prefix* is a namespace prefix. *uri* is a namespace uri. Tags and
attributes in this namespace will be serialized with the given prefix, if at
all possible.
.. versionadded:: 2.7
.. function:: SubElement(parent, tag, attrib={}, **extra) .. function:: SubElement(parent, tag, attrib={}, **extra)
Subelement factory. This function creates an element instance, and appends it Subelement factory. This function creates an element instance, and appends
to an existing element. it to an existing element.
The element name, attribute names, and attribute values can be an ASCII-only The element name, attribute names, and attribute values can be either
:class:`bytes` object or a :class:`str` object. *parent* is the parent bytestrings or Unicode strings. *parent* is the parent element. *tag* is
element. *tag* is the subelement name. *attrib* is an optional dictionary, the subelement name. *attrib* is an optional dictionary, containing element
containing element attributes. *extra* contains additional attributes, given attributes. *extra* contains additional attributes, given as keyword
as keyword arguments. Returns an element instance. arguments. Returns an element instance.
.. function:: tostring(element, encoding=None) .. function:: tostring(element, encoding=None, method=None)
Generates a string representation of an XML element, including all subelements. Generates a string representation of an XML element, including all
*element* is an Element instance. *encoding* is the output encoding (default is subelements. *element* is an :class:`Element` instance. *encoding* is the
US-ASCII). Returns an encoded string containing the XML data. output encoding (default is None). *method* is either ``"xml"``,
``"html"`` or ``"text"`` (default is ``"xml"``). Returns an (optionally)
encoded string containing the XML data.
.. function:: XML(text) .. function:: tostringlist(element, encoding=None, method=None)
Generates a string representation of an XML element, including all
subelements. *element* is an :class:`Element` instance. *encoding* is the
output encoding (default is None). *method* is either ``"xml"``,
``"html"`` or ``"text"`` (default is ``"xml"``). Returns a sequence object
containing the XML data.
.. versionadded:: 2.7
.. function:: XML(text, parser=None)
Parses an XML section from a string constant. This function can be used to Parses an XML section from a string constant. This function can be used to
embed "XML literals" in Python code. *text* is a string containing XML data. embed "XML literals" in Python code. *text* is a string containing XML
Returns an Element instance. data. *parser* is an optional parser instance. If not given, the standard
:class:`XMLParser` parser is used. Returns an :class:`Element` instance.
.. function:: XMLID(text) .. function:: XMLID(text, parser=None)
Parses an XML section from a string constant, and also returns a dictionary Parses an XML section from a string constant, and also returns a dictionary
which maps from element id:s to elements. *text* is a string containing XML which maps from element id:s to elements. *text* is a string containing XML
data. Returns a tuple containing an Element instance and a dictionary. data. *parser* is an optional parser instance. If not given, the standard
:class:`XMLParser` parser is used. Returns a tuple containing an
:class:`Element` instance and a dictionary.
.. _elementtree-element-interface: .. _elementtree-element-objects:
The Element Interface Element Objects
--------------------- ---------------
Element objects returned by Element or SubElement have the following methods
and attributes.
.. attribute:: Element.tag .. class:: Element(tag, attrib={}, **extra)
A string identifying what kind of data this element represents (the element Element class. This class defines the Element interface, and provides a
type, in other words). reference implementation of this interface.
The element name, attribute names, and attribute values can be either
bytestrings or Unicode strings. *tag* is the element name. *attrib* is
an optional dictionary, containing element attributes. *extra* contains
additional attributes, given as keyword arguments.
.. attribute:: Element.text .. attribute:: tag
The *text* attribute can be used to hold additional data associated with the A string identifying what kind of data this element represents (the
element. As the name implies this attribute is usually a string but may be any element type, in other words).
application-specific object. If the element is created from an XML file the
attribute will contain any text found between the element tags.
.. attribute:: Element.tail .. attribute:: text
The *tail* attribute can be used to hold additional data associated with the The *text* attribute can be used to hold additional data associated with
element. This attribute is usually a string but may be any application-specific the element. As the name implies this attribute is usually a string but
object. If the element is created from an XML file the attribute will contain may be any application-specific object. If the element is created from
any text found after the element's end tag and before the next tag. an XML file the attribute will contain any text found between the element
tags.
.. attribute:: Element.attrib .. attribute:: tail
A dictionary containing the element's attributes. Note that while the *attrib* The *tail* attribute can be used to hold additional data associated with
value is always a real mutable Python dictionary, an ElementTree implementation the element. This attribute is usually a string but may be any
may choose to use another internal representation, and create the dictionary application-specific object. If the element is created from an XML file
only if someone asks for it. To take advantage of such implementations, use the the attribute will contain any text found after the element's end tag and
dictionary methods below whenever possible. before the next tag.
.. attribute:: attrib
A dictionary containing the element's attributes. Note that while the
*attrib* value is always a real mutable Python dictionary, an ElementTree
implementation may choose to use another internal representation, and
create the dictionary only if someone asks for it. To take advantage of
such implementations, use the dictionary methods below whenever possible.
The following dictionary-like methods work on the element attributes. The following dictionary-like methods work on the element attributes.
.. method:: Element.clear() .. method:: clear()
Resets an element. This function removes all subelements, clears all Resets an element. This function removes all subelements, clears all
attributes, and sets the text and tail attributes to None. attributes, and sets the text and tail attributes to None.
.. method:: Element.get(key, default=None) .. method:: get(key, default=None)
Gets the element attribute named *key*. Gets the element attribute named *key*.
Returns the attribute value, or *default* if the attribute was not found. Returns the attribute value, or *default* if the attribute was not found.
.. method:: Element.items() .. method:: items()
Returns the element attributes as a sequence of (name, value) pairs. The Returns the element attributes as a sequence of (name, value) pairs. The
attributes are returned in an arbitrary order. attributes are returned in an arbitrary order.
.. method:: Element.keys() .. method:: keys()
Returns the elements attribute names as a list. The names are returned in an Returns the elements attribute names as a list. The names are returned
arbitrary order. in an arbitrary order.
.. method:: Element.set(key, value) .. method:: set(key, value)
Set the attribute *key* on the element to *value*. Set the attribute *key* on the element to *value*.
The following methods work on the element's children (subelements). The following methods work on the element's children (subelements).
.. method:: Element.append(subelement) .. method:: append(subelement)
Adds the element *subelement* to the end of this elements internal list of Adds the element *subelement* to the end of this elements internal list
subelements. of subelements.
.. method:: Element.find(match) .. method:: extend(subelements)
Finds the first subelement matching *match*. *match* may be a tag name or path. Appends *subelements* from a sequence object with zero or more elements.
Returns an element instance or ``None``. Raises :exc:`AssertionError` if a subelement is not a valid object.
.. versionadded:: 2.7
.. method:: Element.findall(match) .. method:: find(match)
Finds all subelements matching *match*. *match* may be a tag name or path. Finds the first subelement matching *match*. *match* may be a tag name
Returns an iterable yielding all matching elements in document order. or path. Returns an element instance or ``None``.
.. method:: Element.findtext(condition, default=None) .. method:: findall(match)
Finds text for the first subelement matching *condition*. *condition* may be a Finds all matching subelements, by tag name or path. Returns a list
tag name or path. Returns the text content of the first matching element, or containing all matching elements in document order.
*default* if no element was found. Note that if the matching element has no
text content an empty string is returned.
.. method:: Element.getchildren() .. method:: findtext(match, default=None)
Returns all subelements. The elements are returned in document order. Finds text for the first subelement matching *match*. *match* may be
a tag name or path. Returns the text content of the first matching
element, or *default* if no element was found. Note that if the matching
element has no text content an empty string is returned.
.. method:: Element.getiterator(tag=None) .. method:: getchildren()
Creates a tree iterator with the current element as the root. The iterator .. deprecated:: 2.7
iterates over this element and all elements below it, in document (depth first) Use ``list(elem)`` or iteration.
order. If *tag* is not ``None`` or ``'*'``, only elements whose tag equals
*tag* are returned from the iterator.
.. method:: Element.insert(index, element) .. method:: getiterator(tag=None)
.. deprecated:: 2.7
Use method :meth:`Element.iter` instead.
.. method:: insert(index, element)
Inserts a subelement at the given position in this element. Inserts a subelement at the given position in this element.
.. method:: Element.makeelement(tag, attrib) .. method:: iter(tag=None)
Creates a new element object of the same type as this element. Do not call this Creates a tree :term:`iterator` with the current element as the root.
method, use the SubElement factory function instead. The iterator iterates over this element and all elements below it, in
document (depth first) order. If *tag* is not ``None`` or ``'*'``, only
elements whose tag equals *tag* are returned from the iterator. If the
tree structure is modified during iteration, the result is undefined.
.. method:: Element.remove(subelement) .. method:: iterfind(match)
Removes *subelement* from the element. Unlike the findXYZ methods this method Finds all matching subelements, by tag name or path. Returns an iterable
compares elements based on the instance identity, not on tag value or contents. yielding all matching elements in document order.
Element objects also support the following sequence type methods for working .. versionadded:: 2.7
with subelements: :meth:`__delitem__`, :meth:`__getitem__`, :meth:`__setitem__`,
:meth:`__len__`.
Caution: Because Element objects do not define a :meth:`__bool__` method,
elements with no subelements will test as ``False``. :: .. method:: itertext()
Creates a text iterator. The iterator loops over this element and all
subelements, in document order, and returns all inner text.
.. versionadded:: 2.7
.. method:: makeelement(tag, attrib)
Creates a new element object of the same type as this element. Do not
call this method, use the :func:`SubElement` factory function instead.
.. method:: remove(subelement)
Removes *subelement* from the element. Unlike the find\* methods this
method compares elements based on the instance identity, not on tag value
or contents.
:class:`Element` objects also support the following sequence type methods
for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`,
:meth:`__setitem__`, :meth:`__len__`.
Caution: Elements with no subelements will test as ``False``. This behavior
will change in future versions. Use specific ``len(elem)`` or ``elem is
None`` test instead. ::
element = root.find('foo') element = root.find('foo')
@ -306,11 +382,12 @@ ElementTree Objects
.. class:: ElementTree(element=None, file=None) .. class:: ElementTree(element=None, file=None)
ElementTree wrapper class. This class represents an entire element hierarchy, ElementTree wrapper class. This class represents an entire element
and adds some extra support for serialization to and from standard XML. hierarchy, and adds some extra support for serialization to and from
standard XML.
*element* is the root element. The tree is initialized with the contents of the *element* is the root element. The tree is initialized with the contents
XML *file* if given. of the XML *file* if given.
.. method:: _setroot(element) .. method:: _setroot(element)
@ -320,56 +397,73 @@ ElementTree Objects
care. *element* is an element instance. care. *element* is an element instance.
.. method:: find(path) .. method:: find(match)
Finds the first toplevel element with given tag. Same as Finds the first toplevel element matching *match*. *match* may be a tag
getroot().find(path). *path* is the element to look for. Returns the name or path. Same as getroot().find(match). Returns the first matching
first matching element, or ``None`` if no element was found. element, or ``None`` if no element was found.
.. method:: findall(path) .. method:: findall(match)
Finds all toplevel elements with the given tag. Same as Finds all matching subelements, by tag name or path. Same as
getroot().findall(path). *path* is the element to look for. Returns a getroot().findall(match). *match* may be a tag name or path. Returns a
list or :term:`iterator` containing all matching elements, in document list containing all matching elements, in document order.
order.
.. method:: findtext(path, default=None) .. method:: findtext(match, default=None)
Finds the element text for the first toplevel element with given tag. Finds the element text for the first toplevel element with given tag.
Same as getroot().findtext(path). *path* is the toplevel element to look Same as getroot().findtext(match). *match* may be a tag name or path.
for. *default* is the value to return if the element was not *default* is the value to return if the element was not found. Returns
found. Returns the text content of the first matching element, or the the text content of the first matching element, or the default value no
default value no element was found. Note that if the element has is element was found. Note that if the element is found, but has no text
found, but has no text content, this method returns an empty string. content, this method returns an empty string.
.. method:: getiterator(tag=None) .. method:: getiterator(tag=None)
.. deprecated:: 2.7
Use method :meth:`ElementTree.iter` instead.
.. method:: getroot()
Returns the root element for this tree.
.. method:: iter(tag=None)
Creates and returns a tree iterator for the root element. The iterator Creates and returns a tree iterator for the root element. The iterator
loops over all elements in this tree, in section order. *tag* is the tag loops over all elements in this tree, in section order. *tag* is the tag
to look for (default is to return all elements) to look for (default is to return all elements)
.. method:: getroot() .. method:: iterfind(match)
Returns the root element for this tree. Finds all matching subelements, by tag name or path. Same as
getroot().iterfind(match). Returns an iterable yielding all matching
elements in document order.
.. versionadded:: 2.7
.. method:: parse(source, parser=None) .. method:: parse(source, parser=None)
Loads an external XML section into this element tree. *source* is a file Loads an external XML section into this element tree. *source* is a file
name or file object. *parser* is an optional parser instance. If not name or file object. *parser* is an optional parser instance. If not
given, the standard XMLTreeBuilder parser is used. Returns the section given, the standard XMLParser parser is used. Returns the section
root element. root element.
.. method:: write(file, encoding=None) .. method:: write(file, encoding=None, xml_declaration=None, method=None)
Writes the element tree to a file, as XML. *file* is a file name, or a Writes the element tree to a file, as XML. *file* is a file name, or a
file object opened for writing. *encoding* [1]_ is the output encoding file object opened for writing. *encoding* [1]_ is the output encoding
(default is US-ASCII). (default is None). *xml_declaration* controls if an XML declaration
should be added to the file. Use False for never, True for always, None
for only if not US-ASCII or UTF-8 (default is None). *method* is either
``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an
(optionally) encoded string.
This is the XML file that is going to be manipulated:: This is the XML file that is going to be manipulated::
@ -388,13 +482,13 @@ Example of changing the attribute "target" of every link in first paragraph::
>>> from xml.etree.ElementTree import ElementTree >>> from xml.etree.ElementTree import ElementTree
>>> tree = ElementTree() >>> tree = ElementTree()
>>> tree.parse("index.xhtml") >>> tree.parse("index.xhtml")
<Element html at b7d3f1ec> <Element 'html' at 0xb77e6fac>
>>> p = tree.find("body/p") # Finds first occurrence of tag p in body >>> p = tree.find("body/p") # Finds first occurrence of tag p in body
>>> p >>> p
<Element p at 8416e0c> <Element 'p' at 0xb77ec26c>
>>> links = p.getiterator("a") # Returns list of all links >>> links = list(p.iter("a")) # Returns list of all links
>>> links >>> links
[<Element a at b7d4f9ec>, <Element a at b7d4fb0c>] [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
>>> for i in links: # Iterates through all found links >>> for i in links: # Iterates through all found links
... i.attrib["target"] = "blank" ... i.attrib["target"] = "blank"
>>> tree.write("output.xhtml") >>> tree.write("output.xhtml")
@ -407,12 +501,12 @@ QName Objects
.. class:: QName(text_or_uri, tag=None) .. class:: QName(text_or_uri, tag=None)
QName wrapper. This can be used to wrap a QName attribute value, in order to QName wrapper. This can be used to wrap a QName attribute value, in order
get proper namespace handling on output. *text_or_uri* is a string containing to get proper namespace handling on output. *text_or_uri* is a string
the QName value, in the form {uri}local, or, if the tag argument is given, the containing the QName value, in the form {uri}local, or, if the tag argument
URI part of a QName. If *tag* is given, the first argument is interpreted as an is given, the URI part of a QName. If *tag* is given, the first argument is
URI, and this argument is interpreted as a local name. :class:`QName` instances interpreted as an URI, and this argument is interpreted as a local name.
are opaque. :class:`QName` instances are opaque.
.. _elementtree-treebuilder-objects: .. _elementtree-treebuilder-objects:
@ -423,29 +517,29 @@ TreeBuilder Objects
.. class:: TreeBuilder(element_factory=None) .. class:: TreeBuilder(element_factory=None)
Generic element structure builder. This builder converts a sequence of start, Generic element structure builder. This builder converts a sequence of
data, and end method calls to a well-formed element structure. You can use this start, data, and end method calls to a well-formed element structure. You
class to build an element structure using a custom XML parser, or a parser for can use this class to build an element structure using a custom XML parser,
some other XML-like format. The *element_factory* is called to create new or a parser for some other XML-like format. The *element_factory* is called
Element instances when given. to create new :class:`Element` instances when given.
.. method:: close() .. method:: close()
Flushes the parser buffers, and returns the toplevel document Flushes the builder buffers, and returns the toplevel document
element. Returns an Element instance. element. Returns an :class:`Element` instance.
.. method:: data(data) .. method:: data(data)
Adds text to the current element. *data* is a string. This should be Adds text to the current element. *data* is a string. This should be
either an ASCII-only :class:`bytes` object or a :class:`str` object. either a bytestring, or a Unicode string.
.. method:: end(tag) .. method:: end(tag)
Closes the current element. *tag* is the element name. Returns the closed Closes the current element. *tag* is the element name. Returns the
element. closed element.
.. method:: start(tag, attrs) .. method:: start(tag, attrs)
@ -454,18 +548,32 @@ TreeBuilder Objects
containing element attributes. Returns the opened element. containing element attributes. Returns the opened element.
.. _elementtree-xmltreebuilder-objects: In addition, a custom :class:`TreeBuilder` object can provide the
following method:
XMLTreeBuilder Objects .. method:: doctype(name, pubid, system)
----------------------
Handles a doctype declaration. *name* is the doctype name. *pubid* is
the public identifier. *system* is the system identifier. This method
does not exist on the default :class:`TreeBuilder` class.
.. versionadded:: 2.7
.. class:: XMLTreeBuilder(html=0, target=None) .. _elementtree-xmlparser-objects:
Element structure builder for XML source data, based on the expat parser. *html* XMLParser Objects
are predefined HTML entities. This flag is not supported by the current -----------------
implementation. *target* is the target object. If omitted, the builder uses an
instance of the standard TreeBuilder class.
.. class:: XMLParser(html=0, target=None, encoding=None)
:class:`Element` structure builder for XML source data, based on the expat
parser. *html* are predefined HTML entities. This flag is not supported by
the current implementation. *target* is the target object. If omitted, the
builder uses an instance of the standard TreeBuilder class. *encoding* [1]_
is optional. If given, the value overrides the encoding specified in the
XML file.
.. method:: close() .. method:: close()
@ -475,22 +583,23 @@ XMLTreeBuilder Objects
.. method:: doctype(name, pubid, system) .. method:: doctype(name, pubid, system)
Handles a doctype declaration. *name* is the doctype name. *pubid* is the .. deprecated:: 2.7
public identifier. *system* is the system identifier. Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder
target.
.. method:: feed(data) .. method:: feed(data)
Feeds data to the parser. *data* is encoded data. Feeds data to the parser. *data* is encoded data.
:meth:`XMLTreeBuilder.feed` calls *target*\'s :meth:`start` method :meth:`XMLParser.feed` calls *target*\'s :meth:`start` method
for each opening tag, its :meth:`end` method for each closing tag, for each opening tag, its :meth:`end` method for each closing tag,
and data is processed by method :meth:`data`. :meth:`XMLTreeBuilder.close` and data is processed by method :meth:`data`. :meth:`XMLParser.close`
calls *target*\'s method :meth:`close`. calls *target*\'s method :meth:`close`.
:class:`XMLTreeBuilder` can be used not only for building a tree structure. :class:`XMLParser` can be used not only for building a tree structure.
This is an example of counting the maximum depth of an XML file:: This is an example of counting the maximum depth of an XML file::
>>> from xml.etree.ElementTree import XMLTreeBuilder >>> from xml.etree.ElementTree import XMLParser
>>> class MaxDepth: # The target object of the parser >>> class MaxDepth: # The target object of the parser
... maxDepth = 0 ... maxDepth = 0
... depth = 0 ... depth = 0
@ -506,7 +615,7 @@ This is an example of counting the maximum depth of an XML file::
... return self.maxDepth ... return self.maxDepth
... ...
>>> target = MaxDepth() >>> target = MaxDepth()
>>> parser = XMLTreeBuilder(target=target) >>> parser = XMLParser(target=target)
>>> exampleXml = """ >>> exampleXml = """
... <a> ... <a>
... <b> ... <b>
@ -529,4 +638,3 @@ This is an example of counting the maximum depth of an XML file::
appropriate standards. For example, "UTF-8" is valid, but "UTF8" is appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
and http://www.iana.org/assignments/character-sets. and http://www.iana.org/assignments/character-sets.

View File

@ -1,23 +0,0 @@
:mod:`xml.etree` --- The ElementTree API for XML
================================================
.. module:: xml.etree
:synopsis: Package containing common ElementTree modules.
.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
The ElementTree package is a simple, efficient, and quite popular library for
XML manipulation in Python. The :mod:`xml.etree` package contains the most
common components from the ElementTree API library. In the current release,
this package contains the :mod:`ElementTree`, :mod:`ElementPath`, and
:mod:`ElementInclude` modules from the full ElementTree distribution.
.. XXX To be continued!
.. seealso::
`ElementTree Overview <http://effbot.org/tag/elementtree>`_
The home page for :mod:`ElementTree`. This includes links to additional
documentation, alternative implementations, and other add-ons.

View File

@ -402,12 +402,14 @@ def temp_cwd(name='tempcwd', quiet=False):
rmtree(name) rmtree(name)
def findfile(file, here=__file__): def findfile(file, here=__file__, subdir=None):
"""Try to find a file on sys.path and the working directory. If it is not """Try to find a file on sys.path and the working directory. If it is not
found the argument passed to the function is returned (this does not found the argument passed to the function is returned (this does not
necessarily signal failure; could still be the legitimate path).""" necessarily signal failure; could still be the legitimate path)."""
if os.path.isabs(file): if os.path.isabs(file):
return file return file
if subdir is not None:
file = os.path.join(subdir, file)
path = sys.path path = sys.path
path = [os.path.dirname(here)] + path path = [os.path.dirname(here)] + path
for dn in path: for dn in path:

View File

@ -1,9 +1,7 @@
# test for xml.dom.minidom # test for xml.dom.minidom
import os
import sys
import pickle import pickle
from test.support import verbose, run_unittest from test.support import verbose, run_unittest, findfile
import unittest import unittest
import xml.dom import xml.dom
@ -14,12 +12,8 @@ from xml.dom.minidom import parse, Node, Document, parseString
from xml.dom.minidom import getDOMImplementation from xml.dom.minidom import getDOMImplementation
if __name__ == "__main__": tstfile = findfile("test.xml", subdir="xmltestdata")
base = sys.argv[0]
else:
base = __file__
tstfile = os.path.join(os.path.dirname(base), "test.xml")
del base
# The tests of DocumentType importing use these helpers to construct # The tests of DocumentType importing use these helpers to construct
# the documents to work with, since not all DOM builders actually # the documents to work with, since not all DOM builders actually

View File

@ -15,7 +15,9 @@ from xml.sax.xmlreader import InputSource, AttributesImpl, AttributesNSImpl
from io import StringIO from io import StringIO
from test.support import findfile, run_unittest from test.support import findfile, run_unittest
import unittest import unittest
import os
TEST_XMLFILE = findfile("test.xml", subdir="xmltestdata")
TEST_XMLFILE_OUT = findfile("test.xml.out", subdir="xmltestdata")
ns_uri = "http://www.python.org/xml-ns/saxtest/" ns_uri = "http://www.python.org/xml-ns/saxtest/"
@ -311,7 +313,7 @@ class XMLFilterBaseTest(unittest.TestCase):
# #
# =========================================================================== # ===========================================================================
xml_test_out = open(findfile("test.xml.out")).read() xml_test_out = open(TEST_XMLFILE_OUT).read()
class ExpatReaderTest(XmlTestBase): class ExpatReaderTest(XmlTestBase):
@ -323,7 +325,7 @@ class ExpatReaderTest(XmlTestBase):
xmlgen = XMLGenerator(result) xmlgen = XMLGenerator(result)
parser.setContentHandler(xmlgen) parser.setContentHandler(xmlgen)
parser.parse(open(findfile("test.xml"))) parser.parse(open(TEST_XMLFILE))
self.assertEquals(result.getvalue(), xml_test_out) self.assertEquals(result.getvalue(), xml_test_out)
@ -452,7 +454,7 @@ class ExpatReaderTest(XmlTestBase):
xmlgen = XMLGenerator(result) xmlgen = XMLGenerator(result)
parser.setContentHandler(xmlgen) parser.setContentHandler(xmlgen)
parser.parse(findfile("test.xml")) parser.parse(TEST_XMLFILE)
self.assertEquals(result.getvalue(), xml_test_out) self.assertEquals(result.getvalue(), xml_test_out)
@ -462,7 +464,7 @@ class ExpatReaderTest(XmlTestBase):
xmlgen = XMLGenerator(result) xmlgen = XMLGenerator(result)
parser.setContentHandler(xmlgen) parser.setContentHandler(xmlgen)
parser.parse(InputSource(findfile("test.xml"))) parser.parse(InputSource(TEST_XMLFILE))
self.assertEquals(result.getvalue(), xml_test_out) self.assertEquals(result.getvalue(), xml_test_out)
@ -473,7 +475,7 @@ class ExpatReaderTest(XmlTestBase):
parser.setContentHandler(xmlgen) parser.setContentHandler(xmlgen)
inpsrc = InputSource() inpsrc = InputSource()
inpsrc.setByteStream(open(findfile("test.xml"))) inpsrc.setByteStream(open(TEST_XMLFILE))
parser.parse(inpsrc) parser.parse(inpsrc)
self.assertEquals(result.getvalue(), xml_test_out) self.assertEquals(result.getvalue(), xml_test_out)
@ -534,9 +536,9 @@ class ExpatReaderTest(XmlTestBase):
xmlgen = XMLGenerator(result) xmlgen = XMLGenerator(result)
parser = create_parser() parser = create_parser()
parser.setContentHandler(xmlgen) parser.setContentHandler(xmlgen)
parser.parse(findfile("test.xml")) parser.parse(TEST_XMLFILE)
self.assertEquals(parser.getSystemId(), findfile("test.xml")) self.assertEquals(parser.getSystemId(), TEST_XMLFILE)
self.assertEquals(parser.getPublicId(), None) self.assertEquals(parser.getPublicId(), None)

File diff suppressed because it is too large Load Diff

View File

@ -1,31 +1,11 @@
# xml.etree test for cElementTree # xml.etree test for cElementTree
import doctest
import sys
from test import support from test import support
ET = support.import_module('xml.etree.cElementTree') cET = support.import_module('xml.etree.cElementTree')
SAMPLE_XML = """
<body>
<tag>text</tag>
<tag />
<section>
<tag>subtext</tag>
</section>
</body>
"""
SAMPLE_XML_NS = """ # cElementTree specific tests
<body xmlns="http://effbot.org/ns">
<tag>text</tag>
<tag />
<section>
<tag>subtext</tag>
</section>
</body>
"""
def sanity(): def sanity():
""" """
@ -34,187 +14,26 @@ def sanity():
>>> from xml.etree import cElementTree >>> from xml.etree import cElementTree
""" """
def check_method(method):
if not hasattr(method, '__call__'):
print(method, "not callable")
def serialize(ET, elem):
import io
file = io.StringIO()
tree = ET.ElementTree(elem)
tree.write(file)
return file.getvalue()
def summarize(elem):
return elem.tag
def summarize_list(seq):
return list(map(summarize, seq))
def interface():
"""
Test element tree interface.
>>> element = ET.Element("tag", key="value")
>>> tree = ET.ElementTree(element)
Make sure all standard element methods exist.
>>> check_method(element.append)
>>> check_method(element.insert)
>>> check_method(element.remove)
>>> check_method(element.getchildren)
>>> check_method(element.find)
>>> check_method(element.findall)
>>> check_method(element.findtext)
>>> check_method(element.clear)
>>> check_method(element.get)
>>> check_method(element.set)
>>> check_method(element.keys)
>>> check_method(element.items)
>>> check_method(element.getiterator)
Basic method sanity checks.
>>> serialize(ET, element) # 1
'<tag key="value" />'
>>> subelement = ET.Element("subtag")
>>> element.append(subelement)
>>> serialize(ET, element) # 2
'<tag key="value"><subtag /></tag>'
>>> element.insert(0, subelement)
>>> serialize(ET, element) # 3
'<tag key="value"><subtag /><subtag /></tag>'
>>> element.remove(subelement)
>>> serialize(ET, element) # 4
'<tag key="value"><subtag /></tag>'
>>> element.remove(subelement)
>>> serialize(ET, element) # 5
'<tag key="value" />'
>>> element.remove(subelement)
Traceback (most recent call last):
ValueError: list.remove(x): x not in list
>>> serialize(ET, element) # 6
'<tag key="value" />'
"""
def find():
"""
Test find methods (including xpath syntax).
>>> elem = ET.XML(SAMPLE_XML)
>>> elem.find("tag").tag
'tag'
>>> ET.ElementTree(elem).find("tag").tag
'tag'
>>> elem.find("section/tag").tag
'tag'
>>> ET.ElementTree(elem).find("section/tag").tag
'tag'
>>> elem.findtext("tag")
'text'
>>> elem.findtext("tog")
>>> elem.findtext("tog", "default")
'default'
>>> ET.ElementTree(elem).findtext("tag")
'text'
>>> elem.findtext("section/tag")
'subtext'
>>> ET.ElementTree(elem).findtext("section/tag")
'subtext'
>>> summarize_list(elem.findall("tag"))
['tag', 'tag']
>>> summarize_list(elem.findall("*"))
['tag', 'tag', 'section']
>>> summarize_list(elem.findall(".//tag"))
['tag', 'tag', 'tag']
>>> summarize_list(elem.findall("section/tag"))
['tag']
>>> summarize_list(elem.findall("section//tag"))
['tag']
>>> summarize_list(elem.findall("section/*"))
['tag']
>>> summarize_list(elem.findall("section//*"))
['tag']
>>> summarize_list(elem.findall("section/.//*"))
['tag']
>>> summarize_list(elem.findall("*/*"))
['tag']
>>> summarize_list(elem.findall("*//*"))
['tag']
>>> summarize_list(elem.findall("*/tag"))
['tag']
>>> summarize_list(elem.findall("*/./tag"))
['tag']
>>> summarize_list(elem.findall("./tag"))
['tag', 'tag']
>>> summarize_list(elem.findall(".//tag"))
['tag', 'tag', 'tag']
>>> summarize_list(elem.findall("././tag"))
['tag', 'tag']
>>> summarize_list(ET.ElementTree(elem).findall("/tag"))
['tag', 'tag']
>>> summarize_list(ET.ElementTree(elem).findall("./tag"))
['tag', 'tag']
>>> elem = ET.XML(SAMPLE_XML_NS)
>>> summarize_list(elem.findall("tag"))
[]
>>> summarize_list(elem.findall("{http://effbot.org/ns}tag"))
['{http://effbot.org/ns}tag', '{http://effbot.org/ns}tag']
>>> summarize_list(elem.findall(".//{http://effbot.org/ns}tag"))
['{http://effbot.org/ns}tag', '{http://effbot.org/ns}tag', '{http://effbot.org/ns}tag']
"""
def parseliteral():
r"""
>>> element = ET.XML("<html><body>text</body></html>")
>>> ET.ElementTree(element).write(sys.stdout)
<html><body>text</body></html>
>>> element = ET.fromstring("<html><body>text</body></html>")
>>> ET.ElementTree(element).write(sys.stdout)
<html><body>text</body></html>
>>> print(ET.tostring(element))
<html><body>text</body></html>
>>> print(repr(ET.tostring(element, "ascii")))
b"<?xml version='1.0' encoding='ascii'?>\n<html><body>text</body></html>"
>>> _, ids = ET.XMLID("<html><body>text</body></html>")
>>> len(ids)
0
>>> _, ids = ET.XMLID("<html><body id='body'>text</body></html>")
>>> len(ids)
1
>>> ids["body"].tag
'body'
"""
def check_encoding(encoding):
"""
>>> check_encoding("ascii")
>>> check_encoding("us-ascii")
>>> check_encoding("iso-8859-1")
>>> check_encoding("iso-8859-15")
>>> check_encoding("cp437")
>>> check_encoding("mac-roman")
"""
ET.XML(
"<?xml version='1.0' encoding='%s'?><xml />" % encoding
)
def bug_1534630():
"""
>>> bob = ET.TreeBuilder()
>>> e = bob.data("data")
>>> e = bob.start("tag", {})
>>> e = bob.end("tag")
>>> e = bob.close()
>>> serialize(ET, e)
'<tag />'
"""
def test_main(): def test_main():
from test import test_xml_etree_c from test import test_xml_etree, test_xml_etree_c
# Run the tests specific to the C implementation
support.run_doctest(test_xml_etree_c, verbosity=True) support.run_doctest(test_xml_etree_c, verbosity=True)
# Assign the C implementation before running the doctests
# Patch the __name__, to prevent confusion with the pure Python test
pyET = test_xml_etree.ET
py__name__ = test_xml_etree.__name__
test_xml_etree.ET = cET
if __name__ != '__main__':
test_xml_etree.__name__ = __name__
try:
# Run the same test suite as xml.etree.ElementTree
test_xml_etree.test_main(module_name='xml.etree.cElementTree')
finally:
test_xml_etree.ET = pyET
test_xml_etree.__name__ = py__name__
if __name__ == '__main__': if __name__ == '__main__':
test_main() test_main()

View File

@ -0,0 +1,7 @@
<?pi data?>
<!-- comment -->
<root xmlns='namespace'>
<element key='value'>text</element>
<element>text</element>tail
<empty-element/>
</root>

View File

@ -0,0 +1,6 @@
<!-- comment -->
<root>
<element key='value'>text</element>
<element>text</element>tail
<empty-element/>
</root>

View File

@ -1,6 +1,6 @@
# #
# ElementTree # ElementTree
# $Id: ElementInclude.py 1862 2004-06-18 07:31:02Z Fredrik $ # $Id: ElementInclude.py 3375 2008-02-13 08:05:08Z fredrik $
# #
# limited xinclude support for element trees # limited xinclude support for element trees
# #
@ -16,7 +16,7 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# The ElementTree toolkit is # The ElementTree toolkit is
# #
# Copyright (c) 1999-2004 by Fredrik Lundh # Copyright (c) 1999-2008 by Fredrik Lundh
# #
# By obtaining, using, and/or copying this software and/or its # By obtaining, using, and/or copying this software and/or its
# associated documentation, you agree that you have read, understood, # associated documentation, you agree that you have read, understood,
@ -42,7 +42,7 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# Licensed to PSF under a Contributor Agreement. # Licensed to PSF under a Contributor Agreement.
# See http://www.python.org/2.4/license for licensing details. # See http://www.python.org/psf/license for licensing details.
## ##
# Limited XInclude support for the ElementTree package. # Limited XInclude support for the ElementTree package.

View File

@ -1,6 +1,6 @@
# #
# ElementTree # ElementTree
# $Id: ElementPath.py 1858 2004-06-17 21:31:41Z Fredrik $ # $Id: ElementPath.py 3375 2008-02-13 08:05:08Z fredrik $
# #
# limited xpath support for element trees # limited xpath support for element trees
# #
@ -8,8 +8,13 @@
# 2003-05-23 fl created # 2003-05-23 fl created
# 2003-05-28 fl added support for // etc # 2003-05-28 fl added support for // etc
# 2003-08-27 fl fixed parsing of periods in element names # 2003-08-27 fl fixed parsing of periods in element names
# 2007-09-10 fl new selection engine
# 2007-09-12 fl fixed parent selector
# 2007-09-13 fl added iterfind; changed findall to return a list
# 2007-11-30 fl added namespaces support
# 2009-10-30 fl added child element value filter
# #
# Copyright (c) 2003-2004 by Fredrik Lundh. All rights reserved. # Copyright (c) 2003-2009 by Fredrik Lundh. All rights reserved.
# #
# fredrik@pythonware.com # fredrik@pythonware.com
# http://www.pythonware.com # http://www.pythonware.com
@ -17,7 +22,7 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# The ElementTree toolkit is # The ElementTree toolkit is
# #
# Copyright (c) 1999-2004 by Fredrik Lundh # Copyright (c) 1999-2009 by Fredrik Lundh
# #
# By obtaining, using, and/or copying this software and/or its # By obtaining, using, and/or copying this software and/or its
# associated documentation, you agree that you have read, understood, # associated documentation, you agree that you have read, understood,
@ -43,7 +48,7 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# Licensed to PSF under a Contributor Agreement. # Licensed to PSF under a Contributor Agreement.
# See http://www.python.org/2.4/license for licensing details. # See http://www.python.org/psf/license for licensing details.
## ##
# Implementation module for XPath support. There's usually no reason # Implementation module for XPath support. There's usually no reason
@ -53,146 +58,246 @@
import re import re
xpath_tokenizer = re.compile( xpath_tokenizer_re = re.compile(
"(::|\.\.|\(\)|[/.*:\[\]\(\)@=])|((?:\{[^}]+\})?[^/:\[\]\(\)@=\s]+)|\s+" "("
).findall "'[^']*'|\"[^\"]*\"|"
"::|"
class xpath_descendant_or_self: "//?|"
pass "\.\.|"
"\(\)|"
## "[/.*:\[\]\(\)@=])|"
# Wrapper for a compiled XPath. "((?:\{[^}]+\})?[^/\[\]\(\)@=\s]+)|"
"\s+"
class Path:
##
# Create an Path instance from an XPath expression.
def __init__(self, path):
tokens = xpath_tokenizer(path)
# the current version supports 'path/path'-style expressions only
self.path = []
self.tag = None
if tokens and tokens[0][0] == "/":
raise SyntaxError("cannot use absolute path on element")
while tokens:
op, tag = tokens.pop(0)
if tag or op == "*":
self.path.append(tag or op)
elif op == ".":
pass
elif op == "/":
self.path.append(xpath_descendant_or_self())
continue
else:
raise SyntaxError("unsupported path syntax (%s)" % op)
if tokens:
op, tag = tokens.pop(0)
if op != "/":
raise SyntaxError(
"expected path separator (%s)" % (op or tag)
) )
if self.path and isinstance(self.path[-1], xpath_descendant_or_self):
raise SyntaxError("path cannot end with //")
if len(self.path) == 1 and isinstance(self.path[0], type("")):
self.tag = self.path[0]
## def xpath_tokenizer(pattern, namespaces=None):
# Find first matching object. for token in xpath_tokenizer_re.findall(pattern):
tag = token[1]
if tag and tag[0] != "{" and ":" in tag:
try:
prefix, uri = tag.split(":", 1)
if not namespaces:
raise KeyError
yield token[0], "{%s}%s" % (namespaces[prefix], uri)
except KeyError:
raise SyntaxError("prefix %r not found in prefix map" % prefix)
else:
yield token
def find(self, element): def get_parent_map(context):
tag = self.tag parent_map = context.parent_map
if tag is None: if parent_map is None:
nodeset = self.findall(element) context.parent_map = parent_map = {}
if not nodeset: for p in context.root.iter():
return None for e in p:
return nodeset[0] parent_map[e] = p
for elem in element: return parent_map
if elem.tag == tag:
return elem
return None
## def prepare_child(next, token):
# Find text for first matching object. tag = token[1]
def select(context, result):
for elem in result:
for e in elem:
if e.tag == tag:
yield e
return select
def findtext(self, element, default=None): def prepare_star(next, token):
tag = self.tag def select(context, result):
if tag is None: for elem in result:
nodeset = self.findall(element) for e in elem:
if not nodeset: yield e
return default return select
return nodeset[0].text or ""
for elem in element:
if elem.tag == tag:
return elem.text or ""
return default
## def prepare_self(next, token):
# Find all matching objects. def select(context, result):
for elem in result:
yield elem
return select
def findall(self, element): def prepare_descendant(next, token):
nodeset = [element] token = next()
index = 0 if token[0] == "*":
tag = "*"
elif not token[0]:
tag = token[1]
else:
raise SyntaxError("invalid descendant")
def select(context, result):
for elem in result:
for e in elem.iter(tag):
if e is not elem:
yield e
return select
def prepare_parent(next, token):
def select(context, result):
# FIXME: raise error if .. is applied at toplevel?
parent_map = get_parent_map(context)
result_map = {}
for elem in result:
if elem in parent_map:
parent = parent_map[elem]
if parent not in result_map:
result_map[parent] = None
yield parent
return select
def prepare_predicate(next, token):
# FIXME: replace with real parser!!! refs:
# http://effbot.org/zone/simple-iterator-parser.htm
# http://javascript.crockford.com/tdop/tdop.html
signature = []
predicate = []
while 1: while 1:
token = next()
if token[0] == "]":
break
if token[0] and token[0][:1] in "'\"":
token = "'", token[0][1:-1]
signature.append(token[0] or "-")
predicate.append(token[1])
signature = "".join(signature)
# use signature to determine predicate type
if signature == "@-":
# [@attribute] predicate
key = predicate[1]
def select(context, result):
for elem in result:
if elem.get(key) is not None:
yield elem
return select
if signature == "@-='":
# [@attribute='value']
key = predicate[1]
value = predicate[-1]
def select(context, result):
for elem in result:
if elem.get(key) == value:
yield elem
return select
if signature == "-" and not re.match("\d+$", predicate[0]):
# [tag]
tag = predicate[0]
def select(context, result):
for elem in result:
if elem.find(tag) is not None:
yield elem
return select
if signature == "-='" and not re.match("\d+$", predicate[0]):
# [tag='value']
tag = predicate[0]
value = predicate[-1]
def select(context, result):
for elem in result:
for e in elem.findall(tag):
if "".join(e.itertext()) == value:
yield elem
break
return select
if signature == "-" or signature == "-()" or signature == "-()-":
# [index] or [last()] or [last()-index]
if signature == "-":
index = int(predicate[0]) - 1
else:
if predicate[0] != "last":
raise SyntaxError("unsupported function")
if signature == "-()-":
try: try:
path = self.path[index] index = int(predicate[2]) - 1
index = index + 1 except ValueError:
except IndexError: raise SyntaxError("unsupported expression")
return nodeset else:
set = [] index = -1
if isinstance(path, xpath_descendant_or_self): def select(context, result):
parent_map = get_parent_map(context)
for elem in result:
try: try:
tag = self.path[index] parent = parent_map[elem]
if not isinstance(tag, type("")): # FIXME: what if the selector is "*" ?
tag = None elems = list(parent.findall(elem.tag))
else: if elems[index] is elem:
index = index + 1 yield elem
except IndexError: except (IndexError, KeyError):
tag = None # invalid path pass
for node in nodeset: return select
new = list(node.getiterator(tag)) raise SyntaxError("invalid predicate")
if new and new[0] is node:
set.extend(new[1:]) ops = {
else: "": prepare_child,
set.extend(new) "*": prepare_star,
else: ".": prepare_self,
for node in nodeset: "..": prepare_parent,
for node in node: "//": prepare_descendant,
if path == "*" or node.tag == path: "[": prepare_predicate,
set.append(node) }
if not set:
return []
nodeset = set
_cache = {} _cache = {}
## class _SelectorContext:
# (Internal) Compile path. parent_map = None
def __init__(self, root):
self.root = root
def _compile(path): # --------------------------------------------------------------------
p = _cache.get(path)
if p is not None: ##
return p # Generate all matching objects.
p = Path(path)
if len(_cache) >= 100: def iterfind(elem, path, namespaces=None):
# compile selector pattern
if path[-1:] == "/":
path = path + "*" # implicit all (FIXME: keep this?)
try:
selector = _cache[path]
except KeyError:
if len(_cache) > 100:
_cache.clear() _cache.clear()
_cache[path] = p if path[:1] == "/":
return p raise SyntaxError("cannot use absolute path on element")
next = iter(xpath_tokenizer(path, namespaces)).__next__
token = next()
selector = []
while 1:
try:
selector.append(ops[token[0]](next, token))
except StopIteration:
raise SyntaxError("invalid path")
try:
token = next()
if token[0] == "/":
token = next()
except StopIteration:
break
_cache[path] = selector
# execute selector pattern
result = [elem]
context = _SelectorContext(elem)
for select in selector:
result = select(context, result)
return result
## ##
# Find first matching object. # Find first matching object.
def find(element, path): def find(elem, path, namespaces=None):
return _compile(path).find(element) try:
return next(iterfind(elem, path, namespaces))
## except StopIteration:
# Find text for first matching object. return None
def findtext(element, path, default=None):
return _compile(path).findtext(element, default)
## ##
# Find all matching objects. # Find all matching objects.
def findall(element, path): def findall(elem, path, namespaces=None):
return _compile(path).findall(element) return list(iterfind(elem, path, namespaces))
##
# Find text for first matching object.
def findtext(elem, path, default=None, namespaces=None):
try:
elem = next(iterfind(elem, path, namespaces))
return elem.text or ""
except StopIteration:
return default

File diff suppressed because it is too large Load Diff

View File

@ -1,10 +1,10 @@
# $Id: __init__.py 1821 2004-06-03 16:57:49Z fredrik $ # $Id: __init__.py 3375 2008-02-13 08:05:08Z fredrik $
# elementtree package # elementtree package
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# The ElementTree toolkit is # The ElementTree toolkit is
# #
# Copyright (c) 1999-2004 by Fredrik Lundh # Copyright (c) 1999-2008 by Fredrik Lundh
# #
# By obtaining, using, and/or copying this software and/or its # By obtaining, using, and/or copying this software and/or its
# associated documentation, you agree that you have read, understood, # associated documentation, you agree that you have read, understood,
@ -30,4 +30,4 @@
# -------------------------------------------------------------------- # --------------------------------------------------------------------
# Licensed to PSF under a Contributor Agreement. # Licensed to PSF under a Contributor Agreement.
# See http://www.python.org/2.4/license for licensing details. # See http://www.python.org/psf/license for licensing details.

View File

@ -836,7 +836,7 @@ EXTRAPLATDIR= @EXTRAPLATDIR@
MACHDEPS= $(PLATDIR) $(EXTRAPLATDIR) MACHDEPS= $(PLATDIR) $(EXTRAPLATDIR)
XMLLIBSUBDIRS= xml xml/dom xml/etree xml/parsers xml/sax XMLLIBSUBDIRS= xml xml/dom xml/etree xml/parsers xml/sax
LIBSUBDIRS= tkinter site-packages test test/output test/data \ LIBSUBDIRS= tkinter site-packages test test/output test/data \
test/decimaltestdata \ test/decimaltestdata test/xmltestdata \
encodings \ encodings \
email email/mime email/test email/test/data \ email email/mime email/test email/test/data \
html json json/tests http dbm xmlrpc \ html json json/tests http dbm xmlrpc \

View File

@ -283,6 +283,9 @@ C-API
Library Library
------- -------
- Issue #6472: The xml.etree package is updated to ElementTree 1.3. The
cElementTree module is updated too.
- Issue #7774: Set sys.executable to an empty string if argv[0] has been set to - Issue #7774: Set sys.executable to an empty string if argv[0] has been set to
an non existent program name and Python is unable to retrieve the real an non existent program name and Python is unable to retrieve the real
program name program name

File diff suppressed because it is too large Load Diff

View File

@ -1006,8 +1006,6 @@ def add_files(db):
lib.add_file("audiotest.au") lib.add_file("audiotest.au")
lib.add_file("cfgparser.1") lib.add_file("cfgparser.1")
lib.add_file("sgml_input.html") lib.add_file("sgml_input.html")
lib.add_file("test.xml")
lib.add_file("test.xml.out")
lib.add_file("testtar.tar") lib.add_file("testtar.tar")
lib.add_file("test_difflib_expect.html") lib.add_file("test_difflib_expect.html")
lib.add_file("check_soundcard.vbs") lib.add_file("check_soundcard.vbs")
@ -1019,6 +1017,9 @@ def add_files(db):
lib.add_file("zipdir.zip") lib.add_file("zipdir.zip")
if dir=='decimaltestdata': if dir=='decimaltestdata':
lib.glob("*.decTest") lib.glob("*.decTest")
if dir=='xmltestdata':
lib.glob("*.xml")
lib.add_file("test.xml.out")
if dir=='output': if dir=='output':
lib.glob("test_*") lib.glob("test_*")
if dir=='idlelib': if dir=='idlelib':