Issue 23729: Document ElementTree namespace handling and fix an omission in the XPATH predicate table.
This commit is contained in:
parent
936da2a796
commit
f6e31b79a8
|
@ -284,6 +284,71 @@ sub-elements for a given element::
|
||||||
>>> ET.dump(a)
|
>>> ET.dump(a)
|
||||||
<a><b /><c><d /></c></a>
|
<a><b /><c><d /></c></a>
|
||||||
|
|
||||||
|
Parsing XML with Namespaces
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
If the XML input has `namespaces
|
||||||
|
<https://en.wikipedia.org/wiki/XML_namespace>`__, tags and attributes
|
||||||
|
with prefixes in the form ``prefix:sometag`` get expanded to
|
||||||
|
``{uri}tag`` where the *prefix* is replaced by the full *URI*. Also,
|
||||||
|
if there is a `default namespace
|
||||||
|
<http://www.w3.org/TR/2006/REC-xml-names-20060816/#defaulting>`__,
|
||||||
|
that full URI gets prepended to all of the non-prefixed tags.
|
||||||
|
|
||||||
|
Here is an XML example that incorporates two namespaces, one with the
|
||||||
|
prefix "fictional" and the other serving as the default namespace:
|
||||||
|
|
||||||
|
.. code-block:: xml
|
||||||
|
|
||||||
|
<?xml version="1.0"?>
|
||||||
|
<actors xmlns:fictional="http://characters.example.com"
|
||||||
|
xmlns="http://people.example.com">
|
||||||
|
<actor>
|
||||||
|
<name>John Cleese</name>
|
||||||
|
<fictional:character>Lancelot</fictional:character>
|
||||||
|
<fictional:character>Archie Leach</fictional:character>
|
||||||
|
</actor>
|
||||||
|
<actor>
|
||||||
|
<name>Eric Idle</name>
|
||||||
|
<fictional:character>Sir Robin</fictional:character>
|
||||||
|
<fictional:character>Gunther</fictional:character>
|
||||||
|
<fictional:character>Commander Clement</fictional:character>
|
||||||
|
</actor>
|
||||||
|
</actors>
|
||||||
|
|
||||||
|
One way to search and explore this XML example is to manually add the
|
||||||
|
URI to every tag or attribute in the xpath of a *find()* or *findall()*::
|
||||||
|
|
||||||
|
root = from_string(xml_text)
|
||||||
|
for actor in root.findall('{http://people.example.com}actor'):
|
||||||
|
name = actor.find('{http://people.example.com}name')
|
||||||
|
print(name.text)
|
||||||
|
for char in actor.findall('{http://characters.example.com}character'):
|
||||||
|
print(' |-->', char.text)
|
||||||
|
|
||||||
|
Another way to search the namespaced XML example is to create a
|
||||||
|
dictionary with your own prefixes and use those in the search::
|
||||||
|
|
||||||
|
ns = {'real_person': 'http://people.example.com',
|
||||||
|
'role': 'http://characters.example.com'}
|
||||||
|
|
||||||
|
for actor in root.findall('real_person:actor', ns):
|
||||||
|
name = actor.find('real_person:name', ns)
|
||||||
|
print(name.text)
|
||||||
|
for char in actor.findall('role:character', ns):
|
||||||
|
print(' |-->', char.text)
|
||||||
|
|
||||||
|
These two approaches both output::
|
||||||
|
|
||||||
|
John Cleese
|
||||||
|
|--> Lancelot
|
||||||
|
|--> Archie Leach
|
||||||
|
Eric Idle
|
||||||
|
|--> Sir Robin
|
||||||
|
|--> Gunther
|
||||||
|
|--> Commander Clement
|
||||||
|
|
||||||
|
|
||||||
Additional resources
|
Additional resources
|
||||||
^^^^^^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
@ -366,6 +431,9 @@ Supported XPath syntax
|
||||||
| ``[tag]`` | Selects all elements that have a child named |
|
| ``[tag]`` | Selects all elements that have a child named |
|
||||||
| | ``tag``. Only immediate children are supported. |
|
| | ``tag``. Only immediate children are supported. |
|
||||||
+-----------------------+------------------------------------------------------+
|
+-----------------------+------------------------------------------------------+
|
||||||
|
| ``[tag=text]`` | Selects all elements that have a child named |
|
||||||
|
| | ``tag`` that includes the given ``text``. |
|
||||||
|
+-----------------------+------------------------------------------------------+
|
||||||
| ``[position]`` | Selects all elements that are located at the given |
|
| ``[position]`` | Selects all elements that are located at the given |
|
||||||
| | position. The position can be either an integer |
|
| | position. The position can be either an integer |
|
||||||
| | (1 is the first position), the expression ``last()`` |
|
| | (1 is the first position), the expression ``last()`` |
|
||||||
|
|
Loading…
Reference in New Issue