Issue 17538: Document XML vulnerabilties
This commit is contained in:
parent
4b394db41f
commit
23790b4be0
|
@ -25,6 +25,7 @@ definition of the Python bindings for the DOM and SAX interfaces.
|
||||||
htmlparser.rst
|
htmlparser.rst
|
||||||
sgmllib.rst
|
sgmllib.rst
|
||||||
htmllib.rst
|
htmllib.rst
|
||||||
|
xml.rst
|
||||||
xml.etree.elementtree.rst
|
xml.etree.elementtree.rst
|
||||||
xml.dom.rst
|
xml.dom.rst
|
||||||
xml.dom.minidom.rst
|
xml.dom.minidom.rst
|
||||||
|
|
|
@ -14,6 +14,14 @@
|
||||||
directive. Since they are attributes which are set by client code, in-text
|
directive. Since they are attributes which are set by client code, in-text
|
||||||
references to these attributes should be marked using the :member: role.
|
references to these attributes should be marked using the :member: role.
|
||||||
|
|
||||||
|
|
||||||
|
.. warning::
|
||||||
|
|
||||||
|
The :mod:`pyexpat` module is not secure against maliciously
|
||||||
|
constructed data. If you need to parse untrusted or unauthenticated data see
|
||||||
|
:ref:`xml-vulnerabilities`.
|
||||||
|
|
||||||
|
|
||||||
.. versionadded:: 2.0
|
.. versionadded:: 2.0
|
||||||
|
|
||||||
.. index:: single: Expat
|
.. index:: single: Expat
|
||||||
|
|
|
@ -20,6 +20,14 @@ to be simpler than the full DOM and also significantly smaller. Users who are
|
||||||
not already proficient with the DOM should consider using the
|
not already proficient with the DOM should consider using the
|
||||||
:mod:`xml.etree.ElementTree` module for their XML processing instead
|
:mod:`xml.etree.ElementTree` module for their XML processing instead
|
||||||
|
|
||||||
|
|
||||||
|
.. warning::
|
||||||
|
|
||||||
|
The :mod:`xml.dom.minidom` module is not secure against
|
||||||
|
maliciously constructed data. If you need to parse untrusted or
|
||||||
|
unauthenticated data see :ref:`xml-vulnerabilities`.
|
||||||
|
|
||||||
|
|
||||||
DOM applications typically start by parsing some XML into a DOM. With
|
DOM applications typically start by parsing some XML into a DOM. With
|
||||||
:mod:`xml.dom.minidom`, this is done through the parse functions::
|
:mod:`xml.dom.minidom`, this is done through the parse functions::
|
||||||
|
|
||||||
|
|
|
@ -16,6 +16,13 @@
|
||||||
Object Model representation of a document from SAX events.
|
Object Model representation of a document from SAX events.
|
||||||
|
|
||||||
|
|
||||||
|
.. warning::
|
||||||
|
|
||||||
|
The :mod:`xml.dom.pulldom` module is not secure against
|
||||||
|
maliciously constructed data. If you need to parse untrusted or
|
||||||
|
unauthenticated data see :ref:`xml-vulnerabilities`.
|
||||||
|
|
||||||
|
|
||||||
.. class:: PullDOM([documentFactory])
|
.. class:: PullDOM([documentFactory])
|
||||||
|
|
||||||
:class:`xml.sax.handler.ContentHandler` implementation that ...
|
:class:`xml.sax.handler.ContentHandler` implementation that ...
|
||||||
|
|
|
@ -16,6 +16,14 @@ The :class:`Element` type is a flexible container object, designed to store
|
||||||
hierarchical data structures in memory. The type can be described as a cross
|
hierarchical data structures in memory. The type can be described as a cross
|
||||||
between a list and a dictionary.
|
between a list and a dictionary.
|
||||||
|
|
||||||
|
|
||||||
|
.. warning::
|
||||||
|
|
||||||
|
The :mod:`xml.etree.ElementTree` module is not secure against
|
||||||
|
maliciously constructed data. If you need to parse untrusted or
|
||||||
|
unauthenticated data see :ref:`xml-vulnerabilities`.
|
||||||
|
|
||||||
|
|
||||||
Each element has a number of properties associated with it:
|
Each element has a number of properties associated with it:
|
||||||
|
|
||||||
* a tag which is a string identifying what kind of data this element represents
|
* a tag which is a string identifying what kind of data this element represents
|
||||||
|
|
|
@ -0,0 +1,131 @@
|
||||||
|
.. _xml:
|
||||||
|
|
||||||
|
XML Processing Modules
|
||||||
|
======================
|
||||||
|
|
||||||
|
.. module:: xml
|
||||||
|
:synopsis: Package containing XML processing modules
|
||||||
|
.. sectionauthor:: Christian Heimes <christian@python.org>
|
||||||
|
.. sectionauthor:: Georg Brandl <georg@python.org>
|
||||||
|
|
||||||
|
|
||||||
|
Python's interfaces for processing XML are grouped in the ``xml`` package.
|
||||||
|
|
||||||
|
.. warning::
|
||||||
|
|
||||||
|
The XML modules are not secure against erroneous or maliciously
|
||||||
|
constructed data. If you need to parse untrusted or unauthenticated data see
|
||||||
|
:ref:`xml-vulnerabilities`.
|
||||||
|
|
||||||
|
It is important to note that modules in the :mod:`xml` package require that
|
||||||
|
there be at least one SAX-compliant XML parser available. The Expat parser is
|
||||||
|
included with Python, so the :mod:`xml.parsers.expat` module will always be
|
||||||
|
available.
|
||||||
|
|
||||||
|
The documentation for the :mod:`xml.dom` and :mod:`xml.sax` packages are the
|
||||||
|
definition of the Python bindings for the DOM and SAX interfaces.
|
||||||
|
|
||||||
|
The XML handling submodules are:
|
||||||
|
|
||||||
|
* :mod:`xml.etree.ElementTree`: the ElementTree API, a simple and lightweight
|
||||||
|
|
||||||
|
..
|
||||||
|
|
||||||
|
* :mod:`xml.dom`: the DOM API definition
|
||||||
|
* :mod:`xml.dom.minidom`: a lightweight DOM implementation
|
||||||
|
* :mod:`xml.dom.pulldom`: support for building partial DOM trees
|
||||||
|
|
||||||
|
..
|
||||||
|
|
||||||
|
* :mod:`xml.sax`: SAX2 base classes and convenience functions
|
||||||
|
* :mod:`xml.parsers.expat`: the Expat parser binding
|
||||||
|
|
||||||
|
|
||||||
|
.. _xml-vulnerabilities:
|
||||||
|
|
||||||
|
XML vulnerabilities
|
||||||
|
===================
|
||||||
|
|
||||||
|
The XML processing modules are not secure against maliciously constructed data.
|
||||||
|
An attacker can abuse vulnerabilities for e.g. denial of service attacks, to
|
||||||
|
access local files, to generate network connections to other machines, or
|
||||||
|
to or circumvent firewalls. The attacks on XML abuse unfamiliar features
|
||||||
|
like inline `DTD`_ (document type definition) with entities.
|
||||||
|
|
||||||
|
|
||||||
|
========================= ======== ========= ========= ======== =========
|
||||||
|
kind sax etree minidom pulldom xmlrpc
|
||||||
|
========================= ======== ========= ========= ======== =========
|
||||||
|
billion laughs **True** **True** **True** **True** **True**
|
||||||
|
quadratic blowup **True** **True** **True** **True** **True**
|
||||||
|
external entity expansion **True** False (1) False (2) **True** False (3)
|
||||||
|
DTD retrieval **True** False False **True** False
|
||||||
|
decompression bomb False False False False **True**
|
||||||
|
========================= ======== ========= ========= ======== =========
|
||||||
|
|
||||||
|
1. :mod:`xml.etree.ElementTree` doesn't expand external entities and raises a
|
||||||
|
ParserError when an entity occurs.
|
||||||
|
2. :mod:`xml.dom.minidom` doesn't expand external entities and simply returns
|
||||||
|
the unexpanded entity verbatim.
|
||||||
|
3. :mod:`xmlrpclib` doesn't expand external entities and omits them.
|
||||||
|
|
||||||
|
|
||||||
|
billion laughs / exponential entity expansion
|
||||||
|
The `Billion Laughs`_ attack -- also known as exponential entity expansion --
|
||||||
|
uses multiple levels of nested entities. Each entity refers to another entity
|
||||||
|
several times, the final entity definition contains a small string. Eventually
|
||||||
|
the small string is expanded to several gigabytes. The exponential expansion
|
||||||
|
consumes lots of CPU time, too.
|
||||||
|
|
||||||
|
quadratic blowup entity expansion
|
||||||
|
A quadratic blowup attack is similar to a `Billion Laughs`_ attack; it abuses
|
||||||
|
entity expansion, too. Instead of nested entities it repeats one large entity
|
||||||
|
with a couple of thousand chars over and over again. The attack isn't as
|
||||||
|
efficient as the exponential case but it avoids triggering countermeasures of
|
||||||
|
parsers against heavily nested entities.
|
||||||
|
|
||||||
|
external entity expansion
|
||||||
|
Entity declarations can contain more than just text for replacement. They can
|
||||||
|
also point to external resources by public identifiers or system identifiers.
|
||||||
|
System identifiers are standard URIs or can refer to local files. The XML
|
||||||
|
parser retrieves the resource with e.g. HTTP or FTP requests and embeds the
|
||||||
|
content into the XML document.
|
||||||
|
|
||||||
|
DTD retrieval
|
||||||
|
Some XML libraries like Python's mod:'xml.dom.pulldom' retrieve document type
|
||||||
|
definitions from remote or local locations. The feature has similar
|
||||||
|
implications as the external entity expansion issue.
|
||||||
|
|
||||||
|
decompression bomb
|
||||||
|
The issue of decompression bombs (aka `ZIP bomb`_) apply to all XML libraries
|
||||||
|
that can parse compressed XML stream like gzipped HTTP streams or LZMA-ed
|
||||||
|
files. For an attacker it can reduce the amount of transmitted data by three
|
||||||
|
magnitudes or more.
|
||||||
|
|
||||||
|
The documentation of `defusedxml`_ on PyPI has further information about
|
||||||
|
all known attack vectors with examples and references.
|
||||||
|
|
||||||
|
defused packages
|
||||||
|
----------------
|
||||||
|
|
||||||
|
`defusedxml`_ is a pure Python package with modified subclasses of all stdlib
|
||||||
|
XML parsers that prevent any potentially malicious operation. The courses of
|
||||||
|
action are recommended for any server code that parses untrusted XML data. The
|
||||||
|
package also ships with example exploits and an extended documentation on more
|
||||||
|
XML exploits like xpath injection.
|
||||||
|
|
||||||
|
`defusedexpat`_ provides a modified libexpat and patched replacment
|
||||||
|
:mod:`pyexpat` extension module with countermeasures against entity expansion
|
||||||
|
DoS attacks. Defusedexpat still allows a sane and configurable amount of entity
|
||||||
|
expansions. The modifications will be merged into future releases of Python.
|
||||||
|
|
||||||
|
The workarounds and modifications are not included in patch releases as they
|
||||||
|
break backward compatibility. After all inline DTD and entity expansion are
|
||||||
|
well-definied XML features.
|
||||||
|
|
||||||
|
|
||||||
|
.. _defusedxml: <https://pypi.python.org/pypi/defusedxml/>
|
||||||
|
.. _defusedexpat: <https://pypi.python.org/pypi/defusedexpat/>
|
||||||
|
.. _Billion Laughs: http://en.wikipedia.org/wiki/Billion_laughs
|
||||||
|
.. _ZIP bomb: http://en.wikipedia.org/wiki/Zip_bomb
|
||||||
|
.. _DTD: http://en.wikipedia.org/wiki/Document_Type_Definition
|
|
@ -16,6 +16,14 @@ Simple API for XML (SAX) interface for Python. The package itself provides the
|
||||||
SAX exceptions and the convenience functions which will be most used by users of
|
SAX exceptions and the convenience functions which will be most used by users of
|
||||||
the SAX API.
|
the SAX API.
|
||||||
|
|
||||||
|
|
||||||
|
.. warning::
|
||||||
|
|
||||||
|
The :mod:`xml.sax` module is not secure against maliciously
|
||||||
|
constructed data. If you need to parse untrusted or unauthenticated data see
|
||||||
|
:ref:`xml-vulnerabilities`.
|
||||||
|
|
||||||
|
|
||||||
The convenience functions are:
|
The convenience functions are:
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -28,6 +28,13 @@ supports writing XML-RPC client code; it handles all the details of translating
|
||||||
between conformable Python objects and XML on the wire.
|
between conformable Python objects and XML on the wire.
|
||||||
|
|
||||||
|
|
||||||
|
.. warning::
|
||||||
|
|
||||||
|
The :mod:`xmlrpclib` module is not secure against maliciously
|
||||||
|
constructed data. If you need to parse untrusted or unauthenticated data see
|
||||||
|
:ref:`xml-vulnerabilities`.
|
||||||
|
|
||||||
|
|
||||||
.. class:: ServerProxy(uri[, transport[, encoding[, verbose[, allow_none[, use_datetime]]]]])
|
.. class:: ServerProxy(uri[, transport[, encoding[, verbose[, allow_none[, use_datetime]]]]])
|
||||||
|
|
||||||
A :class:`ServerProxy` instance is an object that manages communication with a
|
A :class:`ServerProxy` instance is an object that manages communication with a
|
||||||
|
|
|
@ -36,6 +36,11 @@ Library
|
||||||
- Issue #17531: Fix tests that thought group and user ids were always the int
|
- Issue #17531: Fix tests that thought group and user ids were always the int
|
||||||
type. Also, always allow -1 as a valid group and user id.
|
type. Also, always allow -1 as a valid group and user id.
|
||||||
|
|
||||||
|
Documentation
|
||||||
|
-------------
|
||||||
|
|
||||||
|
- Issue 17538: Document XML vulnerabilties
|
||||||
|
|
||||||
|
|
||||||
What's New in Python 2.7.4 release candidate 1
|
What's New in Python 2.7.4 release candidate 1
|
||||||
==============================================
|
==============================================
|
||||||
|
|
Loading…
Reference in New Issue