mirror of https://github.com/python/cpython
Issue #14674: Add a discussion of the json module's standard compliance.
Patch by Chris Rebert.
This commit is contained in:
parent
e991236b4d
commit
f3e0a69d88
|
@ -7,8 +7,10 @@
|
||||||
.. sectionauthor:: Bob Ippolito <bob@redivi.com>
|
.. sectionauthor:: Bob Ippolito <bob@redivi.com>
|
||||||
.. versionadded:: 2.6
|
.. versionadded:: 2.6
|
||||||
|
|
||||||
`JSON (JavaScript Object Notation) <http://json.org>`_ is a subset of JavaScript
|
`JSON (JavaScript Object Notation) <http://json.org>`_, specified by
|
||||||
syntax (ECMA-262 3rd edition) used as a lightweight data interchange format.
|
:rfc:`4627`, is a lightweight data interchange format based on a subset of
|
||||||
|
`JavaScript <http://en.wikipedia.org/wiki/JavaScript>`_ syntax (`ECMA-262 3rd
|
||||||
|
edition <http://www.ecma-international.org/publications/files/ECMA-ST-ARCH/ECMA-262,%203rd%20edition,%20December%201999.pdf>`_).
|
||||||
|
|
||||||
:mod:`json` exposes an API familiar to users of the standard library
|
:mod:`json` exposes an API familiar to users of the standard library
|
||||||
:mod:`marshal` and :mod:`pickle` modules.
|
:mod:`marshal` and :mod:`pickle` modules.
|
||||||
|
@ -106,8 +108,10 @@ Using json.tool from the shell to validate and pretty-print::
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
|
||||||
The JSON produced by this module's default settings is a subset of
|
JSON is a subset of `YAML <http://yaml.org/>`_ 1.2. The JSON produced by
|
||||||
YAML, so it may be used as a serializer for that as well.
|
this module's default settings (in particular, the default *separators*
|
||||||
|
value) is also a subset of YAML 1.0 and 1.1. This module can thus also be
|
||||||
|
used as a YAML serializer.
|
||||||
|
|
||||||
|
|
||||||
Basic Usage
|
Basic Usage
|
||||||
|
@ -193,7 +197,8 @@ Basic Usage
|
||||||
*object_hook* is an optional function that will be called with the result of
|
*object_hook* is an optional function that will be called with the result of
|
||||||
any object literal decoded (a :class:`dict`). The return value of
|
any object literal decoded (a :class:`dict`). The return value of
|
||||||
*object_hook* will be used instead of the :class:`dict`. This feature can be used
|
*object_hook* will be used instead of the :class:`dict`. This feature can be used
|
||||||
to implement custom decoders (e.g. JSON-RPC class hinting).
|
to implement custom decoders (e.g. `JSON-RPC <http://www.jsonrpc.org>`_
|
||||||
|
class hinting).
|
||||||
|
|
||||||
*object_pairs_hook* is an optional function that will be called with the
|
*object_pairs_hook* is an optional function that will be called with the
|
||||||
result of any object literal decoded with an ordered list of pairs. The
|
result of any object literal decoded with an ordered list of pairs. The
|
||||||
|
@ -242,7 +247,7 @@ Basic Usage
|
||||||
The other arguments have the same meaning as in :func:`load`.
|
The other arguments have the same meaning as in :func:`load`.
|
||||||
|
|
||||||
|
|
||||||
Encoders and decoders
|
Encoders and Decoders
|
||||||
---------------------
|
---------------------
|
||||||
|
|
||||||
.. class:: JSONDecoder([encoding[, object_hook[, parse_float[, parse_int[, parse_constant[, strict[, object_pairs_hook]]]]]]])
|
.. class:: JSONDecoder([encoding[, object_hook[, parse_float[, parse_int[, parse_constant[, strict[, object_pairs_hook]]]]]]])
|
||||||
|
@ -438,3 +443,108 @@ Encoders and decoders
|
||||||
|
|
||||||
for chunk in JSONEncoder().iterencode(bigobject):
|
for chunk in JSONEncoder().iterencode(bigobject):
|
||||||
mysocket.write(chunk)
|
mysocket.write(chunk)
|
||||||
|
|
||||||
|
|
||||||
|
Standard Compliance
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
The JSON format is specified by :rfc:`4627`. This section details this
|
||||||
|
module's level of compliance with the RFC. For simplicity,
|
||||||
|
:class:`JSONEncoder` and :class:`JSONDecoder` subclasses, and parameters other
|
||||||
|
than those explicitly mentioned, are not considered.
|
||||||
|
|
||||||
|
This module does not comply with the RFC in a strict fashion, implementing some
|
||||||
|
extensions that are valid JavaScript but not valid JSON. In particular:
|
||||||
|
|
||||||
|
- Top-level non-object, non-array values are accepted and output;
|
||||||
|
- Infinite and NaN number values are accepted and output;
|
||||||
|
- Repeated names within an object are accepted, and only the value of the last
|
||||||
|
name-value pair is used.
|
||||||
|
|
||||||
|
Since the RFC permits RFC-compliant parsers to accept input texts that are not
|
||||||
|
RFC-compliant, this module's deserializer is technically RFC-compliant under
|
||||||
|
default settings.
|
||||||
|
|
||||||
|
Character Encodings
|
||||||
|
^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
The RFC recommends that JSON be represented using either UTF-8, UTF-16, or
|
||||||
|
UTF-32, with UTF-8 being the default. Accordingly, this module uses UTF-8 as
|
||||||
|
the default for its *encoding* parameter.
|
||||||
|
|
||||||
|
This module's deserializer only directly works with ASCII-compatible encodings;
|
||||||
|
UTF-16, UTF-32, and other ASCII-incompatible encodings require the use of
|
||||||
|
workarounds described in the documentation for the deserializer's *encoding*
|
||||||
|
parameter.
|
||||||
|
|
||||||
|
The RFC also non-normatively describes a limited encoding detection technique
|
||||||
|
for JSON texts; this module's deserializer does not implement this or any other
|
||||||
|
kind of encoding detection.
|
||||||
|
|
||||||
|
As permitted, though not required, by the RFC, this module's serializer sets
|
||||||
|
*ensure_ascii=True* by default, thus escaping the output so that the resulting
|
||||||
|
strings only contain ASCII characters.
|
||||||
|
|
||||||
|
|
||||||
|
Top-level Non-Object, Non-Array Values
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
The RFC specifies that the top-level value of a JSON text must be either a
|
||||||
|
JSON object or array (Python :class:`dict` or :class:`list`). This module's
|
||||||
|
deserializer also accepts input texts consisting solely of a
|
||||||
|
JSON null, boolean, number, or string value::
|
||||||
|
|
||||||
|
>>> just_a_json_string = '"spam and eggs"' # Not by itself a valid JSON text
|
||||||
|
>>> json.loads(just_a_json_string)
|
||||||
|
u'spam and eggs'
|
||||||
|
|
||||||
|
This module itself does not include a way to request that such input texts be
|
||||||
|
regarded as illegal. Likewise, this module's serializer also accepts single
|
||||||
|
Python :data:`None`, :class:`bool`, numeric, and :class:`str`
|
||||||
|
values as input and will generate output texts consisting solely of a top-level
|
||||||
|
JSON null, boolean, number, or string value without raising an exception::
|
||||||
|
|
||||||
|
>>> neither_a_list_nor_a_dict = u"spam and eggs"
|
||||||
|
>>> json.dumps(neither_a_list_nor_a_dict) # The result is not a valid JSON text
|
||||||
|
'"spam and eggs"'
|
||||||
|
|
||||||
|
This module's serializer does not itself include a way to enforce the
|
||||||
|
aforementioned constraint.
|
||||||
|
|
||||||
|
|
||||||
|
Infinite and NaN Number Values
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
The RFC does not permit the representation of infinite or NaN number values.
|
||||||
|
Despite that, by default, this module accepts and outputs ``Infinity``,
|
||||||
|
``-Infinity``, and ``NaN`` as if they were valid JSON number literal values::
|
||||||
|
|
||||||
|
>>> # Neither of these calls raises an exception, but the results are not valid JSON
|
||||||
|
>>> json.dumps(float('-inf'))
|
||||||
|
'-Infinity'
|
||||||
|
>>> json.dumps(float('nan'))
|
||||||
|
'NaN'
|
||||||
|
>>> # Same when deserializing
|
||||||
|
>>> json.loads('-Infinity')
|
||||||
|
-inf
|
||||||
|
>>> json.loads('NaN')
|
||||||
|
nan
|
||||||
|
|
||||||
|
In the serializer, the *allow_nan* parameter can be used to alter this
|
||||||
|
behavior. In the deserializer, the *parse_constant* parameter can be used to
|
||||||
|
alter this behavior.
|
||||||
|
|
||||||
|
|
||||||
|
Repeated Names Within an Object
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
The RFC specifies that the names within a JSON object should be unique, but
|
||||||
|
does not specify how repeated names in JSON objects should be handled. By
|
||||||
|
default, this module does not raise an exception; instead, it ignores all but
|
||||||
|
the last name-value pair for a given name::
|
||||||
|
|
||||||
|
>>> weird_json = '{"x": 1, "x": 2, "x": 3}'
|
||||||
|
>>> json.loads(weird_json)
|
||||||
|
{u'x': 3}
|
||||||
|
|
||||||
|
The *object_pairs_hook* parameter can be used to alter this behavior.
|
||||||
|
|
|
@ -373,6 +373,9 @@ Build
|
||||||
Documentation
|
Documentation
|
||||||
-------------
|
-------------
|
||||||
|
|
||||||
|
- Issue #14674: Add a discussion of the json module's standard compliance.
|
||||||
|
Patch by Chris Rebert.
|
||||||
|
|
||||||
- Issue #15630: Add an example for "continue" stmt in the tutorial. Patch by
|
- Issue #15630: Add an example for "continue" stmt in the tutorial. Patch by
|
||||||
Daniel Ellis.
|
Daniel Ellis.
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue