Issue #18840: Introduce the json module in the tutorial, and deemphasize the pickle module.

This commit is contained in:
Antoine Pitrou 2013-12-05 23:46:32 +01:00
parent 9c72ebc96b
commit dd799d2e32
3 changed files with 69 additions and 33 deletions

View File

@ -78,6 +78,13 @@ Glossary
Benevolent Dictator For Life, a.k.a. `Guido van Rossum
<http://www.python.org/~guido/>`_, Python's creator.
binary file
A :term:`file object` able to read and write
:term:`bytes-like objects <bytes-like object>`.
.. seealso::
A :term:`text file` reads and writes :class:`str` objects.
bytes-like object
An object that supports the :ref:`bufferobjects`, like :class:`bytes`,
:class:`bytearray` or :class:`memoryview`. Bytes-like objects can
@ -225,10 +232,11 @@ Glossary
etc.). File objects are also called :dfn:`file-like objects` or
:dfn:`streams`.
There are actually three categories of file objects: raw binary files,
buffered binary files and text files. Their interfaces are defined in the
:mod:`io` module. The canonical way to create a file object is by using
the :func:`open` function.
There are actually three categories of file objects: raw
:term:`binary files <binary file>`, buffered
:term:`binary files <binary file>` and :term:`text files <text file>`.
Their interfaces are defined in the :mod:`io` module. The canonical
way to create a file object is by using the :func:`open` function.
file-like object
A synonym for :term:`file object`.
@ -780,6 +788,14 @@ Glossary
:meth:`~collections.somenamedtuple._asdict`. Examples of struct sequences
include :data:`sys.float_info` and the return value of :func:`os.stat`.
text file
A :term:`file object` able to read and write :class:`str` objects.
Often, a text file actually accesses a byte-oriented datastream
and handles the text encoding automatically.
.. seealso::
A :term:`binary file` reads and write :class:`bytes` objects.
triple-quoted string
A string which is bound by three instances of either a quotation mark
(") or an apostrophe ('). While they don't provide any functionality

View File

@ -377,47 +377,64 @@ File objects have some additional methods, such as :meth:`~file.isatty` and
Reference for a complete guide to file objects.
.. _tut-pickle:
.. _tut-json:
The :mod:`pickle` Module
------------------------
Saving structured data with :mod:`json`
---------------------------------------
.. index:: module: pickle
.. index:: module: json
Strings can easily be written to and read from a file. Numbers take a bit more
Strings can easily be written to and read from a file. Numbers take a bit more
effort, since the :meth:`read` method only returns strings, which will have to
be passed to a function like :func:`int`, which takes a string like ``'123'``
and returns its numeric value 123. However, when you want to save more complex
data types like lists, dictionaries, or class instances, things get a lot more
complicated.
and returns its numeric value 123. When you want to save more complex data
types like nested lists and dictionaries, parsing and serializing by hand
becomes complicated.
Rather than have users be constantly writing and debugging code to save
complicated data types, Python provides a standard module called :mod:`pickle`.
This is an amazing module that can take almost any Python object (even some
forms of Python code!), and convert it to a string representation; this process
is called :dfn:`pickling`. Reconstructing the object from the string
representation is called :dfn:`unpickling`. Between pickling and unpickling,
the string representing the object may have been stored in a file or data, or
Rather than having users constantly writing and debugging code to save
complicated data types to files, Python allows you to use the popular data
interchange format called `JSON (JavaScript Object Notation)
<http://json.org>`_. The standard module called :mod:`json` can take Python
data hierarchies, and convert them to string representations; this process is
called :dfn:`serializing`. Reconstructing the data from the string representation
is called :dfn:`deserializing`. Between serializing and deserializing, the
string representing the object may have been stored in a file or data, or
sent over a network connection to some distant machine.
If you have an object ``x``, and a file object ``f`` that's been opened for
writing, the simplest way to pickle the object takes only one line of code::
.. note::
The JSON format is commonly used by modern applications to allow for data
exchange. Many programmers are already familiar with it, which makes
it a good choice for interoperability.
pickle.dump(x, f)
If you have an object ``x``, you can view its JSON string representation with a
simple line of code::
To unpickle the object again, if ``f`` is a file object which has been opened
for reading::
>>> json.dumps([1, 'simple', 'list'])
'[1, "simple", "list"]'
x = pickle.load(f)
Another variant of the :func:`~json.dumps` function, called :func:`~json.dump`,
simply serializes the object to a :term:`text file`. So if ``f`` is a
:term:`text file` object opened for writing, we can do this::
(There are other variants of this, used when pickling many objects or when you
don't want to write the pickled data to a file; consult the complete
documentation for :mod:`pickle` in the Python Library Reference.)
json.dump(x, f)
:mod:`pickle` is the standard way to make Python objects which can be stored and
reused by other programs or by a future invocation of the same program; the
technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is
so widely used, many authors who write Python extensions take care to ensure
that new data types such as matrices can be properly pickled and unpickled.
To decode the object again, if ``f`` is a :term:`text file` object which has
been opened for reading::
x = json.load(f)
This simple serialization technique can handle lists and dictionaries, but
serializing arbitrary class instances in JSON requires a bit of extra effort.
The reference for the :mod:`json` module contains an explanation of this.
.. seealso::
:mod:`pickle` - the pickle module
Contrary to :ref:`JSON <tut-json>`, *pickle* is a protocol which allows
the serialization of arbitrarily complex Python objects. As such, it is
specific to Python and cannot be used to communicate with applications
written in other languages. It is also insecure by default:
deserializing pickle data coming from an untrusted source can execute
arbitrary code, if the data was crafted by a skilled attacker.

View File

@ -113,6 +113,9 @@ Tests
Documentation
-------------
- Issue #18840: Introduce the json module in the tutorial, and deemphasize
the pickle module.
- Issue #19845: Updated the Compiling Python on Windows section.
- Issue #19795: Improved markup of True/False constants.