Issue #9715: improve documentation of the io module
This commit is contained in:
parent
4a656ebe05
commit
b530e1438b
|
@ -11,44 +11,90 @@
|
||||||
.. moduleauthor:: Benjamin Peterson <benjamin@python.org>
|
.. moduleauthor:: Benjamin Peterson <benjamin@python.org>
|
||||||
.. sectionauthor:: Benjamin Peterson <benjamin@python.org>
|
.. sectionauthor:: Benjamin Peterson <benjamin@python.org>
|
||||||
|
|
||||||
The :mod:`io` module provides the Python interfaces to stream handling. The
|
Overview
|
||||||
built-in :func:`open` function is defined in this module.
|
--------
|
||||||
|
|
||||||
At the top of the I/O hierarchy is the abstract base class :class:`IOBase`. It
|
The :mod:`io` module provides Python 3's main facilities for dealing for
|
||||||
defines the basic interface to a stream. Note, however, that there is no
|
various types of I/O. Three main types of I/O are defined: *text I/O*,
|
||||||
separation between reading and writing to streams; implementations are allowed
|
*binary I/O*, *raw I/O*. It should be noted that these are generic categories,
|
||||||
to raise an :exc:`IOError` if they do not support a given operation.
|
and various backing stores can be used for each of them. Concrete objects
|
||||||
|
belonging to any of these categories will often be called *streams*; another
|
||||||
|
common term is *file-like objects*.
|
||||||
|
|
||||||
Extending :class:`IOBase` is :class:`RawIOBase` which deals simply with the
|
Independently of its category, each concrete stream object will also have
|
||||||
reading and writing of raw bytes to a stream. :class:`FileIO` subclasses
|
various capabilities: it can be read-only, write-only, or read-write; it
|
||||||
:class:`RawIOBase` to provide an interface to files in the machine's
|
can also allow arbitrary random access (seeking forwards or backwards to
|
||||||
file system.
|
any location), or only sequential access (for example in the case of a
|
||||||
|
socket or pipe).
|
||||||
|
|
||||||
:class:`BufferedIOBase` deals with buffering on a raw byte stream
|
All streams are careful about the type of data you give to them. For example
|
||||||
(:class:`RawIOBase`). Its subclasses, :class:`BufferedWriter`,
|
giving a :class:`str` object to the ``write()`` method of a binary stream
|
||||||
:class:`BufferedReader`, and :class:`BufferedRWPair` buffer streams that are
|
will raise a ``TypeError``. So will giving a :class:`bytes` object to the
|
||||||
readable, writable, and both readable and writable.
|
``write()`` method of a text stream.
|
||||||
:class:`BufferedRandom` provides a buffered interface to random access
|
|
||||||
streams. :class:`BytesIO` is a simple stream of in-memory bytes.
|
|
||||||
|
|
||||||
Another :class:`IOBase` subclass, :class:`TextIOBase`, deals with
|
Text I/O
|
||||||
streams whose bytes represent text, and handles encoding and decoding
|
^^^^^^^^
|
||||||
from and to strings. :class:`TextIOWrapper`, which extends it, is a
|
|
||||||
buffered text interface to a buffered raw stream
|
|
||||||
(:class:`BufferedIOBase`). Finally, :class:`StringIO` is an in-memory
|
|
||||||
stream for text.
|
|
||||||
|
|
||||||
Argument names are not part of the specification, and only the arguments of
|
Text I/O expects and produces :class:`str` objects. This means that,
|
||||||
:func:`.open` are intended to be used as keyword arguments.
|
whenever the backing store is natively made of bytes (such as in the case
|
||||||
|
of a file), encoding and decoding of data is made transparently, as well as,
|
||||||
|
optionally, translation of platform-specific newline characters.
|
||||||
|
|
||||||
.. seealso::
|
A way to create a text stream is to :meth:`open()` a file in text mode,
|
||||||
:mod:`sys`
|
optionally specifying an encoding::
|
||||||
contains the standard IO streams: :data:`sys.stdin`, :data:`sys.stdout`,
|
|
||||||
and :data:`sys.stderr`.
|
f = open("myfile.txt", "r", encoding="utf-8")
|
||||||
|
|
||||||
|
In-memory text streams are also available as :class:`StringIO` objects::
|
||||||
|
|
||||||
|
f = io.StringIO("some initial text data")
|
||||||
|
|
||||||
|
The detailed API of text streams is described by the :class:`TextIOBase`
|
||||||
|
class.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
Text I/O over a binary storage (such as a file) is significantly
|
||||||
|
slower than binary I/O over the same storage. This can become noticeable
|
||||||
|
if you handle huge amounts of text data (for example very large log files).
|
||||||
|
|
||||||
|
Binary I/O
|
||||||
|
^^^^^^^^^^
|
||||||
|
|
||||||
|
Binary I/O (also called *buffered I/O*) expects and produces
|
||||||
|
:class:`bytes` objects. No encoding, decoding or character translation
|
||||||
|
is performed. This is the category of streams used for all kinds of non-text
|
||||||
|
data, and also when manual control over the handling of text data is desired.
|
||||||
|
|
||||||
|
A way to create a binary stream is to :meth:`open()` a file in binary mode::
|
||||||
|
|
||||||
|
f = open("myfile.jpg", "rb")
|
||||||
|
|
||||||
|
In-memory binary streams are also available as :class:`BytesIO` objects::
|
||||||
|
|
||||||
|
f = io.BytesIO(b"some initial binary data: \x00\x01")
|
||||||
|
|
||||||
|
The detailed API of binary streams is described by the :class:`BufferedIOBase`
|
||||||
|
class.
|
||||||
|
|
||||||
|
Other library modules may provide additional ways to create text or binary
|
||||||
|
streams. See for example :meth:`socket.socket.makefile`.
|
||||||
|
|
||||||
|
Raw I/O
|
||||||
|
^^^^^^^
|
||||||
|
|
||||||
|
Raw I/O (also called *unbuffered I/O*) is generally used as a low-level
|
||||||
|
building-block for binary and text streams; it is rarely useful to directly
|
||||||
|
manipulate a raw stream from user code. Nevertheless, you can for example
|
||||||
|
create a raw stream by opening a file in binary mode with buffering disabled::
|
||||||
|
|
||||||
|
f = open("myfile.jpg", "rb", buffering=0)
|
||||||
|
|
||||||
|
The detailed API of raw streams is described by the :class:`RawIOBase`
|
||||||
|
class.
|
||||||
|
|
||||||
|
|
||||||
Module Interface
|
High-level Module Interface
|
||||||
----------------
|
---------------------------
|
||||||
|
|
||||||
.. data:: DEFAULT_BUFFER_SIZE
|
.. data:: DEFAULT_BUFFER_SIZE
|
||||||
|
|
||||||
|
@ -89,17 +135,22 @@ Module Interface
|
||||||
not be used in new code)
|
not be used in new code)
|
||||||
========= ===============================================================
|
========= ===============================================================
|
||||||
|
|
||||||
The default mode is ``'rt'`` (open for reading text). For binary random
|
The default mode is ``'r'`` (open for reading text, synonym of ``'rt'``).
|
||||||
access, the mode ``'w+b'`` opens and truncates the file to 0 bytes, while
|
For binary read-write access, the mode ``'w+b'`` opens and truncates the
|
||||||
``'r+b'`` opens the file without truncation.
|
file to 0 bytes, while ``'r+b'`` opens the file without truncation.
|
||||||
|
|
||||||
Python distinguishes between files opened in binary and text modes, even when
|
As mentioned in the `overview`_, Python distinguishes between binary
|
||||||
the underlying operating system doesn't. Files opened in binary mode
|
and text I/O. Files opened in binary mode (including ``'b'`` in the
|
||||||
(including ``'b'`` in the *mode* argument) return contents as ``bytes``
|
*mode* argument) return contents as :class:`bytes` objects without
|
||||||
objects without any decoding. In text mode (the default, or when ``'t'`` is
|
any decoding. In text mode (the default, or when ``'t'``
|
||||||
included in the *mode* argument), the contents of the file are returned as
|
is included in the *mode* argument), the contents of the file are
|
||||||
strings, the bytes having been first decoded using a platform-dependent
|
returned as strings, the bytes having been first decoded using a
|
||||||
encoding or using the specified *encoding* if given.
|
platform-dependent encoding or using the specified *encoding* if given.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
Python doesn't depend on the underlying operating system's notion
|
||||||
|
of text files; all the the processing is done by Python itself, and
|
||||||
|
is therefore platform-independent.
|
||||||
|
|
||||||
*buffering* is an optional integer used to set the buffering policy.
|
*buffering* is an optional integer used to set the buffering policy.
|
||||||
Pass 0 to switch buffering off (only allowed in binary mode), 1 to select
|
Pass 0 to switch buffering off (only allowed in binary mode), 1 to select
|
||||||
|
@ -168,11 +219,6 @@ Module Interface
|
||||||
:class:`BufferedRandom`. When buffering is disabled, the raw stream, a
|
:class:`BufferedRandom`. When buffering is disabled, the raw stream, a
|
||||||
subclass of :class:`RawIOBase`, :class:`FileIO`, is returned.
|
subclass of :class:`RawIOBase`, :class:`FileIO`, is returned.
|
||||||
|
|
||||||
It is also possible to use a string or bytearray as a file for both reading
|
|
||||||
and writing. For strings :class:`StringIO` can be used like a file opened in
|
|
||||||
a text mode, and for bytearrays a :class:`BytesIO` can be used like a
|
|
||||||
file opened in a binary mode.
|
|
||||||
|
|
||||||
|
|
||||||
.. exception:: BlockingIOError
|
.. exception:: BlockingIOError
|
||||||
|
|
||||||
|
@ -194,8 +240,67 @@ Module Interface
|
||||||
when an unsupported operation is called on a stream.
|
when an unsupported operation is called on a stream.
|
||||||
|
|
||||||
|
|
||||||
|
In-memory streams
|
||||||
|
^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
It is also possible to use a :class:`str` or :class:`bytes`-like object as a
|
||||||
|
file for both reading and writing. For strings :class:`StringIO` can be
|
||||||
|
used like a file opened in text mode, and :class:`BytesIO` can be used like
|
||||||
|
a file opened in binary mode. Both provide full read-write capabilities
|
||||||
|
with random access.
|
||||||
|
|
||||||
|
|
||||||
|
.. seealso::
|
||||||
|
:mod:`sys`
|
||||||
|
contains the standard IO streams: :data:`sys.stdin`, :data:`sys.stdout`,
|
||||||
|
and :data:`sys.stderr`.
|
||||||
|
|
||||||
|
|
||||||
|
Class hierarchy
|
||||||
|
---------------
|
||||||
|
|
||||||
|
The implementation of I/O streams is organized as a hierarchy of classes.
|
||||||
|
First :term:`abstract base classes <abstract base class>` (ABCs), which are used to specify the
|
||||||
|
various categories of streams, then concrete classes providing the standard
|
||||||
|
stream implementations.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
The abstract base classes also provide default implementations of
|
||||||
|
some methods in order to help implementation of concrete stream
|
||||||
|
classes. For example, :class:`BufferedIOBase` provides
|
||||||
|
unoptimized implementations of ``readinto()`` and ``readline()``.
|
||||||
|
|
||||||
|
At the top of the I/O hierarchy is the abstract base class :class:`IOBase`. It
|
||||||
|
defines the basic interface to a stream. Note, however, that there is no
|
||||||
|
separation between reading and writing to streams; implementations are allowed
|
||||||
|
to raise an :exc:`UnsupportedOperation` if they do not support a given
|
||||||
|
operation.
|
||||||
|
|
||||||
|
Extending :class:`IOBase` is the :class:`RawIOBase` ABC which deals simply
|
||||||
|
with the reading and writing of raw bytes to a stream. :class:`FileIO`
|
||||||
|
subclasses :class:`RawIOBase` to provide an interface to files in the
|
||||||
|
machine's file system.
|
||||||
|
|
||||||
|
The :class:`BufferedIOBase` ABC deals with buffering on a raw byte stream
|
||||||
|
(:class:`RawIOBase`). Its subclasses, :class:`BufferedWriter`,
|
||||||
|
:class:`BufferedReader`, and :class:`BufferedRWPair` buffer streams that are
|
||||||
|
readable, writable, and both readable and writable.
|
||||||
|
:class:`BufferedRandom` provides a buffered interface to random access
|
||||||
|
streams. :class:`BytesIO` is a simple stream of in-memory bytes.
|
||||||
|
|
||||||
|
Another :class:`IOBase` subclass, the :class:`TextIOBase` ABC, deals with
|
||||||
|
streams whose bytes represent text, and handles encoding and decoding
|
||||||
|
from and to strings. :class:`TextIOWrapper`, which extends it, is a
|
||||||
|
buffered text interface to a buffered raw stream
|
||||||
|
(:class:`BufferedIOBase`). Finally, :class:`StringIO` is an in-memory
|
||||||
|
stream for text.
|
||||||
|
|
||||||
|
Argument names are not part of the specification, and only the arguments of
|
||||||
|
:func:`.open` are intended to be used as keyword arguments.
|
||||||
|
|
||||||
|
|
||||||
I/O Base Classes
|
I/O Base Classes
|
||||||
----------------
|
^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
.. class:: IOBase
|
.. class:: IOBase
|
||||||
|
|
||||||
|
@ -467,7 +572,7 @@ I/O Base Classes
|
||||||
|
|
||||||
|
|
||||||
Raw File I/O
|
Raw File I/O
|
||||||
------------
|
^^^^^^^^^^^^
|
||||||
|
|
||||||
.. class:: FileIO(name, mode='r', closefd=True)
|
.. class:: FileIO(name, mode='r', closefd=True)
|
||||||
|
|
||||||
|
@ -505,7 +610,7 @@ Raw File I/O
|
||||||
|
|
||||||
|
|
||||||
Buffered Streams
|
Buffered Streams
|
||||||
----------------
|
^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
In many situations, buffered I/O streams will provide higher performance
|
In many situations, buffered I/O streams will provide higher performance
|
||||||
(bandwidth and latency) than raw I/O streams. Their API is also more usable.
|
(bandwidth and latency) than raw I/O streams. Their API is also more usable.
|
||||||
|
@ -515,7 +620,7 @@ In many situations, buffered I/O streams will provide higher performance
|
||||||
A stream implementation using an in-memory bytes buffer. It inherits
|
A stream implementation using an in-memory bytes buffer. It inherits
|
||||||
:class:`BufferedIOBase`.
|
:class:`BufferedIOBase`.
|
||||||
|
|
||||||
The argument *initial_bytes* is an optional initial bytearray.
|
The argument *initial_bytes* contains optional initial :class:`bytes` data.
|
||||||
|
|
||||||
:class:`BytesIO` provides or overrides these methods in addition to those
|
:class:`BytesIO` provides or overrides these methods in addition to those
|
||||||
from :class:`BufferedIOBase` and :class:`IOBase`:
|
from :class:`BufferedIOBase` and :class:`IOBase`:
|
||||||
|
@ -632,7 +737,7 @@ In many situations, buffered I/O streams will provide higher performance
|
||||||
|
|
||||||
|
|
||||||
Text I/O
|
Text I/O
|
||||||
--------
|
^^^^^^^^
|
||||||
|
|
||||||
.. class:: TextIOBase
|
.. class:: TextIOBase
|
||||||
|
|
||||||
|
@ -736,14 +841,14 @@ Text I/O
|
||||||
|
|
||||||
.. class:: StringIO(initial_value='', newline=None)
|
.. class:: StringIO(initial_value='', newline=None)
|
||||||
|
|
||||||
An in-memory stream for text. It inherits :class:`TextIOWrapper`.
|
An in-memory stream for text I/O.
|
||||||
|
|
||||||
The initial value of the buffer (an empty string by default) can be set by
|
The initial value of the buffer (an empty string by default) can be set by
|
||||||
providing *initial_value*. The *newline* argument works like that of
|
providing *initial_value*. The *newline* argument works like that of
|
||||||
:class:`TextIOWrapper`. The default is to do no newline translation.
|
:class:`TextIOWrapper`. The default is to do no newline translation.
|
||||||
|
|
||||||
:class:`StringIO` provides this method in addition to those from
|
:class:`StringIO` provides this method in addition to those from
|
||||||
:class:`TextIOWrapper` and its parents:
|
:class:`TextIOBase` and its parents:
|
||||||
|
|
||||||
.. method:: getvalue()
|
.. method:: getvalue()
|
||||||
|
|
||||||
|
@ -767,6 +872,11 @@ Text I/O
|
||||||
# .getvalue() will now raise an exception.
|
# .getvalue() will now raise an exception.
|
||||||
output.close()
|
output.close()
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
:class:`StringIO` uses a native text storage and doesn't suffer from
|
||||||
|
the performance issues of other text streams, such as those based on
|
||||||
|
:class:`TextIOWrapper`.
|
||||||
|
|
||||||
.. class:: IncrementalNewlineDecoder
|
.. class:: IncrementalNewlineDecoder
|
||||||
|
|
||||||
A helper codec that decodes newlines for universal newlines mode. It
|
A helper codec that decodes newlines for universal newlines mode. It
|
||||||
|
|
Loading…
Reference in New Issue