rewrite and move open() docs only to functions.rst

This commit is contained in:
Benjamin Peterson 2010-08-30 13:19:53 +00:00
parent 4e4ffb1181
commit 6b4fa776ac
2 changed files with 108 additions and 212 deletions

View File

@ -712,36 +712,37 @@ are always available. They are listed here in alphabetical order.
========= =============================================================== ========= ===============================================================
The default mode is ``'r'`` (open for reading text, synonym of ``'rt'``). The default mode is ``'r'`` (open for reading text, synonym of ``'rt'``).
For binary read-write access, the mode ``'w+b'`` opens and truncates the For binary read-write access, the mode ``'w+b'`` opens and truncates the file
file to 0 bytes, while ``'r+b'`` opens the file without truncation. to 0 bytes. ``'r+b'`` opens the file without truncation.
As mentioned in the `overview`_, Python distinguishes between binary As mentioned in the :ref:`io-overview`, Python distinguishes between binary
and text I/O. Files opened in binary mode (including ``'b'`` in the and text I/O. Files opened in binary mode (including ``'b'`` in the *mode*
*mode* argument) return contents as :class:`bytes` objects without argument) return contents as :class:`bytes` objects without any decoding. In
any decoding. In text mode (the default, or when ``'t'`` text mode (the default, or when ``'t'`` is included in the *mode* argument),
is included in the *mode* argument), the contents of the file are the contents of the file are returned as :class:`str`, the bytes having been
returned as strings, the bytes having been first decoded using a first decoded using a platform-dependent encoding or using the specified
platform-dependent encoding or using the specified *encoding* if given. *encoding* if given.
.. note:: .. note::
Python doesn't depend on the underlying operating system's notion
of text files; all the the processing is done by Python itself, and
is therefore platform-independent.
*buffering* is an optional integer used to set the buffering policy. Python doesn't depend on the underlying operating system's notion of text
Pass 0 to switch buffering off (only allowed in binary mode), 1 to select files; all the the processing is done by Python itself, and is therefore
line buffering (only usable in text mode), and an integer > 1 to indicate platform-independent.
the size of a fixed-size chunk buffer. When no *buffering* argument is
given, the default buffering policy works as follows:
* Binary files are buffered in fixed-size chunks; the size of the buffer *buffering* is an optional integer used to set the buffering policy. Pass 0
is chosen using a heuristic trying to determine the underlying device's to switch buffering off (only allowed in binary mode), 1 to select line
"block size" and falling back on :attr:`DEFAULT_BUFFER_SIZE`. buffering (only usable in text mode), and an integer > 1 to indicate the size
On many systems, the buffer will typically be 4096 or 8192 bytes long. of a fixed-size chunk buffer. When no *buffering* argument is given, the
default buffering policy works as follows:
* "Interactive" text files (files for which :meth:`isatty` returns True) * Binary files are buffered in fixed-size chunks; the size of the buffer is
use line buffering. Other text files use the policy described above chosen using a heuristic trying to determine the underlying device's "block
for binary files. size" and falling back on :attr:`io.DEFAULT_BUFFER_SIZE`. On many systems,
the buffer will typically be 4096 or 8192 bytes long.
* "Interactive" text files (files for which :meth:`isatty` returns True) use
line buffering. Other text files use the policy described above for binary
files.
*encoding* is the name of the encoding used to decode or encode the file. *encoding* is the name of the encoding used to decode or encode the file.
This should only be used in text mode. The default encoding is platform This should only be used in text mode. The default encoding is platform
@ -784,16 +785,17 @@ are always available. They are listed here in alphabetical order.
closed. If a filename is given *closefd* has no effect and must be ``True`` closed. If a filename is given *closefd* has no effect and must be ``True``
(the default). (the default).
The type of file object returned by the :func:`.open` function depends on the The type of file object returned by the :func:`open` function depends on the
mode. When :func:`.open` is used to open a file in a text mode (``'w'``, mode. When :func:`open` is used to open a file in a text mode (``'w'``,
``'r'``, ``'wt'``, ``'rt'``, etc.), it returns a subclass of ``'r'``, ``'wt'``, ``'rt'``, etc.), it returns a subclass of
:class:`TextIOBase` (specifically :class:`TextIOWrapper`). When used to open :class:`io.TextIOBase` (specifically :class:`io.TextIOWrapper`). When used
a file in a binary mode with buffering, the returned class is a subclass of to open a file in a binary mode with buffering, the returned class is a
:class:`BufferedIOBase`. The exact class varies: in read binary mode, it subclass of :class:`io.BufferedIOBase`. The exact class varies: in read
returns a :class:`BufferedReader`; in write binary and append binary modes, binary mode, it returns a :class:`io.BufferedReader`; in write binary and
it returns a :class:`BufferedWriter`, and in read/write mode, it returns a append binary modes, it returns a :class:`io.BufferedWriter`, and in
:class:`BufferedRandom`. When buffering is disabled, the raw stream, a read/write mode, it returns a :class:`io.BufferedRandom`. When buffering is
subclass of :class:`RawIOBase`, :class:`FileIO`, is returned. disabled, the raw stream, a subclass of :class:`io.RawIOBase`,
:class:`io.FileIO`, is returned.
.. index:: .. index::
single: line-buffered I/O single: line-buffered I/O

View File

@ -11,37 +11,39 @@
.. moduleauthor:: Benjamin Peterson <benjamin@python.org> .. moduleauthor:: Benjamin Peterson <benjamin@python.org>
.. sectionauthor:: Benjamin Peterson <benjamin@python.org> .. sectionauthor:: Benjamin Peterson <benjamin@python.org>
.. _io-overview:
Overview Overview
-------- --------
The :mod:`io` module provides Python 3's main facilities for dealing for The :mod:`io` module provides Python's main facilities for dealing for various
various types of I/O. Three main types of I/O are defined: *text I/O*, types of I/O. There are three main types of I/O: *text I/O*, *binary I/O*, *raw
*binary I/O*, *raw I/O*. It should be noted that these are generic categories, I/O*. These are generic categories, and various backing stores can be used for
and various backing stores can be used for each of them. Concrete objects each of them. Concrete objects belonging to any of these categories will often
belonging to any of these categories will often be called *streams*; another be called *streams*; another common term is *file-like objects*.
common term is *file-like objects*.
Independently of its category, each concrete stream object will also have Independently of its category, each concrete stream object will also have
various capabilities: it can be read-only, write-only, or read-write; it various capabilities: it can be read-only, write-only, or read-write. It can
can also allow arbitrary random access (seeking forwards or backwards to also allow arbitrary random access (seeking forwards or backwards to any
any location), or only sequential access (for example in the case of a location), or only sequential access (for example in the case of a socket or
socket or pipe). pipe).
All streams are careful about the type of data you give to them. For example All streams are careful about the type of data you give to them. For example
giving a :class:`str` object to the ``write()`` method of a binary stream giving a :class:`str` object to the ``write()`` method of a binary stream
will raise a ``TypeError``. So will giving a :class:`bytes` object to the will raise a ``TypeError``. So will giving a :class:`bytes` object to the
``write()`` method of a text stream. ``write()`` method of a text stream.
Text I/O Text I/O
^^^^^^^^ ^^^^^^^^
Text I/O expects and produces :class:`str` objects. This means that, Text I/O expects and produces :class:`str` objects. This means that whenever
whenever the backing store is natively made of bytes (such as in the case the backing store is natively made of bytes (such as in the case of a file),
of a file), encoding and decoding of data is made transparently, as well as, encoding and decoding of data is made transparently as well as optional
optionally, translation of platform-specific newline characters. translation of platform-specific newline characters.
A way to create a text stream is to :meth:`open()` a file in text mode, The easiest way to create a text stream is with :meth:`open()`, optionally
optionally specifying an encoding:: specifying an encoding::
f = open("myfile.txt", "r", encoding="utf-8") f = open("myfile.txt", "r", encoding="utf-8")
@ -49,23 +51,26 @@ In-memory text streams are also available as :class:`StringIO` objects::
f = io.StringIO("some initial text data") f = io.StringIO("some initial text data")
The detailed API of text streams is described by the :class:`TextIOBase` The text stream API is described in detail in the documentation for the
class. :class:`TextIOBase`.
.. note:: .. note::
Text I/O over a binary storage (such as a file) is significantly
slower than binary I/O over the same storage. This can become noticeable Text I/O over a binary storage (such as a file) is significantly slower than
if you handle huge amounts of text data (for example very large log files). binary I/O over the same storage. This can become noticeable if you handle
huge amounts of text data (for example very large log files).
Binary I/O Binary I/O
^^^^^^^^^^ ^^^^^^^^^^
Binary I/O (also called *buffered I/O*) expects and produces Binary I/O (also called *buffered I/O*) expects and produces :class:`bytes`
:class:`bytes` objects. No encoding, decoding or character translation objects. No encoding, decoding, or newline translation is performed. This
is performed. This is the category of streams used for all kinds of non-text category of streams can be used for all kinds of non-text data, and also when
data, and also when manual control over the handling of text data is desired. manual control over the handling of text data is desired.
A way to create a binary stream is to :meth:`open()` a file in binary mode:: The easiest way to create a binary stream is with :meth:`open()` with ``'b'`` in
the mode string::
f = open("myfile.jpg", "rb") f = open("myfile.jpg", "rb")
@ -73,24 +78,24 @@ In-memory binary streams are also available as :class:`BytesIO` objects::
f = io.BytesIO(b"some initial binary data: \x00\x01") f = io.BytesIO(b"some initial binary data: \x00\x01")
The detailed API of binary streams is described by the :class:`BufferedIOBase` The binary stream API is described in detail in the docs of
class. :class:`BufferedIOBase`.
Other library modules may provide additional ways to create text or binary Other library modules may provide additional ways to create text or binary
streams. See for example :meth:`socket.socket.makefile`. streams. See :meth:`socket.socket.makefile` for example.
Raw I/O Raw I/O
^^^^^^^ ^^^^^^^
Raw I/O (also called *unbuffered I/O*) is generally used as a low-level Raw I/O (also called *unbuffered I/O*) is generally used as a low-level
building-block for binary and text streams; it is rarely useful to directly building-block for binary and text streams; it is rarely useful to directly
manipulate a raw stream from user code. Nevertheless, you can for example manipulate a raw stream from user code. Nevertheless, you can create a raw
create a raw stream by opening a file in binary mode with buffering disabled:: stream by opening a file in binary mode with buffering disabled::
f = open("myfile.jpg", "rb", buffering=0) f = open("myfile.jpg", "rb", buffering=0)
The detailed API of raw streams is described by the :class:`RawIOBase` The raw stream API is described in detail in the docs of :class:`RawIOBase`.
class.
High-level Module Interface High-level Module Interface
@ -99,125 +104,13 @@ High-level Module Interface
.. data:: DEFAULT_BUFFER_SIZE .. data:: DEFAULT_BUFFER_SIZE
An int containing the default buffer size used by the module's buffered I/O An int containing the default buffer size used by the module's buffered I/O
classes. :func:`.open` uses the file's blksize (as obtained by classes. :func:`open` uses the file's blksize (as obtained by
:func:`os.stat`) if possible. :func:`os.stat`) if possible.
.. function:: open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True) .. function:: open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True)
Open *file* and return a corresponding stream. If the file cannot be opened, This is an alias for the builtin :func:`open` function.
an :exc:`IOError` is raised.
*file* is either a string or bytes object giving the pathname (absolute or
relative to the current working directory) of the file to be opened or
an integer file descriptor of the file to be wrapped. (If a file descriptor
is given, it is closed when the returned I/O object is closed, unless
*closefd* is set to ``False``.)
*mode* is an optional string that specifies the mode in which the file is
opened. It defaults to ``'r'`` which means open for reading in text mode.
Other common values are ``'w'`` for writing (truncating the file if it
already exists), and ``'a'`` for appending (which on *some* Unix systems,
means that *all* writes append to the end of the file regardless of the
current seek position). In text mode, if *encoding* is not specified the
encoding used is platform dependent. (For reading and writing raw bytes use
binary mode and leave *encoding* unspecified.) The available modes are:
========= ===============================================================
Character Meaning
--------- ---------------------------------------------------------------
``'r'`` open for reading (default)
``'w'`` open for writing, truncating the file first
``'a'`` open for writing, appending to the end of the file if it exists
``'b'`` binary mode
``'t'`` text mode (default)
``'+'`` open a disk file for updating (reading and writing)
``'U'`` universal newline mode (for backwards compatibility; should
not be used in new code)
========= ===============================================================
The default mode is ``'r'`` (open for reading text, synonym of ``'rt'``).
For binary read-write access, the mode ``'w+b'`` opens and truncates the
file to 0 bytes, while ``'r+b'`` opens the file without truncation.
As mentioned in the `overview`_, Python distinguishes between binary
and text I/O. Files opened in binary mode (including ``'b'`` in the
*mode* argument) return contents as :class:`bytes` objects without
any decoding. In text mode (the default, or when ``'t'``
is included in the *mode* argument), the contents of the file are
returned as strings, the bytes having been first decoded using a
platform-dependent encoding or using the specified *encoding* if given.
.. note::
Python doesn't depend on the underlying operating system's notion
of text files; all the the processing is done by Python itself, and
is therefore platform-independent.
*buffering* is an optional integer used to set the buffering policy.
Pass 0 to switch buffering off (only allowed in binary mode), 1 to select
line buffering (only usable in text mode), and an integer > 1 to indicate
the size of a fixed-size chunk buffer. When no *buffering* argument is
given, the default buffering policy works as follows:
* Binary files are buffered in fixed-size chunks; the size of the buffer
is chosen using a heuristic trying to determine the underlying device's
"block size" and falling back on :attr:`DEFAULT_BUFFER_SIZE`.
On many systems, the buffer will typically be 4096 or 8192 bytes long.
* "Interactive" text files (files for which :meth:`isatty` returns True)
use line buffering. Other text files use the policy described above
for binary files.
*encoding* is the name of the encoding used to decode or encode the file.
This should only be used in text mode. The default encoding is platform
dependent (whatever :func:`locale.getpreferredencoding` returns), but any
encoding supported by Python can be used. See the :mod:`codecs` module for
the list of supported encodings.
*errors* is an optional string that specifies how encoding and decoding
errors are to be handled--this cannot be used in binary mode. Pass
``'strict'`` to raise a :exc:`ValueError` exception if there is an encoding
error (the default of ``None`` has the same effect), or pass ``'ignore'`` to
ignore errors. (Note that ignoring encoding errors can lead to data loss.)
``'replace'`` causes a replacement marker (such as ``'?'``) to be inserted
where there is malformed data. When writing, ``'xmlcharrefreplace'``
(replace with the appropriate XML character reference) or
``'backslashreplace'`` (replace with backslashed escape sequences) can be
used. Any other error handling name that has been registered with
:func:`codecs.register_error` is also valid.
*newline* controls how universal newlines works (it only applies to text
mode). It can be ``None``, ``''``, ``'\n'``, ``'\r'``, and ``'\r\n'``. It
works as follows:
* On input, if *newline* is ``None``, universal newlines mode is enabled.
Lines in the input can end in ``'\n'``, ``'\r'``, or ``'\r\n'``, and these
are translated into ``'\n'`` before being returned to the caller. If it is
``''``, universal newline mode is enabled, but line endings are returned to
the caller untranslated. If it has any of the other legal values, input
lines are only terminated by the given string, and the line ending is
returned to the caller untranslated.
* On output, if *newline* is ``None``, any ``'\n'`` characters written are
translated to the system default line separator, :data:`os.linesep`. If
*newline* is ``''``, no translation takes place. If *newline* is any of
the other legal values, any ``'\n'`` characters written are translated to
the given string.
If *closefd* is ``False`` and a file descriptor rather than a filename was
given, the underlying file descriptor will be kept open when the file is
closed. If a filename is given *closefd* has no effect and must be ``True``
(the default).
The type of file object returned by the :func:`.open` function depends on the
mode. When :func:`.open` is used to open a file in a text mode (``'w'``,
``'r'``, ``'wt'``, ``'rt'``, etc.), it returns a subclass of
:class:`TextIOBase` (specifically :class:`TextIOWrapper`). When used to open
a file in a binary mode with buffering, the returned class is a subclass of
:class:`BufferedIOBase`. The exact class varies: in read binary mode, it
returns a :class:`BufferedReader`; in write binary and append binary modes,
it returns a :class:`BufferedWriter`, and in read/write mode, it returns a
:class:`BufferedRandom`. When buffering is disabled, the raw stream, a
subclass of :class:`RawIOBase`, :class:`FileIO`, is returned.
.. exception:: BlockingIOError .. exception:: BlockingIOError
@ -244,13 +137,14 @@ In-memory streams
^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^
It is also possible to use a :class:`str` or :class:`bytes`-like object as a It is also possible to use a :class:`str` or :class:`bytes`-like object as a
file for both reading and writing. For strings :class:`StringIO` can be file for both reading and writing. For strings :class:`StringIO` can be used
used like a file opened in text mode, and :class:`BytesIO` can be used like like a file opened in text mode. :class:`BytesIO` can be used like a file
a file opened in binary mode. Both provide full read-write capabilities opened in binary mode. Both provide full read-write capabilities with random
with random access. access.
.. seealso:: .. seealso::
:mod:`sys` :mod:`sys`
contains the standard IO streams: :data:`sys.stdin`, :data:`sys.stdout`, contains the standard IO streams: :data:`sys.stdin`, :data:`sys.stdout`,
and :data:`sys.stderr`. and :data:`sys.stderr`.
@ -259,44 +153,43 @@ with random access.
Class hierarchy Class hierarchy
--------------- ---------------
The implementation of I/O streams is organized as a hierarchy of classes. The implementation of I/O streams is organized as a hierarchy of classes. First
First :term:`abstract base classes <abstract base class>` (ABCs), which are used to specify the :term:`abstract base classes <abstract base class>` (ABCs), which are used to
various categories of streams, then concrete classes providing the standard specify the various categories of streams, then concrete classes providing the
stream implementations. standard stream implementations.
.. note:: .. note::
The abstract base classes also provide default implementations of
some methods in order to help implementation of concrete stream The abstract base classes also provide default implementations of some
classes. For example, :class:`BufferedIOBase` provides methods in order to help implementation of concrete stream classes. For
unoptimized implementations of ``readinto()`` and ``readline()``. example, :class:`BufferedIOBase` provides unoptimized implementations of
``readinto()`` and ``readline()``.
At the top of the I/O hierarchy is the abstract base class :class:`IOBase`. It At the top of the I/O hierarchy is the abstract base class :class:`IOBase`. It
defines the basic interface to a stream. Note, however, that there is no defines the basic interface to a stream. Note, however, that there is no
separation between reading and writing to streams; implementations are allowed separation between reading and writing to streams; implementations are allowed
to raise an :exc:`UnsupportedOperation` if they do not support a given to raise :exc:`UnsupportedOperation` if they do not support a given operation.
operation.
Extending :class:`IOBase` is the :class:`RawIOBase` ABC which deals simply The :class:`RawIOBase` ABC extends :class:`IOBase`. It deals with the reading
with the reading and writing of raw bytes to a stream. :class:`FileIO` and writing of bytes to a stream. :class:`FileIO` subclasses :class:`RawIOBase`
subclasses :class:`RawIOBase` to provide an interface to files in the to provide an interface to files in the machine's file system.
machine's file system.
The :class:`BufferedIOBase` ABC deals with buffering on a raw byte stream The :class:`BufferedIOBase` ABC deals with buffering on a raw byte stream
(:class:`RawIOBase`). Its subclasses, :class:`BufferedWriter`, (:class:`RawIOBase`). Its subclasses, :class:`BufferedWriter`,
:class:`BufferedReader`, and :class:`BufferedRWPair` buffer streams that are :class:`BufferedReader`, and :class:`BufferedRWPair` buffer streams that are
readable, writable, and both readable and writable. readable, writable, and both readable and writable. :class:`BufferedRandom`
:class:`BufferedRandom` provides a buffered interface to random access provides a buffered interface to random access streams. Another
streams. :class:`BytesIO` is a simple stream of in-memory bytes. :class`BufferedIOBase` subclass, :class:`BytesIO`, is a stream of in-memory
bytes.
Another :class:`IOBase` subclass, the :class:`TextIOBase` ABC, deals with The :class:`TextIOBase` ABC, another subclass of :class:`IOBase`, deals with
streams whose bytes represent text, and handles encoding and decoding streams whose bytes represent text, and handles encoding and decoding to and
from and to strings. :class:`TextIOWrapper`, which extends it, is a from strings. :class:`TextIOWrapper`, which extends it, is a buffered text
buffered text interface to a buffered raw stream interface to a buffered raw stream (:class:`BufferedIOBase`). Finally,
(:class:`BufferedIOBase`). Finally, :class:`StringIO` is an in-memory :class:`StringIO` is an in-memory stream for text.
stream for text.
Argument names are not part of the specification, and only the arguments of Argument names are not part of the specification, and only the arguments of
:func:`.open` are intended to be used as keyword arguments. :func:`open` are intended to be used as keyword arguments.
I/O Base Classes I/O Base Classes
@ -381,7 +274,7 @@ I/O Base Classes
most *limit* bytes will be read. most *limit* bytes will be read.
The line terminator is always ``b'\n'`` for binary files; for text files, The line terminator is always ``b'\n'`` for binary files; for text files,
the *newlines* argument to :func:`.open` can be used to select the line the *newlines* argument to :func:`open` can be used to select the line
terminator(s) recognized. terminator(s) recognized.
.. method:: readlines(hint=-1) .. method:: readlines(hint=-1)
@ -873,8 +766,9 @@ Text I/O
output.close() output.close()
.. note:: .. note::
:class:`StringIO` uses a native text storage and doesn't suffer from
the performance issues of other text streams, such as those based on :class:`StringIO` uses a native text storage and doesn't suffer from the
performance issues of other text streams, such as those based on
:class:`TextIOWrapper`. :class:`TextIOWrapper`.
.. class:: IncrementalNewlineDecoder .. class:: IncrementalNewlineDecoder