gh-85679: Recommend `encoding="utf-8"` in tutorial (GH-91778)

This commit is contained in:
Inada Naoki 2022-05-02 17:25:05 +09:00 committed by GitHub
parent d414f7ece8
commit 614420df97
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 18 additions and 10 deletions

View File

@ -279,11 +279,12 @@ Reading and Writing Files
object: file
:func:`open` returns a :term:`file object`, and is most commonly used with
two arguments: ``open(filename, mode)``.
two positional arguments and one keyword argument:
``open(filename, mode, encoding=None)``
::
>>> f = open('workfile', 'w')
>>> f = open('workfile', 'w', encoding="utf-8")
.. XXX str(f) is <io.TextIOWrapper object at 0x82e8dc4>
@ -300,11 +301,14 @@ writing. The *mode* argument is optional; ``'r'`` will be assumed if it's
omitted.
Normally, files are opened in :dfn:`text mode`, that means, you read and write
strings from and to the file, which are encoded in a specific encoding. If
encoding is not specified, the default is platform dependent (see
:func:`open`). ``'b'`` appended to the mode opens the file in
:dfn:`binary mode`: now the data is read and written in the form of bytes
objects. This mode should be used for all files that don't contain text.
strings from and to the file, which are encoded in a specific *encoding*.
If *encoding* is not specified, the default is platform dependent
(see :func:`open`).
Because UTF-8 is the modern de-facto standard, ``encoding="utf-8"`` is
recommended unless you know that you need to use a different encoding.
Appending a ``'b'`` to the mode opens the file in :dfn:`binary mode`.
Binary mode data is read and written as :class:`bytes` objects.
You can not specify *encoding* when opening file in binary mode.
In text mode, the default when reading is to convert platform-specific line
endings (``\n`` on Unix, ``\r\n`` on Windows) to just ``\n``. When writing in
@ -320,7 +324,7 @@ after its suite finishes, even if an exception is raised at some
point. Using :keyword:`!with` is also much shorter than writing
equivalent :keyword:`try`\ -\ :keyword:`finally` blocks::
>>> with open('workfile') as f:
>>> with open('workfile', encoding="utf-8") as f:
... read_data = f.read()
>>> # We can check that the file has been automatically closed.
@ -490,11 +494,15 @@ simply serializes the object to a :term:`text file`. So if ``f`` is a
json.dump(x, f)
To decode the object again, if ``f`` is a :term:`text file` object which has
been opened for reading::
To decode the object again, if ``f`` is a :term:`binary file` or
:term:`text file` object which has been opened for reading::
x = json.load(f)
.. note::
JSON files must be encoded in UTF-8. Use ``encoding="utf-8"`` when opening
JSON file as a :term:`text file` for both of reading and writing.
This simple serialization technique can handle lists and dictionaries, but
serializing arbitrary class instances in JSON requires a bit of extra effort.
The reference for the :mod:`json` module contains an explanation of this.