bpo-37330: open() no longer accept 'U' in file mode (GH-28118)

open(), io.open(), codecs.open() and fileinput.FileInput no longer
accept "U" ("universal newline") in the file mode. This flag was
deprecated since Python 3.3.
This commit is contained in:
Victor Stinner 2021-09-02 12:58:00 +02:00 committed by GitHub
parent a806608705
commit 19ba2122ac
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
13 changed files with 61 additions and 104 deletions

View File

@ -204,6 +204,9 @@ wider range of codecs when working with binary files:
*buffering* has the same meaning as for the built-in :func:`open` function. *buffering* has the same meaning as for the built-in :func:`open` function.
It defaults to -1 which means that the default buffer size will be used. It defaults to -1 which means that the default buffer size will be used.
.. versionchanged:: 3.11
The ``'U'`` mode has been removed.
.. function:: EncodedFile(file, data_encoding, file_encoding=None, errors='strict') .. function:: EncodedFile(file, data_encoding, file_encoding=None, errors='strict')

View File

@ -153,7 +153,7 @@ available for subclassing as well:
and :meth:`~io.TextIOBase.readline` cannot be mixed. and :meth:`~io.TextIOBase.readline` cannot be mixed.
With *mode* you can specify which file mode will be passed to :func:`open`. It With *mode* you can specify which file mode will be passed to :func:`open`. It
must be one of ``'r'``, ``'rU'``, ``'U'`` and ``'rb'``. must be one of ``'r'`` and ``'rb'``.
The *openhook*, when given, must be a function that takes two arguments, The *openhook*, when given, must be a function that takes two arguments,
*filename* and *mode*, and returns an accordingly opened file-like object. You *filename* and *mode*, and returns an accordingly opened file-like object. You
@ -171,9 +171,6 @@ available for subclassing as well:
.. versionchanged:: 3.2 .. versionchanged:: 3.2
Can be used as a context manager. Can be used as a context manager.
.. deprecated:: 3.4
The ``'rU'`` and ``'U'`` modes.
.. deprecated:: 3.8 .. deprecated:: 3.8
Support for :meth:`__getitem__` method is deprecated. Support for :meth:`__getitem__` method is deprecated.
@ -183,6 +180,9 @@ available for subclassing as well:
.. versionchanged:: 3.10 .. versionchanged:: 3.10
The keyword-only parameter *encoding* and *errors* are added. The keyword-only parameter *encoding* and *errors* are added.
.. versionchanged:: 3.11
The ``'rU'`` and ``'U'`` modes have been removed.
**Optional in-place filtering:** if the keyword argument ``inplace=True`` is **Optional in-place filtering:** if the keyword argument ``inplace=True`` is
passed to :func:`fileinput.input` or to the :class:`FileInput` constructor, the passed to :func:`fileinput.input` or to the :class:`FileInput` constructor, the

View File

@ -1156,12 +1156,6 @@ are always available. They are listed here in alphabetical order.
first decoded using a platform-dependent encoding or using the specified first decoded using a platform-dependent encoding or using the specified
*encoding* if given. *encoding* if given.
There is an additional mode character permitted, ``'U'``, which no longer
has any effect, and is considered deprecated. It previously enabled
:term:`universal newlines` in text mode, which became the default behavior
in Python 3.0. Refer to the documentation of the
:ref:`newline <open-newline-parameter>` parameter for further details.
.. note:: .. note::
Python doesn't depend on the underlying operating system's notion of text Python doesn't depend on the underlying operating system's notion of text
@ -1304,8 +1298,7 @@ are always available. They are listed here in alphabetical order.
The ``mode`` and ``flags`` arguments may have been modified or inferred from The ``mode`` and ``flags`` arguments may have been modified or inferred from
the original call. the original call.
.. versionchanged:: .. versionchanged:: 3.3
3.3
* The *opener* parameter was added. * The *opener* parameter was added.
* The ``'x'`` mode was added. * The ``'x'`` mode was added.
@ -1313,30 +1306,26 @@ are always available. They are listed here in alphabetical order.
* :exc:`FileExistsError` is now raised if the file opened in exclusive * :exc:`FileExistsError` is now raised if the file opened in exclusive
creation mode (``'x'``) already exists. creation mode (``'x'``) already exists.
.. versionchanged:: .. versionchanged:: 3.4
3.4
* The file is now non-inheritable. * The file is now non-inheritable.
.. deprecated-removed:: 3.4 3.10 .. versionchanged:: 3.5
The ``'U'`` mode.
.. versionchanged::
3.5
* If the system call is interrupted and the signal handler does not raise an * If the system call is interrupted and the signal handler does not raise an
exception, the function now retries the system call instead of raising an exception, the function now retries the system call instead of raising an
:exc:`InterruptedError` exception (see :pep:`475` for the rationale). :exc:`InterruptedError` exception (see :pep:`475` for the rationale).
* The ``'namereplace'`` error handler was added. * The ``'namereplace'`` error handler was added.
.. versionchanged:: .. versionchanged:: 3.6
3.6
* Support added to accept objects implementing :class:`os.PathLike`. * Support added to accept objects implementing :class:`os.PathLike`.
* On Windows, opening a console buffer may return a subclass of * On Windows, opening a console buffer may return a subclass of
:class:`io.RawIOBase` other than :class:`io.FileIO`. :class:`io.RawIOBase` other than :class:`io.FileIO`.
.. versionchanged:: 3.11
The ``'U'`` mode has been removed.
.. function:: ord(c) .. function:: ord(c)
Given a string representing one Unicode character, return an integer Given a string representing one Unicode character, return an integer

View File

@ -321,6 +321,14 @@ Changes in the Python API
Python 3.8. Python 3.8.
(Contributed by Illia Volochii in :issue:`43234`.) (Contributed by Illia Volochii in :issue:`43234`.)
* :func:`open`, :func:`io.open`, :func:`codecs.open` and
:class:`fileinput.FileInput` no longer accept ``'U'`` ("universal newline")
in the file mode. This flag was deprecated since Python 3.3. In Python 3, the
"universal newline" is used by default when a file is open in text mode. The
:ref:`newline parameter <open-newline-parameter>` of :func:`open` controls
how universal newlines works.
(Contributed by Victor Stinner in :issue:`37330`.)
C API Changes C API Changes
============= =============

View File

@ -101,7 +101,6 @@ def open(file, mode="r", buffering=-1, encoding=None, errors=None,
'b' binary mode 'b' binary mode
't' text mode (default) 't' text mode (default)
'+' open a disk file for updating (reading and writing) '+' open a disk file for updating (reading and writing)
'U' universal newline mode (deprecated)
========= =============================================================== ========= ===============================================================
The default mode is 'rt' (open for reading text). For binary random The default mode is 'rt' (open for reading text). For binary random
@ -117,10 +116,6 @@ def open(file, mode="r", buffering=-1, encoding=None, errors=None,
returned as strings, the bytes having been first decoded using a returned as strings, the bytes having been first decoded using a
platform-dependent encoding or using the specified encoding if given. platform-dependent encoding or using the specified encoding if given.
'U' mode is deprecated and will raise an exception in future versions
of Python. It has no effect in Python 3. Use newline to control
universal newlines mode.
buffering is an optional integer used to set the buffering policy. buffering is an optional integer used to set the buffering policy.
Pass 0 to switch buffering off (only allowed in binary mode), 1 to select Pass 0 to switch buffering off (only allowed in binary mode), 1 to select
line buffering (only usable in text mode), and an integer > 1 to indicate line buffering (only usable in text mode), and an integer > 1 to indicate
@ -206,7 +201,7 @@ def open(file, mode="r", buffering=-1, encoding=None, errors=None,
if errors is not None and not isinstance(errors, str): if errors is not None and not isinstance(errors, str):
raise TypeError("invalid errors: %r" % errors) raise TypeError("invalid errors: %r" % errors)
modes = set(mode) modes = set(mode)
if modes - set("axrwb+tU") or len(mode) > len(modes): if modes - set("axrwb+t") or len(mode) > len(modes):
raise ValueError("invalid mode: %r" % mode) raise ValueError("invalid mode: %r" % mode)
creating = "x" in modes creating = "x" in modes
reading = "r" in modes reading = "r" in modes
@ -215,13 +210,6 @@ def open(file, mode="r", buffering=-1, encoding=None, errors=None,
updating = "+" in modes updating = "+" in modes
text = "t" in modes text = "t" in modes
binary = "b" in modes binary = "b" in modes
if "U" in modes:
if creating or writing or appending or updating:
raise ValueError("mode U cannot be combined with 'x', 'w', 'a', or '+'")
import warnings
warnings.warn("'U' mode is deprecated",
DeprecationWarning, 2)
reading = True
if text and binary: if text and binary:
raise ValueError("can't have text and binary mode at once") raise ValueError("can't have text and binary mode at once")
if creating + reading + writing + appending > 1: if creating + reading + writing + appending > 1:

View File

@ -217,15 +217,10 @@ class FileInput:
EncodingWarning, 2) EncodingWarning, 2)
# restrict mode argument to reading modes # restrict mode argument to reading modes
if mode not in ('r', 'rU', 'U', 'rb'): if mode not in ('r', 'rb'):
raise ValueError("FileInput opening mode must be one of " raise ValueError("FileInput opening mode must be 'r' or 'rb'")
"'r', 'rU', 'U' and 'rb'")
if 'U' in mode:
import warnings
warnings.warn("'U' mode is deprecated",
DeprecationWarning, 2)
self._mode = mode self._mode = mode
self._write_mode = mode.replace('r', 'w') if 'U' not in mode else 'w' self._write_mode = mode.replace('r', 'w')
if openhook: if openhook:
if inplace: if inplace:
raise ValueError("FileInput cannot use an opening hook in inplace mode") raise ValueError("FileInput cannot use an opening hook in inplace mode")

View File

@ -226,7 +226,7 @@ def load_module(name, file, filename, details):
""" """
suffix, mode, type_ = details suffix, mode, type_ = details
if mode and (not mode.startswith(('r', 'U')) or '+' in mode): if mode and (not mode.startswith('r') or '+' in mode):
raise ValueError('invalid file open mode {!r}'.format(mode)) raise ValueError('invalid file open mode {!r}'.format(mode))
elif file is None and type_ in {PY_SOURCE, PY_COMPILED}: elif file is None and type_ in {PY_SOURCE, PY_COMPILED}:
msg = 'file object required for import (type code {})'.format(type_) msg = 'file object required for import (type code {})'.format(type_)

View File

@ -714,11 +714,23 @@ class UTF16Test(ReadTest, unittest.TestCase):
self.addCleanup(os_helper.unlink, os_helper.TESTFN) self.addCleanup(os_helper.unlink, os_helper.TESTFN)
with open(os_helper.TESTFN, 'wb') as fp: with open(os_helper.TESTFN, 'wb') as fp:
fp.write(s) fp.write(s)
with warnings_helper.check_warnings(('', DeprecationWarning)): with codecs.open(os_helper.TESTFN, 'r',
reader = codecs.open(os_helper.TESTFN, 'U', encoding=self.encoding) encoding=self.encoding) as reader:
with reader:
self.assertEqual(reader.read(), s1) self.assertEqual(reader.read(), s1)
def test_invalid_modes(self):
for mode in ('U', 'rU', 'r+U'):
with self.assertRaises(ValueError) as cm:
codecs.open(os_helper.TESTFN, mode, encoding=self.encoding)
self.assertIn('invalid mode', str(cm.exception))
for mode in ('rt', 'wt', 'at', 'r+t'):
with self.assertRaises(ValueError) as cm:
codecs.open(os_helper.TESTFN, mode, encoding=self.encoding)
self.assertIn("can't have text and binary mode at once",
str(cm.exception))
class UTF16LETest(ReadTest, unittest.TestCase): class UTF16LETest(ReadTest, unittest.TestCase):
encoding = "utf-16-le" encoding = "utf-16-le"
ill_formed_sequence = b"\x80\xdc" ill_formed_sequence = b"\x80\xdc"

View File

@ -230,20 +230,11 @@ class FileInputTests(BaseTests, unittest.TestCase):
line = list(fi) line = list(fi)
self.assertEqual(fi.fileno(), -1) self.assertEqual(fi.fileno(), -1)
def test_opening_mode(self): def test_invalid_opening_mode(self):
try: for mode in ('w', 'rU', 'U'):
# invalid mode, should raise ValueError with self.subTest(mode=mode):
fi = FileInput(mode="w", encoding="utf-8") with self.assertRaises(ValueError):
self.fail("FileInput should reject invalid mode argument") FileInput(mode=mode)
except ValueError:
pass
# try opening in universal newline mode
t1 = self.writeTmp(b"A\nB\r\nC\rD", mode="wb")
with warnings_helper.check_warnings(('', DeprecationWarning)):
fi = FileInput(files=t1, mode="U", encoding="utf-8")
with warnings_helper.check_warnings(('', DeprecationWarning)):
lines = list(fi)
self.assertEqual(lines, ["A\n", "B\n", "C\n", "D"])
def test_stdin_binary_mode(self): def test_stdin_binary_mode(self):
with mock.patch('sys.stdin') as m_stdin: with mock.patch('sys.stdin') as m_stdin:
@ -1015,10 +1006,6 @@ class Test_hook_encoded(unittest.TestCase):
self.assertEqual(lines, expected_lines) self.assertEqual(lines, expected_lines)
check('r', ['A\n', 'B\n', 'C\n', 'D\u20ac']) check('r', ['A\n', 'B\n', 'C\n', 'D\u20ac'])
with self.assertWarns(DeprecationWarning):
check('rU', ['A\n', 'B\n', 'C\n', 'D\u20ac'])
with self.assertWarns(DeprecationWarning):
check('U', ['A\n', 'B\n', 'C\n', 'D\u20ac'])
with self.assertRaises(ValueError): with self.assertRaises(ValueError):
check('rb', ['A\n', 'B\r\n', 'C\r', 'D\u20ac']) check('rb', ['A\n', 'B\r\n', 'C\r', 'D\u20ac'])

View File

@ -3954,16 +3954,6 @@ class MiscIOTest(unittest.TestCase):
self.assertEqual(f.mode, "wb") self.assertEqual(f.mode, "wb")
f.close() f.close()
with warnings_helper.check_warnings(('', DeprecationWarning)):
f = self.open(os_helper.TESTFN, "U", encoding="utf-8")
self.assertEqual(f.name, os_helper.TESTFN)
self.assertEqual(f.buffer.name, os_helper.TESTFN)
self.assertEqual(f.buffer.raw.name, os_helper.TESTFN)
self.assertEqual(f.mode, "U")
self.assertEqual(f.buffer.mode, "rb")
self.assertEqual(f.buffer.raw.mode, "rb")
f.close()
f = self.open(os_helper.TESTFN, "w+", encoding="utf-8") f = self.open(os_helper.TESTFN, "w+", encoding="utf-8")
self.assertEqual(f.mode, "w+") self.assertEqual(f.mode, "w+")
self.assertEqual(f.buffer.mode, "rb+") # Does it really matter? self.assertEqual(f.buffer.mode, "rb+") # Does it really matter?
@ -3977,6 +3967,13 @@ class MiscIOTest(unittest.TestCase):
f.close() f.close()
g.close() g.close()
def test_removed_u_mode(self):
# bpo-37330: The "U" mode has been removed in Python 3.11
for mode in ("U", "rU", "r+U"):
with self.assertRaises(ValueError) as cm:
self.open(os_helper.TESTFN, mode)
self.assertIn('invalid mode', str(cm.exception))
def test_open_pipe_with_append(self): def test_open_pipe_with_append(self):
# bpo-27805: Ignore ESPIPE from lseek() in open(). # bpo-27805: Ignore ESPIPE from lseek() in open().
r, w = os.pipe() r, w = os.pipe()

View File

@ -0,0 +1,4 @@
:func:`open`, :func:`io.open`, :func:`codecs.open` and
:class:`fileinput.FileInput` no longer accept ``'U'`` ("universal newline")
in the file mode. This flag was deprecated since Python 3.3. Patch by Victor
Stinner.

View File

@ -138,7 +138,6 @@ Character Meaning
'b' binary mode 'b' binary mode
't' text mode (default) 't' text mode (default)
'+' open a disk file for updating (reading and writing) '+' open a disk file for updating (reading and writing)
'U' universal newline mode (deprecated)
========= =============================================================== ========= ===============================================================
The default mode is 'rt' (open for reading text). For binary random The default mode is 'rt' (open for reading text). For binary random
@ -154,10 +153,6 @@ bytes objects without any decoding. In text mode (the default, or when
returned as strings, the bytes having been first decoded using a returned as strings, the bytes having been first decoded using a
platform-dependent encoding or using the specified encoding if given. platform-dependent encoding or using the specified encoding if given.
'U' mode is deprecated and will raise an exception in future versions
of Python. It has no effect in Python 3. Use newline to control
universal newlines mode.
buffering is an optional integer used to set the buffering policy. buffering is an optional integer used to set the buffering policy.
Pass 0 to switch buffering off (only allowed in binary mode), 1 to select Pass 0 to switch buffering off (only allowed in binary mode), 1 to select
line buffering (only usable in text mode), and an integer > 1 to indicate line buffering (only usable in text mode), and an integer > 1 to indicate
@ -233,12 +228,12 @@ static PyObject *
_io_open_impl(PyObject *module, PyObject *file, const char *mode, _io_open_impl(PyObject *module, PyObject *file, const char *mode,
int buffering, const char *encoding, const char *errors, int buffering, const char *encoding, const char *errors,
const char *newline, int closefd, PyObject *opener) const char *newline, int closefd, PyObject *opener)
/*[clinic end generated code: output=aefafc4ce2b46dc0 input=7295902222e6b311]*/ /*[clinic end generated code: output=aefafc4ce2b46dc0 input=1543f4511d2356a5]*/
{ {
unsigned i; unsigned i;
int creating = 0, reading = 0, writing = 0, appending = 0, updating = 0; int creating = 0, reading = 0, writing = 0, appending = 0, updating = 0;
int text = 0, binary = 0, universal = 0; int text = 0, binary = 0;
char rawmode[6], *m; char rawmode[6], *m;
int line_buffering, is_number; int line_buffering, is_number;
@ -296,10 +291,6 @@ _io_open_impl(PyObject *module, PyObject *file, const char *mode,
case 'b': case 'b':
binary = 1; binary = 1;
break; break;
case 'U':
universal = 1;
reading = 1;
break;
default: default:
goto invalid_mode; goto invalid_mode;
} }
@ -322,18 +313,6 @@ _io_open_impl(PyObject *module, PyObject *file, const char *mode,
*m = '\0'; *m = '\0';
/* Parameters validation */ /* Parameters validation */
if (universal) {
if (creating || writing || appending || updating) {
PyErr_SetString(PyExc_ValueError,
"mode U cannot be combined with 'x', 'w', 'a', or '+'");
goto error;
}
if (PyErr_WarnEx(PyExc_DeprecationWarning,
"'U' mode is deprecated", 1) < 0)
goto error;
reading = 1;
}
if (text && binary) { if (text && binary) {
PyErr_SetString(PyExc_ValueError, PyErr_SetString(PyExc_ValueError,
"can't have text and binary mode at once"); "can't have text and binary mode at once");

View File

@ -36,7 +36,6 @@ PyDoc_STRVAR(_io_open__doc__,
"\'b\' binary mode\n" "\'b\' binary mode\n"
"\'t\' text mode (default)\n" "\'t\' text mode (default)\n"
"\'+\' open a disk file for updating (reading and writing)\n" "\'+\' open a disk file for updating (reading and writing)\n"
"\'U\' universal newline mode (deprecated)\n"
"========= ===============================================================\n" "========= ===============================================================\n"
"\n" "\n"
"The default mode is \'rt\' (open for reading text). For binary random\n" "The default mode is \'rt\' (open for reading text). For binary random\n"
@ -52,10 +51,6 @@ PyDoc_STRVAR(_io_open__doc__,
"returned as strings, the bytes having been first decoded using a\n" "returned as strings, the bytes having been first decoded using a\n"
"platform-dependent encoding or using the specified encoding if given.\n" "platform-dependent encoding or using the specified encoding if given.\n"
"\n" "\n"
"\'U\' mode is deprecated and will raise an exception in future versions\n"
"of Python. It has no effect in Python 3. Use newline to control\n"
"universal newlines mode.\n"
"\n"
"buffering is an optional integer used to set the buffering policy.\n" "buffering is an optional integer used to set the buffering policy.\n"
"Pass 0 to switch buffering off (only allowed in binary mode), 1 to select\n" "Pass 0 to switch buffering off (only allowed in binary mode), 1 to select\n"
"line buffering (only usable in text mode), and an integer > 1 to indicate\n" "line buffering (only usable in text mode), and an integer > 1 to indicate\n"
@ -359,4 +354,4 @@ _io_open_code(PyObject *module, PyObject *const *args, Py_ssize_t nargs, PyObjec
exit: exit:
return return_value; return return_value;
} }
/*[clinic end generated code: output=06e055d1d80b835d input=a9049054013a1b77]*/ /*[clinic end generated code: output=6ea315343f6a94ba input=a9049054013a1b77]*/