gh-98401: Invalid escape sequences emits SyntaxWarning (#99011)

A backslash-character pair that is not a valid escape sequence now
generates a SyntaxWarning, instead of DeprecationWarning.  For
example, re.compile("\d+\.\d+") now emits a SyntaxWarning ("\d" is an
invalid escape sequence), use raw strings for regular expression:
re.compile(r"\d+\.\d+"). In a future Python version, SyntaxError will
eventually be raised, instead of SyntaxWarning.

Octal escapes with value larger than 0o377 (ex: "\477"), deprecated
in Python 3.11, now produce a SyntaxWarning, instead of
DeprecationWarning. In a future Python version they will be
eventually a SyntaxError.

codecs.escape_decode() and codecs.unicode_escape_decode() are left
unchanged: they still emit DeprecationWarning.

* The parser only emits SyntaxWarning for Python 3.12 (feature
  version), and still emits DeprecationWarning on older Python
  versions.
* Fix SyntaxWarning by using raw strings in Tools/c-analyzer/ and
  wasm_build.py.
This commit is contained in:
Victor Stinner 2022-11-03 17:53:25 +01:00 committed by GitHub
parent 916af11a97
commit a60ddd31be
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
11 changed files with 69 additions and 29 deletions

View File

@ -29,7 +29,7 @@ a literal backslash, one might have to write ``'\\\\'`` as the pattern
string, because the regular expression must be ``\\``, and each string, because the regular expression must be ``\\``, and each
backslash must be expressed as ``\\`` inside a regular Python string backslash must be expressed as ``\\`` inside a regular Python string
literal. Also, please note that any invalid escape sequences in Python's literal. Also, please note that any invalid escape sequences in Python's
usage of the backslash in string literals now generate a :exc:`DeprecationWarning` usage of the backslash in string literals now generate a :exc:`SyntaxWarning`
and in the future this will become a :exc:`SyntaxError`. This behaviour and in the future this will become a :exc:`SyntaxError`. This behaviour
will happen even if it is a valid escape sequence for a regular expression. will happen even if it is a valid escape sequence for a regular expression.

View File

@ -612,9 +612,13 @@ Notes:
As in Standard C, up to three octal digits are accepted. As in Standard C, up to three octal digits are accepted.
.. versionchanged:: 3.11 .. versionchanged:: 3.11
Octal escapes with value larger than ``0o377`` produce a :exc:`DeprecationWarning`. Octal escapes with value larger than ``0o377`` produce a
In a future Python version they will be a :exc:`SyntaxWarning` and :exc:`DeprecationWarning`.
eventually a :exc:`SyntaxError`.
.. versionchanged:: 3.12
Octal escapes with value larger than ``0o377`` produce a
:exc:`SyntaxWarning`. In a future Python version they will be eventually
a :exc:`SyntaxError`.
(3) (3)
Unlike in Standard C, exactly two hex digits are required. Unlike in Standard C, exactly two hex digits are required.
@ -646,9 +650,11 @@ escape sequences only recognized in string literals fall into the category of
unrecognized escapes for bytes literals. unrecognized escapes for bytes literals.
.. versionchanged:: 3.6 .. versionchanged:: 3.6
Unrecognized escape sequences produce a :exc:`DeprecationWarning`. In Unrecognized escape sequences produce a :exc:`DeprecationWarning`.
a future Python version they will be a :exc:`SyntaxWarning` and
eventually a :exc:`SyntaxError`. .. versionchanged:: 3.12
Unrecognized escape sequences produce a :exc:`SyntaxWarning`. In a future
Python version they will be eventually a :exc:`SyntaxError`.
Even in a raw literal, quotes can be escaped with a backslash, but the Even in a raw literal, quotes can be escaped with a backslash, but the
backslash remains in the result; for example, ``r"\""`` is a valid string backslash remains in the result; for example, ``r"\""`` is a valid string

View File

@ -121,6 +121,22 @@ Other Language Changes
chance to execute the GC periodically. (Contributed by Pablo Galindo in chance to execute the GC periodically. (Contributed by Pablo Galindo in
:gh:`97922`.) :gh:`97922`.)
* A backslash-character pair that is not a valid escape sequence now generates
a :exc:`SyntaxWarning`, instead of :exc:`DeprecationWarning`.
For example, ``re.compile("\d+\.\d+")`` now emits a :exc:`SyntaxWarning`
(``"\d"`` is an invalid escape sequence), use raw strings for regular
expression: ``re.compile(r"\d+\.\d+")``.
In a future Python version, :exc:`SyntaxError` will eventually be raised,
instead of :exc:`SyntaxWarning`.
(Contributed by Victor Stinner in :gh:`98401`.)
* Octal escapes with value larger than ``0o377`` (ex: ``"\477"``), deprecated
in Python 3.11, now produce a :exc:`SyntaxWarning`, instead of
:exc:`DeprecationWarning`.
In a future Python version they will be eventually a :exc:`SyntaxError`.
(Contributed by Victor Stinner in :gh:`98401`.)
New Modules New Modules
=========== ===========

View File

@ -310,8 +310,8 @@ class CodeopTests(unittest.TestCase):
def test_warning(self): def test_warning(self):
# Test that the warning is only returned once. # Test that the warning is only returned once.
with warnings_helper.check_warnings( with warnings_helper.check_warnings(
(".*literal", SyntaxWarning), ('"is" with a literal', SyntaxWarning),
(".*invalid", DeprecationWarning), ("invalid escape sequence", SyntaxWarning),
) as w: ) as w:
compile_command(r"'\e' is 0") compile_command(r"'\e' is 0")
self.assertEqual(len(w.warnings), 2) self.assertEqual(len(w.warnings), 2)
@ -321,9 +321,9 @@ class CodeopTests(unittest.TestCase):
warnings.simplefilter('error', SyntaxWarning) warnings.simplefilter('error', SyntaxWarning)
compile_command('1 is 1', symbol='exec') compile_command('1 is 1', symbol='exec')
# Check DeprecationWarning treated as an SyntaxError # Check SyntaxWarning treated as an SyntaxError
with warnings.catch_warnings(), self.assertRaises(SyntaxError): with warnings.catch_warnings(), self.assertRaises(SyntaxError):
warnings.simplefilter('error', DeprecationWarning) warnings.simplefilter('error', SyntaxWarning)
compile_command(r"'\e'", symbol='exec') compile_command(r"'\e'", symbol='exec')
def test_incomplete_warning(self): def test_incomplete_warning(self):
@ -337,7 +337,7 @@ class CodeopTests(unittest.TestCase):
warnings.simplefilter('always') warnings.simplefilter('always')
self.assertInvalid("'\\e' 1") self.assertInvalid("'\\e' 1")
self.assertEqual(len(w), 1) self.assertEqual(len(w), 1)
self.assertEqual(w[0].category, DeprecationWarning) self.assertEqual(w[0].category, SyntaxWarning)
self.assertRegex(str(w[0].message), 'invalid escape sequence') self.assertRegex(str(w[0].message), 'invalid escape sequence')
self.assertEqual(w[0].filename, '<input>') self.assertEqual(w[0].filename, '<input>')

View File

@ -776,7 +776,7 @@ x = (
self.assertEqual(f'2\x203', '2 3') self.assertEqual(f'2\x203', '2 3')
self.assertEqual(f'\x203', ' 3') self.assertEqual(f'\x203', ' 3')
with self.assertWarns(DeprecationWarning): # invalid escape sequence with self.assertWarns(SyntaxWarning): # invalid escape sequence
value = eval(r"f'\{6*7}'") value = eval(r"f'\{6*7}'")
self.assertEqual(value, '\\42') self.assertEqual(value, '\\42')
self.assertEqual(f'\\{6*7}', '\\42') self.assertEqual(f'\\{6*7}', '\\42')

View File

@ -109,11 +109,11 @@ class TestLiterals(unittest.TestCase):
for b in range(1, 128): for b in range(1, 128):
if b in b"""\n\r"'01234567NU\\abfnrtuvx""": if b in b"""\n\r"'01234567NU\\abfnrtuvx""":
continue continue
with self.assertWarns(DeprecationWarning): with self.assertWarns(SyntaxWarning):
self.assertEqual(eval(r"'\%c'" % b), '\\' + chr(b)) self.assertEqual(eval(r"'\%c'" % b), '\\' + chr(b))
with warnings.catch_warnings(record=True) as w: with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always', category=DeprecationWarning) warnings.simplefilter('always', category=SyntaxWarning)
eval("'''\n\\z'''") eval("'''\n\\z'''")
self.assertEqual(len(w), 1) self.assertEqual(len(w), 1)
self.assertEqual(str(w[0].message), r"invalid escape sequence '\z'") self.assertEqual(str(w[0].message), r"invalid escape sequence '\z'")
@ -121,7 +121,7 @@ class TestLiterals(unittest.TestCase):
self.assertEqual(w[0].lineno, 1) self.assertEqual(w[0].lineno, 1)
with warnings.catch_warnings(record=True) as w: with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('error', category=DeprecationWarning) warnings.simplefilter('error', category=SyntaxWarning)
with self.assertRaises(SyntaxError) as cm: with self.assertRaises(SyntaxError) as cm:
eval("'''\n\\z'''") eval("'''\n\\z'''")
exc = cm.exception exc = cm.exception
@ -133,11 +133,11 @@ class TestLiterals(unittest.TestCase):
def test_eval_str_invalid_octal_escape(self): def test_eval_str_invalid_octal_escape(self):
for i in range(0o400, 0o1000): for i in range(0o400, 0o1000):
with self.assertWarns(DeprecationWarning): with self.assertWarns(SyntaxWarning):
self.assertEqual(eval(r"'\%o'" % i), chr(i)) self.assertEqual(eval(r"'\%o'" % i), chr(i))
with warnings.catch_warnings(record=True) as w: with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always', category=DeprecationWarning) warnings.simplefilter('always', category=SyntaxWarning)
eval("'''\n\\407'''") eval("'''\n\\407'''")
self.assertEqual(len(w), 1) self.assertEqual(len(w), 1)
self.assertEqual(str(w[0].message), self.assertEqual(str(w[0].message),
@ -146,7 +146,7 @@ class TestLiterals(unittest.TestCase):
self.assertEqual(w[0].lineno, 1) self.assertEqual(w[0].lineno, 1)
with warnings.catch_warnings(record=True) as w: with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('error', category=DeprecationWarning) warnings.simplefilter('error', category=SyntaxWarning)
with self.assertRaises(SyntaxError) as cm: with self.assertRaises(SyntaxError) as cm:
eval("'''\n\\407'''") eval("'''\n\\407'''")
exc = cm.exception exc = cm.exception
@ -186,11 +186,11 @@ class TestLiterals(unittest.TestCase):
for b in range(1, 128): for b in range(1, 128):
if b in b"""\n\r"'01234567\\abfnrtvx""": if b in b"""\n\r"'01234567\\abfnrtvx""":
continue continue
with self.assertWarns(DeprecationWarning): with self.assertWarns(SyntaxWarning):
self.assertEqual(eval(r"b'\%c'" % b), b'\\' + bytes([b])) self.assertEqual(eval(r"b'\%c'" % b), b'\\' + bytes([b]))
with warnings.catch_warnings(record=True) as w: with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always', category=DeprecationWarning) warnings.simplefilter('always', category=SyntaxWarning)
eval("b'''\n\\z'''") eval("b'''\n\\z'''")
self.assertEqual(len(w), 1) self.assertEqual(len(w), 1)
self.assertEqual(str(w[0].message), r"invalid escape sequence '\z'") self.assertEqual(str(w[0].message), r"invalid escape sequence '\z'")
@ -198,7 +198,7 @@ class TestLiterals(unittest.TestCase):
self.assertEqual(w[0].lineno, 1) self.assertEqual(w[0].lineno, 1)
with warnings.catch_warnings(record=True) as w: with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('error', category=DeprecationWarning) warnings.simplefilter('error', category=SyntaxWarning)
with self.assertRaises(SyntaxError) as cm: with self.assertRaises(SyntaxError) as cm:
eval("b'''\n\\z'''") eval("b'''\n\\z'''")
exc = cm.exception exc = cm.exception
@ -209,11 +209,11 @@ class TestLiterals(unittest.TestCase):
def test_eval_bytes_invalid_octal_escape(self): def test_eval_bytes_invalid_octal_escape(self):
for i in range(0o400, 0o1000): for i in range(0o400, 0o1000):
with self.assertWarns(DeprecationWarning): with self.assertWarns(SyntaxWarning):
self.assertEqual(eval(r"b'\%o'" % i), bytes([i & 0o377])) self.assertEqual(eval(r"b'\%o'" % i), bytes([i & 0o377]))
with warnings.catch_warnings(record=True) as w: with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always', category=DeprecationWarning) warnings.simplefilter('always', category=SyntaxWarning)
eval("b'''\n\\407'''") eval("b'''\n\\407'''")
self.assertEqual(len(w), 1) self.assertEqual(len(w), 1)
self.assertEqual(str(w[0].message), self.assertEqual(str(w[0].message),
@ -222,7 +222,7 @@ class TestLiterals(unittest.TestCase):
self.assertEqual(w[0].lineno, 1) self.assertEqual(w[0].lineno, 1)
with warnings.catch_warnings(record=True) as w: with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('error', category=DeprecationWarning) warnings.simplefilter('error', category=SyntaxWarning)
with self.assertRaises(SyntaxError) as cm: with self.assertRaises(SyntaxError) as cm:
eval("b'''\n\\407'''") eval("b'''\n\\407'''")
exc = cm.exception exc = cm.exception

View File

@ -0,0 +1,7 @@
A backslash-character pair that is not a valid escape sequence now generates a
:exc:`SyntaxWarning`, instead of :exc:`DeprecationWarning`. For example,
``re.compile("\d+\.\d+")`` now emits a :exc:`SyntaxWarning` (``"\d"`` is an
invalid escape sequence), use raw strings for regular expression:
``re.compile(r"\d+\.\d+")``. In a future Python version, :exc:`SyntaxError`
will eventually be raised, instead of :exc:`SyntaxWarning`. Patch by Victor
Stinner.

View File

@ -0,0 +1,4 @@
Octal escapes with value larger than ``0o377`` (ex: ``"\477"``), deprecated
in Python 3.11, now produce a :exc:`SyntaxWarning`, instead of
:exc:`DeprecationWarning`. In a future Python version they will be
eventually a :exc:`SyntaxError`. Patch by Victor Stinner.

View File

@ -21,9 +21,16 @@ warn_invalid_escape_sequence(Parser *p, const char *first_invalid_escape, Token
if (msg == NULL) { if (msg == NULL) {
return -1; return -1;
} }
if (PyErr_WarnExplicitObject(PyExc_DeprecationWarning, msg, p->tok->filename, PyObject *category;
if (p->feature_version >= 12) {
category = PyExc_SyntaxWarning;
}
else {
category = PyExc_DeprecationWarning;
}
if (PyErr_WarnExplicitObject(category, msg, p->tok->filename,
t->lineno, NULL, NULL) < 0) { t->lineno, NULL, NULL) < 0) {
if (PyErr_ExceptionMatches(PyExc_DeprecationWarning)) { if (PyErr_ExceptionMatches(category)) {
/* Replace the DeprecationWarning exception with a SyntaxError /* Replace the DeprecationWarning exception with a SyntaxError
to get a more accurate error report */ to get a more accurate error report */
PyErr_Clear(); PyErr_Clear();

View File

@ -96,7 +96,7 @@ def parse(srclines):
# # end matched parens # # end matched parens
# ''') # ''')
''' r'''
# for loop # for loop
(?: (?:
\s* \b for \s* \b for

View File

@ -137,7 +137,7 @@ def read_python_version(configure: pathlib.Path = CONFIGURE) -> str:
configure and configure.ac are the canonical source for major and configure and configure.ac are the canonical source for major and
minor version number. minor version number.
""" """
version_re = re.compile("^PACKAGE_VERSION='(\d\.\d+)'") version_re = re.compile(r"^PACKAGE_VERSION='(\d\.\d+)'")
with configure.open(encoding="utf-8") as f: with configure.open(encoding="utf-8") as f:
for line in f: for line in f:
mo = version_re.match(line) mo = version_re.match(line)