gh-98401: Invalid escape sequences emits SyntaxWarning (#99011)

A backslash-character pair that is not a valid escape sequence now
generates a SyntaxWarning, instead of DeprecationWarning.  For
example, re.compile("\d+\.\d+") now emits a SyntaxWarning ("\d" is an
invalid escape sequence), use raw strings for regular expression:
re.compile(r"\d+\.\d+"). In a future Python version, SyntaxError will
eventually be raised, instead of SyntaxWarning.

Octal escapes with value larger than 0o377 (ex: "\477"), deprecated
in Python 3.11, now produce a SyntaxWarning, instead of
DeprecationWarning. In a future Python version they will be
eventually a SyntaxError.

codecs.escape_decode() and codecs.unicode_escape_decode() are left
unchanged: they still emit DeprecationWarning.

* The parser only emits SyntaxWarning for Python 3.12 (feature
  version), and still emits DeprecationWarning on older Python
  versions.
* Fix SyntaxWarning by using raw strings in Tools/c-analyzer/ and
  wasm_build.py.
This commit is contained in:
Victor Stinner 2022-11-03 17:53:25 +01:00 committed by GitHub
parent 916af11a97
commit a60ddd31be
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
11 changed files with 69 additions and 29 deletions

View File

@ -29,7 +29,7 @@ a literal backslash, one might have to write ``'\\\\'`` as the pattern
string, because the regular expression must be ``\\``, and each
backslash must be expressed as ``\\`` inside a regular Python string
literal. Also, please note that any invalid escape sequences in Python's
usage of the backslash in string literals now generate a :exc:`DeprecationWarning`
usage of the backslash in string literals now generate a :exc:`SyntaxWarning`
and in the future this will become a :exc:`SyntaxError`. This behaviour
will happen even if it is a valid escape sequence for a regular expression.

View File

@ -612,9 +612,13 @@ Notes:
As in Standard C, up to three octal digits are accepted.
.. versionchanged:: 3.11
Octal escapes with value larger than ``0o377`` produce a :exc:`DeprecationWarning`.
In a future Python version they will be a :exc:`SyntaxWarning` and
eventually a :exc:`SyntaxError`.
Octal escapes with value larger than ``0o377`` produce a
:exc:`DeprecationWarning`.
.. versionchanged:: 3.12
Octal escapes with value larger than ``0o377`` produce a
:exc:`SyntaxWarning`. In a future Python version they will be eventually
a :exc:`SyntaxError`.
(3)
Unlike in Standard C, exactly two hex digits are required.
@ -646,9 +650,11 @@ escape sequences only recognized in string literals fall into the category of
unrecognized escapes for bytes literals.
.. versionchanged:: 3.6
Unrecognized escape sequences produce a :exc:`DeprecationWarning`. In
a future Python version they will be a :exc:`SyntaxWarning` and
eventually a :exc:`SyntaxError`.
Unrecognized escape sequences produce a :exc:`DeprecationWarning`.
.. versionchanged:: 3.12
Unrecognized escape sequences produce a :exc:`SyntaxWarning`. In a future
Python version they will be eventually a :exc:`SyntaxError`.
Even in a raw literal, quotes can be escaped with a backslash, but the
backslash remains in the result; for example, ``r"\""`` is a valid string

View File

@ -121,6 +121,22 @@ Other Language Changes
chance to execute the GC periodically. (Contributed by Pablo Galindo in
:gh:`97922`.)
* A backslash-character pair that is not a valid escape sequence now generates
a :exc:`SyntaxWarning`, instead of :exc:`DeprecationWarning`.
For example, ``re.compile("\d+\.\d+")`` now emits a :exc:`SyntaxWarning`
(``"\d"`` is an invalid escape sequence), use raw strings for regular
expression: ``re.compile(r"\d+\.\d+")``.
In a future Python version, :exc:`SyntaxError` will eventually be raised,
instead of :exc:`SyntaxWarning`.
(Contributed by Victor Stinner in :gh:`98401`.)
* Octal escapes with value larger than ``0o377`` (ex: ``"\477"``), deprecated
in Python 3.11, now produce a :exc:`SyntaxWarning`, instead of
:exc:`DeprecationWarning`.
In a future Python version they will be eventually a :exc:`SyntaxError`.
(Contributed by Victor Stinner in :gh:`98401`.)
New Modules
===========

View File

@ -310,8 +310,8 @@ class CodeopTests(unittest.TestCase):
def test_warning(self):
# Test that the warning is only returned once.
with warnings_helper.check_warnings(
(".*literal", SyntaxWarning),
(".*invalid", DeprecationWarning),
('"is" with a literal', SyntaxWarning),
("invalid escape sequence", SyntaxWarning),
) as w:
compile_command(r"'\e' is 0")
self.assertEqual(len(w.warnings), 2)
@ -321,9 +321,9 @@ class CodeopTests(unittest.TestCase):
warnings.simplefilter('error', SyntaxWarning)
compile_command('1 is 1', symbol='exec')
# Check DeprecationWarning treated as an SyntaxError
# Check SyntaxWarning treated as an SyntaxError
with warnings.catch_warnings(), self.assertRaises(SyntaxError):
warnings.simplefilter('error', DeprecationWarning)
warnings.simplefilter('error', SyntaxWarning)
compile_command(r"'\e'", symbol='exec')
def test_incomplete_warning(self):
@ -337,7 +337,7 @@ class CodeopTests(unittest.TestCase):
warnings.simplefilter('always')
self.assertInvalid("'\\e' 1")
self.assertEqual(len(w), 1)
self.assertEqual(w[0].category, DeprecationWarning)
self.assertEqual(w[0].category, SyntaxWarning)
self.assertRegex(str(w[0].message), 'invalid escape sequence')
self.assertEqual(w[0].filename, '<input>')

View File

@ -776,7 +776,7 @@ x = (
self.assertEqual(f'2\x203', '2 3')
self.assertEqual(f'\x203', ' 3')
with self.assertWarns(DeprecationWarning): # invalid escape sequence
with self.assertWarns(SyntaxWarning): # invalid escape sequence
value = eval(r"f'\{6*7}'")
self.assertEqual(value, '\\42')
self.assertEqual(f'\\{6*7}', '\\42')

View File

@ -109,11 +109,11 @@ class TestLiterals(unittest.TestCase):
for b in range(1, 128):
if b in b"""\n\r"'01234567NU\\abfnrtuvx""":
continue
with self.assertWarns(DeprecationWarning):
with self.assertWarns(SyntaxWarning):
self.assertEqual(eval(r"'\%c'" % b), '\\' + chr(b))
with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always', category=DeprecationWarning)
warnings.simplefilter('always', category=SyntaxWarning)
eval("'''\n\\z'''")
self.assertEqual(len(w), 1)
self.assertEqual(str(w[0].message), r"invalid escape sequence '\z'")
@ -121,7 +121,7 @@ class TestLiterals(unittest.TestCase):
self.assertEqual(w[0].lineno, 1)
with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('error', category=DeprecationWarning)
warnings.simplefilter('error', category=SyntaxWarning)
with self.assertRaises(SyntaxError) as cm:
eval("'''\n\\z'''")
exc = cm.exception
@ -133,11 +133,11 @@ class TestLiterals(unittest.TestCase):
def test_eval_str_invalid_octal_escape(self):
for i in range(0o400, 0o1000):
with self.assertWarns(DeprecationWarning):
with self.assertWarns(SyntaxWarning):
self.assertEqual(eval(r"'\%o'" % i), chr(i))
with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always', category=DeprecationWarning)
warnings.simplefilter('always', category=SyntaxWarning)
eval("'''\n\\407'''")
self.assertEqual(len(w), 1)
self.assertEqual(str(w[0].message),
@ -146,7 +146,7 @@ class TestLiterals(unittest.TestCase):
self.assertEqual(w[0].lineno, 1)
with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('error', category=DeprecationWarning)
warnings.simplefilter('error', category=SyntaxWarning)
with self.assertRaises(SyntaxError) as cm:
eval("'''\n\\407'''")
exc = cm.exception
@ -186,11 +186,11 @@ class TestLiterals(unittest.TestCase):
for b in range(1, 128):
if b in b"""\n\r"'01234567\\abfnrtvx""":
continue
with self.assertWarns(DeprecationWarning):
with self.assertWarns(SyntaxWarning):
self.assertEqual(eval(r"b'\%c'" % b), b'\\' + bytes([b]))
with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always', category=DeprecationWarning)
warnings.simplefilter('always', category=SyntaxWarning)
eval("b'''\n\\z'''")
self.assertEqual(len(w), 1)
self.assertEqual(str(w[0].message), r"invalid escape sequence '\z'")
@ -198,7 +198,7 @@ class TestLiterals(unittest.TestCase):
self.assertEqual(w[0].lineno, 1)
with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('error', category=DeprecationWarning)
warnings.simplefilter('error', category=SyntaxWarning)
with self.assertRaises(SyntaxError) as cm:
eval("b'''\n\\z'''")
exc = cm.exception
@ -209,11 +209,11 @@ class TestLiterals(unittest.TestCase):
def test_eval_bytes_invalid_octal_escape(self):
for i in range(0o400, 0o1000):
with self.assertWarns(DeprecationWarning):
with self.assertWarns(SyntaxWarning):
self.assertEqual(eval(r"b'\%o'" % i), bytes([i & 0o377]))
with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always', category=DeprecationWarning)
warnings.simplefilter('always', category=SyntaxWarning)
eval("b'''\n\\407'''")
self.assertEqual(len(w), 1)
self.assertEqual(str(w[0].message),
@ -222,7 +222,7 @@ class TestLiterals(unittest.TestCase):
self.assertEqual(w[0].lineno, 1)
with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('error', category=DeprecationWarning)
warnings.simplefilter('error', category=SyntaxWarning)
with self.assertRaises(SyntaxError) as cm:
eval("b'''\n\\407'''")
exc = cm.exception

View File

@ -0,0 +1,7 @@
A backslash-character pair that is not a valid escape sequence now generates a
:exc:`SyntaxWarning`, instead of :exc:`DeprecationWarning`. For example,
``re.compile("\d+\.\d+")`` now emits a :exc:`SyntaxWarning` (``"\d"`` is an
invalid escape sequence), use raw strings for regular expression:
``re.compile(r"\d+\.\d+")``. In a future Python version, :exc:`SyntaxError`
will eventually be raised, instead of :exc:`SyntaxWarning`. Patch by Victor
Stinner.

View File

@ -0,0 +1,4 @@
Octal escapes with value larger than ``0o377`` (ex: ``"\477"``), deprecated
in Python 3.11, now produce a :exc:`SyntaxWarning`, instead of
:exc:`DeprecationWarning`. In a future Python version they will be
eventually a :exc:`SyntaxError`. Patch by Victor Stinner.

View File

@ -21,9 +21,16 @@ warn_invalid_escape_sequence(Parser *p, const char *first_invalid_escape, Token
if (msg == NULL) {
return -1;
}
if (PyErr_WarnExplicitObject(PyExc_DeprecationWarning, msg, p->tok->filename,
PyObject *category;
if (p->feature_version >= 12) {
category = PyExc_SyntaxWarning;
}
else {
category = PyExc_DeprecationWarning;
}
if (PyErr_WarnExplicitObject(category, msg, p->tok->filename,
t->lineno, NULL, NULL) < 0) {
if (PyErr_ExceptionMatches(PyExc_DeprecationWarning)) {
if (PyErr_ExceptionMatches(category)) {
/* Replace the DeprecationWarning exception with a SyntaxError
to get a more accurate error report */
PyErr_Clear();

View File

@ -96,7 +96,7 @@ def parse(srclines):
# # end matched parens
# ''')
'''
r'''
# for loop
(?:
\s* \b for

View File

@ -137,7 +137,7 @@ def read_python_version(configure: pathlib.Path = CONFIGURE) -> str:
configure and configure.ac are the canonical source for major and
minor version number.
"""
version_re = re.compile("^PACKAGE_VERSION='(\d\.\d+)'")
version_re = re.compile(r"^PACKAGE_VERSION='(\d\.\d+)'")
with configure.open(encoding="utf-8") as f:
for line in f:
mo = version_re.match(line)