mirror of https://github.com/python/cpython
bpo-32614: Modify re examples to use a raw string to prevent warning (GH-5265)
Modify RE examples in documentation to use raw strings to prevent DeprecationWarning. Add text to REGEX HOWTO to highlight the deprecation. Approved by Serhiy Storchaka.
This commit is contained in:
parent
bbbcf8693b
commit
66771422d0
|
@ -289,6 +289,8 @@ Putting REs in strings keeps the Python language simpler, but has one
|
|||
disadvantage which is the topic of the next section.
|
||||
|
||||
|
||||
.. _the-backslash-plague:
|
||||
|
||||
The Backslash Plague
|
||||
--------------------
|
||||
|
||||
|
@ -327,6 +329,13 @@ backslashes are not handled in any special way in a string literal prefixed with
|
|||
while ``"\n"`` is a one-character string containing a newline. Regular
|
||||
expressions will often be written in Python code using this raw string notation.
|
||||
|
||||
In addition, special escape sequences that are valid in regular expressions,
|
||||
but not valid as Python string literals, now result in a
|
||||
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`,
|
||||
which means the sequences will be invalid if raw string notation or escaping
|
||||
the backslashes isn't used.
|
||||
|
||||
|
||||
+-------------------+------------------+
|
||||
| Regular String | Raw string |
|
||||
+===================+==================+
|
||||
|
@ -457,10 +466,16 @@ In actual programs, the most common style is to store the
|
|||
Two pattern methods return all of the matches for a pattern.
|
||||
:meth:`~re.Pattern.findall` returns a list of matching strings::
|
||||
|
||||
>>> p = re.compile('\d+')
|
||||
>>> p = re.compile(r'\d+')
|
||||
>>> p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping')
|
||||
['12', '11', '10']
|
||||
|
||||
The ``r`` prefix, making the literal a raw string literal, is needed in this
|
||||
example because escape sequences in a normal "cooked" string literal that are
|
||||
not recognized by Python, as opposed to regular expressions, now result in a
|
||||
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`. See
|
||||
:ref:`the-backslash-plague`.
|
||||
|
||||
:meth:`~re.Pattern.findall` has to create the entire list before it can be returned as the
|
||||
result. The :meth:`~re.Pattern.finditer` method returns a sequence of
|
||||
:ref:`match object <match-objects>` instances as an :term:`iterator`::
|
||||
|
@ -1096,11 +1111,11 @@ following calls::
|
|||
The module-level function :func:`re.split` adds the RE to be used as the first
|
||||
argument, but is otherwise the same. ::
|
||||
|
||||
>>> re.split('[\W]+', 'Words, words, words.')
|
||||
>>> re.split(r'[\W]+', 'Words, words, words.')
|
||||
['Words', 'words', 'words', '']
|
||||
>>> re.split('([\W]+)', 'Words, words, words.')
|
||||
>>> re.split(r'([\W]+)', 'Words, words, words.')
|
||||
['Words', ', ', 'words', ', ', 'words', '.', '']
|
||||
>>> re.split('[\W]+', 'Words, words, words.', 1)
|
||||
>>> re.split(r'[\W]+', 'Words, words, words.', 1)
|
||||
['Words', 'words, words.']
|
||||
|
||||
|
||||
|
|
|
@ -463,7 +463,7 @@ The string in this example has the number 57 written in both Thai and
|
|||
Arabic numerals::
|
||||
|
||||
import re
|
||||
p = re.compile('\d+')
|
||||
p = re.compile(r'\d+')
|
||||
|
||||
s = "Over \u0e55\u0e57 57 flavours"
|
||||
m = p.search(s)
|
||||
|
|
|
@ -345,7 +345,7 @@ The special characters are:
|
|||
|
||||
This example looks for a word following a hyphen:
|
||||
|
||||
>>> m = re.search('(?<=-)\w+', 'spam-egg')
|
||||
>>> m = re.search(r'(?<=-)\w+', 'spam-egg')
|
||||
>>> m.group(0)
|
||||
'egg'
|
||||
|
||||
|
|
|
@ -0,0 +1,3 @@
|
|||
Modify RE examples in documentation to use raw strings to prevent
|
||||
:exc:`DeprecationWarning` and add text to REGEX HOWTO to highlight the
|
||||
deprecation.
|
Loading…
Reference in New Issue