bpo-32614: Modify re examples to use a raw string to prevent warning (GH-5265)

Modify RE examples in documentation to use raw strings to prevent DeprecationWarning.
Add text to REGEX HOWTO to highlight the deprecation.  Approved by Serhiy Storchaka.
This commit is contained in:
Cheryl Sabella 2018-02-02 16:16:27 -05:00 committed by Terry Jan Reedy
parent bbbcf8693b
commit 66771422d0
4 changed files with 24 additions and 6 deletions

View File

@ -289,6 +289,8 @@ Putting REs in strings keeps the Python language simpler, but has one
disadvantage which is the topic of the next section.
.. _the-backslash-plague:
The Backslash Plague
--------------------
@ -327,6 +329,13 @@ backslashes are not handled in any special way in a string literal prefixed with
while ``"\n"`` is a one-character string containing a newline. Regular
expressions will often be written in Python code using this raw string notation.
In addition, special escape sequences that are valid in regular expressions,
but not valid as Python string literals, now result in a
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`,
which means the sequences will be invalid if raw string notation or escaping
the backslashes isn't used.
+-------------------+------------------+
| Regular String | Raw string |
+===================+==================+
@ -457,10 +466,16 @@ In actual programs, the most common style is to store the
Two pattern methods return all of the matches for a pattern.
:meth:`~re.Pattern.findall` returns a list of matching strings::
>>> p = re.compile('\d+')
>>> p = re.compile(r'\d+')
>>> p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping')
['12', '11', '10']
The ``r`` prefix, making the literal a raw string literal, is needed in this
example because escape sequences in a normal "cooked" string literal that are
not recognized by Python, as opposed to regular expressions, now result in a
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`. See
:ref:`the-backslash-plague`.
:meth:`~re.Pattern.findall` has to create the entire list before it can be returned as the
result. The :meth:`~re.Pattern.finditer` method returns a sequence of
:ref:`match object <match-objects>` instances as an :term:`iterator`::
@ -1096,11 +1111,11 @@ following calls::
The module-level function :func:`re.split` adds the RE to be used as the first
argument, but is otherwise the same. ::
>>> re.split('[\W]+', 'Words, words, words.')
>>> re.split(r'[\W]+', 'Words, words, words.')
['Words', 'words', 'words', '']
>>> re.split('([\W]+)', 'Words, words, words.')
>>> re.split(r'([\W]+)', 'Words, words, words.')
['Words', ', ', 'words', ', ', 'words', '.', '']
>>> re.split('[\W]+', 'Words, words, words.', 1)
>>> re.split(r'[\W]+', 'Words, words, words.', 1)
['Words', 'words, words.']

View File

@ -463,7 +463,7 @@ The string in this example has the number 57 written in both Thai and
Arabic numerals::
import re
p = re.compile('\d+')
p = re.compile(r'\d+')
s = "Over \u0e55\u0e57 57 flavours"
m = p.search(s)

View File

@ -345,7 +345,7 @@ The special characters are:
This example looks for a word following a hyphen:
>>> m = re.search('(?<=-)\w+', 'spam-egg')
>>> m = re.search(r'(?<=-)\w+', 'spam-egg')
>>> m.group(0)
'egg'

View File

@ -0,0 +1,3 @@
Modify RE examples in documentation to use raw strings to prevent
:exc:`DeprecationWarning` and add text to REGEX HOWTO to highlight the
deprecation.