[3.6] bpo-32614: Modify re examples to use a raw string to prevent wa… …rning (GH-5265) (GH-5500)

Modify RE examples in documentation to use raw strings to prevent DeprecationWarning.
Add text to REGEX HOWTO to highlight the deprecation.  Approved by Serhiy Storchaka.

(cherry picked from commit 66771422d0)
This commit is contained in:
Terry Jan Reedy 2018-02-02 17:37:30 -05:00 committed by GitHub
parent f61951b10c
commit fbf8e823c0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 26 additions and 8 deletions

View File

@ -289,6 +289,8 @@ Putting REs in strings keeps the Python language simpler, but has one
disadvantage which is the topic of the next section.
.. _the-backslash-plague:
The Backslash Plague
--------------------
@ -327,6 +329,13 @@ backslashes are not handled in any special way in a string literal prefixed with
while ``"\n"`` is a one-character string containing a newline. Regular
expressions will often be written in Python code using this raw string notation.
In addition, special escape sequences that are valid in regular expressions,
but not valid as Python string literals, now result in a
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`,
which means the sequences will be invalid if raw string notation or escaping
the backslashes isn't used.
+-------------------+------------------+
| Regular String | Raw string |
+===================+==================+
@ -457,12 +466,18 @@ In actual programs, the most common style is to store the
Two pattern methods return all of the matches for a pattern.
:meth:`~re.pattern.findall` returns a list of matching strings::
>>> p = re.compile('\d+')
>>> p = re.compile(r'\d+')
>>> p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping')
['12', '11', '10']
:meth:`~re.pattern.findall` has to create the entire list before it can be returned as the
result. The :meth:`~re.pattern.finditer` method returns a sequence of
The ``r`` prefix, making the literal a raw string literal, is needed in this
example because escape sequences in a normal "cooked" string literal that are
not recognized by Python, as opposed to regular expressions, now result in a
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`. See
:ref:`the-backslash-plague`.
:meth:`~re.Pattern.findall` has to create the entire list before it can be returned as the
result. The :meth:`~re.Pattern.finditer` method returns a sequence of
:ref:`match object <match-objects>` instances as an :term:`iterator`::
>>> iterator = p.finditer('12 drummers drumming, 11 ... 10 ...')
@ -1096,11 +1111,11 @@ following calls::
The module-level function :func:`re.split` adds the RE to be used as the first
argument, but is otherwise the same. ::
>>> re.split('[\W]+', 'Words, words, words.')
>>> re.split(r'[\W]+', 'Words, words, words.')
['Words', 'words', 'words', '']
>>> re.split('([\W]+)', 'Words, words, words.')
>>> re.split(r'([\W]+)', 'Words, words, words.')
['Words', ', ', 'words', ', ', 'words', '.', '']
>>> re.split('[\W]+', 'Words, words, words.', 1)
>>> re.split(r'[\W]+', 'Words, words, words.', 1)
['Words', 'words, words.']

View File

@ -463,7 +463,7 @@ The string in this example has the number 57 written in both Thai and
Arabic numerals::
import re
p = re.compile('\d+')
p = re.compile(r'\d+')
s = "Over \u0e55\u0e57 57 flavours"
m = p.search(s)

View File

@ -315,7 +315,7 @@ The special characters are:
This example looks for a word following a hyphen:
>>> m = re.search('(?<=-)\w+', 'spam-egg')
>>> m = re.search(r'(?<=-)\w+', 'spam-egg')
>>> m.group(0)
'egg'

View File

@ -0,0 +1,3 @@
Modify RE examples in documentation to use raw strings to prevent
:exc:`DeprecationWarning` and add text to REGEX HOWTO to highlight the
deprecation.