Document the fact that '\U' and '\u' escapes are not treated specially in 3.0 (see issue 2541)

This commit is contained in:
Benjamin Peterson 2008-04-28 21:05:10 +00:00
parent a288faef8e
commit a2f837f751
3 changed files with 11 additions and 11 deletions

View File

@ -423,8 +423,9 @@ characters that otherwise have a special meaning, such as newline, backslash
itself, or the quote character.
String literals may optionally be prefixed with a letter ``'r'`` or ``'R'``;
such strings are called :dfn:`raw strings` and use different rules for
interpreting backslash escape sequences.
such strings are called :dfn:`raw strings` and treat backslashes as literal
characters. As a result, ``'\U'`` and ``'\u'`` escapes in raw strings are not
treated specially.
Bytes literals are always prefixed with ``'b'`` or ``'B'``; they produce an
instance of the :class:`bytes` type instead of the :class:`str` type. They
@ -520,15 +521,6 @@ is more easily recognized as broken.) It is also important to note that the
escape sequences only recognized in string literals fall into the category of
unrecognized escapes for bytes literals.
When an ``'r'`` or ``'R'`` prefix is used in a string literal, then the
``\uXXXX`` and ``\UXXXXXXXX`` escape sequences are processed while *all other
backslashes are left in the string*. For example, the string literal
``r"\u0062\n"`` consists of three Unicode characters: 'LATIN SMALL LETTER B',
'REVERSE SOLIDUS', and 'LATIN SMALL LETTER N'. Backslashes can be escaped with a
preceding backslash; however, both remain in the string. As a result,
``\uXXXX`` escape sequences are only recognized when there is an odd number of
backslashes.
Even in a raw string, string quotes can be escaped with a backslash, but the
backslash remains in the string; for example, ``r"\""`` is a valid string
literal consisting of two characters: a backslash and a double quote; ``r"\"``

View File

@ -167,6 +167,9 @@ Strings and Bytes
explicitly convert between them, using the :meth:`str.encode` (str -> bytes)
or :meth:`bytes.decode` (bytes -> str) methods.
* All backslashes in raw strings are interpreted literally. This means that
Unicode escapes are not treated specially.
.. XXX add bytearray
* PEP 3112: Bytes literals, e.g. ``b"abc"``, create :class:`bytes` instances.
@ -183,6 +186,8 @@ Strings and Bytes
* The :mod:`StringIO` and :mod:`cStringIO` modules are gone. Instead, import
:class:`io.StringIO` or :class:`io.BytesIO`.
* ``'\U'`` and ``'\u'`` escapes in raw strings are not treated specially.
PEP 3101: A New Approach to String Formatting
=============================================

View File

@ -26,6 +26,9 @@ Core and Builtins
through as unmodified as possible; as a consequence, the C API
related to command line arguments was changed to use wchar_t.
- All backslashes in raw strings are interpreted literally. This means that
'\u' and '\U' escapes are not treated specially.
Extension Modules
-----------------