mirror of https://github.com/python/cpython
Document the fact that '\U' and '\u' escapes are not treated specially in 3.0 (see issue 2541)
This commit is contained in:
parent
a288faef8e
commit
a2f837f751
|
@ -423,8 +423,9 @@ characters that otherwise have a special meaning, such as newline, backslash
|
||||||
itself, or the quote character.
|
itself, or the quote character.
|
||||||
|
|
||||||
String literals may optionally be prefixed with a letter ``'r'`` or ``'R'``;
|
String literals may optionally be prefixed with a letter ``'r'`` or ``'R'``;
|
||||||
such strings are called :dfn:`raw strings` and use different rules for
|
such strings are called :dfn:`raw strings` and treat backslashes as literal
|
||||||
interpreting backslash escape sequences.
|
characters. As a result, ``'\U'`` and ``'\u'`` escapes in raw strings are not
|
||||||
|
treated specially.
|
||||||
|
|
||||||
Bytes literals are always prefixed with ``'b'`` or ``'B'``; they produce an
|
Bytes literals are always prefixed with ``'b'`` or ``'B'``; they produce an
|
||||||
instance of the :class:`bytes` type instead of the :class:`str` type. They
|
instance of the :class:`bytes` type instead of the :class:`str` type. They
|
||||||
|
@ -520,15 +521,6 @@ is more easily recognized as broken.) It is also important to note that the
|
||||||
escape sequences only recognized in string literals fall into the category of
|
escape sequences only recognized in string literals fall into the category of
|
||||||
unrecognized escapes for bytes literals.
|
unrecognized escapes for bytes literals.
|
||||||
|
|
||||||
When an ``'r'`` or ``'R'`` prefix is used in a string literal, then the
|
|
||||||
``\uXXXX`` and ``\UXXXXXXXX`` escape sequences are processed while *all other
|
|
||||||
backslashes are left in the string*. For example, the string literal
|
|
||||||
``r"\u0062\n"`` consists of three Unicode characters: 'LATIN SMALL LETTER B',
|
|
||||||
'REVERSE SOLIDUS', and 'LATIN SMALL LETTER N'. Backslashes can be escaped with a
|
|
||||||
preceding backslash; however, both remain in the string. As a result,
|
|
||||||
``\uXXXX`` escape sequences are only recognized when there is an odd number of
|
|
||||||
backslashes.
|
|
||||||
|
|
||||||
Even in a raw string, string quotes can be escaped with a backslash, but the
|
Even in a raw string, string quotes can be escaped with a backslash, but the
|
||||||
backslash remains in the string; for example, ``r"\""`` is a valid string
|
backslash remains in the string; for example, ``r"\""`` is a valid string
|
||||||
literal consisting of two characters: a backslash and a double quote; ``r"\"``
|
literal consisting of two characters: a backslash and a double quote; ``r"\"``
|
||||||
|
|
|
@ -167,6 +167,9 @@ Strings and Bytes
|
||||||
explicitly convert between them, using the :meth:`str.encode` (str -> bytes)
|
explicitly convert between them, using the :meth:`str.encode` (str -> bytes)
|
||||||
or :meth:`bytes.decode` (bytes -> str) methods.
|
or :meth:`bytes.decode` (bytes -> str) methods.
|
||||||
|
|
||||||
|
* All backslashes in raw strings are interpreted literally. This means that
|
||||||
|
Unicode escapes are not treated specially.
|
||||||
|
|
||||||
.. XXX add bytearray
|
.. XXX add bytearray
|
||||||
|
|
||||||
* PEP 3112: Bytes literals, e.g. ``b"abc"``, create :class:`bytes` instances.
|
* PEP 3112: Bytes literals, e.g. ``b"abc"``, create :class:`bytes` instances.
|
||||||
|
@ -183,6 +186,8 @@ Strings and Bytes
|
||||||
* The :mod:`StringIO` and :mod:`cStringIO` modules are gone. Instead, import
|
* The :mod:`StringIO` and :mod:`cStringIO` modules are gone. Instead, import
|
||||||
:class:`io.StringIO` or :class:`io.BytesIO`.
|
:class:`io.StringIO` or :class:`io.BytesIO`.
|
||||||
|
|
||||||
|
* ``'\U'`` and ``'\u'`` escapes in raw strings are not treated specially.
|
||||||
|
|
||||||
|
|
||||||
PEP 3101: A New Approach to String Formatting
|
PEP 3101: A New Approach to String Formatting
|
||||||
=============================================
|
=============================================
|
||||||
|
|
|
@ -26,6 +26,9 @@ Core and Builtins
|
||||||
through as unmodified as possible; as a consequence, the C API
|
through as unmodified as possible; as a consequence, the C API
|
||||||
related to command line arguments was changed to use wchar_t.
|
related to command line arguments was changed to use wchar_t.
|
||||||
|
|
||||||
|
- All backslashes in raw strings are interpreted literally. This means that
|
||||||
|
'\u' and '\U' escapes are not treated specially.
|
||||||
|
|
||||||
Extension Modules
|
Extension Modules
|
||||||
-----------------
|
-----------------
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue