Document new and deprecated Unicode functions

This commit is contained in:
Victor Stinner 2011-11-20 18:27:55 +01:00
parent b4938aaf15
commit 46606ce870
1 changed files with 92 additions and 11 deletions

View File

@ -209,10 +209,8 @@ the equality of the underlying sequences generated by those range objects.
(:issue:`13021`) (:issue:`13021`)
New, Improved, and Deprecated Modules New and Improved Modules
===================================== ========================
* Stub
array array
----- -----
@ -579,7 +577,11 @@ Optimizations
Major performance enhancements have been added: Major performance enhancements have been added:
* Stub * Thanks to the :pep:`393`, some operations on Unicode strings has been optimized:
* the memory footprint is divided by 2 to 4 depending on the text
* getting a substring of a latin1 strings is 4 times faster
* TODO
Build and C API Changes Build and C API Changes
@ -587,7 +589,27 @@ Build and C API Changes
Changes to Python's build process and to the C API include: Changes to Python's build process and to the C API include:
* Stub * The :pep:`393` added new Unicode types, macros and functions:
* Py_UCS1, Py_UCS2, Py_UCS4 types
* PyASCIIObject and PyCompactUnicodeObject structures
* :c:func:`PyUnicode_New`
* :c:macro:`PyUnicode_READY`
* :c:func:`PyUnicode_FromKindAndData`
* :c:func:`PyUnicode_GetLength`, :c:macro:`PyUnicode_GET_LENGTH`
* :c:func:`PyUnicode_CopyCharacters`
* :c:func:`PyUnicode_ReadChar`, :c:func:`PyUnicode_WriteChar`
* :c:func:`PyUnicode_AsUCS4`, :c:func:`PyUnicode_AsUCS4Copy`
* :c:func:`PyUnicode_FindChar`
* :c:func:`PyUnicode_Substring`
* :c:macro:`PyUnicode_1BYTE_DATA`, :c:macro:`PyUnicode_2BYTE_DATA`,
:c:macro:`PyUnicode_4BYTE_DATA`
* :c:macro:`PyUnicode_KIND` with :c:type:`PyUnicode_Kind` enum:
:c:data:`PyUnicode_WCHAR_KIND`, :c:data:`PyUnicode_1BYTE_KIND`,
:c:data:`PyUnicode_2BYTE_KIND`, :c:data:`PyUnicode_4BYTE_KIND`
* :c:macro:`PyUnicode_DATA`
* :c:macro:`PyUnicode_READ`, :c:macro:`PyUnicode_READ_CHAR`, :c:macro:`PyUnicode_WRITE`
* :c:macro:`PyUnicode_MAX_CHAR_VALUE`
Unsupported Operating Systems Unsupported Operating Systems
@ -599,22 +621,81 @@ Windows 2000 and Windows platforms which set ``COMSPEC`` to ``command.com``
are no longer supported due to maintenance burden. are no longer supported due to maintenance burden.
Deprecated modules, functions and methods Deprecated Python modules, functions and methods
========================================= ================================================
* The :mod:`packaging` module replaces the :mod:`distutils` module * The :mod:`packaging` module replaces the :mod:`distutils` module
* The ``unicode_internal`` codec has been deprecated because of the * The ``unicode_internal`` codec has been deprecated because of the
:pep:`393`, use UTF-8, UTF-16 (``utf-16-le`` or ``utf-16-le``), or UTF-32 :pep:`393`, use UTF-8, UTF-16 (``utf-16-le`` or ``utf-16-le``), or UTF-32
(``utf-32-le`` or ``utf-32-le``) instead. (``utf-32-le`` or ``utf-32-le``)
* :meth:`ftplib.FTP.nlst` and :meth:`ftplib.FTP.dir`: use * :meth:`ftplib.FTP.nlst` and :meth:`ftplib.FTP.dir`: use
:meth:`ftplib.FTP.mlsd` instead. :meth:`ftplib.FTP.mlsd`
* :func:`platform.popen`: use the :mod:`subprocess` module. Check especially * :func:`platform.popen`: use the :mod:`subprocess` module. Check especially
the :ref:`subprocess-replacements` section. the :ref:`subprocess-replacements` section.
* :issue:`13374`: The Windows bytes API has been deprecated in the :mod:`os` * :issue:`13374`: The Windows bytes API has been deprecated in the :mod:`os`
module. Use Unicode filenames instead of bytes filenames to not depend on module. Use Unicode filenames, instead of bytes filenames, to not depend on
the ANSI code page anymore and to support any filename. the ANSI code page anymore and to support any filename.
Deprecated functions and types of the C API
===========================================
The :c:type:`Py_UNICODE` has been deprecated by the :pep:`393` and will be
removed in Python 4. All functions using this type are deprecated:
Functions and macros manipulating Py_UNICODE* strings:
* :c:macro:`Py_UNICODE_strlen`: use :c:func:`PyUnicode_GetLength` or
:c:macro:`PyUnicode_GET_LENGTH`
* :c:macro:`Py_UNICODE_strcat`: use :c:func:`PyUnicode_CopyCharacters` or
:c:func:`PyUnicode_FromFormat`
* :c:macro:`Py_UNICODE_strcpy`, :c:macro:`Py_UNICODE_strncpy`,
:c:macro:`Py_UNICODE_COPY`: use :c:func:`PyUnicode_CopyCharacters` or
:c:func:`PyUnicode_Substring`
* :c:macro:`Py_UNICODE_strcmp`: use :c:func:`PyUnicode_Compare`
* :c:macro:`Py_UNICODE_strncmp`: use :c:func:`PyUnicode_Tailmatch`
* :c:macro:`Py_UNICODE_strchr`, :c:macro:`Py_UNICODE_strrchr`: use
:c:func:`PyUnicode_FindChar`
* :c:macro:`Py_UNICODE_FILL`
Unicode functions and methods using :c:type:`Py_UNICODE` and
:c:type:`Py_UNICODE*` types:
* :c:macro:`PyUnicode_FromUnicode`: use :c:func:`PyUnicode_FromWideChar` or
:c:func:`PyUnicode_FromKindAndData`
* :c:macro:`PyUnicode_AS_UNICODE`, :c:func:`PyUnicode_AsUnicode`,
:c:func:`PyUnicode_AsUnicodeAndSize`: use :c:func:`PyUnicode_AsWideCharString`
* :c:macro:`PyUnicode_AS_DATA`: use :c:macro:`PyUnicode_DATA` with
:c:macro:`PyUnicode_READ` and :c:macro:`PyUnicode_WRITE`
* :c:macro:`PyUnicode_GET_SIZE`, :c:func:`PyUnicode_GetSize`: use
:c:macro:`PyUnicode_GET_LENGTH` or :c:func:`PyUnicode_GetLength`
* :c:macro:`PyUnicode_GET_DATA_SIZE`: use
``PyUnicode_GET_LENGTH(str) * PyUnicode_KIND(str)`` (only work on ready
strings)
* :c:func:`PyUnicode_AsUnicodeCopy`: use :c:func:`PyUnicode_AsUCS4Copy`,
:c:func:`PyUnicode_AsWideCharString` or :c:func:`PyUnicode_Copy`
Encoders:
* :c:func:`PyUnicode_Encode`: use :c:func:`PyUnicode_AsEncodedObject`
* :c:func:`PyUnicode_EncodeUTF7`
* :c:func:`PyUnicode_EncodeUTF8`: use :c:func:`PyUnicode_AsUTF8String`
* :c:func:`PyUnicode_EncodeUTF32`
* :c:func:`PyUnicode_EncodeUTF16`
* :c:func:`PyUnicode_EncodeUnicodeEscape:` use
:c:func:`PyUnicode_AsUnicodeEscapeString`
* :c:func:`PyUnicode_EncodeRawUnicodeEscape:` use
:c:func:`PyUnicode_AsRawUnicodeEscapeString`
* :c:func:`PyUnicode_EncodeLatin1`: use :c:func:`PyUnicode_AsLatin1String`
* :c:func:`PyUnicode_EncodeASCII`: use :c:func:`PyUnicode_AsASCIIString`
* :c:func:`PyUnicode_EncodeCharmap`
* :c:func:`PyUnicode_TranslateCharmap`
* :c:func:`PyUnicode_EncodeMBCS`: use :c:func:`PyUnicode_AsMBCSString` or
:c:func:`PyUnicode_EncodeCodePage` (with ``CP_ACP`` code_page)
* :c:func:`PyUnicode_EncodeDecimal`,
:c:func:`PyUnicode_TransformDecimalToASCII`
Porting to Python 3.3 Porting to Python 3.3
===================== =====================