Clarify concatenation behaviour of immutable strings, and remove explicit

mention of the CPython optimization hack.
This commit is contained in:
Antoine Pitrou 2011-11-25 16:33:53 +01:00
parent 5a53f368e6
commit fd9ebd4a36
2 changed files with 37 additions and 8 deletions

View File

@ -989,6 +989,32 @@ What does 'UnicodeDecodeError' or 'UnicodeEncodeError' error mean?
See the :ref:`unicode-howto`.
What is the most efficient way to concatenate many strings together?
--------------------------------------------------------------------
:class:`str` and :class:`bytes` objects are immutable, therefore concatenating
many strings together is inefficient as each concatenation creates a new
object. In the general case, the total runtime cost is quadratic in the
total string length.
To accumulate many :class:`str` objects, the recommended idiom is to place
them into a list and call :meth:`str.join` at the end::
chunks = []
for s in my_strings:
chunks.append(s)
result = ''.join(chunks)
(another reasonably efficient idiom is to use :class:`io.StringIO`)
To accumulate many :class:`bytes` objects, the recommended idiom is to extend
a :class:`bytearray` object using in-place concatenation (the ``+=`` operator)::
result = bytearray()
for b in my_bytes_objects:
result += b
Sequences (Tuples/Lists)
========================

View File

@ -964,15 +964,18 @@ Notes:
If *k* is ``None``, it is treated like ``1``.
(6)
.. impl-detail::
Concatenating immutable strings always results in a new object. This means
that building up a string by repeated concatenation will have a quadratic
runtime cost in the total string length. To get a linear runtime cost,
you must switch to one of the alternatives below:
If *s* and *t* are both strings, some Python implementations such as
CPython can usually perform an in-place optimization for assignments of
the form ``s = s + t`` or ``s += t``. When applicable, this optimization
makes quadratic run-time much less likely. This optimization is both
version and implementation dependent. For performance sensitive code, it
is preferable to use the :meth:`str.join` method which assures consistent
linear concatenation performance across versions and implementations.
* if concatenating :class:`str` objects, you can build a list and use
:meth:`str.join` at the end;
* if concatenating :class:`bytes` objects, you can similarly use
:meth:`bytes.join`, or you can do in-place concatenation with a
:class:`bytearray` object. :class:`bytearray` objects are mutable and
have an efficient overallocation mechanism.
.. _string-methods: