Modernize the Sorting HowTo guide (gh-115479)

This commit is contained in:
Raymond Hettinger 2024-02-19 21:22:07 -06:00 committed by GitHub
parent c2cb31bbe1
commit 9f8a9e8ac7
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 54 additions and 6 deletions

View File

@ -4,7 +4,6 @@ Sorting HOW TO
**************
:Author: Andrew Dalke and Raymond Hettinger
:Release: 0.1
Python lists have a built-in :meth:`list.sort` method that modifies the list
@ -56,7 +55,7 @@ For example, here's a case-insensitive string comparison:
.. doctest::
>>> sorted("This is a test string from Andrew".split(), key=str.lower)
>>> sorted("This is a test string from Andrew".split(), key=str.casefold)
['a', 'Andrew', 'from', 'is', 'string', 'test', 'This']
The value of the *key* parameter should be a function (or other callable) that
@ -97,10 +96,14 @@ The same technique works for objects with named attributes. For example:
>>> sorted(student_objects, key=lambda student: student.age) # sort by age
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
Operator Module Functions
=========================
Objects with named attributes can be made by a regular class as shown
above, or they can be instances of :class:`~dataclasses.dataclass` or
a :term:`named tuple`.
The key-function patterns shown above are very common, so Python provides
Operator Module Functions and Partial Function Evaluation
=========================================================
The :term:`key function` patterns shown above are very common, so Python provides
convenience functions to make accessor functions easier and faster. The
:mod:`operator` module has :func:`~operator.itemgetter`,
:func:`~operator.attrgetter`, and a :func:`~operator.methodcaller` function.
@ -128,6 +131,24 @@ sort by *grade* then by *age*:
>>> sorted(student_objects, key=attrgetter('grade', 'age'))
[('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]
The :mod:`functools` module provides another helpful tool for making
key-functions. The :func:`~functools.partial` function can reduce the
`arity <https://en.wikipedia.org/wiki/Arity>`_ of a multi-argument
function making it suitable for use as a key-function.
.. doctest::
>>> from functools import partial
>>> from unicodedata import normalize
>>> names = 'Zoë Åbjørn Núñez Élana Zeke Abe Nubia Eloise'.split()
>>> sorted(names, key=partial(normalize, 'NFD'))
['Abe', 'Åbjørn', 'Eloise', 'Élana', 'Nubia', 'Núñez', 'Zeke', 'Zoë']
>>> sorted(names, key=partial(normalize, 'NFC'))
['Abe', 'Eloise', 'Nubia', 'Núñez', 'Zeke', 'Zoë', 'Åbjørn', 'Élana']
Ascending and Descending
========================
@ -200,6 +221,8 @@ This idiom is called Decorate-Sort-Undecorate after its three steps:
For example, to sort the student data by *grade* using the DSU approach:
.. doctest::
>>> decorated = [(student.grade, i, student) for i, student in enumerate(student_objects)]
>>> decorated.sort()
>>> [student for grade, i, student in decorated] # undecorate
@ -282,7 +305,11 @@ Odds and Ends
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
However, note that ``<`` can fall back to using :meth:`~object.__gt__` if
:meth:`~object.__lt__` is not implemented (see :func:`object.__lt__`).
:meth:`~object.__lt__` is not implemented (see :func:`object.__lt__`
for details on the mechanics). To avoid surprises, :pep:`8`
recommends that all six comparison methods be implemented.
The :func:`~functools.total_ordering` decorator is provided to make that
task easier.
* Key functions need not depend directly on the objects being sorted. A key
function can also access external resources. For instance, if the student grades
@ -295,3 +322,24 @@ Odds and Ends
>>> newgrades = {'john': 'F', 'jane':'A', 'dave': 'C'}
>>> sorted(students, key=newgrades.__getitem__)
['jane', 'dave', 'john']
Partial Sorts
=============
Some applications require only some of the data to be ordered. The standard
library provides several tools that do less work than a full sort:
* :func:`min` and :func:`max` return the smallest and largest values,
respectively. These functions make a single pass over the input data and
require almost no auxiliary memory.
* :func:`heapq.nsmallest` and :func:`heapq.nlargest` return
the *n* smallest and largest values, respectively. These functions
make a single pass over the data keeping only *n* elements in memory
at a time. For values of *n* that are small relative to the number of
inputs, these functions make far fewer comparisons than a full sort.
* :func:`heapq.heappush` and :func:`heapq.heappop` create and maintain a
partially sorted arrangement of data that keeps the smallest element
at position ``0``. These functions are suitable for implementing
priority queues which are commonly used for task scheduling.