Issue #13165: stringbench is now available in the Tools/stringbench folder.
It used to live in its own SVN project.
This commit is contained in:
parent
75d9aca97a
commit
1584ae3987
|
@ -57,6 +57,12 @@ Tests
|
|||
- Issue #14355: Regrtest now supports the standard unittest test loading, and
|
||||
will use it if a test file contains no `test_main` method.
|
||||
|
||||
Tools / Demos
|
||||
-------------
|
||||
|
||||
- Issue #13165: stringbench is now available in the Tools/stringbench folder.
|
||||
It used to live in its own SVN project.
|
||||
|
||||
|
||||
What's New in Python 3.3.0 Alpha 2?
|
||||
===================================
|
||||
|
|
|
@ -32,6 +32,9 @@ scripts A number of useful single-file programs, e.g. tabnanny.py
|
|||
tabs and spaces, and 2to3, which converts Python 2 code
|
||||
to Python 3 code.
|
||||
|
||||
stringbench A suite of micro-benchmarks for various operations on
|
||||
strings (both 8-bit and unicode).
|
||||
|
||||
test2to3 A demonstration of how to use 2to3 transparently in setup.py.
|
||||
|
||||
unicode Tools for generating unicodedata and codecs from unicode.org
|
||||
|
|
|
@ -0,0 +1,68 @@
|
|||
stringbench is a set of performance tests comparing byte string
|
||||
operations with unicode operations. The two string implementations
|
||||
are loosely based on each other and sometimes the algorithm for one is
|
||||
faster than the other.
|
||||
|
||||
These test set was started at the Need For Speed sprint in Reykjavik
|
||||
to identify which string methods could be sped up quickly and to
|
||||
identify obvious places for improvement.
|
||||
|
||||
Here is an example of a benchmark
|
||||
|
||||
|
||||
@bench('"Andrew".startswith("A")', 'startswith single character', 1000)
|
||||
def startswith_single(STR):
|
||||
s1 = STR("Andrew")
|
||||
s2 = STR("A")
|
||||
s1_startswith = s1.startswith
|
||||
for x in _RANGE_1000:
|
||||
s1_startswith(s2)
|
||||
|
||||
The bench decorator takes three parameters. The first is a short
|
||||
description of how the code works. In most cases this is Python code
|
||||
snippet. It is not the code which is actually run because the real
|
||||
code is hand-optimized to focus on the method being tested.
|
||||
|
||||
The second parameter is a group title. All benchmarks with the same
|
||||
group title are listed together. This lets you compare different
|
||||
implementations of the same algorithm, such as "t in s"
|
||||
vs. "s.find(t)".
|
||||
|
||||
The last is a count. Each benchmark loops over the algorithm either
|
||||
100 or 1000 times, depending on the algorithm performance. The output
|
||||
time is the time per benchmark call so the reader needs a way to know
|
||||
how to scale the performance.
|
||||
|
||||
These parameters become function attributes.
|
||||
|
||||
|
||||
Here is an example of the output
|
||||
|
||||
|
||||
========== count newlines
|
||||
38.54 41.60 92.7 ...text.with.2000.newlines.count("\n") (*100)
|
||||
========== early match, single character
|
||||
1.14 1.18 96.8 ("A"*1000).find("A") (*1000)
|
||||
0.44 0.41 105.6 "A" in "A"*1000 (*1000)
|
||||
1.15 1.17 98.1 ("A"*1000).index("A") (*1000)
|
||||
|
||||
The first column is the run time in milliseconds for byte strings.
|
||||
The second is the run time for unicode strings. The third is a
|
||||
percentage; byte time / unicode time. It's the percentage by which
|
||||
unicode is faster than byte strings.
|
||||
|
||||
The last column contains the code snippet and the repeat count for the
|
||||
internal benchmark loop.
|
||||
|
||||
The times are computed with 'timeit.py' which repeats the test more
|
||||
and more times until the total time takes over 0.2 seconds, returning
|
||||
the best time for a single iteration.
|
||||
|
||||
The final line of the output is the cumulative time for byte and
|
||||
unicode strings, and the overall performance of unicode relative to
|
||||
bytes. For example
|
||||
|
||||
4079.83 5432.25 75.1 TOTAL
|
||||
|
||||
However, this has no meaning as it evenly weights every test.
|
||||
|
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue