A new hashlib module to replace the md5 and sha modules. It adds
support for additional secure hashes such as SHA-256 and SHA-512. The
hashlib module uses OpenSSL for fast platform optimized
implementations of algorithms when available. The old md5 and sha
modules still exist as wrappers around hashlib to preserve backwards
compatibility.
Fix over-aggressive PyErr_Clear(). The same code fragment appears in
various guises in list.extend(), map(), filter(), zip(), and internally
in PySequence_Tuple().
a frozenset conversion when the initial search attempt fails with a
TypeError and the key is some type of set. Add a testcase.
* Eliminate a duplicate if-stmt.
s|=s, s&=s, s-=s, or s^=s). Add related tests.
* Improve names for several variables and functions.
* Provide alternate table access functions (next, contains, add, and discard)
that work with an entry argument instead of just a key. This improves
set-vs-set operations because we already have a hash value for each key
and can avoid unnecessary calls to PyObject_Hash(). Provides a 5% to 20%
speed-up for quick hashing elements like strings and integers. Provides
much more substantial improvements for slow hashing elements like tuples
or objects defining a custom __hash__() function.
* Have difference operations resize() when 1/5 of the elements are dummies.
Formerly, it was 1/6. The new ratio triggers less frequently and only
in cases that it can resize quicker and with greater benefit. The right
answer is probably either 1/4, 1/5, or 1/6. Picked the middle value for
an even trade-off between resize time and the space/time costs of dummy
entries.
- Handle both frozenset() and frozenset([]).
- Do not use singleton for frozenset subclasses.
- Finalize the singleton.
- Add test cases.
* Factor-out set_update_internal() from set_update(). Simplifies the
code for several internal callers.
* Factor constant expressions out of loop in set_merge_internal().
* Minor comment touch-ups.
[ 1229429 ] missing Py_DECREF in PyObject_CallMethod
Add a test in test_enumerate, which is a bit random, but suffices
(reversed_new calls PyObject_CallMethod under some circumstances).
Should significantly enhance the utility of the module by supporting
the creation of tools that modify the token stream and writeback the
modified result.
[ 1180995 ] binary formats for marshalling floats
Adds 2 new type codes for marshal (binary floats and binary complexes), a
new marshal version (2), updates MAGIC and fiddles the de-serializing of
code objects to be less likely to clobber the real reason for failing if
it fails.
[ 1181301 ] make float packing copy bytes when they can
which hasn't been reviewed, despite numerous threats to check it in
anyway if noone reviews it. Please read the diff on the checkin list,
at least!
The basic idea is to examine the bytes of some 'probe values' to see if
the current platform is a IEEE 754-ish platform, and if so
_PyFloat_{Pack,Unpack}{4,8} just copy bytes around.
The rest is hair for testing, and tests.
crashing, and indirectly on the fact that hash codes in
random.randrange(1000000000) were very unlikely to exhibit collisions.
To see the problem, replace this number with 500 and observe the crash on
either del target[key] or del keys[i].
The fix prevents recursive mutation, just as in the key insertion case.
conversion using the proper magic slot (e.g., __int__()). Also move conversion
code out of PyNumber_*() functions in the C API into the nb_* function.
Applied patch #1109424. Thanks Walter Doewald.
test_site often failed under "regrtest.py -r", because this xmlrpc test
left sys with a setdefaultencoding attribute, but loading site.py removes
that attribute and test_site.py verifies the attribute is gone. Changed
this test to get rid of sys.setdefaultencoding if it didn't exist when
this test started.
Don't know whether this is a bugfix (backport) candidate.
the last character read is "\r" (and size is None, i.e. we're allowed to
call read() multiple times), so that we can return the correct line ending
(this additional character might be a "\n").
If the stream is temporarily exhausted, we might return the wrong line ending
(if the last character read is "\r" and the next one (after the byte stream
provides more data) is "\n", but at least the atcr member ensure that we
get the correct number of lines (i.e. this "\n" will not be treated as
another line ending.)
[ 1165306 ] Property access with decorator makes interpreter crash
Don't allow the creation of unbound methods with NULL im_class, because
attempting to call such crashes.
Backport candidate.
instead of raising a TypeError. Allows other types to successfully
implement __radd__() style methods.
* Remove future division import from test suite.
* Remove test suite's shadowing of __builtin__.dir().
etc., had comments after the colon, and some other cases. This patch
take a simpler approach that doesn't rely on looking for a ':'. Thanks
Simon Percivall!
socket.gethostname() in the check for a valid return.
Also clarified docs (official and docstring) that the value from gethostname()
is returned if gethostbyaddr() doesn't do the job.
* Speed-up "x in y" where x has more than one character.
The existing code made excessive calls to the expensive memcmp() function.
The new code uses memchr() to rapidly find a start point for memcmp().
In addition to knowing that the first character is a match, the new code
also checks that the last character is a match. This significantly reduces
the incidence of false starts (saving memcmp() calls and making quadratic
behavior less likely).
Improves the timings on:
python -m timeit -r7 -s"x='a'*1000" "'ab' in x"
python -m timeit -r7 -s"x='a'*1000" "'bc' in x"
Once this code has proven itself, then string_find_internal() should refer
to it rather than running its own version. Also, something similar may
apply to unicode objects.
[ 1124295 ] Function's __name__ no longer accessible in restricted mode
which I introduced with a bit of mindless copy-paste when making
__name__ writable. You can't assign to __name__ in restricted mode,
which I'm going to pretend was intentional :)
PyNumber_Check, rather than trying to convert to a float. Reimplemented
writer - now raises exceptions when it sees a quotechar but neither
doublequote or escapechar are set. Doublequote results are now more
consistent (eg, single quote should generate """", rather than "",
which is ambiguous).
when this limit is reached. Limit defaults to 128k, and is changed
by module set_field_limit() method. Previously, an unmatched quote
character could result in the entire file being read into the field
buffer, potentially exhausting virtual memory.
was done because we were previously performing validation of the dialect
from python, but this is now down within the C module. Also, the method
we were using to detect classes did not work with new-style classes.
a delimiter. Previously, the 'network location' (<authority> in RFC 2396) would
become 'www.example.com?query=spam', while RFC 2396 does not allow a '?' in
<authority>. See bug #548176 for further discussion.
`glob.glob()` currently calls itself recursively to build a list of matches of
the dirname part of the pattern and then filters by the basename part. This is
effectively BFS. ``glob.glob('*/*/*/*/*/foo')`` will build a huge list of all
directories 5 levels deep even if only a handful of them contain a ``foo``
entry. A generator-based recusion would never have to store these list at once
by implementing DFS. This patch converts the `glob` function to an `iglob`
recursive generator . `glob()` now just returns ``list(iglob(pattern))``.
I also cleaned up the code a bit (reduced duplicate `has_magic()` checks and
created a second `glob0` helper func so that the main loop need not be
duplicated).
Thanks to Cherniavsky Beni for the patch!
test_threading.test_foreign_thread(): new test does a basic check that
"foreign" threads can using the threading module, and that they create
a _DummyThread instance in at least one use case. This isn't a very
good test, since a thread created by thread.start_new_thread() isn't
particularly "foreign".
trying to return a complete line even if a size parameter was given (see
http://www.python.org/sf/1076985). This leads to buffer overflows with long
source lines under Windows if e.g. cp1252 is used as the source encoding.
This patch reverts the behaviour of readline() to something that behaves more
like Python 2.3: If a size parameter is given, read() is called only once.
As a side effect of this, readline() now supports all types of linebreaks
supported by unicode.splitlines().
Note that the tokenizer is still broken and it's possible to provoke segfaults
(see http://www.python.org/sf/1089395).
more. Thanks to Simon Percivall!
The patch makes changes to inspect.py in two places:
* the pattern to match against functions at line 436 is
modified: lambdas should be matched even if not
preceded by whitespace, as long as "lambda" isn't part
of another word.
* the BlockFinder class is heavily modified. Changes are:
- checking for "def", "class" or "lambda" names
before setting self.started to True. Then checking the
same line for word characters after the colon (if the
colon is on that line). If so, and the line does not
end with a line continuation marker, raise EndOfBlock
immediately.
- adding self.passline to show that the line is to be
included and no more checking is necessary on that
line. Since a NEWLINE token is not generated when a
line continuation marker exists, this allows getsource
to continue with these functions even if the following
line would not be indented.
Also add a bunch of
'quite-unlikely-to-occur-in-real-life-but-working-anyway' tests.
is pointless.
Also add a note to the docs for the 'test' package that test cases should check
first that any conditions needed in the operating system are met before having
a test run.
Closes bug #1077302. THanks, Ian Holsman.
(http://www.cygwin.com/faq/faq_3.html#SEC41).
Also check whether onerror has actually been called so this test will
fail on assertion instead of on trying to chmod a non-existent file.
__getitem__() methods: compute only the new spellings needed to satisfy
the given indexing object. This is purely an optimization (it should
have no effect on visible semantics).
regrtest.py: skip rgbimg and imageop as they are not built on 64-bit systems.
_tkinter.c: replace %.8x with %p for printing pointers.
setup.py: add lib64 into the library directories.