Commit Graph

987 Commits

Author SHA1 Message Date
Marc-André Lemburg 2d9204199f This patch changes the way the string .encode() method works slightly
and introduces a new method .decode().

The major change is that strg.encode() will no longer try to convert
Unicode returns from the codec into a string, but instead pass along
the Unicode object as-is. The same is now true for all other codec
return types. The underlying C APIs were changed accordingly.

Note that even though this does have the potential of breaking
existing code, the chances are low since conversion from Unicode
previously took place using the default encoding which is normally
set to ASCII rendering this auto-conversion mechanism useless for
most Unicode encodings.

The good news is that you can now use .encode() and .decode() with
much greater ease and that the door was opened for better accessibility
of the builtin codecs.

As demonstration of the new feature, the patch includes a few new
codecs which allow string to string encoding and decoding (rot13,
hex, zip, uu, base64).

Written by Marc-Andre Lemburg. Copyright assigned to the PSF.
2001-05-15 12:00:02 +00:00
Guido van Rossum 2e0a654f6e Add warnings to the strop module, for to those functions that really
*are* obsolete; three variables and the maketrans() function are not
(yet) obsolete.

Add a compensating warnings.filterwarnings() call to test_strop.py.

Add this to the NEWS.
2001-05-15 02:14:44 +00:00
Fred Drake 992d387540 Convert a couple of comments to docstrings -- PyUnit can use these when
the regression test is run in verbose mode.
2001-05-14 19:15:23 +00:00
Tim Peters 95b3f78622 pprint's workhorse _safe_repr() function took time quadratic in the # of
elements when crunching a list, dict or tuple.  Now takes linear time
instead -- huge speedup for even moderately large containers, and the
code is notably simpler too.
Added some basic "is the output correct?" tests to test_pprint.
2001-05-14 18:39:41 +00:00
Fred Drake 43913dd27c Convert the pprint test to use PyUnit. 2001-05-14 17:41:20 +00:00
Tim Peters a814db579d SF bug[ #423781: pprint.isrecursive() broken. 2001-05-14 07:05:58 +00:00
Mark Hammond ef8b654bbe Add support for Windows using "mbcs" as the default Unicode encoding when dealing with the file system. As discussed on python-dev and in patch 410465. 2001-05-13 08:04:26 +00:00
Tim Peters 2f228e75e4 Get rid of the superstitious "~" in dict hashing's "i = (~hash) & mask".
The comment following used to say:
	/* We use ~hash instead of hash, as degenerate hash functions, such
	   as for ints <sigh>, can have lots of leading zeros. It's not
	   really a performance risk, but better safe than sorry.
	   12-Dec-00 tim:  so ~hash produces lots of leading ones instead --
	   what's the gain? */
That is, there was never a good reason for doing it.  And to the contrary,
as explained on Python-Dev last December, it tended to make the *sum*
(i + incr) & mask (which is the first table index examined in case of
collison) the same "too often" across distinct hashes.

Changing to the simpler "i = hash & mask" reduced the number of string-dict
collisions (== # number of times we go around the lookup for-loop) from about
6 million to 5 million during a full run of the test suite (these are
approximate because the test suite does some random stuff from run to run).
The number of collisions in non-string dicts also decreased, but not as
dramatically.

Note that this may, for a given dict, change the order (wrt previous
releases) of entries exposed by .keys(), .values() and .items().  A number
of std tests suffered bogus failures as a result.  For dicts keyed by
small ints, or (less so) by characters, the order is much more likely to be
in increasing order of key now; e.g.,

>>> d = {}
>>> for i in range(10):
...    d[i] = i
...
>>> d
{0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
>>>

Unfortunately. people may latch on to that in small examples and draw a
bogus conclusion.

test_support.py
    Moved test_extcall's sortdict() into test_support, made it stronger,
    and imported sortdict into other std tests that needed it.
test_unicode.py
    Excluced cp875 from the "roundtrip over range(128)" test, because
    cp875 doesn't have a well-defined inverse for unicode("?", "cp875").
    See Python-Dev for excruciating details.
Cookie.py
    Chaged various output functions to sort dicts before building
    strings from them.
test_extcall
    Fiddled the expected-result file.  This remains sensitive to native
    dict ordering, because, e.g., if there are multiple errors in a
    keyword-arg dict (and test_extcall sets up many cases like that), the
    specific error Python complains about first depends on native dict
    ordering.
2001-05-13 00:19:31 +00:00
Fred Drake 6278799f8e unlink() would normally be found in the "os" module, so use it from there.
Remove unused import of "sys".

If the file TESTFN exists before we start, try to remove it.

Add spaces around the = in some assignments.
2001-05-11 14:29:21 +00:00
Tim Peters 4c02fecf9c Make test_mutants stronger by also adding random keys during comparisons.
A Mystery:  test_mutants ran amazingly slowly even before dictobject.c
"got fixed".  I don't have a clue as to why.  dict comparison was and
remains linear-time in the size of the dicts, and test_mutants only tries
100 dict pairs, of size averaging just 50.  So "it should" run in less than
an eyeblink; but it takes at least a second on this 800MHz box.
2001-05-10 20:18:30 +00:00
Tim Peters fd69208b78 Change test_mmap.py to use test_support.TESTFN instead of hardcoded "foo",
and wrap the body in try/finally to ensure TESTFN gets cleaned up no
matter what.
2001-05-10 20:03:04 +00:00
Tim Peters 8c3e91efaf Repair typos in comments. 2001-05-10 19:40:30 +00:00
Fred Drake aaa48ff5c9 Extend the weakref test suite to cover the complete mapping interface for
both weakref.Weak*Dictionary classes.

This closes SF bug #416480.
2001-05-10 17:16:38 +00:00
Tim Peters 95bf9390a4 SF bug #422121 Insecurities in dict comparison.
Fixed a half dozen ways in which general dict comparison could crash
Python (even cause Win98SE to reboot) in the presence of kay and/or
value comparison routines that mutate the dict during dict comparison.
Bugfix candidate.
2001-05-10 08:32:44 +00:00
Tim Peters 1ee77d9b71 Guido has Spoken. Restore strop.replace()'s treatment of a 0 count as
meaning infinity -- but at least warn about it in the code!  I pissed
away a couple hours on this today, and don't wish the same on the next
in line.
Bugfix candidate.
2001-05-10 01:23:39 +00:00
Tim Peters da45d55a6e The strop module and test_strop.py believe replace() with a 0 count
means "replace everything".  But the string module, string.replace()
amd test_string.py believe a 0 count means "replace nothing".
"Nothing" wins, strop loses.
Bugfix candidate.
2001-05-10 00:59:45 +00:00
Tim Peters 1a7b3eee94 SF bug #422088: [OSF1 alpha] string.replace().
Platform blew up on "123".replace("123", "").  Michael Hudson pinned the
blame on platform malloc(0) returning NULL.
This is a candidate for all bugfix releases.
2001-05-09 23:00:26 +00:00
Fred Drake bc7809b529 Update the tests for the fcntl module to check passing in file objects,
and using the constants defined there instead of FCNTL.
2001-05-09 21:11:59 +00:00
Jeremy Hylton e3e61049a5 Trivial tests of urllib2 for recent SF bug 2001-05-09 15:50:25 +00:00
Tim Peters 72f98e9b83 SF bug #422177: Results from .pyc differs from .py
Store floats and doubles to full precision in marshal.
Test that floats read from .pyc/.pyo closely match those read from .py.
Declare PyFloat_AsString() in floatobject header file.
Add new PyFloat_AsReprString() API function.
Document the functions declared in floatobject.h.
2001-05-08 15:19:57 +00:00
Tim Peters e63415ead8 SF patch #421922: Implement rich comparison for dicts.
d1 == d2 and d1 != d2 now work even if the keys and values in d1 and d2
don't support comparisons other than ==, and testing dicts for equality
is faster now (especially when inequality obtains).
2001-05-08 04:38:29 +00:00
Jeremy Hylton 4c889011db SF patch 419176 from MvL; fixed bug 418977
Two errors in dict_to_map() helper used by PyFrame_LocalsToFast().
2001-05-08 04:08:59 +00:00
Tim Peters 7ae2229afb This is a test showing SF bug 422177. It won't trigger until I check in
another change (to test_import.py, which simply imports the new file).  I'm
checking this piece in now, though, to make it easier to distribute a patch
for x-platform checking.
2001-05-08 03:58:01 +00:00
Tim Peters 8572b4fedf Generalize zip() to work with iterators.
NEEDS DOC CHANGES.
More AttributeErrors transmuted into TypeErrors, in test_b2.py, and,
again, this strikes me as a good thing.
This checkin completes the iterator generalization work that obviously
needed to be done.  Can anyone think of others that should be changed?
2001-05-06 01:05:02 +00:00
Tim Peters ef0c42d4e5 Get rid of silly 5am "del" stmts. 2001-05-05 21:36:52 +00:00
Tim Peters cb8d368b82 Reimplement PySequence_Contains() and instance_contains(), so they work
safely together and don't duplicate logic (the common logic was factored
out into new private API function _PySequence_IterContains()).
Visible change:
    some_complex_number  in  some_instance
no longer blows up if some_instance has __getitem__ but neither
__contains__ nor __iter__.  test_iter changed to ensure that remains true.
2001-05-05 21:05:01 +00:00
Tim Peters 75f8e35ef4 Generalize PySequence_Count() (operator.countOf) to work with iterators. 2001-05-05 11:33:43 +00:00
Tim Peters de9725f135 Make 'x in y' and 'x not in y' (PySequence_Contains) play nice w/ iterators.
NEEDS DOC CHANGES
A few more AttributeErrors turned into TypeErrors, but in test_contains
this time.
The full story for instance objects is pretty much unexplainable, because
instance_contains() tries its own flavor of iteration-based containment
testing first, and PySequence_Contains doesn't get a chance at it unless
instance_contains() blows up.  A consequence is that
    some_complex_number in some_instance
dies with a TypeError unless some_instance.__class__ defines __iter__ but
does not define __getitem__.
2001-05-05 10:06:17 +00:00
Tim Peters 2cfe368283 Make unicode.join() work nice with iterators. This also required a change
to string.join(), so that when the latter figures out in midstream that
it really needs unicode.join() instead, unicode.join() can actually get
all the sequence elements (i.e., there's no guarantee that the sequence
passed to string.join() can be iterated over *again* by unicode.join(),
so string.join() must not pass on the original sequence object anymore).
2001-05-05 05:36:48 +00:00
Tim Peters 6912d4ddf0 Generalize tuple() to work nicely with iterators.
NEEDS DOC CHANGES.
This one surprised me!  While I expected tuple() to be a no-brainer, turns
out it's actually dripping with consequences:
1. It will *allow* the popular PySequence_Fast() to work with any iterable
   object (code for that not yet checked in, but should be trivial).
2. It caused two std tests to fail.  This because some places used
   PyTuple_Sequence() (the C spelling of tuple()) as an indirect way to test
   whether something *is* a sequence.  But tuple() code only looked for the
   existence of sq->item to determine that, and e.g. an instance passed
   that test whether or not it supported the other operations tuple()
   needed (e.g., __len__).  So some things the tests *expected* to fail
   with an AttributeError now fail with a TypeError instead.  This looks
   like an improvement to me; e.g., test_coercion used to produce 559
   TypeErrors and 2 AttributeErrors, and now they're all TypeErrors.  The
   error details are more informative too, because the places calling this
   were *looking* for TypeErrors in order to replace the generic tuple()
   "not a sequence" msg with their own more specific text, and
   AttributeErrors snuck by that.
2001-05-05 03:56:37 +00:00
Tim Peters 15d81efb8a Generalize reduce() to work with iterators.
NEEDS DOC CHANGES.
2001-05-04 04:39:21 +00:00
Tim Peters 8bc10b0c57 Purge redundant cut&paste line. 2001-05-03 23:58:47 +00:00
Tim Peters 4e9afdca39 Generalize map() to work with iterators.
NEEDS DOC CHANGES.
Possibly contentious:  The first time s.next() yields StopIteration (for
a given map argument s) is the last time map() *tries* s.next().  That
is, if other sequence args are longer, s will never again contribute
anything but None values to the result, even if trying s.next() again
could yield another result.  This is the same behavior map() used to have
wrt IndexError, so it's the only way to be wholly backward-compatible.
I'm not a fan of letting StopIteration mean "try again later" anyway.
2001-05-03 23:54:49 +00:00
Tim Peters efdae3939a Remove redundant copy+paste code. 2001-05-03 07:09:25 +00:00
Tim Peters c307453162 Generalize max(seq) and min(seq) to work with iterators.
NEEDS DOC CHANGES.
2001-05-03 07:00:32 +00:00
Marc-André Lemburg 542fe56cb9 Fix for bug #417030: "print '%*s' fails for unicode string" 2001-05-02 14:21:53 +00:00
Tim Peters 0e57abf0cd Generalize filter(f, seq) to work with iterators. This also generalizes
filter() to no longer insist that len(seq) be defined.
NEEDS DOC CHANGES.
2001-05-02 07:39:38 +00:00
Tim Peters 8ae2df483c Whitespace normalization. 2001-05-02 05:54:44 +00:00
Fred Drake 0e540c391f Added tests for Weak*Dictionary iterator support.
Refactored some object initialization to be more reusable.
2001-05-02 05:44:22 +00:00
Tim Peters f553f89d45 Generalize list(seq) to work with iterators. This also generalizes list()
to no longer insist that len(seq) be defined.
NEEDS DOC CHANGES.
This is meant to be a model for how other functions of this ilk (max,
filter, etc) can be generalized similarly.  Feel encouraged to grab your
favorite and convert it!
Note some cute consequences:
    list(file) == file.readlines() == list(file.xreadlines())
    list(dict) == dict.keys()
    list(dict.iteritems()) = dict.items()
    list(xrange(i, j, k)) == range(i, j, k)
2001-05-01 20:45:31 +00:00
Jeremy Hylton ddc4fd03b1 Fix 2.1 nested scopes crash reported by Evan Simpson
The new test case demonstrates the bug.  Be more careful in
symtable_resolve_free() to add a var to cells or frees only if it
won't be added under some other rule.

XXX Add new assertion that will catch this bug.
2001-04-27 02:29:40 +00:00
Fred Drake 8f42e2b1fa Update test to accomodate the change to the namespace_separator parameter
of ParserCreate().

Added assignment tests for the ordered_attributes and specified_attributes
values, similar to the checks for the returns_unicode attribute.
2001-04-25 16:03:54 +00:00
Guido van Rossum 8b48cf9016 Add test suite for iterators. 2001-04-21 13:33:54 +00:00
Tim Peters a3f98d6bac Give UserDict new __contains__ and __iter__ methods. 2001-04-21 09:13:15 +00:00
Guido van Rossum 0dbb4fba4c Implement, test and document "key in dict" and "key not in dict".
I know some people don't like this -- if it's really controversial,
I'll take it out again.  (If it's only Alex Martelli who doesn't like
it, that doesn't count as "real controversial" though. :-)

That's why this is a separate checkin from the iterators stuff I'm
about to check in next.
2001-04-20 16:50:40 +00:00
Jeremy Hylton 3090694068 Fix compileall.py so that it fails on SyntaxErrors
The changes cause compilation failures in any file in the Python
installation lib directory to cause the install to fail.  It looks
like compileall.py intended to behave this way, but a change to
py_compile.py and a separate bug defeated it.

Fixes SF bug #412436

This change affects the test suite, which contains several files that
contain intentional errors.  The solution is to extend compileall.py
with the ability to skip compilation of selected files.

In the test suite, rename nocaret.py and test_future[3..7].py to start
with badsyntax_nocaret.py and badsyntax_future[3..7].py.  Update the
makefile to skip compilation of these files.  Update the tests to use
the name names for imports.

NB compileall.py is changed so that compile_dir() returns success only
if all recursive calls to compile_dir() also check success.
2001-04-18 01:19:28 +00:00
Fred Drake a0a4ab1772 Add a test case for Weak*Dictionary.update() that would have caught a
recently reported bug; also exposed some other bugs in the implementation.
2001-04-16 17:37:27 +00:00
Guido van Rossum 42f92da307 Change the test data to ask for class C from module __main__ rather
than from module pickletester.  Using the latter turned out to cause
the test to break when invoked as "import test.test_pickle" or "import
test.autotest".
2001-04-16 00:28:21 +00:00
Guido van Rossum fc349862d4 In order to make this test work on Windows, the test locale has to be
set to 'en' there -- Windows does not understand the 'en_US' locale.
The test succeeds there.
2001-04-15 13:15:56 +00:00
Guido van Rossum f3ee46b82a Set the SO_REUSEADDR socket option in the server thread -- this seems
needed on some platforms (e.g. Solaris 8) when the test is run twice
in quick succession.
2001-04-15 00:42:13 +00:00