Commit Graph

647 Commits

Author SHA1 Message Date
Marc-André Lemburg 85cc4d8940 Fixed a couple of places where 'int' was used where 'long'
should have been used.
2000-07-06 19:43:31 +00:00
Jack Jansen 56cdce3070 Conditionally (currently on ifdef macintosh) break the large switch up
into 1000-case smaller ones.
2000-07-06 13:57:38 +00:00
Marc-André Lemburg 63f3d17418 Added new codec APIs and a new interface method .encode() which
works just like the Unicode one. The C APIs match the ones in the Unicode
implementation, but were extended to be able to reuse the existing
Unicode codecs for string purposes too.

Conversions from string to Unicode and back are done using the
default encoding.
2000-07-06 11:29:01 +00:00
Marc-André Lemburg 1f46860a29 Fix to bug #389:
Full_Name: Bastian Kleineidam
Version: 2.0b1 CVS 5.7.2000
OS: Debian Linux 2.2
Submission from: earth.cs.uni-sb.de (134.96.252.92)
2000-07-05 15:32:40 +00:00
Marc-André Lemburg a7acf425f6 Added new .isalpha() and .isalnum() methods which provide interfaces
to the new alphabetic lookup APIs in unicodectype.c.
2000-07-05 09:49:44 +00:00
Marc-André Lemburg f3938f55c7 Added new lookup API which matches all alphabetic Unicode characters,
i.e the ones with category 'Ll','Lu','Lt','Lo','Lm'.
2000-07-05 09:48:59 +00:00
Marc-André Lemburg 4027f8f4b3 Added new .isalpha() and .isalnum() methods to match the same
ones on the Unicode objects. Note that the string versions use
the (locale aware) C lib APIs isalpha() and isalnum().
2000-07-05 09:47:46 +00:00
Tim Peters 1f5871e834 Removed Py_PROTO and switched to ANSI C declarations in the dict
implementation.  This was really to test whether my new CVS+SSH
setup is more usable than the old one -- and turns out it is (for
whatever reason, it was impossible to do a commit before that
involved more than one directory).
2000-07-04 17:44:48 +00:00
Marc-André Lemburg 1e7205a62a Bill Tutt:
Make unicode_compare a true UTF-16 compare function (includes
support for surrogates).
2000-07-04 09:51:07 +00:00
Marc-André Lemburg 891bc65486 If auto-conversion fails, the Unicode codecs will return NULL.
This is now checked and the error passed on to the caller.
2000-07-03 09:57:53 +00:00
Fredrik Lundh efecc7d05b changed repr and str to always convert unicode strings
to 8-bit strings, using the default encoding.
2000-07-01 14:31:09 +00:00
Guido van Rossum 4cc6ac7b87 Neil Schemenauer: small fixes for GC 2000-07-01 01:00:38 +00:00
Guido van Rossum ffcc3813d8 Change copyright notice - 2nd try. 2000-06-30 23:58:06 +00:00
Guido van Rossum fd71b9e9d4 Change copyright notice. 2000-06-30 23:50:40 +00:00
Guido van Rossum 9a15c211cf Fix an error on AIX by using a proper cast. 2000-06-30 22:46:04 +00:00
Fred Drake a44d353e2b Trent Mick <trentm@activestate.com>:
The common technique for printing out a pointer has been to cast to a long
and use the "%lx" printf modifier. This is incorrect on Win64 where casting
to a long truncates the pointer. The "%p" formatter should be used instead.

The problem as stated by Tim:
> Unfortunately, the C committee refused to define what %p conversion "looks
> like" -- they explicitly allowed it to be implementation-defined. Older
> versions of Microsoft C even stuck a colon in the middle of the address (in
> the days of segment+offset addressing)!

The result is that the hex value of a pointer will maybe/maybe not have a 0x
prepended to it.


Notes on the patch:

There are two main classes of changes:
- in the various repr() functions that print out pointers
- debugging printf's in the various thread_*.h files (these are why the
patch is large)


Closes SourceForge patch #100505.
2000-06-30 15:01:00 +00:00
Marc-André Lemburg d49e5b4667 Marc-Andre Lemburg <mal@lemburg.com>:
A previous patch by Jack Jansen was accidently reverted.
2000-06-30 14:58:20 +00:00
Marc-André Lemburg f28dd83b86 Marc-Andre Lemburg <mal@lemburg.com>:
New buffer overflow checks for formatting strings.

By Trent Mick.
2000-06-30 10:29:57 +00:00
Jeremy Hylton c5007aa5c3 final patches from Neil Schemenauer for garbage collection 2000-06-30 05:02:53 +00:00
Fred Drake 13634cf7a4 This patch addresses two main issues: (1) There exist some non-fatal
errors in some of the hash algorithms. For exmaple, in float_hash and
complex_hash a certain part of the value is not included in the hash
calculation. See Tim's, Guido's, and my discussion of this on
python-dev in May under the title "fix float_hash and complex_hash for
64-bit *nix"

(2) The hash algorithms that use pointers (e.g. func_hash, code_hash)
are universally not correct on Win64 (they assume that sizeof(long) ==
sizeof(void*))

As well, this patch significantly cleans up the hash code. It adds the
two function _Py_HashDouble and _PyHash_VoidPtr that the various
hashing routine are changed to use.

These help maintain the hash function invariant: (a==b) =>
(hash(a)==hash(b))) I have added Lib/test/test_hash.py and
Lib/test/output/test_hash to test this for some cases.
2000-06-29 19:17:04 +00:00
Guido van Rossum 4f4b799b33 Jack Jansen: Use include "" instead of <>; and staticforward declarations 2000-06-29 00:06:39 +00:00
Guido van Rossum d7823f2645 Vladimir Marangozov:
Avoid calling the dealloc function, previously triggered with
DECREF(inst).  This caused a segfault in PyDict_GetItem, called with a
NULL dict, whenever inst->in_dict fails under low-memory conditions.
2000-06-28 23:46:07 +00:00
Guido van Rossum ad89bbcd88 Trent Mick: change a few casts for Win64 compatibility. 2000-06-28 21:57:18 +00:00
Guido van Rossum eceebb87d9 Jack Jansen: Moved includes to the top, removed think C support 2000-06-28 20:57:07 +00:00
Marc-André Lemburg 0f774e3987 Marc-Andre Lemburg <mal@lemburg.com>:
Patch to the standard unicode-escape codec which dynamically
loads the Unicode name to ordinal mapping from the module
ucnhash.

By Bill Tutt.
2000-06-28 16:43:35 +00:00
Marc-André Lemburg 7c014684c2 Marc-Andre Lemburg <mal@lemburg.com>:
Better error message for "1 in unicodestring". Submitted
by Andrew Kuchling.
2000-06-28 08:11:47 +00:00
Jeremy Hylton d08b4c4524 part 2 of Neil Schemenauer's GC patches:
This patch modifies the type structures of objects that
participate in GC.  The object's tp_basicsize is increased when
GC is enabled.  GC information is prefixed to the object to
maintain binary compatibility.  GC objects also define the
tp_flag Py_TPFLAGS_GC.
2000-06-23 19:37:02 +00:00
Jeremy Hylton d22162bac7 traverse functions should return 0 on success 2000-06-23 17:14:56 +00:00
Jeremy Hylton 99a8f90874 raise TypeError when PyObject_Get/SetAttr called with non-string name 2000-06-23 14:36:32 +00:00
Jeremy Hylton 8caad49c30 Round 1 of Neil Schemenauer's GC patches:
This patch adds the type methods traverse and clear necessary for GC
implementation.
2000-06-23 14:18:11 +00:00
Fred Drake 396f6e0d6a Fredrik Lundh <effbot@telia.com>:
Simplify find code; this is a performance improvement on at least some
platforms.
2000-06-20 15:47:54 +00:00
Marc-André Lemburg 49ef6dc1f4 Marc-Andre Lemburg <mal@lemburg.com>:
Fixed a bug in PyUnicode_Count() which would have caused a
core dump in case of substring coercion failure.

Synchronized .count() with the string method of the same name
to return len(s)+1 for s.count('').
2000-06-18 22:25:22 +00:00
Andrew M. Kuchling 74042d6e5d Patch from /F:
this patch introduces PySequence_Fast and PySequence_Fast_GET_ITEM,
and modifies the list.extend method to accept any kind of sequence.
2000-06-18 18:43:14 +00:00
Marc-André Lemburg bea47e768d Vladimir MARANGOZOV <Vladimir.Marangozov@inrialpes.fr>:
This patch fixes an optimisation mystery in _PyUnicodeNew causing segfaults
on AIX when the interpreter is compiled with -O.
2000-06-17 20:31:17 +00:00
Marc-André Lemburg 29dc381ce0 Michael Hudson <mwh21@cam.ac.uk>:
The error message refers to "append", yet the operation in
question is "concat".
2000-06-16 17:05:57 +00:00
Fred Drake 56780257c6 Thomas Wouters <thomas@xs4all.net>:
The following patch adds "sq_contains" support to rangeobject, and enables
the already-written support for sq_contains in listobject and tupleobject.

The rangeobject "contains" code should be a bit more efficient than the
current default "in" implementation ;-) It might not get used much, but it's
not that much to add.

listobject.c and tupleobject.c already had code for sq_contains, and the
proper struct member was set, but the PyType structure was not extended to
include tp_flags, so the object-specific code was not getting called (Go
ahead, test it ;-). I also did this for the immutable_list_type in
listobject.c, eventhough it is probably never used. Symmetry and all that.
2000-06-15 14:50:20 +00:00
Marc-André Lemburg 60bc809d9a Marc-Andre Lemburg <mal@lemburg.com>:
Added code so that .isXXX() testing returns 0 for emtpy strings.
2000-06-14 09:18:32 +00:00
Marc-André Lemburg 07ceb67d9c Marc-Andre Lemburg <mal@lemburg.com>:
Fixed a typo and removed a debug printf(). Thanks to Finn Bock
for finding these.
2000-06-10 09:32:51 +00:00
Jeremy Hylton a251ea0680 the PyDict_SetItem does not borrow a reference, so we need to decref
reported by Mark Hammon
2000-06-09 16:20:39 +00:00
Andrew M. Kuchling cb95a1470a Patch from Michael Hudson: improve unclear error message 2000-06-09 14:04:53 +00:00
Marc-André Lemburg d4ab4a5905 Marc-Andre Lemburg <mal@lemburg.com>:
Fixed %c formatting to check for one character arguments. Thanks
to Finn Bock for finding this bug.

Added a fix for bug PR#348 which originated from not resetting
the globals correctly in _PyUnicode_Fini().
2000-06-08 17:54:00 +00:00
Marc-André Lemburg 90e8147118 Marc-Andre Lemburg <mal@lemburg.com>:
Change the default encoding to 'ascii' (it was previously
defined as UTF-8).

Note: The implementation still uses UTF-8 to implement
the buffer protocol, so C APIs will still see UTF-8. This
is on purpose: rather than fixing the Unicode implementation,
the C APIs should be made Unicode aware.
2000-06-07 09:13:21 +00:00
Fred Drake 4c7fdfc35b Trent Mick <trentm@ActiveState.com>:
This patch correct bounds checking in PyLong_FromLongLong. Currently, it does
not check properly for negative values when checking to see if the incoming
value fits in a long or unsigned long. This results in possible silent
truncation of the value for very large negative values.
2000-06-01 18:37:36 +00:00
Fred Drake 914a2edb24 Improve TypeError exception message for list catenation. 2000-06-01 14:31:03 +00:00
Fred Drake b6a9ada757 Michael Hudson <mwh21@cam.ac.uk>:
Removed PyErr_BadArgument() calls and replaced them with more useful
error messages.
2000-06-01 03:12:13 +00:00
Fred Drake 785d14f965 Minimal change so I can add the rest of MAL's checkin message:
M.-A. Lemburg <mal@lemburg.com>:
Fixed a core dump in PyUnicode_Format().
2000-05-09 19:54:43 +00:00
Fred Drake e4315f58d2 M.-A. Lemburg <mal@lemburg.com>:
Added support for user settable default encodings. The
current implementation uses a per-process global which
defines the value of the encoding parameter in case it
is set to NULL (meaning: use the default encoding).
2000-05-09 19:53:39 +00:00
Guido van Rossum c18a6f466a Replace PyErr_BadArgument() error in PyInt_AsLong() with "an integer
is required" (we can't say more because we don't know in which context
it is called).
2000-05-09 14:27:48 +00:00
Guido van Rossum b8872e61c6 Trent Mick:
Fix the string methods that implement slice-like semantics with
optional args (count, find, endswith, etc.) to properly handle
indeces outside [INT_MIN, INT_MAX]. Previously the "i" formatter
for PyArg_ParseTuple was used to get the indices. These could overflow.

This patch changes the string methods to use the "O&" formatter with
the slice_index() function from ceval.c which is used to do the same
job for Python code slices (e.g. 'abcabcabc'[0:1000000000L]).
2000-05-09 14:14:27 +00:00
Guido van Rossum c682140de7 Trent Mick:
Fix the string methods that implement slice-like semantics with
optional args (count, find, endswith, etc.) to properly handle
indeces outside [INT_MIN, INT_MAX]. Previously the "i" formatter
for PyArg_ParseTuple was used to get the indices. These could overflow.

This patch changes the string methods to use the "O&" formatter with
the slice_index() function from ceval.c which is used to do the same
job for Python code slices (e.g. 'abcabcabc'[0:1000000000L]). slice_index()
is renamed _PyEval_SliceIndex() and is now exported. As well, the return
values for success/fail were changed to make slice_index directly
usable as required by the "O&" formatter.

[GvR: shouldn't a similar patch be applied to unicodeobject.c?]
2000-05-08 14:08:05 +00:00