Commit Graph

1019 Commits

Author SHA1 Message Date
Guido van Rossum 65e0b99b61 SF patch #103158 by Greg Ball: Don't do unsafe arithmetic in xrange
object.

This fixes potential overflows in xrange()'s internal calculations on
64-bit platforms.  The fix is complicated because the sq_length slot
function can only return an int; we want to support
xrange(sys.maxint), which is a 64-bit quantity on most 64-bit
platforms (except Win64).  The solution is hacky but the best
possible: when the range is that long, we can use it in a for loop but
we can't ask for its length (nor can we actually iterate beyond
2**31-1, because the sq_item slot function has the same restrictions
on its arguments.  Fixing those restrictions is a project for another
day...
2001-01-15 18:58:56 +00:00
Tim Peters 142297ac92 Speed getline_via_fgets(), by supplying two "fast paths", although one is
faster than the other.  Should be faster for Mark Favas's 254-character
mail log lines, and *is* 3-4% quicker for my test case with much shorter
lines (but they're typical of *my* text files, and I'm tired of optimizing
for everyone else at my expense <wink> -- in fact, the only one who loses
here is Guido ...).
2001-01-15 10:36:56 +00:00
Tim Peters f29b64d243 Use the "MS" getline hack (fgets()) by default on non-get_unlocked
platforms.  See NEWS for details.
2001-01-15 06:33:19 +00:00
Guido van Rossum e07d5cf966 Jeff Epler's patch adding an xreadlines() method. (It just imports
the xreadlines module and lets it do its thing.)
2001-01-09 21:50:24 +00:00
Guido van Rossum dcf5715db1 Tsk, tsk, tsk. Treat FreeBSD the same as the other BSDs when defining
a fallback for TELL64.  Fixes SF Bug #128119.
2001-01-09 02:00:11 +00:00
Neil Schemenauer 010b0cc218 Fix a silly bug in float_pow. Sorry Tim. 2001-01-08 06:29:50 +00:00
Tim Peters 1c73323d6f A few reformats; no logic changes. 2001-01-08 04:02:07 +00:00
Guido van Rossum 8628206b95 Let's hope that three time's a charm...
Tim discovered another "bug" in my get_line() code: while the comments
said that n<0 was invalid, it was in fact still called with n<0 (when
PyFile_GetLine() was called with n<0).  In that case fortunately
executed the same code as for n==0.

Changed the comment to admit this fact, and changed Tim's MS speed
hack code to use 'n <= 0' as the criteria for the speed hack.
2001-01-08 01:26:47 +00:00
Tim Peters 15b838521f Fiddled ms_getline_hack after talking w/ Guido: made clearer that the
code duplication is to let us get away without a realloc whenever possible;
boosted the init buf size (the cutoff at which we *can* get away without
a realloc) from 100 to 200 so that more files can enjoy this boost; and
allowed other threads to run in all cases.  The last two cost something,
but not significantly:  in my fat test case, less than a 1% slowdown total.
Since my test case has a great many short lines, that's probably the worst
slowdown, too.  While the logic barely changed, there were lots of edits.
This also gets rid of the reference to fp->_cnt, so the last platform
assumption being made here is that fgets doesn't overwrite bytes
capriciously (== beyond the terminating null byte it must write).
2001-01-08 00:53:12 +00:00
Tim Peters 86821b2563 MS Win32 .readline() speedup, as discussed on Python-Dev. This is a tricky
variant that never needs to "search from the right".
Also fixed unlikely memory leak in get_line, if string size overflows INTMAX.
Also new std test test_bufio to make sure .readline() works.
2001-01-07 21:19:34 +00:00
Guido van Rossum 4ddf0a01f7 Tim noticed that I had botched get_line_raw(). Looking again, I
realized that this behavior is already present in PyFile_GetLine(),
which is the only place that needs it.  A little refactoring of that
function made get_line_raw() redundant.
2001-01-07 20:51:39 +00:00
Marc-André Lemburg ec233e5803 This patch adds a new feature to the builtin charmap codec:
The mapping dictionaries can now contain 1-n mappings, meaning
that character ordinals may be mapped to strings or Unicode object,
e.g. 0x0078 ('x') -> u"abc", causing the ordinal to be replaced by
the complete string or Unicode object instead of just one character.

Another feature introduced by the patch is that of mapping oridnals to
the emtpy string. This allows removing characters.

The patch is different from patch #103100 in that it does not cause a
performance hit for the normal use case of 1-1 mappings.

Written by Marc-Andre Lemburg, copyright assigned to Guido van Rossum.
2001-01-06 14:59:58 +00:00
Guido van Rossum 1187aa4d33 Restructured get_line() for clarity and speed.
- The raw_input() functionality is moved to a separate function.

- Drop GNU getline() in favor of getc_unlocked(), which exists on more
  platforms (and is even a tad faster on my system).
2001-01-05 14:43:05 +00:00
Neil Schemenauer 5ed85ec0c0 Changes for PEP 208. PyObject_Compare has been rewritten. Instances no
longer get special treatment.  The Py_NotImplemented type is here as well.
2001-01-04 01:48:10 +00:00
Neil Schemenauer ba872e2534 Make long a new style number type. Sequence repeat is now done here
now as well.
2001-01-04 01:46:03 +00:00
Neil Schemenauer 139e72ad1a Make int a new style number type. Sequence repeat is now done here
now as well.
2001-01-04 01:45:33 +00:00
Neil Schemenauer 32117e5c29 Make float a new style number type. 2001-01-04 01:44:34 +00:00
Neil Schemenauer 29bfc07183 Make instances a new style number type. See PEP 208 for details. Instance
types no longer get special treatment from abstract.c so more number number
methods have to be implemented.
2001-01-04 01:43:46 +00:00
Neil Schemenauer 5a1f015bee Massive changes as per PEP 208. Read it for details. 2001-01-04 01:39:06 +00:00
Jeremy Hylton 1fb6088e86 dict_update has two boundary conditions: a.update(a) and a.update({})
Added test for second one.
2001-01-03 22:34:59 +00:00
Jeremy Hylton db60bb5aad fix leak 2001-01-03 22:32:16 +00:00
Marc-André Lemburg a866df806d This patch changes the default behaviour of the builtin charmap
codec to not apply Latin-1 mappings for keys which are not found
in the mapping dictionaries, but instead treat them as undefined
mappings.

The patch was originally written by Martin v. Loewis with some
additional (cosmetic) changes and an updated test script
by Marc-Andre Lemburg.

The standard codecs were recreated from the most current files
available at the Unicode.org site using the Tools/scripts/gencodec.py
tool.

This patch closes the bugs #116285 and #119960.
2001-01-03 21:29:14 +00:00
Neil Schemenauer 10e31cf82e Add garbage collection for module objects. Closes patch #102939 and
fixes bug #126345.
2001-01-02 15:58:27 +00:00
Fred Drake e7e190ef97 Make the indentation consistently use tabs instead of using spaces just
in one place.
2000-12-20 00:55:07 +00:00
Andrew M. Kuchling f947ffe951 Patch #102940: use only printable Unicode chars in reporting
incorrect % characters; characters outside the printable range are
 replaced with '?'
2000-12-19 22:49:06 +00:00
Andrew M. Kuchling 932af110d3 Patch #102868 from cgw: fix memory leak when an EOF is encountered
using GNU libc's getline()
2000-12-19 20:59:04 +00:00
Guido van Rossum cda4f9a8dc Fix off-by-one error in split_substring(). Fixes SF bug #122162. 2000-12-19 02:23:19 +00:00
Andrew M. Kuchling 6ca8917758 [ Patch #102852 ] Make % error a bit more informative by indicates the
index at which an unknown %-escape was found
2000-12-15 13:07:46 +00:00
Guido van Rossum adf5410dc4 Test for NULL returned from PyObject_NEW(). 2000-12-14 15:09:46 +00:00
Guido van Rossum 9e8f4ea0aa Test for NULL returned from PyObject_NEW(). 2000-12-14 14:59:53 +00:00
Tim Peters f7f88b11e4 Add long-overdue docstrings to dict methods. 2000-12-13 23:18:45 +00:00
Tim Peters 0e76ab2ecc Use METH_VARARGS instead of "1" in list method table. 2000-12-13 22:35:46 +00:00
Tim Peters f1c7c884b3 Typo repair in comments. Fell for GregS's .popitem() poke. 2000-12-13 19:58:25 +00:00
Tim Peters ea8f2bf9ca Bring comments up to date (e.g., they still said the table had to be
a prime size, which is in fact never true anymore ...).
2000-12-13 01:02:46 +00:00
Guido van Rossum ba6ab84e73 Add popitem() -- SF patch #102733. 2000-12-12 22:02:18 +00:00
Fred Drake 49312a52ec Jeffrey D. Collins <tokeneater@users.sourceforge.net>:
Fix type of the self parameter to some string object methods.

This closes patch #102670.
2000-12-06 14:27:49 +00:00
Moshe Zadka 5725d1eb03 Backing out my changes.
Improved version coming soon to a Source Forge near you!
2000-11-30 19:30:21 +00:00
Andrew M. Kuchling 1221e6df3d Only use getline() when compiling using glibc 2000-11-30 18:27:50 +00:00
Moshe Zadka 1a62750eda Added .first{item,value,key}() to dictionaries.
Complete with docos and tests.
OKed by Guido.
2000-11-30 12:31:03 +00:00
Tim Peters a3a3a030af Fox for SF bug #123859: %[duxXo] long formats inconsistent. 2000-11-30 05:22:44 +00:00
Andrew M. Kuchling 4b2b445f28 Patch #102469: Use glibc's getline() extension when reading unbounded lines 2000-11-29 02:53:22 +00:00
Guido van Rossum d7aa0f245f Update dependencies per /F. 2000-11-28 12:09:18 +00:00
Guido van Rossum 2ccda8a7c4 SF patch #102548, fix for bug #121013, by mwh@users.sourceforge.net.
Fixes a typo that caused "".join(u"this is a test") to dump core.
2000-11-27 18:46:26 +00:00
Guido van Rossum ecaa77798b Added _HAVE_BSDI and __APPLE__ to the list of platforms that require a
hack for TELL64()...  Sounds like there's something else going on
really.  Does anybody have a clue I can buy?
2000-11-13 19:48:22 +00:00
Fred Drake 0b796fa5c5 Fixed support for containment test when a negative step is used; this
*really* closes bug #121965.

Added three attributes to the xrange object: start, stop, and step.  These
are the same as for the slice objects.
2000-11-08 19:42:43 +00:00
Fred Drake a91e1650aa In the containment test, get the boundary condition right. ">" was used
where ">=" should have been.

This closes bug #121965.
2000-11-08 18:37:05 +00:00
Fredrik Lundh fad27aee11 Added 38,642 missing characters to the Unicode database (first-last
ranges) -- but thanks to the 2.0 compression scheme, this doesn't add
a single byte to the resulting binaries (!)

Closes bug #117524
2000-11-03 20:24:15 +00:00
Fred Drake 661ea26b3d Ka-Ping Yee <ping@lfw.org>:
Changes to error messages to increase consistency & clarity.

This (mostly) closes SourceForge patch #101839.
2000-10-24 19:57:45 +00:00
Marc-André Lemburg 53f3d4ac74 [ Bug #116174 ] using %% in cstrings sometimes fails with unicode paramsFix for the bug reported in Bug #116174: "%% %s" % u"abc" failed due
to the way string formatting delegated work to the Unicode formatting
function.
2000-10-07 08:54:09 +00:00
Fred Drake db810ac2b8 Donn Cave <donn@oz.net>:
Fix large file support for BeOS.

This closes SourceForge patch #101773.  Refer to the patch discussion for
information on possible alternate fixes.
2000-10-06 20:42:33 +00:00
Tim Peters c54d19043a SF bug 115831 and Ping's SF patch 101751, 0.0**-2.0 returns inf rather than
raise ValueError.  Checked in the patch as far as it went, but also changed
all of ints, longs and floats to raise ZeroDivisionError instead when raising
0 to a negative number.  This is what 754-inspired stds require, as the "true
result" is an infinity obtained from finite operands, i.e. it's a singularity.
Also changed float pow to not be so timid about using its square-and-multiply
algorithm.  Note that what math.pow does is unrelated to what builtin pow
does, and will still vary by platform.
2000-10-06 00:36:09 +00:00
Neil Schemenauer 08b53e6a2a Simplify _PyTuple_Resize by not using the tuple free list and dropping
support for the last_is_sticky flag.  A few hard to find bugs may be
fixed by this patch since the old code was buggy.
2000-10-05 19:36:49 +00:00
Thomas Wouters dc9100f57d Fix for SF bug #115987: PyInstance_HalfBinOp does not initialize the
result-object-pointer that is passed in, when an exception occurs during
coercion. The pointer has to be explicitly initialized in the caller to avoid
putting trash on the Python stack.
2000-10-05 12:43:25 +00:00
Tim Peters d57731f74b Move LONG_BIT from intobject.c to pyport.h. #error if it's already been
#define'd to an unreasonable value (several recent gcc systems have
misdefined it, causing bogus overflows in integer multiplication).  Nuke
CHAR_BIT entirely.
2000-10-05 01:42:25 +00:00
Neil Schemenauer e3550a65eb - fix a GC bug caused by malloc() failing 2000-10-04 16:20:41 +00:00
Barry Warsaw 5b4c22806f _PyUnicode_Fini(): Initialize the local freelist walking variable `u'
after unicode_empty has been freed, otherwise it might not point to
the real start of the unicode_freelist.  Final closure for SF bug
#110681, Jitterbug PR#398.
2000-10-03 20:45:26 +00:00
Guido van Rossum 4ae8ef84da In _PyUnicode_Fini(), decref unicode_empty before tearng down the free
list.  Discovered by Barry, fix approved by MAL.
2000-10-03 18:09:04 +00:00
Fred Drake d5fadf75e4 Rationalize use of limits.h, moving the inclusion to Python.h.
Add definitions of INT_MAX and LONG_MAX to pyport.h.
Remove includes of limits.h and conditional definitions of INT_MAX
and LONG_MAX elsewhere.

This closes SourceForge patch #101659 and bug #115323.
2000-09-26 05:46:01 +00:00
Fredrik Lundh 375732cd41 - don't set the titlecase flag for uppercase letters (sorry, tim) 2000-09-25 23:03:34 +00:00
Fredrik Lundh 9e7dd4c185 unicode database compression, step 3:
- use unidb compression for the unicodectype module.  smaller, faster,
  and slightly more portable...
2000-09-25 21:48:13 +00:00
Fredrik Lundh 69b58e2772 unicode database compression, step 3:
- use unidb compression for the unicodectype module.  smaller, faster,
  and slightly more portable...

(note: this commit doesn't include the unicodectype.c file itself; I'm
still waiting for the reviewers...)
2000-09-25 21:12:34 +00:00
Tim Peters 858346e484 Replace SIGFPE paranoia around strtod and atof. I don't believe these
fncs are allowed to raise SIGFPE (see the C std), but OK by me if
people using --with-fpectl want to pay for checking anyway.
2000-09-25 21:01:28 +00:00
Tim Peters ef14d73b7a Fix for SF bug 110624: float literals behave inconsistently.
I fixed the specific complaint but left the (many) large issues untouched.
See the (very long) bug report discussion for why:
    http://sourceforge.net/bugs/?func=detailbug&group_id=5470&bug_id=110624
Note that while I left the interface to the undocumented public API function
PyFloat_FromString alone, its 2nd argument is useless.  From a comment block
in the code:

RED_FLAG 22-Sep-2000 tim
PyFloat_FromString's pend argument is braindead.  Prior to this RED_FLAG,

1.  If v was a regular string, *pend was set to point to its terminating
    null byte.  That's useless (the caller can find that without any
    help from this function!).

2.  If v was a Unicode string, or an object convertible to a character
    buffer, *pend was set to point into stack trash (the auto temp
    vector holding the character buffer).  That was downright dangerous.

Since we can't change the interface of a public API function, pend is
still supported but now *officially* useless:  if pend is not NULL,
*pend is set to NULL.
2000-09-23 03:39:17 +00:00
Guido van Rossum 1a5e5830a7 Untested patch by Ty Sarna to make TELL64 work on older NetBSD systems.
According to Justin Pettit, this also works on OpenBSD, so I've added
that symbol as well.
2000-09-21 22:15:29 +00:00
Guido van Rossum 1e3c8ccb9b As suggested by Toby Dickenson, setting ob_type to NULL in
_Py_Dealloc(), is a bad idea (and always was!).  So let's drop it.
2000-09-21 16:25:33 +00:00
Tim Peters 38fd5b6413 Derived from Martin's SF patch 110609: support unbounded ints in %d,i,u,x,X,o formats.
Note a curious extension to the std C rules:  x, X and o formatting can never produce
a sign character in C, so the '+' and ' ' flags are meaningless for them.  But
unbounded ints *can* produce a sign character under these conversions (no fixed-
width bitstring is wide enough to hold all negative values in 2's-comp form).  So
these flags become meaningful in Python when formatting a Python long which is too
big to fit in a C long.  This required shuffling around existing code, which hacked
x and X conversions to death when both the '#' and '0' flags were specified:  the
hacks weren't strong enough to deal with the simultaneous possibility of the ' ' or
'+' flags too, since signs were always meaningless before for x and X conversions.
Isomorphic shuffling was required in unicodeobject.c.
Also added dozens of non-trivial new unbounded-int test cases to test_format.py.
2000-09-21 05:43:11 +00:00
Marc-André Lemburg d1ba443206 This patch adds a new Python C API called PyString_AsStringAndSize()
which implements the automatic conversion from Unicode to a string
object using the default encoding.

The new API is then put to use to have eval() and exec accept
Unicode objects as code parameter. This closes bugs #110924
and #113890.

As side-effect, the traditional C APIs PyString_Size() and
PyString_AsString() will also accept Unicode objects as
parameters.
2000-09-19 21:04:18 +00:00
Marc-André Lemburg e44e507b0e PyObject_SetAttr() and PyObject_GetAttr() now also accept Unicode
objects for the attribute name. Unicode objects are converted to
a string using the default encoding before trying the lookup.

Note that previously it was allowed to pass arbitrary objects as
attribute name in case the tp_getattro/setattro slots were defined.
This patch fixes this by applying an explicit string check first:
all uses of these slots expect string objects and do not check
for the type resulting in a core dump. The tp_getattro/setattro
are still useful as optimization for lookups using interned
string objects though.

This patch fixes bug #113829.
2000-09-18 16:20:57 +00:00
Tim Peters 6b184918f6 Fix for SF bug 110688: Instance deallocation neglected to account for
that Py_INCREF boosts global _Py_RefTotal when Py_REF_DEBUG is defined
but Py_TRACE_REFS isn't.

There are, IMO, way too many preprocessor gimmicks in use for refcount
debugging (at least 3 distinct true/false symbols, but not all 8 combos
are supported by the code, etc etc), and no coherent documentation of
this stuff -- 'twas too painful to track this one down.
2000-09-17 14:40:17 +00:00
Tim Peters 78fc0b57df Fixed legit gripe from c.l.py that math.fmod docs aren't confusing enough.
FRED, please check my monkey-see-monkey-do Tex fiddling!
2000-09-16 03:54:24 +00:00
Neil Schemenauer ce20967c2c Don't remove instance objects from the GC container set until we are
they are dead.  Fixes bug #113812.
2000-09-15 18:57:21 +00:00
Martin v. Löwis 3cd760425f Correctly cast the return value of realloc. 2000-09-15 07:32:39 +00:00
Martin v. Löwis c58dbebf4b Correctly use realloc return value. Fixes bug #114424. 2000-09-15 07:07:46 +00:00
Tim Peters 8f422461b4 Fix for bug 113934. string*n and unicode*n did no overflow checking at
all, either to see whether the # of chars fit in an int, or that the
amount of memory needed fit in a size_t.  Checking these is expensive, but
the alternative is silently wrong answers (as in the bug report) or
core dumps (which were easy to provoke using Unicode strings).
2000-09-09 06:13:41 +00:00
Fredrik Lundh df84675f93 changed \x to consume exactly two hex digits, also for unicode
strings.  closes PEP-223.

also added \U escape (eight hex digits).
2000-09-03 11:29:49 +00:00
Thomas Wouters f2b332dc7e Cosmetic cleanup by Vladimir. 2000-09-02 08:34:40 +00:00
Guido van Rossum 8586991099 REMOVED all CWI, CNRI and BeOpen copyright markings.
This should match the situation in the 1.6b1 tree.
2000-09-01 23:29:29 +00:00
Guido van Rossum bb8be93a50 Rewritten some pieces of PyNumber_InPlaceAdd() for clarity. 2000-09-01 23:27:32 +00:00
Thomas Wouters cadd5b6b58 Fix grouping, again. This time properly :-) Sorry, guys. 2000-09-01 07:53:25 +00:00
Jeremy Hylton b709df3810 refactor __del__ exception handler into PyErr_WriteUnraisable
add sanity check to gc: if an exception occurs during GC, call
PyErr_WriteUnraisable and then call Py_FatalEror.
2000-09-01 02:47:25 +00:00
Guido van Rossum 04127de434 Add parens suggested by gcc -Wall. 2000-09-01 02:39:00 +00:00
Fred Drake 1bff34ab96 Slight performance hack that also avoids requiring the existence of thread
state for dictionaries that have only been indexed by string keys.

See the comments in SourceForge for more.

This closes SourceForge patch #101309.
2000-08-31 19:31:38 +00:00
Fred Drake c88b99ce06 Clear errors raised by PyObject_Compare() without losing any existing
exception context.  This avoids improperly propogating errors raised by
a user-defined __cmp__() by a subsequent lookup operation.

This patch does *not* include the performance enhancement patch for
dictionaries with string keys only; that will be checked in separately.

This closes SourceForge patch #101277 and bug #112558.
2000-08-31 19:04:07 +00:00
Thomas Wouters 6b958f7d7b Fix grouping: this is how I intended it, misguided as I was in boolean
operator associativity.
2000-08-31 07:02:19 +00:00
Fred Drake 8ce159aef5 Peter Schneider-Kamp <nowonder@nowonder.de>:
Remove some of GCC's warning in -Wstrict-prototypes mode.

This closes SourceForge patch #101342.
2000-08-31 05:18:54 +00:00
Fred Drake 562f62aa9b Removed compiler warning about wanting explicit grouping around &&
expression next to a || expression; this is a readability-inspired
warning from GCC.
2000-08-31 05:15:44 +00:00
Guido van Rossum 9c0a99ec1a PyOS_CheckStack() returns 1 when failing, not -1. 2000-08-30 15:53:50 +00:00
Marc-André Lemburg f5e96fa6b7 Fixed a serious typo. 2000-08-25 22:49:05 +00:00
Marc-André Lemburg 6ef68b5b01 Fix to bug [ Bug #111860 ] file.writelines() crashes.
file.writelines() now tries to emulate the behaviour of file.write()
as closely as possible. Due to the problems with releasing the
interpreter lock the solution isn't exactly optimal, but still better
than not supporting the file.write() semantics at all.
2000-08-25 22:39:50 +00:00
Thomas Wouters 1de2a79a48 Call PyErr_Clear() to clear the AttributeError raised by GetAttr. 2000-08-25 10:47:46 +00:00
Thomas Wouters e289e0bd0c Support for the in-place operations introduced by augmented assignment. Only
the list object supports this currently, but other candidates are
gladly accepted (like arraymodule and such.)
2000-08-24 20:08:19 +00:00
Thomas Wouters e266e42c9c Addendum to previous change: now that 'f' is not unconditionally
initialized in the 'if (..)', do so manually.
2000-08-23 23:31:34 +00:00
Thomas Wouters bf6cfa5f8e Add extra check on whether 'tp_as_number' is still non-NULL after coercion,
in the PyNumber_* functions. Also, remove unnecessary tests from
PyNumber_Multiply: after BINOP(), neither argument can be an instance.
2000-08-23 23:16:10 +00:00
Jack Jansen d49cbe1060 Added PyOS_CheckStack call to PyObject_Compare
Lowered the recursion limit on compares to 60 (one recursion depth can
take a whopping 2K of stack space when running test_b1!)
2000-08-22 21:52:51 +00:00
Jack Jansen e979160f5e Added include for limits.h 2000-08-22 21:51:22 +00:00
Barry Warsaw ce4dc41b1a PyUnicode_AsUTF8String(): /F picks up what I missed: the local var
`str' is no longer necessary.  Gotta turn on -Wall!
2000-08-18 19:30:40 +00:00
Barry Warsaw 2dd4abf277 PyUnicode_AsUTF8String(): Don't need to explicitly incref str since
PyUnicode_EncodeUTF8() already returns the created object with the
proper reference count.  This fixes an Insure reported memory leak.
2000-08-18 06:58:15 +00:00
Barry Warsaw 9d23a4eb03 make_pair(): When comparing the pointers, they must be cast to integer
types (i.e. Py_uintptr_t, our spelling of C9X's uintptr_t).  ANSI
specifies that pointer compares other than == and != to non-related
structures are undefined.  This quiets an Insure portability warning.
2000-08-18 05:01:19 +00:00
Barry Warsaw 67c1a04bbb PyFloat_FromString(): Move s_buffer[] up to the top-level function
scope.  Previously, s_buffer[] was defined inside the
PyUnicode_Check() scope, but referred to in the outer scope via
assignment to s.  This quiets an Insure portability warning.
2000-08-18 05:00:03 +00:00
Barry Warsaw dc55d715bb PyInstance_DoBinOp(): When comparing the pointers, they must be cast
to integer types (i.e. Py_uintptr_t, our spelling of C9X's uintptr_t).
ANSI specifies that pointer compares other than == and != to
non-related structures are undefined.  This quiets an Insure
portability warning.
2000-08-18 04:57:32 +00:00
Thomas Wouters 1d75a79c00 Apply SF patch #101029: call __getitem__ with a proper slice object if there
is no __getslice__ available. Also does the same for C extension types.
Includes rudimentary documentation (it could use a cross reference to the
section on slice objects, I couldn't figure out how to do that) and a test
suite for all Python __hooks__ I could think of, including the new
behaviour.
2000-08-17 22:37:32 +00:00
Barry Warsaw 4df762ff98 Insure properly identifies the `interned' dictionary as leaking at
shutdown time, but CVS log entry for revision 2.45 explains why this
is so.  Simply include a comment so we don't have to re-figure it out
again 5 years from now.
2000-08-16 23:41:01 +00:00
Andrew M. Kuchling 1582a3ab98 Updated comment 2000-08-16 12:27:23 +00:00
Tim Peters 39dce29365 Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470.
This was a misleading bug -- the true "bug" was that hash(x) gave an error
return when x is an infinity.  Fixed that.  Added new Py_IS_INFINITY macro to
pyport.h.  Rearranged code to reduce growing duplication in hashing of float and
complex numbers, pushing Trent's earlier stab at that to a logical conclusion.
Fixed exceedingly rare bug where hashing of floats could return -1 even if there
wasn't an error (didn't waste time trying to construct a test case, it was simply
obvious from the code that it *could* happen).  Improved complex hash so that
hash(complex(x, y)) doesn't systematically equal hash(complex(y, x)) anymore.
2000-08-15 03:34:48 +00:00
Marc-André Lemburg b7520774e2 Fixed a couple of instances where a 0-length string was being
resized after creation. 0-length strings are usually shared
and _PyString_Resize() fails on these shared strings.

Fixes [ Bug #111667 ] unicode core dump.
2000-08-14 11:29:19 +00:00
Trent Mick a584664134 Check for overflow in list object insertion and raise OverflowError.
see: http://www.python.org/pipermail/python-dev/2000-August/014971.html
2000-08-13 22:47:45 +00:00
Trent Mick 20abf573ef Clean up warning from Monterey compiler.
Properly end a comment block. It was terminated fine later but by a subsequent
block and. It was also in #if 0. This patch is so trivial I can't believe I am
talking about it. :)
2000-08-12 22:14:34 +00:00
Trent Mick a248fb605f Clean up a warning on Win64. The downcast of the strlen size_t
return value to int is safe here because it previously checked that
there will be no overflow.
2000-08-12 21:37:39 +00:00
Trent Mick 8a74e5fc2c Add the current Win64 compiler to the list of those that need the
huge switch statement broken up. This will probably not be necessary when
the Win64 compiler matures.
2000-08-12 19:37:27 +00:00
Trent Mick f29f47b38b Add largefile support for Linux64 and WIn64. Add test_largefile and some minor
change to regrtest.py to allow optional running of test_largefile ('cause it's
slow on Win64).

This closes patches:
http://sourceforge.net/patch/index.php?func=detailpatch&patch_id=100510&group_id=5470
and
http://sourceforge.net/patch/index.php?func=detailpatch&patch_id=100511&group_id=5470
2000-08-11 19:02:59 +00:00
Vladimir Marangozov 1d3e239f08 Fix missing decrements of the recursive counter in PyObject_Compare().
Closes Patch #101065.
2000-08-11 00:14:26 +00:00
Guido van Rossum 164452cec4 Barry's patch to implement the new setdefault() method. 2000-08-08 16:12:54 +00:00
Marc-André Lemburg e5034378cc Removing UTF-16 aware Unicode comparison code. This kind of compare
function (together with other locale aware ones) should into a new collation
support module. See python-dev for a discussion of this removal.

Note: This patch should also be applied to the 1.6 branch.
2000-08-08 08:04:29 +00:00
Moshe Zadka cf703f04ad Removing warnings found by gcc -Wall 2000-08-04 15:36:13 +00:00
Tim Peters 72d421b75c Boost buffer sizes in the absence of snprintf on Windows.
Ensure that # of args to sprintf always matches # of format specifiers.
2000-08-04 03:05:40 +00:00
Fred Drake c76e0e5679 snprintf() is not portable, so continue to use sprintf() until a portable
snprintf() is available.
2000-08-04 02:34:41 +00:00
Marc-André Lemburg bff879cabb This patch finalizes the move from UTF-8 to a default encoding in
the Python Unicode implementation.

The internal buffer used for implementing the buffer protocol
is renamed to defenc to make this change visible. It now holds the
default encoded version of the Unicode object and is calculated
on demand (NULL otherwise).

Since the default encoding defaults to ASCII, this will mean that
Unicode objects which hold non-ASCII characters will no longer
work on C APIs using the "s" or "t" parser markers. C APIs must now
explicitly provide Unicode support via the "u", "U" or "es"/"es#"
parser markers in order to work with non-ASCII Unicode strings.

(Note: this patch will also have to be applied to the 1.6 branch
 of the CVS tree.)
2000-08-03 18:46:08 +00:00
Fred Drake 2b83b4601f Remove the tp_print handler.
Revise the tp_repr handler to produce a more "minimal" presentation.
Make the tolist() method use PyArg_ParseTuple() and provide a docstring.
2000-08-03 17:43:02 +00:00
Guido van Rossum c4a19e7fe9 Remobe beopen/cnri/cwi copyrights, according to CNRI instructions.
This doesn't change the copyright status for these files -- just the
markings!  Doing it on the main branch for these three files for which
the HEAD revision was pushed back into 1.6.
2000-08-03 16:42:14 +00:00
Guido van Rossum 16b1ad9c7d Changing the CNRI copyright notice according to CNRI's instructions.
This is a notice without a date, which apparently is not a claim to
copyright but only advice to the reader.  IANAL. :-)
2000-08-03 16:24:25 +00:00
Peter Schneider-Kamp 7e01890986 merge Include/my*.h into Include/pyport.h
marked my*.h as obsolete
2000-07-31 15:28:04 +00:00
Thomas Wouters 334fb8985b Use 'void' directly instead of the ANY #define, now that all code is ANSI C.
Leave the actual #define in for API compatibility.
2000-07-25 12:56:38 +00:00
Thomas Wouters c307352027 ANSIfy functions that were hiding inside a macro. 2000-07-23 22:09:59 +00:00
Thomas Wouters a534594fc7 ANSIfication: remove very-old-varargs code, fix function declarations so
they include prototypes.
2000-07-22 23:59:33 +00:00
Thomas Wouters 7889010731 Miscelaneous ANSIfications. I'm assuming here 'main' should take (int,
char**) and return an int even on PC platforms. If not, please fix
PC/utils/makesrc.c ;-P
2000-07-22 19:25:51 +00:00
Marc-André Lemburg 9542f48fd5 Fixed problems with UTF error reporting macros and some formatting bugs. 2000-07-17 18:23:13 +00:00
Marc-André Lemburg cf5f358784 Restore PyXXX_Length() APIs for binary compatibility.
New code will see the macros and therefore use the PyXXX_Size()
APIs instead.
By Thomas Wouters.
2000-07-17 09:22:55 +00:00
Greg Stein af36a3aa20 gcc is being stupid with if/else constructs
clean out some other warnings
2000-07-17 09:04:43 +00:00
Greg Stein ff975003cf stop messing around with goto and just write the macro correctly. 2000-07-16 21:39:49 +00:00
Fredrik Lundh 0e19e76aba - change \x to mean "byte" also in unicode literals
(patch #100912)
2000-07-16 18:47:43 +00:00
Tim Peters 855ffac224 Fix fatal compiler (MSVC6) error:
unicodeobject.c(735) :
    error C2143: syntax error : missing ';' before '}'
2000-07-16 17:10:50 +00:00
Marc-André Lemburg fb625847bf Fix to a bug found by Florian Weimer:
The UTF-8 decoder is still buggy (i.e. it doesn't pass Markus Kuhn's
stress test), mainly due to the following construct:

    #define UTF8_ERROR(details)  do {                       \
        if (utf8_decoding_error(&s, &p, errors, details))   \
            goto onError;                                   \
        continue;                                           \
    } while (0)

(The "continue" statement is supposed to exit from the outer loop,
but of course, it doesn't.  Indeed, this is a marvelous example of
the dangers of the C programming language and especially of the C
preprocessor.)
2000-07-16 13:29:13 +00:00
Thomas Wouters 7e47402264 Spelling fixes supplied by Rob W. W. Hooft. All these are fixes in either
comments, docstrings or error messages. I fixed two minor things in
test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't").

There is a minor style issue involved: Guido seems to have preferred English
grammar (behaviour, honour) in a couple places. This patch changes that to
American, which is the more prominent style in the source. I prefer English
myself, so if English is preferred, I'd be happy to supply a patch myself ;)
2000-07-16 12:04:32 +00:00
Vladimir Marangozov 467a67e74d Fix in PyList_New(). With GC enabled and when out of memory,
free() the GC pointer, not the object pointer.
2000-07-15 03:31:31 +00:00
Andrew M. Kuchling 06051edc0d Added PyObject_AsFileDescriptor, which checks for integer, long integer,
or .fileno() method
2000-07-13 23:56:54 +00:00
Vladimir Marangozov 8dc19f672b Propagate the current exception in get_inprogress_dict() -- it doesn't
need to be cleared.
2000-07-12 23:39:38 +00:00
Jeremy Hylton 03657cfdb0 replace PyXXX_Length calls with PyXXX_Size calls 2000-07-12 13:05:33 +00:00
Jeremy Hylton 6253f83b0a change abstract size functions PySequence_Size &c.
add macros for backwards compatibility with C source
2000-07-12 12:56:19 +00:00
Andrew M. Kuchling bd9848d02f Fix typo in error message 2000-07-12 02:58:28 +00:00
Jack Jansen 28fc880e9a Include macglue.h on the macintosh, so function prototypes are in scope. 2000-07-11 21:47:20 +00:00
Jeremy Hylton 88887aa38e small updates to string_join:
use PyString_AS_STRING macro on local string object
    when resizing string, make sure resized string will always be big enough
    split string containing error message across two lines
add test to string_tests that causes resizing
2000-07-11 20:55:38 +00:00
Marc-André Lemburg 566d8a64eb Jeremy Hylton:
better error message for unicode coercion failure
2000-07-11 09:47:04 +00:00
Barry Warsaw 771d0675b6 string_join(): Some cleaning up of reference counting. In the
seqlen==1 clause, before returning item, we need to DECREF seq.  In
the res=PyString... failure clause, we need to goto finally to also
decref seq (and the DECREF of res in finally is changed to a
XDECREF).  Also, we need to DECREF seq just before the
PyUnicode_Join() return.
2000-07-11 04:58:12 +00:00
Jeremy Hylton 4904829dbf fix two refcount bugs in new string_join implementation:
1. PySequence_Fast_GET_ITEM is a macro and borrows a reference
2. The seq returned from PySequence_Fast must be decref'd
2000-07-11 03:28:17 +00:00
Jeremy Hylton 194e43e953 two changes to string_join:
implementation -- use PySequence_Fast interface to iterate over elements
interface -- if instance object reports wrong length, ignore it;
   previous version raised an IndexError if reported length was too high
2000-07-10 21:30:28 +00:00
Fredrik Lundh dde6164402 - changed hash calculation for unicode strings. the new
value is calculated from the character values, in a way
  that makes sure an 8-bit ASCII string and a unicode string
  with the same contents get the same hash value.

  (as a side effect, this also works for ISO Latin 1 strings).

  for more details, see the python-dev discussion.
2000-07-10 18:27:47 +00:00
Fred Drake 100814dc44 ANSI-fication of the sources. 2000-07-09 15:48:49 +00:00
Fred Drake a2f5511941 ANSI-fication of the sources. 2000-07-09 15:16:51 +00:00
Tim Peters c2e7da9859 Somebody started playing with const, so of course the outcome
was cascades of warnings about mismatching const decls.  Overall,
I think const creates lots of headaches and solves almost
nothing.  Added enough consts to shut up the warnings, but
this did require casting away const in one spot too (another
usual outcome of starting down this path):  the function
mymemreplace can't return const char*, but sometimes wants to
return its first argument as-is, which latter must be declared
const char* in order to avoid const warnings at mymemreplace's
call sites.  So, in the case the function wants to return the
first arg, that arg's declared constness must be subverted.
2000-07-09 08:02:21 +00:00
Fred Drake ba09633e1e ANSI-fication of the sources. 2000-07-09 07:04:36 +00:00
Fred Drake 45cfbcccc2 ANSI-fication of the sources. 2000-07-09 06:21:27 +00:00
Fred Drake ee238b977f ANSI-fication of the sources. 2000-07-09 06:03:25 +00:00
Fred Drake 1b190b4636 ANSI-fication of the sources. 2000-07-09 05:40:56 +00:00
Fred Drake 1f0968c5f8 Remove legacy use of __SC__; no longer needed now that ANSI source is
the standard for Python implementation.
2000-07-09 05:31:24 +00:00
Fred Drake fd99de6470 ANSI-fication of the sources. 2000-07-09 05:02:18 +00:00
Fred Drake 4288c80599 ANSI-fication of the sources. 2000-07-09 04:36:04 +00:00
Fred Drake 4201b9e420 type_error(): Added "const" to signature to eliminate warning with -Wall. 2000-07-09 04:34:13 +00:00
Fred Drake 3be9a8a5ed ANSI-fication of the source.
Make the indentation and brace placement internally consistent.
2000-07-09 04:14:42 +00:00
Fred Drake 799124718d ANSI-fication of the sources. 2000-07-09 04:06:11 +00:00
Tim Peters dbd9ba6a6c Nuke all remaining occurrences of Py_PROTO and Py_FPROTO. 2000-07-09 03:09:57 +00:00
Fredrik Lundh 2a1e060619 - changed __repr__ to use "unicode escape" encoding for unicode
strings, instead of the default encoding.
  (see "minidom" thread for discussion, and also patch #100706)
2000-07-08 17:43:32 +00:00
Skip Montanaro 4cbc9f7650 delete unused local variable from _PyTrash_deposit_object 2000-07-08 12:06:36 +00:00
Skip Montanaro 4ca150bdb2 _Py_RefTotal should only be declared here when Py_TRACE_REFS are #define'd 2000-07-08 12:04:57 +00:00
Tim Peters 7d3a511a40 Cray J90 fixes for long ints.
This was a convenient excuse to create the pyport.h file recently
discussed!
Please use new Py_ARITHMETIC_RIGHT_SHIFT when right-shifting a
signed int and you *need* sign-extension.  This is #define'd in
pyport.h, keying off new config symbol SIGNED_RIGHT_SHIFT_ZERO_FILLS.
If you're running on a platform that needs that symbol #define'd,
the std tests never would have worked for you (in particular,
at least test_long would have failed).
The autoconfig stuff got added to Python after my Unix days, so
I don't know how that works.  Would someone please look into doing
& testing an auto-config of the SIGNED_RIGHT_SHIFT_ZERO_FILLS
symbol?  It needs to be defined if & only if, e.g., (-1) >> 3 is
not -1.
2000-07-08 04:17:21 +00:00
Tim Peters 43f04a36cf The tail end of x_sub implicitly assumed that an unsigned short
contains 16 bits.  Not true on Cray J90.
2000-07-08 02:26:47 +00:00
Tim Peters 9ace6bc7ef Got RID of redundant coercions in longobject.c (as spotted by Greg
Stein -- thanks!).  Incidentally removed all the Py_PROTO macros
from object.h, as they prevented my editor from magically finding
the definitions of the "coercion", "cmpfunc" and "reprfunc"
typedefs that were being redundantly applied in longobject.c.
2000-07-08 00:32:04 +00:00
Marc-André Lemburg e12896ec98 New surrogate support in the UTF-8 codec. By Bill Tutt. 2000-07-07 17:51:08 +00:00
Tim Peters 9f688bf9d2 Some cleanup of longs in prepartion for Cray J90 fixes: got
rid of Py_PROTO, switched to ANSI function decls, and did some
minor fiddling.
2000-07-07 15:53:28 +00:00
Marc-André Lemburg 5a5c81a0e9 Added new API PyUnicode_FromEncodedObject() which supports decoding
objects including instance objects.

The old API PyUnicode_FromObject() is still available as shortcut.
2000-07-07 13:46:42 +00:00
Marc-André Lemburg 063e0cb4c6 Fix to bug #393 (UTF16 codec didn't like empty strings) and
corrected some usage of 'unsigned long' where Py_UNICODE
should have been used.
2000-07-07 11:27:45 +00:00
Sjoerd Mullender 2629bd5a33 Two more places where long should be used instead of int. Especially
true after revision 2.36 was checked in...
2000-07-07 09:47:24 +00:00
Marc-André Lemburg 449c325303 Fixed some code that used 'short' to use 'long' instead. 2000-07-06 20:13:23 +00:00
Marc-André Lemburg 85cc4d8940 Fixed a couple of places where 'int' was used where 'long'
should have been used.
2000-07-06 19:43:31 +00:00
Jack Jansen 56cdce3070 Conditionally (currently on ifdef macintosh) break the large switch up
into 1000-case smaller ones.
2000-07-06 13:57:38 +00:00
Marc-André Lemburg 63f3d17418 Added new codec APIs and a new interface method .encode() which
works just like the Unicode one. The C APIs match the ones in the Unicode
implementation, but were extended to be able to reuse the existing
Unicode codecs for string purposes too.

Conversions from string to Unicode and back are done using the
default encoding.
2000-07-06 11:29:01 +00:00
Marc-André Lemburg 1f46860a29 Fix to bug #389:
Full_Name: Bastian Kleineidam
Version: 2.0b1 CVS 5.7.2000
OS: Debian Linux 2.2
Submission from: earth.cs.uni-sb.de (134.96.252.92)
2000-07-05 15:32:40 +00:00
Marc-André Lemburg a7acf425f6 Added new .isalpha() and .isalnum() methods which provide interfaces
to the new alphabetic lookup APIs in unicodectype.c.
2000-07-05 09:49:44 +00:00
Marc-André Lemburg f3938f55c7 Added new lookup API which matches all alphabetic Unicode characters,
i.e the ones with category 'Ll','Lu','Lt','Lo','Lm'.
2000-07-05 09:48:59 +00:00
Marc-André Lemburg 4027f8f4b3 Added new .isalpha() and .isalnum() methods to match the same
ones on the Unicode objects. Note that the string versions use
the (locale aware) C lib APIs isalpha() and isalnum().
2000-07-05 09:47:46 +00:00
Tim Peters 1f5871e834 Removed Py_PROTO and switched to ANSI C declarations in the dict
implementation.  This was really to test whether my new CVS+SSH
setup is more usable than the old one -- and turns out it is (for
whatever reason, it was impossible to do a commit before that
involved more than one directory).
2000-07-04 17:44:48 +00:00
Marc-André Lemburg 1e7205a62a Bill Tutt:
Make unicode_compare a true UTF-16 compare function (includes
support for surrogates).
2000-07-04 09:51:07 +00:00
Marc-André Lemburg 891bc65486 If auto-conversion fails, the Unicode codecs will return NULL.
This is now checked and the error passed on to the caller.
2000-07-03 09:57:53 +00:00
Fredrik Lundh efecc7d05b changed repr and str to always convert unicode strings
to 8-bit strings, using the default encoding.
2000-07-01 14:31:09 +00:00
Guido van Rossum 4cc6ac7b87 Neil Schemenauer: small fixes for GC 2000-07-01 01:00:38 +00:00
Guido van Rossum ffcc3813d8 Change copyright notice - 2nd try. 2000-06-30 23:58:06 +00:00
Guido van Rossum fd71b9e9d4 Change copyright notice. 2000-06-30 23:50:40 +00:00
Guido van Rossum 9a15c211cf Fix an error on AIX by using a proper cast. 2000-06-30 22:46:04 +00:00
Fred Drake a44d353e2b Trent Mick <trentm@activestate.com>:
The common technique for printing out a pointer has been to cast to a long
and use the "%lx" printf modifier. This is incorrect on Win64 where casting
to a long truncates the pointer. The "%p" formatter should be used instead.

The problem as stated by Tim:
> Unfortunately, the C committee refused to define what %p conversion "looks
> like" -- they explicitly allowed it to be implementation-defined. Older
> versions of Microsoft C even stuck a colon in the middle of the address (in
> the days of segment+offset addressing)!

The result is that the hex value of a pointer will maybe/maybe not have a 0x
prepended to it.


Notes on the patch:

There are two main classes of changes:
- in the various repr() functions that print out pointers
- debugging printf's in the various thread_*.h files (these are why the
patch is large)


Closes SourceForge patch #100505.
2000-06-30 15:01:00 +00:00
Marc-André Lemburg d49e5b4667 Marc-Andre Lemburg <mal@lemburg.com>:
A previous patch by Jack Jansen was accidently reverted.
2000-06-30 14:58:20 +00:00
Marc-André Lemburg f28dd83b86 Marc-Andre Lemburg <mal@lemburg.com>:
New buffer overflow checks for formatting strings.

By Trent Mick.
2000-06-30 10:29:57 +00:00
Jeremy Hylton c5007aa5c3 final patches from Neil Schemenauer for garbage collection 2000-06-30 05:02:53 +00:00
Fred Drake 13634cf7a4 This patch addresses two main issues: (1) There exist some non-fatal
errors in some of the hash algorithms. For exmaple, in float_hash and
complex_hash a certain part of the value is not included in the hash
calculation. See Tim's, Guido's, and my discussion of this on
python-dev in May under the title "fix float_hash and complex_hash for
64-bit *nix"

(2) The hash algorithms that use pointers (e.g. func_hash, code_hash)
are universally not correct on Win64 (they assume that sizeof(long) ==
sizeof(void*))

As well, this patch significantly cleans up the hash code. It adds the
two function _Py_HashDouble and _PyHash_VoidPtr that the various
hashing routine are changed to use.

These help maintain the hash function invariant: (a==b) =>
(hash(a)==hash(b))) I have added Lib/test/test_hash.py and
Lib/test/output/test_hash to test this for some cases.
2000-06-29 19:17:04 +00:00
Guido van Rossum 4f4b799b33 Jack Jansen: Use include "" instead of <>; and staticforward declarations 2000-06-29 00:06:39 +00:00
Guido van Rossum d7823f2645 Vladimir Marangozov:
Avoid calling the dealloc function, previously triggered with
DECREF(inst).  This caused a segfault in PyDict_GetItem, called with a
NULL dict, whenever inst->in_dict fails under low-memory conditions.
2000-06-28 23:46:07 +00:00
Guido van Rossum ad89bbcd88 Trent Mick: change a few casts for Win64 compatibility. 2000-06-28 21:57:18 +00:00
Guido van Rossum eceebb87d9 Jack Jansen: Moved includes to the top, removed think C support 2000-06-28 20:57:07 +00:00
Marc-André Lemburg 0f774e3987 Marc-Andre Lemburg <mal@lemburg.com>:
Patch to the standard unicode-escape codec which dynamically
loads the Unicode name to ordinal mapping from the module
ucnhash.

By Bill Tutt.
2000-06-28 16:43:35 +00:00
Marc-André Lemburg 7c014684c2 Marc-Andre Lemburg <mal@lemburg.com>:
Better error message for "1 in unicodestring". Submitted
by Andrew Kuchling.
2000-06-28 08:11:47 +00:00
Jeremy Hylton d08b4c4524 part 2 of Neil Schemenauer's GC patches:
This patch modifies the type structures of objects that
participate in GC.  The object's tp_basicsize is increased when
GC is enabled.  GC information is prefixed to the object to
maintain binary compatibility.  GC objects also define the
tp_flag Py_TPFLAGS_GC.
2000-06-23 19:37:02 +00:00
Jeremy Hylton d22162bac7 traverse functions should return 0 on success 2000-06-23 17:14:56 +00:00
Jeremy Hylton 99a8f90874 raise TypeError when PyObject_Get/SetAttr called with non-string name 2000-06-23 14:36:32 +00:00
Jeremy Hylton 8caad49c30 Round 1 of Neil Schemenauer's GC patches:
This patch adds the type methods traverse and clear necessary for GC
implementation.
2000-06-23 14:18:11 +00:00
Fred Drake 396f6e0d6a Fredrik Lundh <effbot@telia.com>:
Simplify find code; this is a performance improvement on at least some
platforms.
2000-06-20 15:47:54 +00:00
Marc-André Lemburg 49ef6dc1f4 Marc-Andre Lemburg <mal@lemburg.com>:
Fixed a bug in PyUnicode_Count() which would have caused a
core dump in case of substring coercion failure.

Synchronized .count() with the string method of the same name
to return len(s)+1 for s.count('').
2000-06-18 22:25:22 +00:00
Andrew M. Kuchling 74042d6e5d Patch from /F:
this patch introduces PySequence_Fast and PySequence_Fast_GET_ITEM,
and modifies the list.extend method to accept any kind of sequence.
2000-06-18 18:43:14 +00:00
Marc-André Lemburg bea47e768d Vladimir MARANGOZOV <Vladimir.Marangozov@inrialpes.fr>:
This patch fixes an optimisation mystery in _PyUnicodeNew causing segfaults
on AIX when the interpreter is compiled with -O.
2000-06-17 20:31:17 +00:00
Marc-André Lemburg 29dc381ce0 Michael Hudson <mwh21@cam.ac.uk>:
The error message refers to "append", yet the operation in
question is "concat".
2000-06-16 17:05:57 +00:00
Fred Drake 56780257c6 Thomas Wouters <thomas@xs4all.net>:
The following patch adds "sq_contains" support to rangeobject, and enables
the already-written support for sq_contains in listobject and tupleobject.

The rangeobject "contains" code should be a bit more efficient than the
current default "in" implementation ;-) It might not get used much, but it's
not that much to add.

listobject.c and tupleobject.c already had code for sq_contains, and the
proper struct member was set, but the PyType structure was not extended to
include tp_flags, so the object-specific code was not getting called (Go
ahead, test it ;-). I also did this for the immutable_list_type in
listobject.c, eventhough it is probably never used. Symmetry and all that.
2000-06-15 14:50:20 +00:00
Marc-André Lemburg 60bc809d9a Marc-Andre Lemburg <mal@lemburg.com>:
Added code so that .isXXX() testing returns 0 for emtpy strings.
2000-06-14 09:18:32 +00:00
Marc-André Lemburg 07ceb67d9c Marc-Andre Lemburg <mal@lemburg.com>:
Fixed a typo and removed a debug printf(). Thanks to Finn Bock
for finding these.
2000-06-10 09:32:51 +00:00
Jeremy Hylton a251ea0680 the PyDict_SetItem does not borrow a reference, so we need to decref
reported by Mark Hammon
2000-06-09 16:20:39 +00:00
Andrew M. Kuchling cb95a1470a Patch from Michael Hudson: improve unclear error message 2000-06-09 14:04:53 +00:00
Marc-André Lemburg d4ab4a5905 Marc-Andre Lemburg <mal@lemburg.com>:
Fixed %c formatting to check for one character arguments. Thanks
to Finn Bock for finding this bug.

Added a fix for bug PR#348 which originated from not resetting
the globals correctly in _PyUnicode_Fini().
2000-06-08 17:54:00 +00:00
Marc-André Lemburg 90e8147118 Marc-Andre Lemburg <mal@lemburg.com>:
Change the default encoding to 'ascii' (it was previously
defined as UTF-8).

Note: The implementation still uses UTF-8 to implement
the buffer protocol, so C APIs will still see UTF-8. This
is on purpose: rather than fixing the Unicode implementation,
the C APIs should be made Unicode aware.
2000-06-07 09:13:21 +00:00
Fred Drake 4c7fdfc35b Trent Mick <trentm@ActiveState.com>:
This patch correct bounds checking in PyLong_FromLongLong. Currently, it does
not check properly for negative values when checking to see if the incoming
value fits in a long or unsigned long. This results in possible silent
truncation of the value for very large negative values.
2000-06-01 18:37:36 +00:00
Fred Drake 914a2edb24 Improve TypeError exception message for list catenation. 2000-06-01 14:31:03 +00:00
Fred Drake b6a9ada757 Michael Hudson <mwh21@cam.ac.uk>:
Removed PyErr_BadArgument() calls and replaced them with more useful
error messages.
2000-06-01 03:12:13 +00:00
Fred Drake 785d14f965 Minimal change so I can add the rest of MAL's checkin message:
M.-A. Lemburg <mal@lemburg.com>:
Fixed a core dump in PyUnicode_Format().
2000-05-09 19:54:43 +00:00
Fred Drake e4315f58d2 M.-A. Lemburg <mal@lemburg.com>:
Added support for user settable default encodings. The
current implementation uses a per-process global which
defines the value of the encoding parameter in case it
is set to NULL (meaning: use the default encoding).
2000-05-09 19:53:39 +00:00
Guido van Rossum c18a6f466a Replace PyErr_BadArgument() error in PyInt_AsLong() with "an integer
is required" (we can't say more because we don't know in which context
it is called).
2000-05-09 14:27:48 +00:00
Guido van Rossum b8872e61c6 Trent Mick:
Fix the string methods that implement slice-like semantics with
optional args (count, find, endswith, etc.) to properly handle
indeces outside [INT_MIN, INT_MAX]. Previously the "i" formatter
for PyArg_ParseTuple was used to get the indices. These could overflow.

This patch changes the string methods to use the "O&" formatter with
the slice_index() function from ceval.c which is used to do the same
job for Python code slices (e.g. 'abcabcabc'[0:1000000000L]).
2000-05-09 14:14:27 +00:00
Guido van Rossum c682140de7 Trent Mick:
Fix the string methods that implement slice-like semantics with
optional args (count, find, endswith, etc.) to properly handle
indeces outside [INT_MIN, INT_MAX]. Previously the "i" formatter
for PyArg_ParseTuple was used to get the indices. These could overflow.

This patch changes the string methods to use the "O&" formatter with
the slice_index() function from ceval.c which is used to do the same
job for Python code slices (e.g. 'abcabcabc'[0:1000000000L]). slice_index()
is renamed _PyEval_SliceIndex() and is now exported. As well, the return
values for success/fail were changed to make slice_index directly
usable as required by the "O&" formatter.

[GvR: shouldn't a similar patch be applied to unicodeobject.c?]
2000-05-08 14:08:05 +00:00
Guido van Rossum b8f820c5a9 The methods islower(), isupper(), isspace(), isdigit() and istitle()
gave bogus results for chars in the range 128-255, because their
implementation was using signed characters.  Fixed this by using
unsigned character pointers (as opposed to using Py_CHARMASK()).
2000-05-05 20:44:24 +00:00
Guido van Rossum 03e29f1ae9 Mark Hammond should get his act into gear (his words :-). Zero length
strings _are_ valid!
2000-05-04 15:52:20 +00:00
Guido van Rossum 42c29aaeb5 Fix warning detected by VC++ on assignment of Py_UNICODE to char. 2000-05-03 23:58:29 +00:00
Guido van Rossum b18618dab7 Vladimir Marangozov's long-awaited malloc restructuring.
For more comments, read the patches@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.

(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode.  I'm also holding back on his
change to main.c, which seems unnecessary to me.)
2000-05-03 23:44:39 +00:00
Guido van Rossum 4e751c3d12 Mark Hammond withdraws his fix -- the size includes the trailing 0 so
a size of 0 *is* illegal.
2000-05-03 12:27:22 +00:00
Guido van Rossum a6edfd9737 Mark Hammond:
Fixes the MBCS codec to work correctly with zero length strings.
2000-05-03 11:03:24 +00:00
Barry Warsaw ee98e4e75d Ignore a bunch of generated files. 2000-05-02 18:34:30 +00:00
Guido van Rossum 0e4f657a50 Marc-Andre Lemburg:
Fixed \OOO interpretation for Unicode objects. \777 now
correctly produces the Unicode character with ordinal 511.
2000-05-01 21:27:20 +00:00
Jeremy Hylton 37b1a26c89 add list_contains and tuplecontains: efficient implementations of tp_contains 2000-04-27 21:41:03 +00:00
Guido van Rossum ec5b776998 Marc-Andre Lemburg:
Doc strings can now be given as Unicode strings.
2000-04-27 20:14:13 +00:00
Guido van Rossum 3c1bb8043f Marc-Andre Lemburg:
Fixed a reference leak in the allocator.

Renamed utf8_string to _PyUnicode_AsUTF8String() and made
it external for use by other parts of the interpreter.
2000-04-27 20:13:50 +00:00
Jeremy Hylton 9e392e2412 potentially useless optimization
The previous checkin (2.84) added a PyErr_Format call that made the
cost of raising an AttributeError much more expensive.  In general
this doesn't matter, except that checks for __init__ and
__del__ methods, where exceptions are caught and cleared in C, also
got much more expensive.

The fix is to split instance_getattr1 into two calls:

instance_getattr2 checks the instance and the class for the attribute
and returns it or returns NULL on error.  It does not raise an
exception.

instance_getattr1 does rexec checks, then calls instance_getattr2.  It
raises an exception if instance_getattr2 returns NULL.

PyInstance_New and instance_dealloc now call instance_getattr2
directly.
2000-04-26 20:39:20 +00:00
Guido van Rossum e92e610a9e Christian Tismer -- total rewrite on trashcan code.
Improvements:
- does no longer need any extra memory
- has no relationship to tstate
- works in debug mode
- can easily be modified for free threading (hi Greg:)

Side effects:
Trashcan does change the order of object destruction.
Prevending that would be quite an immense effort, as
my attempts have shown. This version works always
the same, with debug mode or not. The slightly
changed destruction order should therefore be no problem.

Algorithm:
While the old idea of delaying the destruction of some
obejcts at a certain recursion level was kept, we now
no longer aloocate an object to hold these objects.
The delayed objects are instead chained together
via their ob_type field. The type is encoded via
ob_refcnt. When it comes to the destruction of the
chain of waiting objects, the topmost object is popped
off the chain and revived with type and refcount 1,
then it gets a normal Py_DECREF.

I am confident that this solution is near optimum
for minimizing side effects and code bloat.
2000-04-24 15:40:53 +00:00
Guido van Rossum 5ce78f8e4e Patch by Charles G Waldman to avoid a sneaky memory leak in
_PyTuple_Resize().  In addition, a change suggested by Jeremy Hylton
to limit the size of the free lists is also merged into this patch.

Charles wrote initially:

"""
Test Case:  run the following code:

class Nothing:
    def __len__(self):
        return 5
    def __getitem__(self, i):
        if i < 3:
            return i
        else:
            raise IndexError, i

def g(a,*b,**c):
    return

for x in xrange(1000000):
    g(*Nothing())


and watch Python's memory use go up and up.


Diagnosis:

The analysis begins with the call to PySequence_Tuple at line 1641 in
ceval.c - the argument to g is seen to be a sequence but not a tuple,
so it needs to be converted from an abstract sequence to a concrete
tuple.  PySequence_Tuple starts off by creating a new tuple of length
5 (line 1122 in abstract.c).  Then at line 1149, since only 3 elements
were assigned, _PyTuple_Resize is called to make the 5-tuple into a
3-tuple.  When we're all done the 3-tuple is decrefed, but rather than
being freed it is placed on the free_tuples cache.

The basic problem is that the 3-tuples are being added to the cache
but never picked up again, since _PyTuple_Resize doesn't make use of
the free_tuples cache.  If you are resizing a 5-tuple to a 3-tuple and
there is already a 3-tuple in free_tuples[3], instead of using this
tuple, _PyTuple_Resize will realloc the 5-tuple to a 3-tuple.  It
would more efficient to use the existing 3-tuple and cache the
5-tuple.

By making _PyTuple_Resize aware of the free_tuples (just as
PyTuple_New), we not only save a few calls to realloc, but also
prevent this misbehavior whereby tuples are being added to the
free_tuples list but never properly "recycled".
"""

And later:

"""
This patch replaces my submission of Sun, 16 Apr and addresses Jeremy
Hylton's suggestions that we also limit the size of the free tuple
list.  I chose 2000 as the maximum number of tuples of any particular
size to save.

There was also a problem with the previous version of this patch
causing a core dump if Python was built with Py_TRACE_REFS.  This is
fixed in the below version of the patch, which uses tupledealloc
instead of _Py_Dealloc.
"""
2000-04-21 21:15:05 +00:00
Jeremy Hylton 4a3dd2dcc2 Fix PR#7 comparisons of recursive objects
Note that comparisons of deeply nested objects can still dump core in
extreme cases.
2000-04-14 19:13:24 +00:00
Guido van Rossum f0b7b04ae8 Marc-Andre Lemburg:
The maxsplit functionality in .splitlines() was replaced by the keepends
functionality which allows keeping the line end markers together
with the string.

Added support for '%r' % obj: this inserts repr(obj) rather
than str(obj).
2000-04-11 15:39:26 +00:00
Guido van Rossum dc742b3184 Marc-Andre Lemburg:
Added a few missing whitespace Unicode char mappings.
Thanks to Brian Hooper.
2000-04-11 15:39:02 +00:00
Guido van Rossum 86662914be Marc-Andre Lemburg:
The maxsplit functionality in .splitlines() was replaced by the keepends
functionality which allows keeping the line end markers together
with the string.
2000-04-11 15:38:46 +00:00
Guido van Rossum ba71a247ac Simple optimization by Christian Tismer, who gives credit to Lenny
Kneler for reporting this issue: long_mult() is faster when the
smaller argument is on the left.  Swap the arguments accordingly.
2000-04-10 17:31:58 +00:00
Guido van Rossum fd4b957b06 Marc-Andre Lemburg:
* New exported API PyUnicode_Resize()

* The experimental Keep-Alive optimization was turned back
  on after some tweaks to the implementation. It should now
  work without causing core dumps... this has yet to tested
  though (switching it off is easy: see the unicodeobject.c
  file for details).

* Fixed a memory leak in the Unicode freelist cleanup code.

* Added tests to correctly process the return code from
  _PyUnicode_Resize().

* Fixed a bug in the 'ignore' error handling routines
  of some builtin codecs. Added test cases for these to
  test_unicode.py.
2000-04-10 13:51:10 +00:00
Guido van Rossum 90daa87569 Marc-Andre Lemburg:
* string_contains now calls PyUnicode_Contains() only when the other
  operand is a Unicode string (not whenever it's not a string).

* New format style '%r' inserts repr(arg) instead of str(arg).

* '...%s...' % u"abc" now coerces to Unicode just like
  string methods. Care is taken not to reevaluate already formatted
  arguments -- only the first Unicode object appearing in the
  argument mapping is looked up twice. Added test cases for
  this to test_unicode.py.
2000-04-10 13:47:21 +00:00
Guido van Rossum b244f6950b Marc-Andre Lemburg:
* TypeErrors during comparing of mixed type arguments including
  a Unicode object are now masked (just like they are for all
  other combinations).
2000-04-10 13:42:33 +00:00
Guido van Rossum 5f8b12f27e Mark Hammond:
In line with a similar checkin to object.c a while ago, this patch
gives a more descriptive error message for an attribute error on a
class instance.  The message now looks like:

AttributeError: 'Descriptor' instance has no attribute 'GetReturnType'
2000-04-10 13:03:19 +00:00
Guido van Rossum 5db862dd0c Skip Montanaro: add string precisions to calls to PyErr_Format
to prevent possible buffer overruns.
2000-04-10 12:46:51 +00:00
Guido van Rossum ba47704943 Conrad Huang points out that "if (0 < ch < 256)", while legal C,
doesn't mean what the Python programmer thought...
2000-04-06 18:18:10 +00:00
Guido van Rossum 34888ed689 Fredrik Lundh: eliminate a MSVC compiler warning. 2000-04-05 21:29:50 +00:00
Guido van Rossum 9e896b37c7 Marc-Andre's third try at this bulk patch seems to work (except that
his copy of test_contains.py seems to be broken -- the lines he
deleted were already absent).  Checkin messages:


New Unicode support for int(), float(), complex() and long().

- new APIs PyInt_FromUnicode() and PyLong_FromUnicode()
- added support for Unicode to PyFloat_FromString()
- new encoding API PyUnicode_EncodeDecimal() which converts
  Unicode to a decimal char* string (used in the above new
  APIs)
- shortcuts for calls like int(<int object>) and float(<float obj>)
- tests for all of the above

Unicode compares and contains checks:
- comparing Unicode and non-string types now works; TypeErrors
  are masked, all other errors such as ValueError during
  Unicode coercion are passed through (note that PyUnicode_Compare
  does not implement the masking -- PyObject_Compare does this)
- contains now works for non-string types too; TypeErrors are
  masked and 0 returned; all other errors are passed through

Better testing support for the standard codecs.

Misc minor enhancements, such as an alias dbcs for the mbcs codec.

Changes:
- PyLong_FromString() now applies the same error checks as
  does PyInt_FromString(): trailing garbage is reported
  as error and not longer silently ignored. The only characters
  which may be trailing the digits are 'L' and 'l' -- these
  are still silently ignored.
- string.ato?() now directly interface to int(), long() and
  float(). The error strings are now a little different, but
  the type still remains the same. These functions are now
  ready to get declared obsolete ;-)
- PyNumber_Int() now also does a check for embedded NULL chars
  in the input string; PyNumber_Long() already did this (and
  still does)

Followed by:

Looks like I've gone a step too far there... (and test_contains.py
seem to have a bug too).

I've changed back to reporting all errors in PyUnicode_Contains()
and added a few more test cases to test_contains.py (plus corrected
the join() NameError).
2000-04-05 20:11:21 +00:00
Guido van Rossum 2ea3e143f0 Some blank lines. 2000-03-31 17:24:09 +00:00