Commit Graph

46 Commits

Author SHA1 Message Date
Martin v. Löwis 5b222135f8 Make identifiers str (not str8) objects throughout.
This affects the parser, various object implementations,
and all places that put identifiers into C string literals.

In testing, a number of crashes occurred as code would
fail when the recursion limit was reached (such as the
Unicode interning dictionary having key/value pairs where
key is not value). To solve these, I added an overflowed
flag, which allows for 50 more recursions after the
limit was reached and the exception was raised, and
a recursion_critical flag, which indicates that recursion
absolutely must be allowed, i.e. that a certain call
must not cause a stack overflow exception.

There are still some places where both str and str8 are
accepted as identifiers; these should eventually be
removed.
2007-06-10 09:51:05 +00:00
Walter Dörwald 1680713e52 Add interning of unicode strings by copying the functionality from
stringobject.c.

Intern "True" and "False" in bool_repr() again as it was in the
8bit string era.
2007-05-25 13:52:07 +00:00
Thomas Wouters 27d517b21b Merged revisions 53875-53911 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r53899 | neal.norwitz | 2007-02-25 16:52:27 +0100 (Sun, 25 Feb 2007) | 1 line

  Add more details when releasing interned strings
........
  r53900 | neal.norwitz | 2007-02-25 16:53:36 +0100 (Sun, 25 Feb 2007) | 1 line

  Whitespace only changes
........
  r53901 | jeremy.hylton | 2007-02-25 16:57:45 +0100 (Sun, 25 Feb 2007) | 8 lines

  Fix crash in exec when unicode filename can't be decoded.

  I can't think of an easy way to test this behavior.  It only occurs
  when the file system default encoding and the interpreter default
  encoding are different, such that you can open the file but not decode
  its name.
........
  r53902 | jeremy.hylton | 2007-02-25 17:01:58 +0100 (Sun, 25 Feb 2007) | 2 lines

  Put declarations before code.
........
  r53910 | fred.drake | 2007-02-25 18:56:27 +0100 (Sun, 25 Feb 2007) | 3 lines

  - SF patch #1657613: add documentation for the Element interface
  - clean up bogus use of the {datadescni} environment everywhere
........
  r53911 | neal.norwitz | 2007-02-25 20:44:48 +0100 (Sun, 25 Feb 2007) | 17 lines

  Variation of patch # 1624059 to speed up checking if an object is a subclass
  of some of the common builtin types.

  Use a bit in tp_flags for each common builtin type.  Check the bit
  to determine if any instance is a subclass of these common types.
  The check avoids a function call and O(n) search of the base classes.
  The check is done in the various Py*_Check macros rather than calling
  PyType_IsSubtype().

  All the bits are set in tp_flags when the type is declared
  in the Objects/*object.c files because PyType_Ready() is not called
  for all the types.  Should PyType_Ready() be called for all types?
  If so and the change is made, the changes to the Objects/*object.c files
  can be reverted (remove setting the tp_flags).  Objects/typeobject.c
  would also have to be modified to add conditions
  for Py*_CheckExact() in addition to each the PyType_IsSubtype check.
........
2007-02-25 20:39:11 +00:00
Georg Brandl 66a796e5ab Patch #1601678: move intern() to sys.intern(). 2006-12-19 20:50:34 +00:00
Martin v. Löwis 18e165558b Merge ssize_t branch. 2006-02-15 17:27:45 +00:00
Armin Rigo 89a39461bf Wrote down the invariants of some common objects whose structure is
exposed in header files.  Fixed a few comments in these headers.

As we might have expected, writing down invariants systematically exposed a
(minor) bug.  In this case, function objects have a writeable func_code
attribute, which could be set to code objects with the wrong number of
free variables.  Calling the resulting function segfaulted the interpreter.
Added a corresponding test.
2004-10-28 16:32:00 +00:00
Neal Norwitz 21d896cfa1 Use appropriate macros not the deprecated DL_IMPORT/DL_EXPORT macros 2003-07-01 20:15:21 +00:00
Neil Schemenauer 96aa0acef0 Use Py_GCC_ATTRIBUTE instead of __attribute__. Compilers other than GCC
might use __attribute__ in other ways (e.g. CodeWarrior).
2002-09-15 14:09:54 +00:00
Guido van Rossum 45ec02aed1 SF patch 576101, by Oren Tirosh: alternative implementation of
interning.  I modified Oren's patch significantly, but the basic idea
and most of the implementation is unchanged.  Interned strings created
with PyString_InternInPlace() are now mortal, and you must keep a
reference to the resulting string around; use the new function
PyString_InternImmortal() to create immortal interned strings.
2002-08-19 21:43:18 +00:00
Martin v. Löwis 8a8da798a5 Patch #505705: Remove eval in pickle and cPickle. 2002-08-14 07:46:28 +00:00
Mark Hammond 91a681debf Excise DL_EXPORT from Include.
Thanks to Skip Montanaro and Kalle Svensson for the patches.
2002-08-12 07:21:58 +00:00
Guido van Rossum cacfc07d08 - A new type object, 'string', is added. This is a common base type
for 'str' and 'unicode', and can be used instead of
  types.StringTypes, e.g. to test whether something is "a string":
  isinstance(x, string) is True for Unicode and 8-bit strings.  This
  is an abstract base class and cannot be instantiated directly.
2002-05-24 19:01:59 +00:00
Tim Peters 1f7df3595a Remove the CACHE_HASH and INTERN_STRINGS preprocessor symbols. 2002-03-29 03:29:08 +00:00
Neil Schemenauer 90b689076a Add function attributes that allow GCC to check the arguments of printf-like
functions.
2001-10-23 02:21:22 +00:00
Tim Peters 5a49ade70e More on SF bug [#460020] bug or feature: unicode() and subclasses.
Repaired str(i) to return a genuine string when i is an instance of a str
subclass.  New PyString_CheckExact() macro.
2001-09-11 01:41:59 +00:00
Guido van Rossum 5eef77a21b Make the Py<type>_Check() macro use PyObject_TypeCheck(). 2001-08-30 03:08:07 +00:00
Barry Warsaw dadace004b PyString_FromFormat() and PyString_FromFormatV(): Largely ripped from
PyErr_Format() these new C API methods can be used instead of
    sprintf()'s into hardcoded char* buffers.  This allows us to fix
    many situation where long package, module, or class names get
    truncated in reprs.

    PyString_FromFormat() is the varargs variety.
    PyString_FromFormatV() is the va_list variety

    Original PyErr_Format() code was modified to allow %p and %ld
    expansions.

    Many reprs were converted to this, checkins coming soo.  Not
    changed: complex_repr(), float_repr(), float_print(), float_str(),
    int_repr().  There may be other candidates not yet converted.

    Closes patch #454743.
2001-08-24 18:32:06 +00:00
Tim Peters a7259597f1 SF bug 433228: repr(list) woes when len(list) big.
Gave Python linear-time repr() implementations for dicts, lists, strings.
This means, e.g., that repr(range(50000)) is no longer 50x slower than
pprint.pprint() in 2.2 <wink>.

I don't consider this a bugfix candidate, as it's a performance boost.

Added _PyString_Join() to the internal string API.  If we want that in the
public API, fine, but then it requires runtime error checks instead of
asserts.
2001-06-16 05:11:17 +00:00
Martin v. Löwis cd35306a25 Patch #424335: Implement string_richcompare, remove string_compare.
Use new _PyString_Eq in lookdict_string.
2001-05-24 16:56:35 +00:00
Marc-André Lemburg 2d9204199f This patch changes the way the string .encode() method works slightly
and introduces a new method .decode().

The major change is that strg.encode() will no longer try to convert
Unicode returns from the codec into a string, but instead pass along
the Unicode object as-is. The same is now true for all other codec
return types. The underlying C APIs were changed accordingly.

Note that even though this does have the potential of breaking
existing code, the chances are low since conversion from Unicode
previously took place using the default encoding which is normally
set to ASCII rendering this auto-conversion mechanism useless for
most Unicode encodings.

The good news is that you can now use .encode() and .decode() with
much greater ease and that the door was opened for better accessibility
of the builtin codecs.

As demonstration of the new feature, the patch includes a few new
codecs which allow string to string encoding and decoding (rot13,
hex, zip, uu, base64).

Written by Marc-Andre Lemburg. Copyright assigned to the PSF.
2001-05-15 12:00:02 +00:00
Barry Warsaw a903ad9855 _Py_ReleaseInternedStrings(): Private API function to decref and
release the interned string dictionary.  This is useful for memory
use debugging because it eliminates a huge source of noise from the
reports.  Only defined when INTERN_STRINGS is defined.
2001-02-23 16:40:48 +00:00
Tim Peters 38fd5b6413 Derived from Martin's SF patch 110609: support unbounded ints in %d,i,u,x,X,o formats.
Note a curious extension to the std C rules:  x, X and o formatting can never produce
a sign character in C, so the '+' and ' ' flags are meaningless for them.  But
unbounded ints *can* produce a sign character under these conversions (no fixed-
width bitstring is wide enough to hold all negative values in 2's-comp form).  So
these flags become meaningful in Python when formatting a Python long which is too
big to fit in a C long.  This required shuffling around existing code, which hacked
x and X conversions to death when both the '#' and '0' flags were specified:  the
hacks weren't strong enough to deal with the simultaneous possibility of the ' ' or
'+' flags too, since signs were always meaningless before for x and X conversions.
Isomorphic shuffling was required in unicodeobject.c.
Also added dozens of non-trivial new unbounded-int test cases to test_format.py.
2000-09-21 05:43:11 +00:00
Marc-André Lemburg d1ba443206 This patch adds a new Python C API called PyString_AsStringAndSize()
which implements the automatic conversion from Unicode to a string
object using the default encoding.

The new API is then put to use to have eval() and exec accept
Unicode objects as code parameter. This closes bugs #110924
and #113890.

As side-effect, the traditional C APIs PyString_Size() and
PyString_AsString() will also accept Unicode objects as
parameters.
2000-09-19 21:04:18 +00:00
Guido van Rossum 8586991099 REMOVED all CWI, CNRI and BeOpen copyright markings.
This should match the situation in the 1.6b1 tree.
2000-09-01 23:29:29 +00:00
Fred Drake 3cf4d2b3ea ANSI-fication and Py_PROTO extermination. 2000-07-09 00:55:06 +00:00
Marc-André Lemburg 3d1a1d7f0c Added prototypes for the new codec APIs for strings. These APIs
match the ones in the Unicode implementation, but were extended
to be able to reuse the existing Unicode codecs for string
purposes too.

Conversion from string to Unicode and back are done using the
default encoding.
2000-07-06 11:25:40 +00:00
Guido van Rossum ffcc3813d8 Change copyright notice - 2nd try. 2000-06-30 23:58:06 +00:00
Guido van Rossum fd71b9e9d4 Change copyright notice. 2000-06-30 23:50:40 +00:00
Guido van Rossum 43466ec7b0 Add DL_IMPORT(returntype) for all officially exported functions. 1998-12-04 18:48:25 +00:00
Guido van Rossum 1e6e9a2368 Two speedup hacks. Caching the hash saves recalculation of a string's
hash value.  Interning strings (which requires hash caching) tries to
ensure that only one string object with a given value exists, so
equality tests are one pointer comparison.  Together, these can speed
the interpreter up by as much as 20%.  Each costs the size of a long
or pointer per string object.  In addition, interned strings live
until the end of times.  If you are concerned about memory footprint,
simply comment the #define out here (and rebuild everything!).
1997-01-18 07:53:23 +00:00
Barry Warsaw accfb849f9 added PyString_GET_SIZE macro
for both PyString_GET_SIZE and PyString_AS_STRING, cast first argument
to a PyStringObject*
1997-01-06 22:42:50 +00:00
Guido van Rossum 067998f35e Add const to error and newstring functions 1996-12-10 15:33:34 +00:00
Guido van Rossum d266eb460e New permission notice, includes CNRI. 1996-10-25 14:44:06 +00:00
Guido van Rossum fdebf25705 Turn on CACHE_HASH, for 2% speedier dict lookups 1996-07-30 16:42:03 +00:00
Guido van Rossum 051ab123b4 make the type a parameter of the DL_IMPORT macro, for Borland C 1995-02-27 10:17:52 +00:00
Guido van Rossum 938178283c new names for lots of new functions 1995-01-17 16:01:01 +00:00
Guido van Rossum caa6380886 The great renaming, phase two: all header files have been updated to
use the new names exclusively, and the linker will see the new names.
Files that import "Python.h" also only see the new names.  Files that
import "allobjects.h" will continue to be able to use the old names,
due to the inclusion (in allobjects.h) of "rename2.h".
1995-01-12 11:45:45 +00:00
Guido van Rossum 5799b52008 Added 1995 copyright.
object.h: made sizes and refcnts signed ints.
stringobject.h: make getstrsize() signed int.
methodobject.h: add METH_VARARGS and METH_FREENAME flag bit definitions.
1995-01-04 19:06:22 +00:00
Guido van Rossum e89bc75048 Changes for dynamic linking under NT 1994-08-18 16:18:13 +00:00
Guido van Rossum b6775db241 Merge alpha100 branch back to main trunk 1994-08-01 11:34:53 +00:00
Sjoerd Mullender 3bb8a05947 Several optimizations and speed improvements.
cstubs: Use Matrix type instead of float[4][4].
1993-10-22 12:04:32 +00:00
Guido van Rossum a3309960a5 * Added support for X11 modules.
* Makefile: change location of FORMS library.
* posixmodule.c: turn #if 0 into #ifdef MSDOS (stuff in unistd.h or not)
* Almost all .h files: added CPP magic to avoid duplicate inclusions and
  to support inclusion from C++.
1993-07-28 09:05:47 +00:00
Guido van Rossum e537240c25 * Changed many files to use mkvalue() instead of newtupleobject().
* Fixcprt.py: added [-y file] option, do only files younger than file.
* modsupport.[ch]: added vmkvalue().
* intobject.c: use mkvalue().
* stringobject.c: added "formatstring"; renamed string* to string_*;
  ceval.c: call formatstring for string % value.
* longobject.c: close memory leak in divmod.
* parsetok.c: set result node to NULL when returning an error.
1993-03-16 12:15:04 +00:00
Guido van Rossum 5113f5fd34 Copyright for 1992 added 1992-04-05 14:20:22 +00:00
Guido van Rossum f70e43a073 Added copyright notice. 1991-02-19 12:39:46 +00:00
Guido van Rossum 85a5fbbdfe Initial revision 1990-10-14 12:07:46 +00:00