Commit Graph

449 Commits

Author SHA1 Message Date
Christian Heimes 3497f94476 First step of the C API rename:
renamed Include/bytesobject.h to Include/bytearrayobject.h
renamed Include/stringobject.h to Include/bytesobject.h
added Include/stringobject.h with aliases
2008-05-26 12:29:14 +00:00
Georg Brandl ecbbd94e71 #2809 followup: even better split docstring. 2008-05-11 20:53:55 +00:00
Georg Brandl dfb77dbc70 #2809: elaborate str.split docstring a bit. 2008-05-11 09:11:40 +00:00
Gregory P. Smith 2ea2968064 get rid of assert (size >= 0) now that an explicit if (size < 0) is in the code. 2008-04-10 19:50:36 +00:00
Gregory P. Smith c00eb73a30 Raise SystemError when size < 0 is passed into PyString_FromStringAndSize,
PyBytes_FromStringAndSize or PyUnicode_FromStringAndSize.  [issue2587]
2008-04-09 23:16:37 +00:00
Martin v. Löwis d918e4e068 Bug #2388: Fix gcc warnings when compiling with --enable-unicode=ucs4. 2008-04-07 03:08:28 +00:00
Neal Norwitz d183bdd6fb Revert r61969 which added casts to Py_CHARMASK to avoid compiler warnings.
Rather than sprinkle casts throughout the code, change Py_CHARMASK to
always cast it's result to an unsigned char.  This should ensure we
do the right thing when accessing an array with the result.
2008-03-28 04:58:51 +00:00
Neal Norwitz 231346e23f Fix warnings about using char as an array subscript. This is not portable
since char is signed on some platforms and unsigned on others.
2008-03-27 04:40:50 +00:00
Christian Heimes 1a6387e683 Merged revisions 61750,61752,61754,61756,61760,61763,61768,61772,61775,61805,61809,61812,61819,61917,61920,61930,61933-61934 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/trunk-bytearray

........
  r61750 | christian.heimes | 2008-03-22 20:47:44 +0100 (Sat, 22 Mar 2008) | 1 line

  Copied files from py3k w/o modifications
........
  r61752 | christian.heimes | 2008-03-22 20:53:20 +0100 (Sat, 22 Mar 2008) | 7 lines

  Take One
  * Added initialization code, warnings, flags etc. to the appropriate places
  * Added new buffer interface to string type
  * Modified tests
  * Modified Makefile.pre.in to compile the new files
  * Added bytesobject.c to Python.h
........
  r61754 | christian.heimes | 2008-03-22 21:22:19 +0100 (Sat, 22 Mar 2008) | 2 lines

  Disabled bytearray.extend for now since it causes an infinite recursion
  Fixed serveral unit tests
........
  r61756 | christian.heimes | 2008-03-22 21:43:38 +0100 (Sat, 22 Mar 2008) | 5 lines

  Added PyBytes support to several places:
  str + bytearray
  ord(bytearray)
  bytearray(str, encoding)
........
  r61760 | christian.heimes | 2008-03-22 21:56:32 +0100 (Sat, 22 Mar 2008) | 1 line

  Fixed more unit tests related to type('') is not unicode
........
  r61763 | christian.heimes | 2008-03-22 22:20:28 +0100 (Sat, 22 Mar 2008) | 2 lines

  Fixed more unit tests
  Fixed bytearray.extend
........
  r61768 | christian.heimes | 2008-03-22 22:40:50 +0100 (Sat, 22 Mar 2008) | 1 line

  Implemented old buffer interface for bytearray
........
  r61772 | christian.heimes | 2008-03-22 23:24:52 +0100 (Sat, 22 Mar 2008) | 1 line

  Added backport of the io module
........
  r61775 | christian.heimes | 2008-03-23 03:50:49 +0100 (Sun, 23 Mar 2008) | 1 line

  Fix str assignement to bytearray. Assignment of a str of size 1 is interpreted as a single byte
........
  r61805 | christian.heimes | 2008-03-23 19:33:48 +0100 (Sun, 23 Mar 2008) | 3 lines

  Fixed more tests
  Fixed bytearray() comparsion with unicode()
  Fixed iterator assignment of bytearray
........
  r61809 | christian.heimes | 2008-03-23 21:02:21 +0100 (Sun, 23 Mar 2008) | 2 lines

  str(bytesarray()) now returns the bytes and not the representation of the bytearray object
  Enabled and fixed more unit tests
........
  r61812 | christian.heimes | 2008-03-23 21:53:08 +0100 (Sun, 23 Mar 2008) | 3 lines

  Clear error PyNumber_AsSsize_t() fails
  Use CHARMASK for ob_svall access
  disabled a test with memoryview again
........
  r61819 | christian.heimes | 2008-03-23 23:05:57 +0100 (Sun, 23 Mar 2008) | 1 line

  Untested updates to the PCBuild directory
........
  r61917 | christian.heimes | 2008-03-26 00:57:06 +0100 (Wed, 26 Mar 2008) | 1 line

  The type system of Python 2.6 has subtle differences to 3.0's. I've removed the Py_TPFLAGS_BASETYPE flags from bytearray for now. bytearray can't be subclasses until the issues with bytearray subclasses are fixed.
........
  r61920 | christian.heimes | 2008-03-26 01:44:08 +0100 (Wed, 26 Mar 2008) | 2 lines

  Disabled last failing test
  I don't understand what the test is testing and how it suppose to work. Ka-Ping, please check it out.
........
  r61930 | christian.heimes | 2008-03-26 12:46:18 +0100 (Wed, 26 Mar 2008) | 1 line

  Re-enabled bytes warning code
........
  r61933 | christian.heimes | 2008-03-26 13:20:46 +0100 (Wed, 26 Mar 2008) | 1 line

  Fixed a bug in the new buffer protocol. The buffer slots weren't copied into a subclass.
........
  r61934 | christian.heimes | 2008-03-26 13:25:09 +0100 (Wed, 26 Mar 2008) | 1 line

  Re-enabled bytearray subclassing - all tests are passing.
........
2008-03-26 12:49:49 +00:00
Neal Norwitz 4677fbf7de Try to fix a bunch of compiler warnings on Win64. 2008-03-25 04:18:18 +00:00
Amaury Forgeot d'Arc fac02faed8 Issue2469: Correct a typo I introduced at r61793: compilation error with UCS4 builds.
All buildbots compile with UCS2...
2008-03-24 21:04:10 +00:00
Amaury Forgeot d'Arc 9a0d3462fc #1477: ur'\U0010FFFF' raised in narrow unicode builds.
Corrected the raw-unicode-escape codec to use UTF-16 surrogates in
this case, just like the unicode-escape codec.
2008-03-23 09:55:29 +00:00
Neal Norwitz ade57d0485 Remove compiler warnings (on Alpha at least) about using chars as
array subscripts.  Using chars are dangerous b/c they are signed
on some platforms and unsigned on others.
2008-03-23 06:19:57 +00:00
Neal Norwitz 419fd49abe Issue 2321: reduce memory usage (increase the memory that is returned
to the system) by using pymalloc for the data of unicode objects.

Will backport.
2008-03-17 20:22:43 +00:00
Guido van Rossum 5bdff60617 Fix the overflows in expandtabs(). "This time for sure!"
(Exploit at request.)
2008-03-11 21:18:06 +00:00
Facundo Batista c11cecf3d0 Issue 1742669. Now %d accepts very big float numbers.
Thanks Gabriel Genellina.
2008-02-24 03:17:21 +00:00
Eric Smith a9f7d62480 Backport of PEP 3101, Advanced String Formatting, from py3k.
Highlights:
 - Adding PyObject_Format.
 - Adding string.Format class.
 - Adding __format__ for str, unicode, int, long, float, datetime.
 - Adding builtin format.
 - Adding ''.format and u''.format.
 - str/unicode fixups for formatters.

The files in Objects/stringlib that implement PEP 3101 (stringdefs.h,
unicodedefs.h, formatter.h, string_format.h) are identical in trunk
and py3k.  Any changes from here on should be made to trunk, and
changes will propogate to py3k).
2008-02-17 19:46:49 +00:00
Christian Heimes 3b718a79af Implemented Martin's suggestion to clear the free lists during the garbage collection of the highest generation. 2008-02-14 12:47:33 +00:00
Christian Heimes 5b970ad483 Unified naming convention for free lists and their limits. All free lists
in Object/ are named ``free_list``, the counter ``numfree`` and the upper
limit is a macro ``PyName_MAXFREELIST`` inside an #ifndef block.

The chances should make it easier to adjust Python for platforms with
less memory, e.g. mobile phones.
2008-02-06 13:33:44 +00:00
Christian Heimes 4d4f270941 Patch #1970 by Antoine Pitrou: Speedup unicode whitespace and linebreak detection. The speedup is about 25% for split() (571 / 457 usec) and 35% (175 / 127 usec )for splitlines() 2008-01-30 11:32:37 +00:00
Christian Heimes 7f39c9fcbb Backport of several functions from Python 3.0 to 2.6 including PyUnicode_FromString, PyUnicode_Format and PyLong_From/AsSsize_t. The functions are partly required for the backport of the bytearray type and _fileio module. They should also make it easier to port C to 3.0.
First chapter of the Python 3.0 io framework back port: _fileio
The next step depends on a working bytearray type which itself depends on a backport of the nwe buffer API.
2008-01-25 12:18:43 +00:00
Christian Heimes 000a074c95 Modified PyImport_Import and PyImport_ImportModule to always use absolute imports by calling __import__ with an explicit level of 0
Added a new API function PyImport_ImportModuleNoBlock. It solves the problem with dead locks when mixing threads and imports
2008-01-03 22:16:32 +00:00
Christian Heimes e93237dfcc #1629: Renamed Py_Size, Py_Type and Py_Refcnt to Py_SIZE, Py_TYPE and Py_REFCNT. Macros for b/w compatibility are available. 2007-12-19 02:37:44 +00:00
Amaury Forgeot d'Arc 5087980c1e The incremental decoder for utf-7 must preserve its state between calls.
Solves issue1460.

Might not be a backport candidate: a new API function was added,
and some code may rely on details in utf-7.py.
2007-11-20 23:31:27 +00:00
Facundo Batista 6f7e6fb7a2 Made _ParseTupleFinds only defined to unicodeobject.c 2007-11-16 19:16:15 +00:00
Facundo Batista 57d5669f4b Now in find, rfind, index, and rindex, you can use None as defaults,
as usual with slicing (both with str and unicode strings).  This
fixes issue 1259.

For str only the stringobject.c file was modified.  But for unicode,
I needed to repeat in the four functions a lot of code, so created
a new function that does part of the job for them (and placed it in
find.h, following a suggestion of Barry).

Also added tests for this behaviour.
2007-11-16 18:04:14 +00:00
Guido van Rossum 1c1ac38157 Backport fixes for the code that decodes octal escapes (and for PyString
also hex escapes) -- this was reaching beyond the end of the input string
buffer, even though it is not supposed to be \0-terminated.
This has no visible effect but is clearly the correct thing to do.
(In 3.0 it had a visible effect after removing ob_sstate from PyString.)
2007-10-29 22:15:05 +00:00
Walter Dörwald 9d04542cc9 Set startinpos before calling the error handler. 2007-08-30 15:34:55 +00:00
Walter Dörwald 8757878b12 Rewrap line. 2007-08-30 15:30:09 +00:00
Thomas Wouters 3ccec68a05 Improve extended slicing support in builtin types and classes. Specifically:
- Specialcase extended slices that amount to a shallow copy the same way as
   is done for simple slices, in the tuple, string and unicode case.

 - Specialcase step-1 extended slices to optimize the common case for all
   involved types.

 - For lists, allow extended slice assignment of differing lengths as long
   as the step is 1. (Previously, 'l[:2:1] = []' failed even though
   'l[:2] = []' and 'l[:2:None] = []' do not.)

 - Implement extended slicing for buffer, array, structseq, mmap and
   UserString.UserString.

 - Implement slice-object support (but not non-step-1 slice assignment) for
   UserString.MutableString.

 - Add tests for all new functionality.
2007-08-28 15:28:19 +00:00
Walter Dörwald 9ab80a9fb4 Move another variable declaration up. 2007-08-17 16:58:43 +00:00
Walter Dörwald 20b40d3bce Move variable declaration up. 2007-08-17 16:52:50 +00:00
Walter Dörwald 6e39080649 Backport r57105 and r57145 from the py3k branch: UTF-32 codecs. 2007-08-17 16:41:28 +00:00
Georg Brandl 9efd9b6fa4 Bug #1763149: use proper slice syntax in docstring.
(backport)
2007-07-29 17:38:35 +00:00
Martin v. Löwis 6819210b9e PEP 3123: Provide forward compatibility with Python 3.0, while keeping
backwards compatibility. Add Py_Refcnt, Py_Type, Py_Size, and
PyVarObject_HEAD_INIT.
2007-07-21 06:55:02 +00:00
Georg Brandl 7c3b50db66 Patch #1673759: add a missing overflow check when formatting floats
with %G.
2007-07-12 08:38:00 +00:00
Neal Norwitz 5c9a81a3d8 Fix a bug when there was a newline in the string expandtabs was called on.
This also catches another condition that can overflow.

Will backport.
2007-06-11 02:16:10 +00:00
Neal Norwitz 7dbd2a3720 Prevent expandtabs() on string and unicode objects from causing a segfault when
a large width is passed on 32-bit platforms.  Found by Google.

It would be good for people to review this especially carefully and verify
I don't have an off by one error and there is no other way to cause overflow.
2007-06-09 03:36:34 +00:00
Neal Norwitz ee3a1b5244 Variation of patch # 1624059 to speed up checking if an object is a subclass
of some of the common builtin types.

Use a bit in tp_flags for each common builtin type.  Check the bit
to determine if any instance is a subclass of these common types.
The check avoids a function call and O(n) search of the base classes.
The check is done in the various Py*_Check macros rather than calling
PyType_IsSubtype().

All the bits are set in tp_flags when the type is declared
in the Objects/*object.c files because PyType_Ready() is not called
for all the types.  Should PyType_Ready() be called for all types?
If so and the change is made, the changes to the Objects/*object.c files
can be reverted (remove setting the tp_flags).  Objects/typeobject.c
would also have to be modified to add conditions
for Py*_CheckExact() in addition to each the PyType_IsSubtype check.
2007-02-25 19:44:48 +00:00
Armin Rigo 7ccbca93a2 Forward-port of r52136,52138: a review of overflow-detecting code.
* unified the way intobject, longobject and mystrtoul handle
  values around -sys.maxint-1.

* in general, trying to entierely avoid overflows in any computation
  involving signed ints or longs is extremely involved.  Fixed a few
  simple cases where a compiler might be too clever (but that's all
  guesswork).

* more overflow checks against bad data in marshal.c.

* 2.5 specific: fixed a number of places that were still confusing int
  and Py_ssize_t.  Some of them could potentially have caused
  "real-world" breakage.

* list.pop(x): fixing overflow issues on x was messy.  I just reverted
  to PyArg_ParseTuple("n"), which does the right thing.  (An obscure
  test was trying to give a Decimal to list.pop()... doesn't make
  sense any more IMHO)

* trying to write a few tests...
2006-10-04 12:17:45 +00:00
Raymond Hettinger a0c95fa4d8 Fix endcase for str.rpartition() 2006-09-04 15:32:48 +00:00
Neal Norwitz 17753ecbfa Patch #1541585: fix buffer overrun when performing repr() on
a unicode string in a build with wide unicode (UCS-4) support.

This code could be improved, so add an XXX comment.
2006-08-21 22:21:19 +00:00
Marc-André Lemburg 3a457790c7 Correct an accidentally removed previous patch. 2006-08-14 12:57:27 +00:00
Marc-André Lemburg 040f76b79c Slightly revised version of patch #1538956:
Replace UnicodeDecodeErrors raised during == and !=
compares of Unicode and other objects with a new
UnicodeWarning.

All other comparisons continue to raise exceptions.
Exceptions other than UnicodeDecodeErrors are also left
untouched.
2006-08-14 10:55:19 +00:00
Neal Norwitz 8a87f5d37e Patch #1538606, Patch to fix __index__() clipping.
I modified this patch some by fixing style, some error checking, and adding
XXX comments.  This patch requires review and some changes are to be expected.
I'm checking in now to get the greatest possible review and establish a
baseline for moving forward.  I don't want this to hold up release if possible.
2006-08-12 17:03:09 +00:00
Neal Norwitz e1fdb32ff2 Handle allocation failures gracefully. Found with failmalloc.
Many (all?) of these could be backported.
2006-07-21 05:32:28 +00:00
Martin v. Löwis d825143be1 Patch #1455898: Incremental mode for "mbcs" codec. 2006-06-14 05:21:04 +00:00
Neal Norwitz de4c78a1d7 Initialize the type object so pychecker can't crash the interpreter. 2006-06-13 08:28:19 +00:00
Georg Brandl 90e27d38f5 Apply perky's fix for #1503157: "/".join([u"", u""]) raising OverflowError.
Also improve error message on overflow.
2006-06-10 06:40:50 +00:00
Georg Brandl 242508160e RFE #1491485: str/unicode.endswith()/startswith() now accept a tuple as first argument. 2006-06-09 18:45:48 +00:00
Georg Brandl 9f16760666 Repair refleaks in unicodeobject. 2006-06-04 21:46:16 +00:00
Martin v. Löwis 3f767795f6 Patch #1359618: Speed-up charmap encoder. 2006-06-04 19:36:28 +00:00
Fredrik Lundh 60d8b18831 needforspeed: stringlib refactoring: changed find_obj to find_slice,
to enable use from stringobject
2006-05-27 15:20:22 +00:00
Fredrik Lundh c2d29c5a6d needforspeed: replace improvements, changed to Py_LOCAL_INLINE
where appropriate
2006-05-27 14:58:20 +00:00
Martin v. Löwis 2e3f6b77d5 Revert bogus change committed in 46432 to this file. 2006-05-27 11:07:49 +00:00
Andrew Dalke e0df762719 fixed typo 2006-05-27 11:04:36 +00:00
Fredrik Lundh 2d23d5bf2e needforspeed: more stringlib refactoring 2006-05-27 10:05:10 +00:00
Martin v. Löwis d004fc810a Patch 1494554: Update numeric properties to Unicode 4.1. 2006-05-27 08:36:52 +00:00
Neal Norwitz d1b6cd7bfb Fix Coverity warnings.
- Check the correct variable (str_obj, not str) for NULL
 - sep_len was already verified it wasn't 0
2006-05-27 05:21:30 +00:00
Andrew M. Kuchling 07bbfc6a51 Comment typo 2006-05-26 19:51:10 +00:00
Fredrik Lundh e6e43c867d needforspeed: stringlib refactoring: use stringlib/find for string find 2006-05-26 19:48:07 +00:00
Fredrik Lundh c816281304 needforspeed: use a macro to fix slice indexes 2006-05-26 19:33:03 +00:00
Fredrik Lundh ce4eccb0c4 needforspeed: stringlib refactoring: use stringlib/find for unicode
find
2006-05-26 19:29:05 +00:00
Fredrik Lundh 58b5e84d52 needforspeed: stringlib refactoring, continued. added count and
find helpers; updated unicodeobject to use stringlib_count
2006-05-26 19:24:53 +00:00
Fredrik Lundh 9c0e9c089c needspeed: rpartition documentation, tests, and a bug fixes.
feel free to add more tests and improve the documentation.
2006-05-26 18:24:15 +00:00
Fredrik Lundh b3167cbcd7 needforspeed: added rpartition implementation 2006-05-26 18:15:38 +00:00
Fredrik Lundh b947948c61 needforspeed: stringlib refactoring (in progress) 2006-05-26 17:22:38 +00:00
Fredrik Lundh a50d201bd9 needforspeed: stringlib refactoring (in progress) 2006-05-26 17:04:58 +00:00
Fredrik Lundh 95e2a91615 use Py_LOCAL also for string and unicode objects 2006-05-26 11:38:15 +00:00
Fredrik Lundh f2c0dfdb13 needforspeed: use Py_ssize_t for the fastsearch counter and skip
length (thanks, neal!).  and yes, I've verified that this doesn't
slow things down ;-)
2006-05-26 10:27:17 +00:00
Fredrik Lundh 450277fef5 needforspeed: use METH_O for argument handling, which made partition some
~15% faster for the current tests (which is noticable faster than a corre-
sponding find call).  thanks to neal-who-never-sleeps for the tip.
2006-05-26 09:46:59 +00:00
Fredrik Lundh 06a69dd8ff needforspeed: partition implementation, part two.
feel free to improve the documentation and the docstrings.
2006-05-26 08:54:28 +00:00
Andrew Dalke b552c4d848 Code had returned an ssize_t, upcast to long, then converted with PyInt_FromLong.
Now using PyInt_FromSsize_t.
2006-05-25 18:03:25 +00:00
Fredrik Lundh 0c71f88fc9 needforspeed: check for overflow in replace (from Andrew Dalke) 2006-05-25 16:46:54 +00:00
Fredrik Lundh 347ee277aa needforspeed: refactored the replace code slightly; special-case
constant-length changes; use fastsearch to locate the first match.
2006-05-24 16:35:18 +00:00
Fredrik Lundh d5e0dc51cf needforspeedindeed: use fastsearch also for __contains__ 2006-05-24 15:11:01 +00:00
Fredrik Lundh 6471ee4f18 needforspeed: use "fastsearch" for count and findstring helpers. this
results in a 2.5x speedup on the stringbench count tests, and a 20x (!)
speedup on the stringbench search/find/contains test, compared to 2.5a2.

for more on the algorithm, see:

    http://effbot.org/zone/stringlib.htm

if you get weird results, you can disable the new algoritm by undefining
USE_FAST in Objects/unicodeobject.c.

enjoy /F
2006-05-24 14:28:11 +00:00
Fredrik Lundh 240bf2a8e4 use Py_ssize_t for string indexes (thanks, neal!) 2006-05-24 10:20:36 +00:00
Fredrik Lundh 7763351808 return 0 on misses, not -1. 2006-05-23 19:47:35 +00:00
Fredrik Lundh b63588c188 needforspeed: use append+reverse for rsplit, use "bloom filters" to
speed up splitlines and strip with charsets; etc.  rsplit is now as
fast as split in all our tests (reverse takes no time at all), and
splitlines() is nearly as fast as a plain split("\n") in our tests.
and we're not done yet... ;-)
2006-05-23 18:44:25 +00:00
Fredrik Lundh 833bf9422e needforspeed: fixed unicode "in" operator to use same implementation
approach as find/index
2006-05-23 10:12:21 +00:00
Tim Peters 1bacc641a0 unicode_repeat(): Change type of local to Py_ssize_t,
since that's what it should be.
2006-05-23 05:47:16 +00:00
Tim Peters 286085c781 PyUnicode_Join(): Recent code changes introduced new
compiler warnings on Windows (signed vs unsigned mismatch
in comparisons).  Cleaned that up by switching more locals
to Py_ssize_t.  Simplified overflow checking (it can _be_
simpler because while these things are declared as
Py_ssize_t, then should in fact never be negative).
2006-05-22 19:17:04 +00:00
Fredrik Lundh 8a8e05a2b9 needforspeed: use memcpy for "long" strings; use a better algorithm
for long repeats.
2006-05-22 17:12:58 +00:00
Fredrik Lundh f1d60a5384 needforspeed: speed up unicode repeat, unicode string copy 2006-05-22 16:29:30 +00:00
Fredrik Lundh 763b50f9d9 docstring tweaks: count counts non-overlapping substrings, not
total number of occurences
2006-05-22 15:35:12 +00:00
Neal Norwitz 1004a5339a Patch #1488312, Fix memory alignment problem on SPARC in unicode. Will backport 2006-05-15 07:17:23 +00:00
Thomas Wouters 715a4cdea2 Use %zd instead of %i as format character (in call to PyErr_Format) for
Py_ssize_t argument.
2006-04-16 22:04:49 +00:00
Martin v. Löwis 5cb6936672 Make Py_BuildValue, PyObject_CallFunction and
PyObject_CallMethod aware of PY_SSIZE_T_CLEAN.
2006-04-14 09:08:42 +00:00
Martin v. Löwis f15da6995b Remove another INT_MAX limitation 2006-04-13 07:24:50 +00:00
Martin v. Löwis 412fb67368 Change more ints to Py_ssize_t. 2006-04-13 06:34:32 +00:00
Martin v. Löwis 80d2e591d5 Revert 34153: Py_UNICODE should not be signed. 2006-04-13 06:06:08 +00:00
Anthony Baxter ac6bd46d5c spread the extern "C" { } magic pixie dust around. Python itself builds now
using a C++ compiler. Still lots and lots of errors in the modules built by
setup.py, and a bunch of warnings from g++ in the core.
2006-04-13 02:06:09 +00:00
Anthony Baxter a62862120d More low-hanging fruit. Still need to re-arrange some code (or find a better
solution) in the same way as listobject.c got changed. Hoping for a better
solution.
2006-04-11 07:42:36 +00:00
Georg Brandl ecdc0a9f46 That one was a mistake. 2006-03-30 12:19:07 +00:00
Georg Brandl 347b30042b Remove unnecessary casts in type object initializers. 2006-03-30 11:57:00 +00:00
Thomas Wouters a96affe1fc - Reindent a confusingly indented piece of code (no intended code changes
there)
 - Add missing DECREFs of inner-scope 'temp' variable
 - Add various missing DECREFs by changing 'return NULL' into 'goto onError'
 - Avoid double DECREF when last _PyUnicode_Resize() fails

Coverity found one of the missing DECREFs, but oddly enough not the others.
2006-03-12 00:29:36 +00:00
Martin v. Löwis 480f1bb67b Update Unicode database to Unicode 4.1. 2006-03-09 23:38:20 +00:00
Guido van Rossum 38fff8c4e4 Checking in the code for PEP 357.
This was mostly written by Travis Oliphant.
I've inspected it all; Neal Norwitz and MvL have also looked at it
(in an earlier incarnation).
2006-03-07 18:50:55 +00:00
Hye-Shik Chang 4af5c8cee4 SF #1444030: Fix several potential defects found by Coverity.
(reviewed by Neal Norwitz)
2006-03-07 15:39:21 +00:00
Martin v. Löwis 15e62742fa Revert backwards-incompatible const changes. 2006-02-27 16:46:16 +00:00
Thomas Wouters de01774dae Use correct PyArg_Parse format char for Py_ssize_t in unicode.center().
Fixes:

>>> u"".center(10)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
MemoryError

on 64-bit systems.
2006-02-16 19:34:37 +00:00
Martin v. Löwis eb079f1c25 Use Py_ssize_t for counts and sizes.
Convert Py_ssize_t using PyInt_FromSsize_t
2006-02-16 14:32:27 +00:00
Martin v. Löwis 2c95cc6d72 Support %zd in PyErr_Format and PyString_FromFormat. 2006-02-16 06:54:25 +00:00
Tim Peters 15231548d2 doubletounicode(), longtounicode():
Py_SAFE_DOWNCAST can evaluate its first argument multiple
times in a debug build.  This caused two distinct assert-
failures in test_unicode run under a debug build.  Rewrote
the code in trivial ways so that multiple evaluation of the
first argument doesn't hurt.
2006-02-16 01:08:01 +00:00
Thomas Wouters 4701af5bf5 Remove two unused Py_ssize_t variables (merge glitches, looks like.) 2006-02-15 23:10:32 +00:00
Martin v. Löwis 18e165558b Merge ssize_t branch. 2006-02-15 17:27:45 +00:00
Neal Norwitz fc76d633e8 - Patch #1400181, fix unicode string formatting to not use the locale.
This is how string objects work.  u'%f' could use , instead of .
  for the decimal point.  Now both strings and unicode always use periods.

This is the code that would break:

import locale
locale.setlocale(locale.LC_NUMERIC, 'de_DE')
u'%.1f' % 1.0
assert '1.0' == u'%.1f' % 1.0

I couldn't create a test case which fails, but this fixes the problem.

Will backport.
2006-01-10 06:03:13 +00:00
Neal Norwitz d43069ce95 Fix icc warnings: remove (sometimes) unused variable conditionally 2006-01-08 01:12:10 +00:00
Martin v. Löwis dea59e5755 Stop maintaining the buildno file.
Also, stop determining Unicode sizes with PyString_GET_SIZE.
2006-01-05 10:00:36 +00:00
Hye-Shik Chang 835b243c71 Bug #1379994: Fix *unicode_escape codecs to encode r'\' as r'\\'
just like string codecs.
2005-12-17 04:38:31 +00:00
Jeremy Hylton af68c874a6 Add const to several API functions that take char *.
In C++, it's an error to pass a string literal to a char* function
without a const_cast().  Rather than require every C++ extension
module to put a cast around string literals, fix the API to state the
const-ness.

I focused on parts of the API where people usually pass literals:
PyArg_ParseTuple() and friends, Py_BuildValue(), PyMethodDef, the type
slots, etc.  Predictably, there were a large set of functions that
needed to be fixed as a result of these changes.  The most pervasive
change was to make the keyword args list passed to
PyArg_ParseTupleAndKewords() to be a const char *kwlist[].

One cast was required as a result of the changes:  A type object
mallocs the memory for its tp_doc slot and later frees it.
PyTypeObject says that tp_doc is const char *; but if the type was
created by type_new(), we know it is safe to cast to char *.
2005-12-10 18:50:16 +00:00
Walter Dörwald d4fff1731c Fix leaked reference to None. 2005-11-28 22:15:56 +00:00
Andrew M. Kuchling 8294de5673 Another comment typo fix 2005-11-02 16:36:12 +00:00
Walter Dörwald 2e2c02fedb Fix typo in comment. 2005-11-02 08:57:11 +00:00
Fred Drake db390c1ad8 fix typos, mostly in comments 2005-10-28 14:39:47 +00:00
Michael W. Hudson b2308bb9be Fix bug:
[ 1327110 ] wrong TypeError traceback in generator expressions

by removing the code that can stomp on the users' TypeError raised by the
iterable argument to ''.join() -- PySequence_Fast (now?) gives a perfectly
reasonable message itself.  Also, a couple of tests.
2005-10-21 11:45:01 +00:00
Marc-André Lemburg 5c4a9d6591 Whitespace corrections. 2005-10-19 22:39:02 +00:00
Marc-André Lemburg e115ec832c Bug fix for [ 1331062 ] utf 7 codec broken.
Backport candidate.
2005-10-19 22:33:31 +00:00
Walter Dörwald d1c1e10f70 Part of SF patch #1313939: Speedup charmap decoding by extending
PyUnicode_DecodeCharmap() the accept a unicode string as the mapping
argument which is used as a mapping table.

This code isn't used by any of the codecs yet.
2005-10-06 20:29:57 +00:00
Walter Dörwald a47d1c08d0 SF bug #1251300: On UCS-4 builds the "unicode-internal" codec will now complain
about illegal code points. The codec now supports PEP 293 style error handlers.
(This is a variant of the Nik Haldimann's patch that detects truncated data)
2005-08-30 10:23:14 +00:00
Marc-André Lemburg a9cadcd41b Correct the handling of 0-termination of PyUnicode_AsWideChar()
and its usage in PyLocale_strcoll().

Clarify the documentation on this.

Thanks to Andreas Degert for pointing this out.
2004-11-22 13:02:31 +00:00
Marc-André Lemburg 204bd6d9d2 Applied patch for [ 1047269 ] Buffer overwrite in PyUnicode_AsWideChar.
Python 2.3.x candidate.
2004-10-15 07:45:05 +00:00
Skip Montanaro 6543b45b0c Initialize sep and seplen to suppress warning from gcc. 2004-09-16 03:28:13 +00:00
Thomas Heller ca0d2cb66e Add a missing line continuation character. 2004-09-15 11:41:32 +00:00
Walter Dörwald 065a32f550 Make the hint about the None default less ambiguous. 2004-09-14 09:45:10 +00:00
Walter Dörwald 782afc5927 Enhance the docstrings for unicode.split() and string.split()
to make it clear that it is possible to pass None as the
separator argument to get the default "any whitespace" separator.
2004-09-14 09:40:45 +00:00
Walter Dörwald 69652035bc SF patch #998993: The UTF-8 and the UTF-16 stateful decoders now support
decoding incomplete input (when the input stream is temporarily exhausted).
codecs.StreamReader now implements buffering, which enables proper
readline support for the UTF-16 decoders. codecs.StreamReader.read()
has a new argument chars which specifies the number of characters to
return. codecs.StreamReader.readline() and codecs.StreamReader.readlines()
have a new argument keepends. Trailing "\n"s will be stripped from the lines
if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and
PyUnicode_DecodeUTF16Stateful.
2004-09-07 20:24:22 +00:00
Tim Peters 91879ab8ea PyUnicode_Join(): Bozo Alert. While this is chugging along, it may
need to convert str objects from the iterable to unicode.  So, if
someone set the system default encoding to something nasty enough,
the conversion process could mutate the input iterable as a side
effect, and PySequence_Fast doesn't hide that from us if the input was
a list.  IOW, can't assume the size of PySequence_Fast's result is
invariant across PyUnicode_FromObject() calls.
2004-08-27 22:35:44 +00:00
Tim Peters 05eba1fdc8 PyUnicode_Join(): Rewrote to use PySequence_Fast(). This doesn't do
much to reduce the size of the code, but greatly improves its clarity.
It's also quicker in what's probably the most common case (the argument
iterable is a list).  Against it, if the iterable isn't a list or a tuple,
a temp tuple is materialized containing the entire input sequence, and
that's a bigger temp memory burden.  Yawn.
2004-08-27 21:32:02 +00:00
Tim Peters 894c512c2f PyUnicode_Join(): Missed a spot where I intended a cast from size_t to
int.  I sure wish MS would gripe about that!  Whatever, note that the
statement above it guarantees that the cast loses no info.
2004-08-27 05:08:36 +00:00
Tim Peters 8ce9f16259 PyUnicode_Join(): Two primary aims:
1. u1.join([u2]) is u2
2. Be more careful about C-level int overflow.

Since PySequence_Fast() isn't needed to achieve #1, it's not used -- but
the code could sure be simpler if it were.
2004-08-27 01:49:32 +00:00
Hye-Shik Chang e9ddfbb412 SF #989185: Drop unicode.iswide() and unicode.width() and add
unicodedata.east_asian_width().  You can still implement your own
simple width() function using it like this:
    def width(u):
        w = 0
        for c in unicodedata.normalize('NFC', u):
            cwidth = unicodedata.east_asian_width(c)
            if cwidth in ('W', 'F'): w += 2
            else: w += 1
        return w
2004-08-04 07:38:35 +00:00
Marc-André Lemburg d25c650461 Let u'%s' % obj try obj.__unicode__() first and fallback to obj.__str__(). 2004-07-23 16:13:25 +00:00
Nicholas Bastin 9ba301e589 Moved SunPro warning suppression into pyport.h and out of individual
modules and objects.
2004-07-15 15:54:05 +00:00
Marc-André Lemburg 126b44cd41 Fix a copy&paste typo. 2004-07-10 12:04:20 +00:00
Marc-André Lemburg 1dffb120b7 .encode()/.decode() patch part 2. 2004-07-08 19:13:55 +00:00
Marc-André Lemburg d2d4598ec2 Allow string and unicode return types from .encode()/.decode()
methods on string and unicode objects. Added unicode.decode()
which was missing for no apparent reason.
2004-07-08 17:57:32 +00:00
Nicholas Bastin 1ce9e4cfc1 Fixed end-of-loop code not reached warning when using SunPro C 2004-06-17 18:27:18 +00:00
Hye-Shik Chang 974ed7cfa5 - SF #962502: Add two more methods for unicode type; width() and
iswide() for east asian width manipulation. (Inspired by David
Goodger, Reviewed by Martin v. Loewis)
- Move _PyUnicode_TypeRecord.flags to the end of the struct so that
no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
2004-06-02 16:49:17 +00:00
Hye-Shik Chang 4057483164 SF Patch #926375: Remove a useless UTF-16 support code that is never
been used. (Suggested by Martin v. Loewis)
2004-04-06 07:24:51 +00:00
Walter Dörwald cd736e71a3 Fix reallocation bug in unicode.translate(): The code was comparing
characters instead of character pointers to determine space requirements.
2004-02-05 17:36:00 +00:00
Hye-Shik Chang 1bc09b7c2a Cosmetic fix for wrongly indented tabs with ts=4. 2004-01-03 19:35:43 +00:00
Hye-Shik Chang 7fc4cf57b8 Fix unicode.rsplit()'s bug that ignores separater on the end of string when
using specialized splitter for 1 char sep.
2003-12-23 09:10:16 +00:00
Hye-Shik Chang 40e9509dc7 Fix broken xmlcharrefreplace by rev 2.204.
(Pointy hat goes to perky)
2003-12-22 01:31:13 +00:00
Hye-Shik Chang 4a264fb054 SF #859573: Reduce compiler warnings on gcc 3.2 and above. 2003-12-19 01:59:56 +00:00
Hye-Shik Chang 3ae811b57d Add rsplit method for str and unicode builtin types.
SF feature request #801847.
Original patch is written by Sean Reifschneider.
2003-12-15 18:49:53 +00:00
Guido van Rossum 6c9e130524 - Removed FutureWarnings related to hex/oct literals and conversions
and left shifts.  (Thanks to Kalle Svensson for SF patch 849227.)
  This addresses most of the remaining semantic changes promised by
  PEP 237, except for repr() of a long, which still shows the trailing
  'L'.  The PEP appears to promise warnings for operations that
  changed semantics compared to Python 2.3, but this is not
  implemented; we've suffered through enough warnings related to
  hex/oct literals and I think it's best to be silent now.
2003-11-29 23:52:13 +00:00
Raymond Hettinger 4f8f976576 Add optional fillchar argument to ljust(), rjust(), and center() string methods. 2003-11-26 08:21:35 +00:00
Walter Dörwald 4894c30626 Fix a bug in the memory reallocation code of PyUnicode_TranslateCharmap().
charmaptranslate_makespace() allocated more memory than required for the
next replacement but didn't remember that fact, so memory size was growing
exponentially every time a replacement string is longer that one character.
This fixes SF bug #828737.
2003-10-24 14:25:28 +00:00