cpython

Commit Graph

Author	SHA1	Message	Date
Neal Norwitz	a7edb11122	Whitespace normalization	2006-07-30 06:59:13 +00:00
Neal Norwitz	f71ec5a0ac	Bug #1515471 : string.replace() accepts character buffers again. Pass the char* and size around rather than PyObject's.	2006-07-30 06:57:04 +00:00
Neal Norwitz	8e6675a7dc	Update doc to make it agree with code. Bottom factor out some common code.	2006-06-11 05:47:14 +00:00
Georg Brandl	90e27d38f5	Apply perky's fix for #1503157 : "/".join([u"", u""]) raising OverflowError. Also improve error message on overflow.	2006-06-10 06:40:50 +00:00
Georg Brandl	242508160e	RFE #1491485 : str/unicode.endswith()/startswith() now accept a tuple as first argument.	2006-06-09 18:45:48 +00:00
Neal Norwitz	b16e4e7860	Remove ; at end of macro. There was a compiler recently that warned about extra semi-colons. It may have been the HP C compiler. This file will trigger a bunch of those warnings now.	2006-06-01 05:32:49 +00:00
Fredrik Lundh	80f8e80c15	needforspeed: added Py_MEMCPY macro (currently tuned for Visual C only), and use it for string copy operations. this gives a 20% speedup on some string benchmarks.	2006-05-28 12:06:46 +00:00
Fredrik Lundh	0b7ef46950	needforspeed: stringlib refactoring: use find_slice for stringobject	2006-05-27 15:26:19 +00:00
Fredrik Lundh	c2d29c5a6d	needforspeed: replace improvements, changed to Py_LOCAL_INLINE where appropriate	2006-05-27 14:58:20 +00:00
Andrew Dalke	d49d5c49ba	cleanup - removed trailing whitespace	2006-05-27 14:16:40 +00:00
Fredrik Lundh	2d23d5bf2e	needforspeed: more stringlib refactoring	2006-05-27 10:05:10 +00:00
Andrew Dalke	7e0a62ea90	Added description of why splitlines doesn't use the prealloc strategy	2006-05-26 22:49:03 +00:00
Andrew Dalke	5132407868	Added limits to the replace code so it does not count all of the matching patterns in a string, only the number needed by the max limit.	2006-05-26 20:25:22 +00:00
Fredrik Lundh	e6e43c867d	needforspeed: stringlib refactoring: use stringlib/find for string find	2006-05-26 19:48:07 +00:00
Fredrik Lundh	58b5e84d52	needforspeed: stringlib refactoring, continued. added count and find helpers; updated unicodeobject to use stringlib_count	2006-05-26 19:24:53 +00:00
Andrew Dalke	c5da53ba78	substring split now uses /F's fast string matching algorithm. (If compiled without FAST search support, changed the pre-memcmp test to check the last character as well as the first. This gave a 25% speedup for my test case.) Rewrote the split algorithms so they stop when maxsplit gets to 0. Previously they did a string match first then checked if the maxsplit was reached. The new way prevents a needless string search.	2006-05-26 19:02:09 +00:00
Fredrik Lundh	b3167cbcd7	needforspeed: added rpartition implementation	2006-05-26 18:15:38 +00:00
Fredrik Lundh	3a65d87e8c	needforspeed: remove remaining USE_FAST macros; if fastsearch was broken, someone would have noticed by now ;-)	2006-05-26 17:31:41 +00:00
Fredrik Lundh	c2032fb86a	needforspeed: cleanup	2006-05-26 17:26:39 +00:00
Fredrik Lundh	b947948c61	needforspeed: stringlib refactoring (in progress)	2006-05-26 17:22:38 +00:00
Fredrik Lundh	a50d201bd9	needforspeed: stringlib refactoring (in progress)	2006-05-26 17:04:58 +00:00
Fredrik Lundh	7c940d1d68	needforspeed: use Py_LOCAL on a few more locals in stringobject.c	2006-05-26 16:32:42 +00:00
Andrew Dalke	02758d66ce	Eeked out another 3% or so performance in split whitespace by cleaning up the algorithm.	2006-05-26 15:21:01 +00:00
Andrew Dalke	525eab3712	Changes to string.split/rsplit on whitespace to preallocate space in the results list. Originally it allocated 0 items and used the list growth during append. Now it preallocates 12 items so the first few appends don't need list reallocs. ("Here are some words ."2).split(None, 1) is 7% faster ("Here are some words ."2).split() is is 15% faster (Your milage may vary, see dealership for details.) File parsing like this for line in f: count += len(line.split()) is also about 15% faster. There is a slowdown of about 3% for large strings because of the additional overhead of checking if the append is to a preallocated region of the list or not. This will be the rare case. It could be improved with special case code but we decided it was not useful enough. There is a cost of 12sizeof(PyObject ) bytes per list. For the normal case of file parsing this is not a problem because of the lists have a short lifetime. We have not come up with cases where this is a problem in real life. I chose 12 because human text averages about 11 words per line in books, one of my data sets averages 6.2 words with a final peak at 11 words per line, and I work with a tab delimited data set with 8 tabs per line (or 9 words per line). 12 encompasses all of these. Also changed the last rstrip code to append then reverse, rather than doing insert(0). The strip() and rstrip() times are now comparable.	2006-05-26 14:00:45 +00:00
Fredrik Lundh	95e2a91615	use Py_LOCAL also for string and unicode objects	2006-05-26 11:38:15 +00:00
Fredrik Lundh	f2c0dfdb13	needforspeed: use Py_ssize_t for the fastsearch counter and skip length (thanks, neal!). and yes, I've verified that this doesn't slow things down ;-)	2006-05-26 10:27:17 +00:00
Fredrik Lundh	450277fef5	needforspeed: use METH_O for argument handling, which made partition some ~15% faster for the current tests (which is noticable faster than a corre- sponding find call). thanks to neal-who-never-sleeps for the tip.	2006-05-26 09:46:59 +00:00
Fredrik Lundh	06a69dd8ff	needforspeed: partition implementation, part two. feel free to improve the documentation and the docstrings.	2006-05-26 08:54:28 +00:00
Fredrik Lundh	fe5bb7e6d9	needforspeed: partition for 8-bit strings. for some simple tests, this is on par with a corresponding find, and nearly twice as fast as split(sep, 1) full tests, a unicode version, and documentation will follow to- morrow.	2006-05-25 23:27:53 +00:00
Bob Ippolito	955b64c031	squelch gcc4 darwin/x86 compiler warnings	2006-05-25 20:52:38 +00:00
Fredrik Lundh	554da412a8	needforspeed: use insert+reverse instead of append	2006-05-25 19:19:05 +00:00
Jack Diederich	60cbb3fe49	* eliminate warning by reverting tmp_s type to 'const char*'	2006-05-25 18:47:15 +00:00
Fredrik Lundh	c3434b3834	needforspeed: use fastsearch also for find/index and contains. the related tests are now about 10x faster.	2006-05-25 18:44:29 +00:00
Andrew Dalke	598710c727	Added overflow test for adding two (very) large strings where the new string is over max Py_ssize_t. I have no way to test it on my box or any box I have access to. At least it doesn't break anything.	2006-05-25 18:18:39 +00:00
Andrew M. Kuchling	f344c94c85	Comment typo	2006-05-25 18:11:16 +00:00
Fredrik Lundh	af72237abc	needforspeed: use "fastsearch" for count. this results in a 3x speedup for the related stringbench tests.	2006-05-25 17:55:31 +00:00
Andrew Dalke	8c9091074b	Fixed problem identified by Georg. The special-case in-place code for replace made a copy of the string using PyString_FromStringAndSize(s, n) and modify the copied string in-place. However, 1 (and 0) character strings are shared from a cache. This cause "A".replace("A", "a") to change the cached version of "A" -- used by everyone. Now may the copy with NULL as the string and do the memcpy manually. I've added regression tests to check if this happens in the future. Perhaps there should be a PyString_Copy for this case?	2006-05-25 17:53:00 +00:00
Fredrik Lundh	e68955cf32	needforspeed: new replace implementation by Andrew Dalke. replace is now about 3x faster on my machine, for the replace tests from string- bench.	2006-05-25 17:08:14 +00:00
Fredrik Lundh	0c71f88fc9	needforspeed: check for overflow in replace (from Andrew Dalke)	2006-05-25 16:46:54 +00:00
Fredrik Lundh	dfe503d3f0	needforspeed: _toupper/_tolower is a SUSv2 thing; fall back on ISO C versions if they're not defined.	2006-05-25 16:10:12 +00:00
Fredrik Lundh	4b4e33ef14	needforspeed: make new upper/lower work properly for single-character strings too... (thanks to georg brandl for spotting the exact problem faster than anyone else)	2006-05-25 15:49:45 +00:00
Fredrik Lundh	39ccef607e	needforspeed: speed up upper and lower for 8-bit string objects. (the unicode versions of these are still 2x faster on windows, though...) based on work by Andrew Dalke, with tweaks by yours truly.	2006-05-25 15:22:03 +00:00
Fredrik Lundh	763b50f9d9	docstring tweaks: count counts non-overlapping substrings, not total number of occurences	2006-05-22 15:35:12 +00:00
Tim Peters	8931ff1f67	Teach PyString_FromFormat, PyErr_Format, and PyString_FromFormatV about "%u", "%lu" and "%zu" formats. Since PyString_FromFormat and PyErr_Format have exactly the same rules (both inherited from PyString_FromFormatV), it would be good if someone with more LaTeX Fu changed one of them to just point to the other. Their docs were way out of synch before this patch, and I just did a mass copy+paste to repair that. Not a backport candidate (this is a new feature).	2006-05-13 23:28:20 +00:00
Martin v. Löwis	822f34a848	Revert 43315: Printing of %zd must be signed.	2006-05-13 13:34:04 +00:00
Thomas Wouters	568f1d0eed	Py_ssize_t issue; repr()'ing a very large string would result in a teensy string, because of a cast to int.	2006-04-21 13:54:43 +00:00
Thomas Wouters	dc5f808cbc	Make s.replace() work with explicit counts exceeding 2Gb.	2006-04-19 15:38:01 +00:00
Thomas Wouters	4abb3660ca	Use Py_ssize_t to hold the 'width' argument to the ljust, rjust, center and zfill stringmethods, so they can create strings larger than 2Gb on 64bit systems (even win64.) The unicode versions of these methods already did this right.	2006-04-19 14:50:15 +00:00
Skip Montanaro	429433b30b	C++ compiler cleanup: bunch-o-casts, plus use of unsigned loop index var in a couple places	2006-04-18 00:35:43 +00:00
Neal Norwitz	0e2cbabb8d	No need to cast a Py_ssize_t, use %z in PyErr_Format	2006-04-17 05:56:32 +00:00

1 2 3 4 5 ...

309 Commits