Commit Graph

256 Commits

Author SHA1 Message Date
Serhiy Storchaka 413fdcea21 Issue #24821: Refactor STRINGLIB(fastsearch_memchr_1char) and split it on
STRINGLIB(find_char) and STRINGLIB(rfind_char) that can be used independedly
without special preconditions.
2015-11-14 15:42:17 +02:00
Victor Stinner c3d2bc19e4 Use _PyBytesWriter in _PyBytes_FromIterator() 2015-10-14 14:15:49 +02:00
Victor Stinner c5c3ba4bec Add _PyBytesWriter_Resize() function
This function gives a control to the buffer size without using min_size.
2015-10-14 13:56:47 +02:00
Victor Stinner 3c50ce39bf Factorize _PyBytes_FromList() and _PyBytes_FromTuple() code using a C macro 2015-10-14 13:50:40 +02:00
Victor Stinner f2eafa323b Split PyBytes_FromObject() into subfunctions 2015-10-14 13:44:29 +02:00
Victor Stinner 2ec8063cc9 Modify _PyBytes_DecodeEscapeRecode() to use _PyBytesAPI
* Don't overallocate by 400% when recode is needed: only overallocate on demand
  using _PyBytesWriter.
* Use _PyLong_DigitValue to convert hexadecimal digit to int
* Create _PyBytes_DecodeEscapeRecode() subfunction
2015-10-14 13:32:13 +02:00
Victor Stinner f6358a7e4c _PyBytesWriter_Alloc(): only use 10 bytes of the small buffer in debug mode to
enhance code to detect buffer under- and overflow.
2015-10-14 12:02:39 +02:00
Victor Stinner 2bf8993db9 Optimize bytes.fromhex() and bytearray.fromhex()
Issue #25401: Optimize bytes.fromhex() and bytearray.fromhex(): they are now
between 2x and 3.5x faster. Changes:

* Use a fast-path working on a char* string for ASCII string
* Use a slow-path for non-ASCII string
* Replace slow hex_digit_to_int() function with a O(1) lookup in
  _PyLong_DigitValue precomputed table
* Use _PyBytesWriter API to handle the buffer
* Add unit tests to check the error position in error messages
2015-10-14 11:25:33 +02:00
Victor Stinner 772b2b09f2 Optimize bytearray % args
Issue #25399: Don't create temporary bytes objects: modify _PyBytes_Format() to
create work directly on bytearray objects.

* Rename _PyBytes_Format() to _PyBytes_FormatEx() just in case if something
  outside CPython uses it
* _PyBytes_FormatEx() now uses (char*, Py_ssize_t) for the input string, so
  bytearray_format() doesn't need tot create a temporary input bytes object
* Add use_bytearray parameter to _PyBytes_FormatEx() which is passed to
  _PyBytesWriter, to create a bytearray buffer instead of a bytes buffer

Most formatting operations are now between 2.5 and 5 times faster.
2015-10-14 09:56:53 +02:00
Victor Stinner 661aaccf9d Add use_bytearray attribute to _PyBytesWriter
Issue #25399: Add a new use_bytearray attribute to _PyBytesWriter to use a
bytearray buffer, instead of using a bytes object.
2015-10-14 09:41:48 +02:00
Victor Stinner 03dab786b2 Rewrite PyBytes_FromFormatV() using _PyBytesWriter API
* Add much more unit tests on PyBytes_FromFormatV()
* Remove the first loop to compute the length of the output string
* Use _PyBytesWriter to handle the bytes buffer, use overallocation
* Cleanup the code to make simpler and easier to review
2015-10-14 00:21:35 +02:00
Victor Stinner e9aa5950bb Fix compilation error in _PyBytesWriter_WriteBytes() on Windows 2015-10-12 13:57:47 +02:00
Victor Stinner 6c2cdae9e6 Writer APIs: use empty string singletons
Modify _PyBytesWriter_Finish() and _PyUnicodeWriter_Finish() to return the
empty bytes/Unicode string if the string is empty.
2015-10-12 13:29:43 +02:00
Victor Stinner c29e29bed1 Relax _PyBytesWriter API
Don't require _PyBytesWriter pointer to be a "char *". Same change for
_PyBytesWriter_WriteBytes() parameter.

For example, binascii uses "unsigned char*".
2015-10-12 13:12:54 +02:00
Victor Stinner 0cdad1e2bc Issue #25349: Add fast path for b'%c' % int
Optimize also %% formater.
2015-10-09 22:50:36 +02:00
Victor Stinner be75b8cf23 Issue #25349: Optimize bytes % int
Optimize bytes.__mod__(args) for integere formats: %d (%i, %u), %o, %x and %X.
_PyBytesWriter is now used to format directly the integer into the writer
buffer, instead of using a temporary bytes object.

Formatting is between 30% and 50% faster on a microbenchmark.
2015-10-09 22:43:24 +02:00
Victor Stinner ce179bf6ba Add _PyBytesWriter_WriteBytes() to factorize the code 2015-10-09 12:57:22 +02:00
Victor Stinner ad7715891e _PyBytesWriter: simplify code to avoid "prealloc" parameters
Substract preallocate bytes from min_size before calling
_PyBytesWriter_Prepare().
2015-10-09 12:38:53 +02:00
Victor Stinner 53926a1ce2 _PyBytesWriter: rename size attribute to min_size 2015-10-09 12:37:03 +02:00
Victor Stinner fa7762ec06 Issue #25349: Optimize bytes % args using the new private _PyBytesWriter API
* Thanks to the _PyBytesWriter API, output smaller than 512 bytes are allocated
  on the stack and so avoid calling _PyBytes_Resize(). Because of that, change
  the default buffer size to fmtcnt instead of fmtcnt+100.
* Rely on _PyBytesWriter algorithm to overallocate the buffer instead of using
  a custom code. For example, _PyBytesWriter uses a different overallocation
  factor (25% or 50%) depending on the platform to get best performances.
* Disable overallocation for the last write.
* Replace C loops to fill characters with memset()
* Add also many comments to _PyBytes_Format()
* Remove unused FORMATBUFLEN constant
* Avoid the creation of a temporary bytes object when formatting a floating
  point number (when no custom formatting option is used)
* Fix also reference leaks on error handling
* Use Py_MEMCPY() to copy bytes between two formatters (%)
2015-10-09 11:48:06 +02:00
Victor Stinner b3653a3458 Issue #25318: cleanup code _PyBytesWriter
Rename "stack buffer" to "small buffer".

Add also an assertion in _PyBytesWriter_GetPos().
2015-10-09 03:38:24 +02:00
Victor Stinner b13b97d3b8 Issue #25318: Fix compilation error
Replace "#if Py_DEBUG" with "#ifdef Py_DEBUG".
2015-10-09 02:52:16 +02:00
Victor Stinner 0016507c16 Issue #25318: Move _PyBytesWriter to bytesobject.c
Declare also the private API in bytesobject.h.
2015-10-09 01:53:21 +02:00
Serhiy Storchaka d92d4efe3d Issue #23573: Restored optimization of bytes.rfind() and bytearray.rfind()
for single-byte argument on Linux.
2015-07-20 22:58:02 +03:00
Serhiy Storchaka ac5569b1fa Issue #24115: Update uses of PyObject_IsTrue(), PyObject_Not(),
PyObject_IsInstance(), PyObject_RichCompareBool() and _PyDict_Contains()
to check for and handle errors correctly.
2015-05-30 17:48:19 +03:00
Serhiy Storchaka fa494fd883 Issue #24115: Update uses of PyObject_IsTrue(), PyObject_Not(),
PyObject_IsInstance(), PyObject_RichCompareBool() and _PyDict_Contains()
to check for and handle errors correctly.
2015-05-30 17:45:22 +03:00
Serhiy Storchaka 8b2e8b6cce Specify default values of semantic booleans in Argument Clinic generated signatures as booleans. 2015-05-30 11:30:39 +03:00
Gregory P. Smith 8cb6569fe1 Implements issue #9951: Adds a hex() method to bytes, bytearray, & memoryview.
Also updates a few internal implementations of the same thing to use the
new built-in code.

Contributed by Arnon Yaari.
2015-04-25 23:22:26 +00:00
Christian Heimes 4e25913f9f Remove local dead code. In both blocks dir is always greater 0. 2015-04-18 05:54:02 +02:00
Larry Hastings 89964c48d1 Issue #23944: Argument Clinic now wraps long impl prototypes at column 78. 2015-04-14 18:07:59 -04:00
Serhiy Storchaka 1009bf18b3 Issue #23501: Argumen Clinic now generates code into separate files by default. 2015-04-03 23:53:51 +03:00
Serhiy Storchaka 41525e31a5 Issue #23466: Raised OverflowError if %c argument is out of range. 2015-04-03 20:53:46 +03:00
Serhiy Storchaka 2c7b5a9d0d Issue #23466: %c, %o, %x, and %X in bytes formatting now raise TypeError on
non-integer input.
2015-03-30 09:19:08 +03:00
Victor Stinner dabbfe7b30 Issue #23573: Fix bytes.rfind() and bytearray.rfind() on Windows
Windows has no memrchr() function.

This change is only a workaround, the optimization must be reenabled on other
platforms.
2015-03-25 03:16:32 +01:00
Serhiy Storchaka d9d769fcdd Issue #23573: Increased performance of string search operations (str.find,
str.index, str.count, the in operator, str.split, str.partition) with
arguments of different kinds (UCS1, UCS2, UCS4).
2015-03-24 21:55:47 +02:00
Serhiy Storchaka 1dd49824df Issue #23681: The -b option now affects comparisons of bytes with int. 2015-03-20 16:54:57 +02:00
Ethan Furman 62e977f1b6 Close issue23467: add %r compatibility to bytes and bytearray 2015-03-11 08:17:00 -07:00
Antoine Pitrou 63afdaa110 Issue #23629: Fix the default __sizeof__ implementation for variable-sized objects. 2015-03-10 22:35:24 +01:00
Antoine Pitrou a654510150 Issue #23629: Fix the default __sizeof__ implementation for variable-sized objects. 2015-03-10 22:32:00 +01:00
Serhiy Storchaka 26861b0b29 Issue #23450: Fixed possible integer overflows. 2015-02-16 20:52:17 +02:00
Serhiy Storchaka ea5ce5a15e Issue #23383: Cleaned up bytes formatting. 2015-02-10 23:23:12 +02:00
Serhiy Storchaka 83848704f5 Issue #22896: Fixed using _getbuffer() in recently added _PyBytes_Format(). 2015-02-03 01:49:18 +02:00
Serhiy Storchaka 3dd3e26680 Issue #22896: Avoid to use PyObject_AsCharBuffer(), PyObject_AsReadBuffer()
and PyObject_AsWriteBuffer().
2015-02-03 01:25:42 +02:00
Serhiy Storchaka 4fdb68491e Issue #22896: Avoid to use PyObject_AsCharBuffer(), PyObject_AsReadBuffer()
and PyObject_AsWriteBuffer().
2015-02-03 01:21:08 +02:00
Victor Stinner 5474d0ba19 Issue #20284: Fix a compilation warning on Windows
Explicitly cast the long to char.
2015-01-26 16:43:36 +01:00
Benjamin Peterson a8efc9601d ensure ilen is initialized when it is assigned to len 2015-01-26 09:23:41 -05:00
Ethan Furman b95b56150f Issue20284: Implement PEP461 2015-01-23 20:05:18 -08:00
Serhiy Storchaka 83cf99d733 Issue #20335: bytes constructor now raises TypeError when encoding or errors
is specified with non-string argument.  Based on patch by Renaud Blanch.
2014-12-02 09:24:06 +02:00
Serhiy Storchaka 0b2cacb42a Issue #20335: bytes constructor now raises TypeError when encoding or errors
is specified with non-string argument.  Based on patch by Renaud Blanch.
2014-12-02 09:26:14 +02:00
Larry Hastings dfbeb160de Issue #22615: Argument Clinic now supports the "type" argument for the
int converter.  This permits using the int converter with enums and
typedefs.
2014-10-13 10:39:41 +01:00