cpython

Commit Graph

Author	SHA1	Message	Date
Greg Price	2f09413947	closes bpo-37966: Fully implement the UAX #15 quick-check algorithm. (GH-15558) The purpose of the `unicodedata.is_normalized` function is to answer the question `str == unicodedata.normalized(form, str)` more efficiently than writing just that, by using the "quick check" optimization described in the Unicode standard in UAX #15. However, it turns out the code doesn't implement the full algorithm from the standard, and as a result we often miss the optimization and end up having to compute the whole normalized string after all. Implement the standard's algorithm. This greatly speeds up `unicodedata.is_normalized` in many cases where our partial variant of quick-check had been returning MAYBE and the standard algorithm returns NO. At a quick test on my desktop, the existing code takes about 4.4 ms/MB (so 4.4 ns per byte) when the partial quick-check returns MAYBE and it has to do the slow normalize-and-compare: $ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"500000' \ -- 'unicodedata.is_normalized("NFD", s)' 50 loops, best of 5: 4.39 msec per loop With this patch, it gets the answer instantly (58 ns) on the same 1 MB string: $ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"500000' \ -- 'unicodedata.is_normalized("NFD", s)' 5000000 loops, best of 5: 58.2 nsec per loop This restores a small optimization that the original version of this code had for the `unicodedata.normalize` use case. With this, that case is actually faster than in master! $ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"500000' \ -- 'unicodedata.normalize("NFD", s)' 500 loops, best of 5: 561 usec per loop $ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"500000' \ -- 'unicodedata.normalize("NFD", s)' 500 loops, best of 5: 512 usec per loop	2019-09-03 19:45:44 -07:00
Steve Dower	993ac92418	bpo-38020: Fixes crash in os.readlink() on Windows (GH-15663)	2019-09-03 12:50:51 -07:00
Dong-hee Na	0cf832a9ef	bpo-37798: Fix _statistics module doc (GH-15546)	2019-09-03 12:21:45 +03:00
Serhiy Storchaka	1f21eaa15e	bpo-15999: Clean up of handling boolean arguments. (GH-15610) * Use the 'p' format unit instead of manually called PyObject_IsTrue(). * Pass boolean value instead 0/1 integers to functions that needs boolean. * Convert some arguments to boolean only once.	2019-09-01 12:16:51 +03:00
Serhiy Storchaka	5eca7f3f38	bpo-15999: Always pass bool instead of int to socket.setblocking(). (GH-15621)	2019-09-01 12:12:52 +03:00
Serhiy Storchaka	41c57b3353	bpo-37994: Fix silencing all errors if an attribute lookup fails. (GH-15630) Only AttributeError should be silenced.	2019-09-01 12:03:39 +03:00
Serhiy Storchaka	f02ea6225b	bpo-36543: Remove old-deprecated ElementTree features. (GH-12707) Remove methods Element.getchildren(), Element.getiterator() and ElementTree.getiterator() and the xml.etree.cElementTree module.	2019-09-01 11:18:35 +03:00
Inada Naoki	013e52fd34	bpo-37990: fix gc stats (GH-15626)	2019-08-31 09:13:42 +09:00
Min ho Kim	39d87b5471	Fix typos mostly in comments, docs and test names (GH-15209)	2019-08-30 16:21:19 -04:00
Victor Stinner	96b4087ce7	bpo-37140: Fix StructUnionType_paramfunc() (GH-15612) Fix a ctypes regression of Python 3.8. When a ctypes.Structure is passed by copy to a function, ctypes internals created a temporary object which had the side effect of calling the structure finalizer (__del__) twice. The Python semantics requires a finalizer to be called exactly once. Fix ctypes internals to no longer call the finalizer twice. Create a new internal StructParam_Type which is only used by _ctypes_callproc() to call PyMem_Free(ptr) on Py_DECREF(argument). StructUnionType_paramfunc() creates such object.	2019-08-30 14:30:33 +02:00
Sergey Fedoseev	6a650aaf77	bpo-37976: Prevent shadowing of TypeError in zip() (GH-15592)	2019-08-29 21:25:48 -07:00
Thomas A Caswell	e278335a6e	bpo-37933: Fix faulthandler.cancel_dump_traceback_later() (GH-15440) Fix faulthandler.cancel_dump_traceback_later() call if cancel_dump_traceback_later() was not called previously.	2019-08-29 18:30:04 +02:00
Rémi Lapeyre	4901fe274b	bpo-37034: Display argument name on errors with keyword arguments with Argument Clinic. (GH-13593)	2019-08-29 17:49:08 +03:00
Joannah Nanjekye	2c5fb17118	bpo-36833: Add tests for Datetime C API Macros (GH-14842) Added tests for PyDateTime_xxx_GET_xxx() macros of the C API of the datetime module.	2019-08-29 14:54:46 +02:00
Justin Blanchard	122376df55	bpo-37372: Fix error unpickling datetime.time objects from Python 2 with seconds>=24. (GH-14307)	2019-08-29 10:36:15 +03:00
Serhiy Storchaka	b235a1b473	bpo-37960: Silence only necessary errors in repr() of buffered and text streams. (GH-15543)	2019-08-29 09:25:22 +03:00
HongWeipeng	fa220ec763	Raise a RuntimeError when tee iterator is consumed from different threads (GH-15567)	2019-08-28 20:39:25 -07:00
Vinay Sharma	13f37f2ba8	closes bpo-37964: add F_GETPATH command to fcntl (GH-15550) https://bugs.python.org/issue37964 Automerge-Triggered-By: @benjaminp	2019-08-28 18:56:17 -07:00
Christian Heimes	98d90f745d	bpo-37951: Lift subprocess's fork() restriction (GH-15544)	2019-08-27 23:36:56 +02:00
vrajivk	8bf5fef873	bpo-36205: Fix the rusage implementation of time.process_time() (GH-15538)	2019-08-27 00:13:12 -04:00
Raymond Hettinger	6fee0f8ea7	bpo-37798: Minor code formatting and comment clean-ups. (GH-15526)	2019-08-26 11:25:58 -07:00
Inada Naoki	b27cbec801	bpo-37055: fix warnings in _blake2 module (GH-14646) https://bugs.python.org/issue37055 Automerge-Triggered-By: @tiran	2019-08-26 10:52:36 -07:00
Raymond Hettinger	aef9ad82f7	bpo-37942: Improve argument clinic float converter (GH-15470)	2019-08-24 19:10:39 -07:00
Dong-hee Na	0a18ee4be7	bpo-37798: Add C fastpath for statistics.NormalDist.inv_cdf() (GH-15266)	2019-08-23 15:20:30 -07:00
Pablo Galindo	4be11c009a	bpo-37915: Fix comparison between tzinfo objects and timezone objects (GH-15390) https://bugs.python.org/issue37915 Automerge-Triggered-By: @pablogsal	2019-08-22 12:24:25 -07:00
Steve Dower	df2d4a6f3d	bpo-37834: Normalise handling of reparse points on Windows (GH-15231) bpo-37834: Normalise handling of reparse points on Windows * ntpath.realpath() and nt.stat() will traverse all supported reparse points (previously was mixed) * nt.lstat() will let the OS traverse reparse points that are not name surrogates (previously would not traverse any reparse point) * nt.[l]stat() will only set S_IFLNK for symlinks (previous behaviour) * nt.readlink() will read destinations for symlinks and junction points only bpo-1311: os.path.exists('nul') now returns True on Windows * nt.stat('nul').st_mode is now S_IFCHR (previously was an error)	2019-08-21 15:27:33 -07:00
Stefan Krah	bcc446f525	Revert mode change that loses information in directory listings on Linux. (#15366 )	2019-08-21 23:00:04 +02:00
Victor Stinner	d8c5adf6f8	bpo-37851: faulthandler allocates its stack on demand (GH-15358) The faulthandler module no longer allocates its alternative stack at Python startup. Now the stack is only allocated at the first faulthandler usage. faulthandler no longer ignores memory allocation failure when allocating the stack. sigaltstack() failure now raises an OSError exception, rather than being ignored. The alternative stack is no longer used if sigaction() is not available. In practice, sigaltstack() should only be available when sigaction() is avaialble, so this change should have no effect in practice. faulthandler.dump_traceback_later() internal locks are now only allocated at the first dump_traceback_later() call, rather than always being allocated at Python startup.	2019-08-21 13:40:42 +01:00
Greg Price	9ece4a5057	Unmark files as executable that can't actually be executed. (GH-15353) There are plenty of legitimate scripts in the tree that begin with a `#!`, but also a few that seem to be marked executable by mistake. Found them with this command -- it gets executable files known to Git, filters to the ones that don't start with a `#!`, and then unmarks them as executable: $ git ls-files --stage \ \| perl -lane 'print $F[3] if (!/^100644/)' \ \| while read f; do head -c2 "$f" \| grep -qxF '#!' \ \|\| chmod a-x "$f"; \ done Looking at the list by hand confirms that we didn't sweep up any files that should have the executable bit after all. In particular * The `.psd` files are images from Photoshop. * The `.bat` files sure look like things that can be run. But we have lots of other `.bat` files, and they don't have this bit set, so it must not be needed for them. Automerge-Triggered-By: @benjaminp	2019-08-20 21:53:59 -07:00
Brett Cannon	1407038e0b	Remove a dead comment from ossaudiodev.c (#15346 )	2019-08-20 12:20:47 -07:00
Joannah Nanjekye	9e66aba999	bpo-15913: Implement PyBuffer_SizeFromFormat() (GH-13873) Implement PyBuffer_SizeFromFormat() function (previously documented but not implemented): call struct.calcsize().	2019-08-20 15:46:36 +01:00
Alex Gaynor	40dad9545a	Replace usage of the obscure PEM_read_bio_X509_AUX with the more standard PEM_read_bio_X509 (GH-15303) X509_AUX is an odd, note widely used, OpenSSL extension to the X509 file format. This function doesn't actually use any of the extra metadata that it parses, so just use the standard API. Automerge-Triggered-By: @tiran	2019-08-15 05:31:28 -07:00
Victor Stinner	ac827edc49	bpo-21131: Fix faulthandler.register(chain=True) stack (GH-15276) faulthandler now allocates a dedicated stack of SIGSTKSZ*2 bytes, instead of just SIGSTKSZ bytes. Calling the previous signal handler in faulthandler signal handler uses more than SIGSTKSZ bytes of stack memory on some platforms.	2019-08-14 23:35:27 +02:00
Artem Khramov	2814620657	bpo-37811: FreeBSD, OSX: fix poll(2) usage in sockets module (GH-15202) FreeBSD implementation of poll(2) restricts the timeout argument to be either zero, or positive, or equal to INFTIM (-1). Unless otherwise overridden, socket timeout defaults to -1. This value is then converted to milliseconds (-1000) and used as argument to the poll syscall. poll returns EINVAL (22), and the connection fails. This bug was discovered during the EINTR handling testing, and the reproduction code can be found in https://bugs.python.org/issue23618 (see connect_eintr.py, attached). On GNU/Linux, the example runs as expected. This change is trivial: If the supplied timeout value is negative, truncate it to -1.	2019-08-14 23:21:48 +02:00
Victor Stinner	077af8c2c9	bpo-37738: Fix curses addch(str, color_pair) (GH-15071) Fix the implementation of curses addch(str, color_pair): pass the color pair to setcchar(), instead of always passing 0 as the color pair.	2019-08-14 12:31:43 +02:00
Greg Price	51aac15f6d	Delete leftover clinic-generated file for C zipimport. (GH-15174)	2019-08-10 10:20:27 +03:00
Ngalim Siregar	92c7e30adf	bpo-37642: Update acceptable offsets in timezone (GH-14878) This fixes an inconsistency between the Python and C implementations of the datetime module. The pure python version of the code was not accepting offsets greater than 23:59 but less than 24:00. This is an accidental legacy of the original implementation, which was put in place before tzinfo allowed sub-minute time zone offsets. GH-14878	2019-08-09 10:22:16 -04:00
Inada Naoki	2a570af12a	bpo-37587: optimize json.loads (GH-15134) Use a tighter scope temporary variable to help register allocation. 1% speedup for large string. Use PyDict_SetItemDefault() for memoizing keys. At most 4% speedup when the cache hit ratio is low.	2019-08-08 17:57:10 +09:00
Sergey Fedoseev	3e41f3cabb	bpo-34488: optimize BytesIO.writelines() (GH-8904) Avoid the creation of unused int object for each line.	2019-08-07 09:38:31 +09:00
Serhiy Storchaka	18b711c5a7	bpo-37648: Fixed minor inconsistency in some __contains__. (GH-14904) The collection's item is now always at the left and the needle is on the right of ==.	2019-08-04 14:12:48 +03:00
Serhiy Storchaka	17e52649c0	bpo-37685: Fixed comparisons of datetime.timedelta and datetime.timezone. (GH-14996) There was a discrepancy between the Python and C implementations. Add singletons ALWAYS_EQ, LARGEST and SMALLEST in test.support to test mixed type comparison.	2019-08-04 12:38:46 +03:00
Greg Bowser	8fbece135d	bpo-36590: Add Bluetooth RFCOMM and support for Windows. (GH-12767) Support for RFCOMM, L2CAP, HCI, SCO is based on the BTPROTO_* macros being defined. Winsock only supports RFCOMM, even though it has a BTHPROTO_L2CAP macro. L2CAP support would build on windows, but not necessarily work. This also adds some basic unittests for constants (all of which existed prior to this commit, just not on windows) and creating sockets. pair: Nate Duarte <slacknate@gmail.com>	2019-08-02 13:29:52 -07:00
Inada Naoki	bf8162c8c4	bpo-37729: gc: write stats at once (GH-15050) gc used several PySys_WriteStderr() calls to write stats. It caused stats mixed up when stderr is shared by multiple processes like this: gc: collecting generation 2... gc: objects in each generation: 0 0gc: collecting generation 2... gc: objects in each generation: 0 0 126077 126077 gc: objects in permanent generation: 0 gc: objects in permanent generation: 0 gc: done, 112575 unreachable, 0 uncollectablegc: done, 112575 unreachable, 0 uncollectable, 0.2223s elapsed , 0.2344s elapsed	2019-08-02 16:25:29 +09:00
Anthony Sottile	c9345e382c	bpo-37695: Correct unget_wch error message. (GH-14986)	2019-07-31 15:11:24 +03:00
karl ding	31c4fd2a10	bpo-37085: Expose SocketCAN bcm_msg_head flags (#13646 ) Expose the CAN_BCM SocketCAN constants used in the bcm_msg_head struct flags (provided by <linux/can/bcm.h>) under the socket library. This adds the following constants with a CAN_BCM prefix: * SETTIMER * STARTTIMER * TX_COUNTEVT * TX_ANNOUNCE * TX_CP_CAN_ID * RX_FILTER_ID * RX_CHECK_DLC * RX_NO_AUTOTIMER * RX_ANNOUNCE_RESUME * TX_RESET_MULTI_IDX * RX_RTR_FRAME * CAN_FD_FRAME The CAN_FD_FRAME flag was introduced in the 4.8 kernel, while the other ones were present since SocketCAN drivers were mainlined in 2.6.25. As such, it is probably unnecessary to guard against these constants being missing.	2019-07-31 10:47:16 +02:00
Min ho Kim	c4cacc8c5e	Fix typos in comments, docs and test names (#15018 ) * Fix typos in comments, docs and test names * Update test_pyparse.py account for change in string length * Apply suggestion: splitable -> splittable Co-Authored-By: Terry Jan Reedy <tjreedy@udel.edu> * Apply suggestion: splitable -> splittable Co-Authored-By: Terry Jan Reedy <tjreedy@udel.edu> * Apply suggestion: Dealloccte -> Deallocate Co-Authored-By: Terry Jan Reedy <tjreedy@udel.edu> * Update posixmodule checksum. * Reverse idlelib changes.	2019-07-30 18:16:13 -04:00
Marco Paolini	8a758f5b99	bpo-37587: Make json.loads faster for long strings (GH-14752) When scanning the string, most characters are valid, so checking for invalid characters first means never needing to check the value of strict on valid strings, and only needing to check it on invalid characters when doing non-strict parsing of invalid strings. This provides a measurable reduction in per-character processing time (~11% in the pre-merge patch testing).	2019-07-31 00:16:34 +10:00
Pablo Galindo	9211e2fd81	bpo-37268: Add deprecation notice and a DeprecationWarning for the parser module (GH-15017) Deprecate the parser module and add a deprecation warning triggered on import and a warning block in the documentation. https://bugs.python.org/issue37268 Automerge-Triggered-By: @pablogsal	2019-07-30 04:04:01 -07:00
Raymond Hettinger	6b5f1b496f	bpo-37691: Let math.dist() accept sequences and iterables for coordinates (GH-14975)	2019-07-27 14:04:29 -07:00
Markus Mohrhard	898318b53d	bpo-37502: handle default parameter for buffers argument of pickle.loads correctly (GH-14593)	2019-07-25 18:00:34 +02:00

1 2 3 4 5 ...

11254 Commits