* Rename PyThreadState_DeleteCurrent()
to _PyThreadState_DeleteCurrent()
* Move it to the internal C API
Co-Authored-By: Carol Willing <carolcode@willingconsulting.com>
The purpose of the `unicodedata.is_normalized` function is to answer
the question `str == unicodedata.normalized(form, str)` more
efficiently than writing just that, by using the "quick check"
optimization described in the Unicode standard in UAX #15.
However, it turns out the code doesn't implement the full algorithm
from the standard, and as a result we often miss the optimization and
end up having to compute the whole normalized string after all.
Implement the standard's algorithm. This greatly speeds up
`unicodedata.is_normalized` in many cases where our partial variant
of quick-check had been returning MAYBE and the standard algorithm
returns NO.
At a quick test on my desktop, the existing code takes about 4.4 ms/MB
(so 4.4 ns per byte) when the partial quick-check returns MAYBE and it
has to do the slow normalize-and-compare:
$ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
-- 'unicodedata.is_normalized("NFD", s)'
50 loops, best of 5: 4.39 msec per loop
With this patch, it gets the answer instantly (58 ns) on the same 1 MB
string:
$ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
-- 'unicodedata.is_normalized("NFD", s)'
5000000 loops, best of 5: 58.2 nsec per loop
This restores a small optimization that the original version of this
code had for the `unicodedata.normalize` use case.
With this, that case is actually faster than in master!
$ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
-- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 561 usec per loop
$ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
-- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 512 usec per loop
Adds a link to `dateutil.parser.isoparse` in the documentation.
It would be nice to set up intersphinx for things like this, but I think we can leave that for a separate PR.
CC: @pitrou
[bpo-37979](https://bugs.python.org/issue37979)
https://bugs.python.org/issue37979
Automerge-Triggered-By: @pitrou
- drop TargetScopeError in favour of raising SyntaxError directly
as per the updated PEP 572
- comprehension iteration variables are explicitly local, but
named expression targets in comprehensions are nonlocal or
global. Raise SyntaxError as specified in PEP 572
- named expression targets in the outermost iterable of a
comprehension have an ambiguous target scope. Avoid resolving
that question now by raising SyntaxError. PEP 572
originally required this only for cases where the bound name
conflicts with the iteration variable in the comprehension,
but CPython can't easily restrict the exception to that case
(as it doesn't know the target variable names when visiting
the outermost iterator expression)
"Arguments may be integers... " could be misunderstand as they also
could be strings.
New wording makes it clear that arguments have to be integers.
modified: Doc/library/datetime.rst
Automerge-Triggered-By: @pganssle
Fix typo in description of link to mozilla bug report writing guidelines.
Though the URL is misleading, we're indeed trying to write bug _reports_, not to add bugs.
Automerge-Triggered-By: @ned-deily
The activation scripts generated by venv were inconsistent in how they changed the shell's prompt. Some used `__VENV_PROMPT__` exclusively, some used `__VENV_PROMPT__` if it was set even though by default `__VENV_PROMPT__` is always set and the fallback matched the default, and one ignored `__VENV_PROMPT__` and used `__VENV_NAME__` instead (and even used a differing format to the default prompt). This change now has all activation scripts use `__VENV_PROMPT__` only and relies on the fact that venv sets that value by default.
The color of the customization is also now set in fish to the blue from the Python logo for as hex color support is built into that shell (much like PowerShell where the built-in green color is used).
bpo-37834: Normalise handling of reparse points on Windows
* ntpath.realpath() and nt.stat() will traverse all supported reparse points (previously was mixed)
* nt.lstat() will let the OS traverse reparse points that are not name surrogates (previously would not traverse any reparse point)
* nt.[l]stat() will only set S_IFLNK for symlinks (previous behaviour)
* nt.readlink() will read destinations for symlinks and junction points only
bpo-1311: os.path.exists('nul') now returns True on Windows
* nt.stat('nul').st_mode is now S_IFCHR (previously was an error)
Added back mention that ensure_future actually scheduled obj. This documentation just mentions what ensure_future returns, so I did not realize that ensure_future also schedules obj.
There are plenty of legitimate scripts in the tree that begin with a
`#!`, but also a few that seem to be marked executable by mistake.
Found them with this command -- it gets executable files known to Git,
filters to the ones that don't start with a `#!`, and then unmarks
them as executable:
$ git ls-files --stage \
| perl -lane 'print $F[3] if (!/^100644/)' \
| while read f; do
head -c2 "$f" | grep -qxF '#!' \
|| chmod a-x "$f"; \
done
Looking at the list by hand confirms that we didn't sweep up any
files that should have the executable bit after all. In particular
* The `.psd` files are images from Photoshop.
* The `.bat` files sure look like things that can be run.
But we have lots of other `.bat` files, and they don't have
this bit set, so it must not be needed for them.
Automerge-Triggered-By: @benjaminp
The fact that keyword names are strings is now part of the vectorcall and `METH_FASTCALL` protocols. The biggest concrete change is that `_PyStack_UnpackDict` now checks that and raises `TypeError` if not.
CC @markshannon @vstinner
https://bugs.python.org/issue37540
The documented definition was much broader than the real one:
there are tons of characters with general category "Other",
and we don't (and shouldn't) treat most of them as whitespace.
Rewrite the definition to agree with the comment on
_PyUnicode_IsWhitespace, and with the logic in makeunicodedata.py,
which is what generates that function and so ultimately governs.
Add suitable breadcrumbs so that a reader who wants to pin down
exactly what this definition means (what's a "bidirectional class"
of "B"?) can do so. The `unicodedata` module documentation is an
appropriate central place for our references to Unicode's own copious
documentation, so point there.
Also add to the isspace() test a thorough check that the
implementation agrees with the intended definition.
* bpo-37256: Wording in Request class docs
* 📜🤖 Added by blurb_it.
* Update Misc/NEWS.d/next/Documentation/2019-07-16-14-48-12.bpo-37256.qJTrBb.rst
Co-Authored-By: Kyle Stanley <aeros167@gmail.com>
https://bugs.python.org/issue37814:
> The empty tuple syntax in type annotations, `Tuple[()]`, is not obvious from the examples given in the documentation (I naively expected `Tuple[]` to work); it has been documented in PEP 484 and in mypy, but not in the documentation for the typing module.
https://bugs.python.org/issue37814
DeprecationWarning will continue to be emitted for invalid escape
sequences in string and bytes literals just as it did in 3.7.
SyntaxWarning may be emitted in the future. But per mailing list
discussion, we don't yet know when because we haven't settled on how to
do so in a non-disruptive manner.
(Applies 4c5b6bac24 to the master branch).
(This is https://github.com/python/cpython/pull/15142 for master/3.9)
https://bugs.python.org/issue32912
Automerge-Triggered-By: @gpshead
There was a discrepancy between the Python and C implementations.
Add singletons ALWAYS_EQ, LARGEST and SMALLEST in test.support
to test mixed type comparison.
Imports now raise `TypeError` instead of `ValueError` for relative import failures. This makes things consistent between `builtins.__import__` and `importlib.__import__` as well as using a more natural import for the failure.
https://bugs.python.org/issue37444
Automerge-Triggered-By: @brettcannon
Expose the CAN_BCM SocketCAN constants used in the bcm_msg_head struct
flags (provided by <linux/can/bcm.h>) under the socket library.
This adds the following constants with a CAN_BCM prefix:
* SETTIMER
* STARTTIMER
* TX_COUNTEVT
* TX_ANNOUNCE
* TX_CP_CAN_ID
* RX_FILTER_ID
* RX_CHECK_DLC
* RX_NO_AUTOTIMER
* RX_ANNOUNCE_RESUME
* TX_RESET_MULTI_IDX
* RX_RTR_FRAME
* CAN_FD_FRAME
The CAN_FD_FRAME flag was introduced in the 4.8 kernel, while the other
ones were present since SocketCAN drivers were mainlined in 2.6.25. As
such, it is probably unnecessary to guard against these constants being
missing.
Deprecate the parser module and add a deprecation warning triggered on import and a warning block in the documentation.
https://bugs.python.org/issue37268
Automerge-Triggered-By: @pablogsal
Prior to this change the guard on an 'elif' used an assignment expression whose value was used in a later 'else' block, causing some confusion for people.
(Discussion on Twitter: https://twitter.com/brettsky/status/1153861041068994566.)
Automerge-Triggered-By: @brettcannon
* Fix the formatting in the documentation of the tostring() functions.
* bpo-34160: Document that the tostring() and tostringlist() functions also preserve the attribute order now.
* bpo-34160: Add an explanation of how users should deal with the attribute order.
* Remove a vague statement in documentation
* Remove another vague sentence
A sentence starting with "So it should be possible..." shouldn't be in the docs either.
Co-Authored-By: Kyle Stanley <aeros167@gmail.com>
* Include the removal of the previous line
Co-Authored-By: Kyle Stanley <aeros167@gmail.com>
* Remove an extra space
The `allow_abbrev` option for ArgumentParser is documented and intended to disable support for unique prefixes of --options, which may sometimes be ambiguous due to deferred parsing.
However, the initial implementation also broke parsing of grouped short flags, such as `-ab` meaning `-a -b` (or `-a=b`). Checking the argument for a leading `--` before rejecting it fixes this.
This was prompted by pytest-dev/pytest#5469, so a backport to at least 3.8 would be great 😄
And this is my first PR to CPython, so please let me know if I've missed anything!
https://bugs.python.org/issue26967
Hi,
I've faced an issue w/ `mailbox.Maildir()`. The case is following:
1. I create a folder with `tempfile.TemporaryDirectory()`, so it's empty
2. I pass that folder path as an argument when instantiating `mailbox.Maildir()`
3. Then I receive an exception happening because "there's no such file or directory" (namely `cur`, `tmp` or `new`) during interaction with Maildir
**Expected result:** subdirs are created during `Maildir()` instance creation.
**Actual result:** subdirs are assumed as existing which leads to exceptions during use.
**Workaround:** remove the actual dir before passing the path to `Maildir()`. It will be created automatically with all subdirs needed.
**Fix:** This PR. Basically it adds creation of subdirs regardless of whether the base dir existed before.
https://bugs.python.org/issue30088
Fix importlib examples to insert any newly created modules via importlib.util.module_from_spec() immediately into sys.modules instead of after calling loader.exec_module().
Thanks to Benjamin Mintz for finding the bug.
https://bugs.python.org/issue37521
This is done to compensate for the extra stack frames added by
IDLE itself, which cause problems when setting the recursion limit
to low values.
This wraps sys.setrecursionlimit() and sys.getrecursionlimit()
as invisibly as possible.
bdist_wininst depends on MBCS codec, unavailable on non-Windows,
and bdist_wininst have not worked since at least Python 3.2, possibly
never on Python 3.
Here we document that bdist_wininst is only supported on Windows,
and we mark it unsupported otherwise to skip tests.
Distributors of Python 3 can now safely drop the bdist_wininst .exe files
without the need to skip bdist_wininst related tests.
Remove the undocumented sys.callstats() function. Since Python 3.7,
it was deprecated and always returned None. It required a special
build option CALL_PROFILE which was already removed in Python 3.7.
The os.getcwdb() function now uses the UTF-8 encoding on Windows,
rather than the ANSI code page: see PEP 529 for the rationale. The
function is no longer deprecated on Windows.
os.getcwd() and os.getcwdb() now detect integer overflow on memory
allocations. On Unix, these functions properly report MemoryError on
memory allocation failure.
In development mode and in debug build, encoding and errors arguments
are now checked on string encoding and decoding operations. Examples:
open(), str.encode() and bytes.decode().
By default, for best performances, the errors argument is only
checked at the first encoding/decoding error, and the encoding
argument is sometimes ignored for empty strings.
Python now gets the absolute path of the script filename specified on
the command line (ex: "python3 script.py"): the __file__ attribute of
the __main__ module, sys.argv[0] and sys.path[0] become an absolute
path, rather than a relative path.
* Add _Py_isabs() and _Py_abspath() functions.
* _PyConfig_Read() now tries to get the absolute path of
run_filename, but keeps the relative path if _Py_abspath() fails.
* Reimplement os._getfullpathname() using _Py_abspath().
* Use _Py_isabs() in getpath.c.
Remove sys.getcheckinterval() and sys.setcheckinterval() functions.
They were deprecated since Python 3.2. Use sys.getswitchinterval()
and sys.setswitchinterval() instead.
Remove also check_interval field of the PyInterpreterState structure.
At the moment you can definitely use UDPLITE sockets on Linux systems, but it would be good if this support were formalized such that you can detect support at runtime easily.
At the moment, to make and use a UDPLITE socket requires something like the following code:
```
>>> import socket
>>> a = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, 136)
>>> b = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, 136)
>>> a.bind(('localhost', 44444))
>>> b.sendto(b'test'*256, ('localhost', 44444))
>>> b.setsockopt(136, 10, 16)
>>> b.sendto(b'test'*256, ('localhost', 44444))
>>> b.setsockopt(136, 10, 32)
>>> b.sendto(b'test'*256, ('localhost', 44444))
>>> b.setsockopt(136, 10, 64)
>>> b.sendto(b'test'*256, ('localhost', 44444))
```
If you look at this through Wireshark, you can see that the packets are different in that the checksums and checksum coverages change.
With the pull request that I am submitting momentarily, you could do the following code instead:
```
>>> import socket
>>> a = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDPLITE)
>>> b = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDPLITE)
>>> a.bind(('localhost', 44444))
>>> b.sendto(b'test'*256, ('localhost', 44444))
>>> b.set_send_checksum_coverage(16)
>>> b.sendto(b'test'*256, ('localhost', 44444))
>>> b.set_send_checksum_coverage(32)
>>> b.sendto(b'test'*256, ('localhost', 44444))
>>> b.set_send_checksum_coverage(64)
>>> b.sendto(b'test'*256, ('localhost', 44444))
```
One can also detect support for UDPLITE just by checking
```
>>> hasattr(socket, 'IPPROTO_UDPLITE')
```
https://bugs.python.org/issue37345
This is to help prevent people from accidentally installing into the wrong Python interpreter if they are not aware of which Python interpreter `pip` points to.
* Docs: Improved phrasing
Removed usage of second person pronouns in the section and made the assumption of "uneasiness" in code style transition more neutral.
* Removed trailing whitespace on line 34
* Rename PyImport_Cleanup() to _PyImport_Cleanup() and move it to the
internal C API. Add 'tstate' parameters.
* Remove documentation of _PyImport_Init(), PyImport_Cleanup(),
_PyImport_Fini(). All three were documented as "For internal use
only.".