"Generally, mixed-mode arithmetic combining real and complex variables should
be performed directly, not by first coercing the real to complex, lest the sign
of zero be rendered uninformative; the same goes for combinations of pure
imaginary quantities with complex variables." (c) Kahan, W: Branch cuts for
complex elementary functions.
This patch implements mixed-mode arithmetic rules, combining real and
complex variables as specified by C standards since C99 (in particular,
there is no special version for the true division with real lhs
operand). Most C compilers implementing C99+ Annex G have only these
special rules (without support for imaginary type, which is going to be
deprecated in C2y).
When handed an absolute Windows path such as `C:\foo` or `//server/share`,
the `urllib.request.pathname2url()` function returns a URL with an
authority section, such as `///C:/foo` or `//server/share` (or before
GH-126205, `////server/share`). Only the `file:` prefix is omitted.
But when handed an absolute POSIX path such as `/etc/hosts`, or a Windows
path of the same form (rooted but lacking a drive), the function returns a
URL without an authority section, such as `/etc/hosts`.
This patch corrects the discrepancy by adding a `//` prefix before
drive-less, rooted paths when generating URLs.
These functions have long sown confusion among Python developers. The
existing documentation says they deal with URL path components, but that
doesn't fit the evidence on Windows:
>>> pathname2url(r'C:\foo')
'///C:/foo'
>>> pathname2url(r'\\server\share')
'////server/share' # or '//server/share' as of quite recently
If these were URL path components, they would imply complete URLs like
`file://///C:/foo` and `file://////server/share`. Clearly this isn't right.
Yet the implementation in `nturl2path` is deliberate, and the
`url2pathname()` function correctly inverts it.
On non-Windows platforms, the behaviour until quite recently is to simply
quote/unquote the path without adding or removing any leading slashes. This
behaviour is compatible with *both* interpretations -- 1) the value is a
URL path component (existing docs), and 2) the value is everything
following `file:` (this commit)
The conclusion I draw is that these functions operate on everything after
the `file:` prefix, which may include an authority section. This is the
only explanation that fits both the Windows and non-Windows behaviour.
It's also a better match for the function names.
Stop converting Windows drive letters to uppercase in
`urllib.request.pathname2url()` and `url2pathname()`. This behaviour is
unnecessary and inconsistent with pathlib's file URI implementation.
* Name without a PATHEXT extension is only searched if the mode does not
include X_OK.
* Support multi-component PATHEXT extensions (e.g. ".foo.bar").
* Support files without extensions in PATHEXT contains dot-only extension
(".", "..", etc).
* Support PATHEXT extensions that end with a dot (e.g. ".foo.").
The usage parameter of argparse.ArgumentParser no longer
affects the default value of the prog parameter in subparsers.
Previously the full custom usage of the main parser was used as
the prog prefix in subparsers.
Adjust `pathname2url()` to encode embedded colon characters in Windows
paths, rather than bailing out with an `OSError`.
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
This adds authentication to the forkserver control socket. In the past only filesystem permissions protected this socket from code injection into the forkserver process by limiting access to the same UID, which didn't exist when Linux abstract namespace sockets were used (see issue) meaning that any process in the same system network namespace could inject code. We've since stopped using abstract namespace sockets by default, but protecting our control sockets regardless of type is a good idea.
This reuses the HMAC based shared key auth already used by `multiprocessing.connection` sockets for other purposes.
Doing this is useful so that filesystem permissions are not relied upon and trust isn't implied by default between all processes running as the same UID with access to the unix socket.
### pyperformance benchmarks
No significant changes. Including `concurrent_imap` which exercises `multiprocessing.Pool.imap` in that suite.
### Microbenchmarks
This does _slightly_ slow down forkserver use. How much so appears to depend on the platform. Modern platforms and simple platforms are less impacted. This PR adds additional IPC round trips to the control socket to tell forkserver to spawn a new process. Systems with potentially high latency IPC are naturally impacted more.
Typically a 1-4% slowdown on a very targeted process creation microbenchmark, with a worst case overloaded system slowdown of 20%. No evidence that these slowdowns appear in practical sense. See the PR for details.
If the cpXXX encoding is not directly implemented in Python, fall back
to use the Windows-specific API codecs.code_page_encode() and
codecs.code_page_decode().
This unifies the code for nodejs and the code for the browser. After this
commit, the browser example doesn't work; this will be fixed in a
subsequent update.
This was flagged to me at a party today by someone who works in red-teaming as a frequently encountered footgun. Documenting the potentially unexpected behavior seemed like a good place to start.
* Document that slices can be marshalled
* Deduplicate and organize the list of supported types
in docs
* Organize the type code list in marshal.c, to make
it more obvious that this is a versioned format
* Back-fill some historical info
Co-authored-by: Michael Droettboom <mdboom@gmail.com>
* gh-123832: Adjust `socket.getaddrinfo` docs for better POSIX compliance
This changes nothing changes for CPython supported platforms,
but hints how to deal with platforms that stick to the letter of
the spec.
It also marks `socket.getaddrinfo` as a wrapper around `getaddrinfo(3)`;
specifically, workarounds to make the function work consistently across
platforms are out of scope in its code.
Include wording similar to the POSIX's “by providing options and by
limiting the returned information”, which IMO suggests that the
hints limit the resulting list compared to the defaults, *but* can
be interpreted differently. Details are added in a note.
Specifically say that this wraps the underlying C function. So, the
details are in OS docs. The “full range of results” bit goes away.
Use `AF_UNSPEC` rather than zero for the *family* default, although
I don't think a system where it's nonzero would be very usable.
Suggest setting proto and/or type (with examples, as the appropriate
values aren't obvious). Say why you probably want to do that that
on all systems; mention the behavior on the “letter of the spec”
systems.
Suggest that the results should be tried in order, which is,
AFAIK best practice -- see RFC 6724 section 2, and its predecessor
from 2003 (which are specific to IP, but indicate how people use this):
> Well-behaved applications SHOULD iterate through the list of
> addresses returned from `getaddrinfo()` until they find a working address.
Co-authored-by: Carol Willing <carolcode@willingconsulting.com>
One of the most common reasons I see the old `pipes` module still in use
when porting to Python 3.13 is for the undocumented `pipes.quote`
function, which can easily be replaced with `shlex.quote`. I think it's
worth specifically calling this out, since being directed to the
`subprocess` module would be confusing in this case.