Commit Graph

3718 Commits

Author SHA1 Message Date
Georges Toth 303aac8c56
bpo-30681: Support invalid date format or value in email Date header (GH-22090)
I am re-submitting an older PR which was abandoned but is still relevant, #10783 by @timb07.

The issue being solved () is still relevant. The original PR #10783 was closed as
the final request changes were not applied and since abandoned.

In this new PR I have re-used the original patch plus applied both comments from the review, by @maxking and @pganssle.


For reference, here is the original PR description:
In email.utils.parsedate_to_datetime(), a failure to parse the date, or invalid date components (such as hour outside 0..23) raises an exception. Document this behaviour, and add tests to test_email/test_utils.py to confirm this behaviour.

In email.headerregistry.DateHeader.parse(), check when parsedate_to_datetime() raises an exception and add a new defect InvalidDateDefect; preserve the invalid value as the string value of the header, but set the datetime attribute to None.

Add tests to test_email/test_headerregistry.py to confirm this behaviour; also added test to test_email/test_inversion.py to confirm emails with such defective date headers round trip successfully.

This pull request incorporates feedback gratefully received from @bitdancer, @brettcannon, @Mariatta and @warsaw, and replaces the earlier PR #2254.

Automerge-Triggered-By: GH:warsaw
2020-10-26 17:31:06 -07:00
Lysandros Nikolaou bca7014032
bpo-42123: Run the parser two times and only enable invalid rules on the second run (GH-22111)
* Implement running the parser a second time for the errors messages

The first parser run is only responsible for detecting whether
there is a `SyntaxError` or not. If there isn't the AST gets returned.
Otherwise, the parser is run a second time with all the `invalid_*`
rules enabled so that all the customized error messages get produced.
2020-10-27 00:42:04 +02:00
Victor Stinner c8c4200b65
bpo-42157: Convert unicodedata.UCD to heap type (GH-22991)
Convert the unicodedata extension module to the multiphase
initialization API (PEP 489) and convert the unicodedata.UCD static
type to a heap type.

Co-Authored-By: Mohamed Koubaa <koubaa.m@gmail.com>
2020-10-26 23:19:22 +01:00
Victor Stinner 920cb647ba
bpo-42157: unicodedata avoids references to UCD_Type (GH-22990)
* UCD_Check() uses PyModule_Check()
* Simplify the internal _PyUnicode_Name_CAPI structure:

  * Remove size and state members
  * Remove state and self parameters of getcode() and getname()
    functions

* Remove global_module_state
2020-10-26 19:19:36 +01:00
Lisa Roach 8374d2ee15
bpo-39101: Fixes BaseException hang in IsolatedAsyncioTestCase. (GH-22654) 2020-10-26 09:28:17 -07:00
Victor Stinner 47e1afd2a1
bpo-1635741: _PyUnicode_Name_CAPI moves to internal C API (GH-22713)
The private _PyUnicode_Name_CAPI structure of the PyCapsule API
unicodedata.ucnhash_CAPI moves to the internal C API. Moreover, the
structure gets a new state member which must be passed to the
getcode() and getname() functions.

* Move Include/ucnhash.h to Include/internal/pycore_ucnhash.h
* unicodedata module is now built with Py_BUILD_CORE_MODULE.
* unicodedata: move hashAPI variable into unicodedata_module_state.
2020-10-26 16:43:47 +01:00
Alexey Izbyshev c0590c0033
bpo-42146: Fix memory leak in subprocess.Popen() in case of uid/gid overflow (GH-22966)
Fix memory leak in subprocess.Popen() in case of uid/gid overflow

Also add a test that would catch this leak with `--huntrleaks`.

Alas, the test for `extra_groups` also exposes an inconsistency
in our error reporting: we use a custom ValueError for `extra_groups`,
but propagate OverflowError for `user` and `group`.
2020-10-25 17:09:32 -07:00
Pablo Galindo e68c67805e
bpo-42150: Avoid buffer overflow in the new parser (GH-22978) 2020-10-25 23:03:41 +00:00
Jason R. Coombs d1a0a960ee
bpo-42043: Add support for zipfile.Path subclasses (#22716)
* bpo-42043: Add support for zipfile.Path inheritance as introduced in zipp 3.2.0.

* Add blurb.
2020-10-25 14:45:05 -04:00
Jason R. Coombs df8d4c83a6
bpo-41490: ``path`` and ``contents`` to aggressively close handles (#22915)
* bpo-41490: ``path`` method to aggressively close handles

* Add blurb

* In ZipReader.contents, eagerly evaluate the contents to release references to the zipfile.

* Instead use _ensure_sequence to ensure any iterable from a reader is eagerly converted to a list if it's not already a sequence.
2020-10-25 14:21:46 -04:00
Mark Roseman 5df6c99cb4
bpo-33987: Add master ttk Frame to IDLE search dialogs (GH-22942) 2020-10-24 23:14:02 -04:00
Serhiy Storchaka 8cd1dbae32
bpo-41052: Fix pickling heap types implemented in C with protocols 0 and 1 (GH-22870) 2020-10-24 21:14:23 +03:00
Alexey Izbyshev 976da903a7
bpo-35823: subprocess: Use vfork() instead of fork() on Linux when safe (GH-11671)
* bpo-35823: subprocess: Use vfork() instead of fork() on Linux when safe

When used to run a new executable image, fork() is not a good choice
for process creation, especially if the parent has a large working set:
fork() needs to copy page tables, which is slow, and may fail on systems
where overcommit is disabled, despite that the child is not going to
touch most of its address space.

Currently, subprocess is capable of using posix_spawn() instead, which
normally provides much better performance. However, posix_spawn() does not
support many of child setup operations exposed by subprocess.Popen().
Most notably, it's not possible to express `close_fds=True`, which
happens to be the default, via posix_spawn(). As a result, most users
can't benefit from faster process creation, at least not without
changing their code.

However, Linux provides vfork() system call, which creates a new process
without copying the address space of the parent, and which is actually
used by C libraries to efficiently implement posix_spawn(). Due to sharing
of the address space and even the stack with the parent, extreme care
is required to use vfork(). At least the following restrictions must hold:

* No signal handlers must execute in the child process. Otherwise, they
  might clobber memory shared with the parent, potentially confusing it.

* Any library function called after vfork() in the child must be
  async-signal-safe (as for fork()), but it must also not interact with any
  library state in a way that might break due to address space sharing
  and/or lack of any preparations performed by libraries on normal fork().
  POSIX.1 permits to call only execve() and _exit(), and later revisions
  remove vfork() specification entirely. In practice, however, almost all
  operations needed by subprocess.Popen() can be safely implemented on
  Linux.

* Due to sharing of the stack with the parent, the child must be careful
  not to clobber local variables that are alive across vfork() call.
  Compilers are normally aware of this and take extra care with vfork()
  (and setjmp(), which has a similar problem).

* In case the parent is privileged, special attention must be paid to vfork()
  use, because sharing an address space across different privilege domains
  is insecure[1].

This patch adds support for using vfork() instead of fork() on Linux
when it's possible to do safely given the above. In particular:

* vfork() is not used if credential switch is requested. The reverse case
  (simple subprocess.Popen() but another application thread switches
  credentials concurrently) is not possible for pure-Python apps because
  subprocess.Popen() and functions like os.setuid() are mutually excluded
  via GIL. We might also consider to add a way to opt-out of vfork() (and
  posix_spawn() on platforms where it might be implemented via vfork()) in
  a future PR.

* vfork() is not used if `preexec_fn != None`.

With this change, subprocess will still use posix_spawn() if possible, but
will fallback to vfork() on Linux in most cases, and, failing that,
to fork().

[1] https://ewontfix.com/7

Co-authored-by: Gregory P. Smith [Google LLC] <gps@google.com>
2020-10-23 17:47:01 -07:00
Jacob Neil Taylor 16ee68da6e
bpo-38976: Add support for HTTP Only flag in MozillaCookieJar (#17471)
Add support for HTTP Only flag in MozillaCookieJar

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
2020-10-23 15:48:55 -07:00
Christopher Marchfelder da6f098188
bpo-40592: shutil.which will not return None anymore if ; is the last char in PATHEXT (GH-20088)
shutil.which will not return None anymore for empty str in PATHEXT
Empty PATHEXT will now be defaulted to _WIN_DEFAULT_PATHEXT
2020-10-23 11:08:24 +01:00
Brett Cannon 3c69f0c933
bpo-41910: specify the default implementations of object.__eq__ and object.__ne__ (GH-22874)
See Objects/typeobject.c:object_richcompare() for the implementation of this in CPython.

Automerge-Triggered-By: GH:brettcannon
2020-10-21 16:24:38 -07:00
Pablo Galindo b451b0e9a7
bpo-38980: Add -fno-semantic-interposition when building with optimizations (GH-22862) 2020-10-21 22:46:52 +01:00
kpinc c60394c7fc
bpo-39416: Document some restrictions on the default string representations of numeric classes (GH-18111)
[bpo-39416](): Document string representations of the Numeric classes

This is a change to the specification of the Python language.

The idea here is to put sane minimal limits on the Python language's default
representations of its Numeric classes.  That way "Marty's Robotic Massage Parlor
and Python Interpreter" implementation of Python won't do anything too
crazy.

Some discussion in the email thread:
Subject: Documenting Python's float.__str__()
https://mail.python.org/archives/list/python-dev@python.org/thread/FV22TKT3S2Q3P7PNN6MCXI6IX3HRRNAL/
2020-10-21 10:13:50 -07:00
Batuhan Taskaya c7437e2c02
bpo-41747: Ensure all dataclass methods uses their parents' qualname (GH-22155)
* bpo-41747: Ensure all dataclass methods uses their parents' qualname

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2020-10-21 09:49:22 -04:00
Dong-hee Na c0f22fb8b3
bpo-41902: Micro optimization for range.index if step is 1 (GH-22479) 2020-10-21 11:29:56 +09:00
Максим 5f22741340
bpo-23706: Add newline parameter to pathlib.Path.write_text (GH-22420) (GH-22420)
* Add _newline_ parameter to `pathlib.Path.write_text()`
* Update documentation of `pathlib.Path.write_text()`
* Add test case for `pathlib.Path.write_text()` calls with _newline_ parameter passed

Automerge-Triggered-By: GH:methane
2020-10-20 19:08:19 -07:00
Dong-hee Na 25492a5b59
bpo-41902: Micro optimization for compute_item of range (GH-22492) 2020-10-21 10:29:14 +09:00
kj 7cdf30fff3
bpo-42010: [docs] Clarify subscription of types (GH-22822) 2020-10-20 16:38:08 -07:00
Steve Dower 6d883fbe14
bpo-38439: Update the Windows Store package's icons for IDLE. Artwork by Andrew Clover (GH-22817) 2020-10-20 15:54:13 +01:00
Andrey Doroschenko ec42789e6e
bpo-39693: mention KeyError in tarfile extractfile documentation (GH-18639)
Co-authored-by: Andrey Darascheka <andrei.daraschenka@leverx.com>
2020-10-20 10:05:01 -04:00
Miro Hrončok faddc7449d
bpo-38439: Add 256px IDLE icon to the .ico, drop gifs from it (GH-19648) 2020-10-20 13:21:08 +01:00
TIGirardi f2312037e3
bpo-38324: Fix test__locale.py Windows failures (GH-20529)
Use wide-char _W_* fields of lconv structure on Windows
Remove "ps_AF" from test__locale.known_numerics on Windows
2020-10-20 12:39:52 +01:00
Ronald Oussoren 3185267400
bpo-41491: plistlib: accept hexadecimal integer values in xml plist files (GH-22764) 2020-10-20 09:26:33 +02:00
Pablo Galindo 109826c850
bpo-42093: Add opcode cache for LOAD_ATTR (GH-22803) 2020-10-20 06:22:44 +01:00
Raymond Hettinger 871934d4cf
bpo-4356: Add key function support to the bisect module (GH-20556) 2020-10-19 22:04:01 -07:00
Ruben Vorderman 23c0fb8edd
bpo-41586: Add pipesize parameter to subprocess & F_GETPIPE_SZ and F_SETPIPE_SZ to fcntl. (GH-21921)
* Add F_SETPIPE_SZ and F_GETPIPE_SZ to fcntl module
* Add pipesize parameter for subprocess.Popen class

This will allow the user to control the size of the pipes.
On linux the default is 64K. When a pipe is full it blocks for writing.
When a pipe is empty it blocks for reading. On processes that are
very fast this can lead to a lot of wasted CPU cycles. On a typical
Linux system the max pipe size is 1024K which is much better.
For high performance-oriented libraries such as xopen it is nice to
be able to set the pipe size.

The workaround without this feature is to use my_popen_process.stdout.fileno() in
conjuction with fcntl and 1031 (value of F_SETPIPE_SZ) to acquire this behavior.
2020-10-19 16:30:02 -07:00
Mark Sapiro bf838227c3
bpo-27321 Fix email.generator.py to not replace a non-existent header. (GH-18074)
This PR replaces #1977. The reason for the replacement is two-fold.

The fix itself is different is that if the CTE header doesn't exist in the original message, it is inserted. This is important because the new CTE could be quoted-printable whereas the original is implicit 8bit.

Also the tests are different. The test_nonascii_as_string_without_cte test in #1977 doesn't actually test the issue in that it passes without the fix. The test_nonascii_as_string_without_content_type_and_cte test is improved here, and even though it doesn't fail without the fix, it is included for completeness.

Automerge-Triggered-By: @warsaw
2020-10-19 15:49:19 -07:00
Zackery Spytz 1438c2ac77
bpo-41845: Move PyObject_GenericGetDict() back into the limited API (GH22646)
It was moved out of the limited API in 7d95e40721.
This change re-enables it from 3.10, to avoid generating invalid extension modules for earlier versions.
2020-10-19 23:47:37 +01:00
Alex Gaynor 3a8fdb2879
bpo-41784: make PyUnicode_AsUTF8AndSize part of the limited API (GH-22252) 2020-10-19 23:17:50 +01:00
Jason R. Coombs 5456e78f45
bpo-16396: Allow wintypes to be imported on non-Windows systems. (GH-21394)
Co-authored-by: Christian Heimes <christian@python.org>
2020-10-19 23:06:05 +01:00
Barry Warsaw 96ddc58281
bpo-42089: Sync with current cpython branch of importlib_metadata (GH-22775)
~~The only differences are in the test files.~~

Automerge-Triggered-By: @jaraco
2020-10-19 14:14:21 -07:00
Ronald Oussoren 93a1ccabde
bpo-41471: Ignore invalid prefix lengths in system proxy settings on macOS (GH-22762) 2020-10-19 20:16:21 +02:00
Ronald Oussoren 05ee790f4d
bpo-42051: Reject XML entity declarations in plist files (#22760) 2020-10-19 20:13:49 +02:00
Steve Dower 985f0ab3ad
bpo-39107: Updated Tcl and Tk to 8.6.10 in Windows installer (GH-22405) 2020-10-19 16:55:10 +01:00
Mark Shannon b580ed1d9d
Correct name of bytecode in change note. (GH-22723) 2020-10-19 13:20:33 +01:00
Bar Harel 5368c2b6e2
bpo-19270: Fixed sched.scheduler.cancel to cancel correct event (GH-22729) 2020-10-19 10:33:43 +03:00
Anthony Sottile 3c0ac18504
bpo-40492: Fix --outfile with relative path when the program changes it working dir (GH-19910) 2020-10-18 23:48:31 +03:00
Irit Katriel b81c833ab5
bpo-28660: Make TextWrapper break long words on hyphens (GH-22721) 2020-10-18 20:01:15 +03:00
scaramallion c304c9a7ef
bpo-41966: Fix pickling pure datetime.time subclasses (GH-22731) 2020-10-18 17:49:48 +03:00
Ma Lin a0c603cb9d
bpo-38252: Use 8-byte step to detect ASCII sequence in 64bit Windows build (GH-16334) 2020-10-18 17:48:38 +03:00
Max Bernstein 3635388f52
bpo-42065: Fix incorrectly formatted _codecs.charmap_decode error message (GH-19940) 2020-10-17 23:38:21 +03:00
Kevin Adler 1dd6d956a3
closes bpo-42030: Remove legacy AIX dynload support (GH-22717)
Since c19c5a6, AIX builds have defaulted to using dynload_shlib over
dynload_aix when dlopen is available. This function has been available
since AIX 4.3, which went out of support in 2003, the same year the
previously referenced commit was made. It has been nearly 20 years
since a version of AIX has been supported which has not used
dynload_shlib so there's no reason to keep this legacy code around.
2020-10-16 13:03:28 -05:00
Erlend Egeberg Aasland 644e94272a
bpo-42021: Fix possible ref leaks during _sqlite3 module init (GH-22673) 2020-10-15 21:20:15 +09:00
Kevin Adler 2d2af320d9
bpo-41894: Fix UnicodeDecodeError while loading native module (GH-22466)
When running in a non-UTF-8 locale, if an error occurs while importing a
native Python module (say because a dependent share library is missing),
the error message string returned may contain non-ASCII code points
causing a UnicodeDecodeError.

PyUnicode_DecodeFSDefault is used for buffers which may contain
filesystem  paths. For consistency with os.strerror(),
PyUnicode_DecodeLocale is used for buffers which contain system error
messages. While the shortname parameter is always encoded in ASCII
according to PEP 489, it is left decoded using PyUnicode_FromString to
minimize the changes and since it should not affect the decoding (albeit
_potentially_ slower).

In dynload_hpux, since the error buffer contains a message generated
from a static ASCII string and the module filesystem path,
PyUnicode_DecodeFSDefault is used instead of PyUnicode_DecodeLocale as
is used elsewhere.

* bpo-41894: Fix bugs in dynload error msg handling

For both dynload_aix and dynload_hpux, properly handle the possibility
that decoding strings may return NULL and when such an error happens,
properly decrement any previously decoded strings and return early.

In addition, in dynload_aix, ensure that we pass the decoded string
*object* pathname_ob to PyErr_SetImportError instead of the original
pathname buffer.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2020-10-15 10:53:27 +09:00
Brandt Bucher c13b847a6f
bpo-41984: GC track all user classes (GH-22701) 2020-10-14 18:44:07 -07:00