Although the underlying libffi issue remains open, adding these
checks have caused problems in third-party projects which are in
widespread use. See the issue for examples.
The corresponding tests have also been skipped.
The comment about the collection rules for the permanent generation was
incorrectly referenced by a comment in gcmodule.c (the comment has been
moved long ago into a header file). Moving the comment into the relevant
code helps with readability and avoids broken references.
test_openssl_version now accepts version 3.0.0.
getpeercert() no longer returns IPv6 addresses with a trailing new line.
Signed-off-by: Christian Heimes <christian@python.org>
https://bugs.python.org/issue38820
On most platforms, the `environ` symbol is accessible everywhere.
In a dylib on OSX, it's not easily accessible, you need to find it with
_NSGetEnviron.
The code was caching the *value* of environ. But a setenv() can change the value,
leaving garbage at the old value. Fix: don't cache the value of environ, just
read it every time.
The readline module now detects if Python is linked to libedit at runtime
on all platforms. Previously, the check was only done on macOS.
If Python is used as a library by a binary linking to libedit, the linker
resolves the rl_initialize symbol required by the readline module against
libedit instead of libreadline, which leads to a segfault.
Take advantage of the existing supporting code to have readline module being
compatible with both situations.
The previous code was raising a `KeyError` for both the Python and C implementation.
This was caused by the specified index of an invalid input which did not exist
in the memo structure, where the pickle stores what objects it has seen.
The malformed input would have caused either a `BINGET` or `LONG_BINGET` load
from the memo, leading to a `KeyError` as the determined index was bogus.
https://bugs.python.org/issue38876https://bugs.python.org/issue38876
Remove PyMethod_ClearFreeList() and PyCFunction_ClearFreeList()
functions: the free lists of bound method objects have been removed.
Remove also _PyMethod_Fini() and _PyCFunction_Fini() functions.
* Add GCState type for readability
* gcmodule.c now gets its gcstate from tstate
* _PyGC_DumpShutdownStats() now expects tstate rather than runtime
* Rename "state" to "gcstate" for readability: to avoid confusion
between "state" and "tstate" for example.
* collect() now only expects tstate: it gets gcstate from tstate.
* Pass tstate to _PyErr_xxx() functions
Clear the current thread later in the Python finalization.
* The PyInterpreterState_Delete() function is now responsible
to call PyThreadState_Swap(NULL).
* The tstate_delete_common() function is now responsible to clear the
"autoTSSKey" thread local storage and it only clears it once the
thread state is fully cleared. It allows to still get the current
thread from TSS in tstate_delete_common().
* Factorize code in common between Py_FinalizeEx() and
Py_EndInterpreter().
* Py_EndInterpreter() now also calls _PyWarnings_Fini().
* Call _PyExc_Fini() and _PyGC_Fini() later in the finalization.
This exposes a Linux-specific syscall for sending a signal to a process
identified by a file descriptor rather than a pid.
For simplicity, we don't support the siginfo_t parameter to the syscall. This
parameter allows implementing a pidfd version of rt_sigqueueinfo(2), which
Python also doesn't support.
The PyFPE_START_PROTECT() and PyFPE_END_PROTECT() macros are empty:
they have been doing nothing for the last year (since commit
735ae8d139), so stop using them.
Add PyInterpreterState.runtime field: reference to the _PyRuntime
global variable. This field exists to not have to pass runtime in
addition to tstate to a function. Get runtime from tstate:
tstate->interp->runtime.
Remove "_PyRuntimeState *runtime" parameter from functions already
taking a "PyThreadState *tstate" parameter.
_PyGC_Init() first parameter becomes "PyThreadState *tstate".
If an exception is raised and PyInit__multibytecodec() returns NULL,
Python reports properly the exception to the user. There is no need
to crash Python with Py_FatalError().
* Add _PyObject_VectorcallTstate() function: similar to
_PyObject_Vectorcall(), but with tstate parameter
* Add tstate parameter to _PyObject_MakeTpCall()
After #9665, this moves the remaining types in posixmodule to be heap-allocated to make it compatible with PEP384 as well as modifying all the type accessors to fully make the type opaque.
The original PR that got messed up a rebase: https://github.com/python/cpython/pull/10854. All the issues in that commit have now been addressed since https://github.com/python/cpython/pull/11661 got committed.
This change also removes any state from the data segment and onto the module state itself.
https://bugs.python.org/issue35381
Automerge-Triggered-By: @encukou
open(), io.open(), codecs.open() and fileinput.FileInput no longer
accept "U" ("universal newline") in the file mode. This flag was
deprecated since Python 3.3.
It should be impossible for an untracked object to have the collecting
flag set. Back when state was stored in gc_refs, it obviously was
impossible (gc_refs couldn't possibly have a positive & negative value
simultaneously). While the _implementation_ of "state" has gotten much
more complicated, it's still _logically_ just as impossible.
Add a total_nframe field to the traces collected by the tracemalloc module.
This field indicates the original number of frames before it was truncated.
* Misc gc code & comment cleanups.
validate_list: there are two temp flags polluting pointers, but this checked only one. Now it checks both, and verifies that the list head's pointers are not polluted.
move_unreachable: repaired incoherent comments. Added new comments. Cleared the pollution of the unreachable list head's 'next' pointer (it was expedient while the function was running, but there's no excuse for letting this damage survive the function's end).
* Update Modules/gcmodule.c
Co-Authored-By: Pablo Galindo <Pablogsal@gmail.com>
Currently if any finalizer invoked during garbage collection resurrects any object, the gc gives up and aborts the collection. Although finalizers are assured to only run once per object, this behaviour of the gc can lead to an ever-increasing memory situation if new resurrecting objects are allocated in every new gc collection.
To avoid this, recompute what objects among the unreachable set need to be resurrected and what objects can be safely collected. In this way, resurrecting objects will not block the collection of other objects in the unreachable set.
Rewrite getsockaddrarg() helper function of socketmodule.c (_socket
module) to prevent a false alarm when compiling codde using GCC with
_FORTIFY_SOURCE=2. Pass a pointer of the sock_addr_t union, rather
than passing a pointer to a sockaddr structure.
Add "struct sockaddr_tipc tipc;" to the sock_addr_t union.
subtract_refs() now pass the parent object to visit_decref() which
pass it to _PyObject_ASSERT(). So if the "is freed" assertion fails,
the parent is used in debug trace, rather than the freed object. The
parent object is more likely to contain useful information. Freed
objects cannot be inspected are are displayed as "<object at xxx is
freed>" with no other detail.
In debug mode, PyObject_GC_Track() now calls tp_traverse() of the
object type to ensure that the object is valid: test that objects
visited by tp_traverse() are valid.
Fix pyexpat.c: only track the parser in the GC once the parser is
fully initialized.
bpo-36389, bpo-38376: The _PyObject_CheckConsistency() function is
now also available in release mode. For example, it can be used to
debug a crash in the visit_decref() function of the GC.
Modify the following functions to also work in release mode:
* _PyDict_CheckConsistency()
* _PyObject_CheckConsistency()
* _PyType_CheckConsistency()
* _PyUnicode_CheckConsistency()
Other changes:
* _PyMem_IsPtrFreed(ptr) now also returns 1 if ptr is NULL
(equals to 0).
* _PyBytesWriter_CheckConsistency() now returns 1 and is only used
with assert().
* Reorder _PyObject_Dump() to write safe fields first, and only
attempt to render repr() at the end.
* _Py_FindEnvConfigValue() now returns a string allocated
by PyMem_RawMalloc().
* calculate_init() now decodes VPATH macro.
* Add calculate_open_pyenv() function.
* Add substring() and joinpath2() functions.
* Fix add_exe_suffix()
And a few cleanup changes.
On Windows use UTF-16 (or UTF-32 for 32-bit Tcl_UniChar) with the
"surrogatepass" error handler for converting to/from Tcl Unicode objects.
On Linux use UTF-8 with the "surrogateescape" error handler for converting
to/from Tcl String objects.
Converting strings from Tcl to Python and back now never fails
(except MemoryError).
Following symbolic links is now limited to 40 attempts, just to
prevent loops.
Add subfunctions:
* Add resolve_symlinks()
* Add calculate_argv0_path_framework()
* Add calculate_which()
* Add calculate_program_macos()
Fix also _Py_wreadlink(): readlink() result type is Py_ssize_t, not
int.
On FreeBSD, Python no longer calls fedisableexcept() at startup to
control the floating point control mode. The call became useless
since FreeBSD 6: it became the default mode.
For now, we'll rely on the fact that the config structures aren't covered by the stable ABI.
We may revisit this in the future if we further explore the idea of offering a stable embedding API.
(cherry picked from commit bdace21b76)
Fix a bug due to the interaction of weakrefs and the cyclic garbage
collector. We must clear any weakrefs in garbage in order to prevent
their callbacks from executing and causing a crash.
bpo-22273, bpo-38321: Fix following warning:
modules\_ctypes\stgdict.c(704):
warning C4244: 'initializing': conversion from 'Py_ssize_t' to 'int', possible loss of data
bpo-38248, bpo-38321: Fix warning:
modules\_asynciomodule.c(2667):
warning C4102: 'set_exception': unreferenced label
The related goto has been removed by
commit edad4d89e3.
Add a new struct_size field to PyPreConfig and PyConfig structures to
allow to modify these structures in the future without breaking the
backward compatibility.
* Replace private _config_version field with public struct_size field
in PyPreConfig and PyConfig.
* Public PyPreConfig_InitIsolatedConfig() and
PyPreConfig_InitPythonConfig()
return type becomes PyStatus, instead of void.
* Internal _PyConfig_InitCompatConfig(),
_PyPreConfig_InitCompatConfig(), _PyPreConfig_InitFromConfig(),
_PyPreConfig_InitFromPreConfig() return type becomes PyStatus,
instead of void.
* Remove _Py_CONFIG_VERSION
* Update the Initialization Configuration documentation.
test_hmac and test_hashlib test built-in hashing implementations and
OpenSSL-based hashing implementations. Add more checks to skip OpenSSL
implementations when a strict crypto policy is active.
Use EVP_DigestInit_ex() instead of EVP_DigestInit() to initialize the
EVP context. The EVP_DigestInit() function clears alls flags and breaks
usedforsecurity flag again.
Signed-off-by: Christian Heimes <christian@python.org>
https://bugs.python.org/issue38270
* Updated _hashopenssl.c to be PEP 384 compliant
* Remove refleak test from test_hashlib. The updated type no longer accepts random arguments to __init__.
* search_for_prefix() directly calls reduce() if found is greater
than 0.
* Add calculate_pybuilddir() subfunction.
* search_for_prefix(): add path string buffer for readability.
* Fix some error handling code paths: release resources on error.
* calculate_read_pyenv(): rename tmpbuffer to filename.
* test.pythoninfo now also logs windows.dll_path
Refactor path configuration code:
* read_pth_file() now returns PyStatus to report errors, rather than
calling Py_FatalError().
* Move argv0_path and zip_path buffers out of PyCalculatePath
structures.
* On Windows, _PyPathConfig.home is now preferred over PyConfig.home.
* _PyConfig_InitPathConfig() now starts by copying the global path
configuration, and then override values set in PyConfig.
* _PyPathConfig_Calculate() implementations no longer override
_PyPathConfig fields which are already computed. For example,
if _PyPathConfig.prefix is not NULL, leave it unchanged.
* If Py_SetPath() has been called, _PyConfig_InitPathConfig() doesn't
call _PyPathConfig_Calculate() anymore.
* _PyPathConfig_Calculate() no longer uses PyConfig,
except to initialize PyCalculatePath structure.
* pathconfig_calculate(): remove useless temporary
"_PyPathConfig new_config" variable.
* calculate_module_search_path(): remove hack to workaround memory
allocation failure, call Py_FatalError() instead.
* Fix get_program_full_path(): handle memory allocation failure.
* If Py_SetPath() has been called, _PyConfig_InitPathConfig() now
uses its value.
* Py_Initialize() now longer copies path configuration from PyConfig
to the global path configuration (_Py_path_config).
* Convert select module to PEP-384
Summary: Do the necessary versions to be Pyro-compatible, including migrating `PyType_Ready` to `PyType_FromSpec` and moving static data into a new `_selectstate` struct.
* 📜🤖 Added by blurb_it.
* Fixup Mac OS/X build
In ArgumentClinic, value "NULL" should now be used only for unrepresentable default values
(like in the optional third parameter of getattr). "None" should be used if None is accepted
as argument and passing None has the same effect as not passing the argument at all.
* Fix a crash in comparing with float (and maybe other crashes).
* They are now never equal to strings and non-integer numbers.
* Comparison with a large number no longer raises OverflowError.
* Arbitrary exceptions no longer silenced in constructors and comparisons.
* TypeError raised in the constructor contains now the name of the type.
* Accept only ChannelID and int-like objects in channel functions.
* Accept only InterpreterId, int-like objects and str in the InterpreterId constructor.
* Accept int-like objects, not just int in interpreter related functions.
- Migrate `Random_Type` to `PyType_FromSpec`
- To simulate an old use of `PyLong_Type.tp_as_number->nb_absolute`, I added
code to the module init function to stash `int.__abs__` for later
use. Ideally we'd use `PyType_GetSlot()` instead, but it doesn't currently
work for static types in CPython, and implementing it just for this case
doesn't seem worth it.
- Do exact check for long and dispatch to PyNumber_Absolute, use vector call when not exact.
The usedforsecurity keyword only argument added to the hash constructors is useful for FIPS builds and similar restrictive environment with non-technical requirements that legacy algorithms be forbidden by their implementations without being explicitly annotated as not being used for any security related purposes. Linux distros with FIPS support benefit from this being standard rather than making up their own way(s) to do it.
Contributed and Signed-off-by: Christian Heimes christian@python.org
* subprocess: Add user, group and extra_groups paremeters to subprocess.Popen
This adds a `user` parameter to the Popen constructor that will call
setreuid() in the child before calling exec(). This allows processes
running as root to safely drop privileges before running the subprocess
without having to use a preexec_fn.
This also adds a `group` parameter that will call setregid() in
the child process before calling exec().
Finally an `extra_groups` parameter was added that will call
setgroups() to set the supplimental groups.
* 1. add test case with wrong behavior
* 2. fix bug when max_length == -1
* 3. allow b"" as valid input data for decompress_buf()
* 4. when max_length >= 0, let needs_input mechanism works
* add more asserts to test case
The instance destructor for a type is responsible for preparing
an instance for deallocation by decrementing the reference counts
of its referents.
If an instance belongs to a heap type, the type object of an instance
has its reference count decremented while for static types, which
are permanently allocated, the type object is unaffected by the
instance destructor.
Previously, the default instance destructor searched the class
hierarchy for an inherited instance destructor and, if present,
would invoke it.
Then, if the instance type is a heap type, it would decrement the
reference count of that heap type. However, this could result in the
premature destruction of a type because the inherited instance
destructor should have already decremented the reference count
of the type object.
This change avoids the premature destruction of the type object
by suppressing the decrement of its reference count when an
inherited, non-default instance destructor has been invoked.
Finally, an assertion on the Py_SIZE of a type was deleted. Heap
types have a non zero size, making this into an incorrect assertion.
https://github.com/python/cpython/pull/15323
Add functions with various calling conventions to `_testcapi`, expose them as module-level functions, bound methods, class methods, and static methods, and test calling them and introspecting them through GDB.
https://bugs.python.org/issue37499
Co-authored-by: Jeroen Demeyer <J.Demeyer@UGent.be>
Automerge-Triggered-By: @pganssle
Summary:
Eliminate uses of `_Py_IDENTIFIER` from `_posixsubprocess`, replacing them with interned strings.
Also tries to find an existing version of the module, which will allow subinterpreters.
https://bugs.python.org/issue38069
* PEP-384 _struct
* More PEP-384 fixes for _struct
Summary: Add a couple of more fixes for `_struct` that were previously missed such as removing `tp_*` accessors and using `PyBytesWriter` instead of calling `PyBytes_FromStringAndSize` with `NULL`. Also added a test to confirm that `iter_unpack` type is still uninstantiable.
* 📜🤖 Added by blurb_it.
Accumulate certificates in a set instead of doing a costly list contain
operation. A Windows cert store can easily contain over hundred
certificates. The old code would result in way over 5,000 comparison
operations
Signed-off-by: Christian Heimes <christian@python.org>
In debug mode, visit_decref() now calls _PyObject_IsFreed() to ensure
that the object is not freed. If it's freed, the program fails with
an assertion error and Python dumps informations about the freed
object.
ssl_collect_certificates function in _ssl.c has a memory leak.
Calling CertOpenStore() and CertAddStoreToCollection(), a store's refcnt gets incremented by 2.
But CertCloseStore() is called only once and the refcnt leaves 1.
If FormatMessageW() is passed the FORMAT_MESSAGE_FROM_SYSTEM flag without FORMAT_MESSAGE_IGNORE_INSERTS, it will fail if there are insert sequences in the message definition.
* Rename PyThreadState_DeleteCurrent()
to _PyThreadState_DeleteCurrent()
* Move it to the internal C API
Co-Authored-By: Carol Willing <carolcode@willingconsulting.com>
The purpose of the `unicodedata.is_normalized` function is to answer
the question `str == unicodedata.normalized(form, str)` more
efficiently than writing just that, by using the "quick check"
optimization described in the Unicode standard in UAX #15.
However, it turns out the code doesn't implement the full algorithm
from the standard, and as a result we often miss the optimization and
end up having to compute the whole normalized string after all.
Implement the standard's algorithm. This greatly speeds up
`unicodedata.is_normalized` in many cases where our partial variant
of quick-check had been returning MAYBE and the standard algorithm
returns NO.
At a quick test on my desktop, the existing code takes about 4.4 ms/MB
(so 4.4 ns per byte) when the partial quick-check returns MAYBE and it
has to do the slow normalize-and-compare:
$ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
-- 'unicodedata.is_normalized("NFD", s)'
50 loops, best of 5: 4.39 msec per loop
With this patch, it gets the answer instantly (58 ns) on the same 1 MB
string:
$ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
-- 'unicodedata.is_normalized("NFD", s)'
5000000 loops, best of 5: 58.2 nsec per loop
This restores a small optimization that the original version of this
code had for the `unicodedata.normalize` use case.
With this, that case is actually faster than in master!
$ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
-- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 561 usec per loop
$ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
-- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 512 usec per loop
* Use the 'p' format unit instead of manually called PyObject_IsTrue().
* Pass boolean value instead 0/1 integers to functions that needs boolean.
* Convert some arguments to boolean only once.
Fix a ctypes regression of Python 3.8. When a ctypes.Structure is
passed by copy to a function, ctypes internals created a temporary
object which had the side effect of calling the structure finalizer
(__del__) twice. The Python semantics requires a finalizer to be
called exactly once. Fix ctypes internals to no longer call the
finalizer twice.
Create a new internal StructParam_Type which is only used by
_ctypes_callproc() to call PyMem_Free(ptr) on Py_DECREF(argument).
StructUnionType_paramfunc() creates such object.
bpo-37834: Normalise handling of reparse points on Windows
* ntpath.realpath() and nt.stat() will traverse all supported reparse points (previously was mixed)
* nt.lstat() will let the OS traverse reparse points that are not name surrogates (previously would not traverse any reparse point)
* nt.[l]stat() will only set S_IFLNK for symlinks (previous behaviour)
* nt.readlink() will read destinations for symlinks and junction points only
bpo-1311: os.path.exists('nul') now returns True on Windows
* nt.stat('nul').st_mode is now S_IFCHR (previously was an error)
The faulthandler module no longer allocates its alternative stack at
Python startup. Now the stack is only allocated at the first
faulthandler usage.
faulthandler no longer ignores memory allocation failure when
allocating the stack. sigaltstack() failure now raises an OSError
exception, rather than being ignored.
The alternative stack is no longer used if sigaction() is
not available. In practice, sigaltstack() should only be available
when sigaction() is avaialble, so this change should have no effect
in practice.
faulthandler.dump_traceback_later() internal locks are now only
allocated at the first dump_traceback_later() call, rather than
always being allocated at Python startup.
There are plenty of legitimate scripts in the tree that begin with a
`#!`, but also a few that seem to be marked executable by mistake.
Found them with this command -- it gets executable files known to Git,
filters to the ones that don't start with a `#!`, and then unmarks
them as executable:
$ git ls-files --stage \
| perl -lane 'print $F[3] if (!/^100644/)' \
| while read f; do
head -c2 "$f" | grep -qxF '#!' \
|| chmod a-x "$f"; \
done
Looking at the list by hand confirms that we didn't sweep up any
files that should have the executable bit after all. In particular
* The `.psd` files are images from Photoshop.
* The `.bat` files sure look like things that can be run.
But we have lots of other `.bat` files, and they don't have
this bit set, so it must not be needed for them.
Automerge-Triggered-By: @benjaminp
X509_AUX is an odd, note widely used, OpenSSL extension to the X509 file format. This function doesn't actually use any of the extra metadata that it parses, so just use the standard API.
Automerge-Triggered-By: @tiran
faulthandler now allocates a dedicated stack of SIGSTKSZ*2 bytes,
instead of just SIGSTKSZ bytes. Calling the previous signal handler
in faulthandler signal handler uses more than SIGSTKSZ bytes of stack
memory on some platforms.
FreeBSD implementation of poll(2) restricts the timeout argument to be
either zero, or positive, or equal to INFTIM (-1).
Unless otherwise overridden, socket timeout defaults to -1. This value
is then converted to milliseconds (-1000) and used as argument to the
poll syscall. poll returns EINVAL (22), and the connection fails.
This bug was discovered during the EINTR handling testing, and the
reproduction code can be found in
https://bugs.python.org/issue23618 (see connect_eintr.py,
attached). On GNU/Linux, the example runs as expected.
This change is trivial:
If the supplied timeout value is negative, truncate it to -1.
This fixes an inconsistency between the Python and C implementations of
the datetime module. The pure python version of the code was not
accepting offsets greater than 23:59 but less than 24:00. This is an
accidental legacy of the original implementation, which was put in place
before tzinfo allowed sub-minute time zone offsets.
GH-14878
Use a tighter scope temporary variable to help register allocation.
1% speedup for large string.
Use PyDict_SetItemDefault() for memoizing keys.
At most 4% speedup when the cache hit ratio is low.
There was a discrepancy between the Python and C implementations.
Add singletons ALWAYS_EQ, LARGEST and SMALLEST in test.support
to test mixed type comparison.
Support for RFCOMM, L2CAP, HCI, SCO is based on the BTPROTO_* macros
being defined. Winsock only supports RFCOMM, even though it has a
BTHPROTO_L2CAP macro. L2CAP support would build on windows, but not
necessarily work.
This also adds some basic unittests for constants (all of which existed
prior to this commit, just not on windows) and creating sockets.
pair: Nate Duarte <slacknate@gmail.com>
gc used several PySys_WriteStderr() calls to write stats.
It caused stats mixed up when stderr is shared by multiple
processes like this:
gc: collecting generation 2...
gc: objects in each generation: 0 0gc: collecting generation 2...
gc: objects in each generation: 0 0 126077 126077
gc: objects in permanent generation: 0
gc: objects in permanent generation: 0
gc: done, 112575 unreachable, 0 uncollectablegc: done, 112575 unreachable, 0 uncollectable, 0.2223s elapsed
, 0.2344s elapsed
Expose the CAN_BCM SocketCAN constants used in the bcm_msg_head struct
flags (provided by <linux/can/bcm.h>) under the socket library.
This adds the following constants with a CAN_BCM prefix:
* SETTIMER
* STARTTIMER
* TX_COUNTEVT
* TX_ANNOUNCE
* TX_CP_CAN_ID
* RX_FILTER_ID
* RX_CHECK_DLC
* RX_NO_AUTOTIMER
* RX_ANNOUNCE_RESUME
* TX_RESET_MULTI_IDX
* RX_RTR_FRAME
* CAN_FD_FRAME
The CAN_FD_FRAME flag was introduced in the 4.8 kernel, while the other
ones were present since SocketCAN drivers were mainlined in 2.6.25. As
such, it is probably unnecessary to guard against these constants being
missing.
When scanning the string, most characters are valid, so
checking for invalid characters first means never needing
to check the value of strict on valid strings, and only
needing to check it on invalid characters when doing
non-strict parsing of invalid strings.
This provides a measurable reduction in per-character
processing time (~11% in the pre-merge patch testing).
Deprecate the parser module and add a deprecation warning triggered on import and a warning block in the documentation.
https://bugs.python.org/issue37268
Automerge-Triggered-By: @pablogsal
* bpo-37399: Correctly attach tail text to the last element/comment/pi, even when comments or pis are discarded.
Also fixes the insertion of PIs when "insert_pis=True" is configured for a TreeBuilder.