Commit Graph

120 Commits

Author SHA1 Message Date
Sebastian Pipping 6a95676bb5
gh-115398: Expose Expat >=2.6.0 reparse deferral API (CVE-2023-52425) (GH-115623)
Allow controlling Expat >=2.6.0 reparse deferral (CVE-2023-52425) by adding five new methods:

- `xml.etree.ElementTree.XMLParser.flush`
- `xml.etree.ElementTree.XMLPullParser.flush`
- `xml.parsers.expat.xmlparser.GetReparseDeferralEnabled`
- `xml.parsers.expat.xmlparser.SetReparseDeferralEnabled`
- `xml.sax.expatreader.ExpatParser.flush`

Based on the "flush" idea from https://github.com/python/cpython/pull/115138#issuecomment-1932444270 .

### Notes

- Please treat as a security fix related to CVE-2023-52425.

Includes code suggested-by: Snild Dolkow <snild@sony.com>
and by core dev Serhiy Storchaka.
2024-02-29 14:52:50 -08:00
Serhiy Storchaka ca715e56a1
gh-69893: Add the close() method for xml.etree.ElementTree.iterparse() iterator (GH-114534) 2024-02-04 17:25:21 +02:00
Sam Gross 66f95ea6a6
gh-114737: Revert change to ElementTree.iterparse "root" attribute (GH-114755)
Prior to gh-114269, the iterator returned by ElementTree.iterparse was
initialized with the root attribute as None. This restores the previous
behavior.
2024-01-31 13:22:24 +02:00
Sam Gross ce01ab536f
gh-101438: Avoid reference cycle in ElementTree.iterparse. (GH-114269)
The iterator returned by ElementTree.iterparse() may hold on to a file
descriptor. The reference cycle prevented prompt clean-up of the file
descriptor if the returned iterator was not exhausted.
2024-01-23 20:14:46 +00:00
Jacob Walls d717be04dc
gh-83122: Deprecate testing element truth values in `ElementTree` (#31149)
When testing element truth values, emit a DeprecationWarning in all implementations.

This had emitted a FutureWarning in the rarely used python-only implementation since ~2.7 and has always been documented as a behavior not to rely on.

Matching an element in a tree search but having it test False can be unexpected. Raising the warning enables making the choice to finally raise an exception for this ambiguous behavior in the future.
2023-01-22 17:16:48 -08:00
Nick Drozd 024ac542d7
bpo-45975: Simplify some while-loops with walrus operator (GH-29347) 2022-11-26 14:33:25 -08:00
Victor Stinner fd76eb547d
gh-94383: Remove ElementTree.Element.copy() method (#94384)
xml.etree: Remove the ElementTree.Element.copy() method of the pure
Python implementation, deprecated in Python 3.10, use the copy.copy()
function instead. The C implementation of xml.etree has no copy()
method, only a __copy__() method.
2022-07-04 15:51:01 +02:00
Serhiy Storchaka d7db9dc3cc
gh-91810: Fix regression with writing an XML declaration with encoding='unicode' (GH-93426)
Suppress writing an XML declaration in open files in ElementTree.write()
with encoding='unicode' and xml_declaration=None.

If file patch is passed to ElementTree.write() with encoding='unicode',
always open a new file in UTF-8.
2022-06-14 07:25:33 +03:00
Serhiy Storchaka 707839b0fe
gh-91810: ElementTree: Use text file's encoding by default in XML declaration (GH-91903)
ElementTree method write() and function tostring() now use the text file's
encoding ("UTF-8" if not available) instead of locale encoding in XML
declaration when encoding="unicode" is specified.
2022-05-11 09:31:07 +03:00
Jacob Walls 496c428de3
bpo-43292: Fix file leak in `ET.iterparse()` when not exhausted (GH-31696)
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2022-03-07 11:31:46 +02:00
Jannis Vajen 345572a1a0
bpo-46786: Make ElementTree write the HTML tags embed, source, track, wbr as empty tags (GH-31406)
See https://html.spec.whatwg.org/multipage/syntax.html#void-elements
for reference.
2022-02-27 15:25:54 +01:00
Noah Kantrowitz be42c06bb0
Update URLs in comments and metadata to use HTTPS (GH-27458) 2021-07-30 15:54:46 +02:00
Alex Prengère 51a85ddce8
bpo-43399: Fix ElementTree.extend not working on iterators (GH-24751) 2021-03-31 00:11:29 +03:00
Felix C. Stegerman 1f433406bd
bpo-42151: don't set specified_attributes=1 in pure Python ElementTree (GH-22987) 2021-02-24 11:25:31 +09:00
scoder 6a412c94b6
bpo-41900: C14N 2.0 serialisation failed for unprefixed attributes when a default namespace was defined. (GH-22474) 2020-10-03 08:07:07 +02:00
mefistotelis 5fd8123dfd
bpo-39011: Preserve line endings within ElementTree attributes (GH-18468)
* bpo-39011: Preserve line endings within attributes

Line endings within attributes were previously normalized to "\n" in Py3.7/3.8.
This patch removes that normalization, as line endings which were
replaced by entity numbers should be preserved in original form.
2020-04-12 14:51:58 +02:00
Gordon P. Hemsley 7d952ded68 bpo-32424: Deprecate xml.etree.ElementTree.Element.copy() in favor of copy.copy() (GH-12995) 2019-09-10 16:22:01 +01:00
Serhiy Storchaka eb8974616b
bpo-15999: Always pass bool instead of int to the expat parser. (GH-15622) 2019-09-01 12:11:43 +03:00
Serhiy Storchaka f02ea6225b
bpo-36543: Remove old-deprecated ElementTree features. (GH-12707)
Remove methods Element.getchildren(), Element.getiterator() and
ElementTree.getiterator() and the xml.etree.cElementTree module.
2019-09-01 11:18:35 +03:00
Stefan Behnel b5d3ceea48
bpo-14465: Add an indent() function to xml.etree.ElementTree to pretty-print XML trees (GH-15200) 2019-08-23 16:44:25 +02:00
Stefan Behnel e1d5dd645d
bpo-13611: C14N 2.0 implementation for ElementTree (GH-12966)
* Implement C14N 2.0 as a new canonicalize() function in ElementTree.

Missing features:
- prefix renaming in XPath expressions (tag and attribute text is supported)
- preservation of original prefixes given redundant namespace declarations
2019-05-01 22:34:13 +02:00
Stefan Behnel dde3eebdaa
bpo-36676: Namespace prefix aware parsing support for the ET.XMLParser target (GH-12885)
* bpo-36676: Implement namespace prefix aware parsing support for the XMLParser target in ElementTree.
2019-05-01 21:49:58 +02:00
Stefan Behnel 43851a202c
bpo-36673: Implement comment/PI parsing support for the TreeBuilder in ElementTree. (#12883)
* bpo-36673: Implement comment/PI parsing support for the TreeBuilder in ElementTree.

* bpo-36673: Rewrite the comment/PI factory handling for the TreeBuilder in "_elementtree" to make it use the same factories as the ElementTree module, and to make it explicit when the comments/PIs are inserted into the tree and when they are not (which is the default).
2019-05-01 21:20:38 +02:00
Bernt Røskar Brenna ffca16e25a bpo-36227: ElementTree.tostring() default_namespace and xml_declaration arguments (GH-12225)
Add new keyword arguments "default_namespace" and "xml_declaration" to functions ET.tostring() and ET.tostringlist(), as known from ElementTree.write().
2019-04-14 10:07:02 +02:00
Serhiy Storchaka da0847048a
bpo-36431: Use PEP 448 dict unpacking for merging two dicts. (GH-12553) 2019-03-27 08:02:28 +02:00
Serhiy Storchaka 3b05ad7be0
bpo-34160: Preserve user specified order of Element attributes in html. (GH-10190) 2018-10-29 19:31:04 +02:00
Raymond Hettinger e3685fd5fd
bpo-34160: Preserve user specified order of Element attributes (GH-10163) 2018-10-28 11:18:22 -07:00
Serhiy Storchaka f081fd8303
bpo-35013: Add more type checks for children of Element. (GH-9944)
It is now guarantied that children of xml.etree.ElementTree.Element
are Elements (at least in C implementation). Previously methods
__setitem__(), __setstate__() and __deepcopy__() could be used for
adding non-Element children.
2018-10-19 12:12:57 +03:00
Serhiy Storchaka 02ec92fa7b
bpo-29209: Remove old-deprecated features in ElementTree. (GH-6769)
Also make getchildren() and getiterator() emitting
a DeprecationWarning instead of PendingDeprecationWarning.
2018-07-24 12:03:34 +03:00
Mike 53f7a7c281 bpo-32297: Few misspellings found in Python source code comments. (#4803)
* Fix multiple typos in code comments

* Add spacing in comments (test_logging.py, test_math.py)

* Fix spaces at the beginning of comments in test_logging.py
2017-12-14 13:04:53 +02:00
Serhiy Storchaka 2e576f5aec bpo-30144: Import collections ABC from collections.abc rather than collections. (#1263) 2017-04-24 09:05:00 +03:00
Serhiy Storchaka 762ec97ea6 bpo-29204: Emit warnings for already deprecated ElementTree features. (#773)
Element.getiterator() and the html parameter of XMLParser() were
deprecated only in the documentation (since Python 3.2 and 3.4 correspondintly).
Now using them emits a deprecation warning.

* Don’t need check_warnings any more.
2017-03-30 18:12:06 +03:00
Raymond Hettinger 11fa3ffcb1 merge 2016-09-11 23:23:24 -07:00
Raymond Hettinger 076366c2a5 Issue #17582: xml.etree.ElementTree nows preserves whitespaces in attributes
(Patch by Duane Griffin.  Reviewed and approved by Stefan Behnel.)
2016-09-11 23:18:03 -07:00
R David Murray 44b548dda8 #27364: fix "incorrect" uses of escape character in the stdlib.
And most of the tools.

Patch by Emanual Barry, reviewed by me, Serhiy Storchaka, and
Martin Panter.
2016-09-08 13:59:53 -04:00
Martin Panter 32eeb56f42 Merge Element Tree doc string from 3.5 2016-06-04 07:13:38 +00:00
Martin Panter 29ce082c10 Clarify deprecation of ElementTree.XMLParser(html=...) parameter 2016-06-04 07:12:51 +00:00
Martin Panter dcfebb32e2 Issue #26676: Add missing XMLPullParser to ElementTree.__all__ 2016-04-01 06:55:55 +00:00
Serhiy Storchaka 47a9d59d51 Issue #25902: Fixed various refcount issues in ElementTree iteration. 2015-12-21 11:11:12 +02:00
Serhiy Storchaka 66c08d90f6 Issue #25902: Fixed various refcount issues in ElementTree iteration. 2015-12-21 11:09:48 +02:00
Serhiy Storchaka 9ec5e25f26 Issue #25638: Optimized ElementTree.iterparse(); it is now 2x faster.
ElementTree.XMLParser._setevents now accepts any objects with the append
method, not just a list.
2015-12-07 02:31:11 +02:00
Serhiy Storchaka 6f988b5990 Issue #25688: Fixed file leak in ElementTree.iterparse() raising an error. 2015-11-23 15:45:12 +02:00
Serhiy Storchaka e3d4ec4766 Issue #25688: Fixed file leak in ElementTree.iterparse() raising an error. 2015-11-23 15:44:03 +02:00
Martin Panter 982a08f8bb Issue #25047: Merge Element Tree encoding from 3.4 into 3.5 2015-09-23 01:43:08 +00:00
Martin Panter 89f76d3f91 Issue #25047: Respect case writing XML encoding declarations
This restores the ability to write encoding names in uppercase like "UTF-8",
which worked in Python 2.
2015-09-23 01:14:35 +00:00
Serhiy Storchaka 08448a1f4d Issue #23326: Removed __ne__ implementations. Since fixing default __ne__
implementation in issue #21408 they are redundant.
2015-01-31 12:05:05 +02:00
Serhiy Storchaka 83000a490a Removed duplicated words in in comments and docs. 2014-12-01 18:30:14 +02:00
Serhiy Storchaka 56a6d855e2 Removed duplicated words in in comments and docs. 2014-12-01 18:28:43 +02:00
Serhiy Storchaka 465e60e654 Issue #22033: Reprs of most Python implemened classes now contain actual
class name instead of hardcoded one.
2014-07-25 23:36:00 +03:00
R David Murray 410d320703 whatsnew: XMLPullParser, plus some doc updates.
I was confused by the text saying that read_events "iterated", since it
actually returns an iterator (that's what a generator does) that the
caller must then iterate.  So I tidied up the language.  I'm not sure
what the sentence "Events provided in a previous call to read_events()
will not be yielded again." is trying to convey, so I didn't try to fix that.

Also fixed a couple more news items.
2014-01-04 23:52:50 -05:00