cpython

Commit Graph

Author	SHA1	Message	Date
Motoki Naruse	3358d589fb	bpo-30629: Remove second call of str.lower() in html.parser.parse_endtag. (#2099 ) elem is the result of .lower() 6 lines above the handle_endtag call. Patch by Motoki Naruse	2017-06-16 21:15:25 -04:00
Serhiy Storchaka	c842efc6ae	Revert "Fixed a typo in the HTMLParser.feed docstrings" (#1771 ) * Revert "Fixed a typo in the HTMLParser.feed docstrings. The docstring started with an 'r', like a The docstring was correct. I read the patch in opposite direction, as adding the "r" prefix. This reverts commit `5ba185039f`.	2017-05-24 07:20:45 +03:00
Jani Šumak	5ba185039f	Fixed a typo in the HTMLParser.feed docstrings. The docstring started with an 'r', like a rawstring. (#1759 )	2017-05-23 16:40:54 +03:00
R David Murray	44b548dda8	#27364 : fix "incorrect" uses of escape character in the stdlib. And most of the tools. Patch by Emanual Barry, reviewed by me, Serhiy Storchaka, and Martin Panter.	2016-09-08 13:59:53 -04:00
Martin Panter	46f50726a0	Issue #27076 : Doc, comment and tests spelling fixes Most fixes to Doc/ and Lib/ directories by Ville Skyttä.	2016-05-26 05:35:26 +00:00
Martin Panter	4827e488a4	Merge spelling fixes from 3.4 into 3.5	2015-10-31 12:16:18 +00:00
Martin Panter	1f1177d69a	Fix some spelling errors in documentation and code comments	2015-10-31 11:48:53 +00:00
Ezio Melotti	20a2c6482e	#23144 : merge with 3.4.	2015-09-06 21:44:45 +03:00
Ezio Melotti	6f2bb98966	#23144 : Make sure that HTMLParser.feed() returns all the data, even when convert_charrefs is True.	2015-09-06 21:38:06 +03:00
Serhiy Storchaka	82e07b92b3	Issue #23181 : More "codepoint" -> "code point".	2015-01-18 11:33:31 +02:00
Serhiy Storchaka	d3faf43f9b	Issue #23181 : More "codepoint" -> "code point".	2015-01-18 11:28:37 +02:00
Ezio Melotti	6fc16d81af	#21047 : set the default value for the convert_charrefs argument of HTMLParser to True. Patch by Berker Peksag.	2014-08-02 18:36:12 +03:00
Ezio Melotti	11bec7a1b8	Add an __all__ to html.entities.	2014-08-02 15:15:02 +03:00
Ezio Melotti	73a4359eb0	#15114 : the strict mode and argument of HTMLParser, HTMLParser.error, and the HTMLParserError exception have been removed.	2014-08-02 14:10:30 +03:00
Ezio Melotti	153d97b24e	#20288 : merge with 3.3.	2014-02-01 21:22:26 +02:00
Ezio Melotti	f27b9a741a	#20288 : fix handling of invalid numeric charrefs in HTMLParser.	2014-02-01 21:21:01 +02:00
Ezio Melotti	95401c5f6b	#13633 : Added a new convert_charrefs keyword arg to HTMLParser that, when True, automatically converts all character references.	2013-11-23 19:52:05 +02:00
Ezio Melotti	f6de9eb2bb	#19688 : add back and deprecate the internal HTMLParser.unescape() method.	2013-11-22 05:49:29 +02:00
Ezio Melotti	4a9ee26750	#2927 : Added the unescape() function to the html module.	2013-11-19 20:28:45 +02:00
Ezio Melotti	b7038817fe	#19480 : merge with 3.3.	2013-11-07 18:35:27 +02:00
Ezio Melotti	7165d8b9ba	#19480 : HTMLParser now accepts all valid start-tag names as defined by the HTML5 standard.	2013-11-07 18:33:24 +02:00
Ezio Melotti	88ebfb129b	#15114 : The html.parser module now raises a DeprecationWarning when the strict argument of HTMLParser or the HTMLParser.error method are used.	2013-11-02 17:08:24 +02:00
Ezio Melotti	4603487dc9	#18020 : improve html.escape speed by an order of magnitude. Patch by Matt Bryant.	2013-07-07 11:11:24 +02:00
Ezio Melotti	f6ca26fbff	#17802 : merge with 3.3.	2013-05-01 16:20:00 +03:00
Ezio Melotti	8e596a765c	#17802 : Fix an UnboundLocalError in html.parser. Initial tests by Thomas Barlow.	2013-05-01 16:18:25 +03:00
Ezio Melotti	1698babd1b	#14679 : add an __all__ (that contains only HTMLParser) to html.parser.	2013-05-01 16:09:34 +03:00
Ezio Melotti	e6e96eea51	#16245 : Fix the value of a few entities in html.entities.html5.	2012-10-23 15:51:27 +02:00
Ezio Melotti	518dbfd7b5	Reorder html.entities.html5 entities to make updates easier. Patch by Iuliia Proskurnia.	2012-10-23 14:45:58 +02:00
Ezio Melotti	46495182d0	#15156 : HTMLParser now uses the new "html.entities.html5" dictionary.	2012-06-24 22:02:56 +02:00
Ezio Melotti	dc44f55cc9	#11113 : add a new "html5" dictionary containing the named character references defined by the HTML5 standard and the equivalent Unicode character(s) to the html.entities module.	2012-06-24 04:37:41 +02:00
Ezio Melotti	3861d8b271	#15114 : the strict mode of HTMLParser and the HTMLParseError exception are deprecated now that the parser is able to parse invalid markup.	2012-06-23 15:27:51 +02:00
Ezio Melotti	0780b6bc58	#14538 : HTMLParser can now parse correctly start tags that contain a bare /.	2012-04-18 19:18:22 -06:00
Ezio Melotti	29877e8e04	HTMLParser is now able to handle slashes in the start tag.	2012-02-21 09:25:00 +02:00
Ezio Melotti	e31ddedb0e	Fix an index and clean up comments.	2012-02-13 20:20:00 +02:00
Ezio Melotti	f4ab491901	Improve handling of declarations in HTMLParser.	2012-02-13 15:50:37 +02:00
Ezio Melotti	5211ffe4df	#13993 : HTMLParser is now able to handle broken end tags when strict=False.	2012-02-13 11:24:50 +02:00
Ezio Melotti	fa3702dc28	#13960 : HTMLParser is now able to handle broken comments when strict=False.	2012-02-10 10:45:44 +02:00
Ezio Melotti	15cb489234	#13358 : HTMLParser now calls handle_data only once for each CDATA.	2011-11-18 18:01:49 +02:00
Ezio Melotti	c2fe57762b	#1745761 , #755670 , #13357 , #12629 , #1200313 : improve attribute handling in HTMLParser.	2011-11-14 18:53:33 +02:00
Ezio Melotti	7de56f6a04	#670664 : Fix HTMLParser to correctly handle the content of ``<script>...</script>`` and ``<style>...</style>``.	2011-11-01 14:12:22 +02:00
Ezio Melotti	f50ffa94ab	#13273 : fix a bug that prevented HTMLParser to properly detect some tags when strict=False.	2011-10-28 13:21:09 +03:00
Senthil Kumaran	d71bbf9fd5	Fix issue12938 - Update the docstring of html.escape. Include the information on single quote.	2011-09-13 07:14:13 +08:00
Ezio Melotti	d9e0b068af	#12888 : Fix a bug in HTMLParser.unescape that prevented it to escape more than 128 entities. Patch by Peter Otten.	2011-09-05 17:11:06 +03:00
Éric Araujo	51b7aedadd	Merge 3.1	2011-05-25 18:13:49 +02:00
Éric Araujo	39f180bb1f	Fix display of html.parser.HTMLParser.feed docstring	2011-05-04 15:55:47 +02:00
Ezio Melotti	2e3607c1e7	#7311 : fix html.parser to accept non-ASCII attribute values.	2011-04-07 22:03:31 +03:00
Senthil Kumaran	6c85838489	Merged revisions 87542 via svnmerge from svn+ssh://pythondev@svn.python.org/python/branches/py3k ........ r87542 \| senthil.kumaran \| 2010-12-28 23:55:16 +0800 (Tue, 28 Dec 2010) \| 3 lines Fix Issue10759 - html.parser.unescape() fails on HTML entities with incorrect syntax ........	2010-12-28 16:10:56 +00:00
Senthil Kumaran	164540fee1	Fix Issue10759 - html.parser.unescape() fails on HTML entities with incorrect syntax	2010-12-28 15:55:16 +00:00
R. David Murray	b579dba119	#1486713 : Add a tolerant mode to HTMLParser. The motivation for adding this option is that the the functionality it provides used to be provided by sgmllib in Python2, and was used by, for example, BeautifulSoup. Without this option, the Python3 version of BeautifulSoup and the many programs that use it are crippled. The original patch was by 'kxroberto'. I modified it heavily but kept his heuristics and test. I also added additional heuristics to fix #975556, #1046092, and part of #6191. This patch should be completely backward compatible: the behavior with the default strict=True is unchanged.	2010-12-03 04:06:39 +00:00
Georg Brandl	1f7fffb308	#2830 : add html.escape() helper and move cgi.escape() uses in the standard library to it. It defaults to quote=True and also escapes single quotes, which makes casual use safer. The cgi.escape() interface is not touched, but emits a (silent) PendingDeprecationWarning.	2010-10-15 15:57:45 +00:00

1 2

55 Commits