Commit Graph

21 Commits

Author SHA1 Message Date
R David Murray 87cbfb20fb #24926: Fix typo in example. 2015-08-24 12:55:03 -04:00
Ezio Melotti 95401c5f6b #13633: Added a new convert_charrefs keyword arg to HTMLParser that, when True, automatically converts all character references. 2013-11-23 19:52:05 +02:00
Ezio Melotti 88ebfb129b #15114: The html.parser module now raises a DeprecationWarning when the strict argument of HTMLParser or the HTMLParser.error method are used. 2013-11-02 17:08:24 +02:00
Georg Brandl 61063cca6a Fix a couple of versionadded/versionchanged related markup errors. 2012-06-24 22:48:30 +02:00
Ezio Melotti 3861d8b271 #15114: the strict mode of HTMLParser and the HTMLParseError exception are deprecated now that the parser is able to parse invalid markup. 2012-06-23 15:27:51 +02:00
Ezio Melotti 4279bc7aef #14020: improve HTMLParser documentation. 2012-02-18 02:01:36 +02:00
Ezio Melotti 7de56f6a04 #670664: Fix HTMLParser to correctly handle the content of ``<script>...</script>`` and ``<style>...</style>``. 2011-11-01 14:12:22 +02:00
Ezio Melotti f99e4b5dbe Improve HTMLParser example in the doc and fix a couple minor things. 2011-10-28 14:34:56 +03:00
Georg Brandl 2f5cac6ab4 Merge to 3.2. 2011-03-21 08:55:31 +01:00
Georg Brandl 82d8ec5eef Fix duplicate word. 2011-03-21 08:55:16 +01:00
Raymond Hettinger a199368b23 More source links. 2011-01-27 01:20:32 +00:00
R. David Murray bb7b753cfc Add missing versionchanged, correct 'throw' wording to 'raise'. 2010-12-03 04:26:18 +00:00
R. David Murray b579dba119 #1486713: Add a tolerant mode to HTMLParser.
The motivation for adding this option is that the the functionality it
provides used to be provided by sgmllib in Python2, and was used by,
for example, BeautifulSoup.  Without this option, the Python3 version
of BeautifulSoup and the many programs that use it are crippled.

The original patch was by 'kxroberto'.  I modified it heavily but kept his
heuristics and test.  I also added additional heuristics to fix #975556,
#1046092, and part of #6191.  This patch should be completely backward
compatible:  the behavior with the default strict=True is unchanged.
2010-12-03 04:06:39 +00:00
Georg Brandl 13f959b501 Merged revisions 83561,83563,83565-83566,83569,83571,83574-83575,83580,83584,83599,83612,83659,83977,84015-84018,84020,84141 via svnmerge from
svn+ssh://svn.python.org/python/branches/py3k

........
  r83561 | georg.brandl | 2010-08-02 22:17:50 +0200 (Mo, 02 Aug 2010) | 1 line

  #4280: remove outdated "versionchecker" tool.
........
  r83563 | georg.brandl | 2010-08-02 22:21:21 +0200 (Mo, 02 Aug 2010) | 1 line

  #9037: add example how to raise custom exceptions from C code.
........
  r83565 | georg.brandl | 2010-08-02 22:27:20 +0200 (Mo, 02 Aug 2010) | 1 line

  #9111: document that do_help() looks at docstrings.
........
  r83566 | georg.brandl | 2010-08-02 22:30:57 +0200 (Mo, 02 Aug 2010) | 1 line

  #9019: remove false (in 3k) claim about Headers updates.
........
  r83569 | georg.brandl | 2010-08-02 22:39:35 +0200 (Mo, 02 Aug 2010) | 1 line

  #7797: be explicit about bytes-oriented interface of base64 functions.
........
  r83571 | georg.brandl | 2010-08-02 22:44:34 +0200 (Mo, 02 Aug 2010) | 1 line

  Clarify that abs() is not a namespace.
........
  r83574 | georg.brandl | 2010-08-02 22:47:56 +0200 (Mo, 02 Aug 2010) | 1 line

  #6867: epoll.register() returns None.
........
  r83575 | georg.brandl | 2010-08-02 22:52:10 +0200 (Mo, 02 Aug 2010) | 1 line

  #9238: zipfile does handle archive comments.
........
  r83580 | georg.brandl | 2010-08-02 23:02:36 +0200 (Mo, 02 Aug 2010) | 1 line

  #8119: fix copy-paste error.
........
  r83584 | georg.brandl | 2010-08-02 23:07:14 +0200 (Mo, 02 Aug 2010) | 1 line

  #9457: fix documentation links for 3.2.
........
  r83599 | georg.brandl | 2010-08-02 23:51:18 +0200 (Mo, 02 Aug 2010) | 1 line

  #9061: warn that single quotes are never escaped.
........
  r83612 | georg.brandl | 2010-08-03 00:59:44 +0200 (Di, 03 Aug 2010) | 1 line

  Fix unicode literal.
........
  r83659 | georg.brandl | 2010-08-03 14:06:29 +0200 (Di, 03 Aug 2010) | 1 line

  Terminology fix: exceptions are raised, except in generator.throw().
........
  r83977 | georg.brandl | 2010-08-13 17:10:49 +0200 (Fr, 13 Aug 2010) | 1 line

  Fix copy-paste error.
........
  r84015 | georg.brandl | 2010-08-14 17:44:34 +0200 (Sa, 14 Aug 2010) | 1 line

  Add some maintainers.
........
  r84016 | georg.brandl | 2010-08-14 17:46:15 +0200 (Sa, 14 Aug 2010) | 1 line

  Wording fix.
........
  r84017 | georg.brandl | 2010-08-14 17:46:59 +0200 (Sa, 14 Aug 2010) | 1 line

  Typo fix.
........
  r84018 | georg.brandl | 2010-08-14 17:48:49 +0200 (Sa, 14 Aug 2010) | 1 line

  Typo fix.
........
  r84020 | georg.brandl | 2010-08-14 17:57:20 +0200 (Sa, 14 Aug 2010) | 1 line

  Fix format.
........
  r84141 | georg.brandl | 2010-08-17 16:11:59 +0200 (Di, 17 Aug 2010) | 1 line

  Markup nits.
........
2010-10-06 08:35:38 +00:00
Georg Brandl 7cb1319688 Terminology fix: exceptions are raised, except in generator.throw(). 2010-08-03 12:06:29 +00:00
Georg Brandl feb4c88014 Merged revisions 83223 via svnmerge from
svn+ssh://svn.python.org/python/branches/py3k

........
  r83223 | georg.brandl | 2010-07-29 15:38:37 +0200 (Do, 29 Jul 2010) | 1 line

  #3874: document HTMLParser.unknown_decl().
........
2010-08-01 21:09:54 +00:00
Georg Brandl 46aa5c5ba1 #3874: document HTMLParser.unknown_decl(). 2010-07-29 13:38:37 +00:00
Georg Brandl f78dd3452d Merged revisions 73592,73823 via svnmerge from
svn+ssh://svn.python.org/python/branches/py3k

........
  r73592 | ezio.melotti | 2009-06-28 00:58:15 +0200 (So, 28 Jun 2009) | 1 line

  Updated the last example as requested in #6350
........
  r73823 | ezio.melotti | 2009-07-04 03:14:30 +0200 (Sa, 04 Jul 2009) | 1 line

  #6398 typo: versio. -> version.
........
2009-08-13 08:58:24 +00:00
Ezio Melotti 2fad00c198 Updated the last example as requested in #6350 2009-06-27 22:58:15 +00:00
Georg Brandl 877b10add4 Remove the htmllib and sgmllib modules as per PEP 3108. 2008-06-01 21:25:55 +00:00
Georg Brandl 9087b7f83b Merged revisions 63438 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r63438 | georg.brandl | 2008-05-17 23:54:03 +0200 (Sat, 17 May 2008) | 3 lines

  Rename html.parser file, and split html.entities from htmllib
  to ease removal of the latter in Py3k.
........
2008-05-18 07:53:01 +00:00