Merged revisions 63438 via svnmerge from

svn+ssh://pythondev@svn.python.org/python/trunk

........
  r63438 | georg.brandl | 2008-05-17 23:54:03 +0200 (Sat, 17 May 2008) | 3 lines

  Rename html.parser file, and split html.entities from htmllib
  to ease removal of the latter in Py3k.
........
This commit is contained in:
Georg Brandl 2008-05-18 07:53:01 +00:00
parent bf93b0470a
commit 9087b7f83b
4 changed files with 39 additions and 40 deletions

View File

@ -0,0 +1,30 @@
:mod:`html.entities` --- Definitions of HTML general entities
=============================================================
.. module:: html.entities
:synopsis: Definitions of HTML general entities.
.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
This module defines three dictionaries, ``name2codepoint``, ``codepoint2name``,
and ``entitydefs``. ``entitydefs`` is used by the :mod:`htmllib` module to
provide the :attr:`entitydefs` member of the :class:`html.parser.HTMLParser`
class. The definition provided here contains all the entities defined by XHTML
1.0 that can be handled using simple textual substitution in the Latin-1
character set (ISO-8859-1).
.. data:: entitydefs
A dictionary mapping XHTML 1.0 entity definitions to their replacement text in
ISO Latin-1.
.. data:: name2codepoint
A dictionary that maps HTML entity names to the Unicode codepoints.
.. data:: codepoint2name
A dictionary that maps Unicode codepoints to HTML entity names.

View File

@ -1,4 +1,3 @@
:mod:`html.parser` --- Simple HTML and XHTML parser
===================================================
@ -6,7 +5,9 @@
:synopsis: A simple parser that can handle HTML and XHTML.
.. index:: HTML, XHTML
.. index::
single: HTML
single: XHTML
This module defines a class :class:`HTMLParser` which serves as the basis for
parsing text files formatted in HTML (HyperText Mark-up Language) and XHTML.
@ -87,8 +88,8 @@ An exception is defined as well:
HREF="http://www.cwi.nl/">``, this method would be called as
``handle_starttag('a', [('href', 'http://www.cwi.nl/')])``.
All entity references from :mod:`html.entities` are replaced in the
attribute values.
All entity references from :mod:`html.entities` are replaced in the attribute
values.
.. method:: HTMLParser.handle_startendtag(tag, attrs)
@ -171,8 +172,8 @@ As a basic example, below is a very basic HTML parser that uses the
class MyHTMLParser(HTMLParser):
def handle_starttag(self, tag, attrs):
print("Encountered the beginning of a %s tag" % tag)
print "Encountered the beginning of a %s tag" % tag
def handle_endtag(self, tag):
print("Encountered the end of a %s tag" % tag)
print "Encountered the end of a %s tag" % tag

View File

@ -145,36 +145,3 @@ additional methods and instance variables for use within tag methods.
call to :meth:`save_bgn`. If the :attr:`nofill` flag is false, whitespace is
collapsed to single spaces. A call to this method without a preceding call to
:meth:`save_bgn` will raise a :exc:`TypeError` exception.
:mod:`html.entities` --- Definitions of HTML general entities
=============================================================
.. module:: html.entities
:synopsis: Definitions of HTML general entities.
.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
This module defines three dictionaries, ``name2codepoint``, ``codepoint2name``,
and ``entitydefs``. ``entitydefs`` is used by the :mod:`htmllib` module to
provide the :attr:`entitydefs` member of the :class:`HTMLParser` class. The
definition provided here contains all the entities defined by XHTML 1.0 that
can be handled using simple textual substitution in the Latin-1 character set
(ISO-8859-1).
.. data:: entitydefs
A dictionary mapping XHTML 1.0 entity definitions to their replacement text in
ISO Latin-1.
.. data:: name2codepoint
A dictionary that maps HTML entity names to the Unicode codepoints.
.. data:: codepoint2name
A dictionary that maps Unicode codepoints to HTML entity names.

View File

@ -21,7 +21,8 @@ definition of the Python bindings for the DOM and SAX interfaces.
.. toctree::
htmlparser.rst
html.parser.rst
html.entities.rst
sgmllib.rst
htmllib.rst
pyexpat.rst