diff --git a/Doc/library/html.entities.rst b/Doc/library/html.entities.rst new file mode 100644 index 00000000000..0b2efeb48c9 --- /dev/null +++ b/Doc/library/html.entities.rst @@ -0,0 +1,30 @@ +:mod:`html.entities` --- Definitions of HTML general entities +============================================================= + +.. module:: html.entities + :synopsis: Definitions of HTML general entities. +.. sectionauthor:: Fred L. Drake, Jr. + + +This module defines three dictionaries, ``name2codepoint``, ``codepoint2name``, +and ``entitydefs``. ``entitydefs`` is used by the :mod:`htmllib` module to +provide the :attr:`entitydefs` member of the :class:`html.parser.HTMLParser` +class. The definition provided here contains all the entities defined by XHTML +1.0 that can be handled using simple textual substitution in the Latin-1 +character set (ISO-8859-1). + + +.. data:: entitydefs + + A dictionary mapping XHTML 1.0 entity definitions to their replacement text in + ISO Latin-1. + + +.. data:: name2codepoint + + A dictionary that maps HTML entity names to the Unicode codepoints. + + +.. data:: codepoint2name + + A dictionary that maps Unicode codepoints to HTML entity names. diff --git a/Doc/library/htmlparser.rst b/Doc/library/html.parser.rst similarity index 97% rename from Doc/library/htmlparser.rst rename to Doc/library/html.parser.rst index 4bfb2878403..fe6748a2775 100644 --- a/Doc/library/htmlparser.rst +++ b/Doc/library/html.parser.rst @@ -1,4 +1,3 @@ - :mod:`html.parser` --- Simple HTML and XHTML parser =================================================== @@ -6,7 +5,9 @@ :synopsis: A simple parser that can handle HTML and XHTML. -.. index:: HTML, XHTML +.. index:: + single: HTML + single: XHTML This module defines a class :class:`HTMLParser` which serves as the basis for parsing text files formatted in HTML (HyperText Mark-up Language) and XHTML. @@ -87,8 +88,8 @@ An exception is defined as well: HREF="http://www.cwi.nl/">``, this method would be called as ``handle_starttag('a', [('href', 'http://www.cwi.nl/')])``. - All entity references from :mod:`html.entities` are replaced in the - attribute values. + All entity references from :mod:`html.entities` are replaced in the attribute + values. .. method:: HTMLParser.handle_startendtag(tag, attrs) @@ -171,8 +172,8 @@ As a basic example, below is a very basic HTML parser that uses the class MyHTMLParser(HTMLParser): def handle_starttag(self, tag, attrs): - print("Encountered the beginning of a %s tag" % tag) + print "Encountered the beginning of a %s tag" % tag def handle_endtag(self, tag): - print("Encountered the end of a %s tag" % tag) + print "Encountered the end of a %s tag" % tag diff --git a/Doc/library/htmllib.rst b/Doc/library/htmllib.rst index 34423a04471..5e6554a74c1 100644 --- a/Doc/library/htmllib.rst +++ b/Doc/library/htmllib.rst @@ -145,36 +145,3 @@ additional methods and instance variables for use within tag methods. call to :meth:`save_bgn`. If the :attr:`nofill` flag is false, whitespace is collapsed to single spaces. A call to this method without a preceding call to :meth:`save_bgn` will raise a :exc:`TypeError` exception. - - -:mod:`html.entities` --- Definitions of HTML general entities -============================================================= - -.. module:: html.entities - :synopsis: Definitions of HTML general entities. -.. sectionauthor:: Fred L. Drake, Jr. - - -This module defines three dictionaries, ``name2codepoint``, ``codepoint2name``, -and ``entitydefs``. ``entitydefs`` is used by the :mod:`htmllib` module to -provide the :attr:`entitydefs` member of the :class:`HTMLParser` class. The -definition provided here contains all the entities defined by XHTML 1.0 that -can be handled using simple textual substitution in the Latin-1 character set -(ISO-8859-1). - - -.. data:: entitydefs - - A dictionary mapping XHTML 1.0 entity definitions to their replacement text in - ISO Latin-1. - - -.. data:: name2codepoint - - A dictionary that maps HTML entity names to the Unicode codepoints. - - -.. data:: codepoint2name - - A dictionary that maps Unicode codepoints to HTML entity names. - diff --git a/Doc/library/markup.rst b/Doc/library/markup.rst index 19ce7b9e735..1a900311bb8 100644 --- a/Doc/library/markup.rst +++ b/Doc/library/markup.rst @@ -21,7 +21,8 @@ definition of the Python bindings for the DOM and SAX interfaces. .. toctree:: - htmlparser.rst + html.parser.rst + html.entities.rst sgmllib.rst htmllib.rst pyexpat.rst