mirror of https://github.com/python/cpython
Merged revisions 63438 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk ........ r63438 | georg.brandl | 2008-05-17 23:54:03 +0200 (Sat, 17 May 2008) | 3 lines Rename html.parser file, and split html.entities from htmllib to ease removal of the latter in Py3k. ........
This commit is contained in:
parent
bf93b0470a
commit
9087b7f83b
|
@ -0,0 +1,30 @@
|
|||
:mod:`html.entities` --- Definitions of HTML general entities
|
||||
=============================================================
|
||||
|
||||
.. module:: html.entities
|
||||
:synopsis: Definitions of HTML general entities.
|
||||
.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
|
||||
|
||||
|
||||
This module defines three dictionaries, ``name2codepoint``, ``codepoint2name``,
|
||||
and ``entitydefs``. ``entitydefs`` is used by the :mod:`htmllib` module to
|
||||
provide the :attr:`entitydefs` member of the :class:`html.parser.HTMLParser`
|
||||
class. The definition provided here contains all the entities defined by XHTML
|
||||
1.0 that can be handled using simple textual substitution in the Latin-1
|
||||
character set (ISO-8859-1).
|
||||
|
||||
|
||||
.. data:: entitydefs
|
||||
|
||||
A dictionary mapping XHTML 1.0 entity definitions to their replacement text in
|
||||
ISO Latin-1.
|
||||
|
||||
|
||||
.. data:: name2codepoint
|
||||
|
||||
A dictionary that maps HTML entity names to the Unicode codepoints.
|
||||
|
||||
|
||||
.. data:: codepoint2name
|
||||
|
||||
A dictionary that maps Unicode codepoints to HTML entity names.
|
|
@ -1,4 +1,3 @@
|
|||
|
||||
:mod:`html.parser` --- Simple HTML and XHTML parser
|
||||
===================================================
|
||||
|
||||
|
@ -6,7 +5,9 @@
|
|||
:synopsis: A simple parser that can handle HTML and XHTML.
|
||||
|
||||
|
||||
.. index:: HTML, XHTML
|
||||
.. index::
|
||||
single: HTML
|
||||
single: XHTML
|
||||
|
||||
This module defines a class :class:`HTMLParser` which serves as the basis for
|
||||
parsing text files formatted in HTML (HyperText Mark-up Language) and XHTML.
|
||||
|
@ -87,8 +88,8 @@ An exception is defined as well:
|
|||
HREF="http://www.cwi.nl/">``, this method would be called as
|
||||
``handle_starttag('a', [('href', 'http://www.cwi.nl/')])``.
|
||||
|
||||
All entity references from :mod:`html.entities` are replaced in the
|
||||
attribute values.
|
||||
All entity references from :mod:`html.entities` are replaced in the attribute
|
||||
values.
|
||||
|
||||
|
||||
.. method:: HTMLParser.handle_startendtag(tag, attrs)
|
||||
|
@ -171,8 +172,8 @@ As a basic example, below is a very basic HTML parser that uses the
|
|||
class MyHTMLParser(HTMLParser):
|
||||
|
||||
def handle_starttag(self, tag, attrs):
|
||||
print("Encountered the beginning of a %s tag" % tag)
|
||||
print "Encountered the beginning of a %s tag" % tag
|
||||
|
||||
def handle_endtag(self, tag):
|
||||
print("Encountered the end of a %s tag" % tag)
|
||||
print "Encountered the end of a %s tag" % tag
|
||||
|
|
@ -145,36 +145,3 @@ additional methods and instance variables for use within tag methods.
|
|||
call to :meth:`save_bgn`. If the :attr:`nofill` flag is false, whitespace is
|
||||
collapsed to single spaces. A call to this method without a preceding call to
|
||||
:meth:`save_bgn` will raise a :exc:`TypeError` exception.
|
||||
|
||||
|
||||
:mod:`html.entities` --- Definitions of HTML general entities
|
||||
=============================================================
|
||||
|
||||
.. module:: html.entities
|
||||
:synopsis: Definitions of HTML general entities.
|
||||
.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
|
||||
|
||||
|
||||
This module defines three dictionaries, ``name2codepoint``, ``codepoint2name``,
|
||||
and ``entitydefs``. ``entitydefs`` is used by the :mod:`htmllib` module to
|
||||
provide the :attr:`entitydefs` member of the :class:`HTMLParser` class. The
|
||||
definition provided here contains all the entities defined by XHTML 1.0 that
|
||||
can be handled using simple textual substitution in the Latin-1 character set
|
||||
(ISO-8859-1).
|
||||
|
||||
|
||||
.. data:: entitydefs
|
||||
|
||||
A dictionary mapping XHTML 1.0 entity definitions to their replacement text in
|
||||
ISO Latin-1.
|
||||
|
||||
|
||||
.. data:: name2codepoint
|
||||
|
||||
A dictionary that maps HTML entity names to the Unicode codepoints.
|
||||
|
||||
|
||||
.. data:: codepoint2name
|
||||
|
||||
A dictionary that maps Unicode codepoints to HTML entity names.
|
||||
|
||||
|
|
|
@ -21,7 +21,8 @@ definition of the Python bindings for the DOM and SAX interfaces.
|
|||
|
||||
.. toctree::
|
||||
|
||||
htmlparser.rst
|
||||
html.parser.rst
|
||||
html.entities.rst
|
||||
sgmllib.rst
|
||||
htmllib.rst
|
||||
pyexpat.rst
|
||||
|
|
Loading…
Reference in New Issue