Merged revisions 63438 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk ........ r63438 | georg.brandl | 2008-05-17 23:54:03 +0200 (Sat, 17 May 2008) | 3 lines Rename html.parser file, and split html.entities from htmllib to ease removal of the latter in Py3k. ........
This commit is contained in:
parent
bf93b0470a
commit
9087b7f83b
|
@ -0,0 +1,30 @@
|
||||||
|
:mod:`html.entities` --- Definitions of HTML general entities
|
||||||
|
=============================================================
|
||||||
|
|
||||||
|
.. module:: html.entities
|
||||||
|
:synopsis: Definitions of HTML general entities.
|
||||||
|
.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
|
||||||
|
|
||||||
|
|
||||||
|
This module defines three dictionaries, ``name2codepoint``, ``codepoint2name``,
|
||||||
|
and ``entitydefs``. ``entitydefs`` is used by the :mod:`htmllib` module to
|
||||||
|
provide the :attr:`entitydefs` member of the :class:`html.parser.HTMLParser`
|
||||||
|
class. The definition provided here contains all the entities defined by XHTML
|
||||||
|
1.0 that can be handled using simple textual substitution in the Latin-1
|
||||||
|
character set (ISO-8859-1).
|
||||||
|
|
||||||
|
|
||||||
|
.. data:: entitydefs
|
||||||
|
|
||||||
|
A dictionary mapping XHTML 1.0 entity definitions to their replacement text in
|
||||||
|
ISO Latin-1.
|
||||||
|
|
||||||
|
|
||||||
|
.. data:: name2codepoint
|
||||||
|
|
||||||
|
A dictionary that maps HTML entity names to the Unicode codepoints.
|
||||||
|
|
||||||
|
|
||||||
|
.. data:: codepoint2name
|
||||||
|
|
||||||
|
A dictionary that maps Unicode codepoints to HTML entity names.
|
|
@ -1,4 +1,3 @@
|
||||||
|
|
||||||
:mod:`html.parser` --- Simple HTML and XHTML parser
|
:mod:`html.parser` --- Simple HTML and XHTML parser
|
||||||
===================================================
|
===================================================
|
||||||
|
|
||||||
|
@ -6,7 +5,9 @@
|
||||||
:synopsis: A simple parser that can handle HTML and XHTML.
|
:synopsis: A simple parser that can handle HTML and XHTML.
|
||||||
|
|
||||||
|
|
||||||
.. index:: HTML, XHTML
|
.. index::
|
||||||
|
single: HTML
|
||||||
|
single: XHTML
|
||||||
|
|
||||||
This module defines a class :class:`HTMLParser` which serves as the basis for
|
This module defines a class :class:`HTMLParser` which serves as the basis for
|
||||||
parsing text files formatted in HTML (HyperText Mark-up Language) and XHTML.
|
parsing text files formatted in HTML (HyperText Mark-up Language) and XHTML.
|
||||||
|
@ -87,8 +88,8 @@ An exception is defined as well:
|
||||||
HREF="http://www.cwi.nl/">``, this method would be called as
|
HREF="http://www.cwi.nl/">``, this method would be called as
|
||||||
``handle_starttag('a', [('href', 'http://www.cwi.nl/')])``.
|
``handle_starttag('a', [('href', 'http://www.cwi.nl/')])``.
|
||||||
|
|
||||||
All entity references from :mod:`html.entities` are replaced in the
|
All entity references from :mod:`html.entities` are replaced in the attribute
|
||||||
attribute values.
|
values.
|
||||||
|
|
||||||
|
|
||||||
.. method:: HTMLParser.handle_startendtag(tag, attrs)
|
.. method:: HTMLParser.handle_startendtag(tag, attrs)
|
||||||
|
@ -171,8 +172,8 @@ As a basic example, below is a very basic HTML parser that uses the
|
||||||
class MyHTMLParser(HTMLParser):
|
class MyHTMLParser(HTMLParser):
|
||||||
|
|
||||||
def handle_starttag(self, tag, attrs):
|
def handle_starttag(self, tag, attrs):
|
||||||
print("Encountered the beginning of a %s tag" % tag)
|
print "Encountered the beginning of a %s tag" % tag
|
||||||
|
|
||||||
def handle_endtag(self, tag):
|
def handle_endtag(self, tag):
|
||||||
print("Encountered the end of a %s tag" % tag)
|
print "Encountered the end of a %s tag" % tag
|
||||||
|
|
|
@ -145,36 +145,3 @@ additional methods and instance variables for use within tag methods.
|
||||||
call to :meth:`save_bgn`. If the :attr:`nofill` flag is false, whitespace is
|
call to :meth:`save_bgn`. If the :attr:`nofill` flag is false, whitespace is
|
||||||
collapsed to single spaces. A call to this method without a preceding call to
|
collapsed to single spaces. A call to this method without a preceding call to
|
||||||
:meth:`save_bgn` will raise a :exc:`TypeError` exception.
|
:meth:`save_bgn` will raise a :exc:`TypeError` exception.
|
||||||
|
|
||||||
|
|
||||||
:mod:`html.entities` --- Definitions of HTML general entities
|
|
||||||
=============================================================
|
|
||||||
|
|
||||||
.. module:: html.entities
|
|
||||||
:synopsis: Definitions of HTML general entities.
|
|
||||||
.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
|
|
||||||
|
|
||||||
|
|
||||||
This module defines three dictionaries, ``name2codepoint``, ``codepoint2name``,
|
|
||||||
and ``entitydefs``. ``entitydefs`` is used by the :mod:`htmllib` module to
|
|
||||||
provide the :attr:`entitydefs` member of the :class:`HTMLParser` class. The
|
|
||||||
definition provided here contains all the entities defined by XHTML 1.0 that
|
|
||||||
can be handled using simple textual substitution in the Latin-1 character set
|
|
||||||
(ISO-8859-1).
|
|
||||||
|
|
||||||
|
|
||||||
.. data:: entitydefs
|
|
||||||
|
|
||||||
A dictionary mapping XHTML 1.0 entity definitions to their replacement text in
|
|
||||||
ISO Latin-1.
|
|
||||||
|
|
||||||
|
|
||||||
.. data:: name2codepoint
|
|
||||||
|
|
||||||
A dictionary that maps HTML entity names to the Unicode codepoints.
|
|
||||||
|
|
||||||
|
|
||||||
.. data:: codepoint2name
|
|
||||||
|
|
||||||
A dictionary that maps Unicode codepoints to HTML entity names.
|
|
||||||
|
|
||||||
|
|
|
@ -21,7 +21,8 @@ definition of the Python bindings for the DOM and SAX interfaces.
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
|
|
||||||
htmlparser.rst
|
html.parser.rst
|
||||||
|
html.entities.rst
|
||||||
sgmllib.rst
|
sgmllib.rst
|
||||||
htmllib.rst
|
htmllib.rst
|
||||||
pyexpat.rst
|
pyexpat.rst
|
||||||
|
|
Loading…
Reference in New Issue