Allow '=' and '~' in unquoted attribute values.
Added overridable methods handle_starttag(tag, method, attrs) and
handle_endtag(tag, method) so subclasses can decide whether they
really want to call the method (e.g. when suppressing some portion of
the document).
Added support for a number of SGML shortcuts:
shorthand full notation
<tag>...<>... <tag>...<tag>...
<tag>...</> <tag>...</tag>
<tag/.../ <tag>...</tag>
<tag1<tag2> <tag1><tag2>
</tag1</tag2> </tag1></tag2>
</tag1<tag2> </tag1><tag2>
This required factoring out some common actions and rationalizing the
interface to parse_endtag(), so as to make the code more readable.
Fixed syntax for &entity and &#char references so the trailing
semicolon is optional; removed explicit support for trailing period
(which was a TBL mistake in HTML 0.0).
Generalized the test program.
Tried to speed things up a little. (More to come after the profile
results are in.)
Fix error recovery: call the end methods popped from the stack instead
of the one that triggers. (Plus some complications because of the way
HTML extensions are handled in Grail.)