Links are now either in 'todo' or 'done', and ext links
are hadled more like local links except that no further
links are gathered (and sometimes they aren't checked,
e.g. for mailto and news URLs). The -x option reverses
its meaning: it disables checking of ext links (they are
moved to 'done' without checking). A new 'errors' table
collects pages with bad links as we go -- redundant,
but useful for the GUI version which needs to report
this as we go. Some new methods, including reset().
New checkpoint format.
Adapted the GUI to the changes in the Checker class.
Added Quit and "Start over" buttons, and a checkbox
to disable checking external links. The details
window now also shows bad links emanating from the
selected page. Miscellaneous small chages.
- Faster HTML parser derivede from SGMLparser (Fred Gansevles).
- All manipulations of todo, done, ext, bad are done via methods, so a
derived class can override. Also moved the 'done' marking to
dopage(), so run() is much simpler.
- Added a method status() which returns a string containing the
summary counts; added a "total" count.
- Drop the guessing of the file type before opening the document -- we
still need to check those links for validity!
- Added a subroutine to close a connection which first slurps up the
remaining data when it's an ftp URL -- apparently closing an ftp
connection without reading till the end makes it hang.
- Added -n option to skip running (only useful with -R).
- The Checker object now has an instance variable which is set to 1
when it is changed. This is not pickled.
in the 'bad' dictionary (sanitize them so they are picklable; the
sanitation code is now a subroutine); don't check mailto: URLs; omit
colon in Error message.
removed from .mirrorinfo. Now they are (even if -r is not specified
-- the files are not removed, just their .mirrorinfo entry).
Added a feature: the -s pattern option is also used to skip local
files when removing (i.e. -r won't remove local files matching the -s
patterns).
navigation links for HTML 3 version.
Forced a blank line above the footnotes separator for HTML 2; at
least one page did not get this spaced correctly.
print section titles even when the debugging output is not enabled.
Added -3 option to generate HTML 3.0 constructs where meaningful.
Removed repititive garbage generation: the old version added simple
descriptive comments after every datadesc/funcdesc/*desc entry:
function(args) -- function of module xxxx
Description....
These comments are no longer generated:
function(args)
Description....
to 4 spaces per level (no longer 8).
(Makefile): Use .pyc versions of partparse.py and texi2html.py to generate
converted documentation formats. This reduces the startup costs;
probably doesn't affect anyone but me in reality, but helps when
working on the docs.