I'll clean that up later. Also corrected a mistake introduced by the
previous reformatting: an 'else' belonging to a 'for' was accidentally
reindented to belong to the 'if' inside the 'for'. Note that the
module uses inconsistent indentation -- most code is indented with 8
spaces, but some of the reformatted code uses 4 spaces. I'll fix this
later in the promised cleanup pass.
Links are now either in 'todo' or 'done', and ext links
are hadled more like local links except that no further
links are gathered (and sometimes they aren't checked,
e.g. for mailto and news URLs). The -x option reverses
its meaning: it disables checking of ext links (they are
moved to 'done' without checking). A new 'errors' table
collects pages with bad links as we go -- redundant,
but useful for the GUI version which needs to report
this as we go. Some new methods, including reset().
New checkpoint format.
Adapted the GUI to the changes in the Checker class.
Added Quit and "Start over" buttons, and a checkbox
to disable checking external links. The details
window now also shows bad links emanating from the
selected page. Miscellaneous small chages.
- Faster HTML parser derivede from SGMLparser (Fred Gansevles).
- All manipulations of todo, done, ext, bad are done via methods, so a
derived class can override. Also moved the 'done' marking to
dopage(), so run() is much simpler.
- Added a method status() which returns a string containing the
summary counts; added a "total" count.
- Drop the guessing of the file type before opening the document -- we
still need to check those links for validity!
- Added a subroutine to close a connection which first slurps up the
remaining data when it's an ftp URL -- apparently closing an ftp
connection without reading till the end makes it hang.
- Added -n option to skip running (only useful with -R).
- The Checker object now has an instance variable which is set to 1
when it is changed. This is not pickled.
in the 'bad' dictionary (sanitize them so they are picklable; the
sanitation code is now a subroutine); don't check mailto: URLs; omit
colon in Error message.
The table size is now constrained to be a power of two, and we use a
variable increment based on GF(2^n)-{0} (not that I have the faintest
idea what that is :-) which helps avoid the expensive '%' operation.
Some of the entries in the table of polynomials have been modified
according to a post by Tim Peters.
- Use co->... instead of f->f_code->...; save an extra lookup of what
we already have in a local variable).
- Remove test for nlocals > 0 before setting fastlocals to
f->f_localsplus; 0 is a rare case and the assignment is safe even
then.
called with keyword arguments -- the keyword and value were leaked.
This affected for instance with a __call__() method.
Bug reported and fix supplied by Jim Fulton.
i.e., counting opcode frequencies, or (with DXPAIRS defined) opcode
pair frequencies. Define DYNAMIC_EXECUTION_PROFILE on the command
line (for this file and for sysmodule.c) to enable.