cpython

History

Raymond Hettinger 122541bece Issue 21469: Mitigate risk of false positives with robotparser. * Repair the broken link to norobots-rfc.txt. * HTTP response codes >= 500 treated as a failed read rather than as a not found. Not found means that we can assume the entire site is allowed. A 5xx server error tells us nothing. * A successful read() or parse() updates the mtime (which is defined to be "the time the robots.txt file was last fetched"). * The can_fetch() method returns False unless we've had a read() with a 2xx or 4xx response. This avoids false positives in the case where a user calls can_fetch() before calling read(). * I don't see any easy way to test this patch without hitting internet resources that might change or without use of mock objects that wouldn't provide must reassurance.		2014-05-12 21:56:33 -07:00
..
__init__.py	…
error.py	Replace IOError with OSError (#16715 )	2012-12-25 16:47:37 +02:00
parse.py	Issue #20879 : Delay the initialization of encoding and decoding tables for	2014-03-17 22:38:41 +01:00
request.py	Convert urllib.request parse_proxy doctests to unittests.	2014-04-14 16:32:20 -04:00
response.py	urllib.response object to use _TemporaryFileWrapper (and _TemporaryFileCloser)	2014-04-20 09:41:29 -07:00
robotparser.py	Issue 21469: Mitigate risk of false positives with robotparser.	2014-05-12 21:56:33 -07:00