When the body object is a file, its size is no longer determined with
fstat(), since that can report the wrong result (e.g. reading from a pipe).
Instead, determine the size using seek(), or fall back to chunked encoding
for unseekable files.
Also, change the logic for detecting text files to check for TextIOBase
inheritance, rather than inspecting the “mode” attribute, which may not
exist (e.g. BytesIO and StringIO). The Content-Length for text files is no
longer determined ahead of time, because the original logic could have been
wrong depending on the codec and newline translation settings.
Patch by Demian Brecht and Rolf Krahl, with a few tweaks by me.
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.
Issue #27568 Reported and patch contributed by Rémi Rampin.
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.
Issue #27568 Reported and patch contributed by Rémi Rampin.
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.
Issue #27568 Reported and patch contributed by Rémi Rampin.
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.
Issue #27568 Reported and patch contributed by Rémi Rampin.
Some servers send Location header fields with non-ASCII bytes, but "http.
client" requires the request target to be ASCII-encodable, otherwise a
UnicodeEncodeError is raised. Based on patch by Christian Heimes.
Python 2 does not suffer any problem because it allows non-ASCII bytes in the
HTTP request target.
Issue #26804: urllib.request will prefer lower_case proxy environment variables
over UPPER_CASE or Mixed_Case ones.
Patch contributed by Hans-Peter Jansen. Reviewed by Martin Panter and Senthil Kumaran.
This fix is a superset of the functionality introduced by the issue #19494
enhancement, and supersedes that fix. Instead of a new handler, we have a new
password manager that tracks whether we should send the auth for a given uri.
This allows us to say "always send", satisfying #19494, or track that we've
succeeded in auth and send the creds right away on every *subsequent* request.
The support for using the password manager is added to AbstractBasicAuth,
which means the proxy handler also now can handle prior auth if passed
the new password manager.
Patch by Akshit Khurana, docs mostly by me.
This auth handler adds the Authorization header to the first
HTTP request rather than waiting for a HTTP 401 Unauthorized
response from the server as the default HTTPBasicAuthHandler
does.
This allows working with websites like https://api.github.com which do
not follow the strict interpretation of RFC, but more the dicta in the
end of section 2 of RFC 2617:
> A client MAY preemptively send the corresponding Authorization
> header with requests for resources in that space without receipt
> of another challenge from the server. Similarly, when a client
> sends a request to a proxy, it may reuse a userid and password in
> the Proxy-Authorization header field without receiving another
> challenge from the proxy server. See section 4 for security
> considerations associated with Basic authentication.
Patch by Matej Cepl.
* Repair the broken link to norobots-rfc.txt.
* HTTP response codes >= 500 treated as a failed read rather than as a not
found. Not found means that we can assume the entire site is allowed. A 5xx
server error tells us nothing.
* A successful read() or parse() updates the mtime (which is defined to be "the
time the robots.txt file was last fetched").
* The can_fetch() method returns False unless we've had a read() with a 2xx or
4xx response. This avoids false positives in the case where a user calls
can_fetch() before calling read().
* I don't see any easy way to test this patch without hitting internet
resources that might change or without use of mock objects that wouldn't
provide must reassurance.
base32, ascii85 and base85 codecs in the base64 module, and delay the
initialization of the unquote_to_bytes() table of the urllib.parse module, to
not waste memory if these modules are not used.
Fix#17967 - Fix related to regression on Windows.
os.path.join(*self.dirs) produces an invalid path on windows.
ftp paths are always forward-slash seperated like this. /pub/dir.
Fix thishost helper funtion in urllib. Returns the ipaddress of localhost when
hostname is resolvable by socket.gethostname for local machine. This all fixes
certain freebsd builtbot failures.
Fix#17967: For ftp urls CWD to target instead of hopping to each directory
towards target. This fixes a bug where target is accessible, but parent
directories are restricted.
#17403: urllib.parse.robotparser normalizes the urls before adding to
ruleline. This helps in handling certain types invalid urls in a conservative
manner. Patch contributed by Mher Movsisyan.