blocksize was hardcoded to 8192, preventing efficient upload when using
file-like body. Add blocksize argument to __init__, so users can
configure the blocksize to fit their needs.
I tested this uploading data from /dev/zero to a web server dropping the
received data, to test the overhead of the HTTPConnection.send() with a
file-like object.
Here is an example 10g upload with the default buffer size (8192):
$ time ~/src/cpython/release/python upload-httplib.py 10 https://localhost:8000/
Uploaded 10.00g in 17.53 seconds (584.00m/s)
real 0m17.574s
user 0m8.887s
sys 0m5.971s
Same with 512k blocksize:
$ time ~/src/cpython/release/python upload-httplib.py 10 https://localhost:8000/
Uploaded 10.00g in 6.60 seconds (1551.15m/s)
real 0m6.641s
user 0m3.426s
sys 0m2.162s
In real world usage the difference will be smaller, depending on the
local and remote storage and the network.
See https://github.com/nirs/http-bench for more info.
The previous attempt to determine the file’s Content-Length gave a false
positive for pipes on Windows.
Also, drop the special case for sending zero-length iterable bodies.
When the body object is a file, its size is no longer determined with
fstat(), since that can report the wrong result (e.g. reading from a pipe).
Instead, determine the size using seek(), or fall back to chunked encoding
for unseekable files.
Also, change the logic for detecting text files to check for TextIOBase
inheritance, rather than inspecting the “mode” attribute, which may not
exist (e.g. BytesIO and StringIO). The Content-Length for text files is no
longer determined ahead of time, because the original logic could have been
wrong depending on the codec and newline translation settings.
Patch by Demian Brecht and Rolf Krahl, with a few tweaks by me.
The @reap_threads decorator made the test wait (for up to 1 s) until
background threads have finished. Calling join() with a timeout should be
equivalent.
* No longer attempts to close already freed socket file descriptor
* Use socket object to be compatible with Windows
* Do not use a timeout to avoid complication with non-blocking mode
* Use internal localhost server rather than depending on a third party
* Avoid trouble with buffered HTTP data by testing tunnelled CONNECT data
This changeset does two things: introduces a new RemoteDisconnected exception
(that subclasses ConnectionResetError and BadStatusLine) so that a remote
server disconnection can be detected by client code (and provides a better
error message for debugging purposes), and ensures that the client socket is
closed if a ConnectionError happens, so that the automatic re-connection code
can work if the application handles the error and continues on.
Tests are added that confirm that a connection is re-used or not re-used
as appropriate to the various combinations of protocol version and headers.
Patch by Martin Panter, reviewed by Demian Brecht. (Tweaked only slightly by
me.)
Some http servers will reject PUT, POST, and PATCH requests if they
do not have a Content-Length header.
Patch by James Rutherford, with additional cleaning up of the
'request' documentation by me.
Increase http.client.HTTPConnection test coverage.
Added a new tunnel test to verify setting of _tunnel_host, _tunnel_port,
_tunnel_headers attributes on HTTPConnection object.