This change addresses part of issue 4336.
Change endheaders() to take an optional message_body argument
that is sent along with the headers. Change xmlrpclib and
httplib's other methods to use this new interface.
It is more efficient to make a single send() call, which should
get the entire client request into one packet (assuming it is
smaller than the MTU) and will avoid the long pause for delayed
ack following timeout.
Also:
- Add a comment about the buffer size for makefile().
- Extract _set_content_length() method and fix whitespace issues there.
catch_warnings(), and clean up the API.
While expanding the test suite, a bug was found where a warning about the
'line' argument to showwarning() was not letting functions with '*args' go
without a warning.
Closes issue 3602.
Code review by Benjamin Peterson.
all the upper level libraries that use it, including urllib2.
Added and fixed some tests, and changed docs correspondingly.
Thanks to John J Lee for the patch and the pusing, :)
it closes itself. When the stream is read in several calls to read(n),
it should behave in the same way if HTTPConnection knows where the end
of the stream is (through self.length). Added a test case for this
behaviour.
* Much expanded test suite:
All protocols tested against all other protocols.
All protocols tested with all certificate options.
Tests for bad key and bad cert.
Test of STARTTLS functionality.
Test of RAND_* functions.
* Fixes for threading/malloc bug.
* Issue 1065 fixed:
sslsocket class renamed to SSLSocket.
sslerror class renamed to SSLError.
Function "wrap_socket" now used to wrap an existing socket.
* Issue 1583946 finally fixed:
Support for subjectAltName added.
Subject name now returned as proper DN list of RDNs.
* SSLError exported from socket as "sslerror".
* RAND_* functions properly exported from ssl.py.
* Documentation improved:
Example of how to create a self-signed certificate.
Better indexing.
1) Improve the documentation of the SSL module, with a fuller
explanation of certificate usage, another reference, proper
formatting of this and that.
2) Fix Windows bug in ssl.py, and general bug in sslsocket.close().
Remove some unused code from ssl.py. Allow accept() to be called on
sslsocket sockets.
3) Use try-except-else in import of ssl in socket.py. Deprecate use of
socket.ssl().
4) Remove use of socket.ssl() in every library module, except for
test_socket_ssl.py and test_ssl.py.
Hack httplib to work with broken Akamai proxies.
Make sure that httplib doesn't add extract Accept-Encoding or
Content-Length headers if the client has already set them.
The obvious way for this assertion to fail is if the LineAndFileWrapper constructor is called when an empty line. Raise a BadStatusError before the call.
In response to "shouldn't the client close the file?", the answer is
"no". The original design behind HTTPConnection is that the client did
not have to worry about it. The response would close itself when you
read the last of the data from it. This closing also dealt with
allowing the connection to perform another request/response (if it was
a persistent connection).
However... the auto-close behavior broke compatibility with the
classic httplib.HTTP class' behavior when a zero-length response body
was present. In that situation, the HTTPResponse object was
auto-closing it since there was no data present, and for an HTTP/1.0
connection-close socket (or an HTTP/0.9 request) connection, that also
ended up closing the socket. When an httplib.HTTP user went to read
the socket... boom. A patch to correct the auto-close (for compat with
old httplib users) was added in rev 1.22.
But for non-zero-length *chunked* bodies, we should keep the
auto-close behavior. The library user is not reading the socket (they
can't cuz of the chunked response we just got done handling), so they
should be immune to the response closing the socket. In fact, I would
like to see (one day) the auto-close restored, and the HTTP subclass
would simply have a flag to disable that behavior (for back-compat
purposes).
* Replaced "while 1" with "while True"
* Rewrote read() and readline() for clarity and speed.
* Replaced variable 'list' with 'hlist'
* Used augmented assignment in two places.
The implementation now stores all the lines of the request in a buffer
and makes a single send() call when the request is finished,
specifically when endheaders() is called.
This appears to improve performance. The old code called send() for
each line. The sends are all short, so they caused bad interactions
with the Nagle algorithm and delayed acknowledgements. In simple
tests, the second packet was delayed by 100s of ms. The second send was
delayed by the Nagle algorithm, waiting for the ack. The delayed ack
strategy delays the ack in hopes of piggybacking it on a data packet,
but the server won't send any data until it receives the complete
request.
This change minimizes the problem that Nagle + delayed ack will cause
a problem, although a request large enough to be broken into two
packets will still suffer some delay. Luckily the MSS is large enough
to accomodate most single packets.
XXX Bug fix candidate?
The recent SSL changes resulted in important, but subtle changes to
close() semantics. Since builtin socket makefile() is not called for
SSL connections, we don't get separately closeable fds for connection
and response. Comments in the code explain how to restore makefile
semantics.
Bug fix candidate.
If multiple header fields with the same name occur, they are combined
according to the rules in RFC 2616 sec 4.2:
Appending each subsequent field-value to the first, each separated by
a comma. The order in which header fields with the same field-name are
received is significant to the interpretation of the combined field
value.
Section 19.6 of RFC 2616 (HTTP/1.1):
It is beyond the scope of a protocol specification to mandate
compliance with previous versions. HTTP/1.1 was deliberately
designed, however, to make supporting previous versions easy....
And we would expect HTTP/1.1 clients to:
- recognize the format of the Status-Line for HTTP/1.0 and 1.1
responses;
- understand any valid response in the format of HTTP/0.9, 1.0, or
1.1.
The changes to the code do handle response in the format of HTTP/0.9.
Some users may consider this a bug because all responses with a
sufficiently corrupted status line will look like an HTTP/0.9
response. These users can pass strict=1 to the HTTP constructors to
get a BadStatusLine exception instead.
While this is a new feature of sorts, it enhances the robustness of
the code (be tolerant in what you accept). Thus, I consider it a bug
fix candidate.
XXX strict needs to be documented.
The HTTPResponse class now handles 100 continue responses, instead of
choking on them. It detects them internally in the _begin() method
and ignores them. Based on a patch by Bob Kline.
This closes SF bugs 498149 and 551273.
The FakeSocket class (for SSL) is now usable with HTTP/1.1
connections. The old version of the code could not work with
persistent connections, because the makefile() implementation read
until EOF before returning. If the connection is persistent, the
server sends a response and leaves the connection open. A client that
reads until EOF will block until the server gives up on the connection
-- more than a minute in my test case.
The problem was fixed by implementing a reasonable makefile(). It
reads data only when it is needed by the layers above it. It's
implementation uses an internal buffer with a default size of 8192.
Also, rename begin() method of HTTPResponse to _begin() because it
should only be called by the HTTPConnection.
FakeSocket class. Without it, the sendall() call got the method on
the underlying socket object, and that messed up SSL.
Does httplib use other methods of sockets that FakeSocket doesn't support?
Someone should take a look... (I'll try to give it a once-over.)
2.2.1 bugfix candidate.
In August, Greg said this looked good, so I'm going ahead with it.
The fix is different from the one in the bug report. Instead of using
a regular expression to extract the host from the url, I use
urlparse.urlsplit.
Martin commented that the patch doesn't address URLs that have basic
authentication username and password in the header. I don't see any
code anywhere in httplib that supports this feature, so I'm not going
to address it for this fix.
Bug fix candidate.
Try to be systematic about dealing with socket and ssl exceptions in
FakeSocket.makefile(). The previous version of the code caught all
ssl errors and treated them as EOF, even though most of the errors
don't mean EOF.
An SSL error can mean on of three things:
1. The SSL/TLS connection was closed.
2. The operation should be retried.
3. An error occurred.
Also, if a socket error occurred and the error was EINTR, retry the
call. Otherwise, it was a legitimate error and the caller should
receive the exception.
For the HTTPS class (when available), ensure that the x509 certificate data
gets passed through to the HTTPSConnection class. Create a new
HTTPS.__init__ to do this, and refactor the HTTP.__init__ into a new _setup
method for both init's to call.
Note: this is solved differently from the patch, which advocated a new
**x509 parameter on the base HTTPConnection class. But that would open
HTTPConnection to arbitrary (ignored) parameters, so was not as desirable.
Header
Dougfort's comments: httplib does not include ':port ' in the HTTP 1.1
'Host:' header. This causes problems if the server is not listening
on Port 80. The test case I use is the login to /manage under Zope,
with Zope listening on port 8080. Zope returns a <frameset> with the
<frame> source URLs lacking the :8080.
The ASCII-art diagram at the top of httplib contains a backslash at
the end of a line, which causes Python to remove the newline. This
one-character patch adds a space after the backslash so it will
appear at the end of the line in the docstring as intended.