In August, Greg said this looked good, so I'm going ahead with it.
The fix is different from the one in the bug report. Instead of using
a regular expression to extract the host from the url, I use
urlparse.urlsplit.
Martin commented that the patch doesn't address URLs that have basic
authentication username and password in the header. I don't see any
code anywhere in httplib that supports this feature, so I'm not going
to address it for this fix.
Bug fix candidate.
Try to be systematic about dealing with socket and ssl exceptions in
FakeSocket.makefile(). The previous version of the code caught all
ssl errors and treated them as EOF, even though most of the errors
don't mean EOF.
An SSL error can mean on of three things:
1. The SSL/TLS connection was closed.
2. The operation should be retried.
3. An error occurred.
Also, if a socket error occurred and the error was EINTR, retry the
call. Otherwise, it was a legitimate error and the caller should
receive the exception.
For the HTTPS class (when available), ensure that the x509 certificate data
gets passed through to the HTTPSConnection class. Create a new
HTTPS.__init__ to do this, and refactor the HTTP.__init__ into a new _setup
method for both init's to call.
Note: this is solved differently from the patch, which advocated a new
**x509 parameter on the base HTTPConnection class. But that would open
HTTPConnection to arbitrary (ignored) parameters, so was not as desirable.
Header
Dougfort's comments: httplib does not include ':port ' in the HTTP 1.1
'Host:' header. This causes problems if the server is not listening
on Port 80. The test case I use is the login to /manage under Zope,
with Zope listening on port 8080. Zope returns a <frameset> with the
<frame> source URLs lacking the :8080.
The ASCII-art diagram at the top of httplib contains a backslash at
the end of a line, which causes Python to remove the newline. This
one-character patch adds a space after the backslash so it will
appear at the end of the line in the docstring as intended.
socket in httplib.py.
The bug reports that on Windows, you must pass sock._sock to the
socket.ssl() call. But on Unix, you must pass sock itself. (sock is
a wrapper on Windows but not on Unix; the ssl() call wants the real
socket object, not the wrapper.)
So we see if sock has an _sock attribute and if so, extract it.
Unfortunately, the submitter of the bug didn't confirm that this patch
works, so I'll just have to believe it (can't test it myself since I
don't have OpenSSL on Windows set up, and that's a nontrivial thing I
believe).
interface consistent: The client is responsible for closing the
socket, regardless of the amount of data received.
Restore suport for set_debuglevel call.
Modify HTTP to use delegation instead of inheritance. The
_connection_class attribute of the class defines what class to
delegate to. The HTTPS class is a subclass of HTTP that redefines
_connection_class.
Brian E Gallew, which were improved and adapted to OpenSSL 0.9.4 by
Laszlo Kovacs of HP. Both have kindly given permission to include
the patches in the Python distribution. Final formatting by GvR.
(1) No longer close self.sock; close it on close(). (Guido)
(2) Don't use regular expressions for what can be done simply with
string.split() -- regex is thread unsafe. (Jeremy)
(3) Delete unused imports. (Jeremy)