mirror of https://github.com/python/cpython
Issue #12319: Always send file request bodies using chunked encoding
The previous attempt to determine the file’s Content-Length gave a false positive for pipes on Windows. Also, drop the special case for sending zero-length iterable bodies.
This commit is contained in:
parent
8f96a30630
commit
ef91bb2660
|
@ -240,17 +240,17 @@ HTTPConnection Objects
|
|||
The *headers* argument should be a mapping of extra HTTP headers to send
|
||||
with the request.
|
||||
|
||||
If *headers* contains neither Content-Length nor Transfer-Encoding, a
|
||||
Content-Length header will be added automatically if possible. If
|
||||
If *headers* contains neither Content-Length nor Transfer-Encoding,
|
||||
but there is a request body, one of those
|
||||
header fields will be added automatically. If
|
||||
*body* is ``None``, the Content-Length header is set to ``0`` for
|
||||
methods that expect a body (``PUT``, ``POST``, and ``PATCH``). If
|
||||
*body* is a string or bytes-like object, the Content-Length header is
|
||||
set to its length. If *body* is a binary :term:`file object`
|
||||
supporting :meth:`~io.IOBase.seek`, this will be used to determine
|
||||
its size. Otherwise, the Content-Length header is not added
|
||||
automatically. In cases where determining the Content-Length up
|
||||
front is not possible, the body will be chunk-encoded and the
|
||||
Transfer-Encoding header will automatically be set.
|
||||
*body* is a string or a bytes-like object that is not also a
|
||||
:term:`file <file object>`, the Content-Length header is
|
||||
set to its length. Any other type of *body* (files
|
||||
and iterables in general) will be chunk-encoded, and the
|
||||
Transfer-Encoding header will automatically be set instead of
|
||||
Content-Length.
|
||||
|
||||
The *encode_chunked* argument is only relevant if Transfer-Encoding is
|
||||
specified in *headers*. If *encode_chunked* is ``False``, the
|
||||
|
@ -260,19 +260,18 @@ HTTPConnection Objects
|
|||
.. note::
|
||||
Chunked transfer encoding has been added to the HTTP protocol
|
||||
version 1.1. Unless the HTTP server is known to handle HTTP 1.1,
|
||||
the caller must either specify the Content-Length or must use a
|
||||
body representation whose length can be determined automatically.
|
||||
the caller must either specify the Content-Length, or must pass a
|
||||
:class:`str` or bytes-like object that is not also a file as the
|
||||
body representation.
|
||||
|
||||
.. versionadded:: 3.2
|
||||
*body* can now be an iterable.
|
||||
|
||||
.. versionchanged:: 3.6
|
||||
If neither Content-Length nor Transfer-Encoding are set in
|
||||
*headers* and Content-Length cannot be determined, *body* will now
|
||||
be automatically chunk-encoded. The *encode_chunked* argument
|
||||
was added.
|
||||
The Content-Length for binary file objects is determined with seek.
|
||||
No attempt is made to determine the Content-Length for text file
|
||||
*headers*, file and iterable *body* objects are now chunk-encoded.
|
||||
The *encode_chunked* argument was added.
|
||||
No attempt is made to determine the Content-Length for file
|
||||
objects.
|
||||
|
||||
.. method:: HTTPConnection.getresponse()
|
||||
|
|
|
@ -187,12 +187,11 @@ The following classes are provided:
|
|||
server, or ``None`` if no such data is needed. Currently HTTP
|
||||
requests are the only ones that use *data*. The supported object
|
||||
types include bytes, file-like objects, and iterables. If no
|
||||
``Content-Length`` header has been provided, :class:`HTTPHandler` will
|
||||
try to determine the length of *data* and set this header accordingly.
|
||||
If this fails, ``Transfer-Encoding: chunked`` as specified in
|
||||
:rfc:`7230`, Section 3.3.1 will be used to send the data. See
|
||||
:meth:`http.client.HTTPConnection.request` for details on the
|
||||
supported object types and on how the content length is determined.
|
||||
``Content-Length`` nor ``Transfer-Encoding`` header field
|
||||
has been provided, :class:`HTTPHandler` will set these headers according
|
||||
to the type of *data*. ``Content-Length`` will be used to send
|
||||
bytes objects, while ``Transfer-Encoding: chunked`` as specified in
|
||||
:rfc:`7230`, Section 3.3.1 will be used to send files and other iterables.
|
||||
|
||||
For an HTTP POST request method, *data* should be a buffer in the
|
||||
standard :mimetype:`application/x-www-form-urlencoded` format. The
|
||||
|
@ -256,8 +255,8 @@ The following classes are provided:
|
|||
|
||||
.. versionchanged:: 3.6
|
||||
Do not raise an error if the ``Content-Length`` has not been
|
||||
provided and could not be determined. Fall back to use chunked
|
||||
transfer encoding instead.
|
||||
provided and *data* is neither ``None`` nor a bytes object.
|
||||
Fall back to use chunked transfer encoding instead.
|
||||
|
||||
.. class:: OpenerDirector()
|
||||
|
||||
|
|
|
@ -579,8 +579,8 @@ The :class:`~unittest.mock.Mock` class has the following improvements:
|
|||
urllib.request
|
||||
--------------
|
||||
|
||||
If a HTTP request has a non-empty body but no Content-Length header
|
||||
and the content length cannot be determined up front, rather than
|
||||
If a HTTP request has a file or iterable body (other than a
|
||||
bytes object) but no Content-Length header, rather than
|
||||
throwing an error, :class:`~urllib.request.AbstractHTTPHandler` now
|
||||
falls back to use chunked transfer encoding.
|
||||
(Contributed by Demian Brecht and Rolf Krahl in :issue:`12319`.)
|
||||
|
@ -935,6 +935,13 @@ Changes in the Python API
|
|||
This behavior has also been backported to earlier Python versions
|
||||
by Setuptools 26.0.0.
|
||||
|
||||
* In the :mod:`urllib.request` module and the
|
||||
:meth:`http.client.HTTPConnection.request` method, if no Content-Length
|
||||
header field has been specified and the request body is a file object,
|
||||
it is now sent with HTTP 1.1 chunked encoding. If a file object has to
|
||||
be sent to a HTTP 1.0 server, the Content-Length value now has to be
|
||||
specified by the caller. See :issue:`12319`.
|
||||
|
||||
Changes in the C API
|
||||
--------------------
|
||||
|
||||
|
|
|
@ -805,35 +805,21 @@ class HTTPConnection:
|
|||
def _get_content_length(body, method):
|
||||
"""Get the content-length based on the body.
|
||||
|
||||
If the body is "empty", we set Content-Length: 0 for methods
|
||||
that expect a body (RFC 7230, Section 3.3.2). If the body is
|
||||
set for other methods, we set the header provided we can
|
||||
figure out what the length is.
|
||||
If the body is None, we set Content-Length: 0 for methods that expect
|
||||
a body (RFC 7230, Section 3.3.2). We also set the Content-Length for
|
||||
any method if the body is a str or bytes-like object and not a file.
|
||||
"""
|
||||
if not body:
|
||||
if body is None:
|
||||
# do an explicit check for not None here to distinguish
|
||||
# between unset and set but empty
|
||||
if method.upper() in _METHODS_EXPECTING_BODY or body is not None:
|
||||
if method.upper() in _METHODS_EXPECTING_BODY:
|
||||
return 0
|
||||
else:
|
||||
return None
|
||||
|
||||
if hasattr(body, 'read'):
|
||||
# file-like object.
|
||||
if HTTPConnection._is_textIO(body):
|
||||
# text streams are unpredictable because it depends on
|
||||
# character encoding and line ending translation.
|
||||
return None
|
||||
else:
|
||||
# Is it seekable?
|
||||
try:
|
||||
curpos = body.tell()
|
||||
sz = body.seek(0, io.SEEK_END)
|
||||
except (TypeError, AttributeError, OSError):
|
||||
return None
|
||||
else:
|
||||
body.seek(curpos)
|
||||
return sz - curpos
|
||||
return None
|
||||
|
||||
try:
|
||||
# does it implement the buffer protocol (bytes, bytearray, array)?
|
||||
|
@ -1266,8 +1252,7 @@ class HTTPConnection:
|
|||
# the caller passes encode_chunked=True or the following
|
||||
# conditions hold:
|
||||
# 1. content-length has not been explicitly set
|
||||
# 2. the length of the body cannot be determined
|
||||
# (e.g. it is a generator or unseekable file)
|
||||
# 2. the body is a file or iterable, but not a str or bytes-like
|
||||
# 3. Transfer-Encoding has NOT been explicitly set by the caller
|
||||
|
||||
if 'content-length' not in header_names:
|
||||
|
@ -1280,7 +1265,7 @@ class HTTPConnection:
|
|||
encode_chunked = False
|
||||
content_length = self._get_content_length(body, method)
|
||||
if content_length is None:
|
||||
if body:
|
||||
if body is not None:
|
||||
if self.debuglevel > 0:
|
||||
print('Unable to determine size of %r' % body)
|
||||
encode_chunked = True
|
||||
|
|
|
@ -381,6 +381,16 @@ class TransferEncodingTest(TestCase):
|
|||
# same request
|
||||
self.assertNotIn('content-length', [k.lower() for k in headers])
|
||||
|
||||
def test_empty_body(self):
|
||||
# Zero-length iterable should be treated like any other iterable
|
||||
conn = client.HTTPConnection('example.com')
|
||||
conn.sock = FakeSocket(b'')
|
||||
conn.request('POST', '/', ())
|
||||
_, headers, body = self._parse_request(conn.sock.data)
|
||||
self.assertEqual(headers['Transfer-Encoding'], 'chunked')
|
||||
self.assertNotIn('content-length', [k.lower() for k in headers])
|
||||
self.assertEqual(body, b"0\r\n\r\n")
|
||||
|
||||
def _make_body(self, empty_lines=False):
|
||||
lines = self.expected_body.split(b' ')
|
||||
for idx, line in enumerate(lines):
|
||||
|
@ -652,7 +662,9 @@ class BasicTest(TestCase):
|
|||
|
||||
def test_send_file(self):
|
||||
expected = (b'GET /foo HTTP/1.1\r\nHost: example.com\r\n'
|
||||
b'Accept-Encoding: identity\r\nContent-Length:')
|
||||
b'Accept-Encoding: identity\r\n'
|
||||
b'Transfer-Encoding: chunked\r\n'
|
||||
b'\r\n')
|
||||
|
||||
with open(__file__, 'rb') as body:
|
||||
conn = client.HTTPConnection('example.com')
|
||||
|
@ -1717,7 +1729,7 @@ class RequestBodyTest(TestCase):
|
|||
self.assertEqual("5", message.get("content-length"))
|
||||
self.assertEqual(b'body\xc1', f.read())
|
||||
|
||||
def test_file_body(self):
|
||||
def test_text_file_body(self):
|
||||
self.addCleanup(support.unlink, support.TESTFN)
|
||||
with open(support.TESTFN, "w") as f:
|
||||
f.write("body")
|
||||
|
@ -1726,10 +1738,8 @@ class RequestBodyTest(TestCase):
|
|||
message, f = self.get_headers_and_fp()
|
||||
self.assertEqual("text/plain", message.get_content_type())
|
||||
self.assertIsNone(message.get_charset())
|
||||
# Note that the length of text files is unpredictable
|
||||
# because it depends on character encoding and line ending
|
||||
# translation. No content-length will be set, the body
|
||||
# will be sent using chunked transfer encoding.
|
||||
# No content-length will be determined for files; the body
|
||||
# will be sent using chunked transfer encoding instead.
|
||||
self.assertIsNone(message.get("content-length"))
|
||||
self.assertEqual("chunked", message.get("transfer-encoding"))
|
||||
self.assertEqual(b'4\r\nbody\r\n0\r\n\r\n', f.read())
|
||||
|
@ -1743,8 +1753,9 @@ class RequestBodyTest(TestCase):
|
|||
message, f = self.get_headers_and_fp()
|
||||
self.assertEqual("text/plain", message.get_content_type())
|
||||
self.assertIsNone(message.get_charset())
|
||||
self.assertEqual("5", message.get("content-length"))
|
||||
self.assertEqual(b'body\xc1', f.read())
|
||||
self.assertEqual("chunked", message.get("Transfer-Encoding"))
|
||||
self.assertNotIn("Content-Length", message)
|
||||
self.assertEqual(b'5\r\nbody\xc1\r\n0\r\n\r\n', f.read())
|
||||
|
||||
|
||||
class HTTPResponseTest(TestCase):
|
||||
|
|
|
@ -913,40 +913,50 @@ class HandlerTests(unittest.TestCase):
|
|||
self.assertEqual(req.unredirected_hdrs["Spam"], "foo")
|
||||
|
||||
def test_http_body_file(self):
|
||||
# A regular file - Content Length is calculated unless already set.
|
||||
# A regular file - chunked encoding is used unless Content Length is
|
||||
# already set.
|
||||
|
||||
h = urllib.request.AbstractHTTPHandler()
|
||||
o = h.parent = MockOpener()
|
||||
|
||||
file_obj = tempfile.NamedTemporaryFile(mode='w+b', delete=False)
|
||||
file_path = file_obj.name
|
||||
file_obj.write(b"Something\nSomething\nSomething\n")
|
||||
file_obj.close()
|
||||
self.addCleanup(os.unlink, file_path)
|
||||
|
||||
for headers in {}, {"Content-Length": 30}:
|
||||
with open(file_path, "rb") as f:
|
||||
req = Request("http://example.com/", f, headers)
|
||||
newreq = h.do_request_(req)
|
||||
self.assertEqual(int(newreq.get_header('Content-length')), 30)
|
||||
with open(file_path, "rb") as f:
|
||||
req = Request("http://example.com/", f, {})
|
||||
newreq = h.do_request_(req)
|
||||
te = newreq.get_header('Transfer-encoding')
|
||||
self.assertEqual(te, "chunked")
|
||||
self.assertFalse(newreq.has_header('Content-length'))
|
||||
|
||||
os.unlink(file_path)
|
||||
with open(file_path, "rb") as f:
|
||||
req = Request("http://example.com/", f, {"Content-Length": 30})
|
||||
newreq = h.do_request_(req)
|
||||
self.assertEqual(int(newreq.get_header('Content-length')), 30)
|
||||
self.assertFalse(newreq.has_header("Transfer-encoding"))
|
||||
|
||||
def test_http_body_fileobj(self):
|
||||
# A file object - Content Length is calculated unless already set.
|
||||
# A file object - chunked encoding is used
|
||||
# unless Content Length is already set.
|
||||
# (Note that there are some subtle differences to a regular
|
||||
# file, that is why we are testing both cases.)
|
||||
|
||||
h = urllib.request.AbstractHTTPHandler()
|
||||
o = h.parent = MockOpener()
|
||||
|
||||
file_obj = io.BytesIO()
|
||||
file_obj.write(b"Something\nSomething\nSomething\n")
|
||||
|
||||
for headers in {}, {"Content-Length": 30}:
|
||||
file_obj.seek(0)
|
||||
req = Request("http://example.com/", file_obj, headers)
|
||||
newreq = h.do_request_(req)
|
||||
self.assertEqual(int(newreq.get_header('Content-length')), 30)
|
||||
req = Request("http://example.com/", file_obj, {})
|
||||
newreq = h.do_request_(req)
|
||||
self.assertEqual(newreq.get_header('Transfer-encoding'), 'chunked')
|
||||
self.assertFalse(newreq.has_header('Content-length'))
|
||||
|
||||
headers = {"Content-Length": 30}
|
||||
req = Request("http://example.com/", file_obj, headers)
|
||||
newreq = h.do_request_(req)
|
||||
self.assertEqual(int(newreq.get_header('Content-length')), 30)
|
||||
self.assertFalse(newreq.has_header("Transfer-encoding"))
|
||||
|
||||
file_obj.close()
|
||||
|
||||
|
@ -959,9 +969,7 @@ class HandlerTests(unittest.TestCase):
|
|||
h = urllib.request.AbstractHTTPHandler()
|
||||
o = h.parent = MockOpener()
|
||||
|
||||
cmd = [sys.executable, "-c",
|
||||
r"import sys; "
|
||||
r"sys.stdout.buffer.write(b'Something\nSomething\nSomething\n')"]
|
||||
cmd = [sys.executable, "-c", r"pass"]
|
||||
for headers in {}, {"Content-Length": 30}:
|
||||
with subprocess.Popen(cmd, stdout=subprocess.PIPE) as proc:
|
||||
req = Request("http://example.com/", proc.stdout, headers)
|
||||
|
@ -983,8 +991,6 @@ class HandlerTests(unittest.TestCase):
|
|||
|
||||
def iterable_body():
|
||||
yield b"one"
|
||||
yield b"two"
|
||||
yield b"three"
|
||||
|
||||
for headers in {}, {"Content-Length": 11}:
|
||||
req = Request("http://example.com/", iterable_body(), headers)
|
||||
|
@ -996,6 +1002,14 @@ class HandlerTests(unittest.TestCase):
|
|||
else:
|
||||
self.assertEqual(int(newreq.get_header('Content-length')), 11)
|
||||
|
||||
def test_http_body_empty_seq(self):
|
||||
# Zero-length iterable body should be treated like any other iterable
|
||||
h = urllib.request.AbstractHTTPHandler()
|
||||
h.parent = MockOpener()
|
||||
req = h.do_request_(Request("http://example.com/", ()))
|
||||
self.assertEqual(req.get_header("Transfer-encoding"), "chunked")
|
||||
self.assertFalse(req.has_header("Content-length"))
|
||||
|
||||
def test_http_body_array(self):
|
||||
# array.array Iterable - Content Length is calculated
|
||||
|
||||
|
|
|
@ -52,10 +52,9 @@ Library
|
|||
- Issue #12319: Chunked transfer encoding support added to
|
||||
http.client.HTTPConnection requests. The
|
||||
urllib.request.AbstractHTTPHandler class does not enforce a Content-Length
|
||||
header any more. If a HTTP request has a non-empty body, but no
|
||||
Content-Length header, and the content length cannot be determined
|
||||
up front, rather than throwing an error, the library now falls back
|
||||
to use chunked transfer encoding.
|
||||
header any more. If a HTTP request has a file or iterable body, but no
|
||||
Content-Length header, the library now falls back to use chunked transfer-
|
||||
encoding.
|
||||
|
||||
- A new version of typing.py from https://github.com/python/typing:
|
||||
- Collection (only for 3.6) (Issue #27598)
|
||||
|
|
Loading…
Reference in New Issue