`urllib.unquote_to_bytes` and `urllib.unquote` could both potentially generate `O(len(string))` intermediate `bytes` or `str` objects while computing the unquoted final result depending on the input provided. As Python objects are relatively large, this could consume a lot of ram.
This switches the implementation to using an expanding `bytearray` and a generator internally instead of precomputed `split()` style operations.
Microbenchmarks with some antagonistic inputs like `mess = "\u0141%%%20a%fe"*1000` show this is 10-20% slower for unquote and unquote_to_bytes and no different for typical inputs that are short or lack much unicode or % escaping. But the functions are already quite fast anyways so not a big deal. The slowdown scales consistently linear with input size as expected.
Memory usage observed manually using `/usr/bin/time -v` on `python -m timeit` runs of larger inputs. Unittesting memory consumption is difficult and does not seem worthwhile.
Observed memory usage is ~1/2 for `unquote()` and <1/3 for `unquote_to_bytes()` using `python -m timeit -s 'from urllib.parse import unquote, unquote_to_bytes; v="\u0141%01\u0161%20"*500_000' 'unquote_to_bytes(v)'` as a test.
- WASI's ``gethostname()`` is a stub that always fails with OSError
``ENOTSUP``
- skip mailcap ``test`` if subprocess is not available
- WASI process_time clock does not work.
Add methods enterContext() and enterClassContext() in TestCase.
Add method enterAsyncContext() in IsolatedAsyncioTestCase.
Add function enterModuleContext().
test_urllib commented since 2007:
commit d9880d07fc
Author: Facundo Batista <facundobatista@gmail.com>
Date: Fri May 25 04:20:22 2007 +0000
Commenting out the tests until find out who can test them in
one of the problematic enviroments.
pynche code commented since 1998 and 2001:
commit ef30092207
Author: Barry Warsaw <barry@python.org>
Date: Tue Dec 15 01:04:38 1998 +0000
Added most of the mechanism to change the strips from color variations
to color constants (i.e. red constant, green constant, blue
constant). But I haven't hooked this up yet because the UI gets more
crowded and the arrows don't reflect the correct values.
Added "Go to Black" and "Go to White" buttons.
commit 741eae0b31
Author: Barry Warsaw <barry@python.org>
Date: Wed Apr 18 03:51:55 2001 +0000
StripWidget.__init__(), update_yourself(): Removed some unused local
variables reported by PyChecker.
__togglegentype(): PyChecker accurately reported that the variable
__gentypevar was unused -- actually this whole method is currently
unused so comment it out.
urllib.request tests now call urlcleanup() to remove temporary files
created by urlretrieve() tests and to clear the _opener global
variable set by urlopen() and functions calling indirectly urlopen().
regrtest now checks if urllib.request._url_tempfiles and
urllib.request._opener are changed by tests.
CVE-2019-9948: Avoid file reading as disallowing the unnecessary URL
scheme in URLopener().open() and URLopener().retrieve()
of urllib.request.
Co-Authored-By: SH <push0ebp@gmail.com>
Disallow control chars in http URLs in urllib.urlopen. This addresses a potential security problem for applications that do not sanity check their URLs where http request headers could be injected.
* bpo-16285: Update urllib quoting to RFC 3986
urllib.parse.quote is now based on RFC 3986, and hence
includes `'~'` in the set of characters that is not escaped
by default.
Patch by Christian Theune and Ratnadeep Debnath.
In urllib.request, suffixes in no_proxy environment variable with
leading dots could match related hostnames again (e.g. .b.c matches a.b.c).
Patch by Milan Oberkirch.
The deprecation include manual creation of SSLSocket and certfile/keyfile
(or similar) in ftplib, httplib, imaplib, smtplib, poplib and urllib.
ssl.wrap_socket() is not marked as deprecated yet.
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.
Issue #27568 Reported and patch contributed by Rémi Rampin.
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.
Issue #27568 Reported and patch contributed by Rémi Rampin.
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.
Issue #27568 Reported and patch contributed by Rémi Rampin.
This changes the main documentation, doc strings, source code comments, and a
couple error messages in the test suite. In some cases the word was removed
or edited some other way to fix the grammar.