Commit Graph

187 Commits

Author SHA1 Message Date
Victor Stinner 2587b9f64e
gh-105382: Remove urllib.request cafile parameter (#105384)
Remove cafile, capath and cadefault parameters of the
urllib.request.urlopen() function, deprecated in Python 3.6.
2023-06-06 21:17:45 +00:00
Gregory P. Smith 2e279e85fe
gh-88500: Reduce memory use of `urllib.unquote` (#96763)
`urllib.unquote_to_bytes` and `urllib.unquote` could both potentially generate `O(len(string))` intermediate `bytes` or `str` objects while computing the unquoted final result depending on the input provided. As Python objects are relatively large, this could consume a lot of ram.

This switches the implementation to using an expanding `bytearray` and a generator internally instead of precomputed `split()` style operations.

Microbenchmarks with some antagonistic inputs like `mess = "\u0141%%%20a%fe"*1000` show this is 10-20% slower for unquote and unquote_to_bytes and no different for typical inputs that are short or lack much unicode or % escaping. But the functions are already quite fast anyways so not a big deal.  The slowdown scales consistently linear with input size as expected.

Memory usage observed manually using `/usr/bin/time -v` on `python -m timeit` runs of larger inputs. Unittesting memory consumption is difficult and does not seem worthwhile.

Observed memory usage is ~1/2 for `unquote()` and <1/3 for `unquote_to_bytes()` using `python -m timeit -s 'from urllib.parse import unquote, unquote_to_bytes; v="\u0141%01\u0161%20"*500_000' 'unquote_to_bytes(v)'` as a test.
2022-12-10 16:17:39 -08:00
Christian Heimes 760ec8940a
gh-90473: WASI: skip gethostname tests (GH-93092)
- WASI's ``gethostname()`` is a stub that always fails with OSError
  ``ENOTSUP``
- skip mailcap ``test`` if subprocess is not available
- WASI process_time clock does not work.
2022-05-23 10:39:57 +02:00
Serhiy Storchaka 086c6b1b0f
bpo-45046: Support context managers in unittest (GH-28045)
Add methods enterContext() and enterClassContext() in TestCase.
Add method enterAsyncContext() in IsolatedAsyncioTestCase.
Add function enterModuleContext().
2022-05-08 17:49:09 +03:00
Steve Dower 3513d55a61
bpo-43607: Fix urllib handling of Windows paths with \\?\ prefix (GH-25539) 2021-04-23 18:02:47 +01:00
Hai Shi 3ddc634cd5
bpo-40275: Use new test.support helper submodules in tests (GH-21219) 2020-06-30 15:46:06 +02:00
Serhiy Storchaka 700cfa8c90
bpo-41069: Make TESTFN and the CWD for tests containing non-ascii characters. (GH-21035) 2020-06-25 17:56:31 +03:00
Ashwin Ramaswami 9165addc22
bpo-38576: Disallow control characters in hostnames in http.client (GH-18995)
Add host validation for control characters for more CVE-2019-18348 protection.
2020-03-14 11:56:06 -07:00
Serhiy Storchaka 6a265f0d0c
bpo-39057: Fix urllib.request.proxy_bypass_environment(). (GH-17619)
Ignore leading dots and no longer ignore a trailing newline.
2020-01-05 14:14:31 +02:00
Victor Stinner ae7aa42774
Remove code commented for more than 10 years (GH-16965)
test_urllib commented since 2007:

commit d9880d07fc
Author: Facundo Batista <facundobatista@gmail.com>
Date:   Fri May 25 04:20:22 2007 +0000

    Commenting out the tests until find out who can test them in
    one of the problematic enviroments.

pynche code commented since 1998 and 2001:

commit ef30092207
Author: Barry Warsaw <barry@python.org>
Date:   Tue Dec 15 01:04:38 1998 +0000

    Added most of the mechanism to change the strips from color variations
    to color constants (i.e. red constant, green constant, blue
    constant).  But I haven't hooked this up yet because the UI gets more
    crowded and the arrows don't reflect the correct values.

    Added "Go to Black" and "Go to White" buttons.

commit 741eae0b31
Author: Barry Warsaw <barry@python.org>
Date:   Wed Apr 18 03:51:55 2001 +0000

    StripWidget.__init__(), update_yourself(): Removed some unused local
    variables reported by PyChecker.

    __togglegentype(): PyChecker accurately reported that the variable
    __gentypevar was unused -- actually this whole method is currently
    unused so comment it out.
2019-10-28 22:35:31 +01:00
Stein Karlsen aad2ee0156 bpo-32498: urllib.parse.unquote also accepts bytes (GH-7768) 2019-10-14 13:36:29 +03:00
Ashwin Ramaswami ff2e182865 bpo-12707: deprecate info(), geturl(), getcode() methods in favor of headers, url, and status properties for HTTPResponse and addinfourl (GH-11447)
Co-Authored-By: epicfaace <aramaswamis@gmail.com>
2019-09-13 12:40:07 +01:00
Victor Stinner 7cb9204ee1
bpo-37421: urllib.request tests call urlcleanup() (GH-14529)
urllib.request tests now call urlcleanup() to remove temporary files
created by urlretrieve() tests and to clear the _opener global
variable set by urlopen() and functions calling indirectly urlopen().

regrtest now checks if urllib.request._url_tempfiles and
urllib.request._opener are changed by tests.
2019-07-02 14:50:19 +02:00
Victor Stinner eb976e47e2
bpo-36918: Fix "Exception ignored in" in test_urllib (GH-13996)
Mock the HTTPConnection.close() method in a few unit tests to avoid
logging "Exception ignored in: ..." messages.
2019-06-12 04:07:38 +02:00
Victor Stinner 0c2b6a3943
bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme (GH-13474)
CVE-2019-9948: Avoid file reading as disallowing the unnecessary URL
scheme in URLopener().open() and URLopener().retrieve()
of urllib.request.

Co-Authored-By: SH <push0ebp@gmail.com>
2019-05-22 22:15:01 +02:00
Berker Peksag 2725cb01d7
bpo-36948: Fix test_urlopener_retrieve_file on Windows (GH-13476) 2019-05-22 02:00:35 +03:00
Xtreak c661b30f89 bpo-36948: Fix NameError in urllib.request.URLopener.retrieve (GH-13389) 2019-05-19 16:40:05 +03:00
Gregory P. Smith b7378d7728
bpo-30458: Use InvalidURL instead of ValueError. (GH-13044)
Use http.client.InvalidURL instead of ValueError as the new error case's exception.
2019-05-01 16:39:21 -04:00
Xtreak 2fc936ed24 bpo-30458: Disable https related urllib tests on a build without ssl (GH-13032)
These tests require an SSL enabled build. Skip these tests when python is built without SSL to fix test failures.


https://bugs.python.org/issue30458
2019-05-01 04:59:48 -07:00
Gregory P. Smith c4e671eec2
bpo-30458: Disallow control chars in http URLs. (GH-12755)
Disallow control chars in http URLs in urllib.urlopen.  This addresses a potential security problem for applications that do not sanity check their URLs where http request headers could be injected.
2019-04-30 19:12:21 -07:00
Stéphane Wirtel a40681dd5d bpo-36019: Use pythontest.net instead of example.com in network tests (GH-11941) 2019-02-22 14:45:36 +01:00
Senthil Kumaran efbd4ea65d Minor spell fix and formatting fixes in urllib tests. (#959) 2017-04-01 23:47:35 -07:00
Ratnadeep Debnath 21024f0662 bpo-16285: Update urllib quoting to RFC 3986 (#173)
* bpo-16285: Update urllib quoting to RFC 3986

urllib.parse.quote is now based on RFC 3986, and hence
includes `'~'` in the set of characters that is not escaped
by default.

Patch by Christian Theune and Ratnadeep Debnath.
2017-02-25 19:00:28 +10:00
Xiang Zhang c44d58a77a Issue #29142: Merge 3.5. 2017-01-09 11:50:02 +08:00
Xiang Zhang 959ff7f1c6 Issue #29142: Fix suffixes in no_proxy handling in urllib.
In urllib.request, suffixes in no_proxy environment variable with
leading dots could match related hostnames again (e.g. .b.c matches a.b.c).
Patch by Milan Oberkirch.
2017-01-09 11:47:55 +08:00
Christian Heimes d04863771b Issue #28022: Deprecate ssl-related arguments in favor of SSLContext.
The deprecation include manual creation of SSLSocket and certfile/keyfile
(or similar) in ftplib, httplib, imaplib, smtplib, poplib and urllib.

ssl.wrap_socket() is not marked as deprecated yet.
2016-09-10 23:23:33 +02:00
Martin Panter 0be894b2f6 Issue #27895: Spelling fixes (Contributed by Ville Skyttä). 2016-09-07 12:03:06 +00:00
R David Murray 44b548dda8 #27364: fix "incorrect" uses of escape character in the stdlib.
And most of the tools.

Patch by Emanual Barry, reviewed by me, Serhiy Storchaka, and
Martin Panter.
2016-09-08 13:59:53 -04:00
Raymond Hettinger 15f44ab043 Issue #27895: Spelling fixes (Contributed by Ville Skyttä). 2016-08-30 10:47:49 -07:00
Senthil Kumaran 17742f2d45 [merge from 3.4] - Prevent HTTPoxy attack (CVE-2016-1000110)
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.

Issue #27568 Reported and patch contributed by Rémi Rampin.
2016-07-30 23:39:06 -07:00
Senthil Kumaran 436fe5a447 [merge from 3.3] Prevent HTTPoxy attack (CVE-2016-1000110)
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.

Issue #27568 Reported and patch contributed by Rémi Rampin.
2016-07-30 23:34:34 -07:00
Senthil Kumaran 4cbb23f8f2 Prevent HTTPoxy attack (CVE-2016-1000110)
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.

Issue #27568 Reported and patch contributed by Rémi Rampin.
2016-07-30 23:24:16 -07:00
Martin Panter ce6e06874b Issue #14132: Fix redirect handling when target is just a query string 2016-05-16 01:07:13 +00:00
Martin Panter aa27982ffc Issue #26864: Fix case insensitivity and suffix comparison with no_proxy
Patch by Xiang Zhang.
2016-04-30 01:03:40 +00:00
Senthil Kumaran a7c0ff2f0b Issue #26804: urllib.request will prefer lower_case proxy environment variables
over UPPER_CASE or Mixed_Case ones.

Patch contributed by Hans-Peter Jansen. Reviewed by Martin Panter and Senthil Kumaran.
2016-04-25 08:16:23 -07:00
Martin Panter 7462b64911 Issue #25523: Correct "a" article to "an" article
This changes the main documentation, doc strings, source code comments, and a
couple error messages in the test suite. In some cases the word was removed
or edited some other way to fix the grammar.
2015-11-02 03:37:02 +00:00
Serhiy Storchaka 9270be7662 Added more tests for urllib.parse utility functions.
These functions are not documented but used in third-party code.
2015-03-02 16:32:29 +02:00
Senthil Kumaran 8b7e161ac3 backport context argument of urlopen (#22366) for pep 476 2014-09-19 15:23:30 +08:00
Serhiy Storchaka f54c350160 Issue #19524: Fixed resource leak in the HTTP connection when an invalid
response is received.  Patch by Martin Panter.
2014-09-06 21:41:39 +03:00
Benjamin Peterson 3c2dca67ac in ftp cache pruning, avoid changing the size of a dict while iterating over it (closes #21463)
Patch by Skyler Leigh Amador.
2014-06-07 15:08:04 -07:00
Serhiy Storchaka d3e1207191 Issue #20555: Use specific asserts in urllib, httplib, ftplib, cgi, wsgiref tests. 2014-02-08 14:51:10 +02:00
Serhiy Storchaka 25d8aeac7c Issue #20555: Use specific asserts in urllib, httplib, ftplib, cgi, wsgiref tests. 2014-02-08 14:50:08 +02:00
Ezio Melotti a7e7497d88 #18466: merge with 3.3. 2013-08-17 16:58:13 +03:00
Ezio Melotti 85a8629d21 #18466: fix more typos. Patch by Févry Thibault. 2013-08-17 16:57:41 +03:00
Senthil Kumaran f49581c2a1 normalize whitespace 2013-04-10 20:55:58 -07:00
Senthil Kumaran c7e0980259 normalize whitespace. caught by hook 2013-04-10 20:54:23 -07:00
Senthil Kumaran 8b081b7ba1 merge from 3.3
#5609 - test_urllib coverage for url2pathname and pathname2url. Patch
contribution by Thomas Fenzl & Maksim Kozyarchuk
2013-04-10 20:53:12 -07:00
Senthil Kumaran 277e9090b0 #5609 - test_urllib coverage for url2pathname and pathname2url. Patch
contribution by Thomas Fenzl & Maksim Kozyarchuk
2013-04-10 20:51:19 -07:00
Ezio Melotti d8bc0a3693 Merge DeprecationWarnings silencing in test_urllib from 3.3. 2013-02-21 02:55:56 +02:00
Ezio Melotti 79b99dba0f Silence DeprecationWarnings in test_urllib. 2013-02-21 02:41:42 +02:00