Commit Graph

410 Commits

Author SHA1 Message Date
Miss Skeleton (bot) e7c5a43984
bpo-41471: Ignore invalid prefix lengths in system proxy settings on macOS (GH-22762) (GH-22774)
(cherry picked from commit 93a1ccabde)

Co-authored-by: Ronald Oussoren <ronaldoussoren@mac.com>
2020-10-20 09:17:58 +02:00
Irit Katriel 1a3f7c042a
[3.8] bpo-32498: Improve exception message on passing bytes to urllib.parse.unquote (GH-22746) 2020-10-19 00:06:34 +03:00
Miss Islington (bot) ea9e240aa0
bpo-39503: CVE-2020-8492: Fix AbstractBasicAuthHandler (GH-18284) (GH-19296)
The AbstractBasicAuthHandler class of the urllib.request module uses
an inefficient regular expression which can be exploited by an
attacker to cause a denial of service. Fix the regex to prevent the
catastrophic backtracking. Vulnerability reported by Ben Caller
and Matt Schwager.

AbstractBasicAuthHandler of urllib.request now parses all
WWW-Authenticate HTTP headers and accepts multiple challenges per
header: use the realm of the first Basic challenge.

Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>

(cherry picked from commit 0b297d4ff1)
2020-04-02 12:15:55 +02:00
Miss Islington (bot) e4686b7979
bpo-39548: Fix handling of 'WWW-Authenticate' header for Digest Auth (GH-18338)
* bpo-39548: Fix handling of 'WWW-Authenticate' header for Digest authentication

 - The 'qop' value in the 'WWW-Authenticate' header is optional. The
   presence of 'qop' in the header should be checked before its value
   is parsed with 'split'.

Signed-off-by: Stephen Balousek <stephen@balousek.net>

* bpo-39548: Fix handling of 'WWW-Authenticate' header for Digest authentication

 - Add NEWS item

Signed-off-by: Stephen Balousek <stephen@balousek.net>

* Update Misc/NEWS.d/next/Library/2020-02-06-05-33-52.bpo-39548.DF4FFe.rst

Co-Authored-By: Brandt Bucher <brandtbucher@gmail.com>

Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>
(cherry picked from commit 5e260e0fde)

Co-authored-by: Stephen Balousek <sbalousek@users.noreply.github.com>
2020-02-29 13:05:23 -08:00
Senthil Kumaran ea316fd215
Revert "[3.8] bpo-27657: Fix urlparse() with numeric paths (GH-16839)" (GH-18525)
This reverts commit 0f3187c1ce.

The change broke the backwards compatibility of parsing behavior in a
patch release of Python (3.8.1). A decision was taken to revert this
patch in 3.8.2.

In https://bugs.python.org/issue27657 it was decided that the previous
behavior like

>>> urlparse('localhost:8080')
ParseResult(scheme='', netloc='', path='localhost:8080', params='', query='', fragment='')

>>> urlparse('undefined:8080')
ParseResult(scheme='', netloc='', path='undefined:8080', params='', query='', fragment='')

needs to be preserved in patch releases as number of users rely upon it.

Explicitly mention the releases involved with the revert in NEWS.
Adopt the wording suggested by @ned-deily.
2020-02-16 13:47:21 -08:00
Miss Islington (bot) fc84d501b9
bpo-39057: Fix urllib.request.proxy_bypass_environment(). (GH-17619)
Ignore leading dots and no longer ignore a trailing newline.
(cherry picked from commit 6a265f0d0c)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2020-01-05 04:32:00 -08:00
Miss Islington (bot) b9e5547f58
bpo-38686: fix HTTP Digest handling in request.py (GH-17045)
* fix HTTP Digest handling in request.py

There is a bug triggered when server replies to a request with `WWW-Authenticate: Digest` where `qop="auth,auth-int"` rather than mere `qop="auth"`. Having both `auth` and `auth-int` is legitimate according to the `qop-options` rule in §3.2.1 of [[https://www.ietf.org/rfc/rfc2617.txt|RFC 2617]]:
>      qop-options       = "qop" "=" <"> 1GH-qop-value <">
>      qop-value         = "auth" | "auth-int" | token
> **qop-options**: [...] If present, it is a quoted string **of one or more** tokens indicating the "quality of protection" values supported by the server.  The value `"auth"` indicates authentication; the value `"auth-int"` indicates authentication with integrity protection

This is description confirmed by the definition of the [//n//]`GH-`[//m//]//rule// extended-BNF pattern defined in §2.1 of [[https://www.ietf.org/rfc/rfc2616.txt|RFC 2616]] as 'a comma-separated list of //rule// with at least //n// and at most //m// items'.

When this reply is parsed by `get_authorization`, request.py only tests for identity with `'auth'`, failing to recognize it as one of the supported modes the server announced, and claims that `"qop 'auth,auth-int' is not supported"`.

* 📜🤖 Added by blurb_it.

* bpo-38686 review fix: remember why.

* fix trailing space in Lib/urllib/request.py

Co-Authored-By: Brandt Bucher <brandtbucher@gmail.com>
(cherry picked from commit 14a89c4798)

Co-authored-by: PypeBros <PypeBros@users.noreply.github.com>
2019-11-22 15:36:38 -08:00
Senthil Kumaran 0f3187c1ce
[3.8] bpo-27657: Fix urlparse() with numeric paths (GH-661) (#16839)
* bpo-27657: Fix urlparse() with numeric paths

Revert parsing decision from bpo-754016 in favor of the documented
consensus in bpo-16932 of how to treat strings without a // to
designate the netloc.

* bpo-22891: Remove urlsplit() optimization for 'http' prefixed inputs.
(cherry picked from commit 5a88d50ff0)

Co-authored-by: Tim Graham <timograham@gmail.com>
2019-10-18 08:23:14 -07:00
Miss Islington (bot) 590ed09a5b
bpo-25068: urllib.request.ProxyHandler now lowercases the dict keys (GH-13489)
(cherry picked from commit b761e3aed1)

Co-authored-by: Zackery Spytz <zspytz@gmail.com>
2019-09-13 07:25:51 -07:00
Miss Islington (bot) 58a1a76bae
bpo-35922: Fix RobotFileParser when robots.txt has no relevant crawl delay or request rate (GH-11791)
Co-Authored-By: Tal Einat <taleinat+github@gmail.com>
(cherry picked from commit 8047e0e1c6)

Co-authored-by: Rémi Lapeyre <remi.lapeyre@henki.fr>
2019-06-16 00:07:54 -07:00
Steve Dower 8d0ef0b5ed bpo-36742: Corrects fix to handle decomposition in usernames (#13812) 2019-06-04 17:55:29 +02:00
Rémi Lapeyre 674ee12600 bpo-35397: Remove deprecation and document urllib.parse.unwrap (GH-11481) 2019-05-27 09:43:45 -04:00
Steve Dower b82e17e626
bpo-36842: Implement PEP 578 (GH-12613)
Adds sys.audit, sys.addaudithook, io.open_code, and associated C APIs.
2019-05-23 08:45:22 -07:00
Victor Stinner 0c2b6a3943
bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme (GH-13474)
CVE-2019-9948: Avoid file reading as disallowing the unnecessary URL
scheme in URLopener().open() and URLopener().retrieve()
of urllib.request.

Co-Authored-By: SH <push0ebp@gmail.com>
2019-05-22 22:15:01 +02:00
Xtreak c661b30f89 bpo-36948: Fix NameError in urllib.request.URLopener.retrieve (GH-13389) 2019-05-19 16:40:05 +03:00
Steve Dower d537ab0ff9
bpo-36742: Fixes handling of pre-normalization characters in urlsplit() (GH-13017) 2019-04-30 12:03:02 +00:00
Jörn Hees 750d74fac5 bpo-12910: update and correct quote docstring (#2568)
Fixes some mistakes and misleadings in the quote function docstring:
- reserved chars are never actually used by quote code, unreserved chars are
- reserved chars were wrong and incomplete
- mentioned that use-case is not minimal quoting wrt. RFC, but cautious quoting
2019-04-09 17:31:18 -07:00
Serhiy Storchaka da0847048a
bpo-36431: Use PEP 448 dict unpacking for merging two dicts. (GH-12553) 2019-03-27 08:02:28 +02:00
Steve Dower 16e6f7dee7
bpo-36216: Add check for characters in netloc that normalize to separators (GH-12201) 2019-03-07 08:02:26 -08:00
Boštjan Mejak 158695817d closes bpo-35309: cpath should be capath (GH-10699) 2018-11-25 12:32:50 -06:00
matthewbelisle-wf 209144831b bpo-34866: Adding max_num_fields to cgi.FieldStorage (GH-9660)
Adding `max_num_fields` to `cgi.FieldStorage` to make DOS attacks harder by
limiting the number of `MiniFieldStorage` objects created by `FieldStorage`.
2018-10-19 03:52:59 -07:00
Christopher Beacham 5db5c0669e bpo-21475: Support the Sitemap extension in robotparser (GH-6883) 2018-05-16 10:52:07 -04:00
Michael Lazar bd08a0af2d bpo-32861: urllib.robotparser fix incomplete __str__ methods. (GH-5711)
The urllib.robotparser's __str__ representation now includes wildcard
entries and the "Crawl-delay" and "Request-rate" fields. Also removes extra
newlines that were being appended to the end of the string.
2018-05-14 17:10:41 +03:00
Cheryl Sabella 0250de4819 bpo-27485: Rename and deprecate undocumented functions in urllib.parse (GH-2205) 2018-04-25 16:51:54 -07:00
Matt Eaton 2cb4661707 bpo-33034: Improve exception message when cast fails for {Parse,Split}Result.port (GH-6078) 2018-03-20 09:41:37 +03:00
Serhiy Storchaka 3f2e6f15d6
Revert unneccessary changes made in bpo-30296 and apply other improvements. (GH-2624) 2018-02-26 16:50:11 +02:00
INADA Naoki 579e0b80b9
urllib.request: Remove unused import (GH-5268) 2018-01-22 16:45:31 +09:00
Коренберг Марк fbd605151f bpo-32323: urllib.parse.urlsplit() must not lowercase() IPv6 scope value (#4867) 2017-12-21 14:16:17 +02:00
Berker Peksag 3df02dbc8e bpo-31325: Fix usage of namedtuple in RobotFileParser.parse() (#4529) 2017-11-23 15:40:26 -08:00
Oren Milman 8df44ee8e0 remove a redundant lower in urllib.parse.urlsplit (#3008) 2017-09-02 21:51:39 -07:00
postmasters 90e01e50ef urllib: Simplify splithost by calling into urlparse. (#1849)
The current regex based splitting produces a wrong result. For example::

  http://abc#@def

Web browsers parse that URL as ``http://abc/#@def``, that is, the host
is ``abc``, the path is ``/``, and the fragment is ``#@def``.
2017-06-20 15:02:44 +02:00
Jon Dufresne 3972628de3 bpo-30296 Remove unnecessary tuples, lists, sets, and dicts (#1489)
* Replaced list(<generator expression>) with list comprehension
* Replaced dict(<generator expression>) with dict comprehension
* Replaced set(<list literal>) with set literal
* Replaced builtin func(<list comprehension>) with func(<generator
  expression>) when supported (e.g. any(), all(), tuple(), min(), &
  max())
2017-05-18 07:35:54 -07:00
Senthil Kumaran 906f5330b9 bpo-29976: urllib.parse clarify '' in scheme values. (GH-984) 2017-05-17 21:48:59 -07:00
Serhiy Storchaka 55fe1ae970 bpo-30022: Get rid of using EnvironmentError and IOError (except test… (#1051) 2017-04-16 10:46:38 +03:00
Senthil Kumaran 6fab78e902 Remove superfluous comment in urllib.error. (#1076) 2017-04-10 21:08:35 -07:00
Senthil Kumaran 6dfcc81f6b Remove OSError related comment in urllib.request. (#1070) 2017-04-09 19:49:34 -07:00
Senthil Kumaran a2a9ddd923 Remove invalid comment in urllib.request. (#1054) 2017-04-08 23:27:25 -07:00
Senthil Kumaran 257b980b31 correct parse_qs and parse_qsl test case descriptions. (#968)
* correct parse_qs and parse_qsl test case descriptions.
2017-04-04 21:19:43 -07:00
Ratnadeep Debnath 21024f0662 bpo-16285: Update urllib quoting to RFC 3986 (#173)
* bpo-16285: Update urllib quoting to RFC 3986

urllib.parse.quote is now based on RFC 3986, and hence
includes `'~'` in the set of characters that is not escaped
by default.

Patch by Christian Theune and Ratnadeep Debnath.
2017-02-25 19:00:28 +10:00
Xiang Zhang 04c15d5bdc Issue #29142: Merge 3.6. 2017-01-09 11:52:10 +08:00
Xiang Zhang c44d58a77a Issue #29142: Merge 3.5. 2017-01-09 11:50:02 +08:00
Xiang Zhang 959ff7f1c6 Issue #29142: Fix suffixes in no_proxy handling in urllib.
In urllib.request, suffixes in no_proxy environment variable with
leading dots could match related hostnames again (e.g. .b.c matches a.b.c).
Patch by Milan Oberkirch.
2017-01-09 11:47:55 +08:00
Serhiy Storchaka 8cbd3df3ce Issue #28992: Use bytes.fromhex(). 2016-12-21 12:59:28 +02:00
Serhiy Storchaka 70d28a184c Remove unused imports. 2016-12-16 20:00:15 +02:00
Berker Peksag 9a7bbb2e3f Issue #25400: RobotFileParser now correctly returns default values for crawl_delay and request_rate
Initial patch by Peter Wirtz.
2016-09-18 20:17:58 +03:00
Berker Peksag f8479eeb34 Issue #25895: Merge from 3.5 2016-09-16 14:45:15 +03:00
Berker Peksag f676748a05 Issue #25895: Enable WebSocket URL schemes in urllib.parse.urljoin
Patch by Gergely Imreh and Markus Holtermann.
2016-09-16 14:43:58 +03:00
Christian Heimes d04863771b Issue #28022: Deprecate ssl-related arguments in favor of SSLContext.
The deprecation include manual creation of SSLSocket and certfile/keyfile
(or similar) in ftplib, httplib, imaplib, smtplib, poplib and urllib.

ssl.wrap_socket() is not marked as deprecated yet.
2016-09-10 23:23:33 +02:00
Raymond Hettinger b7f3c944d1 Merge 2016-09-09 16:44:53 -07:00
Raymond Hettinger ae9e5f032d Issue #22450: Use "Accept: */*" in the default headers for urllib.request 2016-09-09 16:43:48 -07:00