Commit Graph

394 Commits

Author SHA1 Message Date
Miss Islington (bot) 45d6547acf
bpo-35922: Fix RobotFileParser when robots.txt has no relevant crawl delay or request rate (GH-11791)
Co-Authored-By: Tal Einat <taleinat+github@gmail.com>
(cherry picked from commit 8047e0e1c6)

Co-authored-by: Rémi Lapeyre <remi.lapeyre@henki.fr>
2019-06-16 00:10:06 -07:00
Miss Islington (bot) 250b62acc5
bpo-36742: Corrects fix to handle decomposition in usernames (GH-13812)
(cherry picked from commit 8d0ef0b5ed)

Co-authored-by: Steve Dower <steve.dower@python.org>
2019-06-04 09:15:13 -07:00
Victor Stinner 34bab21559
bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme (GH-13474) (GH-13505)
CVE-2019-9948: Avoid file reading as disallowing the unnecessary URL
scheme in URLopener().open() and URLopener().retrieve()
of urllib.request.

Co-Authored-By: SH <push0ebp@gmail.com>
(cherry picked from commit 0c2b6a3943)
2019-05-22 23:28:28 +02:00
Miss Islington (bot) 4d723e76e1
bpo-36742: Fixes handling of pre-normalization characters in urlsplit() (GH-13017)
(cherry picked from commit d537ab0ff9)

Co-authored-by: Steve Dower <steve.dower@python.org>
2019-04-30 05:21:02 -07:00
Miss Islington (bot) 796698adf5
bpo-12910: update and correct quote docstring (GH-2568)
Fixes some mistakes and misleadings in the quote function docstring:
- reserved chars are never actually used by quote code, unreserved chars are
- reserved chars were wrong and incomplete
- mentioned that use-case is not minimal quoting wrt. RFC, but cautious quoting
(cherry picked from commit 750d74fac5)

Co-authored-by: Jörn Hees <joernhees@users.noreply.github.com>
2019-04-09 17:53:03 -07:00
Steve Dower daad2c482c
bpo-36216: Add check for characters in netloc that normalize to separators (GH-12201) 2019-03-07 09:08:18 -08:00
Miss Islington (bot) 6a528ccb4c closes bpo-35309: cpath should be capath (GH-10701)
(cherry picked from commit 158695817d)

Co-authored-by: Boštjan Mejak <bostjan.xperia@gmail.com>
2018-11-25 14:51:02 -06:00
Miss Islington (bot) a66f279a13
bpo-34866: Adding max_num_fields to cgi.FieldStorage (GH-9660)
Adding `max_num_fields` to `cgi.FieldStorage` to make DOS attacks harder by
limiting the number of `MiniFieldStorage` objects created by `FieldStorage`.
(cherry picked from commit 209144831b)

Co-authored-by: matthewbelisle-wf <matthew.belisle@workiva.com>
2018-10-19 04:11:16 -07:00
Miss Islington (bot) c3fa1f2b93 [3.7] bpo-32861: urllib.robotparser fix incomplete __str__ methods. (GH-5711) (GH-6795)
The urllib.robotparser's __str__ representation now includes wildcard
entries and the "Crawl-delay" and "Request-rate" fields.
(cherry picked from commit bd08a0af2d)

Co-authored-by: Michael Lazar <lazar.michael22@gmail.com>
2018-05-14 21:14:30 +03:00
Miss Islington (bot) f8a3485dcd
Revert unneccessary changes made in bpo-30296 and apply other improvements. (GH-2624)
(cherry picked from commit 3f2e6f15d6)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2018-02-26 08:22:24 -08:00
INADA Naoki 579e0b80b9
urllib.request: Remove unused import (GH-5268) 2018-01-22 16:45:31 +09:00
Коренберг Марк fbd605151f bpo-32323: urllib.parse.urlsplit() must not lowercase() IPv6 scope value (#4867) 2017-12-21 14:16:17 +02:00
Berker Peksag 3df02dbc8e bpo-31325: Fix usage of namedtuple in RobotFileParser.parse() (#4529) 2017-11-23 15:40:26 -08:00
Oren Milman 8df44ee8e0 remove a redundant lower in urllib.parse.urlsplit (#3008) 2017-09-02 21:51:39 -07:00
postmasters 90e01e50ef urllib: Simplify splithost by calling into urlparse. (#1849)
The current regex based splitting produces a wrong result. For example::

  http://abc#@def

Web browsers parse that URL as ``http://abc/#@def``, that is, the host
is ``abc``, the path is ``/``, and the fragment is ``#@def``.
2017-06-20 15:02:44 +02:00
Jon Dufresne 3972628de3 bpo-30296 Remove unnecessary tuples, lists, sets, and dicts (#1489)
* Replaced list(<generator expression>) with list comprehension
* Replaced dict(<generator expression>) with dict comprehension
* Replaced set(<list literal>) with set literal
* Replaced builtin func(<list comprehension>) with func(<generator
  expression>) when supported (e.g. any(), all(), tuple(), min(), &
  max())
2017-05-18 07:35:54 -07:00
Senthil Kumaran 906f5330b9 bpo-29976: urllib.parse clarify '' in scheme values. (GH-984) 2017-05-17 21:48:59 -07:00
Serhiy Storchaka 55fe1ae970 bpo-30022: Get rid of using EnvironmentError and IOError (except test… (#1051) 2017-04-16 10:46:38 +03:00
Senthil Kumaran 6fab78e902 Remove superfluous comment in urllib.error. (#1076) 2017-04-10 21:08:35 -07:00
Senthil Kumaran 6dfcc81f6b Remove OSError related comment in urllib.request. (#1070) 2017-04-09 19:49:34 -07:00
Senthil Kumaran a2a9ddd923 Remove invalid comment in urllib.request. (#1054) 2017-04-08 23:27:25 -07:00
Senthil Kumaran 257b980b31 correct parse_qs and parse_qsl test case descriptions. (#968)
* correct parse_qs and parse_qsl test case descriptions.
2017-04-04 21:19:43 -07:00
Ratnadeep Debnath 21024f0662 bpo-16285: Update urllib quoting to RFC 3986 (#173)
* bpo-16285: Update urllib quoting to RFC 3986

urllib.parse.quote is now based on RFC 3986, and hence
includes `'~'` in the set of characters that is not escaped
by default.

Patch by Christian Theune and Ratnadeep Debnath.
2017-02-25 19:00:28 +10:00
Xiang Zhang 04c15d5bdc Issue #29142: Merge 3.6. 2017-01-09 11:52:10 +08:00
Xiang Zhang c44d58a77a Issue #29142: Merge 3.5. 2017-01-09 11:50:02 +08:00
Xiang Zhang 959ff7f1c6 Issue #29142: Fix suffixes in no_proxy handling in urllib.
In urllib.request, suffixes in no_proxy environment variable with
leading dots could match related hostnames again (e.g. .b.c matches a.b.c).
Patch by Milan Oberkirch.
2017-01-09 11:47:55 +08:00
Serhiy Storchaka 8cbd3df3ce Issue #28992: Use bytes.fromhex(). 2016-12-21 12:59:28 +02:00
Serhiy Storchaka 70d28a184c Remove unused imports. 2016-12-16 20:00:15 +02:00
Berker Peksag 9a7bbb2e3f Issue #25400: RobotFileParser now correctly returns default values for crawl_delay and request_rate
Initial patch by Peter Wirtz.
2016-09-18 20:17:58 +03:00
Berker Peksag f8479eeb34 Issue #25895: Merge from 3.5 2016-09-16 14:45:15 +03:00
Berker Peksag f676748a05 Issue #25895: Enable WebSocket URL schemes in urllib.parse.urljoin
Patch by Gergely Imreh and Markus Holtermann.
2016-09-16 14:43:58 +03:00
Christian Heimes d04863771b Issue #28022: Deprecate ssl-related arguments in favor of SSLContext.
The deprecation include manual creation of SSLSocket and certfile/keyfile
(or similar) in ftplib, httplib, imaplib, smtplib, poplib and urllib.

ssl.wrap_socket() is not marked as deprecated yet.
2016-09-10 23:23:33 +02:00
Raymond Hettinger b7f3c944d1 Merge 2016-09-09 16:44:53 -07:00
Raymond Hettinger ae9e5f032d Issue #22450: Use "Accept: */*" in the default headers for urllib.request 2016-09-09 16:43:48 -07:00
Martin Panter 3c0d0baf2b Issue #12319: Support for chunked encoding of HTTP request bodies
When the body object is a file, its size is no longer determined with
fstat(), since that can report the wrong result (e.g. reading from a pipe).
Instead, determine the size using seek(), or fall back to chunked encoding
for unseekable files.

Also, change the logic for detecting text files to check for TextIOBase
inheritance, rather than inspecting the “mode” attribute, which may not
exist (e.g. BytesIO and StringIO).  The Content-Length for text files is no
longer determined ahead of time, because the original logic could have been
wrong depending on the codec and newline translation settings.

Patch by Demian Brecht and Rolf Krahl, with a few tweaks by me.
2016-08-24 06:33:33 +00:00
Senthil Kumaran cde03fa038 [merge from 3.5] - Prevent HTTPoxy attack (CVE-2016-1000110)
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.

Issue #27568 Reported and patch contributed by Rémi Rampin.
2016-07-30 23:51:13 -07:00
Senthil Kumaran 17742f2d45 [merge from 3.4] - Prevent HTTPoxy attack (CVE-2016-1000110)
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.

Issue #27568 Reported and patch contributed by Rémi Rampin.
2016-07-30 23:39:06 -07:00
Senthil Kumaran 436fe5a447 [merge from 3.3] Prevent HTTPoxy attack (CVE-2016-1000110)
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.

Issue #27568 Reported and patch contributed by Rémi Rampin.
2016-07-30 23:34:34 -07:00
Senthil Kumaran 4cbb23f8f2 Prevent HTTPoxy attack (CVE-2016-1000110)
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.

Issue #27568 Reported and patch contributed by Rémi Rampin.
2016-07-30 23:24:16 -07:00
Martin Panter 29f256909f Issue #22797: Synchronize urlopen() doc string with RST documentation 2016-06-04 05:06:34 +00:00
Martin Panter 0f29ad1be5 More typo fixes for 3.6 2016-06-04 05:06:25 +00:00
R David Murray d2367c651e Clean up urlopen doc string.
Clarifies what is returned when and that the methods are common between the two.

Patch by Alexander Liu as part of #22797.
2016-06-03 20:16:06 -04:00
Martin Panter 0b39a556e8 Issue #14132, Issue #17214: Merge two redirect handling fixes from 3.5 2016-05-16 07:45:28 +00:00
Martin Panter e6f060903c Issue #17214: Percent-encode non-ASCII bytes in redirect targets
Some servers send Location header fields with non-ASCII bytes, but "http.
client" requires the request target to be ASCII-encodable, otherwise a
UnicodeEncodeError is raised. Based on patch by Christian Heimes.

Python 2 does not suffer any problem because it allows non-ASCII bytes in the
HTTP request target.
2016-05-16 01:14:20 +00:00
Martin Panter ce6e06874b Issue #14132: Fix redirect handling when target is just a query string 2016-05-16 01:07:13 +00:00
Senthil Kumaran 5d1110a952 merge from 3.5
Issue #26892: Honor debuglevel flag in urllib.request.HTTPHandler.

Patch contributed by Chi Hsuan Yen.
2016-05-13 01:35:29 -07:00
Senthil Kumaran 9642eedc0a Issue #26892: Honor debuglevel flag in urllib.request.HTTPHandler.
Patch contributed by Chi Hsuan Yen.
2016-05-13 01:32:42 -07:00
Martin Panter 1ce738e08f Merge typo fixes from 3.5 2016-05-08 14:02:35 +00:00
Martin Panter f0564164ba Fix typos in comments, documentation and test method names 2016-05-08 13:48:10 +00:00
Martin Panter 51b697b7f3 Issue #26864: Merge no_proxy fixes from 3.5 2016-04-30 01:30:57 +00:00