Commit Graph

412 Commits

Author SHA1 Message Date
Raymond Hettinger b7f3c944d1 Merge 2016-09-09 16:44:53 -07:00
Raymond Hettinger ae9e5f032d Issue #22450: Use "Accept: */*" in the default headers for urllib.request 2016-09-09 16:43:48 -07:00
Martin Panter 3c0d0baf2b Issue #12319: Support for chunked encoding of HTTP request bodies
When the body object is a file, its size is no longer determined with
fstat(), since that can report the wrong result (e.g. reading from a pipe).
Instead, determine the size using seek(), or fall back to chunked encoding
for unseekable files.

Also, change the logic for detecting text files to check for TextIOBase
inheritance, rather than inspecting the “mode” attribute, which may not
exist (e.g. BytesIO and StringIO).  The Content-Length for text files is no
longer determined ahead of time, because the original logic could have been
wrong depending on the codec and newline translation settings.

Patch by Demian Brecht and Rolf Krahl, with a few tweaks by me.
2016-08-24 06:33:33 +00:00
Senthil Kumaran cde03fa038 [merge from 3.5] - Prevent HTTPoxy attack (CVE-2016-1000110)
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.

Issue #27568 Reported and patch contributed by Rémi Rampin.
2016-07-30 23:51:13 -07:00
Senthil Kumaran 17742f2d45 [merge from 3.4] - Prevent HTTPoxy attack (CVE-2016-1000110)
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.

Issue #27568 Reported and patch contributed by Rémi Rampin.
2016-07-30 23:39:06 -07:00
Senthil Kumaran 436fe5a447 [merge from 3.3] Prevent HTTPoxy attack (CVE-2016-1000110)
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.

Issue #27568 Reported and patch contributed by Rémi Rampin.
2016-07-30 23:34:34 -07:00
Senthil Kumaran 4cbb23f8f2 Prevent HTTPoxy attack (CVE-2016-1000110)
Ignore the HTTP_PROXY variable when REQUEST_METHOD environment is set, which
indicates that the script is in CGI mode.

Issue #27568 Reported and patch contributed by Rémi Rampin.
2016-07-30 23:24:16 -07:00
Martin Panter 29f256909f Issue #22797: Synchronize urlopen() doc string with RST documentation 2016-06-04 05:06:34 +00:00
Martin Panter 0f29ad1be5 More typo fixes for 3.6 2016-06-04 05:06:25 +00:00
R David Murray d2367c651e Clean up urlopen doc string.
Clarifies what is returned when and that the methods are common between the two.

Patch by Alexander Liu as part of #22797.
2016-06-03 20:16:06 -04:00
Martin Panter 0b39a556e8 Issue #14132, Issue #17214: Merge two redirect handling fixes from 3.5 2016-05-16 07:45:28 +00:00
Martin Panter e6f060903c Issue #17214: Percent-encode non-ASCII bytes in redirect targets
Some servers send Location header fields with non-ASCII bytes, but "http.
client" requires the request target to be ASCII-encodable, otherwise a
UnicodeEncodeError is raised. Based on patch by Christian Heimes.

Python 2 does not suffer any problem because it allows non-ASCII bytes in the
HTTP request target.
2016-05-16 01:14:20 +00:00
Martin Panter ce6e06874b Issue #14132: Fix redirect handling when target is just a query string 2016-05-16 01:07:13 +00:00
Senthil Kumaran 5d1110a952 merge from 3.5
Issue #26892: Honor debuglevel flag in urllib.request.HTTPHandler.

Patch contributed by Chi Hsuan Yen.
2016-05-13 01:35:29 -07:00
Senthil Kumaran 9642eedc0a Issue #26892: Honor debuglevel flag in urllib.request.HTTPHandler.
Patch contributed by Chi Hsuan Yen.
2016-05-13 01:32:42 -07:00
Martin Panter 1ce738e08f Merge typo fixes from 3.5 2016-05-08 14:02:35 +00:00
Martin Panter f0564164ba Fix typos in comments, documentation and test method names 2016-05-08 13:48:10 +00:00
Martin Panter 51b697b7f3 Issue #26864: Merge no_proxy fixes from 3.5 2016-04-30 01:30:57 +00:00
Martin Panter aa27982ffc Issue #26864: Fix case insensitivity and suffix comparison with no_proxy
Patch by Xiang Zhang.
2016-04-30 01:03:40 +00:00
Senthil Kumaran 0996fa3bd8 merge 3.5
Issue #26804: urllib.request will prefer lower_case proxy environment variables
over UPPER_CASE or Mixed_Case ones.

Patch contributed by Hans-Peter Jansen. Reviewed by Martin Panter and Senthil Kumaran.
2016-04-25 08:18:07 -07:00
Senthil Kumaran a7c0ff2f0b Issue #26804: urllib.request will prefer lower_case proxy environment variables
over UPPER_CASE or Mixed_Case ones.

Patch contributed by Hans-Peter Jansen. Reviewed by Martin Panter and Senthil Kumaran.
2016-04-25 08:16:23 -07:00
Berker Peksag 48238c7e37 Issue #2202: Fix UnboundLocalError in AbstractDigestAuthHandler.get_algorithm_impls
Raise ValueError if algorithm is not MD5 or SHA.

Initial patch by Mathieu Dupuy.
2016-03-06 16:17:47 +02:00
Berker Peksag e88dd1c32c Issue #2202: Fix UnboundLocalError in AbstractDigestAuthHandler.get_algorithm_impls
Raise ValueError if algorithm is not MD5 or SHA.

Initial patch by Mathieu Dupuy.
2016-03-06 16:16:40 +02:00
Serhiy Storchaka 885bdc4946 Issue #25985: sys.version_info is now used instead of sys.version
to format short Python version.
2016-02-11 13:10:36 +02:00
Martin Panter a3643c280f Issue #12923: Merge FancyURLopener fix from 3.5 2016-02-06 01:08:40 +00:00
Martin Panter a03702252f Issue #12923: Reset FancyURLopener's redirect counter even on exception
Based on patches by Brian Brazil and Daniel Rocco.
2016-02-04 06:01:35 +00:00
Senthil Kumaran 0b57f0adde merge from 3.5
Remove unnecessary test case comment in urllib.parse.py. These are asserted as test cases.
2016-01-25 18:54:37 -08:00
Senthil Kumaran d4e51f45a9 Remove unnecessary test case comment in urllib.parse.py. These are asserted as test cases. 2016-01-25 18:53:34 -08:00
Senthil Kumaran 86f7109dad Issue #25822: Add docstrings to the fields of urllib.parse results.
Patch contributed by Swati Jaiswal.
2016-01-14 00:11:39 -08:00
Serhiy Storchaka 3fd4a735d8 Issue #25899: Converted non-ASCII characters in docstrings and manpage
to ASCII replacements.  Removed UTF-8 BOM from Misc/NEWS.
Original patch by Chris Angelico.
2015-12-18 13:10:37 +02:00
Martin Panter f65dd1d4db Issue #25576: Apply fix to new urlopen() doc string 2015-11-24 23:00:37 +00:00
Berker Peksag 960e848f0d Issue #16099: RobotFileParser now supports Crawl-delay and Request-rate
extensions.

Patch by Nikolay Bogoychev.
2015-10-08 12:27:06 +03:00
Raymond Hettinger 507343a2ef Add missing docstring 2015-08-18 00:35:52 -07:00
Robert Collins dfa95c9a8f Issue #20059: urllib.parse raises ValueError on all invalid ports.
Patch by Martin Panter.
2015-08-10 09:53:30 +12:00
Robert Collins 1f9a29f31b Issue #24021: docstring for urllib.urlcleanup.
Patch from Daniel Andrade Groppe and Peter Lovett
2015-08-04 12:52:43 +12:00
Robert Collins 2fee5c9367 Issue #24021: docstring for urllib.urlcleanup.
Patch from Daniel Andrade Groppe and Peter Lovett
2015-08-04 12:52:06 +12:00
R David Murray c17686f071 Issue #13866: add *quote_via* argument to urlencode.
Patch by samwyse, completed by Arnon Yaari, and reviewed by
Martin Panter.
2015-05-17 20:44:50 -04:00
Facundo Batista 244afcf26c Issue #23887: urllib.error.HTTPError now has a proper repr() representation. 2015-04-22 18:35:54 -03:00
R David Murray 4c7f995e80 #7159: generalize urllib prior auth support.
This fix is a superset of the functionality introduced by the issue #19494
enhancement, and supersedes that fix.  Instead of a new handler, we have a new
password manager that tracks whether we should send the auth for a given uri.
This allows us to say "always send", satisfying #19494, or track that we've
succeeded in auth and send the creds right away on every *subsequent* request.
The support for using the password manager is added to AbstractBasicAuth,
which means the proxy handler also now can handle prior auth if passed
the new password manager.

Patch by Akshit Khurana, docs mostly by me.
2015-04-16 16:36:18 -04:00
Berker Peksag 20416f7994 Issue #23703: Fix a regression in urljoin() introduced in 901e4e52b20a.
Patch by Demian Brecht.
2015-04-16 02:31:14 +03:00
Serhiy Storchaka 7e7a3dba5f Issue #23865: close() methods in multiple modules now are idempotent and more
robust at shutdown. If needs to release multiple resources, they are released
even if errors are occured.
2015-04-10 13:24:41 +03:00
Serhiy Storchaka 2116b12da5 Issue #23865: close() methods in multiple modules now are idempotent and more
robust at shutdown. If needs to release multiple resources, they are released
even if errors are occured.
2015-04-10 13:29:28 +03:00
Serhiy Storchaka 1515450440 Issue #23411: Added DefragResult, ParseResult, SplitResult, DefragResultBytes,
ParseResultBytes, and SplitResultBytes to urllib.parse.__all__.
Patch by Martin Panter.
2015-04-07 19:09:01 +03:00
Victor Stinner a9dd680d23 (Merge 3.4) Issue #23881: urllib.request.ftpwrapper constructor now closes the
socket if the FTP connection failed to fix a ResourceWarning.
2015-04-07 12:50:24 +02:00
Victor Stinner ab73e65032 Issue #23881: urllib.request.ftpwrapper constructor now closes the socket if
the FTP connection failed to fix a ResourceWarning.
2015-04-07 12:49:27 +02:00
Serhiy Storchaka 44eceb6e2a Issue #23563: Optimized utility functions in urllib.parse. 2015-03-03 20:21:35 +02:00
R David Murray 3ab6ba4744 Merge: #23040: Clarify treatment of encoding and errors when component is bytes. 2014-12-24 21:24:07 -05:00
R David Murray 8c4e112afc #23040: Clarify treatment of encoding and errors when component is bytes.
Patch by Wojtek Ruszczewski.
2014-12-24 21:23:18 -05:00
Benjamin Peterson b666697fa8 use context's check_hostname attribute rather than the HTTPSHandler check_hostname parameter 2014-12-07 13:46:02 -05:00
Benjamin Peterson 074b95da48 merge 3.4 2014-12-07 13:47:39 -05:00
Nick Coghlan c216c48699 Close #19494: add urrlib.request.HTTPBasicPriorAuthHandler
This auth handler adds the Authorization header to the first
HTTP request rather than waiting for a HTTP 401 Unauthorized
response from the server as the default HTTPBasicAuthHandler
does.

This allows working with websites like https://api.github.com which do
not follow the strict interpretation of RFC, but more the dicta in the
end of section 2 of RFC 2617:

    > A client MAY preemptively send the corresponding Authorization
    > header with requests for resources in that space without receipt
    > of another challenge from the server.  Similarly, when a client
    > sends a request to a proxy, it may reuse a userid and password in
    > the Proxy-Authorization header field without receiving another
    > challenge from the proxy server. See section 4 for security
    > considerations associated with Basic authentication.

Patch by Matej Cepl.
2014-11-12 23:33:50 +10:00
Senthil Kumaran a66e3885fb Issue #22278: Fix urljoin problem with relative urls, a regression observed
after changes to issue22118 were submitted.

Patch contributed by Demian Brecht and reviewed by Antoine Pitrou.
2014-09-22 15:49:16 +08:00
Senthil Kumaran 8b7e161ac3 backport context argument of urlopen (#22366) for pep 476 2014-09-19 15:23:30 +08:00
Senthil Kumaran a5c85b3f5f Issue #22366: urllib.request.urlopen will accept a context object (SSLContext)
as an argument which will then used be for HTTPS connection.

Patch by Alex Gaynor.
2014-09-19 15:23:30 +08:00
Serhiy Storchaka 91453026ff Issue #19524: Fixed resource leak in the HTTP connection when an invalid
response is received.  Patch by Martin Panter.
2014-09-06 21:43:49 +03:00
Serhiy Storchaka f54c350160 Issue #19524: Fixed resource leak in the HTTP connection when an invalid
response is received.  Patch by Martin Panter.
2014-09-06 21:41:39 +03:00
Antoine Pitrou 55ac5b3f7b Issue #22118: Switch urllib.parse to use RFC 3986 semantics for the resolution of relative URLs, rather than RFCs 1808 and 2396.
Patch by Demian Brecht.
2014-08-21 19:16:17 -04:00
Senthil Kumaran 2b7ccbda90 merge from 3.4
Fix Issue #8797: Raise HTTPError on failed Basic Authentication immediately. Initial patch by Sam Bull.
2014-08-20 07:55:53 +05:30
Senthil Kumaran 783737625d Fix Issue #8797: Raise HTTPError on failed Basic Authentication immediately. Initial patch by Sam Bull. 2014-08-20 07:53:58 +05:30
Senthil Kumaran e2953e5146 merge 3.4; backout changeset 3435c5865cfc due to buildbot failures. Ref #8797 2014-08-16 22:54:24 +05:30
Senthil Kumaran 402df0975c backout changeset 3435c5865cfc due to buildbot failures. Ref #8797 2014-08-16 22:52:37 +05:30
Senthil Kumaran 39e6c07beb merge from 3.4
Fix Issue #8797: Raise HTTPError on failed Basic Authentication immediately. Initial patch by Sam Bull.
2014-08-16 14:19:09 +05:30
Senthil Kumaran b2e3a939bf Fix Issue #8797: Raise HTTPError on failed Basic Authentication immediately. Initial patch by Sam Bull. 2014-08-16 14:17:38 +05:30
Serhiy Storchaka 465e60e654 Issue #22033: Reprs of most Python implemened classes now contain actual
class name instead of hardcoded one.
2014-07-25 23:36:00 +03:00
Senthil Kumaran 284a4a1bb0 Merge 3.4
Fix localhost checking in FileHandler. Raised in #21970.
2014-07-22 00:16:18 -07:00
Senthil Kumaran bc07ac5180 Fix localhost checking in FileHandler. Raised in #21970. 2014-07-22 00:15:20 -07:00
Benjamin Peterson edb07d28fb merge 3.4 (#21463) 2014-06-07 15:09:36 -07:00
Benjamin Peterson 3c2dca67ac in ftp cache pruning, avoid changing the size of a dict while iterating over it (closes #21463)
Patch by Skyler Leigh Amador.
2014-06-07 15:08:04 -07:00
Raymond Hettinger 38acd4c028 Issue 21469: Minor code modernization (convert and/or expression to an if/else expression).
Suggested by: Tal Einat
2014-05-12 22:22:46 -07:00
Raymond Hettinger 122541bece Issue 21469: Mitigate risk of false positives with robotparser.
* Repair the broken link to norobots-rfc.txt.

* HTTP response codes >= 500 treated as a failed read rather than as a not
found.  Not found means that we can assume the entire site is allowed.  A 5xx
server error tells us nothing.

* A successful read() or parse() updates the mtime (which is defined to be "the
  time the robots.txt file was last fetched").

* The can_fetch() method returns False unless we've had a read() with a 2xx or
4xx response.  This avoids false positives in the case where a user calls
can_fetch() before calling read().

* I don't see any easy way to test this patch without hitting internet
resources that might change or without use of mock objects that wouldn't
provide must reassurance.
2014-05-12 21:56:33 -07:00
Senthil Kumaran 6117e5d8e3 urllib.response object to use _TemporaryFileWrapper (and _TemporaryFileCloser)
facility. Provides a better way to handle file descriptor close.

Address issue #15002 . Patch contributed by Christian Theune.
2014-04-20 09:41:29 -07:00
Senthil Kumaran d8e24f1f71 Convert urllib.request parse_proxy doctests to unittests. 2014-04-14 16:32:20 -04:00
Benjamin Peterson 78c8538461 fix typo 2014-04-01 16:27:30 -04:00
Benjamin Peterson 5dd3caed2b simplify check, since now there are only new-style classes 2014-04-01 14:20:56 -04:00
Victor Stinner d6a91a7ab6 Issue #20879: Delay the initialization of encoding and decoding tables for
base32, ascii85 and base85 codecs in the base64 module, and delay the
initialization of the unquote_to_bytes() table of the urllib.parse module, to
not waste memory if these modules are not used.
2014-03-17 22:38:41 +01:00
Serhiy Storchaka 5d83d1a814 Issue #20270: urllib.urlparse now supports empty ports. 2014-01-18 18:31:41 +02:00
Serhiy Storchaka ff97b08d00 Issue #20270: urllib.urlparse now supports empty ports. 2014-01-18 18:30:33 +02:00
Senthil Kumaran b6fac245b5 Backporing the fix from Issue #12692 2013-12-28 17:36:18 -08:00
Christian Heimes 67986f9431 Issue #19735: Implement private function ssl._create_stdlib_context() to
create SSLContext objects in Python's stdlib module. It provides a single
configuration point and makes use of SSLContext.load_default_certs().
2013-11-23 22:43:47 +01:00
Jason R. Coombs aae6a1d76f Issue #18978: A more elegant technique for resolving the method 2013-09-08 12:54:33 -04:00
Jason R. Coombs 7dc4f4bbab Issue #18978: Allow Request.method to be defined at the class level. 2013-09-08 12:47:07 -04:00
Senthil Kumaran d80f7be580 merge from 3.3
Improve urlencode docstring. Patch by Brian Brazil.
Closes issue #15350
2013-09-05 21:43:53 -07:00
Senthil Kumaran 324ae385fe Improve urlencode docstring. Patch by Brian Brazil. 2013-09-05 21:42:38 -07:00
Brett Cannon cd171c8e92 Issue #18200: Back out usage of ModuleNotFoundError (8d28d44f3a9a) 2013-07-04 17:43:24 -04:00
Brett Cannon 0a140668fa Issue #18200: Update the stdlib (except tests) to use
ModuleNotFoundError.
2013-06-13 20:57:26 -04:00
Senthil Kumaran caa00fec19 Fix #17967 - Fix related to regression on Windows.
os.path.join(*self.dirs) produces an invalid path on windows.
ftp paths are always forward-slash seperated like this. /pub/dir.
2013-06-02 11:59:47 -07:00
Senthil Kumaran dcdadfe39a Fix thishost helper funtion in urllib. Returns the ipaddress of localhost when
hostname is resolvable by socket.gethostname for local machine. This all fixes
certain freebsd builtbot failures.
2013-06-01 11:12:17 -07:00
Senthil Kumaran 4e42ae81f6 Fix #17967: For ftp urls CWD to target instead of hopping to each directory
towards target. This fixes a bug where target is accessible, but parent
directories are restricted.
2013-06-01 08:27:06 -07:00
Senthil Kumaran c70a6ae49b #17403: urllib.parse.robotparser normalizes the urls before adding to ruleline.
This helps in handling certain types invalid urls in a conservative manner.
2013-05-29 05:54:31 -07:00
Senthil Kumaran 5ccf2ff3e9 merge from 3.3
Fix #17967 - Fix related to regression on Windows.

os.path.join(*self.dirs) produces an invalid path on windows.
ftp paths are always forward-slash seperated like this. /pub/dir.
2013-06-02 12:00:45 -07:00
Senthil Kumaran 88249b80d7 merge from 3.3
Fix thishost helper funtion in urllib. Returns the ipaddress of localhost when
hostname is resolvable by socket.gethostname for local machine. This all fixes
certain freebsd builtbot failures.
2013-06-01 11:12:52 -07:00
Senthil Kumaran e9ec2e173d merge from 3.3
Fix #17967: For ftp urls CWD to target instead of hopping to each directory
towards target. This fixes a bug where target is accessible, but parent
directories are restricted.
2013-06-01 08:27:53 -07:00
Senthil Kumaran 6b3026ce72 merge from 3.3
#17403: urllib.parse.robotparser normalizes the urls before adding to
ruleline. This helps in handling certain types invalid urls in a conservative
manner. Patch contributed by Mher Movsisyan.
2013-05-29 05:57:21 -07:00
Senthil Kumaran 8307075ce8 Fix #17272 - Make Request.full_url and Request.get_full_url return same result under all circumstances.
Document the change of Request.full_url to a property.
2013-05-24 09:14:12 -07:00
Benjamin Peterson fa6bdc6d86 merge 3.3 2013-05-12 19:02:05 -05:00
Benjamin Peterson 901a278861 use correct format code for exceptions 2013-05-12 19:01:52 -05:00
Senthil Kumaran 5238092592 Issue #17272: Making the urllib.request's Request.full_url a descriptor. Fixes
bugs with assignment to full_url. Patch by Demian Brecht.
2013-04-25 05:45:48 -07:00
Raymond Hettinger 56b0a3d89a Remove redundant imports 2013-04-06 20:53:12 -07:00
Senthil Kumaran 4a2ab120f3 Issue #17483: 3.3 Branch - Remove unreachable code in urllib.request 2013-04-04 19:34:02 -07:00
R David Murray c616604a15 Merge: Use repr when printing unknown url type in urlopen. 2013-04-03 07:01:07 -04:00