Commit Graph

381 Commits

Author SHA1 Message Date
Berker Peksag 960e848f0d Issue #16099: RobotFileParser now supports Crawl-delay and Request-rate
extensions.

Patch by Nikolay Bogoychev.
2015-10-08 12:27:06 +03:00
Raymond Hettinger 507343a2ef Add missing docstring 2015-08-18 00:35:52 -07:00
Robert Collins dfa95c9a8f Issue #20059: urllib.parse raises ValueError on all invalid ports.
Patch by Martin Panter.
2015-08-10 09:53:30 +12:00
Robert Collins 1f9a29f31b Issue #24021: docstring for urllib.urlcleanup.
Patch from Daniel Andrade Groppe and Peter Lovett
2015-08-04 12:52:43 +12:00
Robert Collins 2fee5c9367 Issue #24021: docstring for urllib.urlcleanup.
Patch from Daniel Andrade Groppe and Peter Lovett
2015-08-04 12:52:06 +12:00
R David Murray c17686f071 Issue #13866: add *quote_via* argument to urlencode.
Patch by samwyse, completed by Arnon Yaari, and reviewed by
Martin Panter.
2015-05-17 20:44:50 -04:00
Facundo Batista 244afcf26c Issue #23887: urllib.error.HTTPError now has a proper repr() representation. 2015-04-22 18:35:54 -03:00
R David Murray 4c7f995e80 #7159: generalize urllib prior auth support.
This fix is a superset of the functionality introduced by the issue #19494
enhancement, and supersedes that fix.  Instead of a new handler, we have a new
password manager that tracks whether we should send the auth for a given uri.
This allows us to say "always send", satisfying #19494, or track that we've
succeeded in auth and send the creds right away on every *subsequent* request.
The support for using the password manager is added to AbstractBasicAuth,
which means the proxy handler also now can handle prior auth if passed
the new password manager.

Patch by Akshit Khurana, docs mostly by me.
2015-04-16 16:36:18 -04:00
Berker Peksag 20416f7994 Issue #23703: Fix a regression in urljoin() introduced in 901e4e52b20a.
Patch by Demian Brecht.
2015-04-16 02:31:14 +03:00
Serhiy Storchaka 7e7a3dba5f Issue #23865: close() methods in multiple modules now are idempotent and more
robust at shutdown. If needs to release multiple resources, they are released
even if errors are occured.
2015-04-10 13:24:41 +03:00
Serhiy Storchaka 2116b12da5 Issue #23865: close() methods in multiple modules now are idempotent and more
robust at shutdown. If needs to release multiple resources, they are released
even if errors are occured.
2015-04-10 13:29:28 +03:00
Serhiy Storchaka 1515450440 Issue #23411: Added DefragResult, ParseResult, SplitResult, DefragResultBytes,
ParseResultBytes, and SplitResultBytes to urllib.parse.__all__.
Patch by Martin Panter.
2015-04-07 19:09:01 +03:00
Victor Stinner a9dd680d23 (Merge 3.4) Issue #23881: urllib.request.ftpwrapper constructor now closes the
socket if the FTP connection failed to fix a ResourceWarning.
2015-04-07 12:50:24 +02:00
Victor Stinner ab73e65032 Issue #23881: urllib.request.ftpwrapper constructor now closes the socket if
the FTP connection failed to fix a ResourceWarning.
2015-04-07 12:49:27 +02:00
Serhiy Storchaka 44eceb6e2a Issue #23563: Optimized utility functions in urllib.parse. 2015-03-03 20:21:35 +02:00
R David Murray 3ab6ba4744 Merge: #23040: Clarify treatment of encoding and errors when component is bytes. 2014-12-24 21:24:07 -05:00
R David Murray 8c4e112afc #23040: Clarify treatment of encoding and errors when component is bytes.
Patch by Wojtek Ruszczewski.
2014-12-24 21:23:18 -05:00
Benjamin Peterson b666697fa8 use context's check_hostname attribute rather than the HTTPSHandler check_hostname parameter 2014-12-07 13:46:02 -05:00
Benjamin Peterson 074b95da48 merge 3.4 2014-12-07 13:47:39 -05:00
Nick Coghlan c216c48699 Close #19494: add urrlib.request.HTTPBasicPriorAuthHandler
This auth handler adds the Authorization header to the first
HTTP request rather than waiting for a HTTP 401 Unauthorized
response from the server as the default HTTPBasicAuthHandler
does.

This allows working with websites like https://api.github.com which do
not follow the strict interpretation of RFC, but more the dicta in the
end of section 2 of RFC 2617:

    > A client MAY preemptively send the corresponding Authorization
    > header with requests for resources in that space without receipt
    > of another challenge from the server.  Similarly, when a client
    > sends a request to a proxy, it may reuse a userid and password in
    > the Proxy-Authorization header field without receiving another
    > challenge from the proxy server. See section 4 for security
    > considerations associated with Basic authentication.

Patch by Matej Cepl.
2014-11-12 23:33:50 +10:00
Senthil Kumaran a66e3885fb Issue #22278: Fix urljoin problem with relative urls, a regression observed
after changes to issue22118 were submitted.

Patch contributed by Demian Brecht and reviewed by Antoine Pitrou.
2014-09-22 15:49:16 +08:00
Senthil Kumaran 8b7e161ac3 backport context argument of urlopen (#22366) for pep 476 2014-09-19 15:23:30 +08:00
Senthil Kumaran a5c85b3f5f Issue #22366: urllib.request.urlopen will accept a context object (SSLContext)
as an argument which will then used be for HTTPS connection.

Patch by Alex Gaynor.
2014-09-19 15:23:30 +08:00
Serhiy Storchaka 91453026ff Issue #19524: Fixed resource leak in the HTTP connection when an invalid
response is received.  Patch by Martin Panter.
2014-09-06 21:43:49 +03:00
Serhiy Storchaka f54c350160 Issue #19524: Fixed resource leak in the HTTP connection when an invalid
response is received.  Patch by Martin Panter.
2014-09-06 21:41:39 +03:00
Antoine Pitrou 55ac5b3f7b Issue #22118: Switch urllib.parse to use RFC 3986 semantics for the resolution of relative URLs, rather than RFCs 1808 and 2396.
Patch by Demian Brecht.
2014-08-21 19:16:17 -04:00
Senthil Kumaran 2b7ccbda90 merge from 3.4
Fix Issue #8797: Raise HTTPError on failed Basic Authentication immediately. Initial patch by Sam Bull.
2014-08-20 07:55:53 +05:30
Senthil Kumaran 783737625d Fix Issue #8797: Raise HTTPError on failed Basic Authentication immediately. Initial patch by Sam Bull. 2014-08-20 07:53:58 +05:30
Senthil Kumaran e2953e5146 merge 3.4; backout changeset 3435c5865cfc due to buildbot failures. Ref #8797 2014-08-16 22:54:24 +05:30
Senthil Kumaran 402df0975c backout changeset 3435c5865cfc due to buildbot failures. Ref #8797 2014-08-16 22:52:37 +05:30
Senthil Kumaran 39e6c07beb merge from 3.4
Fix Issue #8797: Raise HTTPError on failed Basic Authentication immediately. Initial patch by Sam Bull.
2014-08-16 14:19:09 +05:30
Senthil Kumaran b2e3a939bf Fix Issue #8797: Raise HTTPError on failed Basic Authentication immediately. Initial patch by Sam Bull. 2014-08-16 14:17:38 +05:30
Serhiy Storchaka 465e60e654 Issue #22033: Reprs of most Python implemened classes now contain actual
class name instead of hardcoded one.
2014-07-25 23:36:00 +03:00
Senthil Kumaran 284a4a1bb0 Merge 3.4
Fix localhost checking in FileHandler. Raised in #21970.
2014-07-22 00:16:18 -07:00
Senthil Kumaran bc07ac5180 Fix localhost checking in FileHandler. Raised in #21970. 2014-07-22 00:15:20 -07:00
Benjamin Peterson edb07d28fb merge 3.4 (#21463) 2014-06-07 15:09:36 -07:00
Benjamin Peterson 3c2dca67ac in ftp cache pruning, avoid changing the size of a dict while iterating over it (closes #21463)
Patch by Skyler Leigh Amador.
2014-06-07 15:08:04 -07:00
Raymond Hettinger 38acd4c028 Issue 21469: Minor code modernization (convert and/or expression to an if/else expression).
Suggested by: Tal Einat
2014-05-12 22:22:46 -07:00
Raymond Hettinger 122541bece Issue 21469: Mitigate risk of false positives with robotparser.
* Repair the broken link to norobots-rfc.txt.

* HTTP response codes >= 500 treated as a failed read rather than as a not
found.  Not found means that we can assume the entire site is allowed.  A 5xx
server error tells us nothing.

* A successful read() or parse() updates the mtime (which is defined to be "the
  time the robots.txt file was last fetched").

* The can_fetch() method returns False unless we've had a read() with a 2xx or
4xx response.  This avoids false positives in the case where a user calls
can_fetch() before calling read().

* I don't see any easy way to test this patch without hitting internet
resources that might change or without use of mock objects that wouldn't
provide must reassurance.
2014-05-12 21:56:33 -07:00
Senthil Kumaran 6117e5d8e3 urllib.response object to use _TemporaryFileWrapper (and _TemporaryFileCloser)
facility. Provides a better way to handle file descriptor close.

Address issue #15002 . Patch contributed by Christian Theune.
2014-04-20 09:41:29 -07:00
Senthil Kumaran d8e24f1f71 Convert urllib.request parse_proxy doctests to unittests. 2014-04-14 16:32:20 -04:00
Benjamin Peterson 78c8538461 fix typo 2014-04-01 16:27:30 -04:00
Benjamin Peterson 5dd3caed2b simplify check, since now there are only new-style classes 2014-04-01 14:20:56 -04:00
Victor Stinner d6a91a7ab6 Issue #20879: Delay the initialization of encoding and decoding tables for
base32, ascii85 and base85 codecs in the base64 module, and delay the
initialization of the unquote_to_bytes() table of the urllib.parse module, to
not waste memory if these modules are not used.
2014-03-17 22:38:41 +01:00
Serhiy Storchaka 5d83d1a814 Issue #20270: urllib.urlparse now supports empty ports. 2014-01-18 18:31:41 +02:00
Serhiy Storchaka ff97b08d00 Issue #20270: urllib.urlparse now supports empty ports. 2014-01-18 18:30:33 +02:00
Senthil Kumaran b6fac245b5 Backporing the fix from Issue #12692 2013-12-28 17:36:18 -08:00
Christian Heimes 67986f9431 Issue #19735: Implement private function ssl._create_stdlib_context() to
create SSLContext objects in Python's stdlib module. It provides a single
configuration point and makes use of SSLContext.load_default_certs().
2013-11-23 22:43:47 +01:00
Jason R. Coombs aae6a1d76f Issue #18978: A more elegant technique for resolving the method 2013-09-08 12:54:33 -04:00
Jason R. Coombs 7dc4f4bbab Issue #18978: Allow Request.method to be defined at the class level. 2013-09-08 12:47:07 -04:00
Senthil Kumaran d80f7be580 merge from 3.3
Improve urlencode docstring. Patch by Brian Brazil.
Closes issue #15350
2013-09-05 21:43:53 -07:00
Senthil Kumaran 324ae385fe Improve urlencode docstring. Patch by Brian Brazil. 2013-09-05 21:42:38 -07:00
Brett Cannon cd171c8e92 Issue #18200: Back out usage of ModuleNotFoundError (8d28d44f3a9a) 2013-07-04 17:43:24 -04:00
Brett Cannon 0a140668fa Issue #18200: Update the stdlib (except tests) to use
ModuleNotFoundError.
2013-06-13 20:57:26 -04:00
Senthil Kumaran caa00fec19 Fix #17967 - Fix related to regression on Windows.
os.path.join(*self.dirs) produces an invalid path on windows.
ftp paths are always forward-slash seperated like this. /pub/dir.
2013-06-02 11:59:47 -07:00
Senthil Kumaran dcdadfe39a Fix thishost helper funtion in urllib. Returns the ipaddress of localhost when
hostname is resolvable by socket.gethostname for local machine. This all fixes
certain freebsd builtbot failures.
2013-06-01 11:12:17 -07:00
Senthil Kumaran 4e42ae81f6 Fix #17967: For ftp urls CWD to target instead of hopping to each directory
towards target. This fixes a bug where target is accessible, but parent
directories are restricted.
2013-06-01 08:27:06 -07:00
Senthil Kumaran c70a6ae49b #17403: urllib.parse.robotparser normalizes the urls before adding to ruleline.
This helps in handling certain types invalid urls in a conservative manner.
2013-05-29 05:54:31 -07:00
Senthil Kumaran 5ccf2ff3e9 merge from 3.3
Fix #17967 - Fix related to regression on Windows.

os.path.join(*self.dirs) produces an invalid path on windows.
ftp paths are always forward-slash seperated like this. /pub/dir.
2013-06-02 12:00:45 -07:00
Senthil Kumaran 88249b80d7 merge from 3.3
Fix thishost helper funtion in urllib. Returns the ipaddress of localhost when
hostname is resolvable by socket.gethostname for local machine. This all fixes
certain freebsd builtbot failures.
2013-06-01 11:12:52 -07:00
Senthil Kumaran e9ec2e173d merge from 3.3
Fix #17967: For ftp urls CWD to target instead of hopping to each directory
towards target. This fixes a bug where target is accessible, but parent
directories are restricted.
2013-06-01 08:27:53 -07:00
Senthil Kumaran 6b3026ce72 merge from 3.3
#17403: urllib.parse.robotparser normalizes the urls before adding to
ruleline. This helps in handling certain types invalid urls in a conservative
manner. Patch contributed by Mher Movsisyan.
2013-05-29 05:57:21 -07:00
Senthil Kumaran 8307075ce8 Fix #17272 - Make Request.full_url and Request.get_full_url return same result under all circumstances.
Document the change of Request.full_url to a property.
2013-05-24 09:14:12 -07:00
Benjamin Peterson fa6bdc6d86 merge 3.3 2013-05-12 19:02:05 -05:00
Benjamin Peterson 901a278861 use correct format code for exceptions 2013-05-12 19:01:52 -05:00
Senthil Kumaran 5238092592 Issue #17272: Making the urllib.request's Request.full_url a descriptor. Fixes
bugs with assignment to full_url. Patch by Demian Brecht.
2013-04-25 05:45:48 -07:00
Raymond Hettinger 56b0a3d89a Remove redundant imports 2013-04-06 20:53:12 -07:00
Senthil Kumaran 4a2ab120f3 Issue #17483: 3.3 Branch - Remove unreachable code in urllib.request 2013-04-04 19:34:02 -07:00
R David Murray c616604a15 Merge: Use repr when printing unknown url type in urlopen. 2013-04-03 07:01:07 -04:00
R David Murray d8a46969f7 Use repr when printing unknown url type in urlopen. 2013-04-03 06:58:34 -04:00
Antoine Pitrou 9a8d6934df Issue #17483: remove unreachable code in urlopen(). 2013-04-01 18:55:35 +02:00
R David Murray 9cc7d45571 #17485: Delete the Content-Length header if the data attribute is deleted.
This is a follow on to issue 16464.  Original patch by Daniel Wozniak.
2013-03-20 00:10:51 -04:00
Senthil Kumaran 41518b4af0 #17474 - Remove the various deprecated methods of Request class. 2013-03-18 18:06:00 -07:00
Serhiy Storchaka c12956ddf2 Issue #1285086: Get rid of the refcounting hack and speed up
urllib.parse.unquote() and urllib.parse.unquote_to_bytes().
2013-03-14 21:34:55 +02:00
Serhiy Storchaka a9d24e6766 Issue #1285086: Get rid of the refcounting hack and speed up
urllib.parse.unquote() and urllib.parse.unquote_to_bytes().
2013-03-14 21:33:35 +02:00
Serhiy Storchaka 8ea4616f16 Issue #1285086: Get rid of the refcounting hack and speed up
urllib.parse.unquote() and urllib.parse.unquote_to_bytes().
2013-03-14 21:31:37 +02:00
Andrew Svetlov f7a17b48d7 Replace IOError with OSError (#16715) 2012-12-25 16:47:37 +02:00
Senthil Kumaran 750909e618 Fix issue16713 - tel url parsing with params 2012-12-24 14:01:48 -08:00
Senthil Kumaran bd6667aae3 Fix issue16713 - tel url parsing with params 2012-12-24 14:01:13 -08:00
Senthil Kumaran ed30199e78 Fix issue16713 - tel url parsing with params 2012-12-24 14:00:20 -08:00
Senthil Kumaran 0a6b9eca68 merge from 3.2
Fix Issue15701 - HTTPError info method call raises AttributeError. Fix that to return headers correctly
2012-12-23 09:12:13 -08:00
Senthil Kumaran 41e66a26b0 Fix Issue15701 - HTTPError info method call raises AttributeError. Fix that to return headers correctly 2012-12-23 09:04:24 -08:00
Andrew Svetlov 2606a6f197 Issue #16719: Get rid of WindowsError. Use OSError instead
Patch by Serhiy Storchaka.
2012-12-19 14:33:35 +02:00
Andrew Svetlov 0832af6628 Issue #16717: get rid of socket.error, replace with OSError 2012-12-18 23:10:48 +02:00
Andrew Svetlov 3438fa496d Get rig of EnvironmentError (#16705) 2012-12-17 23:35:18 +02:00
Senthil Kumaran 5962cce050 Fix Issue15701 : add .headers attribute to urllib.error.HTTPError 2012-12-10 02:09:35 -08:00
Andrew Svetlov bff98fe536 Issue #16464: reset Request's Content-Length header on .data change.
It will be recalculated on sending request to HTTP server.

Patch by Alexey Kachayev
2012-11-27 23:06:19 +02:00
Antoine Pitrou df204be922 Issue #16423: urllib.request now has support for ``data:`` URLs.
Patch by Mathias Panzenböck.
2012-11-24 17:59:08 +01:00
Gregory P. Smith b696610c2d Fixes issue #16409: The reporthook callback made by the legacy
urllib.request.urlretrieve API now properly supplies a constant
non-zero block_size as it did in Python 3.2 and 2.7.  This matches the
behavior of urllib.request.URLopener.retrieve.
2012-11-10 13:44:50 -08:00
Gregory P. Smith 6b0bdab429 Fixes issue #16409: The reporthook callback made by the legacy
urllib.request.urlretrieve API now properly supplies a constant
non-zero block_size as it did in Python 3.2 and 2.7.  This matches the
behavior of urllib.request.URLopener.retrieve.
2012-11-10 13:43:44 -08:00
Senthil Kumaran cc2f0421c7 Issue #16250: Fix URLError invocation with proper args 2012-10-27 02:48:21 -07:00
Senthil Kumaran cad7b31467 Issue #16250: Fix URLError invocation with proper args. 2012-10-27 02:26:46 -07:00
Senthil Kumaran 40d8078f41 Issue #16301: Fix the localhost verification in urllib/request.py for file://. Modify tests to use localhost for local temp files, which could make Windows Buildbot (#16300) happy 2012-10-22 09:43:04 -07:00
Senthil Kumaran 3ebef36eea Issue #16250: Fix the invocations of URLError which had misplaced filename attribute for exception 2012-10-21 18:31:25 -07:00
Senthil Kumaran f577686fd3 Issue #10836: Fix exception raised when file not found in urlretrieve 2012-10-21 13:30:02 -07:00
Nadeem Vawda bd26b5463e Issue #12692: Fix resource leak in urllib.request. 2012-10-21 17:37:43 +02:00
Giampaolo Rodola' 2d51f687e1 Fix issue #16270: urllib may hang when used for retrieving files via FTP by using a context manager. 2012-10-19 13:40:28 +02:00
Giampaolo Rodola' b0cc91290c Fix issue #16270: urllib may hang when used for retrieving files via FTP by using a context manager. 2012-10-19 13:34:32 +02:00
Giampaolo Rodola' 89e92854b6 Fix issue #16270: urllib may hang when used for retrieving files via FTP by using a context manager. 2012-10-19 13:25:17 +02:00
Georg Brandl 491b1dc79e Closes #9374: merge with 3.2 2012-08-24 18:15:46 +02:00