mirror of https://github.com/python/cpython
Move the urllib-inherited API to a distinguished section
This commit is contained in:
parent
0358a17838
commit
b8eb9cbd71
|
@ -104,52 +104,6 @@ The :mod:`urllib.request` module defines the following functions:
|
|||
member variable to modify its position in the handlers list.
|
||||
|
||||
|
||||
.. function:: urlretrieve(url, filename=None, reporthook=None, data=None)
|
||||
|
||||
Copy a network object denoted by a URL to a local file, if necessary. If the URL
|
||||
points to a local file, or a valid cached copy of the object exists, the object
|
||||
is not copied. Return a tuple ``(filename, headers)`` where *filename* is the
|
||||
local file name under which the object can be found, and *headers* is whatever
|
||||
the :meth:`info` method of the object returned by :func:`urlopen` returned (for
|
||||
a remote object, possibly cached). Exceptions are the same as for
|
||||
:func:`urlopen`.
|
||||
|
||||
The second argument, if present, specifies the file location to copy to (if
|
||||
absent, the location will be a tempfile with a generated name). The third
|
||||
argument, if present, is a hook function that will be called once on
|
||||
establishment of the network connection and once after each block read
|
||||
thereafter. The hook will be passed three arguments; a count of blocks
|
||||
transferred so far, a block size in bytes, and the total size of the file. The
|
||||
third argument may be ``-1`` on older FTP servers which do not return a file
|
||||
size in response to a retrieval request.
|
||||
|
||||
If the *url* uses the :file:`http:` scheme identifier, the optional *data*
|
||||
argument may be given to specify a ``POST`` request (normally the request type
|
||||
is ``GET``). The *data* argument must in standard
|
||||
:mimetype:`application/x-www-form-urlencoded` format; see the :func:`urlencode`
|
||||
function below.
|
||||
|
||||
:func:`urlretrieve` will raise :exc:`ContentTooShortError` when it detects that
|
||||
the amount of data available was less than the expected amount (which is the
|
||||
size reported by a *Content-Length* header). This can occur, for example, when
|
||||
the download is interrupted.
|
||||
|
||||
The *Content-Length* is treated as a lower bound: if there's more data to read,
|
||||
urlretrieve reads more data, but if less data is available, it raises the
|
||||
exception.
|
||||
|
||||
You can still retrieve the downloaded data in this case, it is stored in the
|
||||
:attr:`content` attribute of the exception instance.
|
||||
|
||||
If no *Content-Length* header was supplied, urlretrieve can not check the size
|
||||
of the data it has downloaded, and just returns it. In this case you just have
|
||||
to assume that the download was successful.
|
||||
|
||||
.. function:: urlcleanup()
|
||||
|
||||
Clear the cache that may have been built up by previous calls to
|
||||
:func:`urlretrieve`.
|
||||
|
||||
.. function:: pathname2url(path)
|
||||
|
||||
Convert the pathname *path* from the local syntax for a path to the form used in
|
||||
|
@ -218,116 +172,6 @@ The following classes are provided:
|
|||
fetching of the image, this should be true.
|
||||
|
||||
|
||||
.. class:: URLopener(proxies=None, **x509)
|
||||
|
||||
Base class for opening and reading URLs. Unless you need to support opening
|
||||
objects using schemes other than :file:`http:`, :file:`ftp:`, or :file:`file:`,
|
||||
you probably want to use :class:`FancyURLopener`.
|
||||
|
||||
By default, the :class:`URLopener` class sends a :mailheader:`User-Agent` header
|
||||
of ``urllib/VVV``, where *VVV* is the :mod:`urllib` version number.
|
||||
Applications can define their own :mailheader:`User-Agent` header by subclassing
|
||||
:class:`URLopener` or :class:`FancyURLopener` and setting the class attribute
|
||||
:attr:`version` to an appropriate string value in the subclass definition.
|
||||
|
||||
The optional *proxies* parameter should be a dictionary mapping scheme names to
|
||||
proxy URLs, where an empty dictionary turns proxies off completely. Its default
|
||||
value is ``None``, in which case environmental proxy settings will be used if
|
||||
present, as discussed in the definition of :func:`urlopen`, above.
|
||||
|
||||
Additional keyword parameters, collected in *x509*, may be used for
|
||||
authentication of the client when using the :file:`https:` scheme. The keywords
|
||||
*key_file* and *cert_file* are supported to provide an SSL key and certificate;
|
||||
both are needed to support client authentication.
|
||||
|
||||
:class:`URLopener` objects will raise an :exc:`IOError` exception if the server
|
||||
returns an error code.
|
||||
|
||||
.. method:: open(fullurl, data=None)
|
||||
|
||||
Open *fullurl* using the appropriate protocol. This method sets up cache and
|
||||
proxy information, then calls the appropriate open method with its input
|
||||
arguments. If the scheme is not recognized, :meth:`open_unknown` is called.
|
||||
The *data* argument has the same meaning as the *data* argument of
|
||||
:func:`urlopen`.
|
||||
|
||||
|
||||
.. method:: open_unknown(fullurl, data=None)
|
||||
|
||||
Overridable interface to open unknown URL types.
|
||||
|
||||
|
||||
.. method:: retrieve(url, filename=None, reporthook=None, data=None)
|
||||
|
||||
Retrieves the contents of *url* and places it in *filename*. The return value
|
||||
is a tuple consisting of a local filename and either a
|
||||
:class:`email.message.Message` object containing the response headers (for remote
|
||||
URLs) or ``None`` (for local URLs). The caller must then open and read the
|
||||
contents of *filename*. If *filename* is not given and the URL refers to a
|
||||
local file, the input filename is returned. If the URL is non-local and
|
||||
*filename* is not given, the filename is the output of :func:`tempfile.mktemp`
|
||||
with a suffix that matches the suffix of the last path component of the input
|
||||
URL. If *reporthook* is given, it must be a function accepting three numeric
|
||||
parameters. It will be called after each chunk of data is read from the
|
||||
network. *reporthook* is ignored for local URLs.
|
||||
|
||||
If the *url* uses the :file:`http:` scheme identifier, the optional *data*
|
||||
argument may be given to specify a ``POST`` request (normally the request type
|
||||
is ``GET``). The *data* argument must in standard
|
||||
:mimetype:`application/x-www-form-urlencoded` format; see the :func:`urlencode`
|
||||
function below.
|
||||
|
||||
|
||||
.. attribute:: version
|
||||
|
||||
Variable that specifies the user agent of the opener object. To get
|
||||
:mod:`urllib` to tell servers that it is a particular user agent, set this in a
|
||||
subclass as a class variable or in the constructor before calling the base
|
||||
constructor.
|
||||
|
||||
|
||||
.. class:: FancyURLopener(...)
|
||||
|
||||
:class:`FancyURLopener` subclasses :class:`URLopener` providing default handling
|
||||
for the following HTTP response codes: 301, 302, 303, 307 and 401. For the 30x
|
||||
response codes listed above, the :mailheader:`Location` header is used to fetch
|
||||
the actual URL. For 401 response codes (authentication required), basic HTTP
|
||||
authentication is performed. For the 30x response codes, recursion is bounded
|
||||
by the value of the *maxtries* attribute, which defaults to 10.
|
||||
|
||||
For all other response codes, the method :meth:`http_error_default` is called
|
||||
which you can override in subclasses to handle the error appropriately.
|
||||
|
||||
.. note::
|
||||
|
||||
According to the letter of :rfc:`2616`, 301 and 302 responses to POST requests
|
||||
must not be automatically redirected without confirmation by the user. In
|
||||
reality, browsers do allow automatic redirection of these responses, changing
|
||||
the POST to a GET, and :mod:`urllib` reproduces this behaviour.
|
||||
|
||||
The parameters to the constructor are the same as those for :class:`URLopener`.
|
||||
|
||||
.. note::
|
||||
|
||||
When performing basic authentication, a :class:`FancyURLopener` instance calls
|
||||
its :meth:`prompt_user_passwd` method. The default implementation asks the
|
||||
users for the required information on the controlling terminal. A subclass may
|
||||
override this method to support more appropriate behavior if needed.
|
||||
|
||||
The :class:`FancyURLopener` class offers one additional method that should be
|
||||
overloaded to provide the appropriate behavior:
|
||||
|
||||
.. method:: prompt_user_passwd(host, realm)
|
||||
|
||||
Return information needed to authenticate the user at the given host in the
|
||||
specified security realm. The return value should be a tuple, ``(user,
|
||||
password)``, which can be used for basic authentication.
|
||||
|
||||
The implementation prompts for this information on the terminal; an application
|
||||
should override this method to use an appropriate interaction model in the local
|
||||
environment.
|
||||
|
||||
|
||||
.. class:: OpenerDirector()
|
||||
|
||||
The :class:`OpenerDirector` class opens URLs via :class:`BaseHandler`\ s chained
|
||||
|
@ -1220,6 +1064,170 @@ The following example uses no proxies at all, overriding environment settings::
|
|||
>>> f.read().decode('utf-8')
|
||||
|
||||
|
||||
Legacy interface
|
||||
----------------
|
||||
|
||||
The following functions and classes are ported from the Python 2 module
|
||||
``urllib`` (as opposed to ``urllib2``). They might become deprecated at
|
||||
some point in the future.
|
||||
|
||||
|
||||
.. function:: urlretrieve(url, filename=None, reporthook=None, data=None)
|
||||
|
||||
Copy a network object denoted by a URL to a local file, if necessary. If the URL
|
||||
points to a local file, or a valid cached copy of the object exists, the object
|
||||
is not copied. Return a tuple ``(filename, headers)`` where *filename* is the
|
||||
local file name under which the object can be found, and *headers* is whatever
|
||||
the :meth:`info` method of the object returned by :func:`urlopen` returned (for
|
||||
a remote object, possibly cached). Exceptions are the same as for
|
||||
:func:`urlopen`.
|
||||
|
||||
The second argument, if present, specifies the file location to copy to (if
|
||||
absent, the location will be a tempfile with a generated name). The third
|
||||
argument, if present, is a hook function that will be called once on
|
||||
establishment of the network connection and once after each block read
|
||||
thereafter. The hook will be passed three arguments; a count of blocks
|
||||
transferred so far, a block size in bytes, and the total size of the file. The
|
||||
third argument may be ``-1`` on older FTP servers which do not return a file
|
||||
size in response to a retrieval request.
|
||||
|
||||
If the *url* uses the :file:`http:` scheme identifier, the optional *data*
|
||||
argument may be given to specify a ``POST`` request (normally the request type
|
||||
is ``GET``). The *data* argument must in standard
|
||||
:mimetype:`application/x-www-form-urlencoded` format; see the :func:`urlencode`
|
||||
function below.
|
||||
|
||||
:func:`urlretrieve` will raise :exc:`ContentTooShortError` when it detects that
|
||||
the amount of data available was less than the expected amount (which is the
|
||||
size reported by a *Content-Length* header). This can occur, for example, when
|
||||
the download is interrupted.
|
||||
|
||||
The *Content-Length* is treated as a lower bound: if there's more data to read,
|
||||
urlretrieve reads more data, but if less data is available, it raises the
|
||||
exception.
|
||||
|
||||
You can still retrieve the downloaded data in this case, it is stored in the
|
||||
:attr:`content` attribute of the exception instance.
|
||||
|
||||
If no *Content-Length* header was supplied, urlretrieve can not check the size
|
||||
of the data it has downloaded, and just returns it. In this case you just have
|
||||
to assume that the download was successful.
|
||||
|
||||
.. function:: urlcleanup()
|
||||
|
||||
Clear the cache that may have been built up by previous calls to
|
||||
:func:`urlretrieve`.
|
||||
|
||||
.. class:: URLopener(proxies=None, **x509)
|
||||
|
||||
Base class for opening and reading URLs. Unless you need to support opening
|
||||
objects using schemes other than :file:`http:`, :file:`ftp:`, or :file:`file:`,
|
||||
you probably want to use :class:`FancyURLopener`.
|
||||
|
||||
By default, the :class:`URLopener` class sends a :mailheader:`User-Agent` header
|
||||
of ``urllib/VVV``, where *VVV* is the :mod:`urllib` version number.
|
||||
Applications can define their own :mailheader:`User-Agent` header by subclassing
|
||||
:class:`URLopener` or :class:`FancyURLopener` and setting the class attribute
|
||||
:attr:`version` to an appropriate string value in the subclass definition.
|
||||
|
||||
The optional *proxies* parameter should be a dictionary mapping scheme names to
|
||||
proxy URLs, where an empty dictionary turns proxies off completely. Its default
|
||||
value is ``None``, in which case environmental proxy settings will be used if
|
||||
present, as discussed in the definition of :func:`urlopen`, above.
|
||||
|
||||
Additional keyword parameters, collected in *x509*, may be used for
|
||||
authentication of the client when using the :file:`https:` scheme. The keywords
|
||||
*key_file* and *cert_file* are supported to provide an SSL key and certificate;
|
||||
both are needed to support client authentication.
|
||||
|
||||
:class:`URLopener` objects will raise an :exc:`IOError` exception if the server
|
||||
returns an error code.
|
||||
|
||||
.. method:: open(fullurl, data=None)
|
||||
|
||||
Open *fullurl* using the appropriate protocol. This method sets up cache and
|
||||
proxy information, then calls the appropriate open method with its input
|
||||
arguments. If the scheme is not recognized, :meth:`open_unknown` is called.
|
||||
The *data* argument has the same meaning as the *data* argument of
|
||||
:func:`urlopen`.
|
||||
|
||||
|
||||
.. method:: open_unknown(fullurl, data=None)
|
||||
|
||||
Overridable interface to open unknown URL types.
|
||||
|
||||
|
||||
.. method:: retrieve(url, filename=None, reporthook=None, data=None)
|
||||
|
||||
Retrieves the contents of *url* and places it in *filename*. The return value
|
||||
is a tuple consisting of a local filename and either a
|
||||
:class:`email.message.Message` object containing the response headers (for remote
|
||||
URLs) or ``None`` (for local URLs). The caller must then open and read the
|
||||
contents of *filename*. If *filename* is not given and the URL refers to a
|
||||
local file, the input filename is returned. If the URL is non-local and
|
||||
*filename* is not given, the filename is the output of :func:`tempfile.mktemp`
|
||||
with a suffix that matches the suffix of the last path component of the input
|
||||
URL. If *reporthook* is given, it must be a function accepting three numeric
|
||||
parameters. It will be called after each chunk of data is read from the
|
||||
network. *reporthook* is ignored for local URLs.
|
||||
|
||||
If the *url* uses the :file:`http:` scheme identifier, the optional *data*
|
||||
argument may be given to specify a ``POST`` request (normally the request type
|
||||
is ``GET``). The *data* argument must in standard
|
||||
:mimetype:`application/x-www-form-urlencoded` format; see the :func:`urlencode`
|
||||
function below.
|
||||
|
||||
|
||||
.. attribute:: version
|
||||
|
||||
Variable that specifies the user agent of the opener object. To get
|
||||
:mod:`urllib` to tell servers that it is a particular user agent, set this in a
|
||||
subclass as a class variable or in the constructor before calling the base
|
||||
constructor.
|
||||
|
||||
|
||||
.. class:: FancyURLopener(...)
|
||||
|
||||
:class:`FancyURLopener` subclasses :class:`URLopener` providing default handling
|
||||
for the following HTTP response codes: 301, 302, 303, 307 and 401. For the 30x
|
||||
response codes listed above, the :mailheader:`Location` header is used to fetch
|
||||
the actual URL. For 401 response codes (authentication required), basic HTTP
|
||||
authentication is performed. For the 30x response codes, recursion is bounded
|
||||
by the value of the *maxtries* attribute, which defaults to 10.
|
||||
|
||||
For all other response codes, the method :meth:`http_error_default` is called
|
||||
which you can override in subclasses to handle the error appropriately.
|
||||
|
||||
.. note::
|
||||
|
||||
According to the letter of :rfc:`2616`, 301 and 302 responses to POST requests
|
||||
must not be automatically redirected without confirmation by the user. In
|
||||
reality, browsers do allow automatic redirection of these responses, changing
|
||||
the POST to a GET, and :mod:`urllib` reproduces this behaviour.
|
||||
|
||||
The parameters to the constructor are the same as those for :class:`URLopener`.
|
||||
|
||||
.. note::
|
||||
|
||||
When performing basic authentication, a :class:`FancyURLopener` instance calls
|
||||
its :meth:`prompt_user_passwd` method. The default implementation asks the
|
||||
users for the required information on the controlling terminal. A subclass may
|
||||
override this method to support more appropriate behavior if needed.
|
||||
|
||||
The :class:`FancyURLopener` class offers one additional method that should be
|
||||
overloaded to provide the appropriate behavior:
|
||||
|
||||
.. method:: prompt_user_passwd(host, realm)
|
||||
|
||||
Return information needed to authenticate the user at the given host in the
|
||||
specified security realm. The return value should be a tuple, ``(user,
|
||||
password)``, which can be used for basic authentication.
|
||||
|
||||
The implementation prompts for this information on the terminal; an application
|
||||
should override this method to use an appropriate interaction model in the local
|
||||
environment.
|
||||
|
||||
|
||||
:mod:`urllib.request` Restrictions
|
||||
----------------------------------
|
||||
|
||||
|
|
Loading…
Reference in New Issue