Fix old urllib/urllib2/urlparse usage.

This commit is contained in:
Georg Brandl 2008-06-23 11:44:14 +00:00
parent 0f7ede4569
commit 029986af24
7 changed files with 42 additions and 39 deletions

View File

@ -100,7 +100,7 @@ The following classes are provided:
.. seealso:: .. seealso::
Module :mod:`urllib2` Module :mod:`urllib.request`
URL opening with automatic cookie handling. URL opening with automatic cookie handling.
Module :mod:`http.cookies` Module :mod:`http.cookies`
@ -149,11 +149,11 @@ contained :class:`Cookie` objects.
the :class:`CookieJar`'s :class:`CookiePolicy` instance are true and false the :class:`CookieJar`'s :class:`CookiePolicy` instance are true and false
respectively), the :mailheader:`Cookie2` header is also added when appropriate. respectively), the :mailheader:`Cookie2` header is also added when appropriate.
The *request* object (usually a :class:`urllib2.Request` instance) must support The *request* object (usually a :class:`urllib.request..Request` instance)
the methods :meth:`get_full_url`, :meth:`get_host`, :meth:`get_type`, must support the methods :meth:`get_full_url`, :meth:`get_host`,
:meth:`unverifiable`, :meth:`get_origin_req_host`, :meth:`has_header`, :meth:`get_type`, :meth:`unverifiable`, :meth:`get_origin_req_host`,
:meth:`get_header`, :meth:`header_items`, and :meth:`add_unredirected_header`,as :meth:`has_header`, :meth:`get_header`, :meth:`header_items`, and
documented by :mod:`urllib2`. :meth:`add_unredirected_header`, as documented by :mod:`urllib.request`.
.. method:: CookieJar.extract_cookies(response, request) .. method:: CookieJar.extract_cookies(response, request)
@ -166,14 +166,15 @@ contained :class:`Cookie` objects.
as appropriate (subject to the :meth:`CookiePolicy.set_ok` method's approval). as appropriate (subject to the :meth:`CookiePolicy.set_ok` method's approval).
The *response* object (usually the result of a call to The *response* object (usually the result of a call to
:meth:`urllib2.urlopen`, or similar) should support an :meth:`info` method, :meth:`urllib.request.urlopen`, or similar) should support an :meth:`info`
which returns a :class:`email.message.Message` instance. method, which returns a :class:`email.message.Message` instance.
The *request* object (usually a :class:`urllib2.Request` instance) must support The *request* object (usually a :class:`urllib.request.Request` instance)
the methods :meth:`get_full_url`, :meth:`get_host`, :meth:`unverifiable`, and must support the methods :meth:`get_full_url`, :meth:`get_host`,
:meth:`get_origin_req_host`, as documented by :mod:`urllib2`. The request is :meth:`unverifiable`, and :meth:`get_origin_req_host`, as documented by
used to set default values for cookie-attributes as well as for checking that :mod:`urllib.request`. The request is used to set default values for
the cookie is allowed to be set. cookie-attributes as well as for checking that the cookie is allowed to be
set.
.. method:: CookieJar.set_policy(policy) .. method:: CookieJar.set_policy(policy)
@ -715,18 +716,18 @@ Examples
The first example shows the most common usage of :mod:`http.cookiejar`:: The first example shows the most common usage of :mod:`http.cookiejar`::
import http.cookiejar, urllib2 import http.cookiejar, urllib.request
cj = http.cookiejar.CookieJar() cj = http.cookiejar.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/") r = opener.open("http://example.com/")
This example illustrates how to open a URL using your Netscape, Mozilla, or Lynx This example illustrates how to open a URL using your Netscape, Mozilla, or Lynx
cookies (assumes Unix/Netscape convention for location of the cookies file):: cookies (assumes Unix/Netscape convention for location of the cookies file)::
import os, http.cookiejar, urllib2 import os, http.cookiejar, urllib.request
cj = http.cookiejar.MozillaCookieJar() cj = http.cookiejar.MozillaCookieJar()
cj.load(os.path.join(os.environ["HOME"], ".netscape/cookies.txt")) cj.load(os.path.join(os.environ["HOME"], ".netscape/cookies.txt"))
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/") r = opener.open("http://example.com/")
The next example illustrates the use of :class:`DefaultCookiePolicy`. Turn on The next example illustrates the use of :class:`DefaultCookiePolicy`. Turn on
@ -734,12 +735,12 @@ RFC 2965 cookies, be more strict about domains when setting and returning
Netscape cookies, and block some domains from setting cookies or having them Netscape cookies, and block some domains from setting cookies or having them
returned:: returned::
import urllib2 import urllib.request
from http.cookiejar import CookieJar, DefaultCookiePolicy from http.cookiejar import CookieJar, DefaultCookiePolicy
policy = DefaultCookiePolicy( policy = DefaultCookiePolicy(
rfc2965=True, strict_ns_domain=Policy.DomainStrict, rfc2965=True, strict_ns_domain=Policy.DomainStrict,
blocked_domains=["ads.net", ".ads.net"]) blocked_domains=["ads.net", ".ads.net"])
cj = CookieJar(policy) cj = CookieJar(policy)
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/") r = opener.open("http://example.com/")

View File

@ -1077,7 +1077,7 @@ Adding HTTP headers:
Use the *headers* argument to the :class:`Request` constructor, or:: Use the *headers* argument to the :class:`Request` constructor, or::
import urllib import urllib.request
req = urllib.request.Request('http://www.example.com/') req = urllib.request.Request('http://www.example.com/')
req.add_header('Referer', 'http://www.python.org/') req.add_header('Referer', 'http://www.python.org/')
r = urllib.request.urlopen(req) r = urllib.request.urlopen(req)
@ -1085,7 +1085,7 @@ Use the *headers* argument to the :class:`Request` constructor, or::
:class:`OpenerDirector` automatically adds a :mailheader:`User-Agent` header to :class:`OpenerDirector` automatically adds a :mailheader:`User-Agent` header to
every :class:`Request`. To change this:: every :class:`Request`. To change this::
import urllib import urllib.request
opener = urllib.request.build_opener() opener = urllib.request.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')] opener.addheaders = [('User-agent', 'Mozilla/5.0')]
opener.open('http://www.example.com/') opener.open('http://www.example.com/')

View File

@ -1305,7 +1305,7 @@ class CookieJar:
return attrs return attrs
def add_cookie_header(self, request): def add_cookie_header(self, request):
"""Add correct Cookie: header to request (urllib2.Request object). """Add correct Cookie: header to request (urllib.request.Request object).
The Cookie2 header is also added unless policy.hide_cookie2 is true. The Cookie2 header is also added unless policy.hide_cookie2 is true.

View File

@ -1002,11 +1002,11 @@ class HTTPHandler(logging.Handler):
Send the record to the Web server as an URL-encoded dictionary Send the record to the Web server as an URL-encoded dictionary
""" """
try: try:
import http.client, urllib import http.client, urllib.parse
host = self.host host = self.host
h = http.client.HTTP(host) h = http.client.HTTP(host)
url = self.url url = self.url
data = urllib.urlencode(self.mapLogRecord(record)) data = urllib.parse.urlencode(self.mapLogRecord(record))
if self.method == "GET": if self.method == "GET":
if (url.find('?') >= 0): if (url.find('?') >= 0):
sep = '&' sep = '&'

View File

@ -47,24 +47,25 @@ _call_chain conventions
Example usage: Example usage:
import urllib2 import urllib.request
# set up authentication info # set up authentication info
authinfo = urllib2.HTTPBasicAuthHandler() authinfo = urllib.request.HTTPBasicAuthHandler()
authinfo.add_password(realm='PDQ Application', authinfo.add_password(realm='PDQ Application',
uri='https://mahler:8092/site-updates.py', uri='https://mahler:8092/site-updates.py',
user='klem', user='klem',
passwd='geheim$parole') passwd='geheim$parole')
proxy_support = urllib2.ProxyHandler({"http" : "http://ahad-haam:3128"}) proxy_support = urllib.request.ProxyHandler({"http" : "http://ahad-haam:3128"})
# build a new opener that adds authentication and caching FTP handlers # build a new opener that adds authentication and caching FTP handlers
opener = urllib2.build_opener(proxy_support, authinfo, urllib2.CacheFTPHandler) opener = urllib.request.build_opener(proxy_support, authinfo,
urllib.request.CacheFTPHandler)
# install it # install it
urllib2.install_opener(opener) urllib.request.install_opener(opener)
f = urllib2.urlopen('http://www.python.org/') f = urllib.request.urlopen('http://www.python.org/')
""" """
# XXX issues: # XXX issues:
@ -502,7 +503,7 @@ class HTTPRedirectHandler(BaseHandler):
# Strictly (according to RFC 2616), 301 or 302 in response to # Strictly (according to RFC 2616), 301 or 302 in response to
# a POST MUST NOT cause a redirection without confirmation # a POST MUST NOT cause a redirection without confirmation
# from the user (of urllib2, in this case). In practice, # from the user (of urllib.request, in this case). In practice,
# essentially all clients do redirect in this case, so we do # essentially all clients do redirect in this case, so we do
# the same. # the same.
# be conciliant with URIs containing a space # be conciliant with URIs containing a space
@ -655,7 +656,7 @@ class ProxyHandler(BaseHandler):
if proxy_type is None: if proxy_type is None:
proxy_type = orig_type proxy_type = orig_type
if user and password: if user and password:
user_pass = '%s:%s' % (unquote(user), user_pass = '%s:%s' % (urllib.parse.unquote(user),
urllib.parse.unquote(password)) urllib.parse.unquote(password))
creds = base64.b64encode(user_pass.encode()).decode("ascii") creds = base64.b64encode(user_pass.encode()).decode("ascii")
req.add_header('Proxy-authorization', 'Basic ' + creds) req.add_header('Proxy-authorization', 'Basic ' + creds)
@ -808,7 +809,7 @@ class ProxyBasicAuthHandler(AbstractBasicAuthHandler, BaseHandler):
def http_error_407(self, req, fp, code, msg, headers): def http_error_407(self, req, fp, code, msg, headers):
# http_error_auth_reqed requires that there is no userinfo component in # http_error_auth_reqed requires that there is no userinfo component in
# authority. Assume there isn't one, since urllib2 does not (and # authority. Assume there isn't one, since urllib.request does not (and
# should not, RFC 3986 s. 3.2.1) support requests for URLs containing # should not, RFC 3986 s. 3.2.1) support requests for URLs containing
# userinfo. # userinfo.
authority = req.get_host() authority = req.get_host()
@ -1194,7 +1195,7 @@ class FileHandler(BaseHandler):
return urllib.response.addinfourl(open(localfile, 'rb'), return urllib.response.addinfourl(open(localfile, 'rb'),
headers, 'file:'+file) headers, 'file:'+file)
except OSError as msg: except OSError as msg:
# urllib2 users shouldn't expect OSErrors coming from urlopen() # users shouldn't expect OSErrors coming from urlopen()
raise urllib.error.URLError(msg) raise urllib.error.URLError(msg)
raise urllib.error.URLError('file not on local host') raise urllib.error.URLError('file not on local host')

View File

@ -9,7 +9,8 @@ bootstrap issues (/usr/bin/python is Python 2.3 on OSX 10.4)
Usage: see USAGE variable in the script. Usage: see USAGE variable in the script.
""" """
import platform, os, sys, getopt, textwrap, shutil, urllib2, stat, time, pwd import platform, os, sys, getopt, textwrap, shutil, stat, time, pwd
import urllib.request
import grp import grp
INCLUDE_TIMESTAMP = 1 INCLUDE_TIMESTAMP = 1
@ -442,7 +443,7 @@ def downloadURL(url, fname):
if KNOWNSIZES.get(url) == size: if KNOWNSIZES.get(url) == size:
print("Using existing file for", url) print("Using existing file for", url)
return return
fpIn = urllib2.urlopen(url) fpIn = urllib.request.urlopen(url)
fpOut = open(fname, 'wb') fpOut = open(fname, 'wb')
block = fpIn.read(10240) block = fpIn.read(10240)
try: try:

View File

@ -1889,7 +1889,6 @@ random Random variable generators
re Regular Expressions. re Regular Expressions.
reprlib Redo repr() but with limits on most sizes. reprlib Redo repr() but with limits on most sizes.
rlcompleter Word completion for GNU readline 2.0. rlcompleter Word completion for GNU readline 2.0.
robotparser Parse robots.txt files, useful for web spiders.
sched A generally useful event scheduler class. sched A generally useful event scheduler class.
shelve Manage shelves of pickled objects. shelve Manage shelves of pickled objects.
shlex Lexical analyzer class for simple shell-like syntaxes. shlex Lexical analyzer class for simple shell-like syntaxes.
@ -1920,8 +1919,9 @@ turtle LogoMation-like turtle graphics
types Define names for all type symbols in the std interpreter. types Define names for all type symbols in the std interpreter.
tzparse Parse a timezone specification. tzparse Parse a timezone specification.
unicodedata Interface to unicode properties. unicodedata Interface to unicode properties.
urllib Open an arbitrary URL. urllib.parse Parse URLs according to latest draft of standard.
urlparse Parse URLs according to latest draft of standard. urllib.request Open an arbitrary URL.
urllib.robotparser Parse robots.txt files, useful for web spiders.
user Hook to allow user-specified customization code to run. user Hook to allow user-specified customization code to run.
uu UUencode/UUdecode. uu UUencode/UUdecode.
unittest Utilities for implementing unit testing. unittest Utilities for implementing unit testing.