a delimiter. Previously, the 'network location' (<authority> in RFC 2396) would
become 'www.example.com?query=spam', while RFC 2396 does not allow a '?' in
<authority>. See bug #548176 for further discussion.
split parameters from the last path segment. Introduces two new functions,
urlsplit() and urlunsplit(), that do the simpler job of splitting the URL
without monkeying around with the parameters field, since that was not being
handled properly.
This closes bug #478038.
urljoin(): Make this conform to RFC 1808 for all examples given in that
RFC (both "Normal" and "Abnormal"), so long as that RFC does
not conflict the older RFC 1630, which also specified
relative URL resolution.
This closes SF bug #110832 (Jitterbug PR#194).
so we can't use it.
While I'm at it, got rid of string module use. (Found several new
hard special cases for a hypothetical conversion tool: from string
import join, find, rfind; and a local assignment "find=string.find".)
The following adds support for RTSP (RFC2326) URLs to the standard
urlparse.py module.
(Augmented by FLD to include rtspu:, specified in the same RFC & OK'd
by Anthony.)
The attached patches update the standard library so that all modules
have docstrings beginning with one-line summaries.
A new docstring was added to formatter. The docstring for os.py
was updated to mention nt, os2, ce in addition to posix, dos, mac.
If a filename on Windows starts with \\, it is converted to a URL
which starts with ////. If this URL is passed to urlparse.urlparse
you get a path that starts with // (and an empty netloc). If you pass
the result back to urlparse.urlunparse, you get a URL that starts with
//, which is parsed differently by urlparse.urlparse. The fix is to
add the (empty) netloc with accompanying slashes if the path in
urlunparse starts with //. Do this for all schemes that use a netloc.
netloc from the base url as the default netloc for the resulting url
even if the schemes differ.
Once upon a time, when the web was wild, this was a valuable hack
because some people had a URL referencing an ftp server colocated with
an http server without having the host in the ftp URL (so they could
replicate it or change the hostname easily).
More recently, after the file: scheme got added back to the list of
schemes that accept a netloc, it turns out that this caused weirdness
when joining an http: URL with a file: URL -- the resulting file: URL
would always inherit the host from the http: URL because the file:
scheme supports a netloc but in practice never has one.
There are two reasons to get rid of the old, once-valuable hack,
instead of removing the file: scheme from the uses_netloc list. One,
the RFC says that file: uses the netloc syntax, and does not endorse
the old hack. Two, neither netscape 4.5 nor IE 4.0 support the old
hack.