From 624f3372e28e81516d5fc1e38d309066ba8464e6 Mon Sep 17 00:00:00 2001 From: Georg Brandl Date: Tue, 31 Mar 2009 16:11:45 +0000 Subject: [PATCH] #5529: backport new docs of import semantics written by Brett to 2.x. --- Doc/glossary.rst | 14 +++ Doc/library/sys.rst | 37 +++++++ Doc/reference/simple_stmts.rst | 176 +++++++++++++++++++++++---------- 3 files changed, 176 insertions(+), 51 deletions(-) diff --git a/Doc/glossary.rst b/Doc/glossary.rst index 808993e18e8..87a77d07d56 100644 --- a/Doc/glossary.rst +++ b/Doc/glossary.rst @@ -185,6 +185,11 @@ Glossary A module written in C or C++, using Python's C API to interact with the core and with user code. + finder + An object that tries to find the :term:`loader` for a module. It must + implement a method named :meth:`find_module`. See :pep:`302` for + details. + function A series of statements which returns some value to a caller. It can also be passed zero or more arguments which may be used in the execution of @@ -288,6 +293,10 @@ Glossary fraction. Integer division can be forced by using the ``//`` operator instead of the ``/`` operator. See also :term:`__future__`. + importer + An object that both finds and loads a module; both a + :term:`finder` and :term:`loader` object. + interactive Python has an interactive interpreter which means you can enter statements and expressions at the interpreter prompt, immediately @@ -368,6 +377,11 @@ Glossary clause is optional. If omitted, all elements in ``range(256)`` are processed. + loader + An object that loads a module. It must define a method named + :meth:`load_module`. A loader is typically returned by a + :term:`finder`. See :pep:`302` for details. + mapping A container object (such as :class:`dict`) which supports arbitrary key lookups using the special method :meth:`__getitem__`. diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst index 813e788532c..30555c824b4 100644 --- a/Doc/library/sys.rst +++ b/Doc/library/sys.rst @@ -554,6 +554,22 @@ always available. characters are stored as UCS-2 or UCS-4. +.. data:: meta_path + + A list of :term:`finder` objects that have their :meth:`find_module` + methods called to see if one of the objects can find the module to be + imported. The :meth:`find_module` method is called at least with the + absolute name of the module being imported. If the module to be imported is + contained in package then the parent package's :attr:`__path__` attribute + is passed in as a second argument. The method returns :keyword:`None` if + the module cannot be found, else returns a :term:`loader`. + + :data:`sys.meta_path` is searched before any implicit default finders or + :data:`sys.path`. + + See :pep:`302` for the original specification. + + .. data:: modules .. index:: builtin: reload @@ -590,6 +606,27 @@ always available. :data:`sys.path`. +.. data:: path_hooks + + A list of callables that take a path argument to try to create a + :term:`finder` for the path. If a finder can be created, it is to be + returned by the callable, else raise :exc:`ImportError`. + + Originally specified in :pep:`302`. + + +.. data:: path_importer_cache + + A dictionary acting as a cache for :term:`finder` objects. The keys are + paths that have been passed to :data:`sys.path_hooks` and the values are + the finders that are found. If a path is a valid file system path but no + explicit finder is found on :data:`sys.path_hooks` then :keyword:`None` is + stored to represent the implicit default finder should be used. If the path + is not an existing path then :class:`imp.NullImporter` is set. + + Originally specified in :pep:`302`. + + .. data:: platform This string contains a platform identifier that can be used to append diff --git a/Doc/reference/simple_stmts.rst b/Doc/reference/simple_stmts.rst index 87ce4032de8..c15f796f000 100644 --- a/Doc/reference/simple_stmts.rst +++ b/Doc/reference/simple_stmts.rst @@ -653,48 +653,124 @@ The :keyword:`import` statement Import statements are executed in two steps: (1) find a module, and initialize it if necessary; (2) define a name or names in the local namespace (of the scope -where the :keyword:`import` statement occurs). The first form (without -:keyword:`from`) repeats these steps for each identifier in the list. The form -with :keyword:`from` performs step (1) once, and then performs step (2) -repeatedly. - -In this context, to "initialize" a built-in or extension module means to call an -initialization function that the module must provide for the purpose (in the -reference implementation, the function's name is obtained by prepending string -"init" to the module's name); to "initialize" a Python-coded module means to -execute the module's body. +where the :keyword:`import` statement occurs). The statement comes in two +forms differing on whether it uses the :keyword:`from` keyword. The first form +(without :keyword:`from`) repeats these steps for each identifier in the list. +The form with :keyword:`from` performs step (1) once, and then performs step +(2) repeatedly. .. index:: - single: modules (in module sys) - single: sys.modules - pair: module; name - pair: built-in; module - pair: user-defined; module - module: sys - pair: filename; extension - triple: module; search; path + single: package -The system maintains a table of modules that have been or are being initialized, -indexed by module name. This table is accessible as ``sys.modules``. When a -module name is found in this table, step (1) is finished. If not, a search for -a module definition is started. When a module is found, it is loaded. Details -of the module searching and loading process are implementation and platform -specific. It generally involves searching for a "built-in" module with the -given name and then searching a list of locations given as ``sys.path``. +To understand how step (1) occurs, one must first understand how Python handles +hierarchical naming of modules. To help organize modules and provide a +hierarchy in naming, Python has a concept of packages. A package can contain +other packages and modules while modules cannot contain other modules or +packages. From a file system perspective, packages are directories and modules +are files. The original `specification for packages +`_ is still available to read, +although minor details have changed since the writing of that document. .. index:: - pair: module; initialization - exception: ImportError - single: code block - exception: SyntaxError + single: sys.modules -If a built-in module is found, its built-in initialization code is executed and -step (1) is finished. If no matching file is found, :exc:`ImportError` is -raised. If a file is found, it is parsed, yielding an executable code block. If -a syntax error occurs, :exc:`SyntaxError` is raised. Otherwise, an empty module -of the given name is created and inserted in the module table, and then the code -block is executed in the context of this module. Exceptions during this -execution terminate step (1). +Once the name of the module is known (unless otherwise specified, the term +"module" will refer to both packages and modules), searching +for the module or package can begin. The first place checked is +:data:`sys.modules`, the cache of all modules that have been imported +previously. If the module is found there then it is used in step (2) of import. + +.. index:: + single: sys.meta_path + single: finder + pair: finder; find_module + single: __path__ + +If the module is not found in the cache, then :data:`sys.meta_path` is searched +(the specification for :data:`sys.meta_path` can be found in :pep:`302`). +The object is a list of :term:`finder` objects which are queried in order as to +whether they know how to load the module by calling their :meth:`find_module` +method with the name of the module. If the module happens to be contained +within a package (as denoted by the existence of a dot in the name), then a +second argument to :meth:`find_module` is given as the value of the +:attr:`__path__` attribute from the parent package (everything up to the last +dot in the name of the module being imported). If a finder can find the module +it returns a :term:`loader` (discussed later) or returns :keyword:`None`. + +.. index:: + single: sys.path_hooks + single: sys.path_importer_cache + single: sys.path + +If none of the finders on :data:`sys.meta_path` are able to find the module +then some implicitly defined finders are queried. Implementations of Python +vary in what implicit meta path finders are defined. The one they all do +define, though, is one that handles :data:`sys.path_hooks`, +:data:`sys.path_importer_cache`, and :data:`sys.path`. + +The implicit finder searches for the requested module in the "paths" specified +in one of two places ("paths" do not have to be file system paths). If the +module being imported is supposed to be contained within a package then the +second argument passed to :meth:`find_module`, :attr:`__path__` on the parent +package, is used as the source of paths. If the module is not contained in a +package then :data:`sys.path` is used as the source of paths. + +Once the source of paths is chosen it is iterated over to find a finder that +can handle that path. The dict at :data:`sys.path_importer_cache` caches +finders for paths and is checked for a finder. If the path does not have a +finder cached then :data:`sys.path_hooks` is searched by calling each object in +the list with a single argument of the path, returning a finder or raises +:exc:`ImportError`. If a finder is returned then it is cached in +:data:`sys.path_importer_cache` and then used for that path entry. If no finder +can be found but the path exists then a value of :keyword:`None` is +stored in :data:`sys.path_importer_cache` to signify that an implicit, +file-based finder that handles modules stored as individual files should be +used for that path. If the path does not exist then a finder which always +returns :keyword:`None` is placed in the cache for the path. + +.. index:: + single: loader + pair: loader; load_module + exception: ImportError + +If no finder can find the module then :exc:`ImportError` is raised. Otherwise +some finder returned a loader whose :meth:`load_module` method is called with +the name of the module to load (see :pep:`302` for the original definition of +loaders). A loader has several responsibilities to perform on a module it +loads. First, if the module already exists in :data:`sys.modules` (a +possibility if the loader is called outside of the import machinery) then it +is to use that module for initialization and not a new module. But if the +module does not exist in :data:`sys.modules` then it is to be added to that +dict before initialization begins. If an error occurs during loading of the +module and it was added to :data:`sys.modules` it is to be removed from the +dict. If an error occurs but the module was already in :data:`sys.modules` it +is left in the dict. + +.. index:: + single: __name__ + single: __file__ + single: __path__ + single: __package__ + single: __loader__ + +The loader must set several attributes on the module. :data:`__name__` is to be +set to the name of the module. :data:`__file__` is to be the "path" to the file +unless the module is built-in (and thus listed in +:data:`sys.builtin_module_names`) in which case the attribute is not set. +If what is being imported is a package then :data:`__path__` is to be set to a +list of paths to be searched when looking for modules and packages contained +within the package being imported. :data:`__package__` is optional but should +be set to the name of package that contains the module or package (the empty +string is used for module not contained in a package). :data:`__loader__` is +also optional but should be set to the loader object that is loading the +module. + +.. index:: + exception: ImportError + +If an error occurs during loading then the loader raises :exc:`ImportError` if +some other exception is not already being propagated. Otherwise the loader +returns the module that was loaded and initialized. When step (1) finishes without raising an exception, step (2) can begin. @@ -734,23 +810,21 @@ function contains or is a nested block with free variables, the compiler will raise a :exc:`SyntaxError`. .. index:: - keyword: from - statement: from - triple: hierarchical; module; names - single: packages - single: __init__.py + single: relative; import -**Hierarchical module names:** when the module names contains one or more dots, -the module search path is carried out differently. The sequence of identifiers -up to the last dot is used to find a "package"; the final identifier is then -searched inside the package. A package is generally a subdirectory of a -directory on ``sys.path`` that has a file :file:`__init__.py`. +When specifying what module to import you do not have to specify the absolute +name of the module. When a module or package is contained within another +package it is possible to make a relative import within the same top package +without having to mention the package name. By using leading dots in the +specified module or package after :keyword:`from` you can specify how high to +traverse up the current package hierarchy without specifying exact names. One +leading dot means the current package where the module making the import +exists. Two dots means up one package level. Three dots is up two levels, etc. +So if you execute ``from . import mod`` from a module in the ``pkg`` package +then you will end up importing ``pkg.mod``. If you execute ``from ..subpkg2 +imprt mod`` from within ``pkg.subpkg1`` you will import ``pkg.subpkg2.mod``. +The specification for relative imports is contained within :pep:`328`. -.. - [XXX Can't be - bothered to spell this out right now; see the URL - http://www.python.org/doc/essays/packages.html for more details, also about how - the module search works from inside a package.] .. index:: builtin: __import__