cpython/Lib
Eric Snow 79cf20e48d
bpo-21736: Set __file__ on frozen stdlib modules. (gh-28656)
Currently frozen modules do not have __file__ set.  In their spec, origin is set to "frozen" and they are marked as not having a location.  (Similarly, for frozen packages __path__ is set to an empty list.)  However, for frozen stdlib modules we are able to extrapolate __file__ as long as we can determine the stdlib directory at runtime.  (We now do so since gh-28586.)  Having __file__ set is helpful for a number of reasons.  Likewise, having a non-empty __path__ means we can import submodules of a frozen package from the filesystem (e.g. we could partially freeze the encodings module).

This change sets __file__ (and adds to __path__) for frozen stdlib modules.  It uses sys._stdlibdir (from gh-28586) and the frozen module alias information (from gh-28655).  All that work is done in FrozenImporter (in Lib/importlib/_bootstrap.py). 
 Also, if a frozen module is imported before importlib is bootstrapped (during interpreter initialization) then we fix up that module and its spec during the importlib bootstrapping step (i.e. imporlib._bootstrap._setup()) to match what gets set by FrozenImporter, including setting the file info (if the stdlib dir is known).  To facilitate this, modules imported using PyImport_ImportFrozenModule() have __origname__ set using the frozen module alias info.  __origname__ is popped off during importlib bootstrap.

(To be clear, even with this change the new code to set __file__ during fixups in imporlib._bootstrap._setup() doesn't actually get triggered yet.  This is because sys._stdlibdir hasn't been set yet in interpreter initialization at the point importlib is bootstrapped.  However, we do fix up such modules at that point to otherwise match the result of importing through FrozenImporter, just not the __file__ and __path__ parts.  Doing so will require changes in the order in which things happen during interpreter initialization.  That can be addressed separately.  Once it is, the file-related fixup code from this PR will kick in.)

Here are things this change does not do:

* set __file__ for non-stdlib modules (no way of knowing the parent dir)
* set __file__ if the stdlib dir is not known (nor assume the expense of finding it)
* relatedly, set __file__ if the stdlib is in a zip file
* verify that the filename set to __file__ actually exists (too expensive)
* update __path__ for frozen packages that alias a non-package (since there is no package dir)

Other things this change skips, but we may do later:

* set __file__ on modules imported using PyImport_ImportFrozenModule()
* set co_filename when we unmarshal the frozen code object while importing the module (e.g. in FrozenImporter.exec_module()) -- this would allow tracebacks to show source lines
* implement FrozenImporter.get_filename() and FrozenImporter.get_source()

https://bugs.python.org/issue21736
2021-10-14 15:32:18 -06:00
..
__phello__ bpo-45020: Add more test cases for frozen modules. (gh-28664) 2021-09-30 18:38:52 -06:00
asyncio bpo-45416: Fix use of asyncio.Condition() with explicit Lock objects (GH-28850) 2021-10-10 19:01:41 +03:00
collections bpo-27275: Change popitem() and pop() methods of collections.OrderedDict (GH-27530) 2021-08-03 13:00:55 +02:00
concurrent bpo-45021: Fix a hang in forked children (GH-28007) 2021-09-20 11:30:19 -07:00
ctypes Fix typos in the Lib directory (GH-28775) 2021-10-06 16:13:48 -07:00
curses
dbm bpo-40563: Support pathlike objects on dbm/shelve (GH-21849) 2021-09-10 15:26:16 +03:00
distutils Fix typos in the Lib directory (GH-28775) 2021-10-06 16:13:48 -07:00
email bpo-45239: Fix parsedate_tz when time has more than 2 dots in it (GH-28452) 2021-10-13 18:21:27 +02:00
encodings bpo-45467: Fix IncrementalDecoder and StreamReader in the "raw-unicode-escape" codec (GH-28944) 2021-10-14 20:04:19 +03:00
ensurepip bpo-45343: Update bundled pip to 21.2.4 and setuptools to 58.1.0 (GH-28684) 2021-10-05 23:30:38 +02:00
html bpo-45421: Remove dead code from html.parser (GH-28847) 2021-10-12 10:12:21 -07:00
http bpo-45328: Avoid failure in OSs without TCP_NODELAY support (GH-28646) 2021-10-06 19:49:44 +02:00
idlelib Fix typos in the Lib directory (GH-28775) 2021-10-06 16:13:48 -07:00
importlib bpo-21736: Set __file__ on frozen stdlib modules. (gh-28656) 2021-10-14 15:32:18 -06:00
json
lib2to3 Fix typos in the Lib directory (GH-28775) 2021-10-06 16:13:48 -07:00
logging bpo-45401: Change shouldRollover() methods to only rollover regular f… (GH-28822) 2021-10-10 08:15:24 -07:00
msilib [codemod] Fix non-matching bracket pairs (GH-28473) 2021-09-22 01:09:00 +02:00
multiprocessing bpo-38840: Incorrect __all__ in multiprocessing.managers (GH-18034) 2021-08-09 18:44:55 +02:00
pydoc_data bpo-10716: Migrating pydoc to html5. (GH-28651) 2021-10-09 09:36:50 +02:00
site-packages
sqlite3 bpo-16379: Fix SQLite version checks in test_module_constants() (GH-28809) 2021-10-07 12:48:13 -07:00
test bpo-21736: Set __file__ on frozen stdlib modules. (gh-28656) 2021-10-14 15:32:18 -06:00
tkinter bpo-45229: Make tkinter tests discoverable (GH-28637) 2021-10-13 18:12:48 +02:00
turtledemo bpo-44254: On Mac, remove disfunctional colors from turtledemo buttons (GH-26448) 2021-05-29 03:19:50 -04:00
unittest Fix typos in the Lib directory (GH-28775) 2021-10-06 16:13:48 -07:00
urllib bpo-40321: Add missing test, slightly expand documentation (GH-28760) 2021-10-06 17:28:16 +02:00
venv bpo-45337: Use the realpath of the new executable when creating a venv on Windows (GH-28663) 2021-10-07 21:26:12 +01:00
wsgiref Fix typos in the Lib directory (GH-28775) 2021-10-06 16:13:48 -07:00
xml bpo-45132 Remove deprecated __getitem__ methods (GH-28225) 2021-09-08 13:07:40 +03:00
xmlrpc bpo-45386: Handle strftime's ValueError graciously in xmlrpc.client (GH-28765) 2021-10-13 18:38:36 +02:00
zoneinfo Fix typos in the Lib directory (GH-28775) 2021-10-06 16:13:48 -07:00
__future__.py Set the release for `__future__.annotations` to 3.11 (#25596) 2021-04-25 17:09:24 +01:00
__hello__.py bpo-45019: Clean up the frozen __hello__ module. (gh-28374) 2021-09-15 14:15:32 -06:00
_aix_support.py Fix typos in multiple files (GH-26689) 2021-06-12 22:47:44 -04:00
_bootsubprocess.py
_collections_abc.py bpo-44801: Check arguments in substitution of ParamSpec in Callable (GH-27585) 2021-08-04 20:07:01 +02:00
_compat_pickle.py
_compression.py bpo-41486: Faster bz2/lzma/zlib via new output buffering (GH-21740) 2021-04-27 23:58:54 -07:00
_markupbase.py
_osx_support.py [codemod] Fix non-matching bracket pairs (GH-28473) 2021-09-22 01:09:00 +02:00
_py_abc.py
_pydecimal.py Remove unnecessary test for `xc == 1` in _pydecimal (GH-27102) 2021-07-15 12:48:46 +02:00
_pyio.py bpo-37330: open() no longer accept 'U' in file mode (GH-28118) 2021-09-02 12:58:00 +02:00
_sitebuiltins.py bpo-43651: PEP 597: Fix EncodingWarning in some tests (GH-25189) 2021-04-06 11:18:41 +09:00
_strptime.py bpo-43295: Fix error handling of datetime.strptime format string '%z' (GH-24627) 2021-03-03 08:58:57 -08:00
_threading_local.py
_weakrefset.py bpo-44962: Fix a race in WeakKeyDict, WeakValueDict and WeakSet when two threads attempt to commit the last pending removal (GH-27921) 2021-08-28 19:07:37 +02:00
abc.py Clarify the order of a stacked `abstractmethod` (GH-26892) 2021-06-27 21:02:23 +03:00
aifc.py bpo-30077: Add support for Apple aifc/sowt pseudo-compression (GH-24449) 2021-08-13 13:31:25 +02:00
antigravity.py
argparse.py bpo-24444: fix an error in argparse help when help for an option is blank (GH-28050) 2021-10-13 18:31:51 +02:00
ast.py Fix typos in multiple files (GH-26689) 2021-06-12 22:47:44 -04:00
asynchat.py bpo-44498: Issue a deprecation warning on asynchat, asyncore and smtpd import (#26882) 2021-06-24 12:37:26 -07:00
asyncore.py bpo-44498: Issue a deprecation warning on asynchat, asyncore and smtpd import (#26882) 2021-06-24 12:37:26 -07:00
base64.py bpo-35970: Add help flag to base64 module (GH-28774) 2021-10-06 18:38:43 -07:00
bdb.py fix docstring typo in bdb.py (GH-22323) 2021-05-17 00:20:33 +01:00
bisect.py
bz2.py bpo-44439: BZ2File.write() / LZMAFile.write() handle buffer protocol correctly (GH-26764) 2021-06-22 10:04:23 +03:00
cProfile.py bpo-42005: profile and cProfile catch BrokenPipeError (GH-22643) 2021-01-20 09:56:21 +01:00
calendar.py bpo-39710: Remove Python 2-specific sentence from calendar documentation (GH-26985) 2021-09-15 22:36:38 +02:00
cgi.py bpo-41139: Deprecate `cgi.log()` (GH-25625) 2021-04-29 11:36:04 +09:00
cgitb.py bpo-10716: Migrating pydoc to html5. (GH-28651) 2021-10-09 09:36:50 +02:00
chunk.py
cmd.py
code.py
codecs.py bpo-14014: Clarify StreamWriter.reset() documentation (GH-13716) 2021-01-06 04:14:42 +02:00
codeop.py bpo-43202: More codeop._maybe_compile clean-ups (GH-24512) 2021-02-13 01:49:18 -05:00
colorsys.py Improve consistency of colorsys.rgb_to_hsv (GH-27277) 2021-07-23 09:59:30 -03:00
compileall.py Fix missing space with help for `-m compileall -o` (GH-27591) 2021-09-18 00:28:09 +02:00
configparser.py bpo-45173 Remove configparser deprecations (GH-28292) 2021-09-13 19:12:36 +02:00
contextlib.py bpo-44594: fix (Async)ExitStack handling of __context__ (gh-27089) 2021-10-03 23:49:55 -07:00
contextvars.py
copy.py
copyreg.py bpo-44676: Serialize the union type using only public API (GH-27323) 2021-07-24 21:26:02 +03:00
crypt.py
csv.py bpo-43625: Enhance csv sniffer has_headers() to be more accurate (GH-26939) 2021-07-30 19:10:37 +02:00
dataclasses.py Fix dataclassses spelling (GH-28837) 2021-10-09 15:17:52 -04:00
datetime.py Fix typo (GH-23019) 2021-02-03 13:25:28 -08:00
decimal.py
difflib.py Fix typos in the Lib directory (GH-28775) 2021-10-06 16:13:48 -07:00
dis.py bpo-45152: refactor the dis module to make handling of hasconst opcodes more generic (GH-28258) 2021-09-15 10:14:15 +01:00
doctest.py bpo-35753: Fix crash in doctest with unwrap-able functions (#22981) 2021-05-05 19:33:17 +02:00
enum.py bpo-45417: [Enum] fix quadratic behavior during creation (GH-28907) 2021-10-14 13:59:51 -07:00
filecmp.py bpo-42958: Improve description of shallow= in filecmp.cmp docs (GH-27166) 2021-08-04 21:39:45 +02:00
fileinput.py bpo-45132 Remove deprecated __getitem__ methods (GH-28225) 2021-09-08 13:07:40 +03:00
fnmatch.py bpo-42799: fnmatch module: bump up size of lru_cache for patterns (GH-27084) 2021-07-15 12:53:26 +02:00
fractions.py bpo-44258: support PEP 515 for Fraction's initialization from string (GH-26422) 2021-06-07 08:06:33 +01:00
ftplib.py bpo-43285 Make ftplib not trust the PASV response. (GH-24838) 2021-03-15 11:39:31 -07:00
functools.py bpo-44605: Teach @total_ordering() to work with metaclasses (GH-27633) 2021-08-06 14:33:30 -05:00
genericpath.py
getopt.py
getpass.py update docstring for `win_getpass` to reflect code changes (GH-24967) 2021-05-03 23:48:29 -07:00
gettext.py bpo-44235: Remove deprecated functions in the gettext module. (GH-26378) 2021-05-30 10:29:45 +09:00
glob.py bpo-44482: Fix very unlikely resource leak in glob in non-CPython implementations (GH-26843) 2021-06-23 12:53:37 +03:00
graphlib.py [codemod] Fix non-matching bracket pairs (GH-28473) 2021-09-22 01:09:00 +02:00
gzip.py bpo-43613: Faster implementation of gzip.compress and gzip.decompress (GH-27941) 2021-09-02 17:02:59 +02:00
hashlib.py bpo-45155: Apply new byteorder default values for int.to/from_bytes (GH-28465) 2021-09-20 13:22:55 -05:00
heapq.py
hmac.py bpo-40645: use C implementation of HMAC (GH-24920) 2021-03-27 06:55:03 -07:00
imaplib.py bpo-44045: fix spelling of uppercase vs upper-case (GH-25985) 2021-05-28 17:54:25 -03:00
imghdr.py bpo-44539: Support recognizing JPEG files without JFIF or Exif markers (GH-26964) 2021-07-20 20:56:57 +02:00
imp.py bpo-45019: Do some cleanup related to frozen modules. (gh-28319) 2021-09-13 16:18:37 -06:00
inspect.py bpo-30951: Correct co_names docstring in inspect module (GH-2743) 2021-09-24 12:05:34 +02:00
io.py bpo-43680: Deprecate io.OpenWrapper (GH-25357) 2021-04-14 03:24:33 +02:00
ipaddress.py bpo-45155: Apply new byteorder default values for int.to/from_bytes (GH-28465) 2021-09-20 13:22:55 -05:00
keyword.py bpo-42128: Structural Pattern Matching (PEP 634) (GH-22917) 2021-02-26 14:51:55 -08:00
linecache.py Fix typos in the Lib directory (GH-28775) 2021-10-06 16:13:48 -07:00
locale.py bpo-34311: Add locale.localize (GH-15275) 2021-04-12 14:17:40 +02:00
lzma.py bpo-44439: BZ2File.write() / LZMAFile.write() handle buffer protocol correctly (GH-26764) 2021-06-22 10:04:23 +03:00
mailbox.py
mailcap.py
mimetypes.py bpo-45411: Update mimetypes.py (GH-28792) 2021-10-11 13:05:28 +02:00
modulefinder.py bpo-45017: move opcode-related logic from modulefinder to dis (GH-28246) 2021-09-09 14:04:12 +01:00
netrc.py bpo-43733: netrc try to use UTF-8 before using locale encoding. (GH-25781) 2021-05-02 14:01:02 +09:00
nntplib.py
ntpath.py bpo-43757: Make pathlib use os.path.realpath() to resolve symlinks in a path (GH-25264) 2021-04-28 16:50:17 +01:00
nturl2path.py bpo-43607: Fix urllib handling of Windows paths with \\?\ prefix (GH-25539) 2021-04-23 18:02:47 +01:00
numbers.py bpo-44072: fix Complex, Integral docs for `**` (GH-25986) 2021-05-14 18:01:48 -04:00
opcode.py bpo-45367: Specialize BINARY_MULTIPLY (GH-28727) 2021-10-14 15:56:33 +01:00
operator.py bpo-44019: Implement operator.call(). (GH-27888) 2021-09-24 16:22:49 +01:00
optparse.py
os.py bpo-42053: Remove misleading check in os.fwalk() (GH-27669) 2021-08-08 21:04:02 +03:00
pathlib.py bpo-27827: identify a greater range of reserved filename on Windows. (GH-26698) 2021-07-28 16:28:14 +02:00
pdb.py bpo-44682: Handle invalid arg to pdb's "commands" directive (#27252) 2021-07-28 18:55:03 +02:00
pickle.py Fix typos in the Lib directory (GH-28775) 2021-10-06 16:13:48 -07:00
pickletools.py
pipes.py Change type check to isinstance in pipes (GH-27291) 2021-07-28 15:38:06 +02:00
pkgutil.py [codemod] Fix non-matching bracket pairs (GH-28473) 2021-09-22 01:09:00 +02:00
platform.py Fix typos in the Lib directory (GH-28775) 2021-10-06 16:13:48 -07:00
plistlib.py Fix typos in multiple files (GH-26689) 2021-06-12 22:47:44 -04:00
poplib.py
posixpath.py bpo-26329: update os.path.normpath documentation (GH-20138) 2021-07-12 09:48:01 -03:00
pprint.py bpo-41546: make pprint (like print) not write to stdout when it is None (GH-26810) 2021-07-19 10:19:02 +01:00
profile.py bpo-42005: profile and cProfile catch BrokenPipeError (GH-22643) 2021-01-20 09:56:21 +01:00
pstats.py Fix typos in multiple files (GH-26689) 2021-06-12 22:47:44 -04:00
pty.py bpo-26228: [doc] Adapt PTY documentation updates from GH-4167 (GH-27754) 2021-08-13 12:57:07 +02:00
py_compile.py
pyclbr.py bpo-40443: Remove unused imports (GH-25429) 2021-04-16 11:26:06 +02:00
pydoc.py bpo-10716: Migrating pydoc to html5. (GH-28651) 2021-10-09 09:36:50 +02:00
queue.py
quopri.py
random.py bpo-45155: Apply new byteorder default values for int.to/from_bytes (GH-28465) 2021-09-20 13:22:55 -05:00
re.py bpo-38659: [Enum] add _simple_enum decorator (GH-25497) 2021-04-21 10:20:44 -07:00
reprlib.py bpo-39549: reprlib.Repr uses a “fillvalue” attribute (GH-18343) 2021-09-22 15:45:58 -05:00
rlcompleter.py bpo-44752: refactor part of rlcompleter.Completer.attr_matches (GH-27433) 2021-07-29 16:01:21 +02:00
runpy.py bpo-41718: runpy now imports pkgutil in functions (GH-24996) 2021-03-23 19:22:57 +01:00
sched.py
secrets.py
selectors.py
shelve.py bpo-34204: Use pickle.DEFAULT_PROTOCOL in shelve (GH-19639) 2020-10-29 02:44:35 -07:00
shlex.py
shutil.py bpo-45234: Fix FileNotFound exception raised instead of IsADirectoryError in shutil.copyfile() (GH-28421) 2021-09-21 23:53:07 +02:00
signal.py
site.py bpo-43510: Implement PEP 597 opt-in EncodingWarning. (GH-19481) 2021-03-29 12:28:14 +09:00
smtpd.py bpo-44498: Issue a deprecation warning on asynchat, asyncore and smtpd import (#26882) 2021-06-24 12:37:26 -07:00
smtplib.py bpo-43124: Fix smtplib multiple CRLF injection (GH-25987) 2021-08-29 16:10:50 +02:00
sndhdr.py
socket.py bpo-40635: Fix getfqdn() docstring and docs (GH-27971) 2021-08-26 20:40:28 +02:00
socketserver.py bpo-37193: Remove thread objects which finished process its request (GH-23127) 2020-12-31 20:19:30 +00:00
sre_compile.py
sre_constants.py
sre_parse.py
ssl.py Fix typos in multiple files (GH-26689) 2021-06-12 22:47:44 -04:00
stat.py
statistics.py bpo-20499: Rounding error in statistics.pvariance (GH-28230) 2021-09-08 22:00:12 -05:00
string.py bpo-45225: use map function instead of genexpr in capwords (GH-28342) 2021-09-16 14:49:38 -05:00
stringprep.py
struct.py
subprocess.py bpo-40497: Fix handling of check in subprocess.check_output() (GH-19897) 2021-09-20 17:09:05 +02:00
sunau.py
symtable.py bpo-42355: symtable.get_namespace() now checks whether there are multiple or any namespaces found (GH-23278) 2021-07-18 15:56:09 +03:00
sysconfig.py Fix typos in the Lib directory (GH-28775) 2021-10-06 16:13:48 -07:00
tabnanny.py
tarfile.py bpo-39039: tarfile raises descriptive exception from zlib.error (GH-27766) 2021-09-29 11:25:48 +02:00
telnetlib.py Remove unnecessary pass statements (GH-27103) 2021-07-13 15:02:30 +02:00
tempfile.py bpo-4928: Document NamedTemporaryFile non-deletion after SIGKILL (#26198) 2021-05-19 10:21:03 -04:00
textwrap.py
this.py
threading.py Fix typos in the Lib directory (GH-28775) 2021-10-06 16:13:48 -07:00
timeit.py
token.py bpo-43822: Improve syntax errors for missing commas (GH-25377) 2021-04-15 21:38:45 +01:00
tokenize.py Add tests for the C tokenizer and expose it as a private module (GH-27924) 2021-08-24 17:50:05 +01:00
trace.py Fix typo in Lib/trace.py (GH-24309) 2021-02-01 21:16:38 +05:30
traceback.py bpo-45249: Ensure the traceback module prints correctly syntax errors with ranges (GH-28575) 2021-09-27 21:59:06 +01:00
tracemalloc.py bpo-37961: Fix regression in tracemalloc.Traceback.__repr__ (GH-23805) 2020-12-16 22:38:32 +01:00
tty.py
turtle.py Update URLs in comments and metadata to use HTTPS (GH-27458) 2021-07-30 15:54:46 +02:00
types.py bpo-44732: Rename types.Union to types.UnionType (#27342) 2021-07-26 18:00:21 +02:00
typing.py bpo-45166: fixes `get_type_hints` failure on `Final` (GH-28279) 2021-09-25 10:56:22 +02:00
uu.py
uuid.py bpo-45155: Apply new byteorder default values for int.to/from_bytes (GH-28465) 2021-09-20 13:22:55 -05:00
warnings.py
wave.py
weakref.py bpo-44962: Fix a race in WeakKeyDict, WeakValueDict and WeakSet when two threads attempt to commit the last pending removal (GH-27921) 2021-08-28 19:07:37 +02:00
webbrowser.py bpo-42255: Deprecate webbrowser.MacOSX from Python 3.11 (GH-27837) 2021-09-03 18:21:03 +02:00
xdrlib.py
zipapp.py
zipfile.py bpo-39359: [zipfile] add missing "pwd: expected bytes, got str" exception (GH-18031) 2021-09-23 23:37:53 +02:00
zipimport.py bpo-45183: don't raise an exception when calling zipimport.zipimporter.find_spec() when the zip file is missing and the internal cache has been reset (GH-28435) 2021-09-17 16:48:17 -07:00