This patch fixes possible overflow in the use of
PyOS_GetLastModificationTime in getmtime.c and Python/import.c.
Currently PyOS_GetLastModificationTime returns a C long. This can
overflow on Win64 where sizeof(time_t) > sizeof(long). Besides it
should logically return a time_t anyway (this patch changes this).
As well, import.c uses PyOS_GetLastModificationTime for .pyc
timestamping. There has been recent discussion about the .pyc header
format on python-dev. This patch adds oveflow checking to import.c so
that an exception will be raised if the modification time
overflows. There are a few other minor 64-bit readiness changes made
to the module as well:
- size_t instead of int or long for function-local buffer and string
length variables
- one buffer overflow check was added (raises an exception on possible
overflow, this overflow chance exists on 32-bit platforms as well), no
other possible buffer overflows existed (from my analysis anyway)
Closes SourceForge patch #100509.
For more comments, read the patches@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.
(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode. I'm also holding back on his
change to main.c, which seems unnecessary to me.)
Changed all references to the MAGIC constant to use a global
pyc_magic instead. This global is initially set to MAGIC, but can be
changed by the _PyImport_Init() function to provide for
special features implemented in the compiler which are settable
using command line switches and affect the way PYC files are
generated.
Currently this change is only done for the -U flag.
* in import.c, #ifdef out references to dynamic loading based on
HAVE_DYNAMIC_LOADING
* clean out the platform-specific crud from importdl.c.
[ maybe fold this function into import.c and drop the importdl.c file? Greg.]
* change GetDynLoadFunc's "funcname" parameter to "shortname". change
"name" to "fqname" for clarification.
* each GetDynLoadFunc now creates its own funcname value.
WARNING: as I mentioned previously, we may run into an issue with a
missing "_" on some platforms. Testing will show this pretty quickly,
however.
* move pathname munging into dynload_shlib.c
that file in fact did not exist or at least was not used. Change this
so that __file__ is *only* set to the .pyc/.pyo file when it actually
read the code object from it; otherwise __file__ is set to the .py
file.
frozen packages. (I *think* this means that we can now have a
built-in module bar that's a submodule of a frozen package foo, by
registering the built-in module with a name "foo.bar" in the table of
builtin modules.)
because the path through the code would notice that sys.__path__ did
not exist and it would fall back to the default path (builtins +
sys.path) instead of failing). No longer.
to the table of built-in modules. This should normally be called
*before* Py_Initialize(). When the malloc() or realloc() call fails,
-1 is returned and the existing table is unchanged.
After a similar function by Just van Rossum.
int PyImport_ExtendInittab(struct _inittab *newtab);
int PyImport_AppendInittab(char *name, void (*initfunc)());
Frozen packages are indicated by a negative size (the code string
is the __import__.py file). A frozen package module has its __path__
set to a string, the package name.
time can be in PyImport_ImportModuleEx(). Recursive calls from the
same thread are okay.
Potential problems:
- The lock should really be part of the interpreter state rather than
global, but that would require modifying more files, and I first want
to figure out whether this works at all.
- One could argue that the lock should be per module -- however that
would be complicated to implement. We would have to have a linked
list of locks per module name, *or* invent a new object type to
represent a lock, so we can store the locks in the module or in a
separate dictionary. Both seem unwarranted. The one situation where
this can cause problems is when loading a module takes a long time,
e.g. when the module's initialization code interacts with the user --
during that time, no other threads can run. I say, "too bad."
(modified) and use that.
Some differences in the cleanup algorithm:
- Clear __main__ before the other modules.
- Delete more sys variables: including ps1, ps2, exitfunc, argv, and
even path -- this will prevent new imports!
- Restore stdin, stdout, stderr from __stdin__, __stdout__,
__stderr__, effectively deleting hooks that the user might have
installed -- so their (the hooks') destructors will run.
This is an option for OS-es with case-insensitive but case-preserving
filesystems. It is currently supported for Win32 and MacOS. To
enable it, #define CHECK_IMPORT_CASE in your platform specific
config.h. It is enabled by default on those systems where it is
supported. On Win32, it can be disabled at runtime by setting the
environment variable PYTHONCASEOK (to any value).
When enabled, the feature checks that the case of the requested module
name matches that of the filename found in the filesystem, and raises
a NameError exception when they don't match.
pass it the true file. This is used to set __file__ properly, instead
of believing what the code object carries with it. (If the pointer
is NULL, the code object's co_filename is still used.)
(1) Explicitly clear __builtin__._ and sys.{last,exc}_* before
clearing anything else. These are common places where user values
hide and people complain when their destructors fail. Since the
modules containing them are deleted *last* of all, they would come too
late in the normal destruction order. Sigh.
(2) Add some debugging aid to cleanup (after a suggestion by Marc
Lemburg) -- print the names of the modules being cleaned, and (when
-vv is used) print the names of the variables being cleared.
now implement the following finalization strategy.
1. Whenever this code deletes a module, its directory is cleared
carefully, as follows:
- set all names to None that begin with exactly one underscore
- set all names to None that don't begin with two underscores
- clear the directory
2. Modules are deleted in the following order:
- modules with a reference count of 1, except __builtin__ or __sys__
- repeat until no more are found with a reference count of 1
- __main__ if it's still there
- all remaining modules except __builtin__ or sys
- sys
_ __builtin__
__init__.py (or __init__.pyc/.pyo, whichever applies) is considered a
package. All other subdirectories are left alone. Should make Konrad
Hinsen happy!
dummy entry to sys.modules, marking the absence of a submodule by the
same name.
Thus, if module foo.bar executes the statement "import time",
sys.modules['foo.time'] will be set to None, once the absence of a
module foo.time is confirmed (by looking for it in foo's path).
The next time when foo.bar (or any other submodule of foo) executes
"import time", no I/O is necessary to determine that there is no
module foo.time.
(Justification: It may seem strange to pollute sys.modules. However,
since we're doing the lookup anyway it's definitely the fastest
solution. This is the same convention that 'ni' uses and I haven't
heard any complaints.)
right thing.
Still to do:
- Make reload() of a submodule work.
- Performance tweaks -- currently, a submodule that tries to import a
global module *always* searches the package directory first, even if
the global module was already imported. Not sure how to solve this
one; probably need to record misses per package.
- Documentation!
This doesn't yet support "import a.b.c" or "from a.b.c import x", but
it does recognize directories. When importing a directory, it
initializes __path__ to a list containing the directory name, and
loads the __init__ module if found.
The (internal) find_module() and load_module() functions are
restructured so that they both also handle built-in and frozen modules
and Mac resources (and directories of course). The imp module's
find_module() and (new) load_module() also have this functionality.
Moreover, imp unconditionally defines constants for all module types,
and has two more new functions: find_module_in_package() and
find_module_in_directory().
There's also a new API function, PyImport_ImportModuleEx(), which
takes all four __import__ arguments (name, globals, locals, fromlist).
The last three may be NULL. This is currently the same as
PyImport_ImportModule() but in the future it will be able to do
relative dotted-path imports.
Other changes:
- bltinmodule.c: in __import__, call PyImport_ImportModuleEx().
- ceval.c: always pass the fromlist to __import__, even if it is a C
function, so PyImport_ImportModuleEx() is useful.
- getmtime.c: the function has a second argument, the FILE*, on which
it applies fstat(). According to Sjoerd this is much faster. The
first (pathname) argument is ignored, but remains for backward
compatibility (so the Mac version still works without changes).
By cleverly combining the new imp functionality, the full support for
dotted names in Python (mini.py, not checked in) is now about 7K,
lavishly commented (vs. 14K for ni plus 11K for ihooks, also lavishly
commented).
Good night!
scheme based on object's types, have a simple two-phase scheme based
on object's *names*:
/* To make the execution order of destructors for global
objects a bit more predictable, we first zap all objects
whose name starts with a single underscore, before we clear
the entire dictionary. We zap them by replacing them with
None, rather than deleting them from the dictionary, to
avoid rehashing the dictionary (to some extent). */
importdl.c: the MAXSUFFIXSIZE macro is now defined in importdl.h, and
the modules dictionary is now passed using PyImport_GetModuleDict().
Also undefine USE_SHLIB for AIX -- in AIX 4.2 and up, dlfcn.h exists
but we don't want to use it.
replaces its own entry in sys.module, reference count errors ensue;
even if there is no reference count problem, it would be preferable
for the import to yield the new thing in sys.modules anyway (if only
because that's what later imports will yield). This opens the road to
an official hack to implement a __getattr__ like feature for modules:
stick an instance in sys.modules[__name__].
bltinmodule.c: fixed coerce() nightmare in ternary pow().
modsupport.c (initmodule2): pass METH_FREENAME flag to newmethodobject().
pythonrun.c: move flushline() into and around print_error().
mac/macsetfiletype.c: changes by Jack to execute .pyc file passed
as command line argument. On the Mac .pyc files are given a
special type so they can be double-clicked
*module.o/*module.so
* Python/import.c: if initializing a module did not enter the
module into sys.modules, it may have raised an exception -- don't
override this exception.
Merged NT changes
* Python/import.c: add lost NT-specific code back in
Fixed NT changes
* funcobject.c (func_repr): don't call getstringvalue(None) for anonymous
functions.
* bltinmodule.c: removed lambda (which is now a built-in function);
removed implied lambda for string arg to filter/map/reduce.
* Grammar, graminit.[ch], compile.[ch]: replaced lambda as built-in
function by lambda as grammar entity: instead of "lambda('x: x+1')" you
write "lambda x: x+1".
* Xtmodule.c (checkargdict): return 0, not NULL, for error.
each dir in sys.path, try each possible extension. (Note: C extensions
are loaded before Python modules in the same directory, to allow having
a C version used when dynamic loading is supported and a Python version
as a back-up.)
* import.c (reload_module): test for error from getmodulename()
* moduleobject.c: implement module name as dict entry '__name__' instead
of special-casing it in module_getattr(); this way a module (or
function!) can access its own module name, and programs that know what
they are doing can rename modules.
* stdwinmodule.c (initstdwin): strip ".py" suffix of argv[0].
function vs. exec statement
* bltinmodule.c: renamed the module to __builtin__.
* posixmodule.c (posix_execv): renamed exec --> execv since it is now a
reserved word.
no longer done by config.c).
* stdwinmodule.c (initstdwin), config.c (initall): get command line
arguments from sys.argv instead of special-casing stdwin in config.c
* import.c (get_module): fix core dump when foomodule.o does not define
initfoo().
* ChangeLog: documented changes by Sjoerd.
without .py file); Bill's dynamic loading for SunOS using shared
libraries.
pwdmodule.c (mkgrent): remove DECREF of uninitialized variable.
classobject.c (instance_getattr): Fix case when class lookup returns
unbound method instead of function.
function found as instance data.
* socketmodule.c: added 'flags' argument sendto/recvfrom, rewrite
argument parsing in send/recv.
* More changes related to access (terminology change: owner instead of
class; allow any object as owner; local/global variables are owned
by their dictionary, only class/instance data is owned by the class;
"from...import *" now only imports objects with public access; etc.)
yet). The class is now passed to eval_code and stored in the current
frame. It is also stored in instance method objects. An "unbound"
instance method is now returned when a function is retrieved through
"classname.funcname", which when called passes the class to eval_code.
(1) dictionaries/mappings now have attributes values() and items() as
well as keys(); at the C level, use the new function mappinggetnext()
to iterate over a dictionary.
(2) "class C(): ..." is now illegal; you must write "class C: ...".
(3) Class objects now know their own name (finally!); and minor
improvements to the way how classes, functions and methods are
represented as strings.
(4) Added an "access" statement and semantics. (This is still
experimental -- as long as you don't use the keyword 'access' nothing
should be changed.)
lookup (opcode.h, ceval.[ch], compile.c, frameobject.[ch],
pythonrun.c, import.c). The .pyc MAGIC number is changed again.
Added get_menu_text to flmodule.
* Stubs for faster implementation of local variables (not yet finished)
* Added function name to code object. Print it for code and function
objects. THIS MAKES THE .PYC FILE FORMAT INCOMPATIBLE (the version
number has changed accordingly)
* Print address of self for built-in methods
* New internal functions getattro and setattro (getattr/setattr with
string object arg)
* Replaced "dictobject" with more powerful "mappingobject"
* New per-type functio tp_hash to implement arbitrary object hashing,
and hashobject() to interface to it
* Added built-in functions hash(v) and hasattr(v, 'name')
* classobject: made some functions static that accidentally weren't;
added __hash__ special instance method to implement hash()
* Added proper comparison for built-in methods and functions
* various modules: added 1993 to copyright.
* thread.c: added copyright notice.
* ceval.c: minor change to error message for "+"
* stdwinmodule.c: check for error from wfetchcolor
* config.c: MS-DOS fixes (define PYTHONPATH, use DELIM, use osdefs.h)
* Add declaration of inittab to import.h
* sysmodule.c: added sys.builtin_module_names
* xxmodule.c, xxobject.c: fix minor errors
* socketmodule.c: get rid of makepair(); fix makesocketaddr to fix
broken recvfrom()
* socketmodule: get rid of getStrarg()
* ceval.h: move eval_code() to new file eval.h, so compile.h is no
longer needed.
* ceval.c: move thread comments to ceval.h; always make save/restore
thread functions available (for dynloaded modules)
* cdmodule.c, listobject.c: don't include compile.h
* flmodule.c: include ceval.h
* import.c: include eval.h instead of ceval.h
* cgen.py: add forground(); noport(); winopen(""); to initgl().
* bltinmodule.c, socketmodule.c, fileobject.c, posixmodule.c,
selectmodule.c:
adapt to threads (add BGN/END SAVE macros)
* stdwinmodule.c: adapt to threads and use a special stdwin lock.
* pythonmain.c: don't include getpythonpath().
* pythonrun.c: use BGN/END SAVE instead of direct calls; also more
BGN/END SAVE calls etc.
* thread.c: bigger stack size for sun; change exit() to _exit()
* threadmodule.c: use BGN/END SAVE macros where possible
* timemodule.c: adapt better to threads; use BGN/END SAVE; add
longsleep internal function if BSD_TIME; cosmetics