the symbol table pass. These blocks were already ignored by the code
gen pass. Both passes must visit the same set of blocks in the same
order.
Fixes SF buf 132820
They're actually complaining about something more specific, an assignment
in a lambda as an actual argument, so that Python parses the
lambda as if it were a keyword argument. Like f(lambda x: x[0]=42).
The "lambda x: x[0]" part gets parsed as if it were a keyword, being
bound to 42, and the resulting error msg didn't make much sense.
_testcapimodule.c
make sure PyList_Reverse doesn't blow up again
getargs.c
assert args isn't NULL at the top of vgetargs1 instead of
waiting for a NULL-pointer dereference at the end
Bug was introduced by tricks played to make .pyc files executable
via cmdline arg. Then again, -x worked via a trick to begin with.
If anyone can think of a portable way to test -x, be my guest!
create an empty dictionary if it is called without keyword args. Just
pass NULL.
XXX I had believed that this caused weird errors, but the test suite
runs cleanly.
of nested functions. Either is allowed in a function if it contains
no defs or lambdas or the defs and lambdas it contains have no free
variables. If a function is itself nested and has free variables,
either is illegal.
Revise the symtable to use a PySymtableEntryObject, which holds all
the revelent information for a scope, rather than using a bunch of
st_cur_XXX pointers in the symtable struct. The changes simplify the
internal management of the current symtable scope and of the stack.
Added new C source file: Python/symtable.c. (Does the Windows build
process need to be updated?)
As part of these changes, the initial _symtable module interface
introduced in 2.1a2 is replaced. A dictionary of
PySymtableEntryObjects are returned.
hooks to take over the Python import machinery at a very early stage
in the Python startup phase.
If there are still places in the Python interpreter which need to
bypass the __import__ hook, these places must now use
PyImport_ImportModuleEx() instead. So far no other places than in
the import mechanism itself have been identified.
symtable.h, so that they can be used by external module.
Improve error handling in symtable_enter_scope(), which return an
error code that went unchecked by most callers. XXX The error handling
in symtable code is sloppy in general.
Modify symtable to record the line number that begins each scope.
This can help to identify which code block is being referred to when
multiple blocks are bound to the same name.
Add st_scopes dict that is used to preserve scope info when
PyNode_CompileSymtable() is called. Otherwise, this information is
tossed as soon as it is no longer needed.
Add Py_SymtableString() to pythonrun; analogous to Py_CompileString().
discussion on python-dev. 'from mod import *' is still banned except
at the module level.
Fix value for special NOOPT entry in symtable. Initialze to 0 instead
of None, so that later uses of PyInt_AS_LONG() are valid. (Bug
reported by Donn Cave.)
replace local REPR macros with PyObject_REPR in object.h
reference manual but not checked: Names bound by import statemants may
not occur in global statements in the same scope. The from ... import *
form may only occur in a module scope.
I guess these changes could break code, but the reference manual
warned about them.
Several other small changes
If a variable is declared global in the nearest enclosing scope of a
free variable, then treat it is a global in the nested scope too.
Get rid of com_mangle and symtable_mangle functions and call mangle
directly.
If errors occur during symtable table creation, return -1 from
symtable_build().
Do not increment st_errors in assignment to lambda, because exception
is not set.
Add extra argument to symtable_assign(); the argument, flag, is ORed
with DEF_LOCAL for each symtable_add_def() call.
This change eliminates an extra malloc/free when a frame with free
variables is created. Any cell vars or free vars are stored in
f_localsplus after the locals and before the stack.
eval_code2() fills in the appropriate values after handling
initialization of locals.
To track the size the frame has an f_size member that tracks the total
size of f_localsplus. It used to be implicitly f_nlocals + f_stacksize.
They're named as if public, so I did a Bad Thing by changing
PyMarshal_ReadObjectFromFile() to suck up the remainder of the file in one
gulp: anyone who counted on that leaving the file pointer merely at the
end of the next object would be screwed. So restored
PyMarshal_ReadObjectFromFile() to its earlier state, renamed the new greedy
code to PyMarshal_ReadLastObjectFromFile(), and changed Python internals to
call the latter instead.
SF patch http://sourceforge.net/patch/?func=detailpatch&patch_id=103453&group_id=5470
PyMember_Set of T_CHAR always raises exception.
Unfortunately, this is a use of a C API function that Python itself never makes, so
there's no .py test I can check in to verify this stays fixed. But the fault in the
code is obvious, and Dave Cole's patch just as obviously fixes it.
The majority of the changes are in the compiler. The mainloop changes
primarily to implement the new opcodes and to pass a function's
closure to eval_code2(). Frames and functions got new slots to hold
the closure.
Include/compile.h
Add co_freevars and co_cellvars slots to code objects.
Update PyCode_New() to take freevars and cellvars as arguments
Include/funcobject.h
Add func_closure slot to function objects.
Add GetClosure()/SetClosure() functions (and corresponding
macros) for getting at the closure.
Include/frameobject.h
PyFrame_New() now takes a closure.
Include/opcode.h
Add four new opcodes: MAKE_CLOSURE, LOAD_CLOSURE, LOAD_DEREF,
STORE_DEREF.
Remove comment about old requirement for opcodes to fit in 7
bits.
compile.c
Implement changes to code objects for co_freevars and co_cellvars.
Modify symbol table to use st_cur_name (string object for the name
of the current scope) and st_cur_children (list of nested blocks).
Also define st_nested, which might more properly be called
st_cur_nested. Add several DEF_XXX flags to track def-use
information for free variables.
New or modified functions of note:
com_make_closure(struct compiling *, PyCodeObject *)
Emit LOAD_CLOSURE opcodes as needed to pass cells for free
variables into nested scope.
com_addop_varname(struct compiling *, int, char *)
Emits opcodes for LOAD_DEREF and STORE_DEREF.
get_ref_type(struct compiling *, char *name)
Return NAME_CLOSURE if ref type is FREE or CELL
symtable_load_symbols(struct compiling *)
Decides what variables are cell or free based on def-use info.
Can now raise SyntaxError if nested scopes are mixed with
exec or from blah import *.
make_scope_info(PyObject *, PyObject *, int, int)
Helper functions for symtable scope stack.
symtable_update_free_vars(struct symtable *)
After a code block has been analyzed, it must check each of
its children for free variables that are not defined in the
block. If a variable is free in a child and not defined in
the parent, then it is defined by block the enclosing the
current one or it is a global. This does the right logic.
symtable_add_use() is now a macro for symtable_add_def()
symtable_assign(struct symtable *, node *)
Use goto instead of for (;;)
Fixed bug in symtable where name of keyword argument in function
call was treated as assignment in the scope of the call site. Ex:
def f():
g(a=2) # a was considered a local of f
ceval.c
eval_code2() now take one more argument, a closure.
Implement LOAD_CLOSURE, LOAD_DEREF, STORE_DEREF, MAKE_CLOSURE>
Also: When name error occurs for global variable, report that the
name was global in the error mesage.
Objects/frameobject.c
Initialize f_closure to be a tuple containing space for cellvars
and freevars. f_closure is NULL if neither are present.
Objects/funcobject.c
Add support for func_closure.
Python/import.c
Change the magic number.
Python/marshal.c
Track changes to code objects.
parameters that contained both anonymous tuples and *arg or **arg. Ex:
def f(a, (b, c), *d): pass
Fix the symtable_params() to generate names in the right order for
co_varnames slot of code object. Consider *arg and **arg before the
"complex" names introduced by anonymous tuples.
module__doc__: Document the Warning subclass heirarchy.
make_class(): Added a "goto finally" so that if populate_methods()
fails, the return status will be -1 (failure) instead of 0 (success).
fini_exceptions(): When decref'ing the static pointers to the
exception classes, clear out their dictionaries too. This breaks a
cycle from class->dict->method->class and allows the classes with
unbound methods to be reclaimed. This plugs a large memory leak in a
common Py_Initialize()/dosomething/Py_Finalize() loop.
pythonrun.c: In Py_Finalize, don't reset the initialized flag until after
the exit funcs have run.
atexit.py: in _run_exitfuncs, mutate the list of pending calls in a
threadsafe way. This wasn't a contributor to bug 128475, it just burned
my eyeballs when looking at that bug.
symbol table for each top-level compilation unit. The information in
the symbol table allows the elimination of the later optimize() pass;
the bytecode generation emits the correct opcodes.
The current version passes the complete regression test, but may still
contain some bugs. It's a fairly substantial revision. The current
code adds an assert() and a test that may lead to a Py_FatalError().
I expect to remove these before 2.1 beta 1.
The symbol table (struct symtable) is described in comments in the
code.
The changes affects the several com_XXX() functions that were used to
emit LOAD_NAME and its ilk. The primary interface for this bytecode
is now com_addop_varname() which takes a kind and a name, where kind
is one of VAR_LOAD, VAR_STORE, or VAR_DELETE.
There are many other smaller changes:
- The name mangling code is no longer contained in ifdefs. There are
two functions that expose the mangling logical: com_mangle() and
symtable_mangle().
- The com_error() function can accept NULL for its first argument;
this is useful with is_constant_false() is called during symbol
table generation.
- The loop index names used by list comprehensions have been changed
from __1__ to [1], so that they can not be accessed by Python code.
- in com_funcdef(), com_argdefs() is now called before the body of the
function is compiled. This provides consistency with com_lambdef()
and symtable_funcdef().
- Helpers do_pad(), dump(), and DUMP() are added to aid in debugging
the compiler.
except that it always returns Unicode objects.
A new C API PyObject_Unicode() is also provided.
This closes patch #101664.
Written by Marc-Andre Lemburg. Copyright assigned to Guido van Rossum.
- Use PyObject_RichCompare*() where possible: when comparing
keyword arguments, in _PyEval_SliceIndex(), and of course in
cmp_outcome().
Unrelated stuff:
- Removed all trailing whitespace.
- Folded some long lines.
message, and tries to make the messages more consistent and helpful when
the wrong number of arguments or duplicate keyword arguments are supplied.
Comes with more tests for test_extcall.py and and an update to an error
message in test/output/test_pyexpat.
re-initializing Python (Py_Finalize() followed by Py_Initialize()) to
blow up quickly. With the DECREF removed I can't get it to fail any
more. (Except it still leaks, but that's probably a separate issue.)
1) "from M import X" now works even if M is not a real module; it's
basically a getattr() operation with AttributeError exceptions
changed into ImportError.
2) "from M import *" now looks for M.__all__ to decide which names to
import; if M.__all__ doesn't exist, it uses M.__dict__.keys() but
filters out names starting with '_' as before. Whether or not
__all__ exists, there's no restriction on the type of M.
- Make error messages from issubclass() and isinstance() a bit more
descriptive (Ping, modified by Guido)
- Couple of tiny fixes to other docstrings (Ping)
- Get rid of trailing whitespace (Guido)
Cygwin Python DLL and Shared Extension Patch). Add module.dll as a
valid extension.
jlt63 writes: Note that his change essentially backs out the fix for
bug #115973. Should ".pyd" be retained instead for posterity?
an empty keywords dictionary (via apply() or the extended call syntax),
the keywords dict should be ignored. If the keywords dict is not empty,
TypeError should be raised. (Between the restructuring of the call
machinery and this patch, an empty dict in this situation would trigger
a SystemError via PyErr_BadInternalCall().)
Added regression tests to detect errors for this.
More revision still needed.
Much of the code that was in the mainloop was moved to a series of
helper functions. PyEval_CallObjectWithKeywords was split into two
parts. The first part now only does argument handling. The second
part is now named call_object and delegates the call to a
call_(function,method,etc.) helper.
XXX The call_XXX helper functions should be replaced with tp_call
functions for the respective types.
The CALL_FUNCTION implementation contains three kinds of optimization:
1. fast_cfunction and fast_function are called when the arguments on
the stack can be passed directly to eval_code2() without copying
them into a tuple.
2. PyCFunction objects are dispatched immediately, because they are
presumed to occur more often than anything else.
3. Bound methods are dispatched inline. The method object contains a
pointer to the function object that will be called. The function
is called from within the mainloop, which may allow optimization #1
to be used, too.
The extened call implementation -- f(*args) and f(**kw) -- are
implemented as a separate case in the mainloop. This allows the
common case of normal function calls to execute without wasting time
on checks for extended calls, although it does introduce a small
amount of code duplication.
Also, the unused final argument of eval_code2() was removed. This is
probably the last trace of the access statement :-).
"..." in "from M import ..." was never DECREFed. Leak reported by
James Slaughter and nailed by Barry, who also provided an earlier
version of this patch.
the bug report (for details, look at it), but agree there's no need for Python
to declare atof itself: we #include stdlib.h, and ANSI C sez atof is declared
there already.
regardless of whether the system getopt() does what we want. This avoids the
hassle with prototypes and externs, and the check to see if the system
getopt() does what we want. Prefix optind, optarg and opterr with _PyOS_ to
avoid name clashes. Add new include file to define the right symbols. Fix
Demo/pyserv/pyserv.c to include getopt.h itself, instead of relying on
Python to provide it.
When a method is called with no regular arguments and * args, defer
the first arg is subclass check until after the * args have been
expanded.
N.B. The CALL_FUNCTION implementation is getting really hairy; should
review it to see if it can be simplified.
by making the DUP_TOPX code utterly straightforward. This also gets rid
of all normal-case internal DUP_TOPX if/branches, and allows replacing one
POP() with TOP() in each case, so is a good idea regardless.
Do not assume that all platforms using a MetroWorks compiler can use
POSIX threads; the assumption breaks on BeOS. This fix only helps
for BeOS.
This closes SourceForge patch #101772.
unintentionally caused them to get written in text mode under Windows.
As a result, when .pyc files were later read-- in binary mode --the
magic number was always wrong (note that .pyc magic numbers deliberately
include \r and \n characters, so this was "good" breakage, 100% across
all .pyc files, not random corruption in a subset). Fixed that.
Add definitions of INT_MAX and LONG_MAX to pyport.h.
Remove includes of limits.h and conditional definitions of INT_MAX
and LONG_MAX elsewhere.
This closes SourceForge patch #101659 and bug #115323.
Add three new convenience functions to the PyModule_*() family:
PyModule_AddObject(), PyModule_AddIntConstant(), PyModule_AddStringConstant().
This closes SourceForge patch #101233.
"s#" will now return a pointer to the default encoded string data
of the Unicode object instead of a pointer to the raw UTF-16
data.
The latter is still available via PyObject_AsReadBuffer().
The patch also adds an optimization for string objects which is
based on the fact that string objects return the raw character data
for getreadbuffer access and are always single-segment.
which implements the automatic conversion from Unicode to a string
object using the default encoding.
The new API is then put to use to have eval() and exec accept
Unicode objects as code parameter. This closes bugs #110924
and #113890.
As side-effect, the traditional C APIs PyString_Size() and
PyString_AsString() will also accept Unicode objects as
parameters.
When reading a short, sign-extend on platforms where shorts are
bigger than 16 bits.
When reading a long, repair the unportable sign extension that was
being done for 64-bit machines (it assumed that signed right shift
sign-extends).
I can't test this, so I'm just checking it in with blind faith in Andy.
I've tested that it doesn't broeak a non-Pth build on Linux.
Changes include:
- There's a --with-pth configure option.
- Instead of _GNU_PTH, we test for HAVE_PTH.
- Better signal handling.
- (The config.h.in file is regenerated in a slightly different order.)
can cause it to get called by multiple threads simultaneously.
Ditto for PyInterpreterState_Delete.
Of the former, the docs say "The interpreter lock need not be held, but may
be held if it is necessary to serialize calls to this function". This
kinda implies it both is and isn't thread-safe.
Of the latter, the docs merely say "The interpreter lock need not be
held.", and the clause about serializing is absent.
I expect it was *believed* these are both thread-safe, and the bit about
serializing via the global lock was meant as a permission rather than a
caution.
I also expect we've never seen a problem here because the Python core
(prior to the _PyPclose fix) only calls these functions once per run.
The Py_NewInterpreter subsystem exposed by the C API (but not used by
Python itself) also calls them, but that subsystem appears to be very
rarely used.
Whatever, they're both thread-safe now.
ceval.c:
define recurion_limit (static), default value is 2500
define Py_GetRecursionLimit and Py_SetRecursionLimit
raise RuntimeError if limit is exceeded
PC/config.h:
remove plat-specific definition
sysmodule.c:
add sys.(get|set)recursionlimit
how 'import' was called with a compiletime mechanism: create either a tuple
of the import arguments, or None (in the case of a normal import), add it to
the code-block constants, and load it onto the stack before calling
IMPORT_NAME.
PyRun_FileEx(). These are the same as their non-Ex counterparts but
have an extra argument, a flag telling them to close the file when
done.
Then this is used by Py_Main() and execfile() to close the file after
it is parsed but before it is executed.
Adding APIs seems strange given the feature freeze but it's the only
way I see to close the bug report without incompatible changes.
[ Bug #110616 ] source file stays open after parsing is done (PR#209)
(This fix is a bit broken, just as the test already was: the test for
testlist and listmaker are done always, whereas the test for exprlist and
the actual abort() are only done if Py_DEBUG is defined. Suggestions
welcome, I guess ;)
Add the EXTENDED_ARG opcode to the virtual machine, allowing 32-bit
arguments to opcodes instead of being forced to stick to the 16-bit
limit. This is especially useful for machine-generated code, which
can be too long for the SET_LINENO parameter to fit into 16 bits.
This closes the implementation portion of SourceForge patch #100893.
- Fix bug in thread_pthread.h::PyThread_get_thread_ident() where
sizeof(pthread) < sizeof(long).
- Add 'configure' for:
- SIZEOF_PTHREAD is pthread_t can be included via <pthread.h>
- setting Monterey system name
- appropriate CC,LINKCC,LDSHARED,OPT, and CCSHARED for Monterey
- Add section in README for Monterey build
eval_code2(): Implement new bytecodes PRINT_ITEM_TO and
PRINT_NEWLINE_TO, as per accepted SF patch #100970.
Also update graminit.c based on related Grammar/Grammar changes.
trying hard enough to find out what the arguments to an import were. There
is no test-case for this bug, yet, but this is what it looked like:
from encodings import cp1006, cp1026
ImportError: cannot import name cp1026
'__import__' was called with only the first name in the 'arguments' list.
load mod.submod as m, or mod as m ? Both can be achieved differently, and
unambiguously. Also attempt to document this restriction (editor
appreciated!)
Note that this is an artificial check during compile, because incorporating
this in the grammar is hard, and then adjusting the compiler to do the right
thing with the right nodes is harder.
scope. Previously, s_buffer[] was defined inside the
PyUnicode_Check() scope, but referred to in the outer scope via
assignment to s. This quiets an Insure portability warning.
name as n'. By doing some twists and turns, "as" is not a reserved word.
There is a slight change in semantics for 'from module import name' (it will
now honour the 'global' keyword) but only in cases that are explicitly
undocumented.
First, the allocated buffer was never freed after using it to create
the PyString object. Second, it was possible that have_filename would
be false (meaning that filename was not a PyString object), but that
the code would still try to PyString_GET_SIZE() it.
in binascii.c (only on platforms with signed chars -- although Py_CHARMASK
is documented as returning an int, it only does so on platforms with
signed chars).
returning a pointer to the start of the file's "base" name;
similar to os.path.basename().
SyntaxError__str__(): Use my_basename() to keep the length of the
file name included in the exception message short.
filename and lineno attributes, but do not mask the SyntaxError if we
fail.
This is part of what is needed to close SoruceForge bug #110628
(Jitterbug PR#278).
Wrap a long line to fit in under 80 columns.
filename and lineno attributes, but do not mask the SyntaxError if we
fail.
This is part of what is needed to close SoruceForge bug #110628
(Jitterbug PR#278).
than depending on the site that raises the exception. If the
filename and lineno attributes are set on the exception object,
use them to augment the message displayed.
This is part of what is needed to close SoruceForge bug #110628
(Jitterbug PR#278).
string literals has not been tested on an MS_WIN16 platform; the trailing
";" was inside the #ifndef MS_WIN16, which should cause an error (missing
semi-colon) when compiled with that symbol #defined.
did the same anyway.
I'm not sure what to do with Tools/compiler/compiler/* -- that isn't part of
distutils, is it ? Should it try to be compatible with old bytecode version ?
-32768..65535 is acceptable. Added B specifier (with values from
-128..255). No L added (which would have completed the set) because l
already accepts any value (and the letter L is taken for quadwords).
python-dev discussion.
This should catch future version incompatibilities on Windows. Alas,
this doesn't help for 1.5 vs. 1.6; but it will help for 1.6 vs. 2.0.
the Python Unicode implementation.
The internal buffer used for implementing the buffer protocol
is renamed to defenc to make this change visible. It now holds the
default encoded version of the Unicode object and is calculated
on demand (NULL otherwise).
Since the default encoding defaults to ASCII, this will mean that
Unicode objects which hold non-ASCII characters will no longer
work on C APIs using the "s" or "t" parser markers. C APIs must now
explicitly provide Unicode support via the "u", "U" or "es"/"es#"
parser markers in order to work with non-ASCII Unicode strings.
(Note: this patch will also have to be applied to the 1.6 branch
of the CVS tree.)
accepted by the BDFL.
builtin_zip(): New function to implement the zip() function described
in the above proposal.
zip_doc[]: Docstring for zip().
builtin_methods[]: added entry for zip()
for systems that are missing those declarations from system include files.
Start by moving a pointy-haired ones from their previous locations to the
new section.
(The gethostname() one, for instance, breaks on several systems, because
some define it as (char *, size_t) and some as (char *, int).)
I purposely decided not to include the summary of used #defines like Tim did
in the first section of pyport.h. In my opinion, the number of #defines
likedly to be used by this section would make such an overview unwieldy. I
would suggest documenting the non-obvious ones, though.
good C practice hasn't been available to everything all along.
Added Py_SAFE_DOWNCAST(VALUE, WIDE, NARROW) macro to pyport.h; this
just casts VALUE from type WIDE to type NARROW, but assert-fails if
Py_DEBUG is defined and info is lost due to casting.
Replaced a line in Fredrik's fix to marshal.c to use the new macro.
MAGIC number. When updating it next time, be sure it's higher than 50715 *
constants. (Shouldn't be a problem if everyone keeps to the proper
algorithm.)
comments, docstrings or error messages. I fixed two minor things in
test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't").
There is a minor style issue involved: Guido seems to have preferred English
grammar (behaviour, honour) in a couple places. This patch changes that to
American, which is the more prominent style in the source. I prefer English
myself, so if English is preferred, I'd be happy to supply a patch myself ;)
used for indentation related errors. This patch includes Ping's
improvements for indentation-related error messages.
Closes SourceForge patches #100734 and #100856.
This adds support for instance to the constructor (instances
have to define __str__ and can return Unicode objects via that
hook; string return values are decoded into Unicode using the
current default encoding).
The common technique for printing out a pointer has been to cast to a long
and use the "%lx" printf modifier. This is incorrect on Win64 where casting
to a long truncates the pointer. The "%p" formatter should be used instead.
The problem as stated by Tim:
> Unfortunately, the C committee refused to define what %p conversion "looks
> like" -- they explicitly allowed it to be implementation-defined. Older
> versions of Microsoft C even stuck a colon in the middle of the address (in
> the days of segment+offset addressing)!
The result is that the hex value of a pointer will maybe/maybe not have a 0x
prepended to it.
Notes on the patch:
There are two main classes of changes:
- in the various repr() functions that print out pointers
- debugging printf's in the various thread_*.h files (these are why the
patch is large)
Closes SourceForge patch #100505.
This patch fixes possible overflow in the use of
PyOS_GetLastModificationTime in getmtime.c and Python/import.c.
Currently PyOS_GetLastModificationTime returns a C long. This can
overflow on Win64 where sizeof(time_t) > sizeof(long). Besides it
should logically return a time_t anyway (this patch changes this).
As well, import.c uses PyOS_GetLastModificationTime for .pyc
timestamping. There has been recent discussion about the .pyc header
format on python-dev. This patch adds oveflow checking to import.c so
that an exception will be raised if the modification time
overflows. There are a few other minor 64-bit readiness changes made
to the module as well:
- size_t instead of int or long for function-local buffer and string
length variables
- one buffer overflow check was added (raises an exception on possible
overflow, this overflow chance exists on 32-bit platforms as well), no
other possible buffer overflows existed (from my analysis anyway)
Closes SourceForge patch #100509.
The common technique for printing out a pointer has been to cast to a long
and use the "%lx" printf modifier. This is incorrect on Win64 where casting
to a long truncates the pointer. The "%p" formatter should be used instead.
The problem as stated by Tim:
> Unfortunately, the C committee refused to define what %p conversion "looks
> like" -- they explicitly allowed it to be implementation-defined. Older
> versions of Microsoft C even stuck a colon in the middle of the address (in
> the days of segment+offset addressing)!
The result is that the hex value of a pointer will maybe/maybe not have a 0x
prepended to it.
Notes on the patch:
There are two main classes of changes:
- in the various repr() functions that print out pointers
- debugging printf's in the various thread_*.h files (these are why the
patch is large)
Closes SourceForge patch #100505.
This patch fixes a problem on AIX with the signed int case code in
getargs.c, after Trent Mick's intervention about MIN/MAX overflow
checks. The AIX compiler/optimizer generates bogus code with the
default flags "-g -O" causing test_builtin to fail: int("10", 16) <>
16L. Swapping the two checks in the signed int code makes the problem
go away.
Also, make the error messages fit in 80 char lines in the
source.
The depth field was never decremented inside w_object(), and it was
never initialized in PyMarshal_WriteObjectToFile().
This caused imports from .pyc files to fil mysteriously when the .pyc
file was written by the broken code -- w_object() would bail out
early, but PyMarshal_WriteObjectToFile() doesn't check the error or
return an error code, and apparently the marshalling code doesn't call
PyErr_Check() either. (That's a separate patch if I feel like it.)
Various small fixes to the builtin module to ensure no buffer
overflows.
- chunk #1:
Proper casting to ensure no truncation, and hence no surprises, in the
comparison.
- chunk #2:
The id() function guarantees a unique return value for different
objects. It does this by returning the pointer to the object. By
returning a PyInt, on Win64 (sizeof(long) < sizeof(void*)) the pointer
is truncated and the guarantee may be proven false. The appropriate
return function is PyLong_FromVoidPtr, this returns a PyLong if that
is necessary to return the pointer without truncation.
[GvR: note that this means that id() can now return a long on Win32
platforms. This *might* break some code...]
- chunk #3:
Ensure no overflow in raw_input(). Granted the user would have to pass
in >2GB of data but it *is* a possible buffer overflow condition.
As I really do not have anything better to do at the moment, I have written
a patch to Python/marshal.c that prevents Python dumping core when trying
to marshal stack bustingly deep (or recursive) data structure.
It just throws an exception; even slightly clever handling of recursive
data is what pickle is for...
[Fred Drake:] Moved magic constant 5000 to a #define.
This closes SourceForge patch #100645.
the number of children of a node exceeds the max possible value for
the short that is used to count them. The Python runtime converts
this parser error into the SyntaxError "expression too long."
module and into _exceptions.c. This includes all the PyExc_* globals,
the bltin_exc table, init_class_exc(), fini_instances(),
finierrors().
Renamed _PyBuiltin_Init_1() to _PyBuiltin_Init() since the two phase
initializations are necessary any more.
Removed as obsolete _PyBuiltin_Init_2(), _PyBuiltin_Fini_1() and
_PyBuiltin_Fini_2().
need two phase init or fini of the builtin module. Change the call of
_PyBuiltin_Init_1() to _PyBuiltin_Init(). Add a call to
init_exceptions().
Py_Finalize(): Don't call _PyBuiltin_Fini_1(). Instead call
fini_exceptions() but move this to before the thread state is
cleared.
Limit the 'b' formatter of PyArg_ParseTuple to valid values of an unsigned
char, i.e. [0,UCHAR_MAX]. It is expected that this is the common usage of 'b'.
An OverflowError is raised if the parsed value is outside this range.
Changes the 'b', 'h', and 'i' formatters in PyArg_ParseTuple to raise an
Overflow exception if they overflow (previously they just silently
overflowed).
Changes by Guido: always accept values [0..255] (in addition to
[CHAR_MIN..CHAR_MAX]) for 'b' format; changed some spaces into tabs in
other code.
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
For more comments, read the patches@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.
(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode. I'm also holding back on his
change to main.c, which seems unnecessary to me.)
- When 'import exceptions' fails, don't suggest to use -v to print the traceback;
this doesn't actually work.
- Remove comment about fallback to string exceptions.
- Remove a PyErr_Occurred() check after all is said and done that can
never trigger.
- Remove static function newstdexception() which is no longer called.
Added 'u' and 'u#' tags for PyArg_ParseTuple - these turn a
PyUnicodeObject argument into a Py_UNICODE * buffer, or a Py_UNICODE *
buffer plus a length with the '#'. Also added an analog to 'U'
for Py_BuildValue.
return 0 (exceptions don't match). This means that if an ImportError
is raised because exceptions.py can't be imported, the interpreter
will exit "cleanly" with an error message instead of just core
dumping.
PyErr_SetFromErrnoWithFilename(), PyErr_SetFromWindowsErrWithFilename():
Don't test on Py_UseClassExceptionsFlag.
are no longer supported (i.e. -X option is removed).
_PyBuiltin_Init_1(): Don't call initerrors(). This does mean that it
is possible to raise an ImportError before that exception has been
initialized, say because exceptions.py can't be found, or contains
bogosity. See changes to errors.c for how this is handled.
_PyBuiltin_Init_2(): Don't test Py_UseClassExceptionsFlag, just go
ahead and initialize the class-based standard exceptions. If this
fails, we throw a Py_FatalError.
Changed all references to the MAGIC constant to use a global
pyc_magic instead. This global is initially set to MAGIC, but can be
changed by the _PyImport_Init() function to provide for
special features implemented in the compiler which are settable
using command line switches and affect the way PYC files are
generated.
Currently this change is only done for the -U flag.
Support for the new -U command line option option:
with the option enabled the Python compiler
interprets all "..." strings as u"..." (same with r"..." and
ur"...").
Follow a suggestion in an /*XXX*/ comment [in com_add()] to speed up
compilation by using supplemental dictionaries to keep track of names
and constants, eliminating quadratic behavior. With this patch in
place, the time to import a 5000-line file with lots of constants [at
the global level] is reduced from 20 seconds to under 3 on my system.
Here's a patch which changes modsupport to add 'u' and 'u#',
to support building Unicode objects from a null-terminated
Py_UNICODE *, and a Py_UNICODE * with length, respectively.
[Conversion from 'U' to 'u' by Fred, based on python-dev comments.]
Note that the use of None for NULL values of the Py_UNICODE* value is
still in; I'm not sure of the conclusion on that issue.
remaining object references if the environment variable PYTHONDUMPREFS
exists. The default behaviour caused problems for background or
otherwise invisible processes that use the debug build of Python.
Fixed a memory leak found by Fredrik Lundh. Instead of
PyUnicode_AsUTF8String() we now use _PyUnicode_AsUTF8String() which
returns the string object without incremented refcount (and assures
that the so obtained object remains alive until the Unicode object is
garbage collected).
"""
Running "test_extcall" repeatedly results in memory leaks.
One of these can't be fixed (at least not easily!), it happens since
this code:
def saboteur(**kw):
kw['x'] = locals()
d = {}
saboteur(a=1, **d)
creates a circular reference - d['x']['d']==d
The others are due to some missing decrefs in ceval.c, fixed by the
patch attached below.
Note: I originally wrote this without the "goto", just adding the
missing decref's where needed. But I think the goto is justified in
keeping the executable code size of ceval as small as possible.
"""
[I think the circular reference is more like kw['x']['kw'] == kw. --GvR]
Added special case to unicode(): when being passed a
Unicode object as first argument, return the object as-is.
Raises an exception when given a Unicode object *and* an
encoding name.
comparing code objects. This give sless surprising results in
-Optimized code. It also sorts code objects by name, now.
[I changed the patch to hash() slightly to touch fewer lines.]
his copy of test_contains.py seems to be broken -- the lines he
deleted were already absent). Checkin messages:
New Unicode support for int(), float(), complex() and long().
- new APIs PyInt_FromUnicode() and PyLong_FromUnicode()
- added support for Unicode to PyFloat_FromString()
- new encoding API PyUnicode_EncodeDecimal() which converts
Unicode to a decimal char* string (used in the above new
APIs)
- shortcuts for calls like int(<int object>) and float(<float obj>)
- tests for all of the above
Unicode compares and contains checks:
- comparing Unicode and non-string types now works; TypeErrors
are masked, all other errors such as ValueError during
Unicode coercion are passed through (note that PyUnicode_Compare
does not implement the masking -- PyObject_Compare does this)
- contains now works for non-string types too; TypeErrors are
masked and 0 returned; all other errors are passed through
Better testing support for the standard codecs.
Misc minor enhancements, such as an alias dbcs for the mbcs codec.
Changes:
- PyLong_FromString() now applies the same error checks as
does PyInt_FromString(): trailing garbage is reported
as error and not longer silently ignored. The only characters
which may be trailing the digits are 'L' and 'l' -- these
are still silently ignored.
- string.ato?() now directly interface to int(), long() and
float(). The error strings are now a little different, but
the type still remains the same. These functions are now
ready to get declared obsolete ;-)
- PyNumber_Int() now also does a check for embedded NULL chars
in the input string; PyNumber_Long() already did this (and
still does)
Followed by:
Looks like I've gone a step too far there... (and test_contains.py
seem to have a bug too).
I've changed back to reporting all errors in PyUnicode_Contains()
and added a few more test cases to test_contains.py (plus corrected
the join() NameError).
If a non-tuple sequence is passed as the *arg, convert it to a tuple
before checking its length.
If named keyword arguments are used in combination with **kwargs, make
a copy of kwargs before inserting the new keys.
the return value of PySequence_Length(). If an exception occurred,
the returned length will be -1. Make sure this doesn't get obscurred,
and that the bogus length isn't used.
executive summary:
Instead of typing 'apply(f, args, kwargs)' you can type 'f(*arg, **kwargs)'.
Some file-by-file details follow.
Grammar/Grammar:
simplify varargslist, replacing '*' '*' with '**'
add * & ** options to arglist
Include/opcode.h & Lib/dis.py:
define three new opcodes
CALL_FUNCTION_VAR
CALL_FUNCTION_KW
CALL_FUNCTION_VAR_KW
Python/ceval.c:
extend TypeError "keyword parameter redefined" message to include
the name of the offending keyword
reindent CALL_FUNCTION using four spaces
add handling of sequences and dictionaries using extend calls
fix function import_from to use PyErr_Format
The attached patch set includes a workaround to get Python with
Unicode compile on BSDI 4.x (courtesy Thomas Wouters; the cause
is a bug in the BSDI wchar.h header file) and Python interfaces
for the MBCS codec donated by Mark Hammond.
Also included are some minor corrections w/r to the docs of
the new "es" and "es#" parser markers (use PyMem_Free() instead
of free(); thanks to Mark Hammond for finding these).
The unicodedata tests are now in a separate file
(test_unicodedata.py) to avoid problems if the module cannot
be found.
Attached you find the latest update of the Unicode implementation.
The patch is against the current CVS version.
It includes the fix I posted yesterday for the core dump problem
in codecs.c (was introduced by my previous patch set -- sorry),
adds more tests for the codecs and two new parser markers
"es" and "es#".