for 'str' and 'unicode', and can be used instead of
types.StringTypes, e.g. to test whether something is "a string":
isinstance(x, string) is True for Unicode and 8-bit strings. This
is an abstract base class and cannot be instantiated directly.
don't understand how this function works, also beefed up the docs. The
most common usage error is of this form (often spread out across gotos):
if (_PyString_Resize(&s, n) < 0) {
Py_DECREF(s);
s = NULL;
goto outtahere;
}
The error is that if _PyString_Resize runs out of memory, it automatically
decrefs the input string object s (which also deallocates it, since its
refcount must be 1 upon entry), and sets s to NULL. So if the "if"
branch ever triggers, it's an error to call Py_DECREF(s): s is already
NULL! A correct way to write the above is the simpler (and intended)
if (_PyString_Resize(&s, n) < 0)
goto outtahere;
Bugfix candidate.
Highlights: import and friends will understand any of \r, \n and \r\n
as end of line. Python file input will do the same if you use mode 'U'.
Everything can be disabled by configuring with --without-universal-newlines.
See PEP278 for details.
PEP 285. Everything described in the PEP is here, and there is even
some documentation. I had to fix 12 unit tests; all but one of these
were printing Boolean outcomes that changed from 0/1 to False/True.
(The exception is test_unicode.py, which did a type(x) == type(y)
style comparison. I could've fixed that with a single line using
issubtype(x, type(y)), but instead chose to be explicit about those
places where a bool is expected.
Still to do: perhaps more documentation; change standard library
modules to return False/True from predicates.
Based on the patch from Danny Yoo. The fix is in exec_statement() in
ceval.c.
There are also changes to introduce use of PyCode_GetNumFree() in
several places.
of PyMapping_Keys because we know we have a real dict. Tolerate that
objects may have an attr named "__dict__" that's not a dict (Py_None
popped up during testing).
test_descr.py, test_dir(): Test the new classic-class behavior; beef up
the new-style class test similarly.
test_pyclbr.py, checkModule(): dir(C) is no longer a synonym for
C.__dict__.keys() when C is a classic class (looks like the same thing
that burned distutils! -- should it be *made* a synoym again? Then it
would be inconsistent with new-style class behavior.).
bag. It's clearly wrong for classic classes, at heart because a classic
class doesn't have a __class__ attribute, and I'm unclear on whether
that's feature or bug. I'll repair this once I find out (in the
meantime, dir() applied to classic classes won't find the base classes,
while dir() applied to a classic-class instance *will* find the base
classes but not *their* base classes).
Please give the new dir() a try and see whether you love it or hate it.
The new dir([]) behavior is something I could come to love. Here's
something to hate:
>>> class C:
... pass
...
>>> c = C()
>>> dir(c)
['__doc__', '__module__']
>>>
The idea that an instance has a __doc__ attribute is jarring (of course
it's really c.__class__.__doc__ == C.__doc__; likewise for __module__).
OTOH, the code already has too many special cases, and dir(x) doesn't
have a compelling or clear purpose when x isn't a module.
builtin_eval wasn't merging in the compiler flags from the current frame;
I suppose we never noticed this before because future division is the
first future-feature that can affect expressions (nested_scopes and
generators had only statement-level effects).
- Do not compile unicodeobject, unicodectype, and unicodedata if Unicode is disabled
- check for Py_USING_UNICODE in all places that use Unicode functions
- disables unicode literals, and the builtin functions
- add the types.StringTypes list
- remove Unicode literals from most tests.
Fix suggested by Michael Hudson: Raise TypeError if attribute name
passed to getattr() is not a string or Unicode. There is some
unfortunate duplication of code between builtin_getattr() and
PyObject_GetAttr(), but it appears to be unavoidable.
that info to code dynamically compiled *by* code compiled with generators
enabled. Doesn't yet work because there's still no way to tell the parser
that "yield" is OK (unlike nested_scopes, the parser has its fingers in
this too).
Replaced PyEval_GetNestedScopes by a more-general
PyEval_MergeCompilerFlags. Perhaps I should not have? I doubted it was
*intended* to be part of the public API, so just did.
- the correct range for the error message is range(0x110000);
- put the 4-byte Unicode-size code inside the same else branch as the
2-byte code, rather generating unreachable code in the 2-byte case.
- Don't hide the 'else' behine the '}'.
(I would prefer that in 4-byte mode, any value should be accepted, but
reasonable people can argue about that, so I'll put that off.)
Add configure option --enable-unicode.
Add config.h macros Py_USING_UNICODE, PY_UNICODE_TYPE, Py_UNICODE_SIZE,
SIZEOF_WCHAR_T.
Define Py_UCS2.
Encode and decode large UTF-8 characters into single Py_UNICODE values
for wide Unicode types; likewise for UTF-16.
Remove test whether sizeof Py_UNICODE is two.
NEEDS DOC CHANGES.
More AttributeErrors transmuted into TypeErrors, in test_b2.py, and,
again, this strikes me as a good thing.
This checkin completes the iterator generalization work that obviously
needed to be done. Can anyone think of others that should be changed?
NEEDS DOC CHANGES.
Possibly contentious: The first time s.next() yields StopIteration (for
a given map argument s) is the last time map() *tries* s.next(). That
is, if other sequence args are longer, s will never again contribute
anything but None values to the result, even if trying s.next() again
could yield another result. This is the same behavior map() used to have
wrt IndexError, so it's the only way to be wholly backward-compatible.
I'm not a fan of letting StopIteration mean "try again later" anyway.
Also a 2.1 bugfix candidate (am I supposed to do something with those?).
Took away map()'s insistence that sequences support __len__, and cleaned
up the convoluted code that made it *look* like it really cared about
__len__ (in fact the old ->len field was only *used* as a flag bit, as
the main loop only looked at its sign bit, setting the field to -1 when
IndexError got raised; renamed the field to ->saw_IndexError instead).
new slot tp_iter in type object, plus new flag Py_TPFLAGS_HAVE_ITER
new C API PyObject_GetIter(), calls tp_iter
new builtin iter(), with two forms: iter(obj), and iter(function, sentinel)
new internal object types iterobject and calliterobject
new exception StopIteration
new opcodes for "for" loops, GET_ITER and FOR_ITER (also supported by dis.py)
new magic number for .pyc files
new special method for instances: __iter__() returns an iterator
iteration over dictionaries: "for x in dict" iterates over the keys
iteration over files: "for x in file" iterates over lines
TODO:
documentation
test suite
decide whether to use a different way to spell iter(function, sentinal)
decide whether "for key in dict" is a good idea
use iterators in map/filter/reduce, min/max, and elsewhere (in/not in?)
speed tuning (make next() a slot tp_next???)
Jeffery Collins pointed out that filterstring decrefs a character object
before it's done using it. This works by accident today because another
module always happens to have an active reference too at the time. The
accident doesn't work after his Pippy modifications, and since it *is*
an accident even in the mainline Python, it should work by design there too.
The patch accomplishes that.
If a module has a future statement enabling nested scopes, they are
also enable for the exec statement and the functions compile() and
execfile() if they occur in the module.
If Python is run with the -i option, which enters interactive mode
after executing a script, and the script it runs enables nested
scopes, they are also enabled in interactive mode.
XXX The use of -i with -c "from __future__ import nested_scopes" is
not supported. What's the point?
To support these changes, many function variants have been added to
pythonrun.c. All the variants names end with Flags and they take an
extra PyCompilerFlags * argument. It is possible that this complexity
will be eliminated in a future version of the interpreter in which
nested scopes are not optional.
except that it always returns Unicode objects.
A new C API PyObject_Unicode() is also provided.
This closes patch #101664.
Written by Marc-Andre Lemburg. Copyright assigned to Guido van Rossum.
- Make error messages from issubclass() and isinstance() a bit more
descriptive (Ping, modified by Guido)
- Couple of tiny fixes to other docstrings (Ping)
- Get rid of trailing whitespace (Guido)
Add definitions of INT_MAX and LONG_MAX to pyport.h.
Remove includes of limits.h and conditional definitions of INT_MAX
and LONG_MAX elsewhere.
This closes SourceForge patch #101659 and bug #115323.
which implements the automatic conversion from Unicode to a string
object using the default encoding.
The new API is then put to use to have eval() and exec accept
Unicode objects as code parameter. This closes bugs #110924
and #113890.
As side-effect, the traditional C APIs PyString_Size() and
PyString_AsString() will also accept Unicode objects as
parameters.
PyRun_FileEx(). These are the same as their non-Ex counterparts but
have an extra argument, a flag telling them to close the file when
done.
Then this is used by Py_Main() and execfile() to close the file after
it is parsed but before it is executed.
Adding APIs seems strange given the feature freeze but it's the only
way I see to close the bug report without incompatible changes.
[ Bug #110616 ] source file stays open after parsing is done (PR#209)
scope. Previously, s_buffer[] was defined inside the
PyUnicode_Check() scope, but referred to in the outer scope via
assignment to s. This quiets an Insure portability warning.
accepted by the BDFL.
builtin_zip(): New function to implement the zip() function described
in the above proposal.
zip_doc[]: Docstring for zip().
builtin_methods[]: added entry for zip()
This adds support for instance to the constructor (instances
have to define __str__ and can return Unicode objects via that
hook; string return values are decoded into Unicode using the
current default encoding).
Various small fixes to the builtin module to ensure no buffer
overflows.
- chunk #1:
Proper casting to ensure no truncation, and hence no surprises, in the
comparison.
- chunk #2:
The id() function guarantees a unique return value for different
objects. It does this by returning the pointer to the object. By
returning a PyInt, on Win64 (sizeof(long) < sizeof(void*)) the pointer
is truncated and the guarantee may be proven false. The appropriate
return function is PyLong_FromVoidPtr, this returns a PyLong if that
is necessary to return the pointer without truncation.
[GvR: note that this means that id() can now return a long on Win32
platforms. This *might* break some code...]
- chunk #3:
Ensure no overflow in raw_input(). Granted the user would have to pass
in >2GB of data but it *is* a possible buffer overflow condition.
module and into _exceptions.c. This includes all the PyExc_* globals,
the bltin_exc table, init_class_exc(), fini_instances(),
finierrors().
Renamed _PyBuiltin_Init_1() to _PyBuiltin_Init() since the two phase
initializations are necessary any more.
Removed as obsolete _PyBuiltin_Init_2(), _PyBuiltin_Fini_1() and
_PyBuiltin_Fini_2().
For more comments, read the patches@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.
(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode. I'm also holding back on his
change to main.c, which seems unnecessary to me.)
- When 'import exceptions' fails, don't suggest to use -v to print the traceback;
this doesn't actually work.
- Remove comment about fallback to string exceptions.
- Remove a PyErr_Occurred() check after all is said and done that can
never trigger.
- Remove static function newstdexception() which is no longer called.
are no longer supported (i.e. -X option is removed).
_PyBuiltin_Init_1(): Don't call initerrors(). This does mean that it
is possible to raise an ImportError before that exception has been
initialized, say because exceptions.py can't be found, or contains
bogosity. See changes to errors.c for how this is handled.
_PyBuiltin_Init_2(): Don't test Py_UseClassExceptionsFlag, just go
ahead and initialize the class-based standard exceptions. If this
fails, we throw a Py_FatalError.
Added special case to unicode(): when being passed a
Unicode object as first argument, return the object as-is.
Raises an exception when given a Unicode object *and* an
encoding name.
his copy of test_contains.py seems to be broken -- the lines he
deleted were already absent). Checkin messages:
New Unicode support for int(), float(), complex() and long().
- new APIs PyInt_FromUnicode() and PyLong_FromUnicode()
- added support for Unicode to PyFloat_FromString()
- new encoding API PyUnicode_EncodeDecimal() which converts
Unicode to a decimal char* string (used in the above new
APIs)
- shortcuts for calls like int(<int object>) and float(<float obj>)
- tests for all of the above
Unicode compares and contains checks:
- comparing Unicode and non-string types now works; TypeErrors
are masked, all other errors such as ValueError during
Unicode coercion are passed through (note that PyUnicode_Compare
does not implement the masking -- PyObject_Compare does this)
- contains now works for non-string types too; TypeErrors are
masked and 0 returned; all other errors are passed through
Better testing support for the standard codecs.
Misc minor enhancements, such as an alias dbcs for the mbcs codec.
Changes:
- PyLong_FromString() now applies the same error checks as
does PyInt_FromString(): trailing garbage is reported
as error and not longer silently ignored. The only characters
which may be trailing the digits are 'L' and 'l' -- these
are still silently ignored.
- string.ato?() now directly interface to int(), long() and
float(). The error strings are now a little different, but
the type still remains the same. These functions are now
ready to get declared obsolete ;-)
- PyNumber_Int() now also does a check for embedded NULL chars
in the input string; PyNumber_Long() already did this (and
still does)
Followed by:
Looks like I've gone a step too far there... (and test_contains.py
seem to have a bug too).
I've changed back to reporting all errors in PyUnicode_Contains()
and added a few more test cases to test_contains.py (plus corrected
the join() NameError).
Introduce a new builtin exception, UnboundLocalError, raised when ceval.c
tries to retrieve or delete a local name that isn't bound to a value.
Currently raises NameError, which makes this behavior a FAQ since the same
error is raised for "missing" global names too: when the user has a global
of the same name as the unbound local, NameError makes no sense to them.
Even in the absence of shadowing, knowing whether a bogus name is local or
global is a real aid to quick understanding.
Example:
D:\src\PCbuild>type local.py
x = 42
def f():
print x
x = 13
return x
f()
D:\src\PCbuild>python local.py
Traceback (innermost last):
File "local.py", line 8, in ?
f()
File "local.py", line 4, in f
print x
UnboundLocalError: x
D:\src\PCbuild>
Note that UnboundLocalError is a subclass of NameError, for compatibility
with existing class-exception code that may be trying to catch this as a
NameError. Unfortunately, I see no way to make this wholly compatible
with -X (see comments in bltinmodule.c): under -X, [UnboundLocalError
is an alias for NameError --GvR].
[The ceval.c patch differs slightly from the second version that Tim
submitted; I decided not to raise UnboundLocalError for DELETE_NAME,
only for DELETE_LOCAL. DELETE_NAME is only generated at the module
level, and since at that level a NameError is raised for referencing
an undefined name, it should also be raised for deleting one.]
ExtensionClasses in isinstance() and issubclass().
- abstract instance and class protocols are used *only* in those
cases that would generate errors before the patch. That is, there's
no penalty for the normal case.
- instance protocol: an object smells like an instance if it
has a __class__ attribute that smells like a class.
- class protocol: an object smells like a class if it has a
__bases__ attribute that is a tuple with elements that
smell like classes (although not all elements may actually get
sniffed ;).
xrange(), especially for platforms where int and long are different
sizes (so sys.maxint isn't actually the theoretical limit for the
length of a list, but the largest C int is -- sys.maxint is the
largest Python int, which is actually a C long).
test for classes with a __complex__() method. The attribute is pulled
out of the instance with PyObject_GetAttr() but this transfers
ownership and the function object was never DECREF'd.
initialization of class exceptions. Specifically:
init_class_exc(): This function now returns an integer status of the
class exception initialization. No fatal errors in this method now.
Also, use PySys_WriteStderr() when writing error messages. When an
error occurs in this function, 0 is returned, but the partial creation
of the exception classes is not undone (this happens elsewhere).
Things that could trigger the fallback:
- exceptions.py fails to be imported (due to syntax error, etc.)
- one of the exception classes is missing (e.g. due to library
version mismatch)
- exception class can't be inserted into __builtin__'s dictionary
- MemoryError instance can't be pre-allocated
- some other PyErr_Occurred
newstdexception(): Changed the error message. This is still a fatal
error because if the string based exceptions can't be created, we
really can't continue.
initerrors(): Be sure to xdecref the .exc field, which might be
non-NULL if class exceptions init was aborted.
_PyBuiltin_Init_2(): If class exception init fails, print a warning
message and reinstate the string based exceptions.
OSError. The EnvironmentError serves primarily as the (common
implementation) base class for IOError and OSError. OSError is used
by posixmodule.c
Also added tuple definition of EnvironmentError when using string
based exceptions.
(1) If a sequence S is shorter than len(S) indicated, don't fail --
just use the shorter size. (I.e, len(S) is just a hint.)
(2) Implement the special case map(None, S) as list(S) -- it's faster.
the code here becomes much simpler. In particular: abs(), divmod(),
pow(), int(), long(), float(), len(), tuple(), list().
Also make sure that no use of a function pointer gotten from a
tp_as_sequence or tp_as_mapping structure is made without checking it
for NULL first.
A few other cosmetic things, such as properly reindenting slice().
but annoying memory leak. This was introduced when PyExc_Exception
was added; the loop above populating the PyExc_StandardError exception
tuple started at index 1 in bltin_exc, but PyExc_Exception was added
at index 0, so PyExc_StandardError was getting inserted in itself!
How else can a tuple include itself?!
Change the loop to start at index 2.
This was a *fun* one! :-)