find_class(): We no longer mask all exceptions[1] by transforming them
into SystemError. The latter is definitely not the right thing to do,
so we let any exceptions that occur in the PyObject_GetAttr() call to
simply propagate up if they occur.
[1] Note that pickle only masked ImportError, KeyError, and
AttributeError, but cPickle masked all exceptions.
This gives mmap() on Windows the ability to create read-only, write-
through and copy-on-write mmaps. A new keyword argument is introduced
because the mmap() signatures diverged between Windows and Unix, so
while they (now) both support this functionality, there wasn't a way to
spell it in a common way without introducing a new spelling gimmick.
The old spellings are still accepted, so there isn't a backward-
compatibility issue here.
objects to save in gc.garbage. This should be the last change needed to
fix SF bug 477059: "__del__ on new classes vs. GC".
Note that this change slightly changes the behavior of the collector.
Before, if a cycle was found that contained instances with __del__
methods then all instance objects in that cycle were saved in
gc.garbage. Now, only objects with __del__ methods are saved in
gc.garbage.
When moving objects with a __del__ attribute to a special list, look
for __del__ on new-style classes with the HEAPTYPE flag set as well.
(HEAPTYPE means the class was created by a class statement.)
to lists of values, giving the contents of all the ADD_INFO records
seen so far. This is initialized agressively when the log file is
opened, so that whoever is looking at the log reader can always see
the initial data loaded into the data stream. ADD_INFO events later
in the log file continue to be reported to the application layer as
before.
Add a new method, addinfo(), to the profiler. This can be used to
insert additional ADD_INFO records into the profiler log.
Fix the tp_flags and tp_name slots on the type objects.
"socket.socket" -- on Windows, "socket.socket" is the wrapper class.
Also added the module name to the SSL type (which is not a new-style
class -- I don't want to mess with it yet).
constructor acts just like socket() before. All three arguments have
a sensible default now; socket() is equivalent to
socket(AF_INET, SOCK_STREAM).
One minor issue: the socket() function and the SocketType had
different doc strings; socket.__doc__ gave the signature,
SocketType.__doc__ gave the methods. I've merged these for now, but
maybe the list of methods is no longer necessary since it can easily
be recovered through socket.__dict__.keys(). The problem with keeping
it is that the total doc string is a bit long (34 lines -- it scrolls
of a standard tty screen).
Another general issue with the socket module is that it's a big mess.
There's pages and pages of random platform #ifdefs, and the naming
conventions are totally wrong: it uses Py prefixes and CapWords for
static functions. That's a cleanup for another day... (Also I think
the big starting comment that summarizes the API can go -- it's a
repeat of the docstring.)
error occurs, and doesn't return a count. (This is my second patch
from SF patch #474307, with small change to the docstring for send().)
2.1.2 "bugfix" candidate.
Replace some tortuous code that was trying to be clever but forgot to
DECREF the key and value, by more longwinded but obviously correct
code.
(Inspired by but not copying the fix from SF patch #475033.)
STRICT_SYSV_CURSES when compiling curses module on HP/UX. Generalize
access to _flags on systems where WINDOW is opaque. Fixes bugs
#432497, #422265, and the curses parts of #467145 and #473150.
strings, not C strings)
removed USE_PYTHON defines, and related sre.py helpers
skip calling the subx helper if the template is callable.
interestingly enough, this means that
def callback(m):
return literal
result = pattern.sub(callback, string)
is much faster than
result = pattern.sub(literal, string)
removed (conceptually flawed) getliteral helper; the new sub/subn code
uses a faster code path for literal replacement strings, but doesn't
(yet) look for literal patterns.
added STATE_OFFSET macro, and use it to convert state.start/ptr to
char indexes
Allow passing strings to the .border() method
Correct some error messages ("1 or 4" -> "1 to 4")
Bump version number
Tweak code formatting
Update my e-mail address
This adds unsetenv to posix, and uses it in the __delitem__ method of
os.environ.
(XXX Should we change the preferred name for putenv to setenv, for
consistency?)
This was submitted by Moshe, but apparently he's too busy to check it
in himself. He wrote:
Here is a function in GNU readline called add_history,
which is used to manage the history list. Though Python
uses this function internally, it does not expose it to
the Python programmer. This patch adds direct interface
to this function with documentation.
This could be used by friendly modules to "seed" the
history with commands.
This is a big one, touching lots of files. Some of the platforms
aren't tested yet. Briefly, this changes the return value of the
os/posix functions stat(), fstat(), statvfs(), fstatvfs(), and the
time functions localtime(), gmtime(), and strptime() from tuples into
pseudo-sequences. When accessed as a sequence, they behave exactly as
before. But they also have attributes like st_mtime or tm_year. The
stat return value, moreover, has a few platform-specific attributes
that are not available through the sequence interface (because
everybody expects the sequence to have a fixed length, these couldn't
be added there). If your platform's struct stat doesn't define
st_blksize, st_blocks or st_rdev, they won't be accessible from Python
either.
(Still missing is a documentation update.)
Quoth the OpenSSL RAND_add man page:
OpenSSL makes sure that the PRNG state is unique for each
thread. On systems that provide /dev/urandom, the
randomness device is used to seed the PRNG transparently.
However, on all other systems, the application is
responsible for seeding the PRNG by calling RAND_add(),
RAND_egd(3) or RAND_load_file(3).
I decided to expose RAND_add() because it's general and RAND_egd()
because it's a useful special case. RAND_load_file() didn't seem to
offer much over RAND_add(), so I skipped it. Also supplied
RAND_status() which returns true if the PRNG is seeded and false if
not.
Apparently this patch (rev 2.41) replaced all the good old "s#"
formats in PyArg_ParseTuple() with "S". Then it did
PyString_FromStringAndSize() to get back the values setup by the
"s#" format. It also incref'd and decref'd the string obtained by
"S" even though the argument tuple had a reference to it.
Replace PyString_AsString() calls with PyString_AS_STRING().
A good rule of thumb -- if you never check the return value of
PyString_AsString() to see if it's NULL, you ought to be using the
macro <wink>.
Many functions used a local variable called return_error, which was
initialized to zero. If an error occurred, it was set to true. Most
of the code paths checked were only executed if return_error was
false. goto is clearer.
The code also seemed to be written under the curious assumption that
calling Py_DECREF() on a local variable would assign the variable to
NULL. As a result, more of the error-exit code paths returned an
object that had a reference count of zero instead of just returning
NULL. Fixed the code to explicitly assign NULL after the DECREF.
A bit more reformatting, but not much.
XXX Need a much better test suite for zlib, since it the current tests
don't exercise any of this broken code.
This changes Pythread_start_thread() to return the thread ID, or -1
for an error. (It's technically an incompatible API change, but I
doubt anyone calls it.)
Mostly by Toby Dickenson and Titus Brown.
Add an optional argument to a decompression object's decompress()
method. The argument specifies the maximum length of the return
value. If the uncompressed data exceeds this length, the excess data
is stored as the unconsumed_tail attribute. (Not to be confused with
unused_data, which is a separate issue.)
Difference from SF patch: Default value for unconsumed_tail is ""
rather than None. It's simpler if the attribute is always a string.
Added support for saving the names of the functions observed into the
profile log.
Added support for using the profiler to measure coverage without collecting
timing information (which is the slow part). Coverage logs can also be
substantially smaller than profiling logs where per-line information is
being collected.
Updated comments on the log format; corrected record type values in some
of the record descriptions.
Raise ValueError when an object contains an arbitrarily nested
reference to itself. (The previous fix just produced invalid
pickles.)
Solution is very much like Py_ReprEnter() and Py_ReprLeave():
fast_save_enter() and fast_save_leave() that tracks the fast_container
limit and keeps a fast_memo of objects currently being pickled.
The cost of the solution is moderately expensive for deeply nested
structures, but it still seems to be faster than normal pickling,
based on tests with deeply nested lists.
Once FAST_LIMIT is exceeded, the new code is about twice as slow as
fast-mode code that doesn't check for recursion. It's still twice as
fast as the normal pickling code. In the absence of deeply nested
structures, I couldn't measure a difference.
To whoever who changed a bunch of (PyCFunction) casts to
(PyNoArgsFunction) in PyMethodDef initializers: don't do that. The
cast is to shut the compiler up. The compiler wants the function
pointer initializer to be a PyCFunction.
"for <var> in <testlist> may no longer be a single test followed by
a comma. This solves SF bug #431886. Note that if the testlist
contains more than one test, a trailing comma is still allowed, for
maximum backward compatibility; but this example is not:
[(x, y) for x in range(10), for y in range(10)]
^
The fix involved creating a new nonterminal 'testlist_safe' whose
definition doesn't allow the trailing comma if there's only one test:
testlist_safe: test [(',' test)+ [',']]
a misunderstanding of the refcont behavior of the 'O' format code in
PyArg_ParseTuple() and Py_BuildValue(), respectively.
- pobj is only a borrowed reference, so should *not* be DECREF'ed at
the end. This was the cause of SF bug #470635.
- The Py_BuildValue() call would leak the object produced by
makesockaddr(). (I found this by eyeballing the code.)
Add a fast_container member to Picklerobject. If fast is true, then
fast_container counts the depth of nested container calls. If the
depth exceeds FAST_LIMIT (2000), the fast flag is ignored and the
normal checks occur. This approach is much like the approach for
prevent stack overflow for comparison and reprs of recursive objects
(e.g. [[...]]).
- Fast container used for save_list(), save_dict(), and
save_inst().
XXX Not clear which other save_xxx() functions should use it.
Make Picklerobject into new-style types, using PyObject_GenericGetAttr()
and PyObject_GenericSetAttr().
- Use PyMemberDef for binary and fast members
- Use PyGetSetDef for persistent_id, inst_persistent_id, memo, and
PicklingError.
XXX Not all of these seem like they need to use getset, but it's
not clear why the old getattr() and setattr() had such odd
semantics. One change is that the getvalue() attribute will
exist on all Picklers, not just list-based picklers; I think
this is a more rationale interface.
There is a long laundry list of other changes:
- Remove unused #defines for PyList_SET_ITEM() etc.
- Make some of the indentation consistent
- Replace uses of cPickle_PyMapping_HasKey() where the first
argument is self->memo with calls to PyDict_GetItem(), because
self->memo must be a dictionary.
- Don't bother to check if cPickle_PyMapping_HasKey() returns < 0,
because it can only return 0 or 1.
- Replace uses of PyObject_CallObject() with PyObject_Call(), when
we can guarantee that the argument tuple is really a tuple.
Performance impacts of these changes:
- 5% speedup for normal pickling
- No change to fast-mode pickling.
XXX Really need tests for all the features in cPickle that aren't in
pickle.
The platform requires 8-byte alignment for doubles, but the GC header
was 12 bytes and that threw off the natural alignment of the double
members of a subtype of complex. The fix puts the GC header into a
union with a double as the other member, to force no-looser-than
double alignment of GC headers. On boxes that require 8-byte alignment
for doubles, this may add pad bytes to the GC header accordingly; ditto
for platforms that *prefer* 8-byte alignment for doubles. On platforms
that don't care, it shouldn't change the memory layout (because the
size of the old GC header is certainly greater than the size of a double
on all platforms, so unioning with a double shouldn't change size or
alignment on such boxes).
Use #define X509_NAME_MAXLEN for server/issuer length on an SSL
object.
Update doc strings for socket.ssl() and ssl methods read() and
write().
PySSL_SSLwrite(): Check return value and raise exception on error.
Use int for len instead of size_t. (All the function the size_t obj
was passed to our from expected an int!)
PySSL_SSLread(): Check return value of PyArg_ParseTuple()! More
robust checks of return values from SSL_read().
Change all the local names that start with SSL to start with PySSL.
The OpenSSL library defines lots of calls that start with "SSL_". The
calls for Python's SSL objects also started with "SSL_". This choice
made it really confusing to figure out which calls were to the library
and which calls were local to the file.
Add PySSL_SetError() that sets an exception based on the information
from SSL_get_error(). This function will eventually replace all the
calls that set it with an error message that is based on the name of
the call that failed rather than the reason it failed. (Example: If
SSL_connect() failed it used to report "SSL_connect error" now it will
offer a specific message about why SSL_connect failed.)
XXX It might be helpful to augment the error message generated
below with the name of the SSL function that generated the error.
I expect it's obvious most of the time.
Remove several unnecessary INCREFs in the module's constructor call.
PyDict_SetItem() and friends do the INCREF for you.
In SSL_dealloc(), free/dealloc them only if they're non-NULL.
Fixes some obvious core dumps, but not sure yet if there are more
semantics to the SSL calls that would affect the dealloc.
XXX [1] These changes aren't tested very thoroughly, because regrtest
doesn't do any SSL tests. I've done some trivial tests on my own, but
don't really know how to use the key and cert files. In one case, an
SSL-level error causes Python to dump core. I'll get the fixed in the
next round of changes.
XXX [2] The checkin removes the x_attr member of the SSLObject struct.
I'm not sure if this is kosher for backwards compatibility at the
binary level. Perhaps its safer to keep the member but keep it
assigned to NULL.
And the leaks?
newSSLObject() called PyDict_New(), stored the result in x_attr
without checking it, and later stored NULL in x_attr without doing
anything to the dict. So the dict always leaks. There is no further
reference to x_attr, so I just removed it completely.
The error cases in newSSLObject() passed the return value of
PyString_FromString() directly to PyErr_SetObject().
PyErr_SetObject() expects a borrowed reference, so the string leaked.
This simplifies the rounding in _PyObject_VAR_SIZE, allows to restore the
pre-rounding calling sequence, and allows some nice little simplifications
in its callers. I'm still making it return a size_t, though.
As Guido suggested, this makes the new subclassing code substantially
simpler. But the mechanics of doing it w/ C macro semantics are a mess,
and _PyObject_VAR_SIZE has a new calling sequence now.
Question: The PyObject_NEW_VAR macro appears to be part of the public API.
Regardless of what it expands to, the notion that it has to round up the
memory it allocates is new, and extensions containing the old
PyObject_NEW_VAR macro expansion (which was embedded in the
PyObject_NEW_VAR expansion) won't do this rounding. But the rounding
isn't actually *needed* except for new-style instances with dict pointers
after a variable-length blob of embedded data. So my guess is that we do
not need to bump the API version for this (as the rounding isn't needed
for anything an extension can do unless it's recompiled anyway). What's
your guess?
pad memory to properly align the __dict__ pointer in all cases.
gcmodule.c/objimpl.h, _PyObject_GC_Malloc:
+ Added a "padding" argument so that this flavor of malloc can allocate
enough bytes for alignment padding (it can't know this is needed, but
its callers do).
typeobject.c, PyType_GenericAlloc:
+ Allocated enough bytes to align the __dict__ pointer.
+ Sped and simplified the round-up-to-PTRSIZE logic.
+ Added blank lines so I could parse the if/else blocks <0.7 wink>.
Generalize PyLong_AsLongLong to accept int arguments too. The real point
is so that PyArg_ParseTuple's 'L' code does too. That code was
undocumented (AFAICT), so documented it.
#424002.
Refactor init_path_from_argv0() and rename to copy_absolute(); add
absolutize() which does the same in-place.
Clean up whitespace (leading tabs -> spaces, delete trailing
spaces/tabs).
Add raise_exception() to the _testcapi module. It isn't a test, but
the C API exists only to support test_exceptions. raise_exception()
takes two arguments -- an exception class and an integer specifying
how many arguments it should be called with.
test_exceptions uses BadException() to test the interpreter's behavior
when there is a problem instantiating the exception. test_capi1()
calls it with too many arguments. test_capi2() causes an exception to
be raised in the Python code of the constructor.
no backwards compatibility to worry about, so I just pushed the
'closure' struct member to the back -- it's never used in the current
code base (I may eliminate it, but that's more work because the getter
and setter signatures would have to change.)
As examples, I added actual docstrings to the getset attributes of a
few types: file.closed, xxsubtype.spamdict.state.
compatibility, this required all places where an array of "struct
memberlist" structures was declared that is referenced from a type's
tp_members slot to change the type of the structure to PyMemberDef;
"struct memberlist" is now only used by old code that still calls
PyMember_Get/Set. The code in PyObject_GenericGetAttr/SetAttr now
calls the new APIs PyMember_GetOne/SetOne, which take a PyMemberDef
argument.
As examples, I added actual docstrings to the attributes of a few
types: file, complex, instance method, super, and xxsubtype.spamlist.
Also converted the symtable to new style getattr.
#462270: sub-tle difference between pre.sub and sre.sub. PRE ignored
an empty match at the previous location, SRE didn't.
also synced with Secret Labs "sreopen" codebase.
Curious: the MS docs say stati64 etc are supported even on Win95, but
Win95 doesn't support a filesystem that allows partitions > 2 Gb.
test_largefile: This was opening its test file in text mode. I have no
idea how that worked under Win64, but it sure needs binary mode on Win98.
BTW, on Win98 test_largefile runs quickly (under a second).
requires that errno ever get set, and it looks like glibc is already
playing that game. New rules:
+ Never use HUGE_VAL. Use the new Py_HUGE_VAL instead.
+ Never believe errno. If overflow is the only thing you're interested in,
use the new Py_OVERFLOWED(x) macro. If you're interested in any libm
errors, use the new Py_SET_ERANGE_IF_OVERFLOW(x) macro, which attempts
to set errno the way C89 said it worked.
Unfortunately, none of these are reliable, but they work on Windows and I
*expect* under glibc too.
getting Infs, NaNs, or nonsense in 2.1 and before; in yesterday's CVS we
were getting OverflowError; but these functions always make good sense
for positive arguments, no matter how large).
PEP 238. Changes:
- add a new flag variable Py_DivisionWarningFlag, declared in
pydebug.h, defined in object.c, set in main.c, and used in
{int,long,float,complex}object.c. When this flag is set, the
classic division operator issues a DeprecationWarning message.
- add a new API PyRun_SimpleStringFlags() to match
PyRun_SimpleString(). The main() function calls this so that
commands run with -c can also benefit from -Dnew.
- While I was at it, I changed the usage message in main() somewhat:
alphabetized the options, split it in *four* parts to fit in under
512 bytes (not that I still believe this is necessary -- doc strings
elsewhere are much longer), and perhaps most visibly, don't display
the full list of options on each command line error. Instead, the
full list is only displayed when -h is used, and otherwise a brief
reminder of -h is displayed. When -h is used, write to stdout so
that you can do `python -h | more'.
Notes:
- I don't want to use the -W option to control whether the classic
division warning is issued or not, because the machinery to decide
whether to display the warning or not is very expensive (it involves
calling into the warnings.py module). You can use -Werror to turn
the warnings into exceptions though.
- The -Dnew option doesn't select future division for all of the
program -- only for the __main__ module. I don't know if I'll ever
change this -- it would require changes to the .pyc file magic
number to do it right, and a more global notion of compiler flags.
- You can usefully combine -Dwarn and -Dnew: this gives the __main__
module new division, and warns about classic division everywhere
else.