loses information:
OverflowError: regular expression code size limit exceeded
Otherwise the compiled code is gibberish, possibly leading at
least to wrong results or (as reported on c.l.py) internal
sre errors at match time.
I'm not sure how to test this. SRE_CODE is a 2-byte type on
my box, and it's easy to create a regexp that causes the new
exception to trigger here. But it may be a 4-byte type on
other boxes, and creating a regexp large enough to trigger
problems there would be pretty crazy.
Bugfix candidate.
cast first PyUnicode_Decode argument to proper type (why is
"char *" used for encoded byte streams, btw? shouldn't that
be "void *" or, if necessary, "unsigned char *"?)
Subversion revision number.
First, in an svn export, there will be no .svn directory, so use an in-file
$Revision$ keyword string with the keyword chrome stripped off.
Also, use $(srcdir) in the Makefile.pre.in to handle the case where Python is
build outside the source tree.
Add C API function Py_GetBuildNumber(), add it to the interactive prompt
banner (i.e. Py_GetBuildInfo()), and add it as the sys.build_number
attribute. The build number is a string instead of an int because it may
contain a trailing 'M' if there are local modifications.
In C++, it's an error to pass a string literal to a char* function
without a const_cast(). Rather than require every C++ extension
module to put a cast around string literals, fix the API to state the
const-ness.
I focused on parts of the API where people usually pass literals:
PyArg_ParseTuple() and friends, Py_BuildValue(), PyMethodDef, the type
slots, etc. Predictably, there were a large set of functions that
needed to be fixed as a result of these changes. The most pervasive
change was to make the keyword args list passed to
PyArg_ParseTupleAndKewords() to be a const char *kwlist[].
One cast was required as a result of the changes: A type object
mallocs the memory for its tp_doc slot and later frees it.
PyTypeObject says that tp_doc is const char *; but if the type was
created by type_new(), we know it is safe to cast to char *.