In C++, it's an error to pass a string literal to a char* function
without a const_cast(). Rather than require every C++ extension
module to put a cast around string literals, fix the API to state the
const-ness.
I focused on parts of the API where people usually pass literals:
PyArg_ParseTuple() and friends, Py_BuildValue(), PyMethodDef, the type
slots, etc. Predictably, there were a large set of functions that
needed to be fixed as a result of these changes. The most pervasive
change was to make the keyword args list passed to
PyArg_ParseTupleAndKewords() to be a const char *kwlist[].
One cast was required as a result of the changes: A type object
mallocs the memory for its tp_doc slot and later frees it.
PyTypeObject says that tp_doc is const char *; but if the type was
created by type_new(), we know it is safe to cast to char *.
If the callback raised an exception but did not set curexc_traceback,
the trace function was called with PyTrace_RETURN. That is, the trace
function was called with an exception set. The main loop detected the
exception when the trace function returned; it complained and disabled
tracing.
Fix the logic error so that PyTrace_RETURN only occurs if the callback
returned normally.
The trace function must be called for exceptions, too. So we had
to add new functionality to call with PyTrace_EXCEPTION. (Leads to a
rather ugly ifdef / else block that contains only a '}'.)
Reverse the logic and name of NOFIX_TRACE to FIX_TRACE.
Joint work with Fred.
60: Added support for the SkippedEntityHandler, new in Expat 1.95.4.
61: Added support for namespace prefixes, which can be enabled by setting the
"namespace_prefixes" attribute on the parser object.
65: Disable profiling changes for Python 2.0 and 2.1.
66: Update pyexpat to export the Expat 1.95.5 XML_GetFeatureList()
information, and tighten up a type declaration now that Expat is using
an incomplete type rather than a void * for the XML_Parser type.
67: Clarified a comment.
Added support for XML_UseForeignDTD(), new in Expat 1.95.5.
68: Refactor to avoid partial duplication of the code to construct an
ExpatError instance, and actually conform to the API for the exception
instance as well.
69: Remove some spurious trailing whitespace.
Add a special external-entity-ref handler that gets installed once a
handler has raised a Python exception; this can cancel actual parsing
earlier if there's an external entity reference in the input data
after the the Python excpetion has been raised.
70: Untabify APPEND.
71: Backport PyMODINIT_FUNC for 2.2 and earlier.
XML_Parser, which happens to be a pointer type, not an XML_Parser*.
This generated warnings when compiled with Expat 1.95.5, which no
longer defines XML_Parser to be void*.
the "safety" parentheses since some older compilers refuse to compile
the module then, claiming that static initializers are non-constant.
This doesn't actually make any difference for Python, since these
definitions are not used when compiling with a version of Python that
already defines the PyDoc_* macros.
-- replace then with slightly faster PyObject_Call(o,a,NULL). (The
difference is that the latter requires a to be a tuple; the former
allows other values and wraps them in a tuple if necessary; it
involves two more levels of C function calls to accomplish all that.)
The staticforward define was needed to support certain broken C
compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the
static keyword when it was used with a forward declaration of a static
initialized structure. Standard C allows the forward declaration with
static, and we've decided to stop catering to broken C compilers. (In
fact, we expect that the compilers are all fixed eight years later.)
I'm leaving staticforward and statichere defined in object.h as
static. This is only for backwards compatibility with C extensions
that might still use it.
XXX I haven't updated the documentation.
Setting the buffer_text attribute to true causes the parser to collect
character data, waiting as long as possible to report it to the Python
callback. This can save an enormous number of callbacks from C to
Python, which can be a substantial performance improvement.
buffer_text defaults to false.
The handlers array on each parser now has the invariant that None will
never be set as a handler; it will always be NULL or a Python-level
value passed in for the specific handler.
have_handler(): Return true if there is a Python handler for a
particular event.
get_handler_name(): Return a string object giving the name of a
particular handler. This caches the string object so it doesn't
need to be created more than once.
get_parse_result(): Helper to allow the Parse() and ParseFile()
methods to share the same logic for determining the return value
or exception state.
PyUnknownEncodingHandler(), PyModule_AddIntConstant():
Made these helpers static. (The later is only defined for older
versions of Python.)
pyxml_UpdatePairedHandlers(), pyxml_SetStartElementHandler(),
pyxml_SetEndElementHandler(), pyxml_SetStartNamespaceDeclHandler(),
pyxml_SetEndNamespaceDeclHandler(), pyxml_SetStartCdataSection(),
pyxml_SetEndCdataSection(), pyxml_SetStartDoctypeDeclHandler(),
pyxml_SetEndDoctypeDeclHandler():
Removed. These are no longer needed with Expat 1.95.x.
handler_info:
Use the setter functions provided by Expat 1.95.x instead of the
pyxml_Set*Handler() functions which have been removed.
Minor code formatting changes for consistency.
Trailing whitespace removed.
type.__module__ behavior.
This adds the module name and a dot in front of the type name in every
type object initializer, except for built-in types (and those that
already had this). Note that it touches lots of Mac modules -- I have
no way to test these but the changes look right. Apologies if they're
not. This also touches the weakref docs, which contains a sample type
object initializer. It also touches the mmap test output, because the
mmap type's repr is included in that output. It touches object.h to
put the correct description in a comment.
- Do not compile unicodeobject, unicodectype, and unicodedata if Unicode is disabled
- check for Py_USING_UNICODE in all places that use Unicode functions
- disables unicode literals, and the builtin functions
- add the types.StringTypes list
- remove Unicode literals from most tests.
while not generally a good idea, this is used by RDF users, and works
to implement RDF-style namespace+localname concatenation as defined
in the RDF specifications. (This also corrects a backwards-compatibility
bug.)
Be more conservative while clearing out handlers; set the slot in the
self->handlers array to NULL before DECREFing the callback.
Still more adjustments to make the code style internally consistent.
gives the CVS revision of this file even if it does not include the
extra RCS "$Revision: " cruft.
initpyexpat(): Use get_version_string() instead of hard-coding magic
indexes into the RCS string (which may be affected by export options).
with free variables. Thanks to Martin v. Loewis for finding two of
the problems. This fixes SF buf 405583.
There is also a C API change: PyFrame_New() is reverting to its
pre-2.1 signature. The change introduced by nested scopes was a
mistake. XXX Is this okay between beta releases?
cell_clear(), the GC helper, must decref its reference to break
cycles.
frame_dealloc() must dealloc all cell vars and free vars in addition
to locals.
eval_code2() setup code must INCREF cells it copies out of the
closure.
The STORE_DEREF opcode implementation must DECREF the object it passes
to PyCell_Set().
compiled only for some versions of Expat, but was no longer needed as the
new implementation works for all versions. Keeping it created multiple
definitions for Expat 1.2, which caused compilation to fail.
in_callback field that's set to true whenever a callback into an
event handler is true. Needed for:
set_error(): Add line number of offset information to the exception
as attributes, so users don't need to parse the text of the
message.
set_error_attr(): New helper function for set_error().
xmlparse_GetInputContext(): New function of the parser object;
returns the document source for an event during a callback, None
at all other times.
xmlparse_SetParamEntityParsing(): Make the signature consistent with
the other parser methods (use xmlparseobject* for self instead of
PyObject*).
initpyexpat(): Don't lose the reference to the exception class!
call_with_frame(),
getcode(): Re-indent to be consistent with the rest of the file.
now carry a 'code' attribute that gives the Expat error
number.
Added support for additional handlers for Expat 1.95.*, including
XmlDeclHandler, EntityDeclHandler, ElementDeclHandler, and
AttlistDeclHandler. Associated constants are in the 'model'
sub-object.
Added two new attributes to the parser object: ordered_attributes and
specified_attributes. These are used to control how attributes are
reported and which attributes are reported.
Participate in garbage collection if available.
Potentially decref handlers in clear_handlers.
Partially reindent.
Put synthetic frame object on the stack to support better error output.
Expose Python codecs to pyexpat.
Add new Expat 1.2 handlers and API.
Fix memory leak: release self->handlers.
Do not expect PyModule_AddObject and PyModule_AddStringConstant in 2.0b1.
Raise exception in ParseFile.
"xml.parsers.expat.error", so it will reflect the public name of the
exception rather than the internal name.
Also change some of the initialization to use the new PyModule_Add*()
convenience functions.
data and default handlers -- a new reference was being passed to
Py_BuildValue() for the "O" format character; using "N" plugs the leak.
Fixed two other (minor) leaks that occurred on various error conditions.
Removed uses of the UNLESS macro, which makes code hard to read, and is
Evil.
Windows "inconsistent linkage" warnings at the same time. I agree
with Mark Hammond that the whole DL_IMPORT/DL_EXPORT macro system
needs an overhaul; this is just an expedient hack until then.
Added prototype to remove yet another warning.
Make a number of the handlers and helpers "static" since they are not
used in other C source files. This also reduces the number of warnings.
Make a lot of the code "more Python". (Need to get the style guide done!)
that this is not appropriate.
Made somewhat more robust in the face of reload() (exception is not
rebuilt, etc.).
Made the exception a class exception.
* There was no error reported if the .read() method returns a non-string
* If read() returned too much data, the buffer would be overflowed causing a
core dump
* Used strncpy, not memcpy, which seems incorrect if there are embedded \0s.
* The args and bytes objects were leaked
The first two warnings seem harmless enough,
but the last one looks like a potential bug: an
uninitialized int is returned on error. (I also
ended up reformatting some of the code,
because it was hard to read.)
It gets initialized when pyexpat is imported, and is only accessible as an
attribute of pyexpat; it cannot be imported itself. This allows it to at
least be importable after pyexpat itself has been imported by adding it
to sys.modules, so it is not quite as strange.
This arrangement needs to be better thought out.
For more comments, read the patches@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.
(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode. I'm also holding back on his
change to main.c, which seems unnecessary to me.)
The Setup.in entry is sort of a lie; it links with -lexpat, but
Expat's Makefile doesn't actually build a libexpat.a. I'll send
Expat's author a patch to do that; if he doesn't accept it, this
rule will have to list Expat's object files (ick!), or have a
comment explaining how to build a .a file.