cpython/Modules/expat/xmlparse.c

6184 lines
185 KiB
C
Raw Normal View History

2003-01-25 18:41:29 -04:00
/* Copyright (c) 1998, 1999, 2000 Thai Open Source Software Center Ltd
See the file COPYING for copying permission.
2002-02-11 19:13:04 -04:00
*/
#define XML_BUILDING_EXPAT 1
2003-01-25 18:41:29 -04:00
#ifdef COMPILED_FROM_DSP
2003-01-25 18:41:29 -04:00
#include "winconfig.h"
#elif defined(MACOS_CLASSIC)
#include "macconfig.h"
2004-08-03 04:06:22 -03:00
#elif defined(HAVE_EXPAT_CONFIG_H)
2003-01-25 18:41:29 -04:00
#include <expat_config.h>
#endif /* ndef COMPILED_FROM_DSP */
2002-02-11 19:13:04 -04:00
Much-needed merge (using svnmerge.py this time) of trunk changes into p3yk. Inherits test_gzip/test_tarfile failures on 64-bit platforms from the trunk, but I don't want the merge to hang around too long (even though the regular p3yk-contributors are/have been busy with other things.) Merged revisions 45621-46490 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r45621 | george.yoshida | 2006-04-21 18:34:17 +0200 (Fri, 21 Apr 2006) | 2 lines Correct the grammar ........ r45622 | tim.peters | 2006-04-21 18:34:54 +0200 (Fri, 21 Apr 2006) | 2 lines Whitespace normalization. ........ r45624 | thomas.heller | 2006-04-21 18:48:56 +0200 (Fri, 21 Apr 2006) | 1 line Merge in changes from ctypes 0.9.9.6 upstream version. ........ r45625 | thomas.heller | 2006-04-21 18:51:04 +0200 (Fri, 21 Apr 2006) | 1 line Merge in changes from ctypes 0.9.9.6 upstream version. ........ r45630 | thomas.heller | 2006-04-21 20:29:17 +0200 (Fri, 21 Apr 2006) | 8 lines Documentation for ctypes. I think that 'generic operating system services' is the best category. Note that the Doc/lib/libctypes.latex file is generated from reST sources. You are welcome to make typo fixes, and I'll try to keep the reST sources in sync, but markup changes would be lost - they should be fixed in the tool that creates the latex file. The conversion script is external/ctypes/docs/manual/mkpydoc.py. ........ r45631 | tim.peters | 2006-04-21 23:18:10 +0200 (Fri, 21 Apr 2006) | 24 lines SF bug #1473760 TempFile can hang on Windows. Python 2.4 changed ntpath.abspath to do an import inside the function. As a result, due to Python's import lock, anything calling abspath on Windows (directly, or indirectly like tempfile.TemporaryFile) hung when it was called from a thread spawned as a side effect of importing a module. This is a depressingly frequent problem, and deserves a more general fix. I'm settling for a micro-fix here because this specific one accounts for a report of Zope Corp's ZEO hanging on Windows, and it was an odd way to change abspath to begin with (ntpath needs a different implementation depending on whether we're actually running on Windows, and the _obvious_ way to arrange for that is not to bury a possibly-failing import _inside_ the function). Note that if/when other micro-fixes of this kind get made, the new Lib/test/threaded_import_hangers.py is a convenient place to add tests for them. ........ r45634 | phillip.eby | 2006-04-21 23:53:37 +0200 (Fri, 21 Apr 2006) | 2 lines Guido wrote contextlib, not me, but thanks anyway. ;) ........ r45636 | andrew.kuchling | 2006-04-22 03:51:41 +0200 (Sat, 22 Apr 2006) | 1 line Typo fixes ........ r45638 | andrew.kuchling | 2006-04-22 03:58:40 +0200 (Sat, 22 Apr 2006) | 1 line Fix comment typo ........ r45639 | andrew.kuchling | 2006-04-22 04:06:03 +0200 (Sat, 22 Apr 2006) | 8 lines Make copy of test_mailbox.py. We'll still want to check the backward compatibility classes in the new mailbox.py that I'll be committing in a few minutes. One change has been made: the tests use len(mbox) instead of len(mbox.boxes). The 'boxes' attribute was never documented and contains some internal state that seems unlikely to have been useful. ........ r45640 | andrew.kuchling | 2006-04-22 04:32:43 +0200 (Sat, 22 Apr 2006) | 16 lines Add Gregory K. Johnson's revised version of mailbox.py (funded by the 2005 Summer of Code). The revision adds a number of new mailbox classes that support adding and removing messages; these classes also support mailbox locking and default to using email.Message instead of rfc822.Message. The old mailbox classes are largely left alone for backward compatibility. The exception is the Maildir class, which was present in the old module and now inherits from the new classes. The Maildir class's interface is pretty simple, though, so I think it'll be compatible with existing code. (The change to the NEWS file also adds a missing word to a different news item, which unfortunately required rewrapping the line.) ........ r45641 | tim.peters | 2006-04-22 07:52:59 +0200 (Sat, 22 Apr 2006) | 2 lines Whitespace normalization. ........ r45642 | neal.norwitz | 2006-04-22 08:07:46 +0200 (Sat, 22 Apr 2006) | 1 line Add libctypes as a dep ........ r45643 | martin.v.loewis | 2006-04-22 13:15:41 +0200 (Sat, 22 Apr 2006) | 1 line Fix more ssize_t problems. ........ r45644 | martin.v.loewis | 2006-04-22 13:40:03 +0200 (Sat, 22 Apr 2006) | 1 line Fix more ssize_t issues. ........ r45645 | george.yoshida | 2006-04-22 17:10:49 +0200 (Sat, 22 Apr 2006) | 2 lines Typo fixes ........ r45647 | martin.v.loewis | 2006-04-22 17:19:54 +0200 (Sat, 22 Apr 2006) | 1 line Port to Python 2.5. Drop .DEF file. Change output file names to .pyd. ........ r45648 | george.yoshida | 2006-04-22 17:27:14 +0200 (Sat, 22 Apr 2006) | 3 lines - add versionadded tag - make arbitrary arguments come last ........ r45649 | hyeshik.chang | 2006-04-22 17:48:15 +0200 (Sat, 22 Apr 2006) | 3 lines Remove $CJKCodecs$ RCS tags. The CJKCodecs isn't maintained outside anymore. ........ r45654 | greg.ward | 2006-04-23 05:47:58 +0200 (Sun, 23 Apr 2006) | 2 lines Update optparse to Optik 1.5.1. ........ r45658 | george.yoshida | 2006-04-23 11:27:10 +0200 (Sun, 23 Apr 2006) | 2 lines wrap SyntaxError with \exception{} ........ r45660 | ronald.oussoren | 2006-04-23 13:59:25 +0200 (Sun, 23 Apr 2006) | 6 lines Patch 1471925 - Weak linking support for OSX This patch causes several symbols in the socket and posix module to be weakly linked on OSX and disables usage of ftime on OSX. These changes make it possible to use a binary build on OSX 10.4 on a 10.3 system. ........ r45661 | ronald.oussoren | 2006-04-23 14:36:23 +0200 (Sun, 23 Apr 2006) | 5 lines Patch 1471761 - test for broken poll at runtime This patch checks if poll is broken when the select module is loaded instead of doing so at configure-time. This functionality is only active on Mac OS X. ........ r45662 | nick.coghlan | 2006-04-23 17:13:32 +0200 (Sun, 23 Apr 2006) | 1 line Add a Context Types section to parallel the Iterator Types section (uses the same terminology as the 2.5a1 implementation) ........ r45663 | nick.coghlan | 2006-04-23 17:14:37 +0200 (Sun, 23 Apr 2006) | 1 line Update contextlib documentation to use the same terminology as the module implementation ........ r45664 | gerhard.haering | 2006-04-23 17:24:26 +0200 (Sun, 23 Apr 2006) | 2 lines Updated the sqlite3 module to the external pysqlite 2.2.2 version. ........ r45666 | nick.coghlan | 2006-04-23 17:39:16 +0200 (Sun, 23 Apr 2006) | 1 line Update with statement documentation to use same terminology as 2.5a1 implementation ........ r45667 | nick.coghlan | 2006-04-23 18:05:04 +0200 (Sun, 23 Apr 2006) | 1 line Add a (very) brief mention of the with statement to the end of chapter 8 ........ r45668 | nick.coghlan | 2006-04-23 18:35:19 +0200 (Sun, 23 Apr 2006) | 1 line Take 2 on mentioning the with statement, this time without inadvertently killing the Unicode examples ........ r45669 | nick.coghlan | 2006-04-23 19:04:07 +0200 (Sun, 23 Apr 2006) | 1 line Backdated NEWS entry to record the implementation of PEP 338 for alpha 1 ........ r45670 | tim.peters | 2006-04-23 20:13:45 +0200 (Sun, 23 Apr 2006) | 2 lines Whitespace normalization. ........ r45671 | skip.montanaro | 2006-04-23 21:14:27 +0200 (Sun, 23 Apr 2006) | 1 line first cut at trace module doc ........ r45672 | skip.montanaro | 2006-04-23 21:26:33 +0200 (Sun, 23 Apr 2006) | 1 line minor tweak ........ r45673 | skip.montanaro | 2006-04-23 21:30:50 +0200 (Sun, 23 Apr 2006) | 1 line it's always helpful if the example works... ........ r45674 | skip.montanaro | 2006-04-23 21:32:14 +0200 (Sun, 23 Apr 2006) | 1 line correct example ........ r45675 | andrew.kuchling | 2006-04-23 23:01:04 +0200 (Sun, 23 Apr 2006) | 1 line Edits to the PEP 343 section ........ r45676 | andrew.kuchling | 2006-04-23 23:51:10 +0200 (Sun, 23 Apr 2006) | 1 line Add two items ........ r45677 | tim.peters | 2006-04-24 04:03:16 +0200 (Mon, 24 Apr 2006) | 5 lines Bug #1337990: clarified that `doctest` does not support examples requiring both expected output and an exception. I'll backport to 2.4 next. ........ r45679 | nick.coghlan | 2006-04-24 05:04:43 +0200 (Mon, 24 Apr 2006) | 1 line Note changes made to PEP 343 related documentation ........ r45681 | nick.coghlan | 2006-04-24 06:17:02 +0200 (Mon, 24 Apr 2006) | 1 line Change PEP 343 related documentation to use the term context specifier instead of context object ........ r45682 | nick.coghlan | 2006-04-24 06:32:47 +0200 (Mon, 24 Apr 2006) | 1 line Add unit tests for the -m and -c command line switches ........ r45683 | nick.coghlan | 2006-04-24 06:37:15 +0200 (Mon, 24 Apr 2006) | 1 line Fix contextlib.nested to cope with exit methods raising and handling exceptions ........ r45685 | nick.coghlan | 2006-04-24 06:59:28 +0200 (Mon, 24 Apr 2006) | 1 line Fix broken contextlib test from last checkin (I'd've sworn I tested that before checking it in. . .) ........ r45686 | nick.coghlan | 2006-04-24 07:24:26 +0200 (Mon, 24 Apr 2006) | 1 line Back out new command line tests (broke buildbot) ........ r45687 | nick.coghlan | 2006-04-24 07:52:15 +0200 (Mon, 24 Apr 2006) | 1 line More reliable version of new command line tests that just checks the exit codes ........ r45688 | thomas.wouters | 2006-04-24 13:37:13 +0200 (Mon, 24 Apr 2006) | 4 lines Stop test_tcl's testLoadTk from leaking the Tk commands 'loadtk' registers. ........ r45690 | andrew.kuchling | 2006-04-24 16:30:47 +0200 (Mon, 24 Apr 2006) | 2 lines Edits, using the new term 'context specifier' in a few places ........ r45697 | phillip.eby | 2006-04-24 22:53:13 +0200 (Mon, 24 Apr 2006) | 2 lines Revert addition of setuptools ........ r45698 | tim.peters | 2006-04-25 00:45:13 +0200 (Tue, 25 Apr 2006) | 2 lines Whitespace normalization. ........ r45700 | trent.mick | 2006-04-25 02:34:50 +0200 (Tue, 25 Apr 2006) | 4 lines Put break at correct level so *all* root HKEYs acutally get checked for an installed VC6. Otherwise only the first such tree gets checked and this warning doesn't get displayed. ........ r45701 | tim.peters | 2006-04-25 05:31:36 +0200 (Tue, 25 Apr 2006) | 3 lines Patch #1475231: add a new SKIP doctest option, thanks to Edward Loper. ........ r45702 | neal.norwitz | 2006-04-25 07:04:35 +0200 (Tue, 25 Apr 2006) | 1 line versionadded for SKIP ........ r45703 | neal.norwitz | 2006-04-25 07:05:03 +0200 (Tue, 25 Apr 2006) | 1 line Restore Walters name ........ r45704 | neal.norwitz | 2006-04-25 07:49:42 +0200 (Tue, 25 Apr 2006) | 1 line Revert previous change, SKIP had a versionadded elsewhere ........ r45706 | nick.coghlan | 2006-04-25 12:56:51 +0200 (Tue, 25 Apr 2006) | 31 lines Move the PEP 343 documentation and implementation closer to the terminology in the alpha 1 documentation. - "context manager" reverts to its alpha 1 definition - the term "context specifier" goes away entirely - contextlib.GeneratorContextManager is renamed GeneratorContext There are still a number of changes relative to alpha 1: - the expression in the with statement is explicitly called the "context expression" in the language reference - the terms 'with statement context', 'context object' or 'with statement context' are used in several places instead of a bare 'context'. The aim of this is to avoid ambiguity in relation to the runtime context set up when the block is executed, and the context objects that already exist in various application domains (such as decimal.Context) - contextlib.contextmanager is renamed to contextfactory This best reflects the nature of the function resulting from the use of that decorator - decimal.ContextManager is renamed to WithStatementContext Simple dropping the 'Manager' part wasn't possible due to the fact that decimal.Context already exists and means something different. WithStatementContext is ugly but workable. A technically unrelated change snuck into this commit: contextlib.closing now avoids the overhead of creating a generator, since it's trivial to implement that particular context manager directly. ........ r45707 | nick.coghlan | 2006-04-25 13:05:56 +0200 (Tue, 25 Apr 2006) | 1 line Fix latex typo ........ r45708 | thomas.wouters | 2006-04-25 14:28:56 +0200 (Tue, 25 Apr 2006) | 4 lines Fix markup glitch in unittest docs. Will backport. ........ r45710 | andrew.kuchling | 2006-04-25 14:31:38 +0200 (Tue, 25 Apr 2006) | 1 line Add two items; easy_install is now off the table, though pkgutil still is ........ r45711 | andrew.kuchling | 2006-04-25 14:47:25 +0200 (Tue, 25 Apr 2006) | 1 line Rework context terminology ........ r45712 | thomas.wouters | 2006-04-25 15:53:23 +0200 (Tue, 25 Apr 2006) | 9 lines SF bug/patch #1433877: string parameter to ioctl not null terminated The new char-array used in ioctl calls wasn't explicitly NUL-terminated; quite probably the cause for the test_pty failures on Solaris that we circumvented earlier. (I wasn't able to reproduce it with this patch, but it has been somewhat elusive to start with.) ........ r45713 | george.yoshida | 2006-04-25 16:09:58 +0200 (Tue, 25 Apr 2006) | 2 lines minor tweak ........ r45714 | thomas.wouters | 2006-04-25 17:08:10 +0200 (Tue, 25 Apr 2006) | 7 lines Fix SF bug #1476111: SystemError in socket sendto. The AF_INET6 and AF_PACKET cases in getsockaddrarg were missing their own checks for tuple-ness of the address argument, which means a confusing SystemError was raised by PyArg_ParseTuple instead. ........ r45715 | thomas.wouters | 2006-04-25 17:29:46 +0200 (Tue, 25 Apr 2006) | 10 lines Define MAXPATHLEN to be at least PATH_MAX, if that's defined. Python uses MAXPATHLEN-sized buffers for various output-buffers (like to realpath()), and that's correct on BSD platforms, but not Linux (which uses PATH_MAX, and does not define MAXPATHLEN.) Cursory googling suggests Linux is following a newer standard than BSD, but in cases like this, who knows. Using the greater of PATH_MAX and 1024 as a fallback for MAXPATHLEN seems to be the most portable solution. ........ r45717 | thomas.heller | 2006-04-25 20:26:08 +0200 (Tue, 25 Apr 2006) | 3 lines Fix compiler warnings on Darwin. Patch by Brett Canon, see https://sourceforge.net/tracker/?func=detail&atid=532156&aid=1475959&group_id=71702 ........ r45718 | guido.van.rossum | 2006-04-25 22:12:45 +0200 (Tue, 25 Apr 2006) | 4 lines Implement MvL's improvement on __context__ in Condition; this can just call __context__ on the underlying lock. (The same change for Semaphore does *not* work!) ........ r45721 | tim.peters | 2006-04-26 03:15:53 +0200 (Wed, 26 Apr 2006) | 13 lines Rev 45706 renamed stuff in contextlib.py, but didn't rename uses of it in test_with.py. As a result, test_with has been skipped (due to failing imports) on all buildbot boxes since. Alas, that's not a test failure -- you have to pay attention to the 1 skip unexpected on PLATFORM: test_with kinds of output at the ends of test runs to notice that this got broken. It's likely that more renaming in test_with.py would be desirable. ........ r45722 | fred.drake | 2006-04-26 07:15:41 +0200 (Wed, 26 Apr 2006) | 1 line markup fixes, cleanup ........ r45723 | fred.drake | 2006-04-26 07:19:39 +0200 (Wed, 26 Apr 2006) | 1 line minor adjustment suggested by Peter Gephardt ........ r45724 | neal.norwitz | 2006-04-26 07:34:03 +0200 (Wed, 26 Apr 2006) | 10 lines Patch from Aldo Cortesi (OpenBSD buildbot owner). After the patch (45590) to add extra debug stats to the gc module, Python was crashing on OpenBSD due to: Fatal Python error: Interpreter not initialized (version mismatch?) This seems to occur due to calling collect() when initialized (in pythonrun.c) is set to 0. Now, the import will occur in the init function which shouldn't suffer this problem. ........ r45725 | neal.norwitz | 2006-04-26 08:26:12 +0200 (Wed, 26 Apr 2006) | 3 lines Fix this test on Solaris. There can be embedded \r, so don't just replace the one at the end. ........ r45727 | nick.coghlan | 2006-04-26 13:50:04 +0200 (Wed, 26 Apr 2006) | 1 line Fix an error in the last contextlib.closing example ........ r45728 | andrew.kuchling | 2006-04-26 14:21:06 +0200 (Wed, 26 Apr 2006) | 1 line [Bug #1475080] Fix example ........ r45729 | andrew.kuchling | 2006-04-26 14:23:39 +0200 (Wed, 26 Apr 2006) | 1 line Add labels to all sections ........ r45730 | thomas.wouters | 2006-04-26 17:53:30 +0200 (Wed, 26 Apr 2006) | 7 lines The result of SF patch #1471578: big-memory tests for strings, lists and tuples. Lots to be added, still, but this will give big-memory people something to play with in 2.5 alpha 2, and hopefully get more people to write these tests. ........ r45731 | tim.peters | 2006-04-26 19:11:16 +0200 (Wed, 26 Apr 2006) | 2 lines Whitespace normalization. ........ r45732 | martin.v.loewis | 2006-04-26 19:19:44 +0200 (Wed, 26 Apr 2006) | 1 line Use GS- and bufferoverlowU.lib where appropriate, for AMD64. ........ r45733 | thomas.wouters | 2006-04-26 20:46:01 +0200 (Wed, 26 Apr 2006) | 5 lines Add tests for += and *= on strings, and fix the memory-use estimate for the list.extend tests (they were estimating half the actual use.) ........ r45734 | thomas.wouters | 2006-04-26 21:14:46 +0200 (Wed, 26 Apr 2006) | 5 lines Some more test-size-estimate fixes: test_append and test_insert trigger a list resize, which overallocates. ........ r45735 | hyeshik.chang | 2006-04-26 21:20:26 +0200 (Wed, 26 Apr 2006) | 3 lines Fix build on MIPS for libffi. I haven't tested this yet because I don't have an access on MIPS machines. Will be tested by buildbot. :) ........ r45737 | fred.drake | 2006-04-27 01:40:32 +0200 (Thu, 27 Apr 2006) | 1 line one more place to use the current Python version ........ r45738 | fred.drake | 2006-04-27 02:02:24 +0200 (Thu, 27 Apr 2006) | 3 lines - update version numbers in file names again, until we have a better way - elaborate instructions for Cygwin support (closes SF #839709) ........ r45739 | fred.drake | 2006-04-27 02:20:14 +0200 (Thu, 27 Apr 2006) | 1 line add missing word ........ r45740 | anthony.baxter | 2006-04-27 04:11:24 +0200 (Thu, 27 Apr 2006) | 2 lines 2.5a2 ........ r45741 | anthony.baxter | 2006-04-27 04:13:13 +0200 (Thu, 27 Apr 2006) | 1 line 2.5a2 ........ r45749 | andrew.kuchling | 2006-04-27 14:22:37 +0200 (Thu, 27 Apr 2006) | 1 line Now that 2.5a2 is out, revert to the current date ........ r45750 | andrew.kuchling | 2006-04-27 14:23:07 +0200 (Thu, 27 Apr 2006) | 1 line Bump document version ........ r45751 | andrew.kuchling | 2006-04-27 14:34:39 +0200 (Thu, 27 Apr 2006) | 6 lines [Bug #1477102] Add necessary import to example This may be a useful style question for the docs -- should examples show the necessary imports, or should it be assumed that the reader will figure it out? In the What's New, I'm not consistent but usually opt for omitting the imports. ........ r45753 | andrew.kuchling | 2006-04-27 14:38:35 +0200 (Thu, 27 Apr 2006) | 1 line [Bug #1477140] Import Error base class ........ r45754 | andrew.kuchling | 2006-04-27 14:42:54 +0200 (Thu, 27 Apr 2006) | 1 line Mention the xmlrpclib.Error base class, which is used in one of the examples ........ r45756 | george.yoshida | 2006-04-27 15:41:07 +0200 (Thu, 27 Apr 2006) | 2 lines markup fix ........ r45757 | thomas.wouters | 2006-04-27 15:46:59 +0200 (Thu, 27 Apr 2006) | 4 lines Some more size-estimate fixes, for large-list-tests. ........ r45758 | thomas.heller | 2006-04-27 17:50:42 +0200 (Thu, 27 Apr 2006) | 3 lines Rerun the libffi configuration if any of the files used for that are newer then fficonfig.py. ........ r45766 | thomas.wouters | 2006-04-28 00:37:50 +0200 (Fri, 28 Apr 2006) | 6 lines Some style fixes and size-calculation fixes. Also do the small-memory run using a prime number, rather than a convenient power-of-2-and-multiple-of-5, so incorrect testing algorithms fail more easily. ........ r45767 | thomas.wouters | 2006-04-28 00:38:32 +0200 (Fri, 28 Apr 2006) | 6 lines Do the small-memory run of big-meormy tests using a prime number, rather than a convenient power-of-2-and-multiple-of-5, so incorrect testing algorithms fail more easily. ........ r45768 | david.goodger | 2006-04-28 00:53:05 +0200 (Fri, 28 Apr 2006) | 1 line Added SVN access for Steven Bethard and Talin, for PEP updating. ........ r45770 | thomas.wouters | 2006-04-28 01:13:20 +0200 (Fri, 28 Apr 2006) | 16 lines - Add new Warning class, ImportWarning - Warn-raise ImportWarning when importing would have picked up a directory as package, if only it'd had an __init__.py. This swaps two tests (for case-ness and __init__-ness), but case-test is not really more expensive, and it's not in a speed-critical section. - Test for the new warning by importing a common non-package directory on sys.path: site-packages - In regrtest.py, silence warnings generated by the build-environment because Modules/ (which is added to sys.path for Setup-created modules) has 'zlib' and '_ctypes' directories without __init__.py's. ........ r45771 | thomas.wouters | 2006-04-28 01:41:27 +0200 (Fri, 28 Apr 2006) | 6 lines Add more ignores of ImportWarnings; these are all just potential triggers (since they won't trigger if zlib is already sucessfully imported); they were found by grepping .py files, instead of looking at warning output :) ........ r45773 | neal.norwitz | 2006-04-28 06:32:20 +0200 (Fri, 28 Apr 2006) | 1 line Add some whitespace to be more consistent. ........ r45774 | neal.norwitz | 2006-04-28 06:34:43 +0200 (Fri, 28 Apr 2006) | 5 lines Try to really fix the slow buildbots this time. Printing to stdout, doesn't mean the data was actually written. It depends on the buffering, so we need to flush. This will hopefully really fix the buildbots getting killed due to no output on the slow bots. ........ r45775 | neal.norwitz | 2006-04-28 07:28:05 +0200 (Fri, 28 Apr 2006) | 1 line Fix some warnings on Mac OS X 10.4 ........ r45776 | neal.norwitz | 2006-04-28 07:28:30 +0200 (Fri, 28 Apr 2006) | 1 line Fix a warning on alpha ........ r45777 | neal.norwitz | 2006-04-28 07:28:54 +0200 (Fri, 28 Apr 2006) | 1 line Fix a warning on ppc (debian) ........ r45778 | george.yoshida | 2006-04-28 18:09:45 +0200 (Fri, 28 Apr 2006) | 2 lines fix markup glitch ........ r45780 | georg.brandl | 2006-04-28 18:31:17 +0200 (Fri, 28 Apr 2006) | 3 lines Add SeaMonkey to the list of Mozilla browsers. ........ r45781 | georg.brandl | 2006-04-28 18:36:55 +0200 (Fri, 28 Apr 2006) | 2 lines Bug #1475009: clarify ntpath.join behavior with absolute components ........ r45783 | george.yoshida | 2006-04-28 18:40:14 +0200 (Fri, 28 Apr 2006) | 2 lines correct a dead link ........ r45785 | georg.brandl | 2006-04-28 18:54:25 +0200 (Fri, 28 Apr 2006) | 4 lines Bug #1472949: stringify IOErrors in shutil.copytree when appending them to the Error errors list. ........ r45786 | georg.brandl | 2006-04-28 18:58:52 +0200 (Fri, 28 Apr 2006) | 3 lines Bug #1478326: don't allow '/' in distutils.util.get_platform machine names since this value is used to name the build directory. ........ r45788 | thomas.heller | 2006-04-28 19:02:18 +0200 (Fri, 28 Apr 2006) | 1 line Remove a duplicated test (the same test is in test_incomplete.py). ........ r45792 | georg.brandl | 2006-04-28 21:09:24 +0200 (Fri, 28 Apr 2006) | 3 lines Bug #1478429: make datetime.datetime.fromtimestamp accept every float, possibly "rounding up" to the next whole second. ........ r45796 | george.yoshida | 2006-04-29 04:43:30 +0200 (Sat, 29 Apr 2006) | 2 lines grammar fix ........ r45800 | ronald.oussoren | 2006-04-29 13:31:35 +0200 (Sat, 29 Apr 2006) | 2 lines Patch 1471883: --enable-universalsdk on Mac OS X ........ r45801 | andrew.kuchling | 2006-04-29 13:53:15 +0200 (Sat, 29 Apr 2006) | 1 line Add item ........ r45802 | andrew.kuchling | 2006-04-29 14:10:28 +0200 (Sat, 29 Apr 2006) | 1 line Make case of 'ZIP' consistent ........ r45803 | andrew.kuchling | 2006-04-29 14:10:43 +0200 (Sat, 29 Apr 2006) | 1 line Add item ........ r45808 | martin.v.loewis | 2006-04-29 14:37:25 +0200 (Sat, 29 Apr 2006) | 3 lines Further changes for #1471883: Edit Misc/NEWS, and add expat_config.h. ........ r45809 | brett.cannon | 2006-04-29 23:29:50 +0200 (Sat, 29 Apr 2006) | 2 lines Fix docstring for contextfactory; mentioned old contextmanager name. ........ r45810 | gerhard.haering | 2006-04-30 01:12:41 +0200 (Sun, 30 Apr 2006) | 3 lines This is the start of documentation for the sqlite3 module. Please feel free to find a better place for the link to it than alongside bsddb & friends. ........ r45811 | andrew.kuchling | 2006-04-30 03:07:09 +0200 (Sun, 30 Apr 2006) | 1 line Add two items ........ r45814 | george.yoshida | 2006-04-30 05:49:56 +0200 (Sun, 30 Apr 2006) | 2 lines Use \versionchanged instead of \versionadded for new parameter support. ........ r45815 | georg.brandl | 2006-04-30 09:06:11 +0200 (Sun, 30 Apr 2006) | 2 lines Patch #1470846: fix urllib2 ProxyBasicAuthHandler. ........ r45817 | georg.brandl | 2006-04-30 10:57:35 +0200 (Sun, 30 Apr 2006) | 3 lines In stdlib, use hashlib instead of deprecated md5 and sha modules. ........ r45819 | georg.brandl | 2006-04-30 11:23:59 +0200 (Sun, 30 Apr 2006) | 3 lines Patch #1470976: don't NLST files when retrieving over FTP. ........ r45821 | georg.brandl | 2006-04-30 13:13:56 +0200 (Sun, 30 Apr 2006) | 6 lines Bug #1473625: stop cPickle making float dumps locale dependent in protocol 0. On the way, add a decorator to test_support to facilitate running single test functions in different locales with automatic cleanup. ........ r45822 | phillip.eby | 2006-04-30 17:59:26 +0200 (Sun, 30 Apr 2006) | 2 lines Fix infinite regress when inspecting <string> or <stdin> frames. ........ r45824 | georg.brandl | 2006-04-30 19:42:26 +0200 (Sun, 30 Apr 2006) | 3 lines Fix another problem in inspect: if the module for an object cannot be found, don't try to give its __dict__ to linecache. ........ r45825 | georg.brandl | 2006-04-30 20:14:54 +0200 (Sun, 30 Apr 2006) | 3 lines Patch #1472854: make the rlcompleter.Completer class usable on non- UNIX platforms. ........ r45826 | georg.brandl | 2006-04-30 21:34:19 +0200 (Sun, 30 Apr 2006) | 3 lines Patch #1479438: add \keyword markup for "with". ........ r45827 | andrew.kuchling | 2006-04-30 23:19:31 +0200 (Sun, 30 Apr 2006) | 1 line Add urllib2 HOWTO from Michael Foord ........ r45828 | andrew.kuchling | 2006-04-30 23:19:49 +0200 (Sun, 30 Apr 2006) | 1 line Add item ........ r45830 | barry.warsaw | 2006-05-01 05:03:02 +0200 (Mon, 01 May 2006) | 11 lines Port forward from 2.4 branch: Patch #1464708 from William McVey: fixed handling of nested comments in mail addresses. E.g. "Foo ((Foo Bar)) <foo@example.com>" Fixes for both rfc822.py and email package. This patch needs to be back ported to Python 2.3 for email 2.5. ........ r45832 | fred.drake | 2006-05-01 08:25:58 +0200 (Mon, 01 May 2006) | 4 lines - minor clarification in section title - markup adjustments (there is clearly much to be done in this section) ........ r45833 | martin.v.loewis | 2006-05-01 08:28:01 +0200 (Mon, 01 May 2006) | 2 lines Work around deadlock risk. Will backport. ........ r45836 | andrew.kuchling | 2006-05-01 14:45:02 +0200 (Mon, 01 May 2006) | 1 line Some ElementTree fixes: import from xml, not xmlcore; fix case of module name; mention list() instead of getchildren() ........ r45837 | gerhard.haering | 2006-05-01 17:14:48 +0200 (Mon, 01 May 2006) | 3 lines Further integration of the documentation for the sqlite3 module. There's still quite some content to move over from the pysqlite manual, but it's a start now. ........ r45838 | martin.v.loewis | 2006-05-01 17:56:03 +0200 (Mon, 01 May 2006) | 2 lines Rename uisample to text, drop all non-text tables. ........ r45839 | martin.v.loewis | 2006-05-01 18:12:44 +0200 (Mon, 01 May 2006) | 2 lines Add msilib documentation. ........ r45840 | martin.v.loewis | 2006-05-01 18:14:16 +0200 (Mon, 01 May 2006) | 4 lines Rename parameters to match the documentation (which in turn matches Microsoft's documentation). Drop unused parameter in CAB.append. ........ r45841 | fred.drake | 2006-05-01 18:28:54 +0200 (Mon, 01 May 2006) | 1 line add dependency ........ r45842 | andrew.kuchling | 2006-05-01 18:30:25 +0200 (Mon, 01 May 2006) | 1 line Markup fixes; add some XXX comments noting problems ........ r45843 | andrew.kuchling | 2006-05-01 18:32:49 +0200 (Mon, 01 May 2006) | 1 line Add item ........ r45844 | andrew.kuchling | 2006-05-01 19:06:54 +0200 (Mon, 01 May 2006) | 1 line Markup fixes ........ r45850 | neal.norwitz | 2006-05-02 06:43:14 +0200 (Tue, 02 May 2006) | 3 lines SF #1479181: split open() and file() from being aliases for each other. ........ r45852 | neal.norwitz | 2006-05-02 08:23:22 +0200 (Tue, 02 May 2006) | 1 line Try to fix breakage caused by patch #1479181, r45850 ........ r45853 | fred.drake | 2006-05-02 08:53:59 +0200 (Tue, 02 May 2006) | 3 lines SF #1479988: add methods to allow access to weakrefs for the weakref.WeakKeyDictionary and weakref.WeakValueDictionary ........ r45854 | neal.norwitz | 2006-05-02 09:27:47 +0200 (Tue, 02 May 2006) | 5 lines Fix breakage from patch 1471883 (r45800 & r45808) on OSF/1. The problem was that pyconfig.h was being included before some system headers which caused redefinitions and other breakage. This moves system headers after expat_config.h which includes pyconfig.h. ........ r45855 | vinay.sajip | 2006-05-02 10:35:36 +0200 (Tue, 02 May 2006) | 1 line Replaced my dumb way of calculating seconds to midnight with Tim Peters' much more sensible suggestion. What was I thinking ?!? ........ r45856 | andrew.kuchling | 2006-05-02 13:30:03 +0200 (Tue, 02 May 2006) | 1 line Provide encoding as keyword argument; soften warning paragraph about encodings ........ r45858 | guido.van.rossum | 2006-05-02 19:36:09 +0200 (Tue, 02 May 2006) | 2 lines Fix the formatting of KeyboardInterrupt -- a bad issubclass() call. ........ r45862 | guido.van.rossum | 2006-05-02 21:47:52 +0200 (Tue, 02 May 2006) | 7 lines Get rid of __context__, per the latest changes to PEP 343 and python-dev discussion. There are two places of documentation that still mention __context__: Doc/lib/libstdtypes.tex -- I wasn't quite sure how to rewrite that without spending a whole lot of time thinking about it; and whatsnew, which Andrew usually likes to change himself. ........ r45863 | armin.rigo | 2006-05-02 21:52:32 +0200 (Tue, 02 May 2006) | 4 lines Documentation bug: PySet_Pop() returns a new reference (because the caller becomes the owner of that reference). ........ r45864 | guido.van.rossum | 2006-05-02 22:47:36 +0200 (Tue, 02 May 2006) | 4 lines Hopefully this will fix the spurious failures of test_mailbox.py that I'm experiencing. (This code and mailbox.py itself are full of calls to file() that should be calls to open() -- but I'm not fixing those.) ........ r45865 | andrew.kuchling | 2006-05-02 23:44:33 +0200 (Tue, 02 May 2006) | 1 line Use open() instead of file() ........ r45866 | andrew.kuchling | 2006-05-03 00:47:49 +0200 (Wed, 03 May 2006) | 1 line Update context manager section for removal of __context__ ........ r45867 | fred.drake | 2006-05-03 03:46:52 +0200 (Wed, 03 May 2006) | 1 line remove unnecessary assignment ........ r45868 | fred.drake | 2006-05-03 03:48:24 +0200 (Wed, 03 May 2006) | 4 lines tell LaTeX2HTML to: - use UTF-8 output - not mess with the >>> prompt! ........ r45869 | fred.drake | 2006-05-03 04:04:40 +0200 (Wed, 03 May 2006) | 3 lines avoid ugly markup based on the unfortunate conversions of ">>" and "<<" to guillemets; no need for magic here ........ r45870 | fred.drake | 2006-05-03 04:12:47 +0200 (Wed, 03 May 2006) | 1 line at least comment on why curly-quotes are not enabled ........ r45871 | fred.drake | 2006-05-03 04:27:40 +0200 (Wed, 03 May 2006) | 1 line one more place to avoid extra markup ........ r45872 | fred.drake | 2006-05-03 04:29:09 +0200 (Wed, 03 May 2006) | 1 line one more place to avoid extra markup (how many will there be?) ........ r45873 | fred.drake | 2006-05-03 04:29:39 +0200 (Wed, 03 May 2006) | 1 line fix up whitespace in prompt strings ........ r45876 | tim.peters | 2006-05-03 06:46:14 +0200 (Wed, 03 May 2006) | 2 lines Whitespace normalization. ........ r45877 | martin.v.loewis | 2006-05-03 06:52:04 +0200 (Wed, 03 May 2006) | 2 lines Correct some formulations, fix XXX comments. ........ r45879 | georg.brandl | 2006-05-03 07:05:02 +0200 (Wed, 03 May 2006) | 2 lines Patch #1480067: don't redirect HTTP digest auth in urllib2 ........ r45881 | georg.brandl | 2006-05-03 07:15:10 +0200 (Wed, 03 May 2006) | 3 lines Move network tests from test_urllib2 to test_urllib2net. ........ r45887 | nick.coghlan | 2006-05-03 15:02:47 +0200 (Wed, 03 May 2006) | 1 line Finish bringing SVN into line with latest version of PEP 343 by getting rid of all remaining references to context objects that I could find. Without a __context__() method context objects no longer exist. Also get test_with working again, and adopt a suggestion from Neal for decimal.Context.get_manager() ........ r45888 | nick.coghlan | 2006-05-03 15:17:49 +0200 (Wed, 03 May 2006) | 1 line Get rid of a couple more context object references, fix some markup and clarify what happens when a generator context function swallows an exception. ........ r45889 | georg.brandl | 2006-05-03 19:46:13 +0200 (Wed, 03 May 2006) | 3 lines Add seamonkey to list of Windows browsers too. ........ r45890 | georg.brandl | 2006-05-03 20:03:22 +0200 (Wed, 03 May 2006) | 3 lines RFE #1472176: In httplib, don't encode the netloc and hostname with "idna" if not necessary. ........ r45891 | georg.brandl | 2006-05-03 20:12:33 +0200 (Wed, 03 May 2006) | 2 lines Bug #1472191: convert breakpoint indices to ints before comparing them to ints ........ r45893 | georg.brandl | 2006-05-03 20:18:32 +0200 (Wed, 03 May 2006) | 3 lines Bug #1385040: don't allow "def foo(a=1, b): pass" in the compiler package. ........ r45894 | thomas.heller | 2006-05-03 20:35:39 +0200 (Wed, 03 May 2006) | 1 line Don't fail the tests when libglut.so or libgle.so cannot be loaded. ........ r45895 | georg.brandl | 2006-05-04 07:08:10 +0200 (Thu, 04 May 2006) | 2 lines Bug #1481530: allow "from os.path import ..." with imputil ........ r45897 | martin.v.loewis | 2006-05-04 07:51:03 +0200 (Thu, 04 May 2006) | 2 lines Patch #1475845: Raise IndentationError for unexpected indent. ........ r45898 | martin.v.loewis | 2006-05-04 12:08:42 +0200 (Thu, 04 May 2006) | 1 line Implement os.{chdir,rename,rmdir,remove} using Win32 directly. ........ r45899 | martin.v.loewis | 2006-05-04 14:04:27 +0200 (Thu, 04 May 2006) | 2 lines Drop now-unnecessary arguments to posix_2str. ........ r45900 | martin.v.loewis | 2006-05-04 16:27:52 +0200 (Thu, 04 May 2006) | 1 line Update checks to consider Windows error numbers. ........ r45913 | thomas.heller | 2006-05-05 20:42:14 +0200 (Fri, 05 May 2006) | 2 lines Export the 'free' standard C function for use in the test suite. ........ r45914 | thomas.heller | 2006-05-05 20:43:24 +0200 (Fri, 05 May 2006) | 3 lines Fix memory leaks in the ctypes test suite, reported by valgrind, by free()ing the memory we allocate. ........ r45915 | thomas.heller | 2006-05-05 20:46:27 +0200 (Fri, 05 May 2006) | 1 line oops - the function is exported as 'my_free', not 'free'. ........ r45916 | thomas.heller | 2006-05-05 21:14:24 +0200 (Fri, 05 May 2006) | 2 lines Clean up. ........ r45920 | george.yoshida | 2006-05-06 15:09:45 +0200 (Sat, 06 May 2006) | 2 lines describe optional arguments for DocFileSuite ........ r45924 | george.yoshida | 2006-05-06 16:16:51 +0200 (Sat, 06 May 2006) | 2 lines Use \versionchanged for the feature change ........ r45925 | martin.v.loewis | 2006-05-06 18:32:54 +0200 (Sat, 06 May 2006) | 1 line Port access, chmod, parts of getcwdu, mkdir, and utime to direct Win32 API. ........ r45926 | martin.v.loewis | 2006-05-06 22:04:08 +0200 (Sat, 06 May 2006) | 2 lines Handle ERROR_ALREADY_EXISTS. ........ r45931 | andrew.kuchling | 2006-05-07 19:12:12 +0200 (Sun, 07 May 2006) | 1 line [Patch #1479977] Revised version of urllib2 HOWTO, edited by John J. Lee ........ r45932 | andrew.kuchling | 2006-05-07 19:14:53 +0200 (Sun, 07 May 2006) | 1 line Minor language edit ........ r45934 | georg.brandl | 2006-05-07 22:44:34 +0200 (Sun, 07 May 2006) | 3 lines Patch #1483395: add new TLDs to cookielib ........ r45936 | martin.v.loewis | 2006-05-08 07:25:56 +0200 (Mon, 08 May 2006) | 2 lines Add missing PyMem_Free. ........ r45938 | georg.brandl | 2006-05-08 19:28:47 +0200 (Mon, 08 May 2006) | 3 lines Add test for rev. 45934. ........ r45939 | georg.brandl | 2006-05-08 19:36:08 +0200 (Mon, 08 May 2006) | 3 lines Patch #1479302: Make urllib2 digest auth and basic auth play together. ........ r45940 | georg.brandl | 2006-05-08 19:48:01 +0200 (Mon, 08 May 2006) | 3 lines Patch #1478993: take advantage of BaseException/Exception split in cookielib ........ r45941 | neal.norwitz | 2006-05-09 07:38:56 +0200 (Tue, 09 May 2006) | 5 lines Micro optimization. In the first case, we know that frame->f_exc_type is NULL, so there's no reason to do anything with it. In the second case, we know frame->f_exc_type is not NULL, so we can just do an INCREF. ........ r45943 | thomas.heller | 2006-05-09 22:20:15 +0200 (Tue, 09 May 2006) | 2 lines Disable a test that is unreliable. ........ r45944 | tim.peters | 2006-05-10 04:43:01 +0200 (Wed, 10 May 2006) | 4 lines Variant of patch #1478292. doctest.register_optionflag(name) shouldn't create a new flag when `name` is already the name of an option flag. ........ r45947 | neal.norwitz | 2006-05-10 08:57:58 +0200 (Wed, 10 May 2006) | 14 lines Fix problems found by Coverity. longobject.c: also fix an ssize_t problem <a> could have been NULL, so hoist the size calc to not use <a>. _ssl.c: under fail: self is DECREF'd, but it would have been NULL. _elementtree.c: delete self if there was an error. _csv.c: I'm not sure if lineterminator could have been anything other than a string. However, other string method calls are checked, so check this one too. ........ r45948 | thomas.wouters | 2006-05-10 17:04:11 +0200 (Wed, 10 May 2006) | 4 lines Ignore reflog.txt, too. ........ r45949 | georg.brandl | 2006-05-10 17:59:06 +0200 (Wed, 10 May 2006) | 3 lines Bug #1482988: indicate more prominently that the Stats class is in the pstats module. ........ r45950 | georg.brandl | 2006-05-10 18:09:03 +0200 (Wed, 10 May 2006) | 2 lines Bug #1485447: subprocess: document that the "cwd" parameter isn't used to find the executable. Misc. other markup fixes. ........ r45952 | georg.brandl | 2006-05-10 18:11:44 +0200 (Wed, 10 May 2006) | 2 lines Bug #1484978: curses.panel: clarify that Panel objects are destroyed on garbage collection. ........ r45954 | georg.brandl | 2006-05-10 18:26:03 +0200 (Wed, 10 May 2006) | 4 lines Patch #1484695: Update the tarfile module to version 0.8. This fixes a couple of issues, notably handling of long file names using the GNU LONGNAME extension. ........ r45955 | georg.brandl | 2006-05-10 19:13:20 +0200 (Wed, 10 May 2006) | 4 lines Patch #721464: pdb.Pdb instances can now be given explicit stdin and stdout arguments, making it possible to redirect input and output for remote debugging. ........ r45956 | andrew.kuchling | 2006-05-10 19:19:04 +0200 (Wed, 10 May 2006) | 1 line Clarify description of exception handling ........ r45957 | georg.brandl | 2006-05-10 22:09:23 +0200 (Wed, 10 May 2006) | 2 lines Fix two small errors in argument lists. ........ r45960 | brett.cannon | 2006-05-11 07:11:33 +0200 (Thu, 11 May 2006) | 5 lines Detect if %zd is supported by printf() during configure and sets PY_FORMAT_SIZE_T appropriately. Removes warnings on OS X under gcc 4.0.1 when PY_FORMAT_SIZE_T is set to "" instead of "z" as is needed. ........ r45963 | neal.norwitz | 2006-05-11 09:51:59 +0200 (Thu, 11 May 2006) | 1 line Don't mask a no memory error with a less meaningful one as discussed on python-checkins ........ r45964 | martin.v.loewis | 2006-05-11 15:28:43 +0200 (Thu, 11 May 2006) | 3 lines Change WindowsError to carry the Win32 error code in winerror, and the DOS error code in errno. Revert changes where WindowsError catch blocks unnecessarily special-case OSError. ........ r45965 | george.yoshida | 2006-05-11 17:53:27 +0200 (Thu, 11 May 2006) | 2 lines Grammar fix ........ r45967 | andrew.kuchling | 2006-05-11 18:32:24 +0200 (Thu, 11 May 2006) | 1 line typo fix ........ r45968 | tim.peters | 2006-05-11 18:37:42 +0200 (Thu, 11 May 2006) | 5 lines BaseThreadedTestCase.setup(): stop special-casing WindowsError. Rev 45964 fiddled with WindowsError, and broke test_bsddb3 on all the Windows buildbot slaves as a result. This should repair it. ........ r45969 | georg.brandl | 2006-05-11 21:57:09 +0200 (Thu, 11 May 2006) | 2 lines Typo fix. ........ r45970 | tim.peters | 2006-05-12 03:57:59 +0200 (Fri, 12 May 2006) | 5 lines SF patch #1473132: Improve docs for tp_clear and tp_traverse, by Collin Winter. Bugfix candidate (but I'm not going to bother). ........ r45974 | martin.v.loewis | 2006-05-12 14:27:28 +0200 (Fri, 12 May 2006) | 4 lines Dynamically allocate path name buffer for Unicode path name in listdir. Fixes #1431582. Stop overallocating MAX_PATH characters for ANSI path names. Stop assigning to errno. ........ r45975 | martin.v.loewis | 2006-05-12 15:57:36 +0200 (Fri, 12 May 2006) | 1 line Move icon files into DLLs dir. Fixes #1477968. ........ r45976 | george.yoshida | 2006-05-12 18:40:11 +0200 (Fri, 12 May 2006) | 2 lines At first there were 6 steps, but one was removed after that. ........ r45977 | martin.v.loewis | 2006-05-12 19:22:04 +0200 (Fri, 12 May 2006) | 1 line Fix alignment error on Itanium. ........ r45978 | george.yoshida | 2006-05-12 19:25:26 +0200 (Fri, 12 May 2006) | 3 lines Duplicated description about the illegal continue usage can be found in nearly the same place. They are same, so keep the original one and remove the later-added one. ........ r45980 | thomas.heller | 2006-05-12 20:16:03 +0200 (Fri, 12 May 2006) | 2 lines Add missing svn properties. ........ r45981 | thomas.heller | 2006-05-12 20:47:35 +0200 (Fri, 12 May 2006) | 1 line set svn properties ........ r45982 | thomas.heller | 2006-05-12 21:31:46 +0200 (Fri, 12 May 2006) | 1 line add svn:eol-style native svn:keywords Id ........ r45987 | gerhard.haering | 2006-05-13 01:49:49 +0200 (Sat, 13 May 2006) | 3 lines Integrated the rest of the pysqlite reference manual into the Python documentation. Ready to be reviewed and improved upon. ........ r45988 | george.yoshida | 2006-05-13 08:53:31 +0200 (Sat, 13 May 2006) | 2 lines Add \exception markup ........ r45990 | martin.v.loewis | 2006-05-13 15:34:04 +0200 (Sat, 13 May 2006) | 2 lines Revert 43315: Printing of %zd must be signed. ........ r45992 | tim.peters | 2006-05-14 01:28:20 +0200 (Sun, 14 May 2006) | 11 lines Teach PyString_FromFormat, PyErr_Format, and PyString_FromFormatV about "%u", "%lu" and "%zu" formats. Since PyString_FromFormat and PyErr_Format have exactly the same rules (both inherited from PyString_FromFormatV), it would be good if someone with more LaTeX Fu changed one of them to just point to the other. Their docs were way out of synch before this patch, and I just did a mass copy+paste to repair that. Not a backport candidate (this is a new feature). ........ r45993 | tim.peters | 2006-05-14 01:31:05 +0200 (Sun, 14 May 2006) | 2 lines Typo repair. ........ r45994 | tim.peters | 2006-05-14 01:33:19 +0200 (Sun, 14 May 2006) | 2 lines Remove lie in new comment. ........ r45995 | ronald.oussoren | 2006-05-14 21:56:34 +0200 (Sun, 14 May 2006) | 11 lines Rework the build system for osx applications: * Don't use xcodebuild for building PythonLauncher, but use a normal unix makefile. This makes it a lot easier to use the same build flags as for the rest of python (e.g. make a universal version of python launcher) * Convert the mac makefile-s to makefile.in-s and use configure to set makefile variables instead of forwarding them as command-line arguments * Add a C version of pythonw, that we you can use '#!/usr/local/bin/pythonw' * Build IDLE.app using bundlebuilder instead of BuildApplet, that will allow easier modification of the bundle contents later on. ........ r45996 | ronald.oussoren | 2006-05-14 22:35:41 +0200 (Sun, 14 May 2006) | 6 lines A first cut at replacing the icons on MacOS X. This replaces all icons by icons based on the new python.org logo. These are also the first icons that are "proper" OSX icons. These icons were created by Jacob Rus. ........ r45997 | ronald.oussoren | 2006-05-14 23:07:41 +0200 (Sun, 14 May 2006) | 3 lines I missed one small detail in my rewrite of the osx build files: the path to the Python.app template. ........ r45998 | martin.v.loewis | 2006-05-15 07:51:36 +0200 (Mon, 15 May 2006) | 2 lines Fix memory leak. ........ r45999 | neal.norwitz | 2006-05-15 08:48:14 +0200 (Mon, 15 May 2006) | 1 line Move items implemented after a2 into the new a3 section ........ r46000 | neal.norwitz | 2006-05-15 09:04:36 +0200 (Mon, 15 May 2006) | 5 lines - Bug #1487966: Fix SystemError with conditional expression in assignment Most of the test_syntax changes are just updating the numbers. ........ r46001 | neal.norwitz | 2006-05-15 09:17:23 +0200 (Mon, 15 May 2006) | 1 line Patch #1488312, Fix memory alignment problem on SPARC in unicode. Will backport ........ r46003 | martin.v.loewis | 2006-05-15 11:22:27 +0200 (Mon, 15 May 2006) | 3 lines Remove bogus DECREF of self. Change __str__() functions to METH_O. Change WindowsError__str__ to use PyTuple_Pack. ........ r46005 | georg.brandl | 2006-05-15 21:30:35 +0200 (Mon, 15 May 2006) | 3 lines [ 1488881 ] tarfile.py: support for file-objects and bz2 (cp. #1488634) ........ r46007 | tim.peters | 2006-05-15 22:44:10 +0200 (Mon, 15 May 2006) | 9 lines ReadDetectFileobjTest: repair Windows disasters by opening the file object in binary mode. The Windows buildbot slaves shouldn't swap themselves to death anymore. However, test_tarfile may still fail because of a temp directory left behind from a previous failing run. Windows buildbot owners may need to remove that directory by hand. ........ r46009 | tim.peters | 2006-05-15 23:32:25 +0200 (Mon, 15 May 2006) | 3 lines test_directory(): Remove the leftover temp directory that's making the Windows buildbots fail test_tarfile. ........ r46010 | martin.v.loewis | 2006-05-16 09:05:37 +0200 (Tue, 16 May 2006) | 4 lines - Test for sys/statvfs.h before including it, as statvfs is present on some OSX installation, but its header file is not. Will backport to 2.4 ........ r46012 | georg.brandl | 2006-05-16 09:38:27 +0200 (Tue, 16 May 2006) | 3 lines Patch #1435422: zlib's compress and decompress objects now have a copy() method. ........ r46015 | andrew.kuchling | 2006-05-16 18:11:54 +0200 (Tue, 16 May 2006) | 1 line Add item ........ r46016 | andrew.kuchling | 2006-05-16 18:27:31 +0200 (Tue, 16 May 2006) | 3 lines PEP 243 has been withdrawn, so don't refer to it any more. The PyPI upload material has been moved into the section on PEP314. ........ r46017 | george.yoshida | 2006-05-16 19:42:16 +0200 (Tue, 16 May 2006) | 2 lines Update for 'ImportWarning' ........ r46018 | george.yoshida | 2006-05-16 20:07:00 +0200 (Tue, 16 May 2006) | 4 lines Mention that Exception is now a subclass of BaseException. Remove a sentence that says that BaseException inherits from BaseException. (I guess this is just a copy & paste mistake.) ........ r46019 | george.yoshida | 2006-05-16 20:26:10 +0200 (Tue, 16 May 2006) | 2 lines Document ImportWarning ........ r46020 | tim.peters | 2006-05-17 01:22:20 +0200 (Wed, 17 May 2006) | 2 lines Whitespace normalization. ........ r46021 | tim.peters | 2006-05-17 01:24:08 +0200 (Wed, 17 May 2006) | 2 lines Text files missing the SVN eol-style property. ........ r46022 | tim.peters | 2006-05-17 03:30:11 +0200 (Wed, 17 May 2006) | 2 lines PyZlib_copy(), PyZlib_uncopy(): Repair leaks on the normal-case path. ........ r46023 | georg.brandl | 2006-05-17 16:06:07 +0200 (Wed, 17 May 2006) | 3 lines Remove misleading comment about type-class unification. ........ r46024 | georg.brandl | 2006-05-17 16:11:36 +0200 (Wed, 17 May 2006) | 3 lines Apply patch #1489784 from Michael Foord. ........ r46025 | georg.brandl | 2006-05-17 16:18:20 +0200 (Wed, 17 May 2006) | 3 lines Fix typo in os.utime docstring (patch #1490189) ........ r46026 | georg.brandl | 2006-05-17 16:26:50 +0200 (Wed, 17 May 2006) | 3 lines Patch #1490224: set time.altzone correctly on Cygwin. ........ r46027 | georg.brandl | 2006-05-17 16:45:06 +0200 (Wed, 17 May 2006) | 4 lines Add global debug flag to cookielib to avoid heavy dependency on the logging module. Resolves #1484758. ........ r46028 | georg.brandl | 2006-05-17 16:56:04 +0200 (Wed, 17 May 2006) | 3 lines Patch #1486962: Several bugs in the turtle Tk demo module were fixed and several features added, such as speed and geometry control. ........ r46029 | georg.brandl | 2006-05-17 17:17:00 +0200 (Wed, 17 May 2006) | 4 lines Delay-import some large modules to speed up urllib2 import. (fixes #1484793). ........ r46030 | georg.brandl | 2006-05-17 17:51:16 +0200 (Wed, 17 May 2006) | 3 lines Patch #1180296: improve locale string formatting functions ........ r46032 | tim.peters | 2006-05-18 04:06:40 +0200 (Thu, 18 May 2006) | 2 lines Whitespace normalization. ........ r46033 | georg.brandl | 2006-05-18 08:11:19 +0200 (Thu, 18 May 2006) | 3 lines Amendments to patch #1484695. ........ r46034 | georg.brandl | 2006-05-18 08:18:06 +0200 (Thu, 18 May 2006) | 3 lines Remove unused import. ........ r46035 | georg.brandl | 2006-05-18 08:33:27 +0200 (Thu, 18 May 2006) | 3 lines Fix test_locale for platforms without a default thousands separator. ........ r46036 | neal.norwitz | 2006-05-18 08:51:46 +0200 (Thu, 18 May 2006) | 1 line Little cleanup ........ r46037 | georg.brandl | 2006-05-18 09:01:27 +0200 (Thu, 18 May 2006) | 4 lines Bug #1462152: file() now checks more thoroughly for invalid mode strings and removes a possible "U" before passing the mode to the C library function. ........ r46038 | georg.brandl | 2006-05-18 09:20:05 +0200 (Thu, 18 May 2006) | 3 lines Bug #1490688: properly document %e, %f, %g format subtleties. ........ r46039 | vinay.sajip | 2006-05-18 09:28:58 +0200 (Thu, 18 May 2006) | 1 line Changed status from "beta" to "production"; since logging has been part of the stdlib since 2.3, it should be safe to make this assertion ;-) ........ r46040 | ronald.oussoren | 2006-05-18 11:04:15 +0200 (Thu, 18 May 2006) | 2 lines Fix some minor issues with the generated application bundles on MacOSX ........ r46041 | andrew.kuchling | 2006-05-19 02:03:55 +0200 (Fri, 19 May 2006) | 1 line Typo fix; add clarifying word ........ r46044 | neal.norwitz | 2006-05-19 08:31:23 +0200 (Fri, 19 May 2006) | 3 lines Fix #132 from Coverity, retval could have been derefed if a continue inside a try failed. ........ r46045 | neal.norwitz | 2006-05-19 08:43:50 +0200 (Fri, 19 May 2006) | 2 lines Fix #1474677, non-keyword argument following keyword. ........ r46046 | neal.norwitz | 2006-05-19 09:00:58 +0200 (Fri, 19 May 2006) | 4 lines Bug/Patch #1481770: Use .so extension for shared libraries on HP-UX for ia64. I suppose this could be backported if anyone cares. ........ r46047 | neal.norwitz | 2006-05-19 09:05:01 +0200 (Fri, 19 May 2006) | 7 lines Oops, I forgot to include this file in the last commit (46046): Bug/Patch #1481770: Use .so extension for shared libraries on HP-UX for ia64. I suppose this could be backported if anyone cares. ........ r46050 | ronald.oussoren | 2006-05-19 20:17:31 +0200 (Fri, 19 May 2006) | 6 lines * Change working directory to the users home directory, that makes the file open/save dialogs more useable. * Don't use argv emulator, its not needed for idle. ........ r46052 | tim.peters | 2006-05-19 21:16:34 +0200 (Fri, 19 May 2006) | 2 lines Whitespace normalization. ........ r46054 | ronald.oussoren | 2006-05-20 08:17:01 +0200 (Sat, 20 May 2006) | 9 lines Fix bug #1000914 (again). This patches a file that is generated by bgen, however the code is now the same as a current copy of bgen would generate. Without this patch most types in the Carbon.CF module are unusable. I haven't managed to coax bgen into generating a complete copy of _CFmodule.c yet :-(, hence the manual patching. ........ r46055 | george.yoshida | 2006-05-20 17:36:19 +0200 (Sat, 20 May 2006) | 3 lines - markup fix - add clarifying words ........ r46057 | george.yoshida | 2006-05-20 18:29:14 +0200 (Sat, 20 May 2006) | 3 lines - Add 'as' and 'with' as new keywords in 2.5. - Regenerate keyword lists with reswords.py. ........ r46058 | george.yoshida | 2006-05-20 20:07:26 +0200 (Sat, 20 May 2006) | 2 lines Apply patch #1492147 from Mike Foord. ........ r46059 | andrew.kuchling | 2006-05-20 21:25:16 +0200 (Sat, 20 May 2006) | 1 line Minor edits ........ r46061 | george.yoshida | 2006-05-21 06:22:59 +0200 (Sun, 21 May 2006) | 2 lines Fix the TeX compile error. ........ r46062 | george.yoshida | 2006-05-21 06:40:32 +0200 (Sun, 21 May 2006) | 2 lines Apply patch #1492255 from Mike Foord. ........ r46063 | martin.v.loewis | 2006-05-22 10:48:14 +0200 (Mon, 22 May 2006) | 1 line Patch 1490384: New Icons for the PC build. ........ r46064 | martin.v.loewis | 2006-05-22 11:15:18 +0200 (Mon, 22 May 2006) | 1 line Patch #1492356: Port to Windows CE (patch set 1). ........ r46065 | tim.peters | 2006-05-22 13:29:41 +0200 (Mon, 22 May 2006) | 4 lines Define SIZEOF_{DOUBLE,FLOAT} on Windows. Else Michael Hudson's nice gimmicks for IEEE special values (infinities, NaNs) don't work. ........ r46070 | bob.ippolito | 2006-05-22 16:31:24 +0200 (Mon, 22 May 2006) | 2 lines GzipFile.readline performance improvement (~30-40%), patch #1281707 ........ r46071 | bob.ippolito | 2006-05-22 17:22:46 +0200 (Mon, 22 May 2006) | 1 line Revert gzip readline performance patch #1281707 until a more generic performance improvement can be found ........ r46073 | fredrik.lundh | 2006-05-22 17:35:12 +0200 (Mon, 22 May 2006) | 4 lines docstring tweaks: count counts non-overlapping substrings, not total number of occurences ........ r46075 | bob.ippolito | 2006-05-22 17:59:12 +0200 (Mon, 22 May 2006) | 1 line Apply revised patch for GzipFile.readline performance #1281707 ........ r46076 | fredrik.lundh | 2006-05-22 18:29:30 +0200 (Mon, 22 May 2006) | 3 lines needforspeed: speed up unicode repeat, unicode string copy ........ r46079 | fredrik.lundh | 2006-05-22 19:12:58 +0200 (Mon, 22 May 2006) | 4 lines needforspeed: use memcpy for "long" strings; use a better algorithm for long repeats. ........ r46084 | tim.peters | 2006-05-22 21:17:04 +0200 (Mon, 22 May 2006) | 7 lines PyUnicode_Join(): Recent code changes introduced new compiler warnings on Windows (signed vs unsigned mismatch in comparisons). Cleaned that up by switching more locals to Py_ssize_t. Simplified overflow checking (it can _be_ simpler because while these things are declared as Py_ssize_t, then should in fact never be negative). ........ r46085 | tim.peters | 2006-05-23 07:47:16 +0200 (Tue, 23 May 2006) | 3 lines unicode_repeat(): Change type of local to Py_ssize_t, since that's what it should be. ........ r46094 | fredrik.lundh | 2006-05-23 12:10:57 +0200 (Tue, 23 May 2006) | 3 lines needforspeed: check first *and* last character before doing a full memcmp ........ r46095 | fredrik.lundh | 2006-05-23 12:12:21 +0200 (Tue, 23 May 2006) | 4 lines needforspeed: fixed unicode "in" operator to use same implementation approach as find/index ........ r46096 | richard.jones | 2006-05-23 12:37:38 +0200 (Tue, 23 May 2006) | 7 lines Merge from rjones-funccall branch. Applied patch zombie-frames-2.diff from sf patch 876206 with updates for Python 2.5 and also modified to retain the free_list to avoid the 67% slow-down in pybench recursion test. 5% speed up in function call pybench. ........ r46098 | ronald.oussoren | 2006-05-23 13:04:24 +0200 (Tue, 23 May 2006) | 2 lines Avoid creating a mess when installing a framework for the second time. ........ r46101 | georg.brandl | 2006-05-23 13:17:21 +0200 (Tue, 23 May 2006) | 3 lines PyErr_NewException now accepts a tuple of base classes as its "base" parameter. ........ r46103 | ronald.oussoren | 2006-05-23 13:47:16 +0200 (Tue, 23 May 2006) | 3 lines Disable linking extensions with -lpython2.5 for darwin. This should fix bug #1487105. ........ r46104 | ronald.oussoren | 2006-05-23 14:01:11 +0200 (Tue, 23 May 2006) | 6 lines Patch #1488098. This patchs makes it possible to create a universal build on OSX 10.4 and use the result to build extensions on 10.3. It also makes it possible to override the '-arch' and '-isysroot' compiler arguments for specific extensions. ........ r46108 | andrew.kuchling | 2006-05-23 14:44:36 +0200 (Tue, 23 May 2006) | 1 line Add some items; mention the sprint ........ r46109 | andrew.kuchling | 2006-05-23 14:47:01 +0200 (Tue, 23 May 2006) | 1 line Mention string improvements ........ r46110 | andrew.kuchling | 2006-05-23 14:49:35 +0200 (Tue, 23 May 2006) | 4 lines Use 'speed' instead of 'performance', because I agree with the argument at http://zestyping.livejournal.com/193260.html that 'erformance' really means something more general. ........ r46113 | ronald.oussoren | 2006-05-23 17:09:57 +0200 (Tue, 23 May 2006) | 2 lines An improved script for building the binary distribution on MacOSX. ........ r46128 | richard.jones | 2006-05-23 20:28:17 +0200 (Tue, 23 May 2006) | 3 lines Applied patch 1337051 by Neal Norwitz, saving 4 ints on frame objects. ........ r46129 | richard.jones | 2006-05-23 20:32:11 +0200 (Tue, 23 May 2006) | 1 line fix broken merge ........ r46130 | bob.ippolito | 2006-05-23 20:41:17 +0200 (Tue, 23 May 2006) | 1 line Update Misc/NEWS for gzip patch #1281707 ........ r46131 | bob.ippolito | 2006-05-23 20:43:47 +0200 (Tue, 23 May 2006) | 1 line Update Misc/NEWS for gzip patch #1281707 ........ r46132 | fredrik.lundh | 2006-05-23 20:44:25 +0200 (Tue, 23 May 2006) | 7 lines needforspeed: use append+reverse for rsplit, use "bloom filters" to speed up splitlines and strip with charsets; etc. rsplit is now as fast as split in all our tests (reverse takes no time at all), and splitlines() is nearly as fast as a plain split("\n") in our tests. and we're not done yet... ;-) ........ r46133 | tim.peters | 2006-05-23 20:45:30 +0200 (Tue, 23 May 2006) | 38 lines Bug #1334662 / patch #1335972: int(string, base) wrong answers. In rare cases of strings specifying true values near sys.maxint, and oddball bases (not decimal or a power of 2), int(string, base) could deliver insane answers. This repairs all such problems, and also speeds string->int significantly. On my box, here are % speedups for decimal strings of various lengths: length speedup ------ ------- 1 12.4% 2 15.7% 3 20.6% 4 28.1% 5 33.2% 6 37.5% 7 41.9% 8 46.3% 9 51.2% 10 19.5% 11 19.9% 12 23.9% 13 23.7% 14 23.3% 15 24.9% 16 25.3% 17 28.3% 18 27.9% 19 35.7% Note that the difference between 9 and 10 is the difference between short and long Python ints on a 32-bit box. The patch doesn't actually do anything to speed conversion to long: the speedup is due to detecting "unsigned long" overflow more quickly. This is a bugfix candidate, but it's a non-trivial patch and it would be painful to separate the "bug fix" from the "speed up" parts. ........ r46134 | bob.ippolito | 2006-05-23 20:46:41 +0200 (Tue, 23 May 2006) | 1 line Patch #1493701: performance enhancements for struct module. ........ r46136 | andrew.kuchling | 2006-05-23 21:00:45 +0200 (Tue, 23 May 2006) | 1 line Remove duplicate item ........ r46141 | bob.ippolito | 2006-05-23 21:09:51 +0200 (Tue, 23 May 2006) | 1 line revert #1493701 ........ r46142 | bob.ippolito | 2006-05-23 21:11:34 +0200 (Tue, 23 May 2006) | 1 line patch #1493701: performance enhancements for struct module ........ r46144 | bob.ippolito | 2006-05-23 21:12:41 +0200 (Tue, 23 May 2006) | 1 line patch #1493701: performance enhancements for struct module ........ r46148 | bob.ippolito | 2006-05-23 21:25:52 +0200 (Tue, 23 May 2006) | 1 line fix linking issue, warnings, in struct ........ r46149 | andrew.kuchling | 2006-05-23 21:29:38 +0200 (Tue, 23 May 2006) | 1 line Add two items ........ r46150 | bob.ippolito | 2006-05-23 21:31:23 +0200 (Tue, 23 May 2006) | 1 line forward declaration for PyStructType ........ r46151 | bob.ippolito | 2006-05-23 21:32:25 +0200 (Tue, 23 May 2006) | 1 line fix typo in _struct ........ r46152 | andrew.kuchling | 2006-05-23 21:32:35 +0200 (Tue, 23 May 2006) | 1 line Add item ........ r46153 | tim.peters | 2006-05-23 21:34:37 +0200 (Tue, 23 May 2006) | 3 lines Get the Windows build working again (recover from `struct` module changes). ........ r46155 | fredrik.lundh | 2006-05-23 21:47:35 +0200 (Tue, 23 May 2006) | 3 lines return 0 on misses, not -1. ........ r46156 | tim.peters | 2006-05-23 23:51:35 +0200 (Tue, 23 May 2006) | 4 lines test_struct grew weird behavior under regrtest.py -R, due to a module-level cache. Clearing the cache should make it stop showing up in refleak reports. ........ r46157 | tim.peters | 2006-05-23 23:54:23 +0200 (Tue, 23 May 2006) | 2 lines Whitespace normalization. ........ r46158 | tim.peters | 2006-05-23 23:55:53 +0200 (Tue, 23 May 2006) | 2 lines Add missing svn:eol-style property to text files. ........ r46161 | fredrik.lundh | 2006-05-24 12:20:36 +0200 (Wed, 24 May 2006) | 3 lines use Py_ssize_t for string indexes (thanks, neal!) ........ r46173 | fredrik.lundh | 2006-05-24 16:28:11 +0200 (Wed, 24 May 2006) | 14 lines needforspeed: use "fastsearch" for count and findstring helpers. this results in a 2.5x speedup on the stringbench count tests, and a 20x (!) speedup on the stringbench search/find/contains test, compared to 2.5a2. for more on the algorithm, see: http://effbot.org/zone/stringlib.htm if you get weird results, you can disable the new algoritm by undefining USE_FAST in Objects/unicodeobject.c. enjoy /F ........ r46182 | fredrik.lundh | 2006-05-24 17:11:01 +0200 (Wed, 24 May 2006) | 3 lines needforspeedindeed: use fastsearch also for __contains__ ........ r46184 | bob.ippolito | 2006-05-24 17:32:06 +0200 (Wed, 24 May 2006) | 1 line refactor unpack, add unpack_from ........ r46189 | fredrik.lundh | 2006-05-24 18:35:18 +0200 (Wed, 24 May 2006) | 4 lines needforspeed: refactored the replace code slightly; special-case constant-length changes; use fastsearch to locate the first match. ........ r46198 | andrew.dalke | 2006-05-24 20:55:37 +0200 (Wed, 24 May 2006) | 10 lines Added a slew of test for string replace, based various corner cases from the Need For Speed sprint coding. Includes commented out overflow tests which will be uncommented once the code is fixed. This test will break the 8-bit string tests because "".replace("", "A") == "" when it should == "A" We have a fix for it, which should be added tomorrow. ........ r46200 | tim.peters | 2006-05-24 22:27:18 +0200 (Wed, 24 May 2006) | 2 lines We can't leave the checked-in tests broken. ........ r46201 | tim.peters | 2006-05-24 22:29:44 +0200 (Wed, 24 May 2006) | 2 lines Whitespace normalization. ........ r46202 | tim.peters | 2006-05-24 23:00:45 +0200 (Wed, 24 May 2006) | 4 lines Disable the damn empty-string replace test -- it can't be make to pass now for unicode if it passes for str, or vice versa. ........ r46203 | tim.peters | 2006-05-24 23:10:40 +0200 (Wed, 24 May 2006) | 58 lines Heavily fiddled variant of patch #1442927: PyLong_FromString optimization. ``long(str, base)`` is now up to 6x faster for non-power-of-2 bases. The largest speedup is for inputs with about 1000 decimal digits. Conversion from non-power-of-2 bases remains quadratic-time in the number of input digits (it was and remains linear-time for bases 2, 4, 8, 16 and 32). Speedups at various lengths for decimal inputs, comparing 2.4.3 with current trunk. Note that it's actually a bit slower for 1-digit strings: len speedup ---- ------- 1 -4.5% 2 4.6% 3 8.3% 4 12.7% 5 16.9% 6 28.6% 7 35.5% 8 44.3% 9 46.6% 10 55.3% 11 65.7% 12 77.7% 13 73.4% 14 75.3% 15 85.2% 16 103.0% 17 95.1% 18 112.8% 19 117.9% 20 128.3% 30 174.5% 40 209.3% 50 236.3% 60 254.3% 70 262.9% 80 295.8% 90 297.3% 100 324.5% 200 374.6% 300 403.1% 400 391.1% 500 388.7% 600 440.6% 700 468.7% 800 498.0% 900 507.2% 1000 501.2% 2000 450.2% 3000 463.2% 4000 452.5% 5000 440.6% 6000 439.6% 7000 424.8% 8000 418.1% 9000 417.7% ........ r46204 | andrew.kuchling | 2006-05-25 02:23:03 +0200 (Thu, 25 May 2006) | 1 line Minor edits; add an item ........ r46205 | fred.drake | 2006-05-25 04:42:25 +0200 (Thu, 25 May 2006) | 3 lines fix broken links in PDF (SF patch #1281291, contributed by Rory Yorke) ........ r46208 | walter.doerwald | 2006-05-25 10:53:28 +0200 (Thu, 25 May 2006) | 2 lines Replace tab inside comment with space. ........ r46209 | thomas.wouters | 2006-05-25 13:25:51 +0200 (Thu, 25 May 2006) | 4 lines Fix #1488915, Multiple dots in relative import statement raise SyntaxError. ........ r46210 | thomas.wouters | 2006-05-25 13:26:25 +0200 (Thu, 25 May 2006) | 5 lines Update graminit.c for the fix for #1488915, Multiple dots in relative import statement raise SyntaxError, and add testcase. ........ r46211 | andrew.kuchling | 2006-05-25 14:27:59 +0200 (Thu, 25 May 2006) | 1 line Add entry; and fix a typo ........ r46214 | fredrik.lundh | 2006-05-25 17:22:03 +0200 (Thu, 25 May 2006) | 7 lines needforspeed: speed up upper and lower for 8-bit string objects. (the unicode versions of these are still 2x faster on windows, though...) based on work by Andrew Dalke, with tweaks by yours truly. ........ r46216 | fredrik.lundh | 2006-05-25 17:49:45 +0200 (Thu, 25 May 2006) | 5 lines needforspeed: make new upper/lower work properly for single-character strings too... (thanks to georg brandl for spotting the exact problem faster than anyone else) ........ r46217 | kristjan.jonsson | 2006-05-25 17:53:30 +0200 (Thu, 25 May 2006) | 1 line Added a new macro, Py_IS_FINITE(X). On windows there is an intrinsic for this and it is more efficient than to use !Py_IS_INFINITE(X) && !Py_IS_NAN(X). No change on other platforms ........ r46219 | fredrik.lundh | 2006-05-25 18:10:12 +0200 (Thu, 25 May 2006) | 4 lines needforspeed: _toupper/_tolower is a SUSv2 thing; fall back on ISO C versions if they're not defined. ........ r46220 | andrew.kuchling | 2006-05-25 18:23:15 +0200 (Thu, 25 May 2006) | 1 line Fix comment typos ........ r46221 | andrew.dalke | 2006-05-25 18:30:52 +0200 (Thu, 25 May 2006) | 2 lines Added tests for implementation error we came up with in the need for speed sprint. ........ r46222 | andrew.kuchling | 2006-05-25 18:34:54 +0200 (Thu, 25 May 2006) | 1 line Fix another typo ........ r46223 | kristjan.jonsson | 2006-05-25 18:39:27 +0200 (Thu, 25 May 2006) | 1 line Fix incorrect documentation for the Py_IS_FINITE(X) macro. ........ r46224 | fredrik.lundh | 2006-05-25 18:46:54 +0200 (Thu, 25 May 2006) | 3 lines needforspeed: check for overflow in replace (from Andrew Dalke) ........ r46226 | fredrik.lundh | 2006-05-25 19:08:14 +0200 (Thu, 25 May 2006) | 5 lines needforspeed: new replace implementation by Andrew Dalke. replace is now about 3x faster on my machine, for the replace tests from string- bench. ........ r46227 | tim.peters | 2006-05-25 19:34:03 +0200 (Thu, 25 May 2006) | 5 lines A new table to help string->integer conversion was added yesterday to both mystrtoul.c and longobject.c. Share the table instead. Also cut its size by 64 entries (they had been used for an inscrutable trick originally, but the code no longer tries to use that trick). ........ r46229 | andrew.dalke | 2006-05-25 19:53:00 +0200 (Thu, 25 May 2006) | 11 lines Fixed problem identified by Georg. The special-case in-place code for replace made a copy of the string using PyString_FromStringAndSize(s, n) and modify the copied string in-place. However, 1 (and 0) character strings are shared from a cache. This cause "A".replace("A", "a") to change the cached version of "A" -- used by everyone. Now may the copy with NULL as the string and do the memcpy manually. I've added regression tests to check if this happens in the future. Perhaps there should be a PyString_Copy for this case? ........ r46230 | fredrik.lundh | 2006-05-25 19:55:31 +0200 (Thu, 25 May 2006) | 4 lines needforspeed: use "fastsearch" for count. this results in a 3x speedup for the related stringbench tests. ........ r46231 | andrew.dalke | 2006-05-25 20:03:25 +0200 (Thu, 25 May 2006) | 4 lines Code had returned an ssize_t, upcast to long, then converted with PyInt_FromLong. Now using PyInt_FromSsize_t. ........ r46233 | andrew.kuchling | 2006-05-25 20:11:16 +0200 (Thu, 25 May 2006) | 1 line Comment typo ........ r46234 | andrew.dalke | 2006-05-25 20:18:39 +0200 (Thu, 25 May 2006) | 4 lines Added overflow test for adding two (very) large strings where the new string is over max Py_ssize_t. I have no way to test it on my box or any box I have access to. At least it doesn't break anything. ........ r46235 | bob.ippolito | 2006-05-25 20:20:23 +0200 (Thu, 25 May 2006) | 1 line Faster path for PyLong_FromLongLong, using PyLong_FromLong algorithm ........ r46238 | georg.brandl | 2006-05-25 20:44:09 +0200 (Thu, 25 May 2006) | 3 lines Guard the _active.remove() call to avoid errors when there is no _active list. ........ r46239 | fredrik.lundh | 2006-05-25 20:44:29 +0200 (Thu, 25 May 2006) | 4 lines needforspeed: use fastsearch also for find/index and contains. the related tests are now about 10x faster. ........ r46240 | bob.ippolito | 2006-05-25 20:44:50 +0200 (Thu, 25 May 2006) | 1 line Struct now unpacks to PY_LONG_LONG directly when possible, also include #ifdef'ed out code that will return int instead of long when in bounds (not active since it's an API and doc change) ........ r46241 | jack.diederich | 2006-05-25 20:47:15 +0200 (Thu, 25 May 2006) | 1 line * eliminate warning by reverting tmp_s type to 'const char*' ........ r46242 | bob.ippolito | 2006-05-25 21:03:19 +0200 (Thu, 25 May 2006) | 1 line Fix Cygwin compiler issue ........ r46243 | bob.ippolito | 2006-05-25 21:15:27 +0200 (Thu, 25 May 2006) | 1 line fix a struct regression where long would be returned for short unsigned integers ........ r46244 | georg.brandl | 2006-05-25 21:15:31 +0200 (Thu, 25 May 2006) | 4 lines Replace PyObject_CallFunction calls with only object args with PyObject_CallFunctionObjArgs, which is 30% faster. ........ r46245 | fredrik.lundh | 2006-05-25 21:19:05 +0200 (Thu, 25 May 2006) | 3 lines needforspeed: use insert+reverse instead of append ........ r46246 | bob.ippolito | 2006-05-25 21:33:38 +0200 (Thu, 25 May 2006) | 1 line Use LONG_MIN and LONG_MAX to check Python integer bounds instead of the incorrect INT_MIN and INT_MAX ........ r46248 | bob.ippolito | 2006-05-25 21:56:56 +0200 (Thu, 25 May 2006) | 1 line Use faster struct pack/unpack functions for the endian table that matches the host's ........ r46249 | bob.ippolito | 2006-05-25 21:59:56 +0200 (Thu, 25 May 2006) | 1 line enable darwin/x86 support for libffi and hence ctypes (doesn't yet support --enable-universalsdk) ........ r46252 | georg.brandl | 2006-05-25 22:28:10 +0200 (Thu, 25 May 2006) | 4 lines Someone seems to just have copy-pasted the docs of tp_compare to tp_richcompare ;) ........ r46253 | brett.cannon | 2006-05-25 22:44:08 +0200 (Thu, 25 May 2006) | 2 lines Swap out bare malloc()/free() use for PyMem_MALLOC()/PyMem_FREE() . ........ r46254 | bob.ippolito | 2006-05-25 22:52:38 +0200 (Thu, 25 May 2006) | 1 line squelch gcc4 darwin/x86 compiler warnings ........ r46255 | bob.ippolito | 2006-05-25 23:09:45 +0200 (Thu, 25 May 2006) | 1 line fix test_float regression and 64-bit size mismatch issue ........ r46256 | georg.brandl | 2006-05-25 23:11:56 +0200 (Thu, 25 May 2006) | 3 lines Add a x-ref to newer calling APIs. ........ r46257 | ronald.oussoren | 2006-05-25 23:30:54 +0200 (Thu, 25 May 2006) | 2 lines Fix minor typo in prep_cif.c ........ r46259 | brett.cannon | 2006-05-25 23:33:11 +0200 (Thu, 25 May 2006) | 4 lines Change test_values so that it compares the lowercasing of group names since getgrall() can return all lowercase names while getgrgid() returns proper casing. Discovered on Ubuntu 5.04 (custom). ........ r46261 | tim.peters | 2006-05-25 23:50:17 +0200 (Thu, 25 May 2006) | 7 lines Some Win64 pre-release in 2000 didn't support QueryPerformanceCounter(), but we believe Win64 does support it now. So use in time.clock(). It would be peachy if someone with a Win64 box tried this ;-) ........ r46262 | tim.peters | 2006-05-25 23:52:19 +0200 (Thu, 25 May 2006) | 2 lines Whitespace normalization. ........ r46263 | bob.ippolito | 2006-05-25 23:58:05 +0200 (Thu, 25 May 2006) | 1 line Add missing files from x86 darwin ctypes patch ........ r46264 | brett.cannon | 2006-05-26 00:00:14 +0200 (Fri, 26 May 2006) | 2 lines Move over to use of METH_O and METH_NOARGS. ........ r46265 | tim.peters | 2006-05-26 00:25:25 +0200 (Fri, 26 May 2006) | 3 lines Repair idiot typo, and complete the job of trying to use the Windows time.clock() implementation on Win64. ........ r46266 | tim.peters | 2006-05-26 00:28:46 +0200 (Fri, 26 May 2006) | 9 lines Patch #1494387: SVN longobject.c compiler warnings The SIGCHECK macro defined here has always been bizarre, but it apparently causes compiler warnings on "Sun Studio 11". I believe the warnings are bogus, but it doesn't hurt to make the macro definition saner. Bugfix candidate (but I'm not going to bother). ........ r46268 | fredrik.lundh | 2006-05-26 01:27:53 +0200 (Fri, 26 May 2006) | 8 lines needforspeed: partition for 8-bit strings. for some simple tests, this is on par with a corresponding find, and nearly twice as fast as split(sep, 1) full tests, a unicode version, and documentation will follow to- morrow. ........ r46271 | andrew.kuchling | 2006-05-26 03:46:22 +0200 (Fri, 26 May 2006) | 1 line Add Soc student ........ r46272 | ronald.oussoren | 2006-05-26 10:41:25 +0200 (Fri, 26 May 2006) | 3 lines Without this patch OSX users couldn't add new help sources because the code tried to update one item in a tuple. ........ r46273 | fredrik.lundh | 2006-05-26 10:54:28 +0200 (Fri, 26 May 2006) | 5 lines needforspeed: partition implementation, part two. feel free to improve the documentation and the docstrings. ........ r46274 | georg.brandl | 2006-05-26 11:05:54 +0200 (Fri, 26 May 2006) | 3 lines Clarify docs for str.partition(). ........ r46278 | fredrik.lundh | 2006-05-26 11:46:59 +0200 (Fri, 26 May 2006) | 5 lines needforspeed: use METH_O for argument handling, which made partition some ~15% faster for the current tests (which is noticable faster than a corre- sponding find call). thanks to neal-who-never-sleeps for the tip. ........ r46280 | fredrik.lundh | 2006-05-26 12:27:17 +0200 (Fri, 26 May 2006) | 5 lines needforspeed: use Py_ssize_t for the fastsearch counter and skip length (thanks, neal!). and yes, I've verified that this doesn't slow things down ;-) ........ r46285 | andrew.dalke | 2006-05-26 13:11:38 +0200 (Fri, 26 May 2006) | 2 lines Added a few more test cases for whitespace split. These strings have leading whitespace. ........ r46286 | jack.diederich | 2006-05-26 13:15:17 +0200 (Fri, 26 May 2006) | 1 line use Py_ssize_t in places that may need it ........ r46287 | andrew.dalke | 2006-05-26 13:15:22 +0200 (Fri, 26 May 2006) | 2 lines Added split whitespace checks for characters other than space. ........ r46288 | ronald.oussoren | 2006-05-26 13:17:55 +0200 (Fri, 26 May 2006) | 2 lines Fix buglet in postinstall script, it would generate an invalid .cshrc file. ........ r46290 | georg.brandl | 2006-05-26 13:26:11 +0200 (Fri, 26 May 2006) | 3 lines Add "partition" to UserString. ........ r46291 | fredrik.lundh | 2006-05-26 13:29:39 +0200 (Fri, 26 May 2006) | 5 lines needforspeed: added Py_LOCAL macro, based on the LOCAL macro used for SRE and others. applied Py_LOCAL to relevant portion of ceval, which gives a 1-2% speedup on my machine. ymmv. ........ r46292 | jack.diederich | 2006-05-26 13:37:20 +0200 (Fri, 26 May 2006) | 1 line when generating python code prefer to generate valid python code ........ r46293 | fredrik.lundh | 2006-05-26 13:38:15 +0200 (Fri, 26 May 2006) | 3 lines use Py_LOCAL also for string and unicode objects ........ r46294 | ronald.oussoren | 2006-05-26 13:38:39 +0200 (Fri, 26 May 2006) | 12 lines - Search the sqlite specific search directories after the normal include directories when looking for the version of sqlite to use. - On OSX: * Extract additional include and link directories from the CFLAGS and LDFLAGS, if the user has bothered to specify them we might as wel use them. * Add '-Wl,-search_paths_first' to the extra_link_args for readline and sqlite. This makes it possible to use a static library to override the system provided dynamic library. ........ r46295 | ronald.oussoren | 2006-05-26 13:43:26 +0200 (Fri, 26 May 2006) | 6 lines Integrate installing a framework in the 'make install' target. Until now users had to use 'make frameworkinstall' to install python when it is configured with '--enable-framework'. This tends to confuse users that don't hunt for readme files hidden in platform specific directories :-) ........ r46297 | fredrik.lundh | 2006-05-26 13:54:04 +0200 (Fri, 26 May 2006) | 4 lines needforspeed: added PY_LOCAL_AGGRESSIVE macro to enable "aggressive" LOCAL inlining; also added some missing whitespace ........ r46298 | andrew.kuchling | 2006-05-26 14:01:44 +0200 (Fri, 26 May 2006) | 1 line Typo fixes ........ r46299 | fredrik.lundh | 2006-05-26 14:01:49 +0200 (Fri, 26 May 2006) | 4 lines Py_LOCAL shouldn't be used for data; it works for some .NET 2003 compilers, but Trent's copy thinks that it's an anachronism... ........ r46300 | martin.blais | 2006-05-26 14:03:27 +0200 (Fri, 26 May 2006) | 12 lines Support for buffer protocol for socket and struct. * Added socket.recv_buf() and socket.recvfrom_buf() methods, that use the buffer protocol (send and sendto already did). * Added struct.pack_to(), that is the corresponding buffer compatible method to unpack_from(). * Fixed minor typos in arraymodule. ........ r46302 | ronald.oussoren | 2006-05-26 14:23:20 +0200 (Fri, 26 May 2006) | 6 lines - Remove previous version of the binary distribution script for OSX - Some small bugfixes for the IDLE.app wrapper - Tweaks to build-installer to ensure that python gets build in the right way, including sqlite3. - Updated readme files ........ r46305 | tim.peters | 2006-05-26 14:26:21 +0200 (Fri, 26 May 2006) | 2 lines Whitespace normalization. ........ r46307 | andrew.dalke | 2006-05-26 14:28:15 +0200 (Fri, 26 May 2006) | 7 lines I like tests. The new split functions use a preallocated list. Added tests which exceed the preallocation size, to exercise list appends/resizes. Also added more edge case tests. ........ r46308 | andrew.dalke | 2006-05-26 14:31:00 +0200 (Fri, 26 May 2006) | 2 lines Test cases for off-by-one errors in string split with multicharacter pattern. ........ r46309 | tim.peters | 2006-05-26 14:31:20 +0200 (Fri, 26 May 2006) | 2 lines Whitespace normalization. ........ r46313 | andrew.kuchling | 2006-05-26 14:39:48 +0200 (Fri, 26 May 2006) | 1 line Add str.partition() ........ r46314 | bob.ippolito | 2006-05-26 14:52:53 +0200 (Fri, 26 May 2006) | 1 line quick hack to fix busted binhex test ........ r46316 | andrew.dalke | 2006-05-26 15:05:55 +0200 (Fri, 26 May 2006) | 2 lines Added more rstrip tests, including for prealloc'ed arrays ........ r46320 | bob.ippolito | 2006-05-26 15:15:44 +0200 (Fri, 26 May 2006) | 1 line fix #1229380 No struct.pack exception for some out of range integers ........ r46325 | tim.peters | 2006-05-26 15:39:17 +0200 (Fri, 26 May 2006) | 2 lines Use open() to open files (was using file()). ........ r46327 | andrew.dalke | 2006-05-26 16:00:45 +0200 (Fri, 26 May 2006) | 37 lines Changes to string.split/rsplit on whitespace to preallocate space in the results list. Originally it allocated 0 items and used the list growth during append. Now it preallocates 12 items so the first few appends don't need list reallocs. ("Here are some words ."*2).split(None, 1) is 7% faster ("Here are some words ."*2).split() is is 15% faster (Your milage may vary, see dealership for details.) File parsing like this for line in f: count += len(line.split()) is also about 15% faster. There is a slowdown of about 3% for large strings because of the additional overhead of checking if the append is to a preallocated region of the list or not. This will be the rare case. It could be improved with special case code but we decided it was not useful enough. There is a cost of 12*sizeof(PyObject *) bytes per list. For the normal case of file parsing this is not a problem because of the lists have a short lifetime. We have not come up with cases where this is a problem in real life. I chose 12 because human text averages about 11 words per line in books, one of my data sets averages 6.2 words with a final peak at 11 words per line, and I work with a tab delimited data set with 8 tabs per line (or 9 words per line). 12 encompasses all of these. Also changed the last rstrip code to append then reverse, rather than doing insert(0). The strip() and rstrip() times are now comparable. ........ r46328 | tim.peters | 2006-05-26 16:02:05 +0200 (Fri, 26 May 2006) | 5 lines Explicitly close files. I'm trying to stop the frequent spurious test_tarfile failures on Windows buildbots, but it's hard to know how since the regrtest failure output is useless here, and it never fails when a buildbot slave runs test_tarfile the second time in verbose mode. ........ r46329 | andrew.kuchling | 2006-05-26 16:03:41 +0200 (Fri, 26 May 2006) | 1 line Add buffer support for struct, socket ........ r46330 | andrew.kuchling | 2006-05-26 16:04:19 +0200 (Fri, 26 May 2006) | 1 line Typo fix ........ r46331 | bob.ippolito | 2006-05-26 16:07:23 +0200 (Fri, 26 May 2006) | 1 line Fix distutils so that libffi will cross-compile between darwin/x86 and darwin/ppc ........ r46333 | bob.ippolito | 2006-05-26 16:23:21 +0200 (Fri, 26 May 2006) | 1 line Fix _struct typo that broke some 64-bit platforms ........ r46335 | bob.ippolito | 2006-05-26 16:29:35 +0200 (Fri, 26 May 2006) | 1 line Enable PY_USE_INT_WHEN_POSSIBLE in struct ........ r46343 | andrew.dalke | 2006-05-26 17:21:01 +0200 (Fri, 26 May 2006) | 2 lines Eeked out another 3% or so performance in split whitespace by cleaning up the algorithm. ........ r46352 | andrew.dalke | 2006-05-26 18:22:52 +0200 (Fri, 26 May 2006) | 3 lines Test for more edge strip cases; leading and trailing separator gets removed even with strip(..., 0) ........ r46354 | bob.ippolito | 2006-05-26 18:23:28 +0200 (Fri, 26 May 2006) | 1 line fix signed/unsigned mismatch in struct ........ r46355 | steve.holden | 2006-05-26 18:27:59 +0200 (Fri, 26 May 2006) | 5 lines Add -t option to allow easy test selection. Action verbose option correctly. Tweak operation counts. Add empty and new instances tests. Enable comparisons across different warp factors. Change version. ........ r46356 | fredrik.lundh | 2006-05-26 18:32:42 +0200 (Fri, 26 May 2006) | 3 lines needforspeed: use Py_LOCAL on a few more locals in stringobject.c ........ r46357 | thomas.heller | 2006-05-26 18:42:44 +0200 (Fri, 26 May 2006) | 4 lines For now, I gave up with automatic conversion of reST to Python-latex, so I'm writing this in latex now. Skeleton for the ctypes reference. ........ r46358 | tim.peters | 2006-05-26 18:49:28 +0200 (Fri, 26 May 2006) | 3 lines Repair Windows compiler warnings about mixing signed and unsigned integral types in comparisons. ........ r46359 | tim.peters | 2006-05-26 18:52:04 +0200 (Fri, 26 May 2006) | 2 lines Whitespace normalization. ........ r46360 | tim.peters | 2006-05-26 18:53:04 +0200 (Fri, 26 May 2006) | 2 lines Add missing svn:eol-style property to text files. ........ r46362 | fredrik.lundh | 2006-05-26 19:04:58 +0200 (Fri, 26 May 2006) | 3 lines needforspeed: stringlib refactoring (in progress) ........ r46363 | thomas.heller | 2006-05-26 19:18:33 +0200 (Fri, 26 May 2006) | 1 line Write some docs. ........ r46364 | fredrik.lundh | 2006-05-26 19:22:38 +0200 (Fri, 26 May 2006) | 3 lines needforspeed: stringlib refactoring (in progress) ........ r46366 | fredrik.lundh | 2006-05-26 19:26:39 +0200 (Fri, 26 May 2006) | 3 lines needforspeed: cleanup ........ r46367 | fredrik.lundh | 2006-05-26 19:31:41 +0200 (Fri, 26 May 2006) | 4 lines needforspeed: remove remaining USE_FAST macros; if fastsearch was broken, someone would have noticed by now ;-) ........ r46368 | steve.holden | 2006-05-26 19:41:32 +0200 (Fri, 26 May 2006) | 5 lines Use minimum calibration time rather than avergae to avoid the illusion of negative run times. Halt with an error if run times go below 10 ms, indicating that results will be unreliable. ........ r46370 | thomas.heller | 2006-05-26 19:47:40 +0200 (Fri, 26 May 2006) | 2 lines Reordered, and wrote more docs. ........ r46372 | georg.brandl | 2006-05-26 20:03:31 +0200 (Fri, 26 May 2006) | 9 lines Need for speed: Patch #921466 : sys.path_importer_cache is now used to cache valid and invalid file paths for the built-in import machinery which leads to fewer open calls on startup. Also fix issue with PEP 302 style import hooks which lead to more open() calls than necessary. ........ r46373 | fredrik.lundh | 2006-05-26 20:05:34 +0200 (Fri, 26 May 2006) | 3 lines removed unnecessary include ........ r46377 | fredrik.lundh | 2006-05-26 20:15:38 +0200 (Fri, 26 May 2006) | 3 lines needforspeed: added rpartition implementation ........ r46380 | fredrik.lundh | 2006-05-26 20:24:15 +0200 (Fri, 26 May 2006) | 5 lines needspeed: rpartition documentation, tests, and a bug fixes. feel free to add more tests and improve the documentation. ........ r46381 | steve.holden | 2006-05-26 20:26:21 +0200 (Fri, 26 May 2006) | 4 lines Revert tests to MAL's original round sizes to retiain comparability from long ago and far away. Stop calling this pybench 1.4 because it isn't. Remove the empty test, which was a bad idea. ........ r46387 | andrew.kuchling | 2006-05-26 20:41:18 +0200 (Fri, 26 May 2006) | 1 line Add rpartition() and path caching ........ r46388 | andrew.dalke | 2006-05-26 21:02:09 +0200 (Fri, 26 May 2006) | 10 lines substring split now uses /F's fast string matching algorithm. (If compiled without FAST search support, changed the pre-memcmp test to check the last character as well as the first. This gave a 25% speedup for my test case.) Rewrote the split algorithms so they stop when maxsplit gets to 0. Previously they did a string match first then checked if the maxsplit was reached. The new way prevents a needless string search. ........ r46391 | brett.cannon | 2006-05-26 21:04:47 +0200 (Fri, 26 May 2006) | 2 lines Change C spacing to 4 spaces by default to match PEP 7 for new C files. ........ r46392 | georg.brandl | 2006-05-26 21:04:47 +0200 (Fri, 26 May 2006) | 3 lines Exception isn't the root of all exception classes anymore. ........ r46397 | fredrik.lundh | 2006-05-26 21:23:21 +0200 (Fri, 26 May 2006) | 3 lines added rpartition method to UserString class ........ r46398 | fredrik.lundh | 2006-05-26 21:24:53 +0200 (Fri, 26 May 2006) | 4 lines needforspeed: stringlib refactoring, continued. added count and find helpers; updated unicodeobject to use stringlib_count ........ r46400 | fredrik.lundh | 2006-05-26 21:29:05 +0200 (Fri, 26 May 2006) | 4 lines needforspeed: stringlib refactoring: use stringlib/find for unicode find ........ r46403 | fredrik.lundh | 2006-05-26 21:33:03 +0200 (Fri, 26 May 2006) | 3 lines needforspeed: use a macro to fix slice indexes ........ r46404 | thomas.heller | 2006-05-26 21:43:45 +0200 (Fri, 26 May 2006) | 1 line Write more docs. ........ r46406 | fredrik.lundh | 2006-05-26 21:48:07 +0200 (Fri, 26 May 2006) | 3 lines needforspeed: stringlib refactoring: use stringlib/find for string find ........ r46407 | andrew.kuchling | 2006-05-26 21:51:10 +0200 (Fri, 26 May 2006) | 1 line Comment typo ........ r46409 | georg.brandl | 2006-05-26 22:04:44 +0200 (Fri, 26 May 2006) | 3 lines Replace Py_BuildValue("OO") by PyTuple_Pack. ........ r46411 | georg.brandl | 2006-05-26 22:14:47 +0200 (Fri, 26 May 2006) | 2 lines Patch #1492218: document None being a constant. ........ r46415 | georg.brandl | 2006-05-26 22:22:50 +0200 (Fri, 26 May 2006) | 3 lines Simplify calling. ........ r46416 | andrew.dalke | 2006-05-26 22:25:22 +0200 (Fri, 26 May 2006) | 4 lines Added limits to the replace code so it does not count all of the matching patterns in a string, only the number needed by the max limit. ........ r46417 | bob.ippolito | 2006-05-26 22:25:23 +0200 (Fri, 26 May 2006) | 1 line enable all of the struct tests, use ssize_t, fix some whitespace ........ r46418 | tim.peters | 2006-05-26 22:56:56 +0200 (Fri, 26 May 2006) | 2 lines Record Iceland sprint attendees. ........ r46421 | tim.peters | 2006-05-26 23:51:13 +0200 (Fri, 26 May 2006) | 2 lines Whitespace normalization. ........ r46422 | steve.holden | 2006-05-27 00:17:54 +0200 (Sat, 27 May 2006) | 2 lines Add Richard Tew to developers ........ r46423 | steve.holden | 2006-05-27 00:33:20 +0200 (Sat, 27 May 2006) | 2 lines Update help text and documentaition. ........ r46424 | steve.holden | 2006-05-27 00:39:27 +0200 (Sat, 27 May 2006) | 2 lines Blasted typos ... ........ r46425 | andrew.dalke | 2006-05-27 00:49:03 +0200 (Sat, 27 May 2006) | 2 lines Added description of why splitlines doesn't use the prealloc strategy ........ r46426 | tim.peters | 2006-05-27 01:14:37 +0200 (Sat, 27 May 2006) | 19 lines Patch 1145039. set_exc_info(), reset_exc_info(): By exploiting the likely (who knows?) invariant that when an exception's `type` is NULL, its `value` and `traceback` are also NULL, save some cycles in heavily-executed code. This is a "a kronar saved is a kronar earned" patch: the speedup isn't reliably measurable, but it obviously does reduce the operation count in the normal (no exception raised) path through PyEval_EvalFrameEx(). The tim-exc_sanity branch tries to push this harder, but is still blowing up (at least in part due to pre-existing subtle bugs that appear to have no other visible consequences!). Not a bugfix candidate. ........ r46429 | steve.holden | 2006-05-27 02:51:52 +0200 (Sat, 27 May 2006) | 2 lines Reinstate new-style object tests. ........ r46430 | neal.norwitz | 2006-05-27 07:18:57 +0200 (Sat, 27 May 2006) | 1 line Fix compiler warning (and whitespace) on Mac OS 10.4. (A lot of this code looked duplicated, I wonder if a utility function could help reduce the duplication here.) ........ r46431 | neal.norwitz | 2006-05-27 07:21:30 +0200 (Sat, 27 May 2006) | 4 lines Fix Coverity warnings. - Check the correct variable (str_obj, not str) for NULL - sep_len was already verified it wasn't 0 ........ r46432 | martin.v.loewis | 2006-05-27 10:36:52 +0200 (Sat, 27 May 2006) | 2 lines Patch 1494554: Update numeric properties to Unicode 4.1. ........ r46433 | martin.v.loewis | 2006-05-27 10:54:29 +0200 (Sat, 27 May 2006) | 2 lines Explain why 'consumed' is initialized. ........ r46436 | fredrik.lundh | 2006-05-27 12:05:10 +0200 (Sat, 27 May 2006) | 3 lines needforspeed: more stringlib refactoring ........ r46438 | fredrik.lundh | 2006-05-27 12:39:48 +0200 (Sat, 27 May 2006) | 5 lines needforspeed: backed out the Py_LOCAL-isation of ceval; the massive in- lining killed performance on certain Intel boxes, and the "aggressive" macro itself gives most of the benefits on others. ........ r46439 | andrew.dalke | 2006-05-27 13:04:36 +0200 (Sat, 27 May 2006) | 2 lines fixed typo ........ r46440 | martin.v.loewis | 2006-05-27 13:07:49 +0200 (Sat, 27 May 2006) | 2 lines Revert bogus change committed in 46432 to this file. ........ r46444 | andrew.kuchling | 2006-05-27 13:26:33 +0200 (Sat, 27 May 2006) | 1 line Add Py_LOCAL macros ........ r46450 | bob.ippolito | 2006-05-27 13:47:12 +0200 (Sat, 27 May 2006) | 1 line Remove the range checking and int usage #defines from _struct and strip out the now-dead code ........ r46454 | bob.ippolito | 2006-05-27 14:11:36 +0200 (Sat, 27 May 2006) | 1 line Fix up struct docstrings, add struct.pack_to function for symmetry ........ r46456 | richard.jones | 2006-05-27 14:29:24 +0200 (Sat, 27 May 2006) | 2 lines Conversion of exceptions over from faked-up classes to new-style C types. ........ r46457 | georg.brandl | 2006-05-27 14:30:25 +0200 (Sat, 27 May 2006) | 3 lines Add news item for new-style exception class branch merge. ........ r46458 | tim.peters | 2006-05-27 14:36:53 +0200 (Sat, 27 May 2006) | 3 lines More random thrashing trying to understand spurious Windows failures. Who's keeping a bz2 file open? ........ r46460 | andrew.kuchling | 2006-05-27 15:44:37 +0200 (Sat, 27 May 2006) | 1 line Mention new-style exceptions ........ r46461 | richard.jones | 2006-05-27 15:50:42 +0200 (Sat, 27 May 2006) | 1 line credit where credit is due ........ r46462 | georg.brandl | 2006-05-27 16:02:03 +0200 (Sat, 27 May 2006) | 3 lines Always close BZ2Proxy object. Remove unnecessary struct usage. ........ r46463 | tim.peters | 2006-05-27 16:13:13 +0200 (Sat, 27 May 2006) | 2 lines The cheery optimism of old age. ........ r46464 | andrew.dalke | 2006-05-27 16:16:40 +0200 (Sat, 27 May 2006) | 2 lines cleanup - removed trailing whitespace ........ r46465 | georg.brandl | 2006-05-27 16:41:55 +0200 (Sat, 27 May 2006) | 3 lines Remove spurious semicolons after macro invocations. ........ r46468 | fredrik.lundh | 2006-05-27 16:58:20 +0200 (Sat, 27 May 2006) | 4 lines needforspeed: replace improvements, changed to Py_LOCAL_INLINE where appropriate ........ r46469 | fredrik.lundh | 2006-05-27 17:20:22 +0200 (Sat, 27 May 2006) | 4 lines needforspeed: stringlib refactoring: changed find_obj to find_slice, to enable use from stringobject ........ r46470 | fredrik.lundh | 2006-05-27 17:26:19 +0200 (Sat, 27 May 2006) | 3 lines needforspeed: stringlib refactoring: use find_slice for stringobject ........ r46472 | kristjan.jonsson | 2006-05-27 17:41:31 +0200 (Sat, 27 May 2006) | 1 line Add a PCBuild8 build directory for building with Visual Studio .NET 2005. Contains a special project to perform profile guided optimizations on the pythoncore.dll, by instrumenting and running pybench.py ........ r46473 | jack.diederich | 2006-05-27 17:44:34 +0200 (Sat, 27 May 2006) | 3 lines needforspeed: use PyObject_MALLOC instead of system malloc for small allocations. Use PyMem_MALLOC for larger (1k+) chunks. 1%-2% speedup. ........ r46474 | bob.ippolito | 2006-05-27 17:53:49 +0200 (Sat, 27 May 2006) | 1 line fix struct regression on 64-bit platforms ........ r46475 | richard.jones | 2006-05-27 18:07:28 +0200 (Sat, 27 May 2006) | 1 line doc string additions and tweaks ........ r46477 | richard.jones | 2006-05-27 18:15:11 +0200 (Sat, 27 May 2006) | 1 line move semicolons ........ r46478 | george.yoshida | 2006-05-27 18:32:44 +0200 (Sat, 27 May 2006) | 2 lines minor markup nits ........ r46488 | george.yoshida | 2006-05-27 18:51:43 +0200 (Sat, 27 May 2006) | 3 lines End of Ch.3 is now about "with statement". Avoid obsolescence by directly referring to the section. ........ r46489 | george.yoshida | 2006-05-27 19:09:17 +0200 (Sat, 27 May 2006) | 2 lines fix typo ........
2006-05-27 16:21:47 -03:00
#include <stddef.h>
#include <string.h> /* memset(), memcpy() */
#include <assert.h>
2002-02-11 19:13:04 -04:00
#include "expat.h"
#ifdef XML_UNICODE
#define XML_ENCODE_MAX XML_UTF16_ENCODE_MAX
#define XmlConvert XmlUtf16Convert
#define XmlGetInternalEncoding XmlGetUtf16InternalEncoding
#define XmlGetInternalEncodingNS XmlGetUtf16InternalEncodingNS
#define XmlEncode XmlUtf16Encode
#define MUST_CONVERT(enc, s) (!(enc)->isUtf16 || (((unsigned long)s) & 1))
typedef unsigned short ICHAR;
#else
#define XML_ENCODE_MAX XML_UTF8_ENCODE_MAX
#define XmlConvert XmlUtf8Convert
#define XmlGetInternalEncoding XmlGetUtf8InternalEncoding
#define XmlGetInternalEncodingNS XmlGetUtf8InternalEncodingNS
#define XmlEncode XmlUtf8Encode
#define MUST_CONVERT(enc, s) (!(enc)->isUtf8)
typedef char ICHAR;
#endif
#ifndef XML_NS
#define XmlInitEncodingNS XmlInitEncoding
#define XmlInitUnknownEncodingNS XmlInitUnknownEncoding
#undef XmlGetInternalEncodingNS
#define XmlGetInternalEncodingNS XmlGetInternalEncoding
#define XmlParseXmlDeclNS XmlParseXmlDecl
#endif
2003-01-25 18:41:29 -04:00
#ifdef XML_UNICODE
2002-02-11 19:13:04 -04:00
#ifdef XML_UNICODE_WCHAR_T
2003-01-25 18:41:29 -04:00
#define XML_T(x) (const wchar_t)x
#define XML_L(x) L ## x
#else
#define XML_T(x) (const unsigned short)x
#define XML_L(x) x
#endif
2002-02-11 19:13:04 -04:00
#else
2003-01-25 18:41:29 -04:00
2002-02-11 19:13:04 -04:00
#define XML_T(x) x
2003-01-25 18:41:29 -04:00
#define XML_L(x) x
2002-02-11 19:13:04 -04:00
#endif
/* Round up n to be a multiple of sz, where sz is a power of 2. */
#define ROUND_UP(n, sz) (((n) + ((sz) - 1)) & ~((sz) - 1))
/* Handle the case where memmove() doesn't exist. */
#ifndef HAVE_MEMMOVE
#ifdef HAVE_BCOPY
#define memmove(d,s,l) bcopy((s),(d),(l))
#else
#error memmove does not exist on this platform, nor is a substitute available
#endif /* HAVE_BCOPY */
#endif /* HAVE_MEMMOVE */
2003-01-25 18:41:29 -04:00
#include "internal.h"
2002-02-11 19:13:04 -04:00
#include "xmltok.h"
#include "xmlrole.h"
typedef const XML_Char *KEY;
typedef struct {
KEY name;
} NAMED;
typedef struct {
NAMED **v;
unsigned char power;
2002-02-11 19:13:04 -04:00
size_t size;
size_t used;
2003-01-25 18:41:29 -04:00
const XML_Memory_Handling_Suite *mem;
2002-02-11 19:13:04 -04:00
} HASH_TABLE;
/* Basic character hash algorithm, taken from Python's string hash:
h = h * 1000003 ^ character, the constant being a prime number.
*/
#ifdef XML_UNICODE
#define CHAR_HASH(h, c) \
(((h) * 0xF4243) ^ (unsigned short)(c))
#else
#define CHAR_HASH(h, c) \
(((h) * 0xF4243) ^ (unsigned char)(c))
#endif
/* For probing (after a collision) we need a step size relative prime
to the hash table size, which is a power of 2. We use double-hashing,
since we can calculate a second hash value cheaply by taking those bits
of the first hash value that were discarded (masked out) when the table
index was calculated: index = hash & mask, where mask = table->size - 1.
We limit the maximum step size to table->size / 4 (mask >> 2) and make
it odd, since odd numbers are always relative prime to a power of 2.
*/
#define SECOND_HASH(hash, mask, power) \
((((hash) & ~(mask)) >> ((power) - 1)) & ((mask) >> 2))
#define PROBE_STEP(hash, mask, power) \
((unsigned char)((SECOND_HASH(hash, mask, power)) | 1))
2002-02-11 19:13:04 -04:00
typedef struct {
NAMED **p;
NAMED **end;
} HASH_TABLE_ITER;
#define INIT_TAG_BUF_SIZE 32 /* must be a multiple of sizeof(XML_Char) */
#define INIT_DATA_BUF_SIZE 1024
#define INIT_ATTS_SIZE 16
#define INIT_ATTS_VERSION 0xFFFFFFFF
2002-02-11 19:13:04 -04:00
#define INIT_BLOCK_SIZE 1024
#define INIT_BUFFER_SIZE 1024
#define EXPAND_SPARE 24
typedef struct binding {
struct prefix *prefix;
struct binding *nextTagBinding;
struct binding *prevPrefixBinding;
const struct attribute_id *attId;
XML_Char *uri;
int uriLen;
int uriAlloc;
} BINDING;
typedef struct prefix {
const XML_Char *name;
BINDING *binding;
} PREFIX;
typedef struct {
const XML_Char *str;
const XML_Char *localPart;
2003-01-25 18:41:29 -04:00
const XML_Char *prefix;
int strLen;
2002-02-11 19:13:04 -04:00
int uriLen;
2003-01-25 18:41:29 -04:00
int prefixLen;
2002-02-11 19:13:04 -04:00
} TAG_NAME;
2003-01-25 18:41:29 -04:00
/* TAG represents an open element.
The name of the element is stored in both the document and API
encodings. The memory buffer 'buf' is a separately-allocated
memory area which stores the name. During the XML_Parse()/
XMLParseBuffer() when the element is open, the memory for the 'raw'
version of the name (in the document encoding) is shared with the
document buffer. If the element is open across calls to
XML_Parse()/XML_ParseBuffer(), the buffer is re-allocated to
contain the 'raw' name as well.
A parser re-uses these structures, maintaining a list of allocated
TAG objects in a free list.
*/
2002-02-11 19:13:04 -04:00
typedef struct tag {
2003-01-25 18:41:29 -04:00
struct tag *parent; /* parent of this element */
const char *rawName; /* tagName in the original encoding */
2002-02-11 19:13:04 -04:00
int rawNameLength;
2003-01-25 18:41:29 -04:00
TAG_NAME name; /* tagName in the API encoding */
char *buf; /* buffer for name components */
char *bufEnd; /* end of the buffer */
2002-02-11 19:13:04 -04:00
BINDING *bindings;
} TAG;
typedef struct {
const XML_Char *name;
const XML_Char *textPtr;
2004-08-03 04:06:22 -03:00
int textLen; /* length in XML_Chars */
int processed; /* # of processed bytes - when suspended */
2002-02-11 19:13:04 -04:00
const XML_Char *systemId;
const XML_Char *base;
const XML_Char *publicId;
const XML_Char *notation;
2003-01-25 18:41:29 -04:00
XML_Bool open;
XML_Bool is_param;
XML_Bool is_internal; /* true if declared in internal subset outside PE */
2002-02-11 19:13:04 -04:00
} ENTITY;
typedef struct {
2003-01-25 18:41:29 -04:00
enum XML_Content_Type type;
enum XML_Content_Quant quant;
const XML_Char * name;
int firstchild;
int lastchild;
int childcnt;
int nextsib;
2002-02-11 19:13:04 -04:00
} CONTENT_SCAFFOLD;
2003-01-25 18:41:29 -04:00
#define INIT_SCAFFOLD_ELEMENTS 32
2002-02-11 19:13:04 -04:00
typedef struct block {
struct block *next;
int size;
XML_Char s[1];
} BLOCK;
typedef struct {
BLOCK *blocks;
BLOCK *freeBlocks;
const XML_Char *end;
XML_Char *ptr;
XML_Char *start;
2003-01-25 18:41:29 -04:00
const XML_Memory_Handling_Suite *mem;
2002-02-11 19:13:04 -04:00
} STRING_POOL;
/* The XML_Char before the name is used to determine whether
2003-01-25 18:41:29 -04:00
an attribute has been specified. */
2002-02-11 19:13:04 -04:00
typedef struct attribute_id {
XML_Char *name;
PREFIX *prefix;
2003-01-25 18:41:29 -04:00
XML_Bool maybeTokenized;
XML_Bool xmlns;
2002-02-11 19:13:04 -04:00
} ATTRIBUTE_ID;
typedef struct {
const ATTRIBUTE_ID *id;
2003-01-25 18:41:29 -04:00
XML_Bool isCdata;
2002-02-11 19:13:04 -04:00
const XML_Char *value;
} DEFAULT_ATTRIBUTE;
typedef struct {
unsigned long version;
unsigned long hash;
const XML_Char *uriName;
} NS_ATT;
2002-02-11 19:13:04 -04:00
typedef struct {
const XML_Char *name;
PREFIX *prefix;
const ATTRIBUTE_ID *idAtt;
int nDefaultAtts;
int allocDefaultAtts;
DEFAULT_ATTRIBUTE *defaultAtts;
} ELEMENT_TYPE;
typedef struct {
HASH_TABLE generalEntities;
HASH_TABLE elementTypes;
HASH_TABLE attributeIds;
HASH_TABLE prefixes;
STRING_POOL pool;
2003-01-25 18:41:29 -04:00
STRING_POOL entityValuePool;
/* false once a parameter entity reference has been skipped */
XML_Bool keepProcessing;
/* true once an internal or external PE reference has been encountered;
this includes the reference to an external subset */
XML_Bool hasParamEntityRefs;
XML_Bool standalone;
2002-02-11 19:13:04 -04:00
#ifdef XML_DTD
2003-01-25 18:41:29 -04:00
/* indicates if external PE has been read */
XML_Bool paramEntityRead;
2002-02-11 19:13:04 -04:00
HASH_TABLE paramEntities;
#endif /* XML_DTD */
PREFIX defaultPrefix;
/* === scaffolding for building content model === */
2003-01-25 18:41:29 -04:00
XML_Bool in_eldecl;
2002-02-11 19:13:04 -04:00
CONTENT_SCAFFOLD *scaffold;
unsigned contentStringLen;
unsigned scaffSize;
unsigned scaffCount;
int scaffLevel;
int *scaffIndex;
} DTD;
typedef struct open_internal_entity {
const char *internalEventPtr;
const char *internalEventEndPtr;
struct open_internal_entity *next;
ENTITY *entity;
2004-08-03 04:06:22 -03:00
int startTagLevel;
XML_Bool betweenDecl; /* WFC: PE Between Declarations */
2002-02-11 19:13:04 -04:00
} OPEN_INTERNAL_ENTITY;
2003-01-25 18:41:29 -04:00
typedef enum XML_Error PTRCALL Processor(XML_Parser parser,
const char *start,
const char *end,
const char **endPtr);
2002-02-11 19:13:04 -04:00
static Processor prologProcessor;
static Processor prologInitProcessor;
static Processor contentProcessor;
static Processor cdataSectionProcessor;
#ifdef XML_DTD
static Processor ignoreSectionProcessor;
2003-01-25 18:41:29 -04:00
static Processor externalParEntProcessor;
static Processor externalParEntInitProcessor;
static Processor entityValueProcessor;
static Processor entityValueInitProcessor;
2002-02-11 19:13:04 -04:00
#endif /* XML_DTD */
static Processor epilogProcessor;
static Processor errorProcessor;
static Processor externalEntityInitProcessor;
static Processor externalEntityInitProcessor2;
static Processor externalEntityInitProcessor3;
static Processor externalEntityContentProcessor;
2004-08-03 04:06:22 -03:00
static Processor internalEntityProcessor;
2002-02-11 19:13:04 -04:00
static enum XML_Error
handleUnknownEncoding(XML_Parser parser, const XML_Char *encodingName);
static enum XML_Error
2003-01-25 18:41:29 -04:00
processXmlDecl(XML_Parser parser, int isGeneralTextEntity,
2004-08-03 04:06:22 -03:00
const char *s, const char *next);
2002-02-11 19:13:04 -04:00
static enum XML_Error
initializeEncoding(XML_Parser parser);
static enum XML_Error
2004-08-03 04:06:22 -03:00
doProlog(XML_Parser parser, const ENCODING *enc, const char *s,
const char *end, int tok, const char *next, const char **nextPtr,
XML_Bool haveMore);
2002-02-11 19:13:04 -04:00
static enum XML_Error
2004-08-03 04:06:22 -03:00
processInternalEntity(XML_Parser parser, ENTITY *entity,
XML_Bool betweenDecl);
2002-02-11 19:13:04 -04:00
static enum XML_Error
doContent(XML_Parser parser, int startTagLevel, const ENCODING *enc,
2004-08-03 04:06:22 -03:00
const char *start, const char *end, const char **endPtr,
XML_Bool haveMore);
2002-02-11 19:13:04 -04:00
static enum XML_Error
2003-01-25 18:41:29 -04:00
doCdataSection(XML_Parser parser, const ENCODING *, const char **startPtr,
2004-08-03 04:06:22 -03:00
const char *end, const char **nextPtr, XML_Bool haveMore);
2002-02-11 19:13:04 -04:00
#ifdef XML_DTD
static enum XML_Error
2003-01-25 18:41:29 -04:00
doIgnoreSection(XML_Parser parser, const ENCODING *, const char **startPtr,
2004-08-03 04:06:22 -03:00
const char *end, const char **nextPtr, XML_Bool haveMore);
2002-02-11 19:13:04 -04:00
#endif /* XML_DTD */
2003-01-25 18:41:29 -04:00
static enum XML_Error
storeAtts(XML_Parser parser, const ENCODING *, const char *s,
TAG_NAME *tagNamePtr, BINDING **bindingsPtr);
2003-01-25 18:41:29 -04:00
static enum XML_Error
addBinding(XML_Parser parser, PREFIX *prefix, const ATTRIBUTE_ID *attId,
const XML_Char *uri, BINDING **bindingsPtr);
2002-02-11 19:13:04 -04:00
static int
2004-08-03 04:06:22 -03:00
defineAttribute(ELEMENT_TYPE *type, ATTRIBUTE_ID *, XML_Bool isCdata,
XML_Bool isId, const XML_Char *dfltValue, XML_Parser parser);
2002-02-11 19:13:04 -04:00
static enum XML_Error
2003-01-25 18:41:29 -04:00
storeAttributeValue(XML_Parser parser, const ENCODING *, XML_Bool isCdata,
const char *, const char *, STRING_POOL *);
2002-02-11 19:13:04 -04:00
static enum XML_Error
2003-01-25 18:41:29 -04:00
appendAttributeValue(XML_Parser parser, const ENCODING *, XML_Bool isCdata,
const char *, const char *, STRING_POOL *);
2002-02-11 19:13:04 -04:00
static ATTRIBUTE_ID *
2003-01-25 18:41:29 -04:00
getAttributeId(XML_Parser parser, const ENCODING *enc, const char *start,
const char *end);
static int
setElementTypePrefix(XML_Parser parser, ELEMENT_TYPE *);
2002-02-11 19:13:04 -04:00
static enum XML_Error
2003-01-25 18:41:29 -04:00
storeEntityValue(XML_Parser parser, const ENCODING *enc, const char *start,
const char *end);
2002-02-11 19:13:04 -04:00
static int
2003-01-25 18:41:29 -04:00
reportProcessingInstruction(XML_Parser parser, const ENCODING *enc,
const char *start, const char *end);
2002-02-11 19:13:04 -04:00
static int
2003-01-25 18:41:29 -04:00
reportComment(XML_Parser parser, const ENCODING *enc, const char *start,
const char *end);
2002-02-11 19:13:04 -04:00
static void
2003-01-25 18:41:29 -04:00
reportDefault(XML_Parser parser, const ENCODING *enc, const char *start,
const char *end);
2002-02-11 19:13:04 -04:00
2003-01-25 18:41:29 -04:00
static const XML_Char * getContext(XML_Parser parser);
static XML_Bool
setContext(XML_Parser parser, const XML_Char *context);
2002-02-11 19:13:04 -04:00
2003-01-25 18:41:29 -04:00
static void FASTCALL normalizePublicId(XML_Char *s);
2002-02-11 19:13:04 -04:00
2003-01-25 18:41:29 -04:00
static DTD * dtdCreate(const XML_Memory_Handling_Suite *ms);
/* do not call if parentParser != NULL */
static void dtdReset(DTD *p, const XML_Memory_Handling_Suite *ms);
static void
dtdDestroy(DTD *p, XML_Bool isDocEntity, const XML_Memory_Handling_Suite *ms);
static int
dtdCopy(DTD *newDtd, const DTD *oldDtd, const XML_Memory_Handling_Suite *ms);
static int
copyEntityTable(HASH_TABLE *, STRING_POOL *, const HASH_TABLE *);
static NAMED *
lookup(HASH_TABLE *table, KEY name, size_t createSize);
static void FASTCALL
hashTableInit(HASH_TABLE *, const XML_Memory_Handling_Suite *ms);
static void FASTCALL hashTableClear(HASH_TABLE *);
static void FASTCALL hashTableDestroy(HASH_TABLE *);
static void FASTCALL
hashTableIterInit(HASH_TABLE_ITER *, const HASH_TABLE *);
static NAMED * FASTCALL hashTableIterNext(HASH_TABLE_ITER *);
static void FASTCALL
poolInit(STRING_POOL *, const XML_Memory_Handling_Suite *ms);
static void FASTCALL poolClear(STRING_POOL *);
static void FASTCALL poolDestroy(STRING_POOL *);
static XML_Char *
poolAppend(STRING_POOL *pool, const ENCODING *enc,
const char *ptr, const char *end);
static XML_Char *
poolStoreString(STRING_POOL *pool, const ENCODING *enc,
const char *ptr, const char *end);
static XML_Bool FASTCALL poolGrow(STRING_POOL *pool);
static const XML_Char * FASTCALL
poolCopyString(STRING_POOL *pool, const XML_Char *s);
static const XML_Char *
poolCopyStringN(STRING_POOL *pool, const XML_Char *s, int n);
static const XML_Char * FASTCALL
poolAppendString(STRING_POOL *pool, const XML_Char *s);
static int FASTCALL nextScaffoldPart(XML_Parser parser);
static XML_Content * build_model(XML_Parser parser);
static ELEMENT_TYPE *
getElementType(XML_Parser parser, const ENCODING *enc,
const char *ptr, const char *end);
static XML_Parser
parserCreate(const XML_Char *encodingName,
const XML_Memory_Handling_Suite *memsuite,
const XML_Char *nameSep,
DTD *dtd);
static void
parserInit(XML_Parser parser, const XML_Char *encodingName);
2002-02-11 19:13:04 -04:00
#define poolStart(pool) ((pool)->start)
#define poolEnd(pool) ((pool)->ptr)
#define poolLength(pool) ((pool)->ptr - (pool)->start)
#define poolChop(pool) ((void)--(pool->ptr))
#define poolLastChar(pool) (((pool)->ptr)[-1])
#define poolDiscard(pool) ((pool)->ptr = (pool)->start)
#define poolFinish(pool) ((pool)->start = (pool)->ptr)
#define poolAppendChar(pool, c) \
(((pool)->ptr == (pool)->end && !poolGrow(pool)) \
? 0 \
: ((*((pool)->ptr)++ = c), 1))
2003-01-25 18:41:29 -04:00
struct XML_ParserStruct {
/* The first member must be userData so that the XML_GetUserData
macro works. */
2002-02-11 19:13:04 -04:00
void *m_userData;
void *m_handlerArg;
char *m_buffer;
2003-01-25 18:41:29 -04:00
const XML_Memory_Handling_Suite m_mem;
2002-02-11 19:13:04 -04:00
/* first character to be parsed */
const char *m_bufferPtr;
/* past last character to be parsed */
char *m_bufferEnd;
/* allocated end of buffer */
const char *m_bufferLim;
long m_parseEndByteIndex;
const char *m_parseEndPtr;
XML_Char *m_dataBuf;
XML_Char *m_dataBufEnd;
XML_StartElementHandler m_startElementHandler;
XML_EndElementHandler m_endElementHandler;
XML_CharacterDataHandler m_characterDataHandler;
XML_ProcessingInstructionHandler m_processingInstructionHandler;
XML_CommentHandler m_commentHandler;
XML_StartCdataSectionHandler m_startCdataSectionHandler;
XML_EndCdataSectionHandler m_endCdataSectionHandler;
XML_DefaultHandler m_defaultHandler;
XML_StartDoctypeDeclHandler m_startDoctypeDeclHandler;
XML_EndDoctypeDeclHandler m_endDoctypeDeclHandler;
XML_UnparsedEntityDeclHandler m_unparsedEntityDeclHandler;
XML_NotationDeclHandler m_notationDeclHandler;
XML_StartNamespaceDeclHandler m_startNamespaceDeclHandler;
XML_EndNamespaceDeclHandler m_endNamespaceDeclHandler;
XML_NotStandaloneHandler m_notStandaloneHandler;
XML_ExternalEntityRefHandler m_externalEntityRefHandler;
2003-01-25 18:41:29 -04:00
XML_Parser m_externalEntityRefHandlerArg;
XML_SkippedEntityHandler m_skippedEntityHandler;
2002-02-11 19:13:04 -04:00
XML_UnknownEncodingHandler m_unknownEncodingHandler;
XML_ElementDeclHandler m_elementDeclHandler;
XML_AttlistDeclHandler m_attlistDeclHandler;
XML_EntityDeclHandler m_entityDeclHandler;
XML_XmlDeclHandler m_xmlDeclHandler;
const ENCODING *m_encoding;
INIT_ENCODING m_initEncoding;
const ENCODING *m_internalEncoding;
const XML_Char *m_protocolEncodingName;
2003-01-25 18:41:29 -04:00
XML_Bool m_ns;
XML_Bool m_ns_triplets;
2002-02-11 19:13:04 -04:00
void *m_unknownEncodingMem;
void *m_unknownEncodingData;
void *m_unknownEncodingHandlerData;
2004-08-03 04:06:22 -03:00
void (XMLCALL *m_unknownEncodingRelease)(void *);
2002-02-11 19:13:04 -04:00
PROLOG_STATE m_prologState;
Processor *m_processor;
enum XML_Error m_errorCode;
const char *m_eventPtr;
const char *m_eventEndPtr;
const char *m_positionPtr;
OPEN_INTERNAL_ENTITY *m_openInternalEntities;
2004-08-03 04:06:22 -03:00
OPEN_INTERNAL_ENTITY *m_freeInternalEntities;
2003-01-25 18:41:29 -04:00
XML_Bool m_defaultExpandInternalEntities;
2002-02-11 19:13:04 -04:00
int m_tagLevel;
ENTITY *m_declEntity;
const XML_Char *m_doctypeName;
const XML_Char *m_doctypeSysid;
const XML_Char *m_doctypePubid;
const XML_Char *m_declAttributeType;
const XML_Char *m_declNotationName;
const XML_Char *m_declNotationPublicId;
ELEMENT_TYPE *m_declElementType;
ATTRIBUTE_ID *m_declAttributeId;
2003-01-25 18:41:29 -04:00
XML_Bool m_declAttributeIsCdata;
XML_Bool m_declAttributeIsId;
DTD *m_dtd;
2002-02-11 19:13:04 -04:00
const XML_Char *m_curBase;
TAG *m_tagStack;
TAG *m_freeTagList;
BINDING *m_inheritedBindings;
BINDING *m_freeBindingList;
int m_attsSize;
int m_nSpecifiedAtts;
int m_idAttIndex;
ATTRIBUTE *m_atts;
NS_ATT *m_nsAtts;
unsigned long m_nsAttsVersion;
unsigned char m_nsAttsPower;
2002-02-11 19:13:04 -04:00
POSITION m_position;
STRING_POOL m_tempPool;
STRING_POOL m_temp2Pool;
char *m_groupConnector;
unsigned int m_groupSize;
2002-02-11 19:13:04 -04:00
XML_Char m_namespaceSeparator;
2003-01-25 18:41:29 -04:00
XML_Parser m_parentParser;
2004-08-03 04:06:22 -03:00
XML_ParsingStatus m_parsingStatus;
2002-02-11 19:13:04 -04:00
#ifdef XML_DTD
2003-01-25 18:41:29 -04:00
XML_Bool m_isParamEntity;
XML_Bool m_useForeignDTD;
2002-02-11 19:13:04 -04:00
enum XML_ParamEntityParsing m_paramEntityParsing;
#endif
2003-01-25 18:41:29 -04:00
};
#define MALLOC(s) (parser->m_mem.malloc_fcn((s)))
#define REALLOC(p,s) (parser->m_mem.realloc_fcn((p),(s)))
#define FREE(p) (parser->m_mem.free_fcn((p)))
#define userData (parser->m_userData)
#define handlerArg (parser->m_handlerArg)
#define startElementHandler (parser->m_startElementHandler)
#define endElementHandler (parser->m_endElementHandler)
#define characterDataHandler (parser->m_characterDataHandler)
#define processingInstructionHandler \
(parser->m_processingInstructionHandler)
#define commentHandler (parser->m_commentHandler)
#define startCdataSectionHandler \
(parser->m_startCdataSectionHandler)
#define endCdataSectionHandler (parser->m_endCdataSectionHandler)
#define defaultHandler (parser->m_defaultHandler)
#define startDoctypeDeclHandler (parser->m_startDoctypeDeclHandler)
#define endDoctypeDeclHandler (parser->m_endDoctypeDeclHandler)
#define unparsedEntityDeclHandler \
(parser->m_unparsedEntityDeclHandler)
#define notationDeclHandler (parser->m_notationDeclHandler)
#define startNamespaceDeclHandler \
(parser->m_startNamespaceDeclHandler)
#define endNamespaceDeclHandler (parser->m_endNamespaceDeclHandler)
#define notStandaloneHandler (parser->m_notStandaloneHandler)
#define externalEntityRefHandler \
(parser->m_externalEntityRefHandler)
#define externalEntityRefHandlerArg \
(parser->m_externalEntityRefHandlerArg)
#define internalEntityRefHandler \
(parser->m_internalEntityRefHandler)
#define skippedEntityHandler (parser->m_skippedEntityHandler)
#define unknownEncodingHandler (parser->m_unknownEncodingHandler)
#define elementDeclHandler (parser->m_elementDeclHandler)
#define attlistDeclHandler (parser->m_attlistDeclHandler)
#define entityDeclHandler (parser->m_entityDeclHandler)
#define xmlDeclHandler (parser->m_xmlDeclHandler)
#define encoding (parser->m_encoding)
#define initEncoding (parser->m_initEncoding)
#define internalEncoding (parser->m_internalEncoding)
#define unknownEncodingMem (parser->m_unknownEncodingMem)
#define unknownEncodingData (parser->m_unknownEncodingData)
2002-02-11 19:13:04 -04:00
#define unknownEncodingHandlerData \
2003-01-25 18:41:29 -04:00
(parser->m_unknownEncodingHandlerData)
#define unknownEncodingRelease (parser->m_unknownEncodingRelease)
#define protocolEncodingName (parser->m_protocolEncodingName)
#define ns (parser->m_ns)
#define ns_triplets (parser->m_ns_triplets)
#define prologState (parser->m_prologState)
#define processor (parser->m_processor)
#define errorCode (parser->m_errorCode)
#define eventPtr (parser->m_eventPtr)
#define eventEndPtr (parser->m_eventEndPtr)
#define positionPtr (parser->m_positionPtr)
#define position (parser->m_position)
#define openInternalEntities (parser->m_openInternalEntities)
2004-08-03 04:06:22 -03:00
#define freeInternalEntities (parser->m_freeInternalEntities)
2003-01-25 18:41:29 -04:00
#define defaultExpandInternalEntities \
(parser->m_defaultExpandInternalEntities)
#define tagLevel (parser->m_tagLevel)
#define buffer (parser->m_buffer)
#define bufferPtr (parser->m_bufferPtr)
#define bufferEnd (parser->m_bufferEnd)
#define parseEndByteIndex (parser->m_parseEndByteIndex)
#define parseEndPtr (parser->m_parseEndPtr)
#define bufferLim (parser->m_bufferLim)
#define dataBuf (parser->m_dataBuf)
#define dataBufEnd (parser->m_dataBufEnd)
#define _dtd (parser->m_dtd)
#define curBase (parser->m_curBase)
#define declEntity (parser->m_declEntity)
#define doctypeName (parser->m_doctypeName)
#define doctypeSysid (parser->m_doctypeSysid)
#define doctypePubid (parser->m_doctypePubid)
#define declAttributeType (parser->m_declAttributeType)
#define declNotationName (parser->m_declNotationName)
#define declNotationPublicId (parser->m_declNotationPublicId)
#define declElementType (parser->m_declElementType)
#define declAttributeId (parser->m_declAttributeId)
#define declAttributeIsCdata (parser->m_declAttributeIsCdata)
#define declAttributeIsId (parser->m_declAttributeIsId)
#define freeTagList (parser->m_freeTagList)
#define freeBindingList (parser->m_freeBindingList)
#define inheritedBindings (parser->m_inheritedBindings)
#define tagStack (parser->m_tagStack)
#define atts (parser->m_atts)
#define attsSize (parser->m_attsSize)
#define nSpecifiedAtts (parser->m_nSpecifiedAtts)
#define idAttIndex (parser->m_idAttIndex)
#define nsAtts (parser->m_nsAtts)
#define nsAttsVersion (parser->m_nsAttsVersion)
#define nsAttsPower (parser->m_nsAttsPower)
2003-01-25 18:41:29 -04:00
#define tempPool (parser->m_tempPool)
#define temp2Pool (parser->m_temp2Pool)
#define groupConnector (parser->m_groupConnector)
#define groupSize (parser->m_groupSize)
#define namespaceSeparator (parser->m_namespaceSeparator)
#define parentParser (parser->m_parentParser)
2004-08-03 04:06:22 -03:00
#define parsing (parser->m_parsingStatus.parsing)
#define finalBuffer (parser->m_parsingStatus.finalBuffer)
2002-02-11 19:13:04 -04:00
#ifdef XML_DTD
2003-01-25 18:41:29 -04:00
#define isParamEntity (parser->m_isParamEntity)
#define useForeignDTD (parser->m_useForeignDTD)
#define paramEntityParsing (parser->m_paramEntityParsing)
2002-02-11 19:13:04 -04:00
#endif /* XML_DTD */
XML_Parser XMLCALL
2003-01-25 18:41:29 -04:00
XML_ParserCreate(const XML_Char *encodingName)
2002-02-11 19:13:04 -04:00
{
return XML_ParserCreate_MM(encodingName, NULL, NULL);
}
XML_Parser XMLCALL
2003-01-25 18:41:29 -04:00
XML_ParserCreateNS(const XML_Char *encodingName, XML_Char nsSep)
2002-02-11 19:13:04 -04:00
{
XML_Char tmp[2];
*tmp = nsSep;
return XML_ParserCreate_MM(encodingName, NULL, tmp);
}
2003-01-25 18:41:29 -04:00
static const XML_Char implicitContext[] = {
'x', 'm', 'l', '=', 'h', 't', 't', 'p', ':', '/', '/',
'w', 'w', 'w', '.', 'w', '3', '.', 'o', 'r', 'g', '/',
'X', 'M', 'L', '/', '1', '9', '9', '8', '/',
'n', 'a', 'm', 'e', 's', 'p', 'a', 'c', 'e', '\0'
};
XML_Parser XMLCALL
2002-02-11 19:13:04 -04:00
XML_ParserCreate_MM(const XML_Char *encodingName,
2003-01-25 18:41:29 -04:00
const XML_Memory_Handling_Suite *memsuite,
const XML_Char *nameSep)
{
XML_Parser parser = parserCreate(encodingName, memsuite, nameSep, NULL);
if (parser != NULL && ns) {
/* implicit context only set for root parser, since child
parsers (i.e. external entity parsers) will inherit it
*/
if (!setContext(parser, implicitContext)) {
XML_ParserFree(parser);
return NULL;
}
}
return parser;
}
2002-02-11 19:13:04 -04:00
2003-01-25 18:41:29 -04:00
static XML_Parser
parserCreate(const XML_Char *encodingName,
const XML_Memory_Handling_Suite *memsuite,
const XML_Char *nameSep,
DTD *dtd)
{
XML_Parser parser;
2002-02-11 19:13:04 -04:00
if (memsuite) {
XML_Memory_Handling_Suite *mtemp;
2003-01-25 18:41:29 -04:00
parser = (XML_Parser)
memsuite->malloc_fcn(sizeof(struct XML_ParserStruct));
if (parser != NULL) {
mtemp = (XML_Memory_Handling_Suite *)&(parser->m_mem);
mtemp->malloc_fcn = memsuite->malloc_fcn;
mtemp->realloc_fcn = memsuite->realloc_fcn;
mtemp->free_fcn = memsuite->free_fcn;
}
2002-02-11 19:13:04 -04:00
}
else {
XML_Memory_Handling_Suite *mtemp;
2003-01-25 18:41:29 -04:00
parser = (XML_Parser)malloc(sizeof(struct XML_ParserStruct));
if (parser != NULL) {
mtemp = (XML_Memory_Handling_Suite *)&(parser->m_mem);
mtemp->malloc_fcn = malloc;
mtemp->realloc_fcn = realloc;
mtemp->free_fcn = free;
}
2002-02-11 19:13:04 -04:00
}
if (!parser)
return parser;
2003-01-25 18:41:29 -04:00
buffer = NULL;
bufferLim = NULL;
2002-02-11 19:13:04 -04:00
attsSize = INIT_ATTS_SIZE;
2003-01-25 18:41:29 -04:00
atts = (ATTRIBUTE *)MALLOC(attsSize * sizeof(ATTRIBUTE));
if (atts == NULL) {
FREE(parser);
return NULL;
}
dataBuf = (XML_Char *)MALLOC(INIT_DATA_BUF_SIZE * sizeof(XML_Char));
if (dataBuf == NULL) {
FREE(atts);
FREE(parser);
return NULL;
}
dataBufEnd = dataBuf + INIT_DATA_BUF_SIZE;
if (dtd)
_dtd = dtd;
else {
_dtd = dtdCreate(&parser->m_mem);
if (_dtd == NULL) {
FREE(dataBuf);
FREE(atts);
FREE(parser);
return NULL;
}
}
freeBindingList = NULL;
freeTagList = NULL;
2004-08-03 04:06:22 -03:00
freeInternalEntities = NULL;
2003-01-25 18:41:29 -04:00
2002-02-11 19:13:04 -04:00
groupSize = 0;
2003-01-25 18:41:29 -04:00
groupConnector = NULL;
unknownEncodingHandler = NULL;
unknownEncodingHandlerData = NULL;
2002-02-11 19:13:04 -04:00
namespaceSeparator = '!';
2003-01-25 18:41:29 -04:00
ns = XML_FALSE;
ns_triplets = XML_FALSE;
nsAtts = NULL;
nsAttsVersion = 0;
nsAttsPower = 0;
2003-01-25 18:41:29 -04:00
poolInit(&tempPool, &(parser->m_mem));
poolInit(&temp2Pool, &(parser->m_mem));
parserInit(parser, encodingName);
if (encodingName && !protocolEncodingName) {
2002-02-11 19:13:04 -04:00
XML_ParserFree(parser);
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
}
if (nameSep) {
2003-01-25 18:41:29 -04:00
ns = XML_TRUE;
2002-02-11 19:13:04 -04:00
internalEncoding = XmlGetInternalEncodingNS();
namespaceSeparator = *nameSep;
}
else {
internalEncoding = XmlGetInternalEncoding();
}
return parser;
2003-01-25 18:41:29 -04:00
}
static void
parserInit(XML_Parser parser, const XML_Char *encodingName)
{
processor = prologInitProcessor;
XmlPrologStateInit(&prologState);
protocolEncodingName = (encodingName != NULL
? poolCopyString(&tempPool, encodingName)
: NULL);
curBase = NULL;
XmlInitEncoding(&initEncoding, &encoding, 0);
userData = NULL;
handlerArg = NULL;
startElementHandler = NULL;
endElementHandler = NULL;
characterDataHandler = NULL;
processingInstructionHandler = NULL;
commentHandler = NULL;
startCdataSectionHandler = NULL;
endCdataSectionHandler = NULL;
defaultHandler = NULL;
startDoctypeDeclHandler = NULL;
endDoctypeDeclHandler = NULL;
unparsedEntityDeclHandler = NULL;
notationDeclHandler = NULL;
startNamespaceDeclHandler = NULL;
endNamespaceDeclHandler = NULL;
notStandaloneHandler = NULL;
externalEntityRefHandler = NULL;
externalEntityRefHandlerArg = parser;
skippedEntityHandler = NULL;
elementDeclHandler = NULL;
attlistDeclHandler = NULL;
entityDeclHandler = NULL;
xmlDeclHandler = NULL;
bufferPtr = buffer;
bufferEnd = buffer;
parseEndByteIndex = 0;
parseEndPtr = NULL;
declElementType = NULL;
declAttributeId = NULL;
declEntity = NULL;
doctypeName = NULL;
doctypeSysid = NULL;
doctypePubid = NULL;
declAttributeType = NULL;
declNotationName = NULL;
declNotationPublicId = NULL;
declAttributeIsCdata = XML_FALSE;
declAttributeIsId = XML_FALSE;
memset(&position, 0, sizeof(POSITION));
errorCode = XML_ERROR_NONE;
eventPtr = NULL;
eventEndPtr = NULL;
positionPtr = NULL;
2004-08-03 04:06:22 -03:00
openInternalEntities = NULL;
2003-01-25 18:41:29 -04:00
defaultExpandInternalEntities = XML_TRUE;
tagLevel = 0;
tagStack = NULL;
inheritedBindings = NULL;
nSpecifiedAtts = 0;
unknownEncodingMem = NULL;
unknownEncodingRelease = NULL;
unknownEncodingData = NULL;
parentParser = NULL;
2004-08-03 04:06:22 -03:00
parsing = XML_INITIALIZED;
2003-01-25 18:41:29 -04:00
#ifdef XML_DTD
isParamEntity = XML_FALSE;
useForeignDTD = XML_FALSE;
paramEntityParsing = XML_PARAM_ENTITY_PARSING_NEVER;
#endif
}
/* moves list of bindings to freeBindingList */
static void FASTCALL
moveToFreeBindingList(XML_Parser parser, BINDING *bindings)
{
while (bindings) {
BINDING *b = bindings;
bindings = bindings->nextTagBinding;
b->nextTagBinding = freeBindingList;
freeBindingList = b;
}
}
XML_Bool XMLCALL
2003-01-25 18:41:29 -04:00
XML_ParserReset(XML_Parser parser, const XML_Char *encodingName)
{
TAG *tStk;
2004-08-03 04:06:22 -03:00
OPEN_INTERNAL_ENTITY *openEntityList;
2003-01-25 18:41:29 -04:00
if (parentParser)
return XML_FALSE;
/* move tagStack to freeTagList */
tStk = tagStack;
while (tStk) {
TAG *tag = tStk;
tStk = tStk->parent;
tag->parent = freeTagList;
moveToFreeBindingList(parser, tag->bindings);
tag->bindings = NULL;
freeTagList = tag;
}
2004-08-03 04:06:22 -03:00
/* move openInternalEntities to freeInternalEntities */
openEntityList = openInternalEntities;
while (openEntityList) {
OPEN_INTERNAL_ENTITY *openEntity = openEntityList;
openEntityList = openEntity->next;
openEntity->next = freeInternalEntities;
freeInternalEntities = openEntity;
}
2003-01-25 18:41:29 -04:00
moveToFreeBindingList(parser, inheritedBindings);
FREE(unknownEncodingMem);
2003-01-25 18:41:29 -04:00
if (unknownEncodingRelease)
unknownEncodingRelease(unknownEncodingData);
poolClear(&tempPool);
poolClear(&temp2Pool);
parserInit(parser, encodingName);
dtdReset(_dtd, &parser->m_mem);
return setContext(parser, implicitContext);
}
2002-02-11 19:13:04 -04:00
enum XML_Status XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetEncoding(XML_Parser parser, const XML_Char *encodingName)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
/* Block after XML_Parse()/XML_ParseBuffer() has been called.
XXX There's no way for the caller to determine which of the
XXX possible error cases caused the XML_STATUS_ERROR return.
*/
2004-08-03 04:06:22 -03:00
if (parsing == XML_PARSING || parsing == XML_SUSPENDED)
2003-01-25 18:41:29 -04:00
return XML_STATUS_ERROR;
if (encodingName == NULL)
protocolEncodingName = NULL;
2002-02-11 19:13:04 -04:00
else {
protocolEncodingName = poolCopyString(&tempPool, encodingName);
if (!protocolEncodingName)
2003-01-25 18:41:29 -04:00
return XML_STATUS_ERROR;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
return XML_STATUS_OK;
2002-02-11 19:13:04 -04:00
}
XML_Parser XMLCALL
2003-01-25 18:41:29 -04:00
XML_ExternalEntityParserCreate(XML_Parser oldParser,
const XML_Char *context,
const XML_Char *encodingName)
2002-02-11 19:13:04 -04:00
{
XML_Parser parser = oldParser;
2003-01-25 18:41:29 -04:00
DTD *newDtd = NULL;
DTD *oldDtd = _dtd;
2002-02-11 19:13:04 -04:00
XML_StartElementHandler oldStartElementHandler = startElementHandler;
XML_EndElementHandler oldEndElementHandler = endElementHandler;
XML_CharacterDataHandler oldCharacterDataHandler = characterDataHandler;
2003-01-25 18:41:29 -04:00
XML_ProcessingInstructionHandler oldProcessingInstructionHandler
= processingInstructionHandler;
2002-02-11 19:13:04 -04:00
XML_CommentHandler oldCommentHandler = commentHandler;
2003-01-25 18:41:29 -04:00
XML_StartCdataSectionHandler oldStartCdataSectionHandler
= startCdataSectionHandler;
XML_EndCdataSectionHandler oldEndCdataSectionHandler
= endCdataSectionHandler;
2002-02-11 19:13:04 -04:00
XML_DefaultHandler oldDefaultHandler = defaultHandler;
2003-01-25 18:41:29 -04:00
XML_UnparsedEntityDeclHandler oldUnparsedEntityDeclHandler
= unparsedEntityDeclHandler;
2002-02-11 19:13:04 -04:00
XML_NotationDeclHandler oldNotationDeclHandler = notationDeclHandler;
2003-01-25 18:41:29 -04:00
XML_StartNamespaceDeclHandler oldStartNamespaceDeclHandler
= startNamespaceDeclHandler;
XML_EndNamespaceDeclHandler oldEndNamespaceDeclHandler
= endNamespaceDeclHandler;
2002-02-11 19:13:04 -04:00
XML_NotStandaloneHandler oldNotStandaloneHandler = notStandaloneHandler;
2003-01-25 18:41:29 -04:00
XML_ExternalEntityRefHandler oldExternalEntityRefHandler
= externalEntityRefHandler;
XML_SkippedEntityHandler oldSkippedEntityHandler = skippedEntityHandler;
XML_UnknownEncodingHandler oldUnknownEncodingHandler
= unknownEncodingHandler;
2002-02-11 19:13:04 -04:00
XML_ElementDeclHandler oldElementDeclHandler = elementDeclHandler;
XML_AttlistDeclHandler oldAttlistDeclHandler = attlistDeclHandler;
XML_EntityDeclHandler oldEntityDeclHandler = entityDeclHandler;
XML_XmlDeclHandler oldXmlDeclHandler = xmlDeclHandler;
ELEMENT_TYPE * oldDeclElementType = declElementType;
void *oldUserData = userData;
void *oldHandlerArg = handlerArg;
2003-01-25 18:41:29 -04:00
XML_Bool oldDefaultExpandInternalEntities = defaultExpandInternalEntities;
XML_Parser oldExternalEntityRefHandlerArg = externalEntityRefHandlerArg;
2002-02-11 19:13:04 -04:00
#ifdef XML_DTD
2003-01-25 18:41:29 -04:00
enum XML_ParamEntityParsing oldParamEntityParsing = paramEntityParsing;
int oldInEntityValue = prologState.inEntityValue;
2002-02-11 19:13:04 -04:00
#endif
2003-01-25 18:41:29 -04:00
XML_Bool oldns_triplets = ns_triplets;
#ifdef XML_DTD
if (!context)
newDtd = oldDtd;
#endif /* XML_DTD */
2002-02-11 19:13:04 -04:00
2003-01-25 18:41:29 -04:00
/* Note that the magical uses of the pre-processor to make field
access look more like C++ require that `parser' be overwritten
here. This makes this function more painful to follow than it
would be otherwise.
*/
2002-02-11 19:13:04 -04:00
if (ns) {
XML_Char tmp[2];
*tmp = namespaceSeparator;
2003-01-25 18:41:29 -04:00
parser = parserCreate(encodingName, &parser->m_mem, tmp, newDtd);
2002-02-11 19:13:04 -04:00
}
else {
2003-01-25 18:41:29 -04:00
parser = parserCreate(encodingName, &parser->m_mem, NULL, newDtd);
2002-02-11 19:13:04 -04:00
}
if (!parser)
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
startElementHandler = oldStartElementHandler;
endElementHandler = oldEndElementHandler;
characterDataHandler = oldCharacterDataHandler;
processingInstructionHandler = oldProcessingInstructionHandler;
commentHandler = oldCommentHandler;
startCdataSectionHandler = oldStartCdataSectionHandler;
endCdataSectionHandler = oldEndCdataSectionHandler;
defaultHandler = oldDefaultHandler;
unparsedEntityDeclHandler = oldUnparsedEntityDeclHandler;
notationDeclHandler = oldNotationDeclHandler;
startNamespaceDeclHandler = oldStartNamespaceDeclHandler;
endNamespaceDeclHandler = oldEndNamespaceDeclHandler;
notStandaloneHandler = oldNotStandaloneHandler;
externalEntityRefHandler = oldExternalEntityRefHandler;
2003-01-25 18:41:29 -04:00
skippedEntityHandler = oldSkippedEntityHandler;
2002-02-11 19:13:04 -04:00
unknownEncodingHandler = oldUnknownEncodingHandler;
elementDeclHandler = oldElementDeclHandler;
attlistDeclHandler = oldAttlistDeclHandler;
entityDeclHandler = oldEntityDeclHandler;
xmlDeclHandler = oldXmlDeclHandler;
declElementType = oldDeclElementType;
userData = oldUserData;
if (oldUserData == oldHandlerArg)
handlerArg = userData;
else
handlerArg = parser;
if (oldExternalEntityRefHandlerArg != oldParser)
externalEntityRefHandlerArg = oldExternalEntityRefHandlerArg;
defaultExpandInternalEntities = oldDefaultExpandInternalEntities;
ns_triplets = oldns_triplets;
2003-01-25 18:41:29 -04:00
parentParser = oldParser;
2002-02-11 19:13:04 -04:00
#ifdef XML_DTD
paramEntityParsing = oldParamEntityParsing;
2003-01-25 18:41:29 -04:00
prologState.inEntityValue = oldInEntityValue;
2002-02-11 19:13:04 -04:00
if (context) {
#endif /* XML_DTD */
2003-01-25 18:41:29 -04:00
if (!dtdCopy(_dtd, oldDtd, &parser->m_mem)
|| !setContext(parser, context)) {
2002-02-11 19:13:04 -04:00
XML_ParserFree(parser);
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
}
processor = externalEntityInitProcessor;
#ifdef XML_DTD
}
else {
2003-01-25 18:41:29 -04:00
/* The DTD instance referenced by _dtd is shared between the document's
root parser and external PE parsers, therefore one does not need to
call setContext. In addition, one also *must* not call setContext,
because this would overwrite existing prefix->binding pointers in
_dtd with ones that get destroyed with the external PE parser.
This would leave those prefixes with dangling pointers.
*/
isParamEntity = XML_TRUE;
2002-02-11 19:13:04 -04:00
XmlPrologStateInitExternalEntity(&prologState);
2003-01-25 18:41:29 -04:00
processor = externalParEntInitProcessor;
2002-02-11 19:13:04 -04:00
}
#endif /* XML_DTD */
return parser;
}
2003-01-25 18:41:29 -04:00
static void FASTCALL
destroyBindings(BINDING *bindings, XML_Parser parser)
2002-02-11 19:13:04 -04:00
{
for (;;) {
BINDING *b = bindings;
if (!b)
break;
bindings = b->nextTagBinding;
FREE(b->uri);
FREE(b);
}
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_ParserFree(XML_Parser parser)
2002-02-11 19:13:04 -04:00
{
2004-08-03 04:06:22 -03:00
TAG *tagList;
OPEN_INTERNAL_ENTITY *entityList;
if (parser == NULL)
return;
/* free tagStack and freeTagList */
tagList = tagStack;
2002-02-11 19:13:04 -04:00
for (;;) {
TAG *p;
2004-08-03 04:06:22 -03:00
if (tagList == NULL) {
2003-01-25 18:41:29 -04:00
if (freeTagList == NULL)
break;
2004-08-03 04:06:22 -03:00
tagList = freeTagList;
2003-01-25 18:41:29 -04:00
freeTagList = NULL;
2002-02-11 19:13:04 -04:00
}
2004-08-03 04:06:22 -03:00
p = tagList;
tagList = tagList->parent;
2002-02-11 19:13:04 -04:00
FREE(p->buf);
destroyBindings(p->bindings, parser);
FREE(p);
}
2004-08-03 04:06:22 -03:00
/* free openInternalEntities and freeInternalEntities */
entityList = openInternalEntities;
for (;;) {
OPEN_INTERNAL_ENTITY *openEntity;
if (entityList == NULL) {
if (freeInternalEntities == NULL)
break;
entityList = freeInternalEntities;
freeInternalEntities = NULL;
}
openEntity = entityList;
entityList = entityList->next;
FREE(openEntity);
}
2002-02-11 19:13:04 -04:00
destroyBindings(freeBindingList, parser);
destroyBindings(inheritedBindings, parser);
poolDestroy(&tempPool);
poolDestroy(&temp2Pool);
#ifdef XML_DTD
2003-01-25 18:41:29 -04:00
/* external parameter entity parsers share the DTD structure
parser->m_dtd with the root parser, so we must not destroy it
*/
if (!isParamEntity && _dtd)
#else
if (_dtd)
2002-02-11 19:13:04 -04:00
#endif /* XML_DTD */
2003-01-25 18:41:29 -04:00
dtdDestroy(_dtd, (XML_Bool)!parentParser, &parser->m_mem);
2002-02-11 19:13:04 -04:00
FREE((void *)atts);
FREE(groupConnector);
FREE(buffer);
2002-02-11 19:13:04 -04:00
FREE(dataBuf);
FREE(nsAtts);
FREE(unknownEncodingMem);
2002-02-11 19:13:04 -04:00
if (unknownEncodingRelease)
unknownEncodingRelease(unknownEncodingData);
FREE(parser);
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_UseParserAsHandlerArg(XML_Parser parser)
2002-02-11 19:13:04 -04:00
{
handlerArg = parser;
}
enum XML_Error XMLCALL
2003-01-25 18:41:29 -04:00
XML_UseForeignDTD(XML_Parser parser, XML_Bool useDTD)
{
#ifdef XML_DTD
/* block after XML_Parse()/XML_ParseBuffer() has been called */
2004-08-03 04:06:22 -03:00
if (parsing == XML_PARSING || parsing == XML_SUSPENDED)
2003-01-25 18:41:29 -04:00
return XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING;
useForeignDTD = useDTD;
return XML_ERROR_NONE;
#else
return XML_ERROR_FEATURE_REQUIRES_XML_DTD;
#endif
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetReturnNSTriplet(XML_Parser parser, int do_nst)
{
/* block after XML_Parse()/XML_ParseBuffer() has been called */
2004-08-03 04:06:22 -03:00
if (parsing == XML_PARSING || parsing == XML_SUSPENDED)
2003-01-25 18:41:29 -04:00
return;
ns_triplets = do_nst ? XML_TRUE : XML_FALSE;
2002-02-11 19:13:04 -04:00
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetUserData(XML_Parser parser, void *p)
2002-02-11 19:13:04 -04:00
{
if (handlerArg == userData)
handlerArg = userData = p;
else
userData = p;
}
enum XML_Status XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetBase(XML_Parser parser, const XML_Char *p)
2002-02-11 19:13:04 -04:00
{
if (p) {
2003-01-25 18:41:29 -04:00
p = poolCopyString(&_dtd->pool, p);
2002-02-11 19:13:04 -04:00
if (!p)
2003-01-25 18:41:29 -04:00
return XML_STATUS_ERROR;
2002-02-11 19:13:04 -04:00
curBase = p;
}
else
2003-01-25 18:41:29 -04:00
curBase = NULL;
return XML_STATUS_OK;
2002-02-11 19:13:04 -04:00
}
const XML_Char * XMLCALL
2003-01-25 18:41:29 -04:00
XML_GetBase(XML_Parser parser)
2002-02-11 19:13:04 -04:00
{
return curBase;
}
int XMLCALL
2003-01-25 18:41:29 -04:00
XML_GetSpecifiedAttributeCount(XML_Parser parser)
2002-02-11 19:13:04 -04:00
{
return nSpecifiedAtts;
}
int XMLCALL
2003-01-25 18:41:29 -04:00
XML_GetIdAttributeIndex(XML_Parser parser)
2002-02-11 19:13:04 -04:00
{
return idAttIndex;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetElementHandler(XML_Parser parser,
XML_StartElementHandler start,
XML_EndElementHandler end)
2002-02-11 19:13:04 -04:00
{
startElementHandler = start;
endElementHandler = end;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetStartElementHandler(XML_Parser parser,
XML_StartElementHandler start) {
2002-02-11 19:13:04 -04:00
startElementHandler = start;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetEndElementHandler(XML_Parser parser,
XML_EndElementHandler end) {
2002-02-11 19:13:04 -04:00
endElementHandler = end;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetCharacterDataHandler(XML_Parser parser,
XML_CharacterDataHandler handler)
2002-02-11 19:13:04 -04:00
{
characterDataHandler = handler;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetProcessingInstructionHandler(XML_Parser parser,
XML_ProcessingInstructionHandler handler)
2002-02-11 19:13:04 -04:00
{
processingInstructionHandler = handler;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetCommentHandler(XML_Parser parser,
XML_CommentHandler handler)
2002-02-11 19:13:04 -04:00
{
commentHandler = handler;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetCdataSectionHandler(XML_Parser parser,
XML_StartCdataSectionHandler start,
XML_EndCdataSectionHandler end)
2002-02-11 19:13:04 -04:00
{
startCdataSectionHandler = start;
endCdataSectionHandler = end;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetStartCdataSectionHandler(XML_Parser parser,
XML_StartCdataSectionHandler start) {
2002-02-11 19:13:04 -04:00
startCdataSectionHandler = start;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetEndCdataSectionHandler(XML_Parser parser,
XML_EndCdataSectionHandler end) {
2002-02-11 19:13:04 -04:00
endCdataSectionHandler = end;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetDefaultHandler(XML_Parser parser,
XML_DefaultHandler handler)
2002-02-11 19:13:04 -04:00
{
defaultHandler = handler;
2003-01-25 18:41:29 -04:00
defaultExpandInternalEntities = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetDefaultHandlerExpand(XML_Parser parser,
XML_DefaultHandler handler)
2002-02-11 19:13:04 -04:00
{
defaultHandler = handler;
2003-01-25 18:41:29 -04:00
defaultExpandInternalEntities = XML_TRUE;
2002-02-11 19:13:04 -04:00
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetDoctypeDeclHandler(XML_Parser parser,
XML_StartDoctypeDeclHandler start,
XML_EndDoctypeDeclHandler end)
2002-02-11 19:13:04 -04:00
{
startDoctypeDeclHandler = start;
endDoctypeDeclHandler = end;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetStartDoctypeDeclHandler(XML_Parser parser,
XML_StartDoctypeDeclHandler start) {
2002-02-11 19:13:04 -04:00
startDoctypeDeclHandler = start;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetEndDoctypeDeclHandler(XML_Parser parser,
XML_EndDoctypeDeclHandler end) {
2002-02-11 19:13:04 -04:00
endDoctypeDeclHandler = end;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetUnparsedEntityDeclHandler(XML_Parser parser,
XML_UnparsedEntityDeclHandler handler)
2002-02-11 19:13:04 -04:00
{
unparsedEntityDeclHandler = handler;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetNotationDeclHandler(XML_Parser parser,
XML_NotationDeclHandler handler)
2002-02-11 19:13:04 -04:00
{
notationDeclHandler = handler;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetNamespaceDeclHandler(XML_Parser parser,
XML_StartNamespaceDeclHandler start,
XML_EndNamespaceDeclHandler end)
2002-02-11 19:13:04 -04:00
{
startNamespaceDeclHandler = start;
endNamespaceDeclHandler = end;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetStartNamespaceDeclHandler(XML_Parser parser,
XML_StartNamespaceDeclHandler start) {
2002-02-11 19:13:04 -04:00
startNamespaceDeclHandler = start;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetEndNamespaceDeclHandler(XML_Parser parser,
XML_EndNamespaceDeclHandler end) {
2002-02-11 19:13:04 -04:00
endNamespaceDeclHandler = end;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetNotStandaloneHandler(XML_Parser parser,
XML_NotStandaloneHandler handler)
2002-02-11 19:13:04 -04:00
{
notStandaloneHandler = handler;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetExternalEntityRefHandler(XML_Parser parser,
XML_ExternalEntityRefHandler handler)
2002-02-11 19:13:04 -04:00
{
externalEntityRefHandler = handler;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetExternalEntityRefHandlerArg(XML_Parser parser, void *arg)
2002-02-11 19:13:04 -04:00
{
if (arg)
2003-01-25 18:41:29 -04:00
externalEntityRefHandlerArg = (XML_Parser)arg;
2002-02-11 19:13:04 -04:00
else
externalEntityRefHandlerArg = parser;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetSkippedEntityHandler(XML_Parser parser,
XML_SkippedEntityHandler handler)
{
skippedEntityHandler = handler;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetUnknownEncodingHandler(XML_Parser parser,
XML_UnknownEncodingHandler handler,
void *data)
2002-02-11 19:13:04 -04:00
{
unknownEncodingHandler = handler;
unknownEncodingHandlerData = data;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetElementDeclHandler(XML_Parser parser,
XML_ElementDeclHandler eldecl)
2002-02-11 19:13:04 -04:00
{
elementDeclHandler = eldecl;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetAttlistDeclHandler(XML_Parser parser,
XML_AttlistDeclHandler attdecl)
2002-02-11 19:13:04 -04:00
{
attlistDeclHandler = attdecl;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetEntityDeclHandler(XML_Parser parser,
XML_EntityDeclHandler handler)
2002-02-11 19:13:04 -04:00
{
entityDeclHandler = handler;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetXmlDeclHandler(XML_Parser parser,
XML_XmlDeclHandler handler) {
2002-02-11 19:13:04 -04:00
xmlDeclHandler = handler;
}
int XMLCALL
2003-01-25 18:41:29 -04:00
XML_SetParamEntityParsing(XML_Parser parser,
enum XML_ParamEntityParsing peParsing)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
/* block after XML_Parse()/XML_ParseBuffer() has been called */
2004-08-03 04:06:22 -03:00
if (parsing == XML_PARSING || parsing == XML_SUSPENDED)
2003-01-25 18:41:29 -04:00
return 0;
2002-02-11 19:13:04 -04:00
#ifdef XML_DTD
2003-01-25 18:41:29 -04:00
paramEntityParsing = peParsing;
2002-02-11 19:13:04 -04:00
return 1;
#else
2003-01-25 18:41:29 -04:00
return peParsing == XML_PARAM_ENTITY_PARSING_NEVER;
2002-02-11 19:13:04 -04:00
#endif
}
enum XML_Status XMLCALL
2003-01-25 18:41:29 -04:00
XML_Parse(XML_Parser parser, const char *s, int len, int isFinal)
2002-02-11 19:13:04 -04:00
{
2004-08-03 04:06:22 -03:00
switch (parsing) {
case XML_SUSPENDED:
errorCode = XML_ERROR_SUSPENDED;
return XML_STATUS_ERROR;
case XML_FINISHED:
errorCode = XML_ERROR_FINISHED;
return XML_STATUS_ERROR;
default:
parsing = XML_PARSING;
}
2002-02-11 19:13:04 -04:00
if (len == 0) {
2004-08-03 04:06:22 -03:00
finalBuffer = (XML_Bool)isFinal;
2002-02-11 19:13:04 -04:00
if (!isFinal)
2003-01-25 18:41:29 -04:00
return XML_STATUS_OK;
2002-02-11 19:13:04 -04:00
positionPtr = bufferPtr;
2004-08-03 04:06:22 -03:00
parseEndPtr = bufferEnd;
/* If data are left over from last buffer, and we now know that these
data are the final chunk of input, then we have to check them again
to detect errors based on this information.
*/
errorCode = processor(parser, bufferPtr, parseEndPtr, &bufferPtr);
if (errorCode == XML_ERROR_NONE) {
switch (parsing) {
case XML_SUSPENDED:
XmlUpdatePosition(encoding, positionPtr, bufferPtr, &position);
positionPtr = bufferPtr;
return XML_STATUS_SUSPENDED;
case XML_INITIALIZED:
case XML_PARSING:
parsing = XML_FINISHED;
/* fall through */
default:
return XML_STATUS_OK;
}
}
2002-02-11 19:13:04 -04:00
eventEndPtr = eventPtr;
processor = errorProcessor;
2003-01-25 18:41:29 -04:00
return XML_STATUS_ERROR;
2002-02-11 19:13:04 -04:00
}
#ifndef XML_CONTEXT_BYTES
else if (bufferPtr == bufferEnd) {
const char *end;
int nLeftOver;
2004-08-03 04:06:22 -03:00
enum XML_Error result;
2002-02-11 19:13:04 -04:00
parseEndByteIndex += len;
positionPtr = s;
2004-08-03 04:06:22 -03:00
finalBuffer = (XML_Bool)isFinal;
2002-02-11 19:13:04 -04:00
errorCode = processor(parser, s, parseEndPtr = s + len, &end);
2004-08-03 04:06:22 -03:00
2002-02-11 19:13:04 -04:00
if (errorCode != XML_ERROR_NONE) {
eventEndPtr = eventPtr;
processor = errorProcessor;
2003-01-25 18:41:29 -04:00
return XML_STATUS_ERROR;
2002-02-11 19:13:04 -04:00
}
2004-08-03 04:06:22 -03:00
else {
switch (parsing) {
case XML_SUSPENDED:
result = XML_STATUS_SUSPENDED;
break;
case XML_INITIALIZED:
case XML_PARSING:
result = XML_STATUS_OK;
if (isFinal) {
parsing = XML_FINISHED;
return result;
}
}
}
2002-02-11 19:13:04 -04:00
XmlUpdatePosition(encoding, positionPtr, end, &position);
2003-01-25 18:41:29 -04:00
positionPtr = end;
2002-02-11 19:13:04 -04:00
nLeftOver = s + len - end;
if (nLeftOver) {
2003-01-25 18:41:29 -04:00
if (buffer == NULL || nLeftOver > bufferLim - buffer) {
/* FIXME avoid integer overflow */
char *temp;
temp = (buffer == NULL
? (char *)MALLOC(len * 2)
: (char *)REALLOC(buffer, len * 2));
if (temp == NULL) {
errorCode = XML_ERROR_NO_MEMORY;
return XML_STATUS_ERROR;
}
buffer = temp;
if (!buffer) {
errorCode = XML_ERROR_NO_MEMORY;
eventPtr = eventEndPtr = NULL;
processor = errorProcessor;
return XML_STATUS_ERROR;
}
bufferLim = buffer + len * 2;
2002-02-11 19:13:04 -04:00
}
memcpy(buffer, end, nLeftOver);
bufferPtr = buffer;
bufferEnd = buffer + nLeftOver;
}
2004-08-03 04:06:22 -03:00
return result;
2002-02-11 19:13:04 -04:00
}
#endif /* not defined XML_CONTEXT_BYTES */
else {
2003-01-25 18:41:29 -04:00
void *buff = XML_GetBuffer(parser, len);
if (buff == NULL)
return XML_STATUS_ERROR;
else {
memcpy(buff, s, len);
return XML_ParseBuffer(parser, len, isFinal);
}
2002-02-11 19:13:04 -04:00
}
}
enum XML_Status XMLCALL
2003-01-25 18:41:29 -04:00
XML_ParseBuffer(XML_Parser parser, int len, int isFinal)
2002-02-11 19:13:04 -04:00
{
2004-08-03 04:06:22 -03:00
const char *start;
enum XML_Status result = XML_STATUS_OK;
2004-08-03 04:06:22 -03:00
switch (parsing) {
case XML_SUSPENDED:
errorCode = XML_ERROR_SUSPENDED;
return XML_STATUS_ERROR;
case XML_FINISHED:
errorCode = XML_ERROR_FINISHED;
return XML_STATUS_ERROR;
default:
parsing = XML_PARSING;
}
start = bufferPtr;
2002-02-11 19:13:04 -04:00
positionPtr = start;
bufferEnd += len;
2004-08-03 04:06:22 -03:00
parseEndPtr = bufferEnd;
2002-02-11 19:13:04 -04:00
parseEndByteIndex += len;
2004-08-03 04:06:22 -03:00
finalBuffer = (XML_Bool)isFinal;
errorCode = processor(parser, start, parseEndPtr, &bufferPtr);
if (errorCode != XML_ERROR_NONE) {
2002-02-11 19:13:04 -04:00
eventEndPtr = eventPtr;
processor = errorProcessor;
2003-01-25 18:41:29 -04:00
return XML_STATUS_ERROR;
2002-02-11 19:13:04 -04:00
}
2004-08-03 04:06:22 -03:00
else {
switch (parsing) {
case XML_SUSPENDED:
result = XML_STATUS_SUSPENDED;
break;
case XML_INITIALIZED:
case XML_PARSING:
if (isFinal) {
parsing = XML_FINISHED;
return result;
}
default: ; /* should not happen */
}
}
XmlUpdatePosition(encoding, positionPtr, bufferPtr, &position);
positionPtr = bufferPtr;
return result;
2002-02-11 19:13:04 -04:00
}
void * XMLCALL
2003-01-25 18:41:29 -04:00
XML_GetBuffer(XML_Parser parser, int len)
2002-02-11 19:13:04 -04:00
{
2004-08-03 04:06:22 -03:00
switch (parsing) {
case XML_SUSPENDED:
errorCode = XML_ERROR_SUSPENDED;
return NULL;
case XML_FINISHED:
errorCode = XML_ERROR_FINISHED;
return NULL;
default: ;
}
2002-02-11 19:13:04 -04:00
if (len > bufferLim - bufferEnd) {
/* FIXME avoid integer overflow */
int neededSize = len + (bufferEnd - bufferPtr);
#ifdef XML_CONTEXT_BYTES
int keep = bufferPtr - buffer;
if (keep > XML_CONTEXT_BYTES)
keep = XML_CONTEXT_BYTES;
neededSize += keep;
#endif /* defined XML_CONTEXT_BYTES */
if (neededSize <= bufferLim - buffer) {
#ifdef XML_CONTEXT_BYTES
if (keep < bufferPtr - buffer) {
2003-01-25 18:41:29 -04:00
int offset = (bufferPtr - buffer) - keep;
memmove(buffer, &buffer[offset], bufferEnd - bufferPtr + keep);
bufferEnd -= offset;
bufferPtr -= offset;
2002-02-11 19:13:04 -04:00
}
#else
memmove(buffer, bufferPtr, bufferEnd - bufferPtr);
bufferEnd = buffer + (bufferEnd - bufferPtr);
bufferPtr = buffer;
#endif /* not defined XML_CONTEXT_BYTES */
}
else {
char *newBuf;
int bufferSize = bufferLim - bufferPtr;
if (bufferSize == 0)
2003-01-25 18:41:29 -04:00
bufferSize = INIT_BUFFER_SIZE;
2002-02-11 19:13:04 -04:00
do {
2003-01-25 18:41:29 -04:00
bufferSize *= 2;
2002-02-11 19:13:04 -04:00
} while (bufferSize < neededSize);
2003-01-25 18:41:29 -04:00
newBuf = (char *)MALLOC(bufferSize);
2002-02-11 19:13:04 -04:00
if (newBuf == 0) {
2003-01-25 18:41:29 -04:00
errorCode = XML_ERROR_NO_MEMORY;
return NULL;
2002-02-11 19:13:04 -04:00
}
bufferLim = newBuf + bufferSize;
#ifdef XML_CONTEXT_BYTES
if (bufferPtr) {
2003-01-25 18:41:29 -04:00
int keep = bufferPtr - buffer;
if (keep > XML_CONTEXT_BYTES)
keep = XML_CONTEXT_BYTES;
memcpy(newBuf, &bufferPtr[-keep], bufferEnd - bufferPtr + keep);
FREE(buffer);
buffer = newBuf;
bufferEnd = buffer + (bufferEnd - bufferPtr) + keep;
bufferPtr = buffer + keep;
2002-02-11 19:13:04 -04:00
}
else {
2003-01-25 18:41:29 -04:00
bufferEnd = newBuf + (bufferEnd - bufferPtr);
bufferPtr = buffer = newBuf;
2002-02-11 19:13:04 -04:00
}
#else
if (bufferPtr) {
2003-01-25 18:41:29 -04:00
memcpy(newBuf, bufferPtr, bufferEnd - bufferPtr);
FREE(buffer);
2002-02-11 19:13:04 -04:00
}
bufferEnd = newBuf + (bufferEnd - bufferPtr);
bufferPtr = buffer = newBuf;
#endif /* not defined XML_CONTEXT_BYTES */
}
}
return bufferEnd;
}
2004-08-03 04:06:22 -03:00
enum XML_Status XMLCALL
XML_StopParser(XML_Parser parser, XML_Bool resumable)
{
switch (parsing) {
case XML_SUSPENDED:
if (resumable) {
errorCode = XML_ERROR_SUSPENDED;
return XML_STATUS_ERROR;
}
parsing = XML_FINISHED;
break;
case XML_FINISHED:
errorCode = XML_ERROR_FINISHED;
return XML_STATUS_ERROR;
default:
if (resumable) {
#ifdef XML_DTD
if (isParamEntity) {
errorCode = XML_ERROR_SUSPEND_PE;
return XML_STATUS_ERROR;
}
#endif
parsing = XML_SUSPENDED;
}
else
parsing = XML_FINISHED;
}
return XML_STATUS_OK;
}
enum XML_Status XMLCALL
XML_ResumeParser(XML_Parser parser)
{
enum XML_Status result = XML_STATUS_OK;
2004-08-03 04:06:22 -03:00
if (parsing != XML_SUSPENDED) {
errorCode = XML_ERROR_NOT_SUSPENDED;
return XML_STATUS_ERROR;
}
parsing = XML_PARSING;
errorCode = processor(parser, bufferPtr, parseEndPtr, &bufferPtr);
if (errorCode != XML_ERROR_NONE) {
eventEndPtr = eventPtr;
processor = errorProcessor;
return XML_STATUS_ERROR;
}
else {
switch (parsing) {
case XML_SUSPENDED:
result = XML_STATUS_SUSPENDED;
break;
case XML_INITIALIZED:
case XML_PARSING:
if (finalBuffer) {
parsing = XML_FINISHED;
return result;
}
default: ;
}
}
XmlUpdatePosition(encoding, positionPtr, bufferPtr, &position);
positionPtr = bufferPtr;
return result;
}
void XMLCALL
XML_GetParsingStatus(XML_Parser parser, XML_ParsingStatus *status)
{
assert(status != NULL);
*status = parser->m_parsingStatus;
}
enum XML_Error XMLCALL
2003-01-25 18:41:29 -04:00
XML_GetErrorCode(XML_Parser parser)
2002-02-11 19:13:04 -04:00
{
return errorCode;
}
long XMLCALL
2003-01-25 18:41:29 -04:00
XML_GetCurrentByteIndex(XML_Parser parser)
2002-02-11 19:13:04 -04:00
{
if (eventPtr)
return parseEndByteIndex - (parseEndPtr - eventPtr);
return -1;
}
int XMLCALL
2003-01-25 18:41:29 -04:00
XML_GetCurrentByteCount(XML_Parser parser)
2002-02-11 19:13:04 -04:00
{
if (eventEndPtr && eventPtr)
return eventEndPtr - eventPtr;
return 0;
}
const char * XMLCALL
2003-01-25 18:41:29 -04:00
XML_GetInputContext(XML_Parser parser, int *offset, int *size)
2002-02-11 19:13:04 -04:00
{
#ifdef XML_CONTEXT_BYTES
if (eventPtr && buffer) {
*offset = eventPtr - buffer;
*size = bufferEnd - buffer;
return buffer;
}
#endif /* defined XML_CONTEXT_BYTES */
return (char *) 0;
}
int XMLCALL
2003-01-25 18:41:29 -04:00
XML_GetCurrentLineNumber(XML_Parser parser)
2002-02-11 19:13:04 -04:00
{
2004-08-03 04:06:22 -03:00
if (eventPtr && eventPtr >= positionPtr) {
2002-02-11 19:13:04 -04:00
XmlUpdatePosition(encoding, positionPtr, eventPtr, &position);
positionPtr = eventPtr;
}
return position.lineNumber + 1;
}
int XMLCALL
2003-01-25 18:41:29 -04:00
XML_GetCurrentColumnNumber(XML_Parser parser)
2002-02-11 19:13:04 -04:00
{
2004-08-03 04:06:22 -03:00
if (eventPtr && eventPtr >= positionPtr) {
2002-02-11 19:13:04 -04:00
XmlUpdatePosition(encoding, positionPtr, eventPtr, &position);
positionPtr = eventPtr;
}
return position.columnNumber;
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_FreeContentModel(XML_Parser parser, XML_Content *model)
{
FREE(model);
}
void * XMLCALL
2003-01-25 18:41:29 -04:00
XML_MemMalloc(XML_Parser parser, size_t size)
{
return MALLOC(size);
}
void * XMLCALL
2003-01-25 18:41:29 -04:00
XML_MemRealloc(XML_Parser parser, void *ptr, size_t size)
{
return REALLOC(ptr, size);
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_MemFree(XML_Parser parser, void *ptr)
{
FREE(ptr);
}
void XMLCALL
2003-01-25 18:41:29 -04:00
XML_DefaultCurrent(XML_Parser parser)
2002-02-11 19:13:04 -04:00
{
if (defaultHandler) {
if (openInternalEntities)
reportDefault(parser,
2003-01-25 18:41:29 -04:00
internalEncoding,
openInternalEntities->internalEventPtr,
openInternalEntities->internalEventEndPtr);
2002-02-11 19:13:04 -04:00
else
reportDefault(parser, encoding, eventPtr, eventEndPtr);
}
}
const XML_LChar * XMLCALL
2003-01-25 18:41:29 -04:00
XML_ErrorString(enum XML_Error code)
2002-02-11 19:13:04 -04:00
{
static const XML_LChar *message[] = {
0,
2003-01-25 18:41:29 -04:00
XML_L("out of memory"),
XML_L("syntax error"),
XML_L("no element found"),
XML_L("not well-formed (invalid token)"),
XML_L("unclosed token"),
XML_L("partial character"),
XML_L("mismatched tag"),
XML_L("duplicate attribute"),
XML_L("junk after document element"),
XML_L("illegal parameter entity reference"),
XML_L("undefined entity"),
XML_L("recursive entity reference"),
XML_L("asynchronous entity"),
XML_L("reference to invalid character number"),
XML_L("reference to binary entity"),
XML_L("reference to external entity in attribute"),
XML_L("xml declaration not at start of external entity"),
XML_L("unknown encoding"),
XML_L("encoding specified in XML declaration is incorrect"),
XML_L("unclosed CDATA section"),
XML_L("error in processing external entity reference"),
XML_L("document is not standalone"),
XML_L("unexpected parser state - please send a bug report"),
XML_L("entity declared in parameter entity"),
XML_L("requested feature requires XML_DTD support in Expat"),
XML_L("cannot change setting once parsing has begun"),
2004-08-03 04:06:22 -03:00
XML_L("unbound prefix"),
XML_L("must not undeclare prefix"),
XML_L("incomplete markup in parameter entity"),
XML_L("XML declaration not well-formed"),
XML_L("text declaration not well-formed"),
XML_L("illegal character(s) in public id"),
XML_L("parser suspended"),
XML_L("parser not suspended"),
XML_L("parsing aborted"),
XML_L("parsing finished"),
XML_L("cannot suspend in external parameter entity")
2002-02-11 19:13:04 -04:00
};
if (code > 0 && code < sizeof(message)/sizeof(message[0]))
return message[code];
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
}
const XML_LChar * XMLCALL
2002-02-11 19:13:04 -04:00
XML_ExpatVersion(void) {
2003-01-25 18:41:29 -04:00
/* V1 is used to string-ize the version number. However, it would
string-ize the actual version macro *names* unless we get them
substituted before being passed to V1. CPP is defined to expand
a macro, then rescan for more expansions. Thus, we use V2 to expand
the version macros, then CPP will expand the resulting V1() macro
with the correct numerals. */
/* ### I'm assuming cpp is portable in this respect... */
#define V1(a,b,c) XML_L(#a)XML_L(".")XML_L(#b)XML_L(".")XML_L(#c)
#define V2(a,b,c) XML_L("expat_")V1(a,b,c)
return V2(XML_MAJOR_VERSION, XML_MINOR_VERSION, XML_MICRO_VERSION);
#undef V1
#undef V2
2002-02-11 19:13:04 -04:00
}
XML_Expat_Version XMLCALL
2003-01-25 18:41:29 -04:00
XML_ExpatVersionInfo(void)
{
2002-02-11 19:13:04 -04:00
XML_Expat_Version version;
version.major = XML_MAJOR_VERSION;
version.minor = XML_MINOR_VERSION;
version.micro = XML_MICRO_VERSION;
return version;
}
const XML_Feature * XMLCALL
2003-01-25 18:41:29 -04:00
XML_GetFeatureList(void)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
static XML_Feature features[] = {
{XML_FEATURE_SIZEOF_XML_CHAR, XML_L("sizeof(XML_Char)"), 0},
{XML_FEATURE_SIZEOF_XML_LCHAR, XML_L("sizeof(XML_LChar)"), 0},
2003-01-25 18:41:29 -04:00
#ifdef XML_UNICODE
{XML_FEATURE_UNICODE, XML_L("XML_UNICODE"), 0},
2003-01-25 18:41:29 -04:00
#endif
#ifdef XML_UNICODE_WCHAR_T
{XML_FEATURE_UNICODE_WCHAR_T, XML_L("XML_UNICODE_WCHAR_T"), 0},
2003-01-25 18:41:29 -04:00
#endif
#ifdef XML_DTD
{XML_FEATURE_DTD, XML_L("XML_DTD"), 0},
2003-01-25 18:41:29 -04:00
#endif
#ifdef XML_CONTEXT_BYTES
{XML_FEATURE_CONTEXT_BYTES, XML_L("XML_CONTEXT_BYTES"),
XML_CONTEXT_BYTES},
#endif
#ifdef XML_MIN_SIZE
{XML_FEATURE_MIN_SIZE, XML_L("XML_MIN_SIZE"), 0},
2003-01-25 18:41:29 -04:00
#endif
{XML_FEATURE_END, NULL, 0}
2003-01-25 18:41:29 -04:00
};
features[0].value = sizeof(XML_Char);
features[1].value = sizeof(XML_LChar);
return features;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
/* Initially tag->rawName always points into the parse buffer;
for those TAG instances opened while the current parse buffer was
processed, and not yet closed, we need to store tag->rawName in a more
permanent location, since the parse buffer is about to be discarded.
*/
static XML_Bool
storeRawNames(XML_Parser parser)
{
TAG *tag = tagStack;
while (tag) {
int bufSize;
int nameLen = sizeof(XML_Char) * (tag->name.strLen + 1);
char *rawNameBuf = tag->buf + nameLen;
/* Stop if already stored. Since tagStack is a stack, we can stop
at the first entry that has already been copied; everything
below it in the stack is already been accounted for in a
previous call to this function.
*/
if (tag->rawName == rawNameBuf)
break;
/* For re-use purposes we need to ensure that the
size of tag->buf is a multiple of sizeof(XML_Char).
*/
bufSize = nameLen + ROUND_UP(tag->rawNameLength, sizeof(XML_Char));
if (bufSize > tag->bufEnd - tag->buf) {
char *temp = (char *)REALLOC(tag->buf, bufSize);
if (temp == NULL)
return XML_FALSE;
/* if tag->name.str points to tag->buf (only when namespace
processing is off) then we have to update it
*/
if (tag->name.str == (XML_Char *)tag->buf)
tag->name.str = (XML_Char *)temp;
/* if tag->name.localPart is set (when namespace processing is on)
then update it as well, since it will always point into tag->buf
*/
if (tag->name.localPart)
tag->name.localPart = (XML_Char *)temp + (tag->name.localPart -
(XML_Char *)tag->buf);
tag->buf = temp;
tag->bufEnd = temp + bufSize;
rawNameBuf = temp + nameLen;
}
memcpy(rawNameBuf, tag->rawName, tag->rawNameLength);
tag->rawName = rawNameBuf;
tag = tag->parent;
}
return XML_TRUE;
}
static enum XML_Error PTRCALL
contentProcessor(XML_Parser parser,
const char *start,
const char *end,
const char **endPtr)
{
2004-08-03 04:06:22 -03:00
enum XML_Error result = doContent(parser, 0, encoding, start, end,
endPtr, (XML_Bool)!finalBuffer);
if (result == XML_ERROR_NONE) {
if (!storeRawNames(parser))
return XML_ERROR_NO_MEMORY;
}
2003-01-25 18:41:29 -04:00
return result;
}
static enum XML_Error PTRCALL
externalEntityInitProcessor(XML_Parser parser,
const char *start,
const char *end,
const char **endPtr)
2002-02-11 19:13:04 -04:00
{
enum XML_Error result = initializeEncoding(parser);
if (result != XML_ERROR_NONE)
return result;
processor = externalEntityInitProcessor2;
return externalEntityInitProcessor2(parser, start, end, endPtr);
}
2003-01-25 18:41:29 -04:00
static enum XML_Error PTRCALL
externalEntityInitProcessor2(XML_Parser parser,
const char *start,
const char *end,
const char **endPtr)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
const char *next = start; /* XmlContentTok doesn't always set the last arg */
2002-02-11 19:13:04 -04:00
int tok = XmlContentTok(encoding, start, end, &next);
switch (tok) {
case XML_TOK_BOM:
2003-01-25 18:41:29 -04:00
/* If we are at the end of the buffer, this would cause the next stage,
i.e. externalEntityInitProcessor3, to pass control directly to
doContent (by detecting XML_TOK_NONE) without processing any xml text
declaration - causing the error XML_ERROR_MISPLACED_XML_PI in doContent.
*/
2004-08-03 04:06:22 -03:00
if (next == end && !finalBuffer) {
2003-01-25 18:41:29 -04:00
*endPtr = next;
return XML_ERROR_NONE;
}
2002-02-11 19:13:04 -04:00
start = next;
break;
case XML_TOK_PARTIAL:
2004-08-03 04:06:22 -03:00
if (!finalBuffer) {
2002-02-11 19:13:04 -04:00
*endPtr = start;
return XML_ERROR_NONE;
}
eventPtr = start;
return XML_ERROR_UNCLOSED_TOKEN;
case XML_TOK_PARTIAL_CHAR:
2004-08-03 04:06:22 -03:00
if (!finalBuffer) {
2002-02-11 19:13:04 -04:00
*endPtr = start;
return XML_ERROR_NONE;
}
eventPtr = start;
return XML_ERROR_PARTIAL_CHAR;
}
processor = externalEntityInitProcessor3;
return externalEntityInitProcessor3(parser, start, end, endPtr);
}
2003-01-25 18:41:29 -04:00
static enum XML_Error PTRCALL
externalEntityInitProcessor3(XML_Parser parser,
const char *start,
const char *end,
const char **endPtr)
2002-02-11 19:13:04 -04:00
{
2004-08-03 04:06:22 -03:00
int tok;
2003-01-25 18:41:29 -04:00
const char *next = start; /* XmlContentTok doesn't always set the last arg */
2004-08-03 04:06:22 -03:00
eventPtr = start;
tok = XmlContentTok(encoding, start, end, &next);
eventEndPtr = next;
2002-02-11 19:13:04 -04:00
switch (tok) {
case XML_TOK_XML_DECL:
{
2004-08-03 04:06:22 -03:00
enum XML_Error result;
result = processXmlDecl(parser, 1, start, next);
2002-02-11 19:13:04 -04:00
if (result != XML_ERROR_NONE)
2003-01-25 18:41:29 -04:00
return result;
2004-08-03 04:06:22 -03:00
switch (parsing) {
case XML_SUSPENDED:
*endPtr = next;
return XML_ERROR_NONE;
case XML_FINISHED:
return XML_ERROR_ABORTED;
default:
start = next;
}
2002-02-11 19:13:04 -04:00
}
break;
case XML_TOK_PARTIAL:
2004-08-03 04:06:22 -03:00
if (!finalBuffer) {
2002-02-11 19:13:04 -04:00
*endPtr = start;
return XML_ERROR_NONE;
}
return XML_ERROR_UNCLOSED_TOKEN;
case XML_TOK_PARTIAL_CHAR:
2004-08-03 04:06:22 -03:00
if (!finalBuffer) {
2002-02-11 19:13:04 -04:00
*endPtr = start;
return XML_ERROR_NONE;
}
return XML_ERROR_PARTIAL_CHAR;
}
processor = externalEntityContentProcessor;
tagLevel = 1;
2003-01-25 18:41:29 -04:00
return externalEntityContentProcessor(parser, start, end, endPtr);
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
static enum XML_Error PTRCALL
externalEntityContentProcessor(XML_Parser parser,
const char *start,
const char *end,
const char **endPtr)
2002-02-11 19:13:04 -04:00
{
2004-08-03 04:06:22 -03:00
enum XML_Error result = doContent(parser, 1, encoding, start, end,
endPtr, (XML_Bool)!finalBuffer);
if (result == XML_ERROR_NONE) {
if (!storeRawNames(parser))
return XML_ERROR_NO_MEMORY;
}
2003-01-25 18:41:29 -04:00
return result;
2002-02-11 19:13:04 -04:00
}
static enum XML_Error
doContent(XML_Parser parser,
2003-01-25 18:41:29 -04:00
int startTagLevel,
const ENCODING *enc,
const char *s,
const char *end,
2004-08-03 04:06:22 -03:00
const char **nextPtr,
XML_Bool haveMore)
2002-02-11 19:13:04 -04:00
{
2004-08-03 04:06:22 -03:00
/* save one level of indirection */
DTD * const dtd = _dtd;
2002-02-11 19:13:04 -04:00
const char **eventPP;
const char **eventEndPP;
if (enc == encoding) {
eventPP = &eventPtr;
eventEndPP = &eventEndPtr;
}
else {
eventPP = &(openInternalEntities->internalEventPtr);
eventEndPP = &(openInternalEntities->internalEventEndPtr);
}
*eventPP = s;
2004-08-03 04:06:22 -03:00
2002-02-11 19:13:04 -04:00
for (;;) {
const char *next = s; /* XmlContentTok doesn't always set the last arg */
int tok = XmlContentTok(enc, s, end, &next);
*eventEndPP = next;
switch (tok) {
case XML_TOK_TRAILING_CR:
2004-08-03 04:06:22 -03:00
if (haveMore) {
2003-01-25 18:41:29 -04:00
*nextPtr = s;
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
}
*eventEndPP = end;
if (characterDataHandler) {
2003-01-25 18:41:29 -04:00
XML_Char c = 0xA;
characterDataHandler(handlerArg, &c, 1);
2002-02-11 19:13:04 -04:00
}
else if (defaultHandler)
2003-01-25 18:41:29 -04:00
reportDefault(parser, enc, s, end);
2004-08-03 04:06:22 -03:00
/* We are at the end of the final buffer, should we check for
XML_SUSPENDED, XML_FINISHED?
*/
2002-02-11 19:13:04 -04:00
if (startTagLevel == 0)
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_ELEMENTS;
2002-02-11 19:13:04 -04:00
if (tagLevel != startTagLevel)
2003-01-25 18:41:29 -04:00
return XML_ERROR_ASYNC_ENTITY;
2004-08-03 04:06:22 -03:00
*nextPtr = end;
2002-02-11 19:13:04 -04:00
return XML_ERROR_NONE;
case XML_TOK_NONE:
2004-08-03 04:06:22 -03:00
if (haveMore) {
2003-01-25 18:41:29 -04:00
*nextPtr = s;
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
}
if (startTagLevel > 0) {
2003-01-25 18:41:29 -04:00
if (tagLevel != startTagLevel)
return XML_ERROR_ASYNC_ENTITY;
2004-08-03 04:06:22 -03:00
*nextPtr = s;
2003-01-25 18:41:29 -04:00
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
}
return XML_ERROR_NO_ELEMENTS;
case XML_TOK_INVALID:
*eventPP = next;
return XML_ERROR_INVALID_TOKEN;
case XML_TOK_PARTIAL:
2004-08-03 04:06:22 -03:00
if (haveMore) {
2003-01-25 18:41:29 -04:00
*nextPtr = s;
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
}
return XML_ERROR_UNCLOSED_TOKEN;
case XML_TOK_PARTIAL_CHAR:
2004-08-03 04:06:22 -03:00
if (haveMore) {
2003-01-25 18:41:29 -04:00
*nextPtr = s;
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
}
return XML_ERROR_PARTIAL_CHAR;
case XML_TOK_ENTITY_REF:
{
2003-01-25 18:41:29 -04:00
const XML_Char *name;
ENTITY *entity;
XML_Char ch = (XML_Char) XmlPredefinedEntityName(enc,
s + enc->minBytesPerChar,
next - enc->minBytesPerChar);
if (ch) {
if (characterDataHandler)
characterDataHandler(handlerArg, &ch, 1);
else if (defaultHandler)
reportDefault(parser, enc, s, next);
break;
}
name = poolStoreString(&dtd->pool, enc,
s + enc->minBytesPerChar,
next - enc->minBytesPerChar);
if (!name)
return XML_ERROR_NO_MEMORY;
entity = (ENTITY *)lookup(&dtd->generalEntities, name, 0);
poolDiscard(&dtd->pool);
/* First, determine if a check for an existing declaration is needed;
if yes, check that the entity exists, and that it is internal,
otherwise call the skipped entity or default handler.
*/
if (!dtd->hasParamEntityRefs || dtd->standalone) {
if (!entity)
return XML_ERROR_UNDEFINED_ENTITY;
else if (!entity->is_internal)
return XML_ERROR_ENTITY_DECLARED_IN_PE;
}
else if (!entity) {
if (skippedEntityHandler)
skippedEntityHandler(handlerArg, name, 0);
else if (defaultHandler)
reportDefault(parser, enc, s, next);
break;
}
if (entity->open)
return XML_ERROR_RECURSIVE_ENTITY_REF;
if (entity->notation)
return XML_ERROR_BINARY_ENTITY_REF;
if (entity->textPtr) {
enum XML_Error result;
if (!defaultExpandInternalEntities) {
if (skippedEntityHandler)
skippedEntityHandler(handlerArg, entity->name, 0);
else if (defaultHandler)
reportDefault(parser, enc, s, next);
break;
}
2004-08-03 04:06:22 -03:00
result = processInternalEntity(parser, entity, XML_FALSE);
if (result != XML_ERROR_NONE)
2003-01-25 18:41:29 -04:00
return result;
}
else if (externalEntityRefHandler) {
const XML_Char *context;
entity->open = XML_TRUE;
context = getContext(parser);
entity->open = XML_FALSE;
if (!context)
return XML_ERROR_NO_MEMORY;
2004-08-03 04:06:22 -03:00
if (!externalEntityRefHandler(externalEntityRefHandlerArg,
2003-01-25 18:41:29 -04:00
context,
entity->base,
entity->systemId,
entity->publicId))
return XML_ERROR_EXTERNAL_ENTITY_HANDLING;
poolDiscard(&tempPool);
}
else if (defaultHandler)
reportDefault(parser, enc, s, next);
break;
2002-02-11 19:13:04 -04:00
}
case XML_TOK_START_TAG_NO_ATTS:
2003-01-25 18:41:29 -04:00
/* fall through */
case XML_TOK_START_TAG_WITH_ATTS:
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
TAG *tag;
enum XML_Error result;
XML_Char *toPtr;
if (freeTagList) {
tag = freeTagList;
freeTagList = freeTagList->parent;
}
else {
tag = (TAG *)MALLOC(sizeof(TAG));
if (!tag)
return XML_ERROR_NO_MEMORY;
tag->buf = (char *)MALLOC(INIT_TAG_BUF_SIZE);
if (!tag->buf) {
FREE(tag);
return XML_ERROR_NO_MEMORY;
}
tag->bufEnd = tag->buf + INIT_TAG_BUF_SIZE;
}
tag->bindings = NULL;
tag->parent = tagStack;
tagStack = tag;
tag->name.localPart = NULL;
tag->name.prefix = NULL;
tag->rawName = s + enc->minBytesPerChar;
tag->rawNameLength = XmlNameLength(enc, tag->rawName);
++tagLevel;
{
const char *rawNameEnd = tag->rawName + tag->rawNameLength;
const char *fromPtr = tag->rawName;
toPtr = (XML_Char *)tag->buf;
for (;;) {
int bufSize;
int convLen;
XmlConvert(enc,
&fromPtr, rawNameEnd,
(ICHAR **)&toPtr, (ICHAR *)tag->bufEnd - 1);
convLen = toPtr - (XML_Char *)tag->buf;
if (fromPtr == rawNameEnd) {
tag->name.strLen = convLen;
break;
}
bufSize = (tag->bufEnd - tag->buf) << 1;
{
char *temp = (char *)REALLOC(tag->buf, bufSize);
if (temp == NULL)
return XML_ERROR_NO_MEMORY;
tag->buf = temp;
tag->bufEnd = temp + bufSize;
toPtr = (XML_Char *)temp + convLen;
}
}
}
tag->name.str = (XML_Char *)tag->buf;
*toPtr = XML_T('\0');
result = storeAtts(parser, enc, s, &(tag->name), &(tag->bindings));
if (result)
return result;
if (startElementHandler)
startElementHandler(handlerArg, tag->name.str,
(const XML_Char **)atts);
else if (defaultHandler)
reportDefault(parser, enc, s, next);
poolClear(&tempPool);
break;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
case XML_TOK_EMPTY_ELEMENT_NO_ATTS:
2002-02-11 19:13:04 -04:00
/* fall through */
2003-01-25 18:41:29 -04:00
case XML_TOK_EMPTY_ELEMENT_WITH_ATTS:
{
const char *rawName = s + enc->minBytesPerChar;
enum XML_Error result;
BINDING *bindings = NULL;
XML_Bool noElmHandlers = XML_TRUE;
TAG_NAME name;
name.str = poolStoreString(&tempPool, enc, rawName,
rawName + XmlNameLength(enc, rawName));
if (!name.str)
return XML_ERROR_NO_MEMORY;
poolFinish(&tempPool);
result = storeAtts(parser, enc, s, &name, &bindings);
if (result)
return result;
poolFinish(&tempPool);
2003-01-25 18:41:29 -04:00
if (startElementHandler) {
startElementHandler(handlerArg, name.str, (const XML_Char **)atts);
noElmHandlers = XML_FALSE;
}
if (endElementHandler) {
if (startElementHandler)
*eventPP = *eventEndPP;
endElementHandler(handlerArg, name.str);
noElmHandlers = XML_FALSE;
}
if (noElmHandlers && defaultHandler)
reportDefault(parser, enc, s, next);
poolClear(&tempPool);
while (bindings) {
BINDING *b = bindings;
if (endNamespaceDeclHandler)
endNamespaceDeclHandler(handlerArg, b->prefix->name);
bindings = bindings->nextTagBinding;
b->nextTagBinding = freeBindingList;
freeBindingList = b;
b->prefix->binding = b->prevPrefixBinding;
}
2002-02-11 19:13:04 -04:00
}
if (tagLevel == 0)
2003-01-25 18:41:29 -04:00
return epilogProcessor(parser, next, end, nextPtr);
2002-02-11 19:13:04 -04:00
break;
case XML_TOK_END_TAG:
if (tagLevel == startTagLevel)
return XML_ERROR_ASYNC_ENTITY;
else {
2003-01-25 18:41:29 -04:00
int len;
const char *rawName;
TAG *tag = tagStack;
tagStack = tag->parent;
tag->parent = freeTagList;
freeTagList = tag;
rawName = s + enc->minBytesPerChar*2;
len = XmlNameLength(enc, rawName);
if (len != tag->rawNameLength
|| memcmp(tag->rawName, rawName, len) != 0) {
*eventPP = rawName;
return XML_ERROR_TAG_MISMATCH;
}
--tagLevel;
if (endElementHandler) {
const XML_Char *localPart;
const XML_Char *prefix;
XML_Char *uri;
localPart = tag->name.localPart;
if (ns && localPart) {
/* localPart and prefix may have been overwritten in
tag->name.str, since this points to the binding->uri
buffer which gets re-used; so we have to add them again
*/
uri = (XML_Char *)tag->name.str + tag->name.uriLen;
/* don't need to check for space - already done in storeAtts() */
while (*localPart) *uri++ = *localPart++;
prefix = (XML_Char *)tag->name.prefix;
if (ns_triplets && prefix) {
*uri++ = namespaceSeparator;
while (*prefix) *uri++ = *prefix++;
}
*uri = XML_T('\0');
}
endElementHandler(handlerArg, tag->name.str);
}
else if (defaultHandler)
reportDefault(parser, enc, s, next);
while (tag->bindings) {
BINDING *b = tag->bindings;
if (endNamespaceDeclHandler)
endNamespaceDeclHandler(handlerArg, b->prefix->name);
tag->bindings = tag->bindings->nextTagBinding;
b->nextTagBinding = freeBindingList;
freeBindingList = b;
b->prefix->binding = b->prevPrefixBinding;
}
if (tagLevel == 0)
return epilogProcessor(parser, next, end, nextPtr);
2002-02-11 19:13:04 -04:00
}
break;
case XML_TOK_CHAR_REF:
{
2003-01-25 18:41:29 -04:00
int n = XmlCharRefNumber(enc, s);
if (n < 0)
return XML_ERROR_BAD_CHAR_REF;
if (characterDataHandler) {
XML_Char buf[XML_ENCODE_MAX];
characterDataHandler(handlerArg, buf, XmlEncode(n, (ICHAR *)buf));
}
else if (defaultHandler)
reportDefault(parser, enc, s, next);
2002-02-11 19:13:04 -04:00
}
break;
case XML_TOK_XML_DECL:
return XML_ERROR_MISPLACED_XML_PI;
case XML_TOK_DATA_NEWLINE:
if (characterDataHandler) {
2003-01-25 18:41:29 -04:00
XML_Char c = 0xA;
characterDataHandler(handlerArg, &c, 1);
2002-02-11 19:13:04 -04:00
}
else if (defaultHandler)
2003-01-25 18:41:29 -04:00
reportDefault(parser, enc, s, next);
2002-02-11 19:13:04 -04:00
break;
case XML_TOK_CDATA_SECT_OPEN:
{
2003-01-25 18:41:29 -04:00
enum XML_Error result;
if (startCdataSectionHandler)
startCdataSectionHandler(handlerArg);
2002-02-11 19:13:04 -04:00
#if 0
2003-01-25 18:41:29 -04:00
/* Suppose you doing a transformation on a document that involves
changing only the character data. You set up a defaultHandler
and a characterDataHandler. The defaultHandler simply copies
characters through. The characterDataHandler does the
transformation and writes the characters out escaping them as
necessary. This case will fail to work if we leave out the
following two lines (because & and < inside CDATA sections will
be incorrectly escaped).
However, now we have a start/endCdataSectionHandler, so it seems
easier to let the user deal with this.
*/
else if (characterDataHandler)
characterDataHandler(handlerArg, dataBuf, 0);
2002-02-11 19:13:04 -04:00
#endif
2003-01-25 18:41:29 -04:00
else if (defaultHandler)
reportDefault(parser, enc, s, next);
2004-08-03 04:06:22 -03:00
result = doCdataSection(parser, enc, &next, end, nextPtr, haveMore);
if (result != XML_ERROR_NONE)
return result;
else if (!next) {
2003-01-25 18:41:29 -04:00
processor = cdataSectionProcessor;
return result;
}
2002-02-11 19:13:04 -04:00
}
break;
case XML_TOK_TRAILING_RSQB:
2004-08-03 04:06:22 -03:00
if (haveMore) {
2003-01-25 18:41:29 -04:00
*nextPtr = s;
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
}
if (characterDataHandler) {
2003-01-25 18:41:29 -04:00
if (MUST_CONVERT(enc, s)) {
ICHAR *dataPtr = (ICHAR *)dataBuf;
XmlConvert(enc, &s, end, &dataPtr, (ICHAR *)dataBufEnd);
characterDataHandler(handlerArg, dataBuf,
dataPtr - (ICHAR *)dataBuf);
}
else
characterDataHandler(handlerArg,
(XML_Char *)s,
(XML_Char *)end - (XML_Char *)s);
2002-02-11 19:13:04 -04:00
}
else if (defaultHandler)
2003-01-25 18:41:29 -04:00
reportDefault(parser, enc, s, end);
2004-08-03 04:06:22 -03:00
/* We are at the end of the final buffer, should we check for
XML_SUSPENDED, XML_FINISHED?
*/
2002-02-11 19:13:04 -04:00
if (startTagLevel == 0) {
*eventPP = end;
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_ELEMENTS;
2002-02-11 19:13:04 -04:00
}
if (tagLevel != startTagLevel) {
2003-01-25 18:41:29 -04:00
*eventPP = end;
return XML_ERROR_ASYNC_ENTITY;
2002-02-11 19:13:04 -04:00
}
2004-08-03 04:06:22 -03:00
*nextPtr = end;
2002-02-11 19:13:04 -04:00
return XML_ERROR_NONE;
case XML_TOK_DATA_CHARS:
if (characterDataHandler) {
2003-01-25 18:41:29 -04:00
if (MUST_CONVERT(enc, s)) {
for (;;) {
ICHAR *dataPtr = (ICHAR *)dataBuf;
XmlConvert(enc, &s, next, &dataPtr, (ICHAR *)dataBufEnd);
*eventEndPP = s;
characterDataHandler(handlerArg, dataBuf,
dataPtr - (ICHAR *)dataBuf);
if (s == next)
break;
*eventPP = s;
}
}
else
characterDataHandler(handlerArg,
(XML_Char *)s,
(XML_Char *)next - (XML_Char *)s);
2002-02-11 19:13:04 -04:00
}
else if (defaultHandler)
2003-01-25 18:41:29 -04:00
reportDefault(parser, enc, s, next);
2002-02-11 19:13:04 -04:00
break;
case XML_TOK_PI:
if (!reportProcessingInstruction(parser, enc, s, next))
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_MEMORY;
2002-02-11 19:13:04 -04:00
break;
case XML_TOK_COMMENT:
if (!reportComment(parser, enc, s, next))
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_MEMORY;
2002-02-11 19:13:04 -04:00
break;
default:
if (defaultHandler)
2003-01-25 18:41:29 -04:00
reportDefault(parser, enc, s, next);
2002-02-11 19:13:04 -04:00
break;
}
*eventPP = s = next;
2004-08-03 04:06:22 -03:00
switch (parsing) {
case XML_SUSPENDED:
*nextPtr = next;
return XML_ERROR_NONE;
case XML_FINISHED:
return XML_ERROR_ABORTED;
default: ;
}
2002-02-11 19:13:04 -04:00
}
/* not reached */
}
/* Precondition: all arguments must be non-NULL;
Purpose:
- normalize attributes
- check attributes for well-formedness
- generate namespace aware attribute names (URI, prefix)
- build list of attributes for startElementHandler
- default attributes
- process namespace declarations (check and report them)
- generate namespace aware element name (URI, prefix)
2003-01-25 18:41:29 -04:00
*/
static enum XML_Error
storeAtts(XML_Parser parser, const ENCODING *enc,
const char *attStr, TAG_NAME *tagNamePtr,
BINDING **bindingsPtr)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
DTD * const dtd = _dtd; /* save one level of indirection */
ELEMENT_TYPE *elementType;
int nDefaultAtts;
2003-01-25 18:41:29 -04:00
const XML_Char **appAtts; /* the attribute list for the application */
2002-02-11 19:13:04 -04:00
int attIndex = 0;
2003-01-25 18:41:29 -04:00
int prefixLen;
2002-02-11 19:13:04 -04:00
int i;
int n;
2003-01-25 18:41:29 -04:00
XML_Char *uri;
2002-02-11 19:13:04 -04:00
int nPrefixes = 0;
BINDING *binding;
const XML_Char *localPart;
/* lookup the element type name */
elementType = (ELEMENT_TYPE *)lookup(&dtd->elementTypes, tagNamePtr->str,0);
if (!elementType) {
const XML_Char *name = poolCopyString(&dtd->pool, tagNamePtr->str);
if (!name)
return XML_ERROR_NO_MEMORY;
elementType = (ELEMENT_TYPE *)lookup(&dtd->elementTypes, name,
sizeof(ELEMENT_TYPE));
if (!elementType)
return XML_ERROR_NO_MEMORY;
if (ns && !setElementTypePrefix(parser, elementType))
return XML_ERROR_NO_MEMORY;
2002-02-11 19:13:04 -04:00
}
nDefaultAtts = elementType->nDefaultAtts;
2002-02-11 19:13:04 -04:00
/* get the attributes from the tokenizer */
n = XmlGetAttributes(enc, attStr, attsSize, atts);
if (n + nDefaultAtts > attsSize) {
int oldAttsSize = attsSize;
2003-01-25 18:41:29 -04:00
ATTRIBUTE *temp;
2002-02-11 19:13:04 -04:00
attsSize = n + nDefaultAtts + INIT_ATTS_SIZE;
2003-01-25 18:41:29 -04:00
temp = (ATTRIBUTE *)REALLOC((void *)atts, attsSize * sizeof(ATTRIBUTE));
if (temp == NULL)
2002-02-11 19:13:04 -04:00
return XML_ERROR_NO_MEMORY;
2003-01-25 18:41:29 -04:00
atts = temp;
2002-02-11 19:13:04 -04:00
if (n > oldAttsSize)
XmlGetAttributes(enc, attStr, n, atts);
}
2002-02-11 19:13:04 -04:00
appAtts = (const XML_Char **)atts;
for (i = 0; i < n; i++) {
/* add the name and value to the attribute list */
ATTRIBUTE_ID *attId = getAttributeId(parser, enc, atts[i].name,
2003-01-25 18:41:29 -04:00
atts[i].name
+ XmlNameLength(enc, atts[i].name));
2002-02-11 19:13:04 -04:00
if (!attId)
return XML_ERROR_NO_MEMORY;
/* Detect duplicate attributes by their QNames. This does not work when
namespace processing is turned on and different prefixes for the same
namespace are used. For this case we have a check further down.
*/
2002-02-11 19:13:04 -04:00
if ((attId->name)[-1]) {
if (enc == encoding)
2003-01-25 18:41:29 -04:00
eventPtr = atts[i].name;
2002-02-11 19:13:04 -04:00
return XML_ERROR_DUPLICATE_ATTRIBUTE;
}
(attId->name)[-1] = 1;
appAtts[attIndex++] = attId->name;
if (!atts[i].normalized) {
enum XML_Error result;
2003-01-25 18:41:29 -04:00
XML_Bool isCdata = XML_TRUE;
2002-02-11 19:13:04 -04:00
/* figure out whether declared as other than CDATA */
if (attId->maybeTokenized) {
2003-01-25 18:41:29 -04:00
int j;
for (j = 0; j < nDefaultAtts; j++) {
if (attId == elementType->defaultAtts[j].id) {
isCdata = elementType->defaultAtts[j].isCdata;
break;
}
}
2002-02-11 19:13:04 -04:00
}
/* normalize the attribute value */
result = storeAttributeValue(parser, enc, isCdata,
2003-01-25 18:41:29 -04:00
atts[i].valuePtr, atts[i].valueEnd,
&tempPool);
2002-02-11 19:13:04 -04:00
if (result)
2003-01-25 18:41:29 -04:00
return result;
appAtts[attIndex] = poolStart(&tempPool);
poolFinish(&tempPool);
2002-02-11 19:13:04 -04:00
}
else {
2002-02-11 19:13:04 -04:00
/* the value did not need normalizing */
2003-01-25 18:41:29 -04:00
appAtts[attIndex] = poolStoreString(&tempPool, enc, atts[i].valuePtr,
atts[i].valueEnd);
2002-02-11 19:13:04 -04:00
if (appAtts[attIndex] == 0)
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_MEMORY;
2002-02-11 19:13:04 -04:00
poolFinish(&tempPool);
}
/* handle prefixed attribute names */
if (attId->prefix) {
2002-02-11 19:13:04 -04:00
if (attId->xmlns) {
2003-01-25 18:41:29 -04:00
/* deal with namespace declarations here */
enum XML_Error result = addBinding(parser, attId->prefix, attId,
appAtts[attIndex], bindingsPtr);
if (result)
return result;
2002-02-11 19:13:04 -04:00
--attIndex;
}
else {
2003-01-25 18:41:29 -04:00
/* deal with other prefixed names later */
2002-02-11 19:13:04 -04:00
attIndex++;
nPrefixes++;
(attId->name)[-1] = 2;
}
}
else
attIndex++;
}
/* set-up for XML_GetSpecifiedAttributeCount and XML_GetIdAttributeIndex */
nSpecifiedAtts = attIndex;
if (elementType->idAtt && (elementType->idAtt->name)[-1]) {
for (i = 0; i < attIndex; i += 2)
if (appAtts[i] == elementType->idAtt->name) {
idAttIndex = i;
break;
}
}
else
idAttIndex = -1;
/* do attribute defaulting */
for (i = 0; i < nDefaultAtts; i++) {
const DEFAULT_ATTRIBUTE *da = elementType->defaultAtts + i;
if (!(da->id->name)[-1] && da->value) {
if (da->id->prefix) {
if (da->id->xmlns) {
enum XML_Error result = addBinding(parser, da->id->prefix, da->id,
da->value, bindingsPtr);
if (result)
return result;
2003-01-25 18:41:29 -04:00
}
else {
(da->id->name)[-1] = 2;
nPrefixes++;
2003-01-25 18:41:29 -04:00
appAtts[attIndex++] = da->id->name;
appAtts[attIndex++] = da->value;
}
2002-02-11 19:13:04 -04:00
}
else {
(da->id->name)[-1] = 1;
appAtts[attIndex++] = da->id->name;
appAtts[attIndex++] = da->value;
}
2002-02-11 19:13:04 -04:00
}
}
appAtts[attIndex] = 0;
/* expand prefixed attribute names, check for duplicates,
and clear flags that say whether attributes were specified */
2002-02-11 19:13:04 -04:00
i = 0;
if (nPrefixes) {
int j; /* hash table index */
unsigned long version = nsAttsVersion;
int nsAttsSize = (int)1 << nsAttsPower;
/* size of hash table must be at least 2 * (# of prefixed attributes) */
if ((nPrefixes << 1) >> nsAttsPower) { /* true for nsAttsPower = 0 */
NS_ATT *temp;
/* hash table size must also be a power of 2 and >= 8 */
while (nPrefixes >> nsAttsPower++);
if (nsAttsPower < 3)
nsAttsPower = 3;
nsAttsSize = (int)1 << nsAttsPower;
temp = (NS_ATT *)REALLOC(nsAtts, nsAttsSize * sizeof(NS_ATT));
if (!temp)
return XML_ERROR_NO_MEMORY;
nsAtts = temp;
version = 0; /* force re-initialization of nsAtts hash table */
}
/* using a version flag saves us from initializing nsAtts every time */
if (!version) { /* initialize version flags when version wraps around */
version = INIT_ATTS_VERSION;
for (j = nsAttsSize; j != 0; )
nsAtts[--j].version = version;
}
nsAttsVersion = --version;
/* expand prefixed names and check for duplicates */
2002-02-11 19:13:04 -04:00
for (; i < attIndex; i += 2) {
const XML_Char *s = appAtts[i];
if (s[-1] == 2) { /* prefixed */
2002-02-11 19:13:04 -04:00
ATTRIBUTE_ID *id;
const BINDING *b;
unsigned long uriHash = 0;
((XML_Char *)s)[-1] = 0; /* clear flag */
id = (ATTRIBUTE_ID *)lookup(&dtd->attributeIds, s, 0);
b = id->prefix->binding;
if (!b)
return XML_ERROR_UNBOUND_PREFIX;
/* as we expand the name we also calculate its hash value */
for (j = 0; j < b->uriLen; j++) {
const XML_Char c = b->uri[j];
if (!poolAppendChar(&tempPool, c))
return XML_ERROR_NO_MEMORY;
uriHash = CHAR_HASH(uriHash, c);
}
while (*s++ != XML_T(':'))
;
do { /* copies null terminator */
const XML_Char c = *s;
if (!poolAppendChar(&tempPool, *s))
return XML_ERROR_NO_MEMORY;
uriHash = CHAR_HASH(uriHash, c);
} while (*s++);
{ /* Check hash table for duplicate of expanded name (uriName).
Derived from code in lookup(HASH_TABLE *table, ...).
*/
unsigned char step = 0;
unsigned long mask = nsAttsSize - 1;
j = uriHash & mask; /* index into hash table */
while (nsAtts[j].version == version) {
/* for speed we compare stored hash values first */
if (uriHash == nsAtts[j].hash) {
const XML_Char *s1 = poolStart(&tempPool);
const XML_Char *s2 = nsAtts[j].uriName;
/* s1 is null terminated, but not s2 */
for (; *s1 == *s2 && *s1 != 0; s1++, s2++);
if (*s1 == 0)
return XML_ERROR_DUPLICATE_ATTRIBUTE;
}
if (!step)
step = PROBE_STEP(uriHash, mask, nsAttsPower);
j < step ? ( j += nsAttsSize - step) : (j -= step);
2003-01-25 18:41:29 -04:00
}
}
if (ns_triplets) { /* append namespace separator and prefix */
tempPool.ptr[-1] = namespaceSeparator;
s = b->prefix->name;
2003-01-25 18:41:29 -04:00
do {
if (!poolAppendChar(&tempPool, *s))
return XML_ERROR_NO_MEMORY;
} while (*s++);
}
/* store expanded name in attribute list */
s = poolStart(&tempPool);
poolFinish(&tempPool);
appAtts[i] = s;
/* fill empty slot with new version, uriName and hash value */
nsAtts[j].version = version;
nsAtts[j].hash = uriHash;
nsAtts[j].uriName = s;
2003-01-25 18:41:29 -04:00
if (!--nPrefixes)
break;
2002-02-11 19:13:04 -04:00
}
else /* not prefixed */
((XML_Char *)s)[-1] = 0; /* clear flag */
2002-02-11 19:13:04 -04:00
}
}
/* clear flags for the remaining attributes */
2002-02-11 19:13:04 -04:00
for (; i < attIndex; i += 2)
((XML_Char *)(appAtts[i]))[-1] = 0;
for (binding = *bindingsPtr; binding; binding = binding->nextTagBinding)
binding->attId->name[-1] = 0;
if (!ns)
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
/* expand the element type name */
if (elementType->prefix) {
binding = elementType->prefix->binding;
if (!binding)
return XML_ERROR_UNBOUND_PREFIX;
2002-02-11 19:13:04 -04:00
localPart = tagNamePtr->str;
while (*localPart++ != XML_T(':'))
;
}
2003-01-25 18:41:29 -04:00
else if (dtd->defaultPrefix.binding) {
binding = dtd->defaultPrefix.binding;
2002-02-11 19:13:04 -04:00
localPart = tagNamePtr->str;
}
else
return XML_ERROR_NONE;
2003-01-25 18:41:29 -04:00
prefixLen = 0;
if (ns_triplets && binding->prefix->name) {
2003-01-25 18:41:29 -04:00
for (; binding->prefix->name[prefixLen++];)
;
}
2002-02-11 19:13:04 -04:00
tagNamePtr->localPart = localPart;
tagNamePtr->uriLen = binding->uriLen;
2003-01-25 18:41:29 -04:00
tagNamePtr->prefix = binding->prefix->name;
tagNamePtr->prefixLen = prefixLen;
2002-02-11 19:13:04 -04:00
for (i = 0; localPart[i++];)
;
2003-01-25 18:41:29 -04:00
n = i + binding->uriLen + prefixLen;
2002-02-11 19:13:04 -04:00
if (n > binding->uriAlloc) {
TAG *p;
2003-01-25 18:41:29 -04:00
uri = (XML_Char *)MALLOC((n + EXPAND_SPARE) * sizeof(XML_Char));
2002-02-11 19:13:04 -04:00
if (!uri)
return XML_ERROR_NO_MEMORY;
binding->uriAlloc = n + EXPAND_SPARE;
memcpy(uri, binding->uri, binding->uriLen * sizeof(XML_Char));
for (p = tagStack; p; p = p->parent)
if (p->name.str == binding->uri)
2003-01-25 18:41:29 -04:00
p->name.str = uri;
2002-02-11 19:13:04 -04:00
FREE(binding->uri);
binding->uri = uri;
}
2003-01-25 18:41:29 -04:00
uri = binding->uri + binding->uriLen;
memcpy(uri, localPart, i * sizeof(XML_Char));
if (prefixLen) {
uri = uri + (i - 1);
if (namespaceSeparator)
*uri = namespaceSeparator;
2003-01-25 18:41:29 -04:00
memcpy(uri + 1, binding->prefix->name, prefixLen * sizeof(XML_Char));
}
2002-02-11 19:13:04 -04:00
tagNamePtr->str = binding->uri;
return XML_ERROR_NONE;
}
2003-01-25 18:41:29 -04:00
/* addBinding() overwrites the value of prefix->binding without checking.
Therefore one must keep track of the old value outside of addBinding().
*/
static enum XML_Error
addBinding(XML_Parser parser, PREFIX *prefix, const ATTRIBUTE_ID *attId,
const XML_Char *uri, BINDING **bindingsPtr)
2002-02-11 19:13:04 -04:00
{
BINDING *b;
int len;
2003-01-25 18:41:29 -04:00
2004-08-03 04:06:22 -03:00
/* empty URI is only valid for default namespace per XML NS 1.0 (not 1.1) */
2003-01-25 18:41:29 -04:00
if (*uri == XML_T('\0') && prefix->name)
2004-08-03 04:06:22 -03:00
return XML_ERROR_UNDECLARING_PREFIX;
2003-01-25 18:41:29 -04:00
2002-02-11 19:13:04 -04:00
for (len = 0; uri[len]; len++)
;
if (namespaceSeparator)
len++;
if (freeBindingList) {
b = freeBindingList;
if (len > b->uriAlloc) {
2003-01-25 18:41:29 -04:00
XML_Char *temp = (XML_Char *)REALLOC(b->uri,
sizeof(XML_Char) * (len + EXPAND_SPARE));
if (temp == NULL)
return XML_ERROR_NO_MEMORY;
b->uri = temp;
2002-02-11 19:13:04 -04:00
b->uriAlloc = len + EXPAND_SPARE;
}
freeBindingList = b->nextTagBinding;
}
else {
2003-01-25 18:41:29 -04:00
b = (BINDING *)MALLOC(sizeof(BINDING));
2002-02-11 19:13:04 -04:00
if (!b)
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_MEMORY;
b->uri = (XML_Char *)MALLOC(sizeof(XML_Char) * (len + EXPAND_SPARE));
2002-02-11 19:13:04 -04:00
if (!b->uri) {
FREE(b);
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_MEMORY;
2002-02-11 19:13:04 -04:00
}
b->uriAlloc = len + EXPAND_SPARE;
}
b->uriLen = len;
memcpy(b->uri, uri, len * sizeof(XML_Char));
if (namespaceSeparator)
b->uri[len - 1] = namespaceSeparator;
b->prefix = prefix;
b->attId = attId;
b->prevPrefixBinding = prefix->binding;
/* NULL binding when default namespace undeclared */
2003-01-25 18:41:29 -04:00
if (*uri == XML_T('\0') && prefix == &_dtd->defaultPrefix)
prefix->binding = NULL;
2002-02-11 19:13:04 -04:00
else
prefix->binding = b;
b->nextTagBinding = *bindingsPtr;
*bindingsPtr = b;
2004-08-03 04:06:22 -03:00
/* if attId == NULL then we are not starting a namespace scope */
if (attId && startNamespaceDeclHandler)
2002-02-11 19:13:04 -04:00
startNamespaceDeclHandler(handlerArg, prefix->name,
2003-01-25 18:41:29 -04:00
prefix->binding ? uri : 0);
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
}
/* The idea here is to avoid using stack for each CDATA section when
2003-01-25 18:41:29 -04:00
the whole file is parsed with one call.
*/
static enum XML_Error PTRCALL
cdataSectionProcessor(XML_Parser parser,
const char *start,
const char *end,
const char **endPtr)
2002-02-11 19:13:04 -04:00
{
2004-08-03 04:06:22 -03:00
enum XML_Error result = doCdataSection(parser, encoding, &start, end,
endPtr, (XML_Bool)!finalBuffer);
if (result != XML_ERROR_NONE)
return result;
2002-02-11 19:13:04 -04:00
if (start) {
2003-01-25 18:41:29 -04:00
if (parentParser) { /* we are parsing an external entity */
processor = externalEntityContentProcessor;
return externalEntityContentProcessor(parser, start, end, endPtr);
}
else {
processor = contentProcessor;
return contentProcessor(parser, start, end, endPtr);
}
2002-02-11 19:13:04 -04:00
}
return result;
}
2004-08-03 04:06:22 -03:00
/* startPtr gets set to non-null if the section is closed, and to null if
2003-01-25 18:41:29 -04:00
the section is not yet closed.
*/
static enum XML_Error
doCdataSection(XML_Parser parser,
const ENCODING *enc,
const char **startPtr,
const char *end,
2004-08-03 04:06:22 -03:00
const char **nextPtr,
XML_Bool haveMore)
2002-02-11 19:13:04 -04:00
{
const char *s = *startPtr;
const char **eventPP;
const char **eventEndPP;
if (enc == encoding) {
eventPP = &eventPtr;
*eventPP = s;
eventEndPP = &eventEndPtr;
}
else {
eventPP = &(openInternalEntities->internalEventPtr);
eventEndPP = &(openInternalEntities->internalEventEndPtr);
}
*eventPP = s;
2003-01-25 18:41:29 -04:00
*startPtr = NULL;
2004-08-03 04:06:22 -03:00
2002-02-11 19:13:04 -04:00
for (;;) {
const char *next;
int tok = XmlCdataSectionTok(enc, s, end, &next);
*eventEndPP = next;
switch (tok) {
case XML_TOK_CDATA_SECT_CLOSE:
if (endCdataSectionHandler)
2003-01-25 18:41:29 -04:00
endCdataSectionHandler(handlerArg);
2002-02-11 19:13:04 -04:00
#if 0
/* see comment under XML_TOK_CDATA_SECT_OPEN */
else if (characterDataHandler)
2003-01-25 18:41:29 -04:00
characterDataHandler(handlerArg, dataBuf, 0);
2002-02-11 19:13:04 -04:00
#endif
else if (defaultHandler)
2003-01-25 18:41:29 -04:00
reportDefault(parser, enc, s, next);
2002-02-11 19:13:04 -04:00
*startPtr = next;
2004-08-03 04:06:22 -03:00
*nextPtr = next;
if (parsing == XML_FINISHED)
return XML_ERROR_ABORTED;
else
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
case XML_TOK_DATA_NEWLINE:
if (characterDataHandler) {
2003-01-25 18:41:29 -04:00
XML_Char c = 0xA;
characterDataHandler(handlerArg, &c, 1);
2002-02-11 19:13:04 -04:00
}
else if (defaultHandler)
2003-01-25 18:41:29 -04:00
reportDefault(parser, enc, s, next);
2002-02-11 19:13:04 -04:00
break;
case XML_TOK_DATA_CHARS:
if (characterDataHandler) {
2003-01-25 18:41:29 -04:00
if (MUST_CONVERT(enc, s)) {
for (;;) {
ICHAR *dataPtr = (ICHAR *)dataBuf;
XmlConvert(enc, &s, next, &dataPtr, (ICHAR *)dataBufEnd);
*eventEndPP = next;
characterDataHandler(handlerArg, dataBuf,
dataPtr - (ICHAR *)dataBuf);
if (s == next)
break;
*eventPP = s;
}
}
else
characterDataHandler(handlerArg,
(XML_Char *)s,
(XML_Char *)next - (XML_Char *)s);
2002-02-11 19:13:04 -04:00
}
else if (defaultHandler)
2003-01-25 18:41:29 -04:00
reportDefault(parser, enc, s, next);
2002-02-11 19:13:04 -04:00
break;
case XML_TOK_INVALID:
*eventPP = next;
return XML_ERROR_INVALID_TOKEN;
case XML_TOK_PARTIAL_CHAR:
2004-08-03 04:06:22 -03:00
if (haveMore) {
2003-01-25 18:41:29 -04:00
*nextPtr = s;
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
}
return XML_ERROR_PARTIAL_CHAR;
case XML_TOK_PARTIAL:
case XML_TOK_NONE:
2004-08-03 04:06:22 -03:00
if (haveMore) {
2003-01-25 18:41:29 -04:00
*nextPtr = s;
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
}
return XML_ERROR_UNCLOSED_CDATA_SECTION;
default:
*eventPP = next;
return XML_ERROR_UNEXPECTED_STATE;
}
2004-08-03 04:06:22 -03:00
2002-02-11 19:13:04 -04:00
*eventPP = s = next;
2004-08-03 04:06:22 -03:00
switch (parsing) {
case XML_SUSPENDED:
*nextPtr = next;
return XML_ERROR_NONE;
case XML_FINISHED:
return XML_ERROR_ABORTED;
default: ;
}
2002-02-11 19:13:04 -04:00
}
/* not reached */
}
#ifdef XML_DTD
/* The idea here is to avoid using stack for each IGNORE section when
2003-01-25 18:41:29 -04:00
the whole file is parsed with one call.
*/
static enum XML_Error PTRCALL
ignoreSectionProcessor(XML_Parser parser,
const char *start,
const char *end,
const char **endPtr)
2002-02-11 19:13:04 -04:00
{
2004-08-03 04:06:22 -03:00
enum XML_Error result = doIgnoreSection(parser, encoding, &start, end,
endPtr, (XML_Bool)!finalBuffer);
if (result != XML_ERROR_NONE)
return result;
2002-02-11 19:13:04 -04:00
if (start) {
processor = prologProcessor;
return prologProcessor(parser, start, end, endPtr);
}
return result;
}
2003-01-25 18:41:29 -04:00
/* startPtr gets set to non-null is the section is closed, and to null
if the section is not yet closed.
*/
static enum XML_Error
doIgnoreSection(XML_Parser parser,
const ENCODING *enc,
const char **startPtr,
const char *end,
2004-08-03 04:06:22 -03:00
const char **nextPtr,
XML_Bool haveMore)
2002-02-11 19:13:04 -04:00
{
const char *next;
int tok;
const char *s = *startPtr;
const char **eventPP;
const char **eventEndPP;
if (enc == encoding) {
eventPP = &eventPtr;
*eventPP = s;
eventEndPP = &eventEndPtr;
}
else {
eventPP = &(openInternalEntities->internalEventPtr);
eventEndPP = &(openInternalEntities->internalEventEndPtr);
}
*eventPP = s;
2003-01-25 18:41:29 -04:00
*startPtr = NULL;
2002-02-11 19:13:04 -04:00
tok = XmlIgnoreSectionTok(enc, s, end, &next);
*eventEndPP = next;
switch (tok) {
case XML_TOK_IGNORE_SECT:
if (defaultHandler)
reportDefault(parser, enc, s, next);
*startPtr = next;
2004-08-03 04:06:22 -03:00
*nextPtr = next;
if (parsing == XML_FINISHED)
return XML_ERROR_ABORTED;
else
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
case XML_TOK_INVALID:
*eventPP = next;
return XML_ERROR_INVALID_TOKEN;
case XML_TOK_PARTIAL_CHAR:
2004-08-03 04:06:22 -03:00
if (haveMore) {
2002-02-11 19:13:04 -04:00
*nextPtr = s;
return XML_ERROR_NONE;
}
return XML_ERROR_PARTIAL_CHAR;
case XML_TOK_PARTIAL:
case XML_TOK_NONE:
2004-08-03 04:06:22 -03:00
if (haveMore) {
2002-02-11 19:13:04 -04:00
*nextPtr = s;
return XML_ERROR_NONE;
}
return XML_ERROR_SYNTAX; /* XML_ERROR_UNCLOSED_IGNORE_SECTION */
default:
*eventPP = next;
return XML_ERROR_UNEXPECTED_STATE;
}
/* not reached */
}
#endif /* XML_DTD */
static enum XML_Error
initializeEncoding(XML_Parser parser)
{
const char *s;
#ifdef XML_UNICODE
char encodingBuf[128];
if (!protocolEncodingName)
2003-01-25 18:41:29 -04:00
s = NULL;
2002-02-11 19:13:04 -04:00
else {
int i;
for (i = 0; protocolEncodingName[i]; i++) {
if (i == sizeof(encodingBuf) - 1
2003-01-25 18:41:29 -04:00
|| (protocolEncodingName[i] & ~0x7f) != 0) {
encodingBuf[0] = '\0';
break;
2002-02-11 19:13:04 -04:00
}
encodingBuf[i] = (char)protocolEncodingName[i];
}
encodingBuf[i] = '\0';
s = encodingBuf;
}
#else
s = protocolEncodingName;
#endif
if ((ns ? XmlInitEncodingNS : XmlInitEncoding)(&initEncoding, &encoding, s))
return XML_ERROR_NONE;
return handleUnknownEncoding(parser, protocolEncodingName);
}
static enum XML_Error
processXmlDecl(XML_Parser parser, int isGeneralTextEntity,
2003-01-25 18:41:29 -04:00
const char *s, const char *next)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
const char *encodingName = NULL;
const XML_Char *storedEncName = NULL;
const ENCODING *newEncoding = NULL;
const char *version = NULL;
2002-02-11 19:13:04 -04:00
const char *versionend;
2003-01-25 18:41:29 -04:00
const XML_Char *storedversion = NULL;
2002-02-11 19:13:04 -04:00
int standalone = -1;
if (!(ns
? XmlParseXmlDeclNS
2003-01-25 18:41:29 -04:00
: XmlParseXmlDecl)(isGeneralTextEntity,
encoding,
s,
next,
&eventPtr,
&version,
&versionend,
&encodingName,
&newEncoding,
2004-08-03 04:06:22 -03:00
&standalone)) {
if (isGeneralTextEntity)
return XML_ERROR_TEXT_DECL;
else
return XML_ERROR_XML_DECL;
}
2002-02-11 19:13:04 -04:00
if (!isGeneralTextEntity && standalone == 1) {
2003-01-25 18:41:29 -04:00
_dtd->standalone = XML_TRUE;
2002-02-11 19:13:04 -04:00
#ifdef XML_DTD
if (paramEntityParsing == XML_PARAM_ENTITY_PARSING_UNLESS_STANDALONE)
paramEntityParsing = XML_PARAM_ENTITY_PARSING_NEVER;
#endif /* XML_DTD */
}
if (xmlDeclHandler) {
2003-01-25 18:41:29 -04:00
if (encodingName != NULL) {
2002-02-11 19:13:04 -04:00
storedEncName = poolStoreString(&temp2Pool,
2003-01-25 18:41:29 -04:00
encoding,
encodingName,
encodingName
+ XmlNameLength(encoding, encodingName));
if (!storedEncName)
return XML_ERROR_NO_MEMORY;
2002-02-11 19:13:04 -04:00
poolFinish(&temp2Pool);
}
if (version) {
storedversion = poolStoreString(&temp2Pool,
2003-01-25 18:41:29 -04:00
encoding,
version,
versionend - encoding->minBytesPerChar);
if (!storedversion)
return XML_ERROR_NO_MEMORY;
2002-02-11 19:13:04 -04:00
}
xmlDeclHandler(handlerArg, storedversion, storedEncName, standalone);
}
else if (defaultHandler)
reportDefault(parser, encoding, s, next);
2003-01-25 18:41:29 -04:00
if (protocolEncodingName == NULL) {
2002-02-11 19:13:04 -04:00
if (newEncoding) {
if (newEncoding->minBytesPerChar != encoding->minBytesPerChar) {
2003-01-25 18:41:29 -04:00
eventPtr = encodingName;
return XML_ERROR_INCORRECT_ENCODING;
2002-02-11 19:13:04 -04:00
}
encoding = newEncoding;
}
else if (encodingName) {
enum XML_Error result;
2003-01-25 18:41:29 -04:00
if (!storedEncName) {
storedEncName = poolStoreString(
&temp2Pool, encoding, encodingName,
encodingName + XmlNameLength(encoding, encodingName));
if (!storedEncName)
return XML_ERROR_NO_MEMORY;
2002-02-11 19:13:04 -04:00
}
result = handleUnknownEncoding(parser, storedEncName);
2003-01-25 18:41:29 -04:00
poolClear(&temp2Pool);
2002-02-11 19:13:04 -04:00
if (result == XML_ERROR_UNKNOWN_ENCODING)
2003-01-25 18:41:29 -04:00
eventPtr = encodingName;
2002-02-11 19:13:04 -04:00
return result;
}
}
if (storedEncName || storedversion)
poolClear(&temp2Pool);
return XML_ERROR_NONE;
}
static enum XML_Error
handleUnknownEncoding(XML_Parser parser, const XML_Char *encodingName)
{
if (unknownEncodingHandler) {
XML_Encoding info;
int i;
for (i = 0; i < 256; i++)
info.map[i] = -1;
2003-01-25 18:41:29 -04:00
info.convert = NULL;
info.data = NULL;
info.release = NULL;
if (unknownEncodingHandler(unknownEncodingHandlerData, encodingName,
&info)) {
2002-02-11 19:13:04 -04:00
ENCODING *enc;
unknownEncodingMem = MALLOC(XmlSizeOfUnknownEncoding());
if (!unknownEncodingMem) {
2003-01-25 18:41:29 -04:00
if (info.release)
info.release(info.data);
return XML_ERROR_NO_MEMORY;
2002-02-11 19:13:04 -04:00
}
enc = (ns
2003-01-25 18:41:29 -04:00
? XmlInitUnknownEncodingNS
: XmlInitUnknownEncoding)(unknownEncodingMem,
info.map,
info.convert,
info.data);
2002-02-11 19:13:04 -04:00
if (enc) {
2003-01-25 18:41:29 -04:00
unknownEncodingData = info.data;
unknownEncodingRelease = info.release;
encoding = enc;
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
}
}
2003-01-25 18:41:29 -04:00
if (info.release != NULL)
info.release(info.data);
}
return XML_ERROR_UNKNOWN_ENCODING;
}
static enum XML_Error PTRCALL
prologInitProcessor(XML_Parser parser,
const char *s,
const char *end,
const char **nextPtr)
{
enum XML_Error result = initializeEncoding(parser);
if (result != XML_ERROR_NONE)
return result;
processor = prologProcessor;
return prologProcessor(parser, s, end, nextPtr);
}
#ifdef XML_DTD
static enum XML_Error PTRCALL
externalParEntInitProcessor(XML_Parser parser,
const char *s,
const char *end,
const char **nextPtr)
{
enum XML_Error result = initializeEncoding(parser);
if (result != XML_ERROR_NONE)
return result;
/* we know now that XML_Parse(Buffer) has been called,
so we consider the external parameter entity read */
_dtd->paramEntityRead = XML_TRUE;
if (prologState.inEntityValue) {
processor = entityValueInitProcessor;
return entityValueInitProcessor(parser, s, end, nextPtr);
}
else {
processor = externalParEntProcessor;
return externalParEntProcessor(parser, s, end, nextPtr);
}
}
static enum XML_Error PTRCALL
entityValueInitProcessor(XML_Parser parser,
const char *s,
const char *end,
const char **nextPtr)
{
int tok;
2004-08-03 04:06:22 -03:00
const char *start = s;
const char *next = start;
eventPtr = start;
2003-01-25 18:41:29 -04:00
2004-08-03 04:06:22 -03:00
for (;;) {
2003-01-25 18:41:29 -04:00
tok = XmlPrologTok(encoding, start, end, &next);
2004-08-03 04:06:22 -03:00
eventEndPtr = next;
2003-01-25 18:41:29 -04:00
if (tok <= 0) {
2004-08-03 04:06:22 -03:00
if (!finalBuffer && tok != XML_TOK_INVALID) {
*nextPtr = s;
return XML_ERROR_NONE;
2003-01-25 18:41:29 -04:00
}
switch (tok) {
case XML_TOK_INVALID:
2004-08-03 04:06:22 -03:00
return XML_ERROR_INVALID_TOKEN;
2003-01-25 18:41:29 -04:00
case XML_TOK_PARTIAL:
2004-08-03 04:06:22 -03:00
return XML_ERROR_UNCLOSED_TOKEN;
2003-01-25 18:41:29 -04:00
case XML_TOK_PARTIAL_CHAR:
2004-08-03 04:06:22 -03:00
return XML_ERROR_PARTIAL_CHAR;
2003-01-25 18:41:29 -04:00
case XML_TOK_NONE: /* start == end */
default:
break;
}
2004-08-03 04:06:22 -03:00
/* found end of entity value - can store it now */
2003-01-25 18:41:29 -04:00
return storeEntityValue(parser, encoding, s, end);
}
else if (tok == XML_TOK_XML_DECL) {
2004-08-03 04:06:22 -03:00
enum XML_Error result;
result = processXmlDecl(parser, 0, start, next);
if (result != XML_ERROR_NONE)
return result;
switch (parsing) {
case XML_SUSPENDED:
*nextPtr = next;
return XML_ERROR_NONE;
case XML_FINISHED:
return XML_ERROR_ABORTED;
default:
*nextPtr = next;
}
2003-01-25 18:41:29 -04:00
/* stop scanning for text declaration - we found one */
processor = entityValueProcessor;
return entityValueProcessor(parser, next, end, nextPtr);
}
/* If we are at the end of the buffer, this would cause XmlPrologTok to
return XML_TOK_NONE on the next call, which would then cause the
function to exit with *nextPtr set to s - that is what we want for other
tokens, but not for the BOM - we would rather like to skip it;
then, when this routine is entered the next time, XmlPrologTok will
return XML_TOK_INVALID, since the BOM is still in the buffer
*/
2004-08-03 04:06:22 -03:00
else if (tok == XML_TOK_BOM && next == end && !finalBuffer) {
2003-01-25 18:41:29 -04:00
*nextPtr = next;
return XML_ERROR_NONE;
}
start = next;
2004-08-03 04:06:22 -03:00
eventPtr = start;
2003-01-25 18:41:29 -04:00
}
}
static enum XML_Error PTRCALL
externalParEntProcessor(XML_Parser parser,
const char *s,
const char *end,
const char **nextPtr)
{
const char *next = s;
int tok;
2004-08-03 04:06:22 -03:00
tok = XmlPrologTok(encoding, s, end, &next);
2003-01-25 18:41:29 -04:00
if (tok <= 0) {
2004-08-03 04:06:22 -03:00
if (!finalBuffer && tok != XML_TOK_INVALID) {
2003-01-25 18:41:29 -04:00
*nextPtr = s;
return XML_ERROR_NONE;
}
switch (tok) {
case XML_TOK_INVALID:
return XML_ERROR_INVALID_TOKEN;
case XML_TOK_PARTIAL:
return XML_ERROR_UNCLOSED_TOKEN;
case XML_TOK_PARTIAL_CHAR:
return XML_ERROR_PARTIAL_CHAR;
case XML_TOK_NONE: /* start == end */
default:
break;
}
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
/* This would cause the next stage, i.e. doProlog to be passed XML_TOK_BOM.
However, when parsing an external subset, doProlog will not accept a BOM
as valid, and report a syntax error, so we have to skip the BOM
*/
else if (tok == XML_TOK_BOM) {
s = next;
tok = XmlPrologTok(encoding, s, end, &next);
}
processor = prologProcessor;
2004-08-03 04:06:22 -03:00
return doProlog(parser, encoding, s, end, tok, next,
nextPtr, (XML_Bool)!finalBuffer);
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
static enum XML_Error PTRCALL
entityValueProcessor(XML_Parser parser,
const char *s,
const char *end,
const char **nextPtr)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
const char *start = s;
const char *next = s;
const ENCODING *enc = encoding;
int tok;
for (;;) {
tok = XmlPrologTok(enc, start, end, &next);
if (tok <= 0) {
2004-08-03 04:06:22 -03:00
if (!finalBuffer && tok != XML_TOK_INVALID) {
2003-01-25 18:41:29 -04:00
*nextPtr = s;
return XML_ERROR_NONE;
}
switch (tok) {
case XML_TOK_INVALID:
2004-08-03 04:06:22 -03:00
return XML_ERROR_INVALID_TOKEN;
2003-01-25 18:41:29 -04:00
case XML_TOK_PARTIAL:
2004-08-03 04:06:22 -03:00
return XML_ERROR_UNCLOSED_TOKEN;
2003-01-25 18:41:29 -04:00
case XML_TOK_PARTIAL_CHAR:
2004-08-03 04:06:22 -03:00
return XML_ERROR_PARTIAL_CHAR;
2003-01-25 18:41:29 -04:00
case XML_TOK_NONE: /* start == end */
default:
break;
}
2004-08-03 04:06:22 -03:00
/* found end of entity value - can store it now */
2003-01-25 18:41:29 -04:00
return storeEntityValue(parser, enc, s, end);
}
start = next;
}
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
#endif /* XML_DTD */
static enum XML_Error PTRCALL
2002-02-11 19:13:04 -04:00
prologProcessor(XML_Parser parser,
2003-01-25 18:41:29 -04:00
const char *s,
const char *end,
const char **nextPtr)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
const char *next = s;
2002-02-11 19:13:04 -04:00
int tok = XmlPrologTok(encoding, s, end, &next);
2004-08-03 04:06:22 -03:00
return doProlog(parser, encoding, s, end, tok, next,
nextPtr, (XML_Bool)!finalBuffer);
2002-02-11 19:13:04 -04:00
}
static enum XML_Error
doProlog(XML_Parser parser,
2003-01-25 18:41:29 -04:00
const ENCODING *enc,
const char *s,
const char *end,
int tok,
const char *next,
2004-08-03 04:06:22 -03:00
const char **nextPtr,
XML_Bool haveMore)
2002-02-11 19:13:04 -04:00
{
#ifdef XML_DTD
static const XML_Char externalSubsetName[] = { '#' , '\0' };
#endif /* XML_DTD */
2003-01-25 18:41:29 -04:00
static const XML_Char atypeCDATA[] = { 'C', 'D', 'A', 'T', 'A', '\0' };
static const XML_Char atypeID[] = { 'I', 'D', '\0' };
static const XML_Char atypeIDREF[] = { 'I', 'D', 'R', 'E', 'F', '\0' };
static const XML_Char atypeIDREFS[] = { 'I', 'D', 'R', 'E', 'F', 'S', '\0' };
static const XML_Char atypeENTITY[] = { 'E', 'N', 'T', 'I', 'T', 'Y', '\0' };
static const XML_Char atypeENTITIES[] =
{ 'E', 'N', 'T', 'I', 'T', 'I', 'E', 'S', '\0' };
static const XML_Char atypeNMTOKEN[] = {
'N', 'M', 'T', 'O', 'K', 'E', 'N', '\0' };
static const XML_Char atypeNMTOKENS[] = {
'N', 'M', 'T', 'O', 'K', 'E', 'N', 'S', '\0' };
static const XML_Char notationPrefix[] = {
'N', 'O', 'T', 'A', 'T', 'I', 'O', 'N', '(', '\0' };
static const XML_Char enumValueSep[] = { '|', '\0' };
static const XML_Char enumValueStart[] = { '(', '\0' };
2004-08-03 04:06:22 -03:00
/* save one level of indirection */
DTD * const dtd = _dtd;
2002-02-11 19:13:04 -04:00
const char **eventPP;
const char **eventEndPP;
enum XML_Content_Quant quant;
if (enc == encoding) {
eventPP = &eventPtr;
eventEndPP = &eventEndPtr;
}
else {
eventPP = &(openInternalEntities->internalEventPtr);
eventEndPP = &(openInternalEntities->internalEventEndPtr);
}
2004-08-03 04:06:22 -03:00
2002-02-11 19:13:04 -04:00
for (;;) {
int role;
2003-01-25 18:41:29 -04:00
XML_Bool handleDefault = XML_TRUE;
2002-02-11 19:13:04 -04:00
*eventPP = s;
*eventEndPP = next;
if (tok <= 0) {
2004-08-03 04:06:22 -03:00
if (haveMore && tok != XML_TOK_INVALID) {
2003-01-25 18:41:29 -04:00
*nextPtr = s;
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
}
switch (tok) {
case XML_TOK_INVALID:
2003-01-25 18:41:29 -04:00
*eventPP = next;
return XML_ERROR_INVALID_TOKEN;
2002-02-11 19:13:04 -04:00
case XML_TOK_PARTIAL:
2003-01-25 18:41:29 -04:00
return XML_ERROR_UNCLOSED_TOKEN;
2002-02-11 19:13:04 -04:00
case XML_TOK_PARTIAL_CHAR:
2003-01-25 18:41:29 -04:00
return XML_ERROR_PARTIAL_CHAR;
2002-02-11 19:13:04 -04:00
case XML_TOK_NONE:
#ifdef XML_DTD
2004-08-03 04:06:22 -03:00
/* for internal PE NOT referenced between declarations */
if (enc != encoding && !openInternalEntities->betweenDecl) {
*nextPtr = s;
2003-01-25 18:41:29 -04:00
return XML_ERROR_NONE;
2004-08-03 04:06:22 -03:00
}
/* WFC: PE Between Declarations - must check that PE contains
complete markup, not only for external PEs, but also for
internal PEs if the reference occurs between declarations.
*/
if (isParamEntity || enc != encoding) {
2003-01-25 18:41:29 -04:00
if (XmlTokenRole(&prologState, XML_TOK_NONE, end, end, enc)
== XML_ROLE_ERROR)
2004-08-03 04:06:22 -03:00
return XML_ERROR_INCOMPLETE_PE;
*nextPtr = s;
2003-01-25 18:41:29 -04:00
return XML_ERROR_NONE;
}
2002-02-11 19:13:04 -04:00
#endif /* XML_DTD */
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_ELEMENTS;
2002-02-11 19:13:04 -04:00
default:
2003-01-25 18:41:29 -04:00
tok = -tok;
next = end;
break;
2002-02-11 19:13:04 -04:00
}
}
role = XmlTokenRole(&prologState, tok, s, next, enc);
switch (role) {
case XML_ROLE_XML_DECL:
{
2003-01-25 18:41:29 -04:00
enum XML_Error result = processXmlDecl(parser, 0, s, next);
if (result != XML_ERROR_NONE)
return result;
enc = encoding;
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_DOCTYPE_NAME:
if (startDoctypeDeclHandler) {
2003-01-25 18:41:29 -04:00
doctypeName = poolStoreString(&tempPool, enc, s, next);
if (!doctypeName)
return XML_ERROR_NO_MEMORY;
poolFinish(&tempPool);
doctypePubid = NULL;
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
doctypeSysid = NULL; /* always initialize to NULL */
2002-02-11 19:13:04 -04:00
break;
case XML_ROLE_DOCTYPE_INTERNAL_SUBSET:
if (startDoctypeDeclHandler) {
2003-01-25 18:41:29 -04:00
startDoctypeDeclHandler(handlerArg, doctypeName, doctypeSysid,
doctypePubid, 1);
doctypeName = NULL;
poolClear(&tempPool);
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
break;
#ifdef XML_DTD
case XML_ROLE_TEXT_DECL:
{
2003-01-25 18:41:29 -04:00
enum XML_Error result = processXmlDecl(parser, 1, s, next);
if (result != XML_ERROR_NONE)
return result;
enc = encoding;
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
break;
#endif /* XML_DTD */
case XML_ROLE_DOCTYPE_PUBLIC_ID:
2003-01-25 18:41:29 -04:00
#ifdef XML_DTD
useForeignDTD = XML_FALSE;
2004-08-03 04:06:22 -03:00
declEntity = (ENTITY *)lookup(&dtd->paramEntities,
externalSubsetName,
sizeof(ENTITY));
if (!declEntity)
return XML_ERROR_NO_MEMORY;
2003-01-25 18:41:29 -04:00
#endif /* XML_DTD */
dtd->hasParamEntityRefs = XML_TRUE;
2002-02-11 19:13:04 -04:00
if (startDoctypeDeclHandler) {
2004-08-03 04:06:22 -03:00
if (!XmlIsPublicId(enc, s, next, eventPP))
return XML_ERROR_PUBLICID;
2003-01-25 18:41:29 -04:00
doctypePubid = poolStoreString(&tempPool, enc,
s + enc->minBytesPerChar,
next - enc->minBytesPerChar);
if (!doctypePubid)
return XML_ERROR_NO_MEMORY;
2004-08-03 04:06:22 -03:00
normalizePublicId((XML_Char *)doctypePubid);
2003-01-25 18:41:29 -04:00
poolFinish(&tempPool);
handleDefault = XML_FALSE;
2004-08-03 04:06:22 -03:00
goto alreadyChecked;
2002-02-11 19:13:04 -04:00
}
/* fall through */
case XML_ROLE_ENTITY_PUBLIC_ID:
if (!XmlIsPublicId(enc, s, next, eventPP))
2004-08-03 04:06:22 -03:00
return XML_ERROR_PUBLICID;
alreadyChecked:
2003-01-25 18:41:29 -04:00
if (dtd->keepProcessing && declEntity) {
XML_Char *tem = poolStoreString(&dtd->pool,
enc,
s + enc->minBytesPerChar,
next - enc->minBytesPerChar);
if (!tem)
return XML_ERROR_NO_MEMORY;
normalizePublicId(tem);
declEntity->publicId = tem;
poolFinish(&dtd->pool);
if (entityDeclHandler)
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_DOCTYPE_CLOSE:
if (doctypeName) {
2003-01-25 18:41:29 -04:00
startDoctypeDeclHandler(handlerArg, doctypeName,
doctypeSysid, doctypePubid, 0);
poolClear(&tempPool);
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
/* doctypeSysid will be non-NULL in the case of a previous
XML_ROLE_DOCTYPE_SYSTEM_ID, even if startDoctypeDeclHandler
was not set, indicating an external subset
*/
2002-02-11 19:13:04 -04:00
#ifdef XML_DTD
2003-01-25 18:41:29 -04:00
if (doctypeSysid || useForeignDTD) {
dtd->hasParamEntityRefs = XML_TRUE; /* when docTypeSysid == NULL */
if (paramEntityParsing && externalEntityRefHandler) {
ENTITY *entity = (ENTITY *)lookup(&dtd->paramEntities,
externalSubsetName,
sizeof(ENTITY));
if (!entity)
return XML_ERROR_NO_MEMORY;
if (useForeignDTD)
entity->base = curBase;
dtd->paramEntityRead = XML_FALSE;
if (!externalEntityRefHandler(externalEntityRefHandlerArg,
0,
entity->base,
entity->systemId,
entity->publicId))
return XML_ERROR_EXTERNAL_ENTITY_HANDLING;
if (dtd->paramEntityRead &&
!dtd->standalone &&
notStandaloneHandler &&
!notStandaloneHandler(handlerArg))
return XML_ERROR_NOT_STANDALONE;
/* end of DTD - no need to update dtd->keepProcessing */
}
useForeignDTD = XML_FALSE;
}
2002-02-11 19:13:04 -04:00
#endif /* XML_DTD */
2003-01-25 18:41:29 -04:00
if (endDoctypeDeclHandler) {
endDoctypeDeclHandler(handlerArg);
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_INSTANCE_START:
2003-01-25 18:41:29 -04:00
#ifdef XML_DTD
/* if there is no DOCTYPE declaration then now is the
last chance to read the foreign DTD
*/
if (useForeignDTD) {
dtd->hasParamEntityRefs = XML_TRUE;
if (paramEntityParsing && externalEntityRefHandler) {
ENTITY *entity = (ENTITY *)lookup(&dtd->paramEntities,
externalSubsetName,
sizeof(ENTITY));
if (!entity)
return XML_ERROR_NO_MEMORY;
entity->base = curBase;
dtd->paramEntityRead = XML_FALSE;
if (!externalEntityRefHandler(externalEntityRefHandlerArg,
0,
entity->base,
entity->systemId,
entity->publicId))
return XML_ERROR_EXTERNAL_ENTITY_HANDLING;
if (dtd->paramEntityRead &&
!dtd->standalone &&
notStandaloneHandler &&
!notStandaloneHandler(handlerArg))
return XML_ERROR_NOT_STANDALONE;
/* end of DTD - no need to update dtd->keepProcessing */
}
}
#endif /* XML_DTD */
2002-02-11 19:13:04 -04:00
processor = contentProcessor;
return contentProcessor(parser, s, end, nextPtr);
case XML_ROLE_ATTLIST_ELEMENT_NAME:
declElementType = getElementType(parser, enc, s, next);
if (!declElementType)
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_MEMORY;
goto checkAttListDeclHandler;
2002-02-11 19:13:04 -04:00
case XML_ROLE_ATTRIBUTE_NAME:
declAttributeId = getAttributeId(parser, enc, s, next);
if (!declAttributeId)
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_MEMORY;
declAttributeIsCdata = XML_FALSE;
declAttributeType = NULL;
declAttributeIsId = XML_FALSE;
goto checkAttListDeclHandler;
2002-02-11 19:13:04 -04:00
case XML_ROLE_ATTRIBUTE_TYPE_CDATA:
2003-01-25 18:41:29 -04:00
declAttributeIsCdata = XML_TRUE;
declAttributeType = atypeCDATA;
goto checkAttListDeclHandler;
2002-02-11 19:13:04 -04:00
case XML_ROLE_ATTRIBUTE_TYPE_ID:
2003-01-25 18:41:29 -04:00
declAttributeIsId = XML_TRUE;
declAttributeType = atypeID;
goto checkAttListDeclHandler;
2002-02-11 19:13:04 -04:00
case XML_ROLE_ATTRIBUTE_TYPE_IDREF:
2003-01-25 18:41:29 -04:00
declAttributeType = atypeIDREF;
goto checkAttListDeclHandler;
2002-02-11 19:13:04 -04:00
case XML_ROLE_ATTRIBUTE_TYPE_IDREFS:
2003-01-25 18:41:29 -04:00
declAttributeType = atypeIDREFS;
goto checkAttListDeclHandler;
2002-02-11 19:13:04 -04:00
case XML_ROLE_ATTRIBUTE_TYPE_ENTITY:
2003-01-25 18:41:29 -04:00
declAttributeType = atypeENTITY;
goto checkAttListDeclHandler;
2002-02-11 19:13:04 -04:00
case XML_ROLE_ATTRIBUTE_TYPE_ENTITIES:
2003-01-25 18:41:29 -04:00
declAttributeType = atypeENTITIES;
goto checkAttListDeclHandler;
2002-02-11 19:13:04 -04:00
case XML_ROLE_ATTRIBUTE_TYPE_NMTOKEN:
2003-01-25 18:41:29 -04:00
declAttributeType = atypeNMTOKEN;
goto checkAttListDeclHandler;
2002-02-11 19:13:04 -04:00
case XML_ROLE_ATTRIBUTE_TYPE_NMTOKENS:
2003-01-25 18:41:29 -04:00
declAttributeType = atypeNMTOKENS;
checkAttListDeclHandler:
if (dtd->keepProcessing && attlistDeclHandler)
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
break;
case XML_ROLE_ATTRIBUTE_ENUM_VALUE:
case XML_ROLE_ATTRIBUTE_NOTATION_VALUE:
2003-01-25 18:41:29 -04:00
if (dtd->keepProcessing && attlistDeclHandler) {
const XML_Char *prefix;
if (declAttributeType) {
prefix = enumValueSep;
}
else {
prefix = (role == XML_ROLE_ATTRIBUTE_NOTATION_VALUE
? notationPrefix
: enumValueStart);
}
if (!poolAppendString(&tempPool, prefix))
return XML_ERROR_NO_MEMORY;
if (!poolAppend(&tempPool, enc, s, next))
return XML_ERROR_NO_MEMORY;
declAttributeType = tempPool.start;
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_IMPLIED_ATTRIBUTE_VALUE:
case XML_ROLE_REQUIRED_ATTRIBUTE_VALUE:
2003-01-25 18:41:29 -04:00
if (dtd->keepProcessing) {
if (!defineAttribute(declElementType, declAttributeId,
declAttributeIsCdata, declAttributeIsId,
0, parser))
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_MEMORY;
if (attlistDeclHandler && declAttributeType) {
if (*declAttributeType == XML_T('(')
|| (*declAttributeType == XML_T('N')
&& declAttributeType[1] == XML_T('O'))) {
/* Enumerated or Notation type */
if (!poolAppendChar(&tempPool, XML_T(')'))
|| !poolAppendChar(&tempPool, XML_T('\0')))
return XML_ERROR_NO_MEMORY;
declAttributeType = tempPool.start;
poolFinish(&tempPool);
}
*eventEndPP = s;
attlistDeclHandler(handlerArg, declElementType->name,
declAttributeId->name, declAttributeType,
0, role == XML_ROLE_REQUIRED_ATTRIBUTE_VALUE);
poolClear(&tempPool);
handleDefault = XML_FALSE;
}
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_DEFAULT_ATTRIBUTE_VALUE:
case XML_ROLE_FIXED_ATTRIBUTE_VALUE:
2003-01-25 18:41:29 -04:00
if (dtd->keepProcessing) {
const XML_Char *attVal;
enum XML_Error result =
storeAttributeValue(parser, enc, declAttributeIsCdata,
s + enc->minBytesPerChar,
next - enc->minBytesPerChar,
&dtd->pool);
2003-01-25 18:41:29 -04:00
if (result)
return result;
attVal = poolStart(&dtd->pool);
poolFinish(&dtd->pool);
/* ID attributes aren't allowed to have a default */
if (!defineAttribute(declElementType, declAttributeId,
declAttributeIsCdata, XML_FALSE, attVal, parser))
return XML_ERROR_NO_MEMORY;
if (attlistDeclHandler && declAttributeType) {
if (*declAttributeType == XML_T('(')
|| (*declAttributeType == XML_T('N')
&& declAttributeType[1] == XML_T('O'))) {
/* Enumerated or Notation type */
if (!poolAppendChar(&tempPool, XML_T(')'))
|| !poolAppendChar(&tempPool, XML_T('\0')))
return XML_ERROR_NO_MEMORY;
declAttributeType = tempPool.start;
poolFinish(&tempPool);
}
*eventEndPP = s;
attlistDeclHandler(handlerArg, declElementType->name,
declAttributeId->name, declAttributeType,
attVal,
role == XML_ROLE_FIXED_ATTRIBUTE_VALUE);
poolClear(&tempPool);
handleDefault = XML_FALSE;
}
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
break;
2002-02-11 19:13:04 -04:00
case XML_ROLE_ENTITY_VALUE:
2003-01-25 18:41:29 -04:00
if (dtd->keepProcessing) {
enum XML_Error result = storeEntityValue(parser, enc,
s + enc->minBytesPerChar,
next - enc->minBytesPerChar);
if (declEntity) {
declEntity->textPtr = poolStart(&dtd->entityValuePool);
declEntity->textLen = poolLength(&dtd->entityValuePool);
poolFinish(&dtd->entityValuePool);
if (entityDeclHandler) {
*eventEndPP = s;
entityDeclHandler(handlerArg,
declEntity->name,
declEntity->is_param,
declEntity->textPtr,
declEntity->textLen,
curBase, 0, 0, 0);
handleDefault = XML_FALSE;
}
}
else
poolDiscard(&dtd->entityValuePool);
if (result != XML_ERROR_NONE)
return result;
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_DOCTYPE_SYSTEM_ID:
2003-01-25 18:41:29 -04:00
#ifdef XML_DTD
useForeignDTD = XML_FALSE;
#endif /* XML_DTD */
dtd->hasParamEntityRefs = XML_TRUE;
2002-02-11 19:13:04 -04:00
if (startDoctypeDeclHandler) {
2003-01-25 18:41:29 -04:00
doctypeSysid = poolStoreString(&tempPool, enc,
s + enc->minBytesPerChar,
next - enc->minBytesPerChar);
if (doctypeSysid == NULL)
return XML_ERROR_NO_MEMORY;
poolFinish(&tempPool);
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
#ifdef XML_DTD
2003-01-25 18:41:29 -04:00
else
/* use externalSubsetName to make doctypeSysid non-NULL
for the case where no startDoctypeDeclHandler is set */
doctypeSysid = externalSubsetName;
#endif /* XML_DTD */
if (!dtd->standalone
#ifdef XML_DTD
&& !paramEntityParsing
2002-02-11 19:13:04 -04:00
#endif /* XML_DTD */
2003-01-25 18:41:29 -04:00
&& notStandaloneHandler
&& !notStandaloneHandler(handlerArg))
return XML_ERROR_NOT_STANDALONE;
2002-02-11 19:13:04 -04:00
#ifndef XML_DTD
break;
#else /* XML_DTD */
if (!declEntity) {
2003-01-25 18:41:29 -04:00
declEntity = (ENTITY *)lookup(&dtd->paramEntities,
externalSubsetName,
sizeof(ENTITY));
if (!declEntity)
return XML_ERROR_NO_MEMORY;
declEntity->publicId = NULL;
2002-02-11 19:13:04 -04:00
}
/* fall through */
#endif /* XML_DTD */
case XML_ROLE_ENTITY_SYSTEM_ID:
2003-01-25 18:41:29 -04:00
if (dtd->keepProcessing && declEntity) {
declEntity->systemId = poolStoreString(&dtd->pool, enc,
s + enc->minBytesPerChar,
next - enc->minBytesPerChar);
if (!declEntity->systemId)
return XML_ERROR_NO_MEMORY;
declEntity->base = curBase;
poolFinish(&dtd->pool);
if (entityDeclHandler)
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_ENTITY_COMPLETE:
2003-01-25 18:41:29 -04:00
if (dtd->keepProcessing && declEntity && entityDeclHandler) {
*eventEndPP = s;
entityDeclHandler(handlerArg,
declEntity->name,
declEntity->is_param,
0,0,
declEntity->base,
declEntity->systemId,
declEntity->publicId,
0);
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_ENTITY_NOTATION_NAME:
2003-01-25 18:41:29 -04:00
if (dtd->keepProcessing && declEntity) {
declEntity->notation = poolStoreString(&dtd->pool, enc, s, next);
if (!declEntity->notation)
return XML_ERROR_NO_MEMORY;
poolFinish(&dtd->pool);
if (unparsedEntityDeclHandler) {
*eventEndPP = s;
unparsedEntityDeclHandler(handlerArg,
declEntity->name,
declEntity->base,
declEntity->systemId,
declEntity->publicId,
declEntity->notation);
handleDefault = XML_FALSE;
}
else if (entityDeclHandler) {
*eventEndPP = s;
entityDeclHandler(handlerArg,
declEntity->name,
0,0,0,
declEntity->base,
declEntity->systemId,
declEntity->publicId,
declEntity->notation);
handleDefault = XML_FALSE;
}
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_GENERAL_ENTITY_NAME:
{
2003-01-25 18:41:29 -04:00
if (XmlPredefinedEntityName(enc, s, next)) {
declEntity = NULL;
break;
}
if (dtd->keepProcessing) {
const XML_Char *name = poolStoreString(&dtd->pool, enc, s, next);
if (!name)
return XML_ERROR_NO_MEMORY;
declEntity = (ENTITY *)lookup(&dtd->generalEntities, name,
sizeof(ENTITY));
if (!declEntity)
return XML_ERROR_NO_MEMORY;
if (declEntity->name != name) {
poolDiscard(&dtd->pool);
declEntity = NULL;
}
else {
poolFinish(&dtd->pool);
declEntity->publicId = NULL;
declEntity->is_param = XML_FALSE;
/* if we have a parent parser or are reading an internal parameter
entity, then the entity declaration is not considered "internal"
*/
declEntity->is_internal = !(parentParser || openInternalEntities);
if (entityDeclHandler)
handleDefault = XML_FALSE;
}
}
else {
poolDiscard(&dtd->pool);
declEntity = NULL;
}
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_PARAM_ENTITY_NAME:
#ifdef XML_DTD
2003-01-25 18:41:29 -04:00
if (dtd->keepProcessing) {
const XML_Char *name = poolStoreString(&dtd->pool, enc, s, next);
if (!name)
return XML_ERROR_NO_MEMORY;
declEntity = (ENTITY *)lookup(&dtd->paramEntities,
name, sizeof(ENTITY));
if (!declEntity)
return XML_ERROR_NO_MEMORY;
if (declEntity->name != name) {
poolDiscard(&dtd->pool);
declEntity = NULL;
}
else {
poolFinish(&dtd->pool);
declEntity->publicId = NULL;
declEntity->is_param = XML_TRUE;
/* if we have a parent parser or are reading an internal parameter
entity, then the entity declaration is not considered "internal"
*/
declEntity->is_internal = !(parentParser || openInternalEntities);
if (entityDeclHandler)
handleDefault = XML_FALSE;
}
}
else {
poolDiscard(&dtd->pool);
declEntity = NULL;
2002-02-11 19:13:04 -04:00
}
#else /* not XML_DTD */
2003-01-25 18:41:29 -04:00
declEntity = NULL;
#endif /* XML_DTD */
2002-02-11 19:13:04 -04:00
break;
case XML_ROLE_NOTATION_NAME:
2003-01-25 18:41:29 -04:00
declNotationPublicId = NULL;
declNotationName = NULL;
2002-02-11 19:13:04 -04:00
if (notationDeclHandler) {
2003-01-25 18:41:29 -04:00
declNotationName = poolStoreString(&tempPool, enc, s, next);
if (!declNotationName)
return XML_ERROR_NO_MEMORY;
poolFinish(&tempPool);
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_NOTATION_PUBLIC_ID:
if (!XmlIsPublicId(enc, s, next, eventPP))
2004-08-03 04:06:22 -03:00
return XML_ERROR_PUBLICID;
2003-01-25 18:41:29 -04:00
if (declNotationName) { /* means notationDeclHandler != NULL */
XML_Char *tem = poolStoreString(&tempPool,
enc,
s + enc->minBytesPerChar,
next - enc->minBytesPerChar);
if (!tem)
return XML_ERROR_NO_MEMORY;
normalizePublicId(tem);
declNotationPublicId = tem;
poolFinish(&tempPool);
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_NOTATION_SYSTEM_ID:
if (declNotationName && notationDeclHandler) {
2003-01-25 18:41:29 -04:00
const XML_Char *systemId
= poolStoreString(&tempPool, enc,
s + enc->minBytesPerChar,
next - enc->minBytesPerChar);
if (!systemId)
return XML_ERROR_NO_MEMORY;
*eventEndPP = s;
notationDeclHandler(handlerArg,
declNotationName,
curBase,
systemId,
declNotationPublicId);
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
poolClear(&tempPool);
break;
case XML_ROLE_NOTATION_NO_SYSTEM_ID:
if (declNotationPublicId && notationDeclHandler) {
2003-01-25 18:41:29 -04:00
*eventEndPP = s;
notationDeclHandler(handlerArg,
declNotationName,
curBase,
0,
declNotationPublicId);
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
poolClear(&tempPool);
break;
case XML_ROLE_ERROR:
switch (tok) {
case XML_TOK_PARAM_ENTITY_REF:
2004-08-03 04:06:22 -03:00
/* PE references in internal subset are
not allowed within declarations. */
2003-01-25 18:41:29 -04:00
return XML_ERROR_PARAM_ENTITY_REF;
2002-02-11 19:13:04 -04:00
case XML_TOK_XML_DECL:
2003-01-25 18:41:29 -04:00
return XML_ERROR_MISPLACED_XML_PI;
2002-02-11 19:13:04 -04:00
default:
2003-01-25 18:41:29 -04:00
return XML_ERROR_SYNTAX;
2002-02-11 19:13:04 -04:00
}
#ifdef XML_DTD
case XML_ROLE_IGNORE_SECT:
{
2003-01-25 18:41:29 -04:00
enum XML_Error result;
if (defaultHandler)
reportDefault(parser, enc, s, next);
handleDefault = XML_FALSE;
2004-08-03 04:06:22 -03:00
result = doIgnoreSection(parser, enc, &next, end, nextPtr, haveMore);
if (result != XML_ERROR_NONE)
return result;
else if (!next) {
2003-01-25 18:41:29 -04:00
processor = ignoreSectionProcessor;
return result;
}
2002-02-11 19:13:04 -04:00
}
break;
#endif /* XML_DTD */
case XML_ROLE_GROUP_OPEN:
if (prologState.level >= groupSize) {
2003-01-25 18:41:29 -04:00
if (groupSize) {
char *temp = (char *)REALLOC(groupConnector, groupSize *= 2);
if (temp == NULL)
return XML_ERROR_NO_MEMORY;
groupConnector = temp;
if (dtd->scaffIndex) {
int *temp = (int *)REALLOC(dtd->scaffIndex,
groupSize * sizeof(int));
if (temp == NULL)
return XML_ERROR_NO_MEMORY;
dtd->scaffIndex = temp;
}
}
else {
groupConnector = (char *)MALLOC(groupSize = 32);
if (!groupConnector)
return XML_ERROR_NO_MEMORY;
}
2002-02-11 19:13:04 -04:00
}
groupConnector[prologState.level] = 0;
2003-01-25 18:41:29 -04:00
if (dtd->in_eldecl) {
int myindex = nextScaffoldPart(parser);
if (myindex < 0)
return XML_ERROR_NO_MEMORY;
dtd->scaffIndex[dtd->scaffLevel] = myindex;
dtd->scaffLevel++;
dtd->scaffold[myindex].type = XML_CTYPE_SEQ;
if (elementDeclHandler)
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_GROUP_SEQUENCE:
if (groupConnector[prologState.level] == '|')
2003-01-25 18:41:29 -04:00
return XML_ERROR_SYNTAX;
2002-02-11 19:13:04 -04:00
groupConnector[prologState.level] = ',';
2003-01-25 18:41:29 -04:00
if (dtd->in_eldecl && elementDeclHandler)
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
break;
case XML_ROLE_GROUP_CHOICE:
if (groupConnector[prologState.level] == ',')
2003-01-25 18:41:29 -04:00
return XML_ERROR_SYNTAX;
if (dtd->in_eldecl
&& !groupConnector[prologState.level]
&& (dtd->scaffold[dtd->scaffIndex[dtd->scaffLevel - 1]].type
!= XML_CTYPE_MIXED)
) {
dtd->scaffold[dtd->scaffIndex[dtd->scaffLevel - 1]].type
= XML_CTYPE_CHOICE;
if (elementDeclHandler)
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
groupConnector[prologState.level] = '|';
break;
case XML_ROLE_PARAM_ENTITY_REF:
#ifdef XML_DTD
case XML_ROLE_INNER_PARAM_ENTITY_REF:
2003-01-25 18:41:29 -04:00
dtd->hasParamEntityRefs = XML_TRUE;
if (!paramEntityParsing)
dtd->keepProcessing = dtd->standalone;
else {
const XML_Char *name;
ENTITY *entity;
name = poolStoreString(&dtd->pool, enc,
s + enc->minBytesPerChar,
next - enc->minBytesPerChar);
if (!name)
return XML_ERROR_NO_MEMORY;
entity = (ENTITY *)lookup(&dtd->paramEntities, name, 0);
poolDiscard(&dtd->pool);
/* first, determine if a check for an existing declaration is needed;
if yes, check that the entity exists, and that it is internal,
otherwise call the skipped entity handler
*/
if (prologState.documentEntity &&
(dtd->standalone
? !openInternalEntities
: !dtd->hasParamEntityRefs)) {
if (!entity)
return XML_ERROR_UNDEFINED_ENTITY;
else if (!entity->is_internal)
return XML_ERROR_ENTITY_DECLARED_IN_PE;
}
else if (!entity) {
dtd->keepProcessing = dtd->standalone;
/* cannot report skipped entities in declarations */
if ((role == XML_ROLE_PARAM_ENTITY_REF) && skippedEntityHandler) {
skippedEntityHandler(handlerArg, name, 1);
handleDefault = XML_FALSE;
}
break;
}
if (entity->open)
return XML_ERROR_RECURSIVE_ENTITY_REF;
if (entity->textPtr) {
enum XML_Error result;
2004-08-03 04:06:22 -03:00
XML_Bool betweenDecl =
(role == XML_ROLE_PARAM_ENTITY_REF ? XML_TRUE : XML_FALSE);
result = processInternalEntity(parser, entity, betweenDecl);
2003-01-25 18:41:29 -04:00
if (result != XML_ERROR_NONE)
return result;
handleDefault = XML_FALSE;
break;
}
if (externalEntityRefHandler) {
dtd->paramEntityRead = XML_FALSE;
entity->open = XML_TRUE;
if (!externalEntityRefHandler(externalEntityRefHandlerArg,
0,
entity->base,
entity->systemId,
entity->publicId)) {
entity->open = XML_FALSE;
return XML_ERROR_EXTERNAL_ENTITY_HANDLING;
}
entity->open = XML_FALSE;
handleDefault = XML_FALSE;
if (!dtd->paramEntityRead) {
dtd->keepProcessing = dtd->standalone;
break;
}
}
else {
dtd->keepProcessing = dtd->standalone;
break;
}
2002-02-11 19:13:04 -04:00
}
#endif /* XML_DTD */
2003-01-25 18:41:29 -04:00
if (!dtd->standalone &&
notStandaloneHandler &&
!notStandaloneHandler(handlerArg))
return XML_ERROR_NOT_STANDALONE;
2002-02-11 19:13:04 -04:00
break;
2003-01-25 18:41:29 -04:00
/* Element declaration stuff */
2002-02-11 19:13:04 -04:00
case XML_ROLE_ELEMENT_NAME:
if (elementDeclHandler) {
2003-01-25 18:41:29 -04:00
declElementType = getElementType(parser, enc, s, next);
if (!declElementType)
return XML_ERROR_NO_MEMORY;
dtd->scaffLevel = 0;
dtd->scaffCount = 0;
dtd->in_eldecl = XML_TRUE;
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_CONTENT_ANY:
case XML_ROLE_CONTENT_EMPTY:
2003-01-25 18:41:29 -04:00
if (dtd->in_eldecl) {
if (elementDeclHandler) {
XML_Content * content = (XML_Content *) MALLOC(sizeof(XML_Content));
if (!content)
return XML_ERROR_NO_MEMORY;
content->quant = XML_CQUANT_NONE;
content->name = NULL;
content->numchildren = 0;
content->children = NULL;
content->type = ((role == XML_ROLE_CONTENT_ANY) ?
XML_CTYPE_ANY :
XML_CTYPE_EMPTY);
*eventEndPP = s;
elementDeclHandler(handlerArg, declElementType->name, content);
handleDefault = XML_FALSE;
}
dtd->in_eldecl = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
break;
2003-01-25 18:41:29 -04:00
2002-02-11 19:13:04 -04:00
case XML_ROLE_CONTENT_PCDATA:
2003-01-25 18:41:29 -04:00
if (dtd->in_eldecl) {
dtd->scaffold[dtd->scaffIndex[dtd->scaffLevel - 1]].type
= XML_CTYPE_MIXED;
if (elementDeclHandler)
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
}
break;
case XML_ROLE_CONTENT_ELEMENT:
quant = XML_CQUANT_NONE;
goto elementContent;
case XML_ROLE_CONTENT_ELEMENT_OPT:
quant = XML_CQUANT_OPT;
goto elementContent;
case XML_ROLE_CONTENT_ELEMENT_REP:
quant = XML_CQUANT_REP;
goto elementContent;
case XML_ROLE_CONTENT_ELEMENT_PLUS:
quant = XML_CQUANT_PLUS;
elementContent:
2003-01-25 18:41:29 -04:00
if (dtd->in_eldecl) {
ELEMENT_TYPE *el;
const XML_Char *name;
int nameLen;
const char *nxt = (quant == XML_CQUANT_NONE
? next
: next - enc->minBytesPerChar);
int myindex = nextScaffoldPart(parser);
if (myindex < 0)
return XML_ERROR_NO_MEMORY;
dtd->scaffold[myindex].type = XML_CTYPE_NAME;
dtd->scaffold[myindex].quant = quant;
el = getElementType(parser, enc, s, nxt);
if (!el)
return XML_ERROR_NO_MEMORY;
name = el->name;
dtd->scaffold[myindex].name = name;
nameLen = 0;
for (; name[nameLen++]; );
dtd->contentStringLen += nameLen;
if (elementDeclHandler)
handleDefault = XML_FALSE;
}
2002-02-11 19:13:04 -04:00
break;
case XML_ROLE_GROUP_CLOSE:
quant = XML_CQUANT_NONE;
goto closeGroup;
case XML_ROLE_GROUP_CLOSE_OPT:
quant = XML_CQUANT_OPT;
goto closeGroup;
case XML_ROLE_GROUP_CLOSE_REP:
quant = XML_CQUANT_REP;
goto closeGroup;
case XML_ROLE_GROUP_CLOSE_PLUS:
quant = XML_CQUANT_PLUS;
closeGroup:
2003-01-25 18:41:29 -04:00
if (dtd->in_eldecl) {
if (elementDeclHandler)
handleDefault = XML_FALSE;
dtd->scaffLevel--;
dtd->scaffold[dtd->scaffIndex[dtd->scaffLevel]].quant = quant;
if (dtd->scaffLevel == 0) {
if (!handleDefault) {
XML_Content *model = build_model(parser);
if (!model)
return XML_ERROR_NO_MEMORY;
*eventEndPP = s;
elementDeclHandler(handlerArg, declElementType->name, model);
}
dtd->in_eldecl = XML_FALSE;
dtd->contentStringLen = 0;
}
2002-02-11 19:13:04 -04:00
}
break;
/* End element declaration stuff */
2003-01-25 18:41:29 -04:00
case XML_ROLE_PI:
if (!reportProcessingInstruction(parser, enc, s, next))
return XML_ERROR_NO_MEMORY;
handleDefault = XML_FALSE;
2002-02-11 19:13:04 -04:00
break;
2003-01-25 18:41:29 -04:00
case XML_ROLE_COMMENT:
if (!reportComment(parser, enc, s, next))
return XML_ERROR_NO_MEMORY;
handleDefault = XML_FALSE;
break;
case XML_ROLE_NONE:
2002-02-11 19:13:04 -04:00
switch (tok) {
case XML_TOK_BOM:
2003-01-25 18:41:29 -04:00
handleDefault = XML_FALSE;
break;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
break;
case XML_ROLE_DOCTYPE_NONE:
if (startDoctypeDeclHandler)
handleDefault = XML_FALSE;
break;
case XML_ROLE_ENTITY_NONE:
if (dtd->keepProcessing && entityDeclHandler)
handleDefault = XML_FALSE;
break;
case XML_ROLE_NOTATION_NONE:
if (notationDeclHandler)
handleDefault = XML_FALSE;
break;
case XML_ROLE_ATTLIST_NONE:
if (dtd->keepProcessing && attlistDeclHandler)
handleDefault = XML_FALSE;
break;
case XML_ROLE_ELEMENT_NONE:
if (elementDeclHandler)
handleDefault = XML_FALSE;
break;
} /* end of big switch */
if (handleDefault && defaultHandler)
reportDefault(parser, enc, s, next);
2004-08-03 04:06:22 -03:00
switch (parsing) {
case XML_SUSPENDED:
*nextPtr = next;
return XML_ERROR_NONE;
case XML_FINISHED:
return XML_ERROR_ABORTED;
default:
s = next;
tok = XmlPrologTok(enc, s, end, &next);
}
2002-02-11 19:13:04 -04:00
}
/* not reached */
}
2003-01-25 18:41:29 -04:00
static enum XML_Error PTRCALL
epilogProcessor(XML_Parser parser,
const char *s,
const char *end,
const char **nextPtr)
2002-02-11 19:13:04 -04:00
{
processor = epilogProcessor;
eventPtr = s;
for (;;) {
2003-01-25 18:41:29 -04:00
const char *next = NULL;
2002-02-11 19:13:04 -04:00
int tok = XmlPrologTok(encoding, s, end, &next);
eventEndPtr = next;
switch (tok) {
2003-01-25 18:41:29 -04:00
/* report partial linebreak - it might be the last token */
2002-02-11 19:13:04 -04:00
case -XML_TOK_PROLOG_S:
if (defaultHandler) {
2003-01-25 18:41:29 -04:00
reportDefault(parser, encoding, s, next);
2004-08-03 04:06:22 -03:00
if (parsing == XML_FINISHED)
return XML_ERROR_ABORTED;
2002-02-11 19:13:04 -04:00
}
2004-08-03 04:06:22 -03:00
*nextPtr = next;
2003-01-25 18:41:29 -04:00
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
case XML_TOK_NONE:
2004-08-03 04:06:22 -03:00
*nextPtr = s;
2002-02-11 19:13:04 -04:00
return XML_ERROR_NONE;
case XML_TOK_PROLOG_S:
if (defaultHandler)
2003-01-25 18:41:29 -04:00
reportDefault(parser, encoding, s, next);
2002-02-11 19:13:04 -04:00
break;
case XML_TOK_PI:
if (!reportProcessingInstruction(parser, encoding, s, next))
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_MEMORY;
2002-02-11 19:13:04 -04:00
break;
case XML_TOK_COMMENT:
if (!reportComment(parser, encoding, s, next))
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_MEMORY;
2002-02-11 19:13:04 -04:00
break;
case XML_TOK_INVALID:
eventPtr = next;
return XML_ERROR_INVALID_TOKEN;
case XML_TOK_PARTIAL:
2004-08-03 04:06:22 -03:00
if (!finalBuffer) {
2003-01-25 18:41:29 -04:00
*nextPtr = s;
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
}
return XML_ERROR_UNCLOSED_TOKEN;
case XML_TOK_PARTIAL_CHAR:
2004-08-03 04:06:22 -03:00
if (!finalBuffer) {
2003-01-25 18:41:29 -04:00
*nextPtr = s;
return XML_ERROR_NONE;
2002-02-11 19:13:04 -04:00
}
return XML_ERROR_PARTIAL_CHAR;
default:
return XML_ERROR_JUNK_AFTER_DOC_ELEMENT;
}
eventPtr = s = next;
2004-08-03 04:06:22 -03:00
switch (parsing) {
case XML_SUSPENDED:
*nextPtr = next;
return XML_ERROR_NONE;
case XML_FINISHED:
return XML_ERROR_ABORTED;
default: ;
}
2002-02-11 19:13:04 -04:00
}
}
static enum XML_Error
2004-08-03 04:06:22 -03:00
processInternalEntity(XML_Parser parser, ENTITY *entity,
XML_Bool betweenDecl)
2002-02-11 19:13:04 -04:00
{
2004-08-03 04:06:22 -03:00
const char *textStart, *textEnd;
const char *next;
2002-02-11 19:13:04 -04:00
enum XML_Error result;
2004-08-03 04:06:22 -03:00
OPEN_INTERNAL_ENTITY *openEntity;
if (freeInternalEntities) {
openEntity = freeInternalEntities;
freeInternalEntities = openEntity->next;
}
else {
openEntity = (OPEN_INTERNAL_ENTITY *)MALLOC(sizeof(OPEN_INTERNAL_ENTITY));
if (!openEntity)
return XML_ERROR_NO_MEMORY;
}
2003-01-25 18:41:29 -04:00
entity->open = XML_TRUE;
2004-08-03 04:06:22 -03:00
entity->processed = 0;
openEntity->next = openInternalEntities;
openInternalEntities = openEntity;
openEntity->entity = entity;
openEntity->startTagLevel = tagLevel;
openEntity->betweenDecl = betweenDecl;
openEntity->internalEventPtr = NULL;
openEntity->internalEventEndPtr = NULL;
textStart = (char *)entity->textPtr;
textEnd = (char *)(entity->textPtr + entity->textLen);
#ifdef XML_DTD
if (entity->is_param) {
int tok = XmlPrologTok(internalEncoding, textStart, textEnd, &next);
result = doProlog(parser, internalEncoding, textStart, textEnd, tok,
next, &next, XML_FALSE);
}
else
#endif /* XML_DTD */
result = doContent(parser, tagLevel, internalEncoding, textStart,
textEnd, &next, XML_FALSE);
if (result == XML_ERROR_NONE) {
if (textEnd != next && parsing == XML_SUSPENDED) {
entity->processed = next - textStart;
processor = internalEntityProcessor;
}
else {
entity->open = XML_FALSE;
openInternalEntities = openEntity->next;
/* put openEntity back in list of free instances */
openEntity->next = freeInternalEntities;
freeInternalEntities = openEntity;
}
}
2002-02-11 19:13:04 -04:00
return result;
}
2004-08-03 04:06:22 -03:00
static enum XML_Error PTRCALL
internalEntityProcessor(XML_Parser parser,
const char *s,
const char *end,
const char **nextPtr)
{
ENTITY *entity;
const char *textStart, *textEnd;
const char *next;
enum XML_Error result;
OPEN_INTERNAL_ENTITY *openEntity = openInternalEntities;
if (!openEntity)
return XML_ERROR_UNEXPECTED_STATE;
entity = openEntity->entity;
textStart = ((char *)entity->textPtr) + entity->processed;
textEnd = (char *)(entity->textPtr + entity->textLen);
#ifdef XML_DTD
if (entity->is_param) {
int tok = XmlPrologTok(internalEncoding, textStart, textEnd, &next);
result = doProlog(parser, internalEncoding, textStart, textEnd, tok,
next, &next, XML_FALSE);
}
else
2002-02-11 19:13:04 -04:00
#endif /* XML_DTD */
2004-08-03 04:06:22 -03:00
result = doContent(parser, openEntity->startTagLevel, internalEncoding,
textStart, textEnd, &next, XML_FALSE);
if (result != XML_ERROR_NONE)
return result;
else if (textEnd != next && parsing == XML_SUSPENDED) {
entity->processed = next - (char *)entity->textPtr;
return result;
}
else {
entity->open = XML_FALSE;
openInternalEntities = openEntity->next;
/* put openEntity back in list of free instances */
openEntity->next = freeInternalEntities;
freeInternalEntities = openEntity;
}
#ifdef XML_DTD
if (entity->is_param) {
int tok;
processor = prologProcessor;
tok = XmlPrologTok(encoding, s, end, &next);
return doProlog(parser, encoding, s, end, tok, next, nextPtr,
(XML_Bool)!finalBuffer);
}
else
#endif /* XML_DTD */
{
processor = contentProcessor;
/* see externalEntityContentProcessor vs contentProcessor */
return doContent(parser, parentParser ? 1 : 0, encoding, s, end,
nextPtr, (XML_Bool)!finalBuffer);
}
}
2002-02-11 19:13:04 -04:00
2003-01-25 18:41:29 -04:00
static enum XML_Error PTRCALL
errorProcessor(XML_Parser parser,
const char *s,
const char *end,
const char **nextPtr)
2002-02-11 19:13:04 -04:00
{
return errorCode;
}
static enum XML_Error
2003-01-25 18:41:29 -04:00
storeAttributeValue(XML_Parser parser, const ENCODING *enc, XML_Bool isCdata,
const char *ptr, const char *end,
STRING_POOL *pool)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
enum XML_Error result = appendAttributeValue(parser, enc, isCdata, ptr,
end, pool);
2002-02-11 19:13:04 -04:00
if (result)
return result;
if (!isCdata && poolLength(pool) && poolLastChar(pool) == 0x20)
poolChop(pool);
if (!poolAppendChar(pool, XML_T('\0')))
return XML_ERROR_NO_MEMORY;
return XML_ERROR_NONE;
}
static enum XML_Error
2003-01-25 18:41:29 -04:00
appendAttributeValue(XML_Parser parser, const ENCODING *enc, XML_Bool isCdata,
const char *ptr, const char *end,
STRING_POOL *pool)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
DTD * const dtd = _dtd; /* save one level of indirection */
2002-02-11 19:13:04 -04:00
for (;;) {
const char *next;
int tok = XmlAttributeValueTok(enc, ptr, end, &next);
switch (tok) {
case XML_TOK_NONE:
return XML_ERROR_NONE;
case XML_TOK_INVALID:
if (enc == encoding)
2003-01-25 18:41:29 -04:00
eventPtr = next;
2002-02-11 19:13:04 -04:00
return XML_ERROR_INVALID_TOKEN;
case XML_TOK_PARTIAL:
if (enc == encoding)
2003-01-25 18:41:29 -04:00
eventPtr = ptr;
2002-02-11 19:13:04 -04:00
return XML_ERROR_INVALID_TOKEN;
case XML_TOK_CHAR_REF:
{
2003-01-25 18:41:29 -04:00
XML_Char buf[XML_ENCODE_MAX];
int i;
int n = XmlCharRefNumber(enc, ptr);
if (n < 0) {
if (enc == encoding)
eventPtr = ptr;
return XML_ERROR_BAD_CHAR_REF;
}
if (!isCdata
&& n == 0x20 /* space */
&& (poolLength(pool) == 0 || poolLastChar(pool) == 0x20))
break;
n = XmlEncode(n, (ICHAR *)buf);
if (!n) {
if (enc == encoding)
eventPtr = ptr;
return XML_ERROR_BAD_CHAR_REF;
}
for (i = 0; i < n; i++) {
if (!poolAppendChar(pool, buf[i]))
return XML_ERROR_NO_MEMORY;
}
2002-02-11 19:13:04 -04:00
}
break;
case XML_TOK_DATA_CHARS:
if (!poolAppend(pool, enc, ptr, next))
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_MEMORY;
2002-02-11 19:13:04 -04:00
break;
case XML_TOK_TRAILING_CR:
next = ptr + enc->minBytesPerChar;
/* fall through */
case XML_TOK_ATTRIBUTE_VALUE_S:
case XML_TOK_DATA_NEWLINE:
if (!isCdata && (poolLength(pool) == 0 || poolLastChar(pool) == 0x20))
2003-01-25 18:41:29 -04:00
break;
2002-02-11 19:13:04 -04:00
if (!poolAppendChar(pool, 0x20))
2003-01-25 18:41:29 -04:00
return XML_ERROR_NO_MEMORY;
2002-02-11 19:13:04 -04:00
break;
case XML_TOK_ENTITY_REF:
{
2003-01-25 18:41:29 -04:00
const XML_Char *name;
ENTITY *entity;
char checkEntityDecl;
XML_Char ch = (XML_Char) XmlPredefinedEntityName(enc,
ptr + enc->minBytesPerChar,
next - enc->minBytesPerChar);
if (ch) {
if (!poolAppendChar(pool, ch))
return XML_ERROR_NO_MEMORY;
break;
}
name = poolStoreString(&temp2Pool, enc,
ptr + enc->minBytesPerChar,
next - enc->minBytesPerChar);
if (!name)
return XML_ERROR_NO_MEMORY;
entity = (ENTITY *)lookup(&dtd->generalEntities, name, 0);
poolDiscard(&temp2Pool);
/* first, determine if a check for an existing declaration is needed;
if yes, check that the entity exists, and that it is internal,
otherwise call the default handler (if called from content)
*/
if (pool == &dtd->pool) /* are we called from prolog? */
checkEntityDecl =
#ifdef XML_DTD
prologState.documentEntity &&
#endif /* XML_DTD */
(dtd->standalone
? !openInternalEntities
: !dtd->hasParamEntityRefs);
else /* if (pool == &tempPool): we are called from content */
checkEntityDecl = !dtd->hasParamEntityRefs || dtd->standalone;
if (checkEntityDecl) {
if (!entity)
return XML_ERROR_UNDEFINED_ENTITY;
else if (!entity->is_internal)
return XML_ERROR_ENTITY_DECLARED_IN_PE;
}
else if (!entity) {
/* cannot report skipped entity here - see comments on
skippedEntityHandler
if (skippedEntityHandler)
skippedEntityHandler(handlerArg, name, 0);
*/
if ((pool == &tempPool) && defaultHandler)
reportDefault(parser, enc, ptr, next);
break;
}
if (entity->open) {
if (enc == encoding)
eventPtr = ptr;
return XML_ERROR_RECURSIVE_ENTITY_REF;
}
if (entity->notation) {
if (enc == encoding)
eventPtr = ptr;
return XML_ERROR_BINARY_ENTITY_REF;
}
if (!entity->textPtr) {
if (enc == encoding)
eventPtr = ptr;
return XML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REF;
}
else {
enum XML_Error result;
const XML_Char *textEnd = entity->textPtr + entity->textLen;
entity->open = XML_TRUE;
result = appendAttributeValue(parser, internalEncoding, isCdata,
(char *)entity->textPtr,
(char *)textEnd, pool);
entity->open = XML_FALSE;
if (result)
return result;
}
2002-02-11 19:13:04 -04:00
}
break;
default:
if (enc == encoding)
2003-01-25 18:41:29 -04:00
eventPtr = ptr;
2002-02-11 19:13:04 -04:00
return XML_ERROR_UNEXPECTED_STATE;
}
ptr = next;
}
/* not reached */
}
2003-01-25 18:41:29 -04:00
static enum XML_Error
storeEntityValue(XML_Parser parser,
const ENCODING *enc,
const char *entityTextPtr,
const char *entityTextEnd)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
DTD * const dtd = _dtd; /* save one level of indirection */
STRING_POOL *pool = &(dtd->entityValuePool);
enum XML_Error result = XML_ERROR_NONE;
#ifdef XML_DTD
int oldInEntityValue = prologState.inEntityValue;
prologState.inEntityValue = 1;
#endif /* XML_DTD */
/* never return Null for the value argument in EntityDeclHandler,
since this would indicate an external entity; therefore we
have to make sure that entityValuePool.start is not null */
if (!pool->blocks) {
if (!poolGrow(pool))
return XML_ERROR_NO_MEMORY;
}
2002-02-11 19:13:04 -04:00
for (;;) {
const char *next;
int tok = XmlEntityValueTok(enc, entityTextPtr, entityTextEnd, &next);
switch (tok) {
case XML_TOK_PARAM_ENTITY_REF:
#ifdef XML_DTD
2003-01-25 18:41:29 -04:00
if (isParamEntity || enc != encoding) {
const XML_Char *name;
ENTITY *entity;
name = poolStoreString(&tempPool, enc,
entityTextPtr + enc->minBytesPerChar,
next - enc->minBytesPerChar);
if (!name) {
result = XML_ERROR_NO_MEMORY;
goto endEntityValue;
}
entity = (ENTITY *)lookup(&dtd->paramEntities, name, 0);
poolDiscard(&tempPool);
if (!entity) {
/* not a well-formedness error - see XML 1.0: WFC Entity Declared */
/* cannot report skipped entity here - see comments on
skippedEntityHandler
if (skippedEntityHandler)
skippedEntityHandler(handlerArg, name, 0);
*/
dtd->keepProcessing = dtd->standalone;
goto endEntityValue;
}
if (entity->open) {
if (enc == encoding)
eventPtr = entityTextPtr;
result = XML_ERROR_RECURSIVE_ENTITY_REF;
goto endEntityValue;
}
if (entity->systemId) {
if (externalEntityRefHandler) {
dtd->paramEntityRead = XML_FALSE;
entity->open = XML_TRUE;
if (!externalEntityRefHandler(externalEntityRefHandlerArg,
0,
entity->base,
entity->systemId,
entity->publicId)) {
entity->open = XML_FALSE;
result = XML_ERROR_EXTERNAL_ENTITY_HANDLING;
goto endEntityValue;
}
entity->open = XML_FALSE;
if (!dtd->paramEntityRead)
dtd->keepProcessing = dtd->standalone;
}
else
dtd->keepProcessing = dtd->standalone;
}
else {
entity->open = XML_TRUE;
result = storeEntityValue(parser,
internalEncoding,
(char *)entity->textPtr,
(char *)(entity->textPtr
+ entity->textLen));
entity->open = XML_FALSE;
if (result)
goto endEntityValue;
}
break;
2002-02-11 19:13:04 -04:00
}
#endif /* XML_DTD */
2004-08-03 04:06:22 -03:00
/* In the internal subset, PE references are not legal
within markup declarations, e.g entity values in this case. */
2002-02-11 19:13:04 -04:00
eventPtr = entityTextPtr;
2003-01-25 18:41:29 -04:00
result = XML_ERROR_PARAM_ENTITY_REF;
goto endEntityValue;
2002-02-11 19:13:04 -04:00
case XML_TOK_NONE:
2003-01-25 18:41:29 -04:00
result = XML_ERROR_NONE;
goto endEntityValue;
2002-02-11 19:13:04 -04:00
case XML_TOK_ENTITY_REF:
case XML_TOK_DATA_CHARS:
2003-01-25 18:41:29 -04:00
if (!poolAppend(pool, enc, entityTextPtr, next)) {
result = XML_ERROR_NO_MEMORY;
goto endEntityValue;
}
2002-02-11 19:13:04 -04:00
break;
case XML_TOK_TRAILING_CR:
next = entityTextPtr + enc->minBytesPerChar;
/* fall through */
case XML_TOK_DATA_NEWLINE:
2003-01-25 18:41:29 -04:00
if (pool->end == pool->ptr && !poolGrow(pool)) {
result = XML_ERROR_NO_MEMORY;
goto endEntityValue;
}
2002-02-11 19:13:04 -04:00
*(pool->ptr)++ = 0xA;
break;
case XML_TOK_CHAR_REF:
{
2003-01-25 18:41:29 -04:00
XML_Char buf[XML_ENCODE_MAX];
int i;
int n = XmlCharRefNumber(enc, entityTextPtr);
if (n < 0) {
if (enc == encoding)
eventPtr = entityTextPtr;
result = XML_ERROR_BAD_CHAR_REF;
goto endEntityValue;
}
n = XmlEncode(n, (ICHAR *)buf);
if (!n) {
if (enc == encoding)
eventPtr = entityTextPtr;
result = XML_ERROR_BAD_CHAR_REF;
goto endEntityValue;
}
for (i = 0; i < n; i++) {
if (pool->end == pool->ptr && !poolGrow(pool)) {
result = XML_ERROR_NO_MEMORY;
goto endEntityValue;
}
*(pool->ptr)++ = buf[i];
}
2002-02-11 19:13:04 -04:00
}
break;
case XML_TOK_PARTIAL:
if (enc == encoding)
2003-01-25 18:41:29 -04:00
eventPtr = entityTextPtr;
result = XML_ERROR_INVALID_TOKEN;
goto endEntityValue;
2002-02-11 19:13:04 -04:00
case XML_TOK_INVALID:
if (enc == encoding)
2003-01-25 18:41:29 -04:00
eventPtr = next;
result = XML_ERROR_INVALID_TOKEN;
goto endEntityValue;
2002-02-11 19:13:04 -04:00
default:
if (enc == encoding)
2003-01-25 18:41:29 -04:00
eventPtr = entityTextPtr;
result = XML_ERROR_UNEXPECTED_STATE;
goto endEntityValue;
2002-02-11 19:13:04 -04:00
}
entityTextPtr = next;
}
2003-01-25 18:41:29 -04:00
endEntityValue:
#ifdef XML_DTD
prologState.inEntityValue = oldInEntityValue;
#endif /* XML_DTD */
return result;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
static void FASTCALL
2002-02-11 19:13:04 -04:00
normalizeLines(XML_Char *s)
{
XML_Char *p;
for (;; s++) {
if (*s == XML_T('\0'))
return;
if (*s == 0xD)
break;
}
p = s;
do {
if (*s == 0xD) {
*p++ = 0xA;
if (*++s == 0xA)
s++;
}
else
*p++ = *s++;
} while (*s);
*p = XML_T('\0');
}
static int
2003-01-25 18:41:29 -04:00
reportProcessingInstruction(XML_Parser parser, const ENCODING *enc,
const char *start, const char *end)
2002-02-11 19:13:04 -04:00
{
const XML_Char *target;
XML_Char *data;
const char *tem;
if (!processingInstructionHandler) {
if (defaultHandler)
reportDefault(parser, enc, start, end);
return 1;
}
start += enc->minBytesPerChar * 2;
tem = start + XmlNameLength(enc, start);
target = poolStoreString(&tempPool, enc, start, tem);
if (!target)
return 0;
poolFinish(&tempPool);
data = poolStoreString(&tempPool, enc,
2003-01-25 18:41:29 -04:00
XmlSkipS(enc, tem),
end - enc->minBytesPerChar*2);
2002-02-11 19:13:04 -04:00
if (!data)
return 0;
normalizeLines(data);
processingInstructionHandler(handlerArg, target, data);
poolClear(&tempPool);
return 1;
}
static int
2003-01-25 18:41:29 -04:00
reportComment(XML_Parser parser, const ENCODING *enc,
const char *start, const char *end)
2002-02-11 19:13:04 -04:00
{
XML_Char *data;
if (!commentHandler) {
if (defaultHandler)
reportDefault(parser, enc, start, end);
return 1;
}
data = poolStoreString(&tempPool,
enc,
2003-01-25 18:41:29 -04:00
start + enc->minBytesPerChar * 4,
end - enc->minBytesPerChar * 3);
2002-02-11 19:13:04 -04:00
if (!data)
return 0;
normalizeLines(data);
commentHandler(handlerArg, data);
poolClear(&tempPool);
return 1;
}
static void
2003-01-25 18:41:29 -04:00
reportDefault(XML_Parser parser, const ENCODING *enc,
const char *s, const char *end)
2002-02-11 19:13:04 -04:00
{
if (MUST_CONVERT(enc, s)) {
const char **eventPP;
const char **eventEndPP;
if (enc == encoding) {
eventPP = &eventPtr;
eventEndPP = &eventEndPtr;
}
else {
eventPP = &(openInternalEntities->internalEventPtr);
eventEndPP = &(openInternalEntities->internalEventEndPtr);
}
do {
ICHAR *dataPtr = (ICHAR *)dataBuf;
XmlConvert(enc, &s, end, &dataPtr, (ICHAR *)dataBufEnd);
*eventEndPP = s;
defaultHandler(handlerArg, dataBuf, dataPtr - (ICHAR *)dataBuf);
*eventPP = s;
} while (s != end);
}
else
defaultHandler(handlerArg, (XML_Char *)s, (XML_Char *)end - (XML_Char *)s);
}
static int
2003-01-25 18:41:29 -04:00
defineAttribute(ELEMENT_TYPE *type, ATTRIBUTE_ID *attId, XML_Bool isCdata,
XML_Bool isId, const XML_Char *value, XML_Parser parser)
2002-02-11 19:13:04 -04:00
{
DEFAULT_ATTRIBUTE *att;
if (value || isId) {
/* The handling of default attributes gets messed up if we have
a default which duplicates a non-default. */
int i;
for (i = 0; i < type->nDefaultAtts; i++)
if (attId == type->defaultAtts[i].id)
2003-01-25 18:41:29 -04:00
return 1;
2002-02-11 19:13:04 -04:00
if (isId && !type->idAtt && !attId->xmlns)
type->idAtt = attId;
}
if (type->nDefaultAtts == type->allocDefaultAtts) {
if (type->allocDefaultAtts == 0) {
type->allocDefaultAtts = 8;
type->defaultAtts = (DEFAULT_ATTRIBUTE *)MALLOC(type->allocDefaultAtts
2003-01-25 18:41:29 -04:00
* sizeof(DEFAULT_ATTRIBUTE));
if (!type->defaultAtts)
return 0;
2002-02-11 19:13:04 -04:00
}
else {
2003-01-25 18:41:29 -04:00
DEFAULT_ATTRIBUTE *temp;
int count = type->allocDefaultAtts * 2;
temp = (DEFAULT_ATTRIBUTE *)
REALLOC(type->defaultAtts, (count * sizeof(DEFAULT_ATTRIBUTE)));
if (temp == NULL)
return 0;
type->allocDefaultAtts = count;
type->defaultAtts = temp;
2002-02-11 19:13:04 -04:00
}
}
att = type->defaultAtts + type->nDefaultAtts;
att->id = attId;
att->value = value;
att->isCdata = isCdata;
if (!isCdata)
2003-01-25 18:41:29 -04:00
attId->maybeTokenized = XML_TRUE;
2002-02-11 19:13:04 -04:00
type->nDefaultAtts += 1;
return 1;
}
2003-01-25 18:41:29 -04:00
static int
setElementTypePrefix(XML_Parser parser, ELEMENT_TYPE *elementType)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
DTD * const dtd = _dtd; /* save one level of indirection */
2002-02-11 19:13:04 -04:00
const XML_Char *name;
for (name = elementType->name; *name; name++) {
if (*name == XML_T(':')) {
PREFIX *prefix;
const XML_Char *s;
for (s = elementType->name; s != name; s++) {
2003-01-25 18:41:29 -04:00
if (!poolAppendChar(&dtd->pool, *s))
return 0;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
if (!poolAppendChar(&dtd->pool, XML_T('\0')))
return 0;
prefix = (PREFIX *)lookup(&dtd->prefixes, poolStart(&dtd->pool),
sizeof(PREFIX));
2002-02-11 19:13:04 -04:00
if (!prefix)
2003-01-25 18:41:29 -04:00
return 0;
if (prefix->name == poolStart(&dtd->pool))
poolFinish(&dtd->pool);
2002-02-11 19:13:04 -04:00
else
2003-01-25 18:41:29 -04:00
poolDiscard(&dtd->pool);
2002-02-11 19:13:04 -04:00
elementType->prefix = prefix;
}
}
return 1;
}
static ATTRIBUTE_ID *
2003-01-25 18:41:29 -04:00
getAttributeId(XML_Parser parser, const ENCODING *enc,
const char *start, const char *end)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
DTD * const dtd = _dtd; /* save one level of indirection */
2002-02-11 19:13:04 -04:00
ATTRIBUTE_ID *id;
const XML_Char *name;
2003-01-25 18:41:29 -04:00
if (!poolAppendChar(&dtd->pool, XML_T('\0')))
return NULL;
name = poolStoreString(&dtd->pool, enc, start, end);
2002-02-11 19:13:04 -04:00
if (!name)
2003-01-25 18:41:29 -04:00
return NULL;
/* skip quotation mark - its storage will be re-used (like in name[-1]) */
2002-02-11 19:13:04 -04:00
++name;
2003-01-25 18:41:29 -04:00
id = (ATTRIBUTE_ID *)lookup(&dtd->attributeIds, name, sizeof(ATTRIBUTE_ID));
2002-02-11 19:13:04 -04:00
if (!id)
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
if (id->name != name)
2003-01-25 18:41:29 -04:00
poolDiscard(&dtd->pool);
2002-02-11 19:13:04 -04:00
else {
2003-01-25 18:41:29 -04:00
poolFinish(&dtd->pool);
2002-02-11 19:13:04 -04:00
if (!ns)
;
2003-01-25 18:41:29 -04:00
else if (name[0] == XML_T('x')
&& name[1] == XML_T('m')
&& name[2] == XML_T('l')
&& name[3] == XML_T('n')
&& name[4] == XML_T('s')
&& (name[5] == XML_T('\0') || name[5] == XML_T(':'))) {
if (name[5] == XML_T('\0'))
id->prefix = &dtd->defaultPrefix;
2002-02-11 19:13:04 -04:00
else
2003-01-25 18:41:29 -04:00
id->prefix = (PREFIX *)lookup(&dtd->prefixes, name + 6, sizeof(PREFIX));
id->xmlns = XML_TRUE;
2002-02-11 19:13:04 -04:00
}
else {
int i;
for (i = 0; name[i]; i++) {
/* attributes without prefix are *not* in the default namespace */
2003-01-25 18:41:29 -04:00
if (name[i] == XML_T(':')) {
int j;
for (j = 0; j < i; j++) {
if (!poolAppendChar(&dtd->pool, name[j]))
return NULL;
}
if (!poolAppendChar(&dtd->pool, XML_T('\0')))
return NULL;
id->prefix = (PREFIX *)lookup(&dtd->prefixes, poolStart(&dtd->pool),
sizeof(PREFIX));
if (id->prefix->name == poolStart(&dtd->pool))
poolFinish(&dtd->pool);
else
poolDiscard(&dtd->pool);
break;
}
2002-02-11 19:13:04 -04:00
}
}
}
return id;
}
#define CONTEXT_SEP XML_T('\f')
2003-01-25 18:41:29 -04:00
static const XML_Char *
getContext(XML_Parser parser)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
DTD * const dtd = _dtd; /* save one level of indirection */
2002-02-11 19:13:04 -04:00
HASH_TABLE_ITER iter;
2003-01-25 18:41:29 -04:00
XML_Bool needSep = XML_FALSE;
2002-02-11 19:13:04 -04:00
2003-01-25 18:41:29 -04:00
if (dtd->defaultPrefix.binding) {
2002-02-11 19:13:04 -04:00
int i;
int len;
if (!poolAppendChar(&tempPool, XML_T('=')))
2003-01-25 18:41:29 -04:00
return NULL;
len = dtd->defaultPrefix.binding->uriLen;
2002-02-11 19:13:04 -04:00
if (namespaceSeparator != XML_T('\0'))
len--;
for (i = 0; i < len; i++)
2003-01-25 18:41:29 -04:00
if (!poolAppendChar(&tempPool, dtd->defaultPrefix.binding->uri[i]))
return NULL;
needSep = XML_TRUE;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
hashTableIterInit(&iter, &(dtd->prefixes));
2002-02-11 19:13:04 -04:00
for (;;) {
int i;
int len;
const XML_Char *s;
PREFIX *prefix = (PREFIX *)hashTableIterNext(&iter);
if (!prefix)
break;
if (!prefix->binding)
continue;
if (needSep && !poolAppendChar(&tempPool, CONTEXT_SEP))
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
for (s = prefix->name; *s; s++)
if (!poolAppendChar(&tempPool, *s))
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
if (!poolAppendChar(&tempPool, XML_T('=')))
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
len = prefix->binding->uriLen;
if (namespaceSeparator != XML_T('\0'))
len--;
for (i = 0; i < len; i++)
if (!poolAppendChar(&tempPool, prefix->binding->uri[i]))
2003-01-25 18:41:29 -04:00
return NULL;
needSep = XML_TRUE;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
hashTableIterInit(&iter, &(dtd->generalEntities));
2002-02-11 19:13:04 -04:00
for (;;) {
const XML_Char *s;
ENTITY *e = (ENTITY *)hashTableIterNext(&iter);
if (!e)
break;
if (!e->open)
continue;
if (needSep && !poolAppendChar(&tempPool, CONTEXT_SEP))
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
for (s = e->name; *s; s++)
if (!poolAppendChar(&tempPool, *s))
return 0;
2003-01-25 18:41:29 -04:00
needSep = XML_TRUE;
2002-02-11 19:13:04 -04:00
}
if (!poolAppendChar(&tempPool, XML_T('\0')))
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
return tempPool.start;
}
2003-01-25 18:41:29 -04:00
static XML_Bool
setContext(XML_Parser parser, const XML_Char *context)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
DTD * const dtd = _dtd; /* save one level of indirection */
2002-02-11 19:13:04 -04:00
const XML_Char *s = context;
while (*context != XML_T('\0')) {
if (*s == CONTEXT_SEP || *s == XML_T('\0')) {
ENTITY *e;
if (!poolAppendChar(&tempPool, XML_T('\0')))
2003-01-25 18:41:29 -04:00
return XML_FALSE;
e = (ENTITY *)lookup(&dtd->generalEntities, poolStart(&tempPool), 0);
2002-02-11 19:13:04 -04:00
if (e)
2003-01-25 18:41:29 -04:00
e->open = XML_TRUE;
2002-02-11 19:13:04 -04:00
if (*s != XML_T('\0'))
2003-01-25 18:41:29 -04:00
s++;
2002-02-11 19:13:04 -04:00
context = s;
poolDiscard(&tempPool);
}
2003-01-25 18:41:29 -04:00
else if (*s == XML_T('=')) {
2002-02-11 19:13:04 -04:00
PREFIX *prefix;
if (poolLength(&tempPool) == 0)
2003-01-25 18:41:29 -04:00
prefix = &dtd->defaultPrefix;
2002-02-11 19:13:04 -04:00
else {
2003-01-25 18:41:29 -04:00
if (!poolAppendChar(&tempPool, XML_T('\0')))
return XML_FALSE;
prefix = (PREFIX *)lookup(&dtd->prefixes, poolStart(&tempPool),
sizeof(PREFIX));
if (!prefix)
return XML_FALSE;
2002-02-11 19:13:04 -04:00
if (prefix->name == poolStart(&tempPool)) {
2003-01-25 18:41:29 -04:00
prefix->name = poolCopyString(&dtd->pool, prefix->name);
if (!prefix->name)
return XML_FALSE;
}
poolDiscard(&tempPool);
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
for (context = s + 1;
*context != CONTEXT_SEP && *context != XML_T('\0');
context++)
2002-02-11 19:13:04 -04:00
if (!poolAppendChar(&tempPool, *context))
2003-01-25 18:41:29 -04:00
return XML_FALSE;
2002-02-11 19:13:04 -04:00
if (!poolAppendChar(&tempPool, XML_T('\0')))
2003-01-25 18:41:29 -04:00
return XML_FALSE;
2004-08-03 04:06:22 -03:00
if (addBinding(parser, prefix, NULL, poolStart(&tempPool),
2003-01-25 18:41:29 -04:00
&inheritedBindings) != XML_ERROR_NONE)
return XML_FALSE;
2002-02-11 19:13:04 -04:00
poolDiscard(&tempPool);
if (*context != XML_T('\0'))
2003-01-25 18:41:29 -04:00
++context;
2002-02-11 19:13:04 -04:00
s = context;
}
else {
if (!poolAppendChar(&tempPool, *s))
2003-01-25 18:41:29 -04:00
return XML_FALSE;
2002-02-11 19:13:04 -04:00
s++;
}
}
2003-01-25 18:41:29 -04:00
return XML_TRUE;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
static void FASTCALL
normalizePublicId(XML_Char *publicId)
2002-02-11 19:13:04 -04:00
{
XML_Char *p = publicId;
XML_Char *s;
for (s = publicId; *s; s++) {
switch (*s) {
case 0x20:
case 0xD:
case 0xA:
if (p != publicId && p[-1] != 0x20)
2003-01-25 18:41:29 -04:00
*p++ = 0x20;
2002-02-11 19:13:04 -04:00
break;
default:
*p++ = *s;
}
}
if (p != publicId && p[-1] == 0x20)
--p;
*p = XML_T('\0');
}
2003-01-25 18:41:29 -04:00
static DTD *
dtdCreate(const XML_Memory_Handling_Suite *ms)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
DTD *p = (DTD *)ms->malloc_fcn(sizeof(DTD));
if (p == NULL)
return p;
2002-02-11 19:13:04 -04:00
poolInit(&(p->pool), ms);
2003-01-25 18:41:29 -04:00
poolInit(&(p->entityValuePool), ms);
2002-02-11 19:13:04 -04:00
hashTableInit(&(p->generalEntities), ms);
hashTableInit(&(p->elementTypes), ms);
hashTableInit(&(p->attributeIds), ms);
hashTableInit(&(p->prefixes), ms);
#ifdef XML_DTD
2003-01-25 18:41:29 -04:00
p->paramEntityRead = XML_FALSE;
2002-02-11 19:13:04 -04:00
hashTableInit(&(p->paramEntities), ms);
#endif /* XML_DTD */
2003-01-25 18:41:29 -04:00
p->defaultPrefix.name = NULL;
p->defaultPrefix.binding = NULL;
2002-02-11 19:13:04 -04:00
2003-01-25 18:41:29 -04:00
p->in_eldecl = XML_FALSE;
p->scaffIndex = NULL;
p->scaffold = NULL;
2002-02-11 19:13:04 -04:00
p->scaffLevel = 0;
p->scaffSize = 0;
p->scaffCount = 0;
2003-01-25 18:41:29 -04:00
p->contentStringLen = 0;
2002-02-11 19:13:04 -04:00
2003-01-25 18:41:29 -04:00
p->keepProcessing = XML_TRUE;
p->hasParamEntityRefs = XML_FALSE;
p->standalone = XML_FALSE;
return p;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
static void
dtdReset(DTD *p, const XML_Memory_Handling_Suite *ms)
{
HASH_TABLE_ITER iter;
hashTableIterInit(&iter, &(p->elementTypes));
for (;;) {
ELEMENT_TYPE *e = (ELEMENT_TYPE *)hashTableIterNext(&iter);
if (!e)
break;
if (e->allocDefaultAtts != 0)
ms->free_fcn(e->defaultAtts);
}
hashTableClear(&(p->generalEntities));
2002-02-11 19:13:04 -04:00
#ifdef XML_DTD
2003-01-25 18:41:29 -04:00
p->paramEntityRead = XML_FALSE;
hashTableClear(&(p->paramEntities));
#endif /* XML_DTD */
hashTableClear(&(p->elementTypes));
hashTableClear(&(p->attributeIds));
hashTableClear(&(p->prefixes));
poolClear(&(p->pool));
poolClear(&(p->entityValuePool));
p->defaultPrefix.name = NULL;
p->defaultPrefix.binding = NULL;
2002-02-11 19:13:04 -04:00
2003-01-25 18:41:29 -04:00
p->in_eldecl = XML_FALSE;
ms->free_fcn(p->scaffIndex);
p->scaffIndex = NULL;
ms->free_fcn(p->scaffold);
p->scaffold = NULL;
2003-01-25 18:41:29 -04:00
p->scaffLevel = 0;
p->scaffSize = 0;
p->scaffCount = 0;
p->contentStringLen = 0;
2002-02-11 19:13:04 -04:00
2003-01-25 18:41:29 -04:00
p->keepProcessing = XML_TRUE;
p->hasParamEntityRefs = XML_FALSE;
p->standalone = XML_FALSE;
}
2002-02-11 19:13:04 -04:00
2003-01-25 18:41:29 -04:00
static void
dtdDestroy(DTD *p, XML_Bool isDocEntity, const XML_Memory_Handling_Suite *ms)
2002-02-11 19:13:04 -04:00
{
HASH_TABLE_ITER iter;
hashTableIterInit(&iter, &(p->elementTypes));
for (;;) {
ELEMENT_TYPE *e = (ELEMENT_TYPE *)hashTableIterNext(&iter);
if (!e)
break;
if (e->allocDefaultAtts != 0)
2003-01-25 18:41:29 -04:00
ms->free_fcn(e->defaultAtts);
2002-02-11 19:13:04 -04:00
}
hashTableDestroy(&(p->generalEntities));
#ifdef XML_DTD
hashTableDestroy(&(p->paramEntities));
#endif /* XML_DTD */
hashTableDestroy(&(p->elementTypes));
hashTableDestroy(&(p->attributeIds));
hashTableDestroy(&(p->prefixes));
poolDestroy(&(p->pool));
2003-01-25 18:41:29 -04:00
poolDestroy(&(p->entityValuePool));
if (isDocEntity) {
ms->free_fcn(p->scaffIndex);
ms->free_fcn(p->scaffold);
2003-01-25 18:41:29 -04:00
}
ms->free_fcn(p);
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
/* Do a deep copy of the DTD. Return 0 for out of memory, non-zero otherwise.
The new DTD has already been initialized.
*/
static int
dtdCopy(DTD *newDtd, const DTD *oldDtd, const XML_Memory_Handling_Suite *ms)
2002-02-11 19:13:04 -04:00
{
HASH_TABLE_ITER iter;
/* Copy the prefix table. */
hashTableIterInit(&iter, &(oldDtd->prefixes));
for (;;) {
const XML_Char *name;
const PREFIX *oldP = (PREFIX *)hashTableIterNext(&iter);
if (!oldP)
break;
name = poolCopyString(&(newDtd->pool), oldP->name);
if (!name)
return 0;
if (!lookup(&(newDtd->prefixes), name, sizeof(PREFIX)))
return 0;
}
hashTableIterInit(&iter, &(oldDtd->attributeIds));
/* Copy the attribute id table. */
for (;;) {
ATTRIBUTE_ID *newA;
const XML_Char *name;
const ATTRIBUTE_ID *oldA = (ATTRIBUTE_ID *)hashTableIterNext(&iter);
if (!oldA)
break;
/* Remember to allocate the scratch byte before the name. */
if (!poolAppendChar(&(newDtd->pool), XML_T('\0')))
return 0;
name = poolCopyString(&(newDtd->pool), oldA->name);
if (!name)
return 0;
++name;
2003-01-25 18:41:29 -04:00
newA = (ATTRIBUTE_ID *)lookup(&(newDtd->attributeIds), name,
sizeof(ATTRIBUTE_ID));
2002-02-11 19:13:04 -04:00
if (!newA)
return 0;
newA->maybeTokenized = oldA->maybeTokenized;
if (oldA->prefix) {
newA->xmlns = oldA->xmlns;
if (oldA->prefix == &oldDtd->defaultPrefix)
2003-01-25 18:41:29 -04:00
newA->prefix = &newDtd->defaultPrefix;
2002-02-11 19:13:04 -04:00
else
2003-01-25 18:41:29 -04:00
newA->prefix = (PREFIX *)lookup(&(newDtd->prefixes),
oldA->prefix->name, 0);
2002-02-11 19:13:04 -04:00
}
}
/* Copy the element type table. */
hashTableIterInit(&iter, &(oldDtd->elementTypes));
for (;;) {
int i;
ELEMENT_TYPE *newE;
const XML_Char *name;
const ELEMENT_TYPE *oldE = (ELEMENT_TYPE *)hashTableIterNext(&iter);
if (!oldE)
break;
name = poolCopyString(&(newDtd->pool), oldE->name);
if (!name)
return 0;
2003-01-25 18:41:29 -04:00
newE = (ELEMENT_TYPE *)lookup(&(newDtd->elementTypes), name,
sizeof(ELEMENT_TYPE));
2002-02-11 19:13:04 -04:00
if (!newE)
return 0;
if (oldE->nDefaultAtts) {
2003-01-25 18:41:29 -04:00
newE->defaultAtts = (DEFAULT_ATTRIBUTE *)
ms->malloc_fcn(oldE->nDefaultAtts * sizeof(DEFAULT_ATTRIBUTE));
if (!newE->defaultAtts) {
ms->free_fcn(newE);
return 0;
}
2002-02-11 19:13:04 -04:00
}
if (oldE->idAtt)
2003-01-25 18:41:29 -04:00
newE->idAtt = (ATTRIBUTE_ID *)
lookup(&(newDtd->attributeIds), oldE->idAtt->name, 0);
2002-02-11 19:13:04 -04:00
newE->allocDefaultAtts = newE->nDefaultAtts = oldE->nDefaultAtts;
if (oldE->prefix)
2003-01-25 18:41:29 -04:00
newE->prefix = (PREFIX *)lookup(&(newDtd->prefixes),
oldE->prefix->name, 0);
2002-02-11 19:13:04 -04:00
for (i = 0; i < newE->nDefaultAtts; i++) {
2003-01-25 18:41:29 -04:00
newE->defaultAtts[i].id = (ATTRIBUTE_ID *)
lookup(&(newDtd->attributeIds), oldE->defaultAtts[i].id->name, 0);
2002-02-11 19:13:04 -04:00
newE->defaultAtts[i].isCdata = oldE->defaultAtts[i].isCdata;
if (oldE->defaultAtts[i].value) {
2003-01-25 18:41:29 -04:00
newE->defaultAtts[i].value
= poolCopyString(&(newDtd->pool), oldE->defaultAtts[i].value);
if (!newE->defaultAtts[i].value)
return 0;
2002-02-11 19:13:04 -04:00
}
else
2003-01-25 18:41:29 -04:00
newE->defaultAtts[i].value = NULL;
2002-02-11 19:13:04 -04:00
}
}
/* Copy the entity tables. */
if (!copyEntityTable(&(newDtd->generalEntities),
2003-01-25 18:41:29 -04:00
&(newDtd->pool),
&(oldDtd->generalEntities)))
2002-02-11 19:13:04 -04:00
return 0;
#ifdef XML_DTD
if (!copyEntityTable(&(newDtd->paramEntities),
2003-01-25 18:41:29 -04:00
&(newDtd->pool),
&(oldDtd->paramEntities)))
2002-02-11 19:13:04 -04:00
return 0;
2003-01-25 18:41:29 -04:00
newDtd->paramEntityRead = oldDtd->paramEntityRead;
2002-02-11 19:13:04 -04:00
#endif /* XML_DTD */
2003-01-25 18:41:29 -04:00
newDtd->keepProcessing = oldDtd->keepProcessing;
newDtd->hasParamEntityRefs = oldDtd->hasParamEntityRefs;
2002-02-11 19:13:04 -04:00
newDtd->standalone = oldDtd->standalone;
/* Don't want deep copying for scaffolding */
newDtd->in_eldecl = oldDtd->in_eldecl;
newDtd->scaffold = oldDtd->scaffold;
newDtd->contentStringLen = oldDtd->contentStringLen;
newDtd->scaffSize = oldDtd->scaffSize;
newDtd->scaffLevel = oldDtd->scaffLevel;
newDtd->scaffIndex = oldDtd->scaffIndex;
return 1;
} /* End dtdCopy */
2003-01-25 18:41:29 -04:00
static int
copyEntityTable(HASH_TABLE *newTable,
STRING_POOL *newPool,
const HASH_TABLE *oldTable)
2002-02-11 19:13:04 -04:00
{
HASH_TABLE_ITER iter;
2003-01-25 18:41:29 -04:00
const XML_Char *cachedOldBase = NULL;
const XML_Char *cachedNewBase = NULL;
2002-02-11 19:13:04 -04:00
hashTableIterInit(&iter, oldTable);
for (;;) {
ENTITY *newE;
const XML_Char *name;
const ENTITY *oldE = (ENTITY *)hashTableIterNext(&iter);
if (!oldE)
break;
name = poolCopyString(newPool, oldE->name);
if (!name)
return 0;
newE = (ENTITY *)lookup(newTable, name, sizeof(ENTITY));
if (!newE)
return 0;
if (oldE->systemId) {
const XML_Char *tem = poolCopyString(newPool, oldE->systemId);
if (!tem)
2003-01-25 18:41:29 -04:00
return 0;
2002-02-11 19:13:04 -04:00
newE->systemId = tem;
if (oldE->base) {
2003-01-25 18:41:29 -04:00
if (oldE->base == cachedOldBase)
newE->base = cachedNewBase;
else {
cachedOldBase = oldE->base;
tem = poolCopyString(newPool, cachedOldBase);
if (!tem)
return 0;
cachedNewBase = newE->base = tem;
}
}
if (oldE->publicId) {
tem = poolCopyString(newPool, oldE->publicId);
if (!tem)
return 0;
newE->publicId = tem;
2002-02-11 19:13:04 -04:00
}
}
else {
2003-01-25 18:41:29 -04:00
const XML_Char *tem = poolCopyStringN(newPool, oldE->textPtr,
oldE->textLen);
2002-02-11 19:13:04 -04:00
if (!tem)
2003-01-25 18:41:29 -04:00
return 0;
2002-02-11 19:13:04 -04:00
newE->textPtr = tem;
newE->textLen = oldE->textLen;
}
if (oldE->notation) {
const XML_Char *tem = poolCopyString(newPool, oldE->notation);
if (!tem)
2003-01-25 18:41:29 -04:00
return 0;
2002-02-11 19:13:04 -04:00
newE->notation = tem;
}
2003-01-25 18:41:29 -04:00
newE->is_param = oldE->is_param;
newE->is_internal = oldE->is_internal;
2002-02-11 19:13:04 -04:00
}
return 1;
}
#define INIT_POWER 6
2002-02-11 19:13:04 -04:00
static XML_Bool FASTCALL
2003-01-25 18:41:29 -04:00
keyeq(KEY s1, KEY s2)
2002-02-11 19:13:04 -04:00
{
for (; *s1 == *s2; s1++, s2++)
if (*s1 == 0)
return XML_TRUE;
return XML_FALSE;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
static unsigned long FASTCALL
hash(KEY s)
2002-02-11 19:13:04 -04:00
{
unsigned long h = 0;
while (*s)
h = CHAR_HASH(h, *s++);
2002-02-11 19:13:04 -04:00
return h;
}
2003-01-25 18:41:29 -04:00
static NAMED *
lookup(HASH_TABLE *table, KEY name, size_t createSize)
2002-02-11 19:13:04 -04:00
{
size_t i;
if (table->size == 0) {
size_t tsize;
if (!createSize)
2003-01-25 18:41:29 -04:00
return NULL;
table->power = INIT_POWER;
/* table->size is a power of 2 */
table->size = (size_t)1 << INIT_POWER;
tsize = table->size * sizeof(NAMED *);
2003-01-25 18:41:29 -04:00
table->v = (NAMED **)table->mem->malloc_fcn(tsize);
2004-08-03 04:06:22 -03:00
if (!table->v) {
table->size = 0;
2003-01-25 18:41:29 -04:00
return NULL;
2004-08-03 04:06:22 -03:00
}
2002-02-11 19:13:04 -04:00
memset(table->v, 0, tsize);
i = hash(name) & ((unsigned long)table->size - 1);
2002-02-11 19:13:04 -04:00
}
else {
unsigned long h = hash(name);
unsigned long mask = (unsigned long)table->size - 1;
unsigned char step = 0;
i = h & mask;
while (table->v[i]) {
2002-02-11 19:13:04 -04:00
if (keyeq(name, table->v[i]->name))
2003-01-25 18:41:29 -04:00
return table->v[i];
if (!step)
step = PROBE_STEP(h, mask, table->power);
i < step ? (i += table->size - step) : (i -= step);
2002-02-11 19:13:04 -04:00
}
if (!createSize)
2003-01-25 18:41:29 -04:00
return NULL;
/* check for overflow (table is half full) */
if (table->used >> (table->power - 1)) {
unsigned char newPower = table->power + 1;
size_t newSize = (size_t)1 << newPower;
unsigned long newMask = (unsigned long)newSize - 1;
2002-02-11 19:13:04 -04:00
size_t tsize = newSize * sizeof(NAMED *);
2003-01-25 18:41:29 -04:00
NAMED **newV = (NAMED **)table->mem->malloc_fcn(tsize);
2002-02-11 19:13:04 -04:00
if (!newV)
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
memset(newV, 0, tsize);
for (i = 0; i < table->size; i++)
2003-01-25 18:41:29 -04:00
if (table->v[i]) {
unsigned long newHash = hash(table->v[i]->name);
size_t j = newHash & newMask;
step = 0;
while (newV[j]) {
if (!step)
step = PROBE_STEP(newHash, newMask, newPower);
j < step ? (j += newSize - step) : (j -= step);
}
2003-01-25 18:41:29 -04:00
newV[j] = table->v[i];
}
2002-02-11 19:13:04 -04:00
table->mem->free_fcn(table->v);
table->v = newV;
table->power = newPower;
2002-02-11 19:13:04 -04:00
table->size = newSize;
i = h & newMask;
step = 0;
while (table->v[i]) {
if (!step)
step = PROBE_STEP(h, newMask, newPower);
i < step ? (i += newSize - step) : (i -= step);
}
2002-02-11 19:13:04 -04:00
}
}
2003-01-25 18:41:29 -04:00
table->v[i] = (NAMED *)table->mem->malloc_fcn(createSize);
2002-02-11 19:13:04 -04:00
if (!table->v[i])
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
memset(table->v[i], 0, createSize);
table->v[i]->name = name;
(table->used)++;
return table->v[i];
}
2003-01-25 18:41:29 -04:00
static void FASTCALL
hashTableClear(HASH_TABLE *table)
{
size_t i;
for (i = 0; i < table->size; i++) {
table->mem->free_fcn(table->v[i]);
table->v[i] = NULL;
2003-01-25 18:41:29 -04:00
}
table->used = 0;
}
static void FASTCALL
hashTableDestroy(HASH_TABLE *table)
2002-02-11 19:13:04 -04:00
{
size_t i;
for (i = 0; i < table->size; i++)
table->mem->free_fcn(table->v[i]);
table->mem->free_fcn(table->v);
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
static void FASTCALL
hashTableInit(HASH_TABLE *p, const XML_Memory_Handling_Suite *ms)
2002-02-11 19:13:04 -04:00
{
p->power = 0;
2002-02-11 19:13:04 -04:00
p->size = 0;
p->used = 0;
2003-01-25 18:41:29 -04:00
p->v = NULL;
2002-02-11 19:13:04 -04:00
p->mem = ms;
}
2003-01-25 18:41:29 -04:00
static void FASTCALL
hashTableIterInit(HASH_TABLE_ITER *iter, const HASH_TABLE *table)
2002-02-11 19:13:04 -04:00
{
iter->p = table->v;
iter->end = iter->p + table->size;
}
2003-01-25 18:41:29 -04:00
static NAMED * FASTCALL
hashTableIterNext(HASH_TABLE_ITER *iter)
2002-02-11 19:13:04 -04:00
{
while (iter->p != iter->end) {
NAMED *tem = *(iter->p)++;
if (tem)
return tem;
}
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
static void FASTCALL
poolInit(STRING_POOL *pool, const XML_Memory_Handling_Suite *ms)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
pool->blocks = NULL;
pool->freeBlocks = NULL;
pool->start = NULL;
pool->ptr = NULL;
pool->end = NULL;
2002-02-11 19:13:04 -04:00
pool->mem = ms;
}
2003-01-25 18:41:29 -04:00
static void FASTCALL
poolClear(STRING_POOL *pool)
2002-02-11 19:13:04 -04:00
{
if (!pool->freeBlocks)
pool->freeBlocks = pool->blocks;
else {
BLOCK *p = pool->blocks;
while (p) {
BLOCK *tem = p->next;
p->next = pool->freeBlocks;
pool->freeBlocks = p;
p = tem;
}
}
2003-01-25 18:41:29 -04:00
pool->blocks = NULL;
pool->start = NULL;
pool->ptr = NULL;
pool->end = NULL;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
static void FASTCALL
poolDestroy(STRING_POOL *pool)
2002-02-11 19:13:04 -04:00
{
BLOCK *p = pool->blocks;
while (p) {
BLOCK *tem = p->next;
pool->mem->free_fcn(p);
p = tem;
}
p = pool->freeBlocks;
while (p) {
BLOCK *tem = p->next;
pool->mem->free_fcn(p);
p = tem;
}
}
2003-01-25 18:41:29 -04:00
static XML_Char *
poolAppend(STRING_POOL *pool, const ENCODING *enc,
const char *ptr, const char *end)
2002-02-11 19:13:04 -04:00
{
if (!pool->ptr && !poolGrow(pool))
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
for (;;) {
XmlConvert(enc, &ptr, end, (ICHAR **)&(pool->ptr), (ICHAR *)pool->end);
if (ptr == end)
break;
if (!poolGrow(pool))
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
}
return pool->start;
}
2003-01-25 18:41:29 -04:00
static const XML_Char * FASTCALL
poolCopyString(STRING_POOL *pool, const XML_Char *s)
2002-02-11 19:13:04 -04:00
{
do {
if (!poolAppendChar(pool, *s))
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
} while (*s++);
s = pool->start;
poolFinish(pool);
return s;
}
2003-01-25 18:41:29 -04:00
static const XML_Char *
poolCopyStringN(STRING_POOL *pool, const XML_Char *s, int n)
2002-02-11 19:13:04 -04:00
{
if (!pool->ptr && !poolGrow(pool))
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
for (; n > 0; --n, s++) {
if (!poolAppendChar(pool, *s))
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
}
s = pool->start;
poolFinish(pool);
return s;
}
2003-01-25 18:41:29 -04:00
static const XML_Char * FASTCALL
poolAppendString(STRING_POOL *pool, const XML_Char *s)
2002-02-11 19:13:04 -04:00
{
while (*s) {
if (!poolAppendChar(pool, *s))
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
s++;
2003-01-25 18:41:29 -04:00
}
2002-02-11 19:13:04 -04:00
return pool->start;
2003-01-25 18:41:29 -04:00
}
2002-02-11 19:13:04 -04:00
2003-01-25 18:41:29 -04:00
static XML_Char *
poolStoreString(STRING_POOL *pool, const ENCODING *enc,
const char *ptr, const char *end)
2002-02-11 19:13:04 -04:00
{
if (!poolAppend(pool, enc, ptr, end))
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
if (pool->ptr == pool->end && !poolGrow(pool))
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
*(pool->ptr)++ = 0;
return pool->start;
}
2003-01-25 18:41:29 -04:00
static XML_Bool FASTCALL
poolGrow(STRING_POOL *pool)
2002-02-11 19:13:04 -04:00
{
if (pool->freeBlocks) {
if (pool->start == 0) {
pool->blocks = pool->freeBlocks;
pool->freeBlocks = pool->freeBlocks->next;
2003-01-25 18:41:29 -04:00
pool->blocks->next = NULL;
2002-02-11 19:13:04 -04:00
pool->start = pool->blocks->s;
pool->end = pool->start + pool->blocks->size;
pool->ptr = pool->start;
2003-01-25 18:41:29 -04:00
return XML_TRUE;
2002-02-11 19:13:04 -04:00
}
if (pool->end - pool->start < pool->freeBlocks->size) {
BLOCK *tem = pool->freeBlocks->next;
pool->freeBlocks->next = pool->blocks;
pool->blocks = pool->freeBlocks;
pool->freeBlocks = tem;
2003-01-25 18:41:29 -04:00
memcpy(pool->blocks->s, pool->start,
(pool->end - pool->start) * sizeof(XML_Char));
2002-02-11 19:13:04 -04:00
pool->ptr = pool->blocks->s + (pool->ptr - pool->start);
pool->start = pool->blocks->s;
pool->end = pool->start + pool->blocks->size;
2003-01-25 18:41:29 -04:00
return XML_TRUE;
2002-02-11 19:13:04 -04:00
}
}
if (pool->blocks && pool->start == pool->blocks->s) {
int blockSize = (pool->end - pool->start)*2;
2003-01-25 18:41:29 -04:00
pool->blocks = (BLOCK *)
pool->mem->realloc_fcn(pool->blocks,
(offsetof(BLOCK, s)
+ blockSize * sizeof(XML_Char)));
2003-01-25 18:41:29 -04:00
if (pool->blocks == NULL)
return XML_FALSE;
2002-02-11 19:13:04 -04:00
pool->blocks->size = blockSize;
pool->ptr = pool->blocks->s + (pool->ptr - pool->start);
pool->start = pool->blocks->s;
pool->end = pool->start + blockSize;
}
else {
BLOCK *tem;
int blockSize = pool->end - pool->start;
if (blockSize < INIT_BLOCK_SIZE)
blockSize = INIT_BLOCK_SIZE;
else
blockSize *= 2;
2003-01-25 18:41:29 -04:00
tem = (BLOCK *)pool->mem->malloc_fcn(offsetof(BLOCK, s)
+ blockSize * sizeof(XML_Char));
2002-02-11 19:13:04 -04:00
if (!tem)
2003-01-25 18:41:29 -04:00
return XML_FALSE;
2002-02-11 19:13:04 -04:00
tem->size = blockSize;
tem->next = pool->blocks;
pool->blocks = tem;
if (pool->ptr != pool->start)
2003-01-25 18:41:29 -04:00
memcpy(tem->s, pool->start,
(pool->ptr - pool->start) * sizeof(XML_Char));
2002-02-11 19:13:04 -04:00
pool->ptr = tem->s + (pool->ptr - pool->start);
pool->start = tem->s;
pool->end = tem->s + blockSize;
}
2003-01-25 18:41:29 -04:00
return XML_TRUE;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
static int FASTCALL
2002-02-11 19:13:04 -04:00
nextScaffoldPart(XML_Parser parser)
{
2003-01-25 18:41:29 -04:00
DTD * const dtd = _dtd; /* save one level of indirection */
2002-02-11 19:13:04 -04:00
CONTENT_SCAFFOLD * me;
int next;
2003-01-25 18:41:29 -04:00
if (!dtd->scaffIndex) {
dtd->scaffIndex = (int *)MALLOC(groupSize * sizeof(int));
if (!dtd->scaffIndex)
2002-02-11 19:13:04 -04:00
return -1;
2003-01-25 18:41:29 -04:00
dtd->scaffIndex[0] = 0;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
if (dtd->scaffCount >= dtd->scaffSize) {
CONTENT_SCAFFOLD *temp;
if (dtd->scaffold) {
temp = (CONTENT_SCAFFOLD *)
REALLOC(dtd->scaffold, dtd->scaffSize * 2 * sizeof(CONTENT_SCAFFOLD));
if (temp == NULL)
return -1;
dtd->scaffSize *= 2;
2002-02-11 19:13:04 -04:00
}
else {
2003-01-25 18:41:29 -04:00
temp = (CONTENT_SCAFFOLD *)MALLOC(INIT_SCAFFOLD_ELEMENTS
* sizeof(CONTENT_SCAFFOLD));
if (temp == NULL)
return -1;
dtd->scaffSize = INIT_SCAFFOLD_ELEMENTS;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
dtd->scaffold = temp;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
next = dtd->scaffCount++;
me = &dtd->scaffold[next];
if (dtd->scaffLevel) {
CONTENT_SCAFFOLD *parent = &dtd->scaffold[dtd->scaffIndex[dtd->scaffLevel-1]];
2002-02-11 19:13:04 -04:00
if (parent->lastchild) {
2003-01-25 18:41:29 -04:00
dtd->scaffold[parent->lastchild].nextsib = next;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
if (!parent->childcnt)
2002-02-11 19:13:04 -04:00
parent->firstchild = next;
parent->lastchild = next;
parent->childcnt++;
}
me->firstchild = me->lastchild = me->childcnt = me->nextsib = 0;
return next;
2003-01-25 18:41:29 -04:00
}
2002-02-11 19:13:04 -04:00
static void
2003-01-25 18:41:29 -04:00
build_node(XML_Parser parser,
int src_node,
XML_Content *dest,
XML_Content **contpos,
XML_Char **strpos)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
DTD * const dtd = _dtd; /* save one level of indirection */
dest->type = dtd->scaffold[src_node].type;
dest->quant = dtd->scaffold[src_node].quant;
2002-02-11 19:13:04 -04:00
if (dest->type == XML_CTYPE_NAME) {
2003-01-25 18:41:29 -04:00
const XML_Char *src;
2002-02-11 19:13:04 -04:00
dest->name = *strpos;
2003-01-25 18:41:29 -04:00
src = dtd->scaffold[src_node].name;
2002-02-11 19:13:04 -04:00
for (;;) {
*(*strpos)++ = *src;
2003-01-25 18:41:29 -04:00
if (!*src)
break;
2002-02-11 19:13:04 -04:00
src++;
}
dest->numchildren = 0;
2003-01-25 18:41:29 -04:00
dest->children = NULL;
2002-02-11 19:13:04 -04:00
}
else {
unsigned int i;
int cn;
2003-01-25 18:41:29 -04:00
dest->numchildren = dtd->scaffold[src_node].childcnt;
2002-02-11 19:13:04 -04:00
dest->children = *contpos;
*contpos += dest->numchildren;
2003-01-25 18:41:29 -04:00
for (i = 0, cn = dtd->scaffold[src_node].firstchild;
i < dest->numchildren;
i++, cn = dtd->scaffold[cn].nextsib) {
2002-02-11 19:13:04 -04:00
build_node(parser, cn, &(dest->children[i]), contpos, strpos);
}
2003-01-25 18:41:29 -04:00
dest->name = NULL;
2002-02-11 19:13:04 -04:00
}
2003-01-25 18:41:29 -04:00
}
2002-02-11 19:13:04 -04:00
static XML_Content *
build_model (XML_Parser parser)
{
2003-01-25 18:41:29 -04:00
DTD * const dtd = _dtd; /* save one level of indirection */
2002-02-11 19:13:04 -04:00
XML_Content *ret;
XML_Content *cpos;
2003-01-25 18:41:29 -04:00
XML_Char * str;
int allocsize = (dtd->scaffCount * sizeof(XML_Content)
+ (dtd->contentStringLen * sizeof(XML_Char)));
ret = (XML_Content *)MALLOC(allocsize);
if (!ret)
return NULL;
2002-02-11 19:13:04 -04:00
2003-01-25 18:41:29 -04:00
str = (XML_Char *) (&ret[dtd->scaffCount]);
2002-02-11 19:13:04 -04:00
cpos = &ret[1];
build_node(parser, 0, ret, &cpos, &str);
return ret;
2003-01-25 18:41:29 -04:00
}
2002-02-11 19:13:04 -04:00
static ELEMENT_TYPE *
getElementType(XML_Parser parser,
2003-01-25 18:41:29 -04:00
const ENCODING *enc,
const char *ptr,
const char *end)
2002-02-11 19:13:04 -04:00
{
2003-01-25 18:41:29 -04:00
DTD * const dtd = _dtd; /* save one level of indirection */
const XML_Char *name = poolStoreString(&dtd->pool, enc, ptr, end);
2002-02-11 19:13:04 -04:00
ELEMENT_TYPE *ret;
2003-01-25 18:41:29 -04:00
if (!name)
return NULL;
ret = (ELEMENT_TYPE *) lookup(&dtd->elementTypes, name, sizeof(ELEMENT_TYPE));
if (!ret)
return NULL;
2002-02-11 19:13:04 -04:00
if (ret->name != name)
2003-01-25 18:41:29 -04:00
poolDiscard(&dtd->pool);
2002-02-11 19:13:04 -04:00
else {
2003-01-25 18:41:29 -04:00
poolFinish(&dtd->pool);
2002-02-11 19:13:04 -04:00
if (!setElementTypePrefix(parser, ret))
2003-01-25 18:41:29 -04:00
return NULL;
2002-02-11 19:13:04 -04:00
}
return ret;
2003-01-25 18:41:29 -04:00
}