cpython

Commit Graph

Author	SHA1	Message	Date
Fred Drake	2a3d7db93e	Added character data buffering to pyexpat parser objects. Setting the buffer_text attribute to true causes the parser to collect character data, waiting as long as possible to report it to the Python callback. This can save an enormous number of callbacks from C to Python, which can be a substantial performance improvement. buffer_text defaults to false.	2002-06-28 22:56:48 +00:00
Fred Drake	1add023b88	Integrate the tests for name interning from PyXML (test_pyexpat.py revision 1.12 in PyXML).	2002-06-27 19:41:51 +00:00
Jeremy Hylton	3c19ec4eab	Fix when pyexpat not built Import pyexpat first so that import error occurs when it is not available.	2001-07-30 21:47:25 +00:00
Tim Peters	2f228e75e4	Get rid of the superstitious "~" in dict hashing's "i = (~hash) & mask". The comment following used to say: /* We use ~hash instead of hash, as degenerate hash functions, such as for ints <sigh>, can have lots of leading zeros. It's not really a performance risk, but better safe than sorry. 12-Dec-00 tim: so ~hash produces lots of leading ones instead -- what's the gain? / That is, there was never a good reason for doing it. And to the contrary, as explained on Python-Dev last December, it tended to make the sum* (i + incr) & mask (which is the first table index examined in case of collison) the same "too often" across distinct hashes. Changing to the simpler "i = hash & mask" reduced the number of string-dict collisions (== # number of times we go around the lookup for-loop) from about 6 million to 5 million during a full run of the test suite (these are approximate because the test suite does some random stuff from run to run). The number of collisions in non-string dicts also decreased, but not as dramatically. Note that this may, for a given dict, change the order (wrt previous releases) of entries exposed by .keys(), .values() and .items(). A number of std tests suffered bogus failures as a result. For dicts keyed by small ints, or (less so) by characters, the order is much more likely to be in increasing order of key now; e.g., >>> d = {} >>> for i in range(10): ... d[i] = i ... >>> d {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9} >>> Unfortunately. people may latch on to that in small examples and draw a bogus conclusion. test_support.py Moved test_extcall's sortdict() into test_support, made it stronger, and imported sortdict into other std tests that needed it. test_unicode.py Excluced cp875 from the "roundtrip over range(128)" test, because cp875 doesn't have a well-defined inverse for unicode("?", "cp875"). See Python-Dev for excruciating details. Cookie.py Chaged various output functions to sort dicts before building strings from them. test_extcall Fiddled the expected-result file. This remains sensitive to native dict ordering, because, e.g., if there are multiple errors in a keyword-arg dict (and test_extcall sets up many cases like that), the specific error Python complains about first depends on native dict ordering.	2001-05-13 00:19:31 +00:00
Fred Drake	8f42e2b1fa	Update test to accomodate the change to the namespace_separator parameter of ParserCreate(). Added assignment tests for the ordered_attributes and specified_attributes values, similar to the checks for the returns_unicode attribute.	2001-04-25 16:03:54 +00:00
Fred Drake	1e0611b208	The "context" parameter to the ExternalEntityRefParameter exposes internal information from the Expat library that is not part of its public API. Do not print this information as the format of the string may (and will) change as Expat evolves. Add additional tests to make sure the ParserCreate() function raises the right exceptions on illegal parameters.	2000-12-23 22:12:07 +00:00
Fred Drake	004d5e6880	Make reindent.py happy (convert everything to 4-space indents!).	2000-10-23 17:22:08 +00:00
Fred Drake	7fbc85c5c5	Rename the public interface from "pyexpat" to "xml.parsers.expat".	2000-09-23 04:47:56 +00:00
Fred Drake	265a804af2	Revise the test case for pyexpat to avoid using asserts. Conform better to the Python style guide, and remove unneeded imports.	2000-09-21 20:32:13 +00:00
Andrew M. Kuchling	7fd7e36b08	Change pyexpat test suite to exercise the .returns_unicode attribute, parsing the sample data once with 8-bit strings and once with Unicode.	2000-06-27 00:37:25 +00:00
Andrew M. Kuchling	e188d52a7e	Untabified file to fix problems reported by tabnanny	2000-04-02 05:15:38 +00:00
Andrew M. Kuchling	b17664ddf0	Added test case for pyexpat module that tries to exercise all the handlers	2000-03-31 15:44:52 +00:00

12 Commits