variant that never needs to "search from the right".
Also fixed unlikely memory leak in get_line, if string size overflows INTMAX.
Also new std test test_bufio to make sure .readline() works.
the mapping dictionaries can now contain 1-n mappings, meaning
that character ordinals may be mapped to strings or Unicode object,
e.g. 0x0078 ('x') -> u"abc", causing the ordinal to be replaced by
the complete string or Unicode object instead of just one character.
Another feature introduced by the patch is that of mapping oridnals to
the emtpy string. This allows removing characters.
The patch is different from patch #103100 in that it does not cause a
performance hit for the normal use case of 1-1 mappings.
Written by Marc-Andre Lemburg, copyright assigned to Guido van Rossum.
the urljoin() function, which exercises the urlparse() and urlunparse()
functions as side effects.
(Moshe, why did we have perfectly empty tests checked in for this?)
urljoin(): Make this conform to RFC 1808 for all examples given in that
RFC (both "Normal" and "Abnormal"), so long as that RFC does
not conflict the older RFC 1630, which also specified
relative URL resolution.
This closes SF bug #110832 (Jitterbug PR#194).
an empty keywords dictionary (via apply() or the extended call syntax),
the keywords dict should be ignored. If the keywords dict is not empty,
TypeError should be raised. (Between the restructuring of the call
machinery and this patch, an empty dict in this situation would trigger
a SystemError via PyErr_BadInternalCall().)
Added regression tests to detect errors for this.
codec to not apply Latin-1 mappings for keys which are not found
in the mapping dictionaries, but instead treat them as undefined
mappings.
The patch was originally written by Martin v. Loewis with some
additional (cosmetic) changes and an updated test script
by Marc-Andre Lemburg.
The standard codecs were recreated from the most current files
available at the Unicode.org site using the Tools/scripts/gencodec.py
tool.
This patch closes the bugs #116285 and #119960.
1. When running in verbose mode, if any test happens to pass, print
a warning that the apparent success may be bogus (stdout isn't
compared in verbose mode). Been fooled by that too often.
2. When a test fails because the expected stdout doesn't match the
actual stdout, print as much of stdout as did match before the
first failing write. Else we get failures of the form "expected
'a', got 'b'" and a glance at the expected output file shows
500 instances of 'a' -- no idea where it failed, and, as in #1,
trying to run in verbose mode instead doesn't help because
stdout isn't compared then.
the logic. That resulted in a bug. My previous getopt checkin repaired
the bug but left the sorting. The solution is significantly simpler if
we don't bother sorting at all, so this checkin gets rid of the sort and
the code that relied on it.
Christmas present to myself: the bisect module didn't define what
happened if the new element was already in the list. It so happens
that it inserted the new element "to the right" of all equal elements.
Since it wasn't defined, among other bad implications it was a mystery
how to use bisect to determine whether an element was already in the
list (I've seen code that *assumed* "to the right" without justification).
Added new methods bisect_left and insort_left that insert "to the left"
instead; made the old names bisect and insort aliases for the new names
bisect_right and insort_right; beefed up docstrings to explain what
these actually do; and added a std test for the bisect module.
- implement hasAttribute and hasAttributeNS (1.7)
- Node.replaceChild(): Update the sibling nodes to point to newChild. Set
the .nextSibling attribute on oldChild instead of adding a .newChild
attribute (1.9).
information from the Expat library that is not part of its public API.
Do not print this information as the format of the string may (and will)
change as Expat evolves.
Add additional tests to make sure the ParserCreate() function raises the
right exceptions on illegal parameters.
give minidom.py behaviour that complies with the DOM Level 1 REC,
which says that when a node newChild is added to the tree, "if the
newChild is already in the tree, it is first removed."
pulldom.py is patched to use the public minidom interface instead
of setting .parentNode itself. Possibly this reduces pulldom's
efficiency; someone else will have to pronounce on that.
so we can't use it.
While I'm at it, got rid of string module use. (Found several new
hard special cases for a hypothetical conversion tool: from string
import join, find, rfind; and a local assignment "find=string.find".)
required to work around restrictions on the arguments of
u.translate():
1) don't pass the deletions argument if it's empty;
2) convert table to Unicode if s is Unicode.
This fixes SF bug #124060.
bugs #126161 and 123634).
The solution doesn't use the unicode-escape encoding; that has other
problems (it seems not 100% reversible). Rather, it transforms the
input Unicode object slightly before encoding it using
raw-unicode-escape, so that the decoding will reconstruct the original
string: backslash and newline characters are translated into their
\uXXXX counterparts.
This is backwards incompatible for strings containing backslashes, but
for some of those strings, the pickling was already broken.
obsolete!).
Fix a bug in ftpwrapper.retrfile() where somehow ftplib.error_perm was
assumed to be a string. (The fix applies str().)
Also break some long lines and change the output from test() slightly.
Make Node inherit from xml.dom.Node to pick up the NodeType values
defined by the W3C recommendation.
When raising AttributeError, be sure to provide the name of the attribute
that does not exist.
Node.normalize(): Make sure we do not allow an empty text node to survive
as the first child; update the sibling links properly.
_getElementsByTagNameNSHelper(): Make recursive calls using the right
number of parameters.
Attr.__setattr__(): Be sure to update name and nodeName at the same time
since they are synonyms for this node type.
AttributeList: Renamed to NamedNodeMap (AttributeList maintained as an
alias). Compute the length attribute dynamically to allow
the underlying structures to mutate.
AttributeList.item(): Call .keys() on the dictionary rather than using
self.keys() for performance.
AttributeList.setNamedItem(), .setNamedItemNS():
Added methods.
Text.splitText():
Added method.
DocumentType:
Added implementation class.
DOMImplementation:
Added implementation class.
Document.appendChild(): Do not allow a second document element to be added.
Document.documentElement: Find this dynamically, so that one can be
removed and another added.
Document.unlink(): Clear the doctype attribute.
_get_StringIO(): Only use the StringIO module; cStringIO does not support
Unicode.
objects; uses minidom if one is not provided to the constructor.
parse(): Pick up the default_bufsize default value dynamically so that
the value in the module may be (meaningfully) changed at runtime.
This (partially) closes patch #102477.
can't be imported. This makes StringIO.py work with Jython.
Also, get rid of the string module by converting to string methods.
Shorten some lines by using augmented assignment where appropriate.
encodings package aliases mapping dictionary rather than in the
internal cache used by the search function.
This enables aliases to take advantage of the full normalization
process applied to encoding names which was previously not available.
The patch restricts alias registration to new aliases. Existing
aliases cannot be overridden anymore.
roundtrip(): Show the offending syntax tree when things break; this makes
it a little easier to debug the module by adding test cases.
(Still need better tests for this module, but there's not enough time
today.)
socket in httplib.py.
The bug reports that on Windows, you must pass sock._sock to the
socket.ssl() call. But on Unix, you must pass sock itself. (sock is
a wrapper on Windows but not on Unix; the ssl() call wants the real
socket object, not the wrapper.)
So we see if sock has an _sock attribute and if so, extract it.
Unfortunately, the submitter of the bug didn't confirm that this patch
works, so I'll just have to believe it (can't test it myself since I
don't have OpenSSL on Windows set up, and that's a nontrivial thing I
believe).
- Use new Error class (subclass of RuntimeError so is backward
compatible) which is raised when RuntimeError used to be raised.
- Report original attribute name in error messages instead of name
mangled with namespace URL.
also test join method of 8-bit strings.
Also changed the test() function to (1) compare the types of the
expected and actual result, and (2) in verbose mode, print the repr()
of the output.
testAAA(),
testAAB(): Added checks that the results are right.
testTooManyDocumentElements(): Added code to actually test this.
testCloneElementDeep()
testCloneElementShallow(): Filled these in with test code.
_testCloneElementCopiesAttributes(),
_setupCloneElement(): Helper functions used with the other
testCloneElement*() functions.
testCloneElementShallowCopiesAttributes(): No longer a separate test;
_setupCloneElement() uses _testCloneElementCopiesAttributes() to
test that this is always done.
testNormalize(): Added to check Node.normalize().
behavior.
Added support for the Attr.ownerElement attribute.
Everywhere: Define constant object attributes in the classes rather than
on the instances during object construction. This reduces the amount of
work needed for object construction and destruction; these need to be
lightweight operations on a DOM.
Node._get_firstChild(),
Node._get_lastChild(): Return None if there are no children (required for
compliance with DOM level 1).
Node.insertBefore(): If refChild is None, append the new node instead of
failing (required for compliance). Also, update the sibling
relationships. Return the inserted node (required for compliance).
Node.appendChild(): Update the parent of the appended node.
Node.replaceChild(): Actually replace the old child! Update the parent
and sibling relationships of both the old and new children. Return
the replaced child (required for compliance).
Node.normalize(): Implemented the normalize() method. Required for
compliance, but missing from the release. Useful for joining
adjacent Text nodes into a single node for easier processing.
Node.cloneNode(): Actually make this work. Don't let the new node share
the instance __dict__ with the original. Do proper recursion if
doing a "deep" clone. Move the attribute cloning out of the base
class, since only Element is supposed to have attributes.
Node.unlink(): Simplify handling of child nodes for efficiency, and
remove the attribute handling since only Element nodes support
attributes.
Attr.cloneNode(): Extend this to clear the ownerElement attribute in
the clone.
AttributeList.items(),
AttributeList.itemsNS(): Slight performance improvement (avoid lambda).
Element.cloneNode(): Extend Node.cloneNode() with support for the
attributes. Clone the Attr objects after creating the underlying
clone.
Element.unlink(): Clean out the attributes here instead of in the base
class, since this is the only class that will have them.
Element.toxml(): Adjust to create only one AttributeList instance; minor
efficiency improvement.
_nssplit(): No need to re-import string.
Document.__init__(): No longer needed once constant attributes are
initialized in the class itself.
Document.createElementNS(),
Document.createAttributeNS(): Use the defined constructors rather than
directly access the classes.
_get_StringIO(): New function. Create an output StringIO using the most
efficient available flavor.
parse(),
parseString(): Import pulldom here instead of in the public namespace of
the module.
file uploads.
In response to SF bugs 110674 and 119806, and discussions on
python-dev, we are removing the self.lines attribute from the
FieldStorage class. Specifically touched where methods __init__(),
read_lines_to_eof(), and skip_lines().
No one can remember why self.lines was added. Technically, it's part
of the public interface for the class, but it was never documented.
It's possible clever or nosy code will break because of this, but it
was decided to remove it and see who complains.
This resolution also closes the second half of the cgi.py entry in PEP
42. The first half of that PEP concerns specifically binary file
uploads, where there may be no end-of-line marker for a very long
time. This patch does not address that issue.
embedded code objects (e.g. functions) rather than the generated code
object. This change means that the compiler generates code for
everything at the end, rather then generating code for each function
as it finds it. Implementation note: _convert_LOAD_CONST in
pyassem.py must be change to call getCode().
Other changes follow. Several changes creates extra edges between
basic blocks to reflect control flow for loops and exceptions. These
missing edges had gone unnoticed because they do not affect the
current compilation process.
pyassem.py:
Add _enable_debug() and _disable_debug() methods that print
instructions and blocks to stdout as they are generated.
Add edges between blocks for instructions like SETUP_LOOP,
FOR_LOOP, etc.
Add pruneNext to get rid of bogus edges remaining after
unconditional transfer ops (e.g. JUMP_FORWARD)
Change repr of Block to omit block length.
pycodegen.py:
Make sure a new block is started after FOR_LOOP, etc.
Change assert implementation to use RAISE_VARARGS 1 when there is
no user-specified failure output.
misc.py:
Implement __contains__ and copy for Set.
When a method is called with no regular arguments and * args, defer
the first arg is subclass check until after the * args have been
expanded.
N.B. The CALL_FUNCTION implementation is getting really hairy; should
review it to see if it can be simplified.
-- fixed negative lookbehind to work correctly at the beginning
of the target string (bug #117242)
-- improved syntax check; you can no longer refer to a group
inside itself (bug #110866)
Reformatting -- long lines, "[ ]" -> "[]", a few indentation nits.
Replace calls to Node function (which constructed ast nodes) with
calls to actual constructors imported from ast module.
Optimize com_node (most frequently used method) for the common case --
the appropriate method is found in _dispatch.
Fix com_augassign to use class object's rather than node names
(rendered invalid by recent changes to ast)
Remove expensive tests for sequence-ness in com_stmt and
com_append_stmt. These tests should never fail; if they do, something
is really broken and exception will be raised elsewhere.
Fix com_stmt and com_append_stmt to use isinstance rather than
testing's type slot of ast node (this slot disappeared with recent
changes to ast).
Betlehem, verified by Peter Funk. Fixes preservation of language
search order lost due to use of dictionary keys instead of a list.
Closes SF bug #116964.