give minidom.py behaviour that complies with the DOM Level 1 REC,
which says that when a node newChild is added to the tree, "if the
newChild is already in the tree, it is first removed."
pulldom.py is patched to use the public minidom interface instead
of setting .parentNode itself. Possibly this reduces pulldom's
efficiency; someone else will have to pronounce on that.
so we can't use it.
While I'm at it, got rid of string module use. (Found several new
hard special cases for a hypothetical conversion tool: from string
import join, find, rfind; and a local assignment "find=string.find".)
required to work around restrictions on the arguments of
u.translate():
1) don't pass the deletions argument if it's empty;
2) convert table to Unicode if s is Unicode.
This fixes SF bug #124060.
bugs #126161 and 123634).
The solution doesn't use the unicode-escape encoding; that has other
problems (it seems not 100% reversible). Rather, it transforms the
input Unicode object slightly before encoding it using
raw-unicode-escape, so that the decoding will reconstruct the original
string: backslash and newline characters are translated into their
\uXXXX counterparts.
This is backwards incompatible for strings containing backslashes, but
for some of those strings, the pickling was already broken.
Note that SF bug #123634 complains specifically that cPickle fails to
unpickle the pickle for u'' (the empty Unicode string) correctly.
This was an off-by-one error in load_unicode().
XXX Ugliness: in order to do the modified raw-unicode-escape, I've
cut-and-pasted a copy of PyUnicode_EncodeRawUnicodeEscape() into this
file that also encodes '\\' and '\n'. It might be nice to migrate
this into the Unicode implementation and give this encoding a new name
('half-raw-unicode-escape'? 'pickle-unicode-escape'?); that would help
pickle.py too. But right now I can't be bothered with the necessary
infrastructural changes.
bugs #126161 and 123634).
The solution doesn't use the unicode-escape encoding; that has other
problems (it seems not 100% reversible). Rather, it transforms the
input Unicode object slightly before encoding it using
raw-unicode-escape, so that the decoding will reconstruct the original
string: backslash and newline characters are translated into their
\uXXXX counterparts.
This is backwards incompatible for strings containing backslashes, but
for some of those strings, the pickling was already broken.
numbers" instead; we have not described "reals" anywhere else in the
documentation, and this is not the place to change the story!
Reported by Keith Briggs <keith.briggs@bt.com>.