number of tests, all because of the codecs/_multibytecodecs issue described
here (it's not a Py3K issue, just something Py3K discovers):
http://mail.python.org/pipermail/python-dev/2006-April/064051.html
Hye-Shik Chang promised to look for a fix, so no need to fix it here. The
tests that are expected to break are:
test_codecencodings_cn
test_codecencodings_hk
test_codecencodings_jp
test_codecencodings_kr
test_codecencodings_tw
test_codecs
test_multibytecodec
This merge fixes an actual test failure (test_weakref) in this branch,
though, so I believe merging is the right thing to do anyway.
(branch-creation time) up to 43067. 43068 and 43069 contain a little
swapping action between re.py and sre.py, and this mightily confuses svn
merge, so later changes are going in separately.
This merge should break no additional tests.
The last-merged revision is going in a 'last_merge' property on '.' (the
branch directory.) Arbitrarily chosen, really; if there's a BCP for this, I
couldn't find it, but we can easily change it afterwards ;)
unicodedata.east_asian_width(). You can still implement your own
simple width() function using it like this:
def width(u):
w = 0
for c in unicodedata.normalize('NFC', u):
cwidth = unicodedata.east_asian_width(c)
if cwidth in ('W', 'F'): w += 2
else: w += 1
return w
iswide() for east asian width manipulation. (Inspired by David
Goodger, Reviewed by Martin v. Loewis)
- Move _PyUnicode_TypeRecord.flags to the end of the struct so that
no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
(with one small bugfix in bgen/bgen/scantools.py)
This replaces string module functions with string methods
for the stuff in the Tools directory. Several uses of
string.letters etc. are still remaining.
- use unidb compression for the unicodectype module. smaller,
faster, and slightly more portable...
- also mention the unicode directory in Tools/README
- fixed attributions
- moved decomposition data to a separate table, in preparation
for step 3 (which won't happen before 2.0 final, promise!)
- use relative paths in the generator script
I have a lot more stuff in the works for 2.1, but let's leave
that for another day...
a bit, sped it a lot primarily by removing the unused assumption that None was
a legit bin entry (the function doesn't really need to assume that there's
anything special about 0), added an optional "trace" argument, and in __debug__
mode added exhaustive verification that the decomposition is both correct and
doesn't overstep any array bounds (which wasn't obvious to me from staring at the
generated C code -- now I feel safe!). Did not commit a new unicodedata_db.h, as
the one produced by this version is identical to the one already checked in.