When the UTF-8 conversion to Unicode fails, return an 8-bit string

instead.  This seems more robust than returning an Unicode string with
some unconverted charcters in it.

This still doesn't support getting truly binary data out of Tcl, since
we look for the trailing null byte; but the old (pre-Unicode) code did
this too, so apparently there's no need.  (Plus, I really don't feel
like finding out how Tcl deals with this in each version.)
This commit is contained in:
Guido van Rossum 2000-05-04 15:55:17 +00:00
parent 03e29f1ae9
commit 69529ad0cc
1 changed files with 5 additions and 1 deletions

View File

@ -654,7 +654,11 @@ Tkapp_Call(self, args)
else {
/* Convert UTF-8 to Unicode string */
p = strchr(p, '\0');
res = PyUnicode_DecodeUTF8(s, (int)(p-s), "ignore");
res = PyUnicode_DecodeUTF8(s, (int)(p-s), "strict");
if (res == NULL) {
PyErr_Clear();
res = PyString_FromStringAndSize(s, (int)(p-s));
}
}
}