to avoid confusing situations like:
>>> int("")
ValueError: invalid literal for int():
>>> int("2\n\n2")
ValueError: invalid literal for int(): 2
2
Also report the base used, to avoid:
ValueError: invalid literal for int(): 2
They now report:
>>> int("")
ValueError: invalid literal for int() with base 10: ''
>>> int("2\n\n2")
ValueError: invalid literal for int() with base 10: '2\n\n2'
>>> int("2", 2)
ValueError: invalid literal for int() with base 2: '2'
(Reporting the base could be avoided when base is 10, which is the default,
but hrm.) Another effect of these changes is that the errormessage can be
longer; before, it was cut off at about 250 characters. Now, it can be up to
four times as long, as the unrepr'ed string is cut off at 200 characters,
instead.
No tests were added or changed, since testing for exact errormsgs is (pardon
the pun) somewhat errorprone, and I consider not testing the exact text
preferable. The actually changed code is tested frequent enough in the
test_builtin test as it is (120 runs for each of ints and longs.)
interpolate PY_FORMAT_SIZE_T for refcount display
instead of casting refcounts to long.
I understand that gcc on some boxes delivers
nuisance warnings about this, but if any new ones
appear because of this they'll get fixed by magic
when the others get fixed.
mismatches. At least I hope this fixes them all.
This reverts part of my change from yesterday that converted everything
in Parser/*.c to use PyObject_* API. The encoding doesn't really need
to use PyMem_*, however, it uses new_string() which must return PyMem_*
for handling the result of PyOS_Readline() which returns PyMem_* memory.
If there were 2 versions of new_string() one that returned PyMem_*
for tokens and one that return PyObject_* for encodings that could
also fix this problem. I'm not sure which version would be clearer.
This seems to fix both Guido's and Phillip's problems, so it's good enough
for now. After this change, it would be good to review Parser/*.c
for consistent use of the 2 memory APIs.
PyTypeObject structures, I had to make prototypes for the functions, and
move the structure definition ahead of the functions. I'd dearly like a better
way to do this - to change this would make for a massive set of changes to
the codebase.
There's still some warnings - this is purely to get rid of errors first.
malloc/realloc type functions, as well as renaming one variable called 'new'
in tokensizer.c. Still lots more to be done, going to be checking in one
chunk at a time or the patch will be massively huge. Still compiles ok with
gcc.
"x86 OpenBSD trunk" buildbot due to changing Python so that
Python-exposed addresses are always non-negative.
test_int_pointer_arg(): This line failed now whenever the
box happened to assign an address to `ci` "with the sign
bit set":
self.failUnlessEqual(addressof(ci), func(byref(ci)))
The problem is that the ctypes addressof() inherited "all
addresses are non-negative now" from changes to
PyLong_FromVoidPtr(), but byref() did not inherit that
change and can still return a negative int.
I don't know whether, or what, the ctypes implementation wants
to do about that (possibly nothing), but in the meantime
the test fails frequently.
So, introduced a Python positive_address() function in
the test module, that takes a purported machine address and,
if negative, converts it to a non-negative value "with the
same bits". This should leave the test passing under all
versions of Python.
Belated thanks to Armin Rigo for teaching me the sick trick ;-)
for determining the # of bits in a machine pointer via abuse
of the struct module.
tests. Alas, because only the "x86 OpenBSD trunk" buildbot fails
these tests, and test_descr stops after the first failure, there's
no sane way for me to fix these short of fixing one and then
waiting for the buildbot to reveal the next one.
failing on one of the 32-bit buildbot boxes because of it,
due to tempting but always-wrong Python code. Users
probably have code like this too (I know I did ...).
to that id() can now return a Python long on a 32-bit box that allocates
addresses "with the sign bit set".
test_set.py test_subclass_with_custom_hash(): it's never been portably
legal for a __hash__() method to return id(self), but on 32-bit boxes
that never caused a problem before it became possible for id() to
return a Python long. Changed __hash__ here to return a Python int
regardless of platform.
test_descr.py specials():
vereq(hash(c1), id(c1))
has never been a correct test -- just removed it (hash() is always
a Python int; id() may be a Python long).