Close #19378: address flaws in the new dis module APIs

- confusing line_offset parameter -> first_line parameter
- systematically test and fix new file parameter
- remove redundant Bytecode.show_info() API
- rename Bytecode.display_code() to Bytecode.dis() and have it
  return the multi-line string rather than printing it directly
- eliminated some not-so-helpful helpers from the bytecode_helper
  test support module

Also fixed a longstanding defect (worked around in the test suite)
where lines emitted by the dis module could include trailing white
space. That no longer happens, allowing the formatting tests to be
simplified to use plain string comparisons.
This commit is contained in:
Nick Coghlan 2013-11-06 22:08:36 +10:00
parent e0881f464c
commit 90b8e7d2bc
5 changed files with 167 additions and 122 deletions

View File

@ -44,36 +44,39 @@ The bytecode analysis API allows pieces of Python code to be wrapped in a
:class:`Bytecode` object that provides easy access to details of the :class:`Bytecode` object that provides easy access to details of the
compiled code. compiled code.
.. class:: Bytecode .. class:: Bytecode(x, *, first_line=None)
The bytecode operations of a piece of code Analyse the bytecode corresponding to a function, method, string of
source code, or a code object (as returned by :func:`compile`).
This is a convenient wrapper around many of the functions listed below. This is a convenience wrapper around many of the functions listed below,
Instantiate it with a function, method, string of code, or a code object most notably :func:`get_instructions`, as iterating over a
(as returned by :func:`compile`). :class:`ByteCode` instance yields the bytecode operations as
:class:`Instruction` instances.
Iterating over this yields the bytecode operations as :class:`Instruction` If *first_line* is not None, it indicates the line number that should
instances. be reported for the first source line in the disassembled code.
Otherwise, the source line information (if any) is taken directly from
the disassembled code object.
.. data:: codeobj .. data:: codeobj
The compiled code object. The compiled code object.
.. method:: display_code(*, file=None) .. data:: first_line
Print a formatted view of the bytecode operations, like :func:`dis`. The first source line of the code object (if available)
.. method:: dis()
Return a formatted view of the bytecode operations (the same as
printed by :func:`dis`, but returned as a multi-line string).
.. method:: info() .. method:: info()
Return a formatted multi-line string with detailed information about the Return a formatted multi-line string with detailed information about the
code object, like :func:`code_info`. code object, like :func:`code_info`.
.. method:: show_info(*, file=None)
Print the information about the code object as returned by :meth:`info`.
.. versionadded:: 3.4
Example:: Example::
>>> bytecode = dis.Bytecode(myfunc) >>> bytecode = dis.Bytecode(myfunc)
@ -176,7 +179,7 @@ object isn't useful:
Added ``file`` parameter Added ``file`` parameter
.. function:: get_instructions(x, *, line_offset=0) .. function:: get_instructions(x, *, first_line=None)
Return an iterator over the instructions in the supplied function, method, Return an iterator over the instructions in the supplied function, method,
source code string or code object. source code string or code object.
@ -184,8 +187,10 @@ object isn't useful:
The iterator generates a series of :class:`Instruction` named tuples The iterator generates a series of :class:`Instruction` named tuples
giving the details of each operation in the supplied code. giving the details of each operation in the supplied code.
The given *line_offset* is added to the ``starts_line`` attribute of any If *first_line* is not None, it indicates the line number that should
instructions that start a new line. be reported for the first source line in the disassembled code.
Otherwise, the source line information (if any) is taken directly from
the disassembled code object.
.. versionadded:: 3.4 .. versionadded:: 3.4

View File

@ -3,6 +3,7 @@
import sys import sys
import types import types
import collections import collections
import io
from opcode import * from opcode import *
from opcode import __all__ as _opcodes_all from opcode import __all__ as _opcodes_all
@ -34,7 +35,7 @@ def dis(x=None, *, file=None):
""" """
if x is None: if x is None:
distb() distb(file=file)
return return
if hasattr(x, '__func__'): # Method if hasattr(x, '__func__'): # Method
x = x.__func__ x = x.__func__
@ -46,7 +47,7 @@ def dis(x=None, *, file=None):
if isinstance(x1, _have_code): if isinstance(x1, _have_code):
print("Disassembly of %s:" % name, file=file) print("Disassembly of %s:" % name, file=file)
try: try:
dis(x1) dis(x1, file=file)
except TypeError as msg: except TypeError as msg:
print("Sorry:", msg, file=file) print("Sorry:", msg, file=file)
print(file=file) print(file=file)
@ -203,21 +204,27 @@ class Instruction(_Instruction):
# Column: Opcode argument details # Column: Opcode argument details
if self.argrepr: if self.argrepr:
fields.append('(' + self.argrepr + ')') fields.append('(' + self.argrepr + ')')
return ' '.join(fields) return ' '.join(fields).rstrip()
def get_instructions(x, *, line_offset=0): def get_instructions(x, *, first_line=None):
"""Iterator for the opcodes in methods, functions or code """Iterator for the opcodes in methods, functions or code
Generates a series of Instruction named tuples giving the details of Generates a series of Instruction named tuples giving the details of
each operations in the supplied code. each operations in the supplied code.
The given line offset is added to the 'starts_line' attribute of any If *first_line* is not None, it indicates the line number that should
instructions that start a new line. be reported for the first source line in the disassembled code.
Otherwise, the source line information (if any) is taken directly from
the disassembled code object.
""" """
co = _get_code_object(x) co = _get_code_object(x)
cell_names = co.co_cellvars + co.co_freevars cell_names = co.co_cellvars + co.co_freevars
linestarts = dict(findlinestarts(co)) linestarts = dict(findlinestarts(co))
if first_line is not None:
line_offset = first_line - co.co_firstlineno
else:
line_offset = 0
return _get_instructions_bytes(co.co_code, co.co_varnames, co.co_names, return _get_instructions_bytes(co.co_code, co.co_varnames, co.co_names,
co.co_consts, cell_names, linestarts, co.co_consts, cell_names, linestarts,
line_offset) line_offset)
@ -320,13 +327,14 @@ def disassemble(co, lasti=-1, *, file=None):
def _disassemble_bytes(code, lasti=-1, varnames=None, names=None, def _disassemble_bytes(code, lasti=-1, varnames=None, names=None,
constants=None, cells=None, linestarts=None, constants=None, cells=None, linestarts=None,
*, file=None): *, file=None, line_offset=0):
# Omit the line number column entirely if we have no line number info # Omit the line number column entirely if we have no line number info
show_lineno = linestarts is not None show_lineno = linestarts is not None
# TODO?: Adjust width upwards if max(linestarts.values()) >= 1000? # TODO?: Adjust width upwards if max(linestarts.values()) >= 1000?
lineno_width = 3 if show_lineno else 0 lineno_width = 3 if show_lineno else 0
for instr in _get_instructions_bytes(code, varnames, names, for instr in _get_instructions_bytes(code, varnames, names,
constants, cells, linestarts): constants, cells, linestarts,
line_offset=line_offset):
new_source_line = (show_lineno and new_source_line = (show_lineno and
instr.starts_line is not None and instr.starts_line is not None and
instr.offset > 0) instr.offset > 0)
@ -398,40 +406,44 @@ class Bytecode:
Iterating over this yields the bytecode operations as Instruction instances. Iterating over this yields the bytecode operations as Instruction instances.
""" """
def __init__(self, x): def __init__(self, x, *, first_line=None):
self.codeobj = _get_code_object(x) self.codeobj = co = _get_code_object(x)
self.cell_names = self.codeobj.co_cellvars + self.codeobj.co_freevars if first_line is None:
self.linestarts = dict(findlinestarts(self.codeobj)) self.first_line = co.co_firstlineno
self.line_offset = 0 self._line_offset = 0
self.original_object = x else:
self.first_line = first_line
self._line_offset = first_line - co.co_firstlineno
self._cell_names = co.co_cellvars + co.co_freevars
self._linestarts = dict(findlinestarts(co))
self._original_object = x
def __iter__(self): def __iter__(self):
co = self.codeobj co = self.codeobj
return _get_instructions_bytes(co.co_code, co.co_varnames, co.co_names, return _get_instructions_bytes(co.co_code, co.co_varnames, co.co_names,
co.co_consts, self.cell_names, co.co_consts, self._cell_names,
self.linestarts, self.line_offset) self._linestarts,
line_offset=self._line_offset)
def __repr__(self): def __repr__(self):
return "{}({!r})".format(self.__class__.__name__, self.original_object) return "{}({!r})".format(self.__class__.__name__,
self._original_object)
def info(self): def info(self):
"""Return formatted information about the code object.""" """Return formatted information about the code object."""
return _format_code_info(self.codeobj) return _format_code_info(self.codeobj)
def show_info(self, *, file=None): def dis(self):
"""Print the information about the code object as returned by info().""" """Return a formatted view of the bytecode operations."""
print(self.info(), file=file)
def display_code(self, *, file=None):
"""Print a formatted view of the bytecode operations.
"""
co = self.codeobj co = self.codeobj
return _disassemble_bytes(co.co_code, varnames=co.co_varnames, with io.StringIO() as output:
names=co.co_names, constants=co.co_consts, _disassemble_bytes(co.co_code, varnames=co.co_varnames,
cells=self.cell_names, names=co.co_names, constants=co.co_consts,
linestarts=self.linestarts, cells=self._cell_names,
file=file linestarts=self._linestarts,
) line_offset=self._line_offset,
file=output)
return output.getvalue()
def _test(): def _test():

View File

@ -14,37 +14,6 @@ class BytecodeTestCase(unittest.TestCase):
dis.dis(co, file=s) dis.dis(co, file=s)
return s.getvalue() return s.getvalue()
def assertInstructionMatches(self, instr, expected, *, line_offset=0):
# Deliberately test opname first, since that gives a more
# meaningful error message than testing opcode
self.assertEqual(instr.opname, expected.opname)
self.assertEqual(instr.opcode, expected.opcode)
self.assertEqual(instr.arg, expected.arg)
self.assertEqual(instr.argval, expected.argval)
self.assertEqual(instr.argrepr, expected.argrepr)
self.assertEqual(instr.offset, expected.offset)
if expected.starts_line is None:
self.assertIsNone(instr.starts_line)
else:
self.assertEqual(instr.starts_line,
expected.starts_line + line_offset)
self.assertEqual(instr.is_jump_target, expected.is_jump_target)
def assertBytecodeExactlyMatches(self, x, expected, *, line_offset=0):
"""Throws AssertionError if any discrepancy is found in bytecode
*x* is the object to be introspected
*expected* is a list of dis.Instruction objects
Set *line_offset* as appropriate to adjust for the location of the
object to be disassembled within the test file. If the expected list
assumes the first line is line 1, then an appropriate offset would be
``1 - f.__code__.co_firstlineno``.
"""
actual = dis.get_instructions(x, line_offset=line_offset)
self.assertEqual(list(actual), expected)
def assertInBytecode(self, x, opname, argval=_UNSPECIFIED): def assertInBytecode(self, x, opname, argval=_UNSPECIFIED):
"""Returns instr if op is found, otherwise throws AssertionError""" """Returns instr if op is found, otherwise throws AssertionError"""
for instr in dis.get_instructions(x): for instr in dis.get_instructions(x):

View File

@ -8,6 +8,7 @@ import sys
import dis import dis
import io import io
import types import types
import contextlib
class _C: class _C:
def __init__(self, x): def __init__(self, x):
@ -176,30 +177,20 @@ dis_compound_stmt_str = """\
class DisTests(unittest.TestCase): class DisTests(unittest.TestCase):
def get_disassembly(self, func, lasti=-1, wrapper=True): def get_disassembly(self, func, lasti=-1, wrapper=True):
s = io.StringIO() # We want to test the default printing behaviour, not the file arg
save_stdout = sys.stdout output = io.StringIO()
sys.stdout = s with contextlib.redirect_stdout(output):
try:
if wrapper: if wrapper:
dis.dis(func) dis.dis(func)
else: else:
dis.disassemble(func, lasti) dis.disassemble(func, lasti)
finally: return output.getvalue()
sys.stdout = save_stdout
# Trim trailing blanks (if any).
return [line.rstrip() for line in s.getvalue().splitlines()]
def get_disassemble_as_string(self, func, lasti=-1): def get_disassemble_as_string(self, func, lasti=-1):
return '\n'.join(self.get_disassembly(func, lasti, False)) return self.get_disassembly(func, lasti, False)
def do_disassembly_test(self, func, expected): def do_disassembly_test(self, func, expected):
lines = self.get_disassembly(func) self.assertEqual(self.get_disassembly(func), expected)
expected = expected.splitlines()
if expected != lines:
self.fail(
"events did not match expectation:\n" +
"\n".join(difflib.ndiff(expected,
lines)))
def test_opmap(self): def test_opmap(self):
self.assertEqual(dis.opmap["NOP"], 9) self.assertEqual(dis.opmap["NOP"], 9)
@ -290,6 +281,20 @@ class DisTests(unittest.TestCase):
def test_dis_object(self): def test_dis_object(self):
self.assertRaises(TypeError, dis.dis, object()) self.assertRaises(TypeError, dis.dis, object())
class DisWithFileTests(DisTests):
# Run the tests again, using the file arg instead of print
def get_disassembly(self, func, lasti=-1, wrapper=True):
# We want to test the default printing behaviour, not the file arg
output = io.StringIO()
if wrapper:
dis.dis(func, file=output)
else:
dis.disassemble(func, lasti, file=output)
return output.getvalue()
code_info_code_info = """\ code_info_code_info = """\
Name: code_info Name: code_info
Filename: (.*) Filename: (.*)
@ -482,26 +487,29 @@ def jumpy():
print("OK, now we're done") print("OK, now we're done")
# End fodder for opinfo generation tests # End fodder for opinfo generation tests
expected_outer_offset = 1 - outer.__code__.co_firstlineno expected_outer_line = 1
expected_jumpy_offset = 1 - jumpy.__code__.co_firstlineno _line_offset = outer.__code__.co_firstlineno - 1
code_object_f = outer.__code__.co_consts[3] code_object_f = outer.__code__.co_consts[3]
expected_f_line = code_object_f.co_firstlineno - _line_offset
code_object_inner = code_object_f.co_consts[3] code_object_inner = code_object_f.co_consts[3]
expected_inner_line = code_object_inner.co_firstlineno - _line_offset
expected_jumpy_line = 1
# The following lines are useful to regenerate the expected results after # The following lines are useful to regenerate the expected results after
# either the fodder is modified or the bytecode generation changes # either the fodder is modified or the bytecode generation changes
# After regeneration, update the references to code_object_f and # After regeneration, update the references to code_object_f and
# code_object_inner before rerunning the tests # code_object_inner before rerunning the tests
#_instructions = dis.get_instructions(outer, line_offset=expected_outer_offset) #_instructions = dis.get_instructions(outer, first_line=expected_outer_line)
#print('expected_opinfo_outer = [\n ', #print('expected_opinfo_outer = [\n ',
#',\n '.join(map(str, _instructions)), ',\n]', sep='') #',\n '.join(map(str, _instructions)), ',\n]', sep='')
#_instructions = dis.get_instructions(outer(), line_offset=expected_outer_offset) #_instructions = dis.get_instructions(outer(), first_line=expected_outer_line)
#print('expected_opinfo_f = [\n ', #print('expected_opinfo_f = [\n ',
#',\n '.join(map(str, _instructions)), ',\n]', sep='') #',\n '.join(map(str, _instructions)), ',\n]', sep='')
#_instructions = dis.get_instructions(outer()(), line_offset=expected_outer_offset) #_instructions = dis.get_instructions(outer()(), first_line=expected_outer_line)
#print('expected_opinfo_inner = [\n ', #print('expected_opinfo_inner = [\n ',
#',\n '.join(map(str, _instructions)), ',\n]', sep='') #',\n '.join(map(str, _instructions)), ',\n]', sep='')
#_instructions = dis.get_instructions(jumpy, line_offset=expected_jumpy_offset) #_instructions = dis.get_instructions(jumpy, first_line=expected_jumpy_line)
#print('expected_opinfo_jumpy = [\n ', #print('expected_opinfo_jumpy = [\n ',
#',\n '.join(map(str, _instructions)), ',\n]', sep='') #',\n '.join(map(str, _instructions)), ',\n]', sep='')
@ -671,42 +679,75 @@ expected_opinfo_jumpy = [
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=243, starts_line=None, is_jump_target=False), Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=243, starts_line=None, is_jump_target=False),
] ]
# One last piece of inspect fodder to check the default line number handling
def simple(): pass
expected_opinfo_simple = [
Instruction(opname='LOAD_CONST', opcode=100, arg=0, argval=None, argrepr='None', offset=0, starts_line=simple.__code__.co_firstlineno, is_jump_target=False),
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=3, starts_line=None, is_jump_target=False)
]
class InstructionTests(BytecodeTestCase): class InstructionTests(BytecodeTestCase):
def test_default_first_line(self):
actual = dis.get_instructions(simple)
self.assertEqual(list(actual), expected_opinfo_simple)
def test_first_line_set_to_None(self):
actual = dis.get_instructions(simple, first_line=None)
self.assertEqual(list(actual), expected_opinfo_simple)
def test_outer(self): def test_outer(self):
self.assertBytecodeExactlyMatches(outer, expected_opinfo_outer, actual = dis.get_instructions(outer, first_line=expected_outer_line)
line_offset=expected_outer_offset) self.assertEqual(list(actual), expected_opinfo_outer)
def test_nested(self): def test_nested(self):
with captured_stdout(): with captured_stdout():
f = outer() f = outer()
self.assertBytecodeExactlyMatches(f, expected_opinfo_f, actual = dis.get_instructions(f, first_line=expected_f_line)
line_offset=expected_outer_offset) self.assertEqual(list(actual), expected_opinfo_f)
def test_doubly_nested(self): def test_doubly_nested(self):
with captured_stdout(): with captured_stdout():
inner = outer()() inner = outer()()
self.assertBytecodeExactlyMatches(inner, expected_opinfo_inner, actual = dis.get_instructions(inner, first_line=expected_inner_line)
line_offset=expected_outer_offset) self.assertEqual(list(actual), expected_opinfo_inner)
def test_jumpy(self): def test_jumpy(self):
self.assertBytecodeExactlyMatches(jumpy, expected_opinfo_jumpy, actual = dis.get_instructions(jumpy, first_line=expected_jumpy_line)
line_offset=expected_jumpy_offset) self.assertEqual(list(actual), expected_opinfo_jumpy)
# get_instructions has its own tests above, so can rely on it to validate
# the object oriented API
class BytecodeTests(unittest.TestCase): class BytecodeTests(unittest.TestCase):
def test_instantiation(self): def test_instantiation(self):
# Test with function, method, code string and code object # Test with function, method, code string and code object
for obj in [_f, _C(1).__init__, "a=1", _f.__code__]: for obj in [_f, _C(1).__init__, "a=1", _f.__code__]:
b = dis.Bytecode(obj) with self.subTest(obj=obj):
self.assertIsInstance(b.codeobj, types.CodeType) b = dis.Bytecode(obj)
self.assertIsInstance(b.codeobj, types.CodeType)
self.assertRaises(TypeError, dis.Bytecode, object()) self.assertRaises(TypeError, dis.Bytecode, object())
def test_iteration(self): def test_iteration(self):
b = dis.Bytecode(_f) for obj in [_f, _C(1).__init__, "a=1", _f.__code__]:
for instr in b: with self.subTest(obj=obj):
self.assertIsInstance(instr, dis.Instruction) via_object = list(dis.Bytecode(obj))
via_generator = list(dis.get_instructions(obj))
self.assertEqual(via_object, via_generator)
assert len(list(b)) > 0 # Iterating should yield at least 1 instruction def test_explicit_first_line(self):
actual = dis.Bytecode(outer, first_line=expected_outer_line)
self.assertEqual(list(actual), expected_opinfo_outer)
def test_source_line_in_disassembly(self):
# Use the line in the source code
actual = dis.Bytecode(simple).dis()[:3]
expected = "{:>3}".format(simple.__code__.co_firstlineno)
self.assertEqual(actual, expected)
# Use an explicit first line number
actual = dis.Bytecode(simple, first_line=350).dis()[:3]
self.assertEqual(actual, "350")
def test_info(self): def test_info(self):
self.maxDiff = 1000 self.maxDiff = 1000
@ -714,16 +755,14 @@ class BytecodeTests(unittest.TestCase):
b = dis.Bytecode(x) b = dis.Bytecode(x)
self.assertRegex(b.info(), expected) self.assertRegex(b.info(), expected)
def test_display_code(self): def test_disassembled(self):
b = dis.Bytecode(_f) actual = dis.Bytecode(_f).dis()
output = io.StringIO() self.assertEqual(actual, dis_f)
b.display_code(file=output)
result = [line.rstrip() for line in output.getvalue().splitlines()]
self.assertEqual(result, dis_f.splitlines())
def test_main(): def test_main():
run_unittest(DisTests, CodeInfoTests, InstructionTests, BytecodeTests) run_unittest(DisTests, DisWithFileTests, CodeInfoTests,
InstructionTests, BytecodeTests)
if __name__ == "__main__": if __name__ == "__main__":
test_main() test_main()

View File

@ -31,6 +31,20 @@ Core and Builtins
Library Library
------- -------
- Issue #19378: Fixed a number of cases in the dis module where the new
"file" parameter was not being honoured correctly
- Issue #19378: Removed the "dis.Bytecode.show_info" method
- Issue #19378: Renamed the "dis.Bytecode.display_code" method to
"dis.Bytecode.dis" and converted it to returning a string rather than
printing output.
- Issue #19378: the "line_offset" parameter in the new "dis.get_instructions"
API has been renamed to "first_line" (and the default value and usage
changed accordingly). This should reduce confusion with the more common use
of "offset" in the dis docs to refer to bytecode offsets.
- Issue #18678: Corrected spwd struct member names in spwd module: - Issue #18678: Corrected spwd struct member names in spwd module:
sp_nam->sp_namp, and sp_pwd->sp_pwdp. The old names are kept as extra sp_nam->sp_namp, and sp_pwd->sp_pwdp. The old names are kept as extra
structseq members, for backward compatibility. structseq members, for backward compatibility.
@ -169,6 +183,12 @@ Library
Tests Tests
----- -----
- Issue #19378: the main dis module tests are now run with both stdout
redirection *and* passing an explicit file parameter
- Issue #19378: removed the not-actually-helpful assertInstructionMatches
and assertBytecodeExactlyMatches helpers from bytecode_helper
- Issue #18702: All skipped tests now reported as skipped. - Issue #18702: All skipped tests now reported as skipped.
- Issue #19439: interpreter embedding tests are now executed on Windows - Issue #19439: interpreter embedding tests are now executed on Windows