mirror of https://github.com/python/cpython
gh-119786: move exception handling doc to InternalDocs (#119815)
This commit is contained in:
parent
cae4c80714
commit
367adc91fb
|
@ -10,3 +10,7 @@ to hold for other implementations of the Python language.
|
||||||
The core dev team attempts to keep this documentation up to date. If
|
The core dev team attempts to keep this documentation up to date. If
|
||||||
it is not, please report that through the
|
it is not, please report that through the
|
||||||
[issue tracker](https://github.com/python/cpython/issues).
|
[issue tracker](https://github.com/python/cpython/issues).
|
||||||
|
|
||||||
|
|
||||||
|
[Exception Handling](exception_handling.md)
|
||||||
|
|
|
@ -0,0 +1,201 @@
|
||||||
|
Description of exception handling
|
||||||
|
---------------------------------
|
||||||
|
|
||||||
|
Python uses a technique known as "zero-cost" exception handling, which
|
||||||
|
minimizes the cost of supporting exceptions. In the common case (where
|
||||||
|
no exception is raised) the cost is reduced to zero (or close to zero).
|
||||||
|
The cost of raising an exception is increased, but not by much.
|
||||||
|
|
||||||
|
The following code:
|
||||||
|
|
||||||
|
```
|
||||||
|
try:
|
||||||
|
g(0)
|
||||||
|
except:
|
||||||
|
res = "fail"
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
compiles into intermediate code like the following:
|
||||||
|
|
||||||
|
```
|
||||||
|
RESUME 0
|
||||||
|
|
||||||
|
1 SETUP_FINALLY 8 (to L1)
|
||||||
|
|
||||||
|
2 LOAD_NAME 0 (g)
|
||||||
|
PUSH_NULL
|
||||||
|
LOAD_CONST 0 (0)
|
||||||
|
CALL 1
|
||||||
|
POP_TOP
|
||||||
|
POP_BLOCK
|
||||||
|
|
||||||
|
-- L1: PUSH_EXC_INFO
|
||||||
|
|
||||||
|
3 POP_TOP
|
||||||
|
|
||||||
|
4 LOAD_CONST 1 ('fail')
|
||||||
|
STORE_NAME 1 (res)
|
||||||
|
```
|
||||||
|
|
||||||
|
`SETUP_FINALLY` and `POP_BLOCK` are pseudo-instructions. This means
|
||||||
|
that they can appear in intermediate code but they are not bytecode
|
||||||
|
instructions. `SETUP_FINALLY` specifies that henceforth, exceptions
|
||||||
|
are handled by the code at label L1. The `POP_BLOCK` instruction
|
||||||
|
reverses the effect of the last `SETUP` instruction, so that the
|
||||||
|
active exception handler reverts to what it was before.
|
||||||
|
|
||||||
|
`SETUP_FINALLY` and `POP_BLOCK` have no effect when no exceptions
|
||||||
|
are raised. The idea of zero-cost exception handling is to replace
|
||||||
|
these pseudo-instructions by metadata which is stored alongside the
|
||||||
|
bytecode, and which is inspected only when an exception occurs.
|
||||||
|
This metadata is the exception table, and it is stored in the code
|
||||||
|
object's `co_exceptiontable` field.
|
||||||
|
|
||||||
|
When the pseudo-instructions are translated into bytecode,
|
||||||
|
`SETUP_FINALLY` and `POP_BLOCK` are removed, and the exception
|
||||||
|
table is constructed, mapping each instruction to the exception
|
||||||
|
handler that covers it, if any. Instructions which are not
|
||||||
|
covered by any exception handler within the same code object's
|
||||||
|
bytecode, do not appear in the exception table at all.
|
||||||
|
|
||||||
|
For the code object in our example above, the table has a single
|
||||||
|
entry specifying that all instructions that were between the
|
||||||
|
`SETUP_FINALLY` and the `POP_BLOCK` are covered by the exception
|
||||||
|
handler located at label `L1`.
|
||||||
|
|
||||||
|
Handling Exceptions
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
At runtime, when an exception occurs, the interpreter looks up
|
||||||
|
the offset of the current instruction in the exception table. If
|
||||||
|
it finds a handler, control flow transfers to it. Otherwise, the
|
||||||
|
exception bubbles up to the caller, and the caller's frame is
|
||||||
|
checked for a handler covering the `CALL` instruction. This
|
||||||
|
repeats until a handler is found or the topmost frame is reached.
|
||||||
|
If no handler is found, the program terminates. During unwinding,
|
||||||
|
the traceback is constructed as each frame is added to it.
|
||||||
|
|
||||||
|
Along with the location of an exception handler, each entry of the
|
||||||
|
exception table also contains the stack depth of the `try` instruction
|
||||||
|
and a boolean `lasti` value, which indicates whether the instruction
|
||||||
|
offset of the raising instruction should be pushed to the stack.
|
||||||
|
|
||||||
|
Handling an exception, once an exception table entry is found, consists
|
||||||
|
of the following steps:
|
||||||
|
|
||||||
|
1. pop values from the stack until it matches the stack depth for the handler.
|
||||||
|
2. if `lasti` is true, then push the offset that the exception was raised at.
|
||||||
|
3. push the exception to the stack.
|
||||||
|
4. jump to the target offset and resume execution.
|
||||||
|
|
||||||
|
|
||||||
|
Reraising Exceptions and `lasti`
|
||||||
|
--------------------------------
|
||||||
|
|
||||||
|
The purpose of pushing `lasti` to the stack is for cases where an exception
|
||||||
|
needs to be re-raised, and be associated with the original instruction that
|
||||||
|
raised it. This happens, for example, at the end of a `finally` block, when
|
||||||
|
any in-flight exception needs to be propagated on. As the frame's instruction
|
||||||
|
pointer now points into the finally block, a `RERAISE` instruction
|
||||||
|
(with `oparg > 0`) sets it to the `lasti` value from the stack.
|
||||||
|
|
||||||
|
Format of the exception table
|
||||||
|
-----------------------------
|
||||||
|
|
||||||
|
Conceptually, the exception table consists of a sequence of 5-tuples:
|
||||||
|
```
|
||||||
|
1. `start-offset` (inclusive)
|
||||||
|
2. `end-offset` (exclusive)
|
||||||
|
3. `target`
|
||||||
|
4. `stack-depth`
|
||||||
|
5. `push-lasti` (boolean)
|
||||||
|
```
|
||||||
|
|
||||||
|
All offsets and lengths are in code units, not bytes.
|
||||||
|
|
||||||
|
We want the format to be compact, but quickly searchable.
|
||||||
|
For it to be compact, it needs to have variable sized entries so that we can store common (small) offsets compactly, but handle large offsets if needed.
|
||||||
|
For it to be searchable quickly, we need to support binary search giving us log(n) performance in all cases.
|
||||||
|
Binary search typically assumes fixed size entries, but that is not necessary, as long as we can identify the start of an entry.
|
||||||
|
|
||||||
|
It is worth noting that the size (end-start) is always smaller than the end, so we encode the entries as:
|
||||||
|
`start, size, target, depth, push-lasti`.
|
||||||
|
|
||||||
|
Also, sizes are limited to 2**30 as the code length cannot exceed 2**31 and each code unit takes 2 bytes.
|
||||||
|
It also happens that depth is generally quite small.
|
||||||
|
|
||||||
|
So, we need to encode:
|
||||||
|
```
|
||||||
|
`start` (up to 30 bits)
|
||||||
|
`size` (up to 30 bits)
|
||||||
|
`target` (up to 30 bits)
|
||||||
|
`depth` (up to ~8 bits)
|
||||||
|
`lasti` (1 bit)
|
||||||
|
```
|
||||||
|
|
||||||
|
We need a marker for the start of the entry, so the first byte of entry will have the most significant bit set.
|
||||||
|
Since the most significant bit is reserved for marking the start of an entry, we have 7 bits per byte to encode offsets.
|
||||||
|
Encoding uses a standard varint encoding, but with only 7 bits instead of the usual 8.
|
||||||
|
The 8 bits of a byte are (msb left) SXdddddd where S is the start bit. X is the extend bit meaning that the next byte is required to extend the offset.
|
||||||
|
|
||||||
|
In addition, we combine `depth` and `lasti` into a single value, `((depth<<1)+lasti)`, before encoding.
|
||||||
|
|
||||||
|
For example, the exception entry:
|
||||||
|
```
|
||||||
|
`start`: 20
|
||||||
|
`end`: 28
|
||||||
|
`target`: 100
|
||||||
|
`depth`: 3
|
||||||
|
`lasti`: False
|
||||||
|
```
|
||||||
|
|
||||||
|
is encoded by first converting to the more compact four value form:
|
||||||
|
```
|
||||||
|
`start`: 20
|
||||||
|
`size`: 8
|
||||||
|
`target`: 100
|
||||||
|
`depth<<1+lasti`: 6
|
||||||
|
```
|
||||||
|
|
||||||
|
which is then encoded as:
|
||||||
|
```
|
||||||
|
148 (MSB + 20 for start)
|
||||||
|
8 (size)
|
||||||
|
65 (Extend bit + 1)
|
||||||
|
36 (Remainder of target, 100 == (1<<6)+36)
|
||||||
|
6
|
||||||
|
```
|
||||||
|
|
||||||
|
for a total of five bytes.
|
||||||
|
|
||||||
|
|
||||||
|
Script to parse the exception table
|
||||||
|
-----------------------------------
|
||||||
|
|
||||||
|
```
|
||||||
|
def parse_varint(iterator):
|
||||||
|
b = next(iterator)
|
||||||
|
val = b & 63
|
||||||
|
while b&64:
|
||||||
|
val <<= 6
|
||||||
|
b = next(iterator)
|
||||||
|
val |= b&63
|
||||||
|
return val
|
||||||
|
```
|
||||||
|
```
|
||||||
|
def parse_exception_table(code):
|
||||||
|
iterator = iter(code.co_exceptiontable)
|
||||||
|
try:
|
||||||
|
while True:
|
||||||
|
start = parse_varint(iterator)*2
|
||||||
|
length = parse_varint(iterator)*2
|
||||||
|
end = start + length - 2 # Present as inclusive, not exclusive
|
||||||
|
target = parse_varint(iterator)*2
|
||||||
|
dl = parse_varint(iterator)
|
||||||
|
depth = dl >> 1
|
||||||
|
lasti = bool(dl&1)
|
||||||
|
yield start, end, target, depth, lasti
|
||||||
|
except StopIteration:
|
||||||
|
return
|
||||||
|
```
|
|
@ -1,182 +0,0 @@
|
||||||
Description of exception handling in Python 3.11
|
|
||||||
------------------------------------------------
|
|
||||||
|
|
||||||
Python 3.11 uses what is known as "zero-cost" exception handling.
|
|
||||||
Prior to 3.11, exceptions were handled by a runtime stack of "blocks".
|
|
||||||
|
|
||||||
In zero-cost exception handling, the cost of supporting exceptions is minimized.
|
|
||||||
In the common case (where no exception is raised) the cost is reduced
|
|
||||||
to zero (or close to zero).
|
|
||||||
The cost of raising an exception is increased, but not by much.
|
|
||||||
|
|
||||||
The following code:
|
|
||||||
|
|
||||||
def f():
|
|
||||||
try:
|
|
||||||
g(0)
|
|
||||||
except:
|
|
||||||
return "fail"
|
|
||||||
|
|
||||||
compiles as follows in 3.10:
|
|
||||||
|
|
||||||
2 0 SETUP_FINALLY 7 (to 16)
|
|
||||||
|
|
||||||
3 2 LOAD_GLOBAL 0 (g)
|
|
||||||
4 LOAD_CONST 1 (0)
|
|
||||||
6 CALL_NO_KW 1
|
|
||||||
8 POP_TOP
|
|
||||||
10 POP_BLOCK
|
|
||||||
12 LOAD_CONST 0 (None)
|
|
||||||
14 RETURN_VALUE
|
|
||||||
|
|
||||||
4 >> 16 POP_TOP
|
|
||||||
18 POP_TOP
|
|
||||||
20 POP_TOP
|
|
||||||
|
|
||||||
5 22 POP_EXCEPT
|
|
||||||
24 LOAD_CONST 3 ('fail')
|
|
||||||
26 RETURN_VALUE
|
|
||||||
|
|
||||||
Note the explicit instructions to push and pop from the "block" stack:
|
|
||||||
SETUP_FINALLY and POP_BLOCK.
|
|
||||||
|
|
||||||
In 3.11, the SETUP_FINALLY and POP_BLOCK are eliminated, replaced with
|
|
||||||
a table to determine where to jump to when an exception is raised.
|
|
||||||
|
|
||||||
1 0 RESUME 0
|
|
||||||
|
|
||||||
2 2 NOP
|
|
||||||
|
|
||||||
3 4 LOAD_GLOBAL 1 (g + NULL)
|
|
||||||
16 LOAD_CONST 1 (0)
|
|
||||||
18 PRECALL 1
|
|
||||||
22 CALL 1
|
|
||||||
32 POP_TOP
|
|
||||||
34 LOAD_CONST 0 (None)
|
|
||||||
36 RETURN_VALUE
|
|
||||||
>> 38 PUSH_EXC_INFO
|
|
||||||
|
|
||||||
4 40 POP_TOP
|
|
||||||
|
|
||||||
5 42 POP_EXCEPT
|
|
||||||
44 LOAD_CONST 2 ('fail')
|
|
||||||
46 RETURN_VALUE
|
|
||||||
>> 48 COPY 3
|
|
||||||
50 POP_EXCEPT
|
|
||||||
52 RERAISE 1
|
|
||||||
ExceptionTable:
|
|
||||||
4 to 32 -> 38 [0]
|
|
||||||
38 to 40 -> 48 [1] lasti
|
|
||||||
|
|
||||||
(Note this code is from 3.11, later versions may have slightly different bytecode.)
|
|
||||||
|
|
||||||
If an instruction raises an exception then its offset is used to find the target to jump to.
|
|
||||||
For example, the CALL at offset 22, falls into the range 4 to 32.
|
|
||||||
So, if g() raises an exception, then control jumps to offset 38.
|
|
||||||
|
|
||||||
|
|
||||||
Unwinding
|
|
||||||
---------
|
|
||||||
|
|
||||||
When an exception is raised, the current instruction offset is used to find following:
|
|
||||||
target to jump to, stack depth, and 'lasti', which determines whether the instruction
|
|
||||||
offset of the raising instruction should be pushed.
|
|
||||||
|
|
||||||
This information is stored in the exception table, described below.
|
|
||||||
|
|
||||||
If there is no relevant entry, the exception bubbles up to the caller.
|
|
||||||
|
|
||||||
If there is an entry, then:
|
|
||||||
1. pop values from the stack until it matches the stack depth for the handler.
|
|
||||||
2. if 'lasti' is true, then push the offset that the exception was raised at.
|
|
||||||
3. push the exception to the stack.
|
|
||||||
4. jump to the target offset and resume execution.
|
|
||||||
|
|
||||||
|
|
||||||
Format of the exception table
|
|
||||||
-----------------------------
|
|
||||||
|
|
||||||
Conceptually, the exception table consists of a sequence of 5-tuples:
|
|
||||||
1. start-offset (inclusive)
|
|
||||||
2. end-offset (exclusive)
|
|
||||||
3. target
|
|
||||||
4. stack-depth
|
|
||||||
5. push-lasti (boolean)
|
|
||||||
|
|
||||||
All offsets and lengths are in instructions, not bytes.
|
|
||||||
|
|
||||||
We want the format to be compact, but quickly searchable.
|
|
||||||
For it to be compact, it needs to have variable sized entries so that we can store common (small) offsets compactly, but handle large offsets if needed.
|
|
||||||
For it to be searchable quickly, we need to support binary search giving us log(n) performance in all cases.
|
|
||||||
Binary search typically assumes fixed size entries, but that is not necessary, as long as we can identify the start of an entry.
|
|
||||||
|
|
||||||
It is worth noting that the size (end-start) is always smaller than the end, so we encode the entries as:
|
|
||||||
start, size, target, depth, push-lasti
|
|
||||||
|
|
||||||
Also, sizes are limited to 2**30 as the code length cannot exceed 2**31 and each instruction takes 2 bytes.
|
|
||||||
It also happens that depth is generally quite small.
|
|
||||||
|
|
||||||
So, we need to encode:
|
|
||||||
start (up to 30 bits)
|
|
||||||
size (up to 30 bits)
|
|
||||||
target (up to 30 bits)
|
|
||||||
depth (up to ~8 bits)
|
|
||||||
lasti (1 bit)
|
|
||||||
|
|
||||||
We need a marker for the start of the entry, so the first byte of entry will have the most significant bit set.
|
|
||||||
Since the most significant bit is reserved for marking the start of an entry, we have 7 bits per byte to encode offsets.
|
|
||||||
Encoding uses a standard varint encoding, but with only 7 bits instead of the usual 8.
|
|
||||||
The 8 bits of a bit are (msb left) SXdddddd where S is the start bit. X is the extend bit meaning that the next byte is required to extend the offset.
|
|
||||||
|
|
||||||
In addition, we will combine depth and lasti into a single value, ((depth<<1)+lasti), before encoding.
|
|
||||||
|
|
||||||
For example, the exception entry:
|
|
||||||
start: 20
|
|
||||||
end: 28
|
|
||||||
target: 100
|
|
||||||
depth: 3
|
|
||||||
lasti: False
|
|
||||||
|
|
||||||
is encoded first by converting to the more compact four value form:
|
|
||||||
start: 20
|
|
||||||
size: 8
|
|
||||||
target: 100
|
|
||||||
depth<<1+lasti: 6
|
|
||||||
|
|
||||||
which is then encoded as:
|
|
||||||
148 (MSB + 20 for start)
|
|
||||||
8 (size)
|
|
||||||
65 (Extend bit + 1)
|
|
||||||
36 (Remainder of target, 100 == (1<<6)+36)
|
|
||||||
6
|
|
||||||
|
|
||||||
for a total of five bytes.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Script to parse the exception table
|
|
||||||
-----------------------------------
|
|
||||||
|
|
||||||
def parse_varint(iterator):
|
|
||||||
b = next(iterator)
|
|
||||||
val = b & 63
|
|
||||||
while b&64:
|
|
||||||
val <<= 6
|
|
||||||
b = next(iterator)
|
|
||||||
val |= b&63
|
|
||||||
return val
|
|
||||||
|
|
||||||
def parse_exception_table(code):
|
|
||||||
iterator = iter(code.co_exceptiontable)
|
|
||||||
try:
|
|
||||||
while True:
|
|
||||||
start = parse_varint(iterator)*2
|
|
||||||
length = parse_varint(iterator)*2
|
|
||||||
end = start + length - 2 # Present as inclusive, not exclusive
|
|
||||||
target = parse_varint(iterator)*2
|
|
||||||
dl = parse_varint(iterator)
|
|
||||||
depth = dl >> 1
|
|
||||||
lasti = bool(dl&1)
|
|
||||||
yield start, end, target, depth, lasti
|
|
||||||
except StopIteration:
|
|
||||||
return
|
|
Loading…
Reference in New Issue