This fixes a few things that are connected. It is very hard to provide
an independent test case for each of those fixes, because they are
interconnected and sometimes one masks another. The provided test case
triggers some of those bugs below but not all.
- Background:
placeBlockMarker takes a BB, and if the BB is a destination of some
branch, it places end_block marker there, and computes the nearest
common dominator of all predecessors (what we call 'header') and places
a block marker there.
When we first place markers, we traverse BBs from top to bottom. For
example, when there are 5 BBs A, B, C, D, and E and B, D, and E are
branch destinations, if mark the BB given to placeBlockMarker with *
and draw a rectangle representing the border of block and end_block
markers, the process is going to look like
------- ----- |-----| --- |---| ||---|| |A| ||A|| |||A||| --- --> |---| --> ||---|| *B | B | || B || C | C | || C || D ----- |-----| E *D | D | E ------- *E
which means when we first place markers, we go from inner to outer
scopes. So when we place a block marker, if the header already
contains other block or try marker, it has to belong to an inner
scope, so the existing block/try markers should go _after_ the new
marker. This was the assumption we had.
But after placing all markers we run fixUnwindMismatches function.
There we do some control flow transformation and create some branches,
and we call placeBlockMarker again to place block/end_block
markers for those newly created branches. We can't assume that we are
traversing branch destination BBs from top to bottom now because we are
basically inserting some new markers in the middle of existing markers.
Fix:
In placeBlockMarker, we don't have the assumption that the BB given is
in the order of top to bottom, and when placing block markers,
calculates whether existing block or try markers are inner or
outer scopes with respect to the current scope.
- Background:
In fixUnwindMismatches, when there is a call whose correct unwind
destination mismatches the current destination after initially placing
try markers, we wrap that with a new nested try/catch/end and
jump to the correct handler within the new catch. The correct handler
code is split as a separate BB from its original EH pad so it can be
branched to. Here's an example:
- Before
mbb: call @foo <- Unwind destination mismatch! wrong-ehpad: catch ... cont: end_try ... correct-ehpad: catch [handler code]
- After
mbb: try (new) call @foo nested-ehpad: (new) catch (new) local.set n / drop (new) br %handleri (new) nested-end: (new) end_try (new) wrong-ehpad: catch ... cont: end_try ... correct-ehpad: catch local.set n / drop (new) handler: (new) end_try [handler code]
Note that after this transformation, it is possible there are no calls
to actually unwind to correct-ehpad here. call @foo now
branches to handler, and there can be no other calls to unwind to
correct-ehpad. In this case correct-ehpad does not have any
predecessors anymore.
This can cause a bug in placeBlockMarker, because we may need to place
end_block marker in handler, and placeBlockMarker computes the
nearest common dominator of all predecessors. If one of handler's
predecessor (here correct-ehpad) does not have any predecessors, i.e.,
no way of reaching it, we cannot correctly compute the common dominator
of predecessors of handler, and end up placing no block/end
markers. This bug actually sometimes masks the bug 1.
Fix:
When we have an EH pad that does not have any predecessors after this
transformation, deletes all its successors, so that its successors don't
have any dangling predecessors.
- Background:
Actually the handler BB in the example shown in bug 2 doesn't need
end_block marker, despite it being a new branch destination, because
it already has end_try marker which can serve the same purpose. I just
put that example there for an illustration purpose. There is a case we
actually need to place end_block marker: when the branch dest is the
appendix BB. The appendix BB is created when there is a call that is
supposed to unwind to the caller ends up unwinding to a wrong EH pad. In
this case we also wrap the call with a nested try/catch/end,
create an 'appendix' BB at the very end of the function, and branch to
that BB, where we rethrow the exception to the caller.
Fix:
When we don't actually need to place block markers, we don't.
- In case we fall through to the continuation BB after the catch block,
after extracting handler code in fixUnwindMismatches (refer to bug 2
for an example), we now have to add a branch to it to bypass the
handler.
- Before
try ... (falls through to 'cont') catch handler body end <-- cont
- After
try ... br %cont (new) catch end handler body <-- cont
The problem is, we haven't been placing a new end_block marker in the
cont BB in this case. We should, and this fixes it. But it is hard to
provide a test case that triggers this bug, because the current
compilation pipeline from .ll to .s does not generate this kind of code;
we always have a br after invoke. But code without br is still
valid, and we can have that kind of code if we have some pipeline
changes or optimizations later. Even mir test cases cannot trigger this
part for now, because we don't encode auxiliary EH-related data
structures (such as [[ https://github.com/llvm/llvm-project/blob/19f5da9c1d698653f942b504544a73b85b1e703c/llvm/include/llvm/CodeGen/WasmEHFuncInfo.h#L29-L54 | WasmEHFuncInfo ]]) in mir now. Those functionalities
can be added later, but I don't think we should block this fix on that.
Fix for bug 1