A WebAssemblyException object contains BBs that belong to a 'catch' part
of the try-catch-end structure. Because CFGSort requires all the BBs
within a catch part to be sorted together as it does for loops, this
pass calculates the nesting structure of catch part of exceptions in a
function. Now this assumes the use of Windows EH instructions.
Details
Diff Detail
- Repository
- rL LLVM
- Build Status
Buildable 18482 Build 18482: arc lint + arc unit
Event Timeline
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
26–27 | What happens if some BBs belong to two different catch blocks? |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
26–27 | Like loop nesting structure, catch block structure should form a nested structure, so a BB belongs to both catch1 and catch2, catch1 should contain catch2 or vice versa. |
To handle if-else structure as well, I will create WebAssemblySortingBlock or something that can calculate the nesting structure of all exceptions, loops and if-elses.
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
26–27 | This is not so in LLVM IR. For example: define dso_local void @f() local_unnamed_addr #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) { invoke void @x0() to label %5 unwind label %1 ; <label>:1: ; preds = %0 %2 = landingpad { i8*, i32 } catch i8* null %3 = extractvalue { i8*, i32 } %2, 0 %4 = tail call i8* @__cxa_begin_catch(i8* %3) #3 br label %11 ; <label>:5: ; preds = %0 invoke void @x1() to label %10 unwind label %6 ; <label>:6: ; preds = %5 %7 = landingpad { i8*, i32 } catch i8* null %8 = extractvalue { i8*, i32 } %7, 0 %9 = tail call i8* @__cxa_begin_catch(i8* %8) #3 br label %11 ; <label>:10: ; preds = %5 ret void ; <label>:11: ; preds = %6, %1 tail call void @abort() #4 unreachable } declare dso_local void @x0() local_unnamed_addr #1 declare i32 @__gxx_personality_v0(...) declare i8* @__cxa_begin_catch(i8*) local_unnamed_addr declare dso_local void @x1() local_unnamed_addr #1 ; Function Attrs: noreturn nounwind declare dso_local void @abort() local_unnamed_addr #2 In this example, BB#11 belongs to two different catch blocks. |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
26–27 | An exception catch part is defined as a BB with catch instruction and all other BBs dominated by this BB. In this case BB#11 is neither in the two catch blocks. Catch#1 contains BB#6 and BB#11, and Catch#2 contains BB#2 only. |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
26–27 | Why doesn't Catch#2 contain BB#11? The two catches are identical, they just jump to a shared common block after executing __cxa_begin_catch. |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
26–27 | What I meant by BB#11 belongs to neither of the catches was more about the real wasm code structure. Yes, it belongs to the catches semantically. But the wasm code will be like (block (try ... catch ... br 1 ) ) ... (try ... catch ... br 1 ) ) call @abort This code is not specific to your code example; this is trying to show the simplest case that two catch blocks jump to the common code. This text format is called the folded text format in wasm, and the real wasm code does not have parentheses of course, but this format is easier to show scope information. br is an unconditional branch instruction in wasm. The argument 1 means it breaks out of 2 scopes. (br 0 means it breaks out of 1 scope.) So br 1 within a try breaks out both the try scope and the outer block scope to get to the call @abort instruction. So, unless you want to duplicate call @abort part, in should be located outside of the two catch blocks, and there should be a branch within each of the catches to jump to the shared code. An instance of WebAssemblyExceptionInfo represents BBs within a single catch in wasm code (try is not there yet, so just BBs starting from catch). This is the reason why it is defined as 'the landing pad BB and all other BBs dominated by it`. |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
61–73 | Similar concern here. | |
80–91 | Is it not possible for the MBB to branch to block which then calls std::terminate? | |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.h | ||
39–40 | DeleteContainerPointers(SubExceptions); Even better would be to use a std::vector<unique_ptr<WebAssemblyException>> |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
80–91 | Hmm, good point... I assumed a terminatepad would be always a single BB but it may not hold. Do you think I should DFS all descendants from here? | |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.h | ||
39–40 | To change this to std::vector<unique_ptr<WebAssemblyException>>, should I RAII WebAssemblyException object at the first creation time and use std::move whenever I pass it to a function until I figure out which container to insert this? |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
80–91 | We handled problems like this one by using getFuncletMembership on the MachineFunction. This does a DFS to let you know which MBBs are part of which catch scopes. I think I'd find a way to reuse it for this. |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
80–91 | Hmm, no, I guess I shouldn't.. doing DFS from all catchpads or cleanuppads sounds like unnecessarily expensive. Should I search for a function call to __clang_call_terminate and search upward from there..? |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
80–91 | IIRC, getFuncletMembership is O(# MBBs) which is no worse than the fundamental amount of work you are doing here. It shouldn't be a considerable amount of overhead. |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
80–91 | Oh, I added the last comment before seeing your new comments about getFuncletMembership, sorry. You mean, I can replace the whole analysis (not only terminate pad thing) with getFuncletMembership? Or are you talking only about this terminatepad detection? If you meant the former, I actually thought about not doing this DFS myself but just use the result from getFuncletMembership, but I couldn't find a way to create parent-child relationship between exception objects with getFuncletMembership. That function just returns a map of <BB, funclet number>, which does not say anything about whether a funclet is included in another funclet or is a top-level funclet. Can I reconstruct that information using getFuncletMembership? And by the way I changed the function name to getEHScopeMembership in D47005 after you accepted it, per @dschuff's request. PTAL if it looks OK. |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
80–91 | Oh, maybe I can just search for a call to __clang_call_terminate() and figure out which WebAssemblyException object it belongs to, and search for BBs that are contained in that exception object... |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
80–91 | As I said above, can I reconstruct LoopInfo-like parent-child relationship with getFuncletMembership? |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.h | ||
---|---|---|
39–40 | Come to think of it, this can be complicated. WebAssemblyException object is created in recalculate, which calls discoverAndMapException. Within discoverAndMapException, if a given exception is found to be a subexception of some other exception, it is inserted into a SubExceptions vector in the parent exception. If not, it is not inserted into anywhere in discoverAndMapException, but it is inserted into WebAssemblyExceptionInfo::TopLevelExceptions after returning from discoverAndMapException. So the code should look something like void WebAssembly::discoverAndMapException(std::unique_ptr<WebAssemblyException> WE) { ... if (WE is a subexception of a parent exception P) add WE to P.SubExceptions; ... } void addTopLeveException(std::unique_ptr<WebAssemblyException> WE); void WebAssemblyExceptinInfo::recalculate() { ... std::unique_ptr<WebAssemblyException> o(args); for (...) { auto *WE = new WeAssemblyException(EHPad); discoverAndMapException(std::move(WE), MDT); ... } ... if (top level exception) addTopLevelException(std::move(WE)); ... } But the problem is, once we move the object to discoverAndMapException, we can't dereference it or move it again after returning from it. So it is complicated because
I think DeleteContainerPointers(SubExceptions); is a simple and sane solution. What do you think? |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
80–91 | I don't think you can do it with getEHScopeMembership, it just tells you which MBBs are in the same EH scope... However, testing if something is a top-level scope should be possible I think. You should be able to go from the MBB which starts the scope back to the llvm::BasicBlock and inspect its pad to see if it has a parent. | |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.h | ||
39–40 | SGTM |
- Use DeleteContainerPointers
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
80–91 | Actually we should reconsider the terminatepad-related transformations that happen in WebAssemblyExceptionPrepare pass, before this pass runs. WebAssemblyExceptionPrepare is not WasmEHPrepare that runs before instruction selection; it is a MachineFunctionPass that runs before we run WebAssemblyExceptionInfo. WebAssemblyExceptionInfo is in D46803 and I added you as a reviewer there just now. I'll continue this discussion there. |
Updated exception grouping algorithm: now it simply gather all the BBs
dominated by an EH pad, rather than stopping at catchret/cleanupret/unreachable.
CFGStackify seems to work either way, and this makes things simpler:
- We don't need to store frontier BBs ourselves and be able to just use existing MachineDominanceFrontier
- We can replace catchret/cleanupret earlier in ExceptionPrepare, so we need a single pass rather than two passes.
Looks pretty reasonable otherwise. I like the MIR unittest 👌
lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp | ||
---|---|---|
15 | Ah, this is the explanation I was looking for before. I would put it (at least the part that describes the class) in the header where the class is defined. | |
19 | I would reverse it and say something like "A WebAssemblyException represents/contains/consists of these blocks/this region" | |
lib/Target/WebAssembly/WebAssemblyExceptionInfo.h | ||
28 | Per our discussion on the previous CL, a big comment here explaining what this class represents, and how it is used. |
Per our discussion on the previous CL, a big comment here explaining what this class represents, and how it is used.