Implements direct and indirect tail calls enabled by the 'tail-call'
feature. Updates existing call tests and adds new tests including
a binary encoding test.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 33039 Build 33038: arc lint + arc unit
Event Timeline
Nice!
- Does the LLVM guarantee tail calls to be placed in the positions in which they can be really tail calls? Can this property change when we move code around in our backend, so tail calls cannot be executed as tail calls anymore?
- It seems we need to update the Tail call optimization section in the Code Generator doc too.
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp | ||
---|---|---|
788 |
| |
llvm/lib/Target/WebAssembly/WebAssemblyInstrCall.td | ||
26–35 |
| |
71 | Probably not related to this CL, but I wonder maybe we should also merge this into a unified CALL instruction when we have calls that return multiple values. | |
82 |
| |
82 | Oh I checked the spec and it says return_call. Nevermind the first thing | |
llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td | ||
102 | Nit: stray indentations | |
llvm/test/CodeGen/WebAssembly/tailcall.ll | ||
2 | It looks like we don't support tail calls in fastisel yet, right?
| |
14 | How is this musttail supported in fastisel, while we have no support yet? Does that fall back to the normal isel? | |
30 | Maybe I'm missing something, but can this return_call return a value, while it does not have typed versions? | |
62 | Nit: Stray space after return_call_indirect, and double spaces between args. The same for other functions below too. | |
116 | How about adding a test for notail too for coverage? |
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp | ||
---|---|---|
651 | Probably should update this error message (e.g. "tail calls aren't enabled") |
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp | ||
---|---|---|
788 | I initially tried to handle this as part of the existing logic, but kept running into assertion failures in the SelectionDAGBuilder. Eventually I just cargo culted what AArch64 was doing for its tail call lowering (including the MVT::Glue) and everything worked. According to MachineValueType.h Glue "glues nodes together during pre-RA sched" so maybe it's for gluing the call to a subsequent return node? I'm not entirely sure. I'll try playing around with this a bit more to see if I can get rid of the glue. | |
llvm/lib/Target/WebAssembly/WebAssemblyInstrCall.td | ||
26–35 | yes, for simd. vt is v16i8, v4i32, etc. but rt is always V128. prefix is actually rt in lowercase, but there is no tablegen for lowercasing a string :(. We could actually clean a lot of SIMD tablegen code up if we could properly manipulate strings, but I'm not sure that would be a welcome tablegen feature. | |
71 | Yes, something like that is going to need to happen. It will probably be rather tricky, given the rigidity of the SelectionDAG type system. | |
82 | In wasm, return_call is not just a call that happens to be in a tail position, but actually a return and a call rolled into a single action. So a return_call can be moved by optimizations in any way that a return could be moved. Since the instruction performs a return, it is impossible for it to produce a value that is consumed in the current context, which is why it is not modeled as returning a value. In Binaryen this is clearer because return_call would have type unreachable. | |
llvm/test/CodeGen/WebAssembly/tailcall.ll | ||
14 | Yes, musttail causes fastisel to fall back to normal isel. I tried adding fastisel support for tail call but it didn't work out. Maybe I was missing some glue... | |
30 | It returns a value to the caller of the current function but does not return a value to the current function itself. | |
62 | The space is not stray, unfortunately. There is a FIXME comment in the instruction printer to remove the leading comma. I have fixed the double spaces. |
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp | ||
---|---|---|
788 | I see. It sounds requiring MVT::Glue actually makes sense in that case. But I still wonder why can't we handle this in the existing logic? Aren't the opcode and MVT::Glue only different things from the normal workflow? And don't we need InTyList as well, as in the normal flow? | |
llvm/lib/Target/WebAssembly/WebAssemblyInstrCall.td | ||
82 | Hmm, I see. Then I think we'd need isReturn = 1 for these instructions. | |
llvm/test/CodeGen/WebAssembly/tailcall.ll | ||
2 | ping | |
14 | Then how about putting a TODO comment in WebAssemblyFastISel.cpp or somewhere? |
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp | ||
---|---|---|
788 | I integrated the IsTailCall path into the existing logic flow. I think this makes the code more complicated, but perhaps it is less surprising now. | |
llvm/test/CodeGen/WebAssembly/tailcall.ll | ||
2 | I suppose we should probably support tail calls in FastISel. I started doing that but ran into issues so I gave up. Probably worth another shot in a separate PR. I've added a TODO. FastISel is not the same as DAG ISel because it lowers optional tail calls to normal calls while DAG ISel lowers them to tail calls. Since there is a difference I thought it worth a test. |
llvm/docs/CodeGenerator.rst | ||
---|---|---|
2107 | Shouldn't we have the same restrictions on PIC code? |
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp | ||
---|---|---|
790 |
|
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp | ||
---|---|---|
790 | The terminology here is super confusing, but everything named In* refers to values coming back into the current frame once the call finishes, i.e. the return values. return_call has no return values except for the chain and glue, so InTys gets cleared. We would need to duplicate the condition for the clear and the glue push if we hoisted the other push, so I don't think it's worth it. |
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp | ||
---|---|---|
790 | Oh I see.... I thought InTys is the input parameters, and that was one of the reasons I suggested merging into the existing workflow. Yeah, the benefits of merging look a bit less clear then. If you think the previous code is more clear I think it's good to revert to it. Sorry for my confusion. |
llvm/docs/CodeGenerator.rst | ||
---|---|---|
2107 | I'm not totally clear on the details of our PIC code, but any direct or indirect call could be lowered to a tail call, so I'm not sure why there should be a restriction. I'll add tests for byval, varargs, etc. |
llvm/docs/CodeGenerator.rst | ||
---|---|---|
2107 | IIRC PIC shouldn't effect direct calls. It effects indirect calls since function addresses aren't statically known. |
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp | ||
---|---|---|
796–800 | Now we have reverted to the previous logic, we don't need RET_CALL here | |
llvm/test/CodeGen/WebAssembly/tailcall.ll | ||
120 | Not sure what this tests..? The function name is mismatched_prototypes but the prototypes here match. And the comment says musttail requires ... but this is not a musttail but a tail. The same for other methods below. |
llvm/test/CodeGen/WebAssembly/tailcall.ll | ||
---|---|---|
120 | The prototypes of the caller and callee do not match. Using musttail here would be an LLVM validation error, but we can still test with tail, and similar for the following tests. I'll clarify the comments. |
llvm/test/CodeGen/WebAssembly/tailcall.ll | ||
---|---|---|
120 | Sorry, but don't they match..? declare i32 @baz(i32, i32, i32) Call: tail call i32 @baz(i32 0, i32 42, i32 6) |
LGTM!
llvm/test/CodeGen/WebAssembly/tailcall.ll | ||
---|---|---|
120 | Talked offline and turned out I didn't correctly understand what the comment meant |
Shouldn't we have the same restrictions on PIC code?
Also, while it seems like va_arg and byval should work, if we're going to promise to optimize those too, we should have some tests.