Also make return calls terminator instructions so epilogues are
inserted before them rather than after them. Together, these changes
make WebAssembly's tail call optimization more stack-safe.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Unit tests: fail. 62433 tests passed, 1 failed and 845 were skipped.
failed: libc++.std/thread/thread_mutex/thread_mutex_requirements/thread_timedmutex_requirements/thread_timedmutex_class/try_lock_for.pass.cpp
clang-tidy: fail. clang-tidy found 0 errors and 1 warnings. 0 of them are added as review comments below (why?).
clang-format: pass.
Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml
Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.
It doesn't look like there's anything wasm-specific here. Surely this also inhibits tail calling in other backends too? Are frontends suppost to avoid putting 'tail call' in the IR in this case?
I took a look at some of the other backends and their logic is much more complicated than what we have here. I didn't see any code that did exactly what I'm doing here, but I assume that they are doing something else to the same effect that is possibly more conservative. It's also possible that they do some magic to move stack arguments before tail calling. Experimentation shows that clang does not add tail to the call when a stack argument is passed anyway, so perhaps other frontends do that work themselves as well.
The langref says (https://llvm.org/docs/LangRef.html#call-instruction) that the tail call marker implies that the callee does not access allocas from the caller. So it seems like it *should* mean that the backend can depend on this property (that you're checking for here). It also means that the frontend should guarantee it as best it knows how, and optimizations should not introduce it (or remove the attribute if they do?). There are lots of ways to sneak pointers into places (aliasing, going through memory, etc etc) so I'd expect the check in this CL to be brittle.
But if other backends are checking for this kind of property anyway that would be fishy, so now I'm a bit confused. Are they just checking for particular target-specific (or calling-convention-specific) properties?
I also did a quick search for what other backends are doing, and they are all kind of different. Many targets seem to implement some function called [[ https://github.com/llvm/llvm-project/search?q=isEligibleForTailCallOptimization&unscoped_q=isEligibleForTailCallOptimization | isEligibleForTailCallOptimization ]], even though the superclass TargetLowering does not have that method. Anyway, I think it would be fine if we add things we need on an as-needed basis.
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp | ||
---|---|---|
691 | I think there can be other cases other than geps and aliases; looking at this function might give some ideas. Skimming at the code, it looks [[ https://github.com/llvm/llvm-project/blob/0c3b2986ac6b71abc649811c3ec9cb0bf073c7d8/llvm/lib/IR/Value.cpp#L605-L607 | stripPointerCastsAndOffsets ]] does the most comprehensive checking we need. |
I agree with your interpretation, but there doesn't seem to be any validation that the callee doesn't access allocas of the caller, so this change makes the backend more robust in the presence of incorrect IR that nonetheless passes IR validation. That seems like a useful but non-essential property to have, so I would be ok not merging this except that I don't think there are any downsides.
There are lots of ways to sneak pointers into places (aliasing, going through memory, etc etc) so I'd expect the check in this CL to be brittle.
The check does as much as it can, but you're right that it can't look through operations that lose type information. That means it is possible to generate IR that hides its reference to caller allocas in its tail call, but it's ok if we sometimes generate bad code in that case because the LangRef said you weren't supposed to do that.
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp | ||
---|---|---|
691 | stripPointerCastsAndOffsets doesn't look through aliases, so it doesn't replace stripPointerCastsAndAliases. I could use both in the loop, but I think manually fetching the GEP pointer is clearer. I could probably be convinced otherwise, though. |
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp | ||
---|---|---|
691 | Ah, you're right that it doesn't look through aliases. But the current implementation here does not look through bitcasts or call return args that stripPointerCastsAndOffsets. How about something like this? Value *Val = Arg.get(), *NewVal = nullptr; while (Val != NewVal) { NewVal = Val->stripPoinsterCastsAndOffsets()->stripPointerCastsAndAliases(); } |
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp | ||
---|---|---|
691 | ('checks' was missing at the end of first sentence) |
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp | ||
---|---|---|
691 | Talked in person, I was incorrect that this misses bitcasts and such; it includes checks for all that because this manually goes through geps. |
I think there can be other cases other than geps and aliases; looking at this function might give some ideas. Skimming at the code, it looks [[ https://github.com/llvm/llvm-project/blob/0c3b2986ac6b71abc649811c3ec9cb0bf073c7d8/llvm/lib/IR/Value.cpp#L605-L607 | stripPointerCastsAndOffsets ]] does the most comprehensive checking we need.