The existing logic in determining whether an alloca should live on the frame only looks explicit def-use relationships. However a value defined by an alloca may be implicitly needed across suspension points, either because an alias has across-suspension-point def-use relationship, or escaped by store/call/memory intrinsics. To properly handle all these cases, we have to properly visit the alloca pointer up-front. Thie patch extends the exisiting alloca use visitor to determine whether an alloca should live on the frame.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
There's an existing StackLifetime analysis that does some of this work and also considers the lifetime intrinsics; can we take advantage of that?
My understanding is quite the opposite, that StackLifetime analysis purely relies on lifetime intrinsics, and does not look at how an alloca is being used.
https://github.com/llvm/llvm-project/blob/e10e7829bf6f10c053c05e42b676d7acaf54a221/llvm/include/llvm/Analysis/StackLifetime.h#L111-L112
Okay. Well, it does seem to me that we need to be considering lifetimes. If we can also optimize when we don't have lifetime information and the variable doesn't escape, that's great, but we often have pretty good lifetime information that we can use. And in optimized builds we'll usually have already eliminated allocas that don't escape, so the remaining allocas are very likely to have all escaped.
Yes we are considering lifetimes. The algorithms by first checking to see if there are lifetimes, and if there are use them. After that we use the pointer visitor to do more precise tracking.
Great! To my understanding, if there are lifetime informations, the behavior of this patch is same with the behavior of previous implementation. Is my understanding right?
Thanks for the patch! I think I generally agree with this patch.
One thing is that the aliases seems to be over-analyzed. Can we just leave aliases on frame? Cause I have no idea about this effect on real workloads.
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
923–925 | this function implies dominate relation for arg1 and arg2, so we can not use hasPathCrossingSuspendPoint for two user basic blocks directly. |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
923–925 | hmm if that's the case, I think we might already have some buggy code here. The checks on lifetime markers don't necessary have dominance relationships. I saw you removed the dominance assertion in https://reviews.llvm.org/D75664. Do you remember why? |
Just setEscaped for pointer cast used before coro.begin, or can we do something like llvm::PointerMayBeCapturedBefore to check the pointer directly? I'm not sure about this.
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
923–925 | The bitcast users hit the assertion, however, they do not across the suspend point due to rewriteMaterializableInstructions. Except that, each lifetime marker has dominance relationship with some of the users. Iterate all the lifetime marker has effect as same as def |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
2044 | we can call this at the beginning of this function, and then sink the aliases after coro.begin as we do in rewriteMaterializableInstructions, keep the original logic unchanged. I think we can even deal with unknown offset. |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
2044 | I am not exactly sure what you mean, but let me explain a bit more about the alias analysis here. Which case do you think we can/should simplify? |
Thank you for the explanation!
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
925 | Still need call Checker.hasPathCrossingSuspendPoint(AllocaDef, Pointeruser) |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
2044 | This makes me more clear! LGTM. |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
925 | We cannot do that because the whole purpose of this rewrite is to check if uses of the alloca belong to different suspend regions, which causes the spill. |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
925 | let's say : void foo () { alloca a; if (cond) { co_await use1 a; }else use2 a; } in such case, hasPathCrossingSuspendPoint(use1, use2) return false which means a is not keep on frame, this is wrong. |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
925 | In this case, "a" is just defined but not used at all before the co_await, so it would be correct to not keep it on frame, right? Could you explain why in this case "a" needs to be on the frame? |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
925 |
I do not understand why there is a user bb happends before coro.suspend? |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
925 | It may used before the co_await, as long as initial_suspend never suspend and cond is false. This pattern can also be put in a loop. we do not know whether part will be executed. |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
925 | If an alloca needs to live across a suspension, there must exists two uses of the alloca (or through an alias) that one happens before a coro.suspend and one happens after that same coro.suspend. If no two uses of alloca can cross a suspend, the alloca does not ever need to live on the frame. Does this part sound right? Next we need to prove that if there exists such a pair of use, one of them will return true on hasPathCrossingSuspendPoint. |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
925 |
In your code example, "a" is not used before co_await, neither initialized. So it doesn't matter if it lives on the frame or not. If it is used, then there will exists a user-pair that crosses suspensions. |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
925 |
There is no before/after relation. as long as two uses used in different suspend region. the case can be changed to void foo () { alloca a; if (cond) { co_await use1 a; }else{ co_await use2 a; } }
|
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
925 | Could you explain what could go wrong in this example if "a" is not on the frame? void foo () { alloca a; if (cond) { co_await use1 a; }else{ co_await use2 a; } } behaves exactly as void foo () { if (cond) { co_await alloca a1 use1 a1; }else{ co_await alloca a2 use2 a2; } } |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
925 | hmm... get your point. make sense to me. |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
2026 | The lifetime checks may provide more accurate information in the case where allocas seem escaped. So I think it's helpful to keep it. |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
898 | I prefer this to be report_fatal_error. |
llvm/lib/Transforms/Coroutines/CoroFrame.cpp | ||
---|---|---|
916–921 | not needed |
Hello,
A git bisect has identified this change as the likely candidate for a new set of asserts / crashes in SwiftShader when attempting to use the coroutine passes. This is having knock-on issues with internal Google projects.
When passing this IR to ./bin/opt crash.ir -coro-early -coro-split -coro-elide -S with this change, we now get this crash.
Running the same command on the parent change does not crash, and behaves as expected.
I'd like to file a bug, but LLVM's Bugzilla is not letting me sign in, possibly due to spam restrictions. :-/
I'll do some investigation myself tomorrow, but any assistance here would be gratefully appreciated.
Many thanks,
Ben
I can confirm that this patch introduced a bug.
It can be triggered when there is an alloca instruction defined after coro.begin and used after a suspension. These allocas are not properly moved to the .resume function.
I will think about how to fix this.
clang-tidy: warning: invalid case style for variable 'i' [readability-identifier-naming]
not useful
clang-tidy: warning: invalid case style for variable 'e' [readability-identifier-naming]
not useful