Prior to D89768, any alloca that's used after suspension points will be put on to the coroutine frame, and hence they will always be reloaded in the resume function.
However D89768 introduced a more precise way to determine whether an alloca should live on the frame. Allocas that are only used within one suspension region (hence does not need to live across suspension points) will not be put on the frame. They will remain local to the resume function.
When creating the new entry for the .resume function, the existing logic only moved all the allocas from the old entry to the new entry. This covers every alloca from the old entry. However allocas that's defined afer coro.begin are put into a separate basic block during CoroSplit (the PostSpill basic block). We need to make sure these allocas are moved to the new entry as well if they are used.
This patch walks through all allocas, and check if they are still used but are not reachable from the new entry, if so, we move them to the new entry.
This should fix the bug reported in D89768
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Comment Actions
Cool. I also noticed that your code have quite lots of allocas, which makes it a great example to benchmark the effect of memory savings from D89768.
I was wondering, comparing to before D89768 and now, how much reduction do you see in the coroutine frame size in your codebase (i.e. the eventual argument to @coroutine_alloc_frame)? I would really appreciate if you could share some insights on the magnitude of that.