This is a prototype heap-to-stack conversion pass. It's not really ready to be reviewed yet. Posting this now because I'm going to talk about it tomorrow at the dev meeting (lightning talk).
Details
- Reviewers
- None
Diff Detail
Event Timeline
Not sure if enough, since @efriedma noted in D47345:
"Your "capture" function captures the pointer, but it's possible to write a function which takes the pointer as a parameter but doesn't capture it. For example, given the function void f(void *p) { free(p); }, LLVM will mark the parameter "p" nocapture."
free doesn't capture, but that's okay. I'm not depending on the capture check for that because, for free, the path reachability check takes care of that. For everything else, that's exactly correct, and in fact, I want to be able to handle cases where functions take the pointers and don't capture them (the tricky part there is actually that, if the pointer is captured, we also need to prove eventual termination).
I need to update the description to describe the algorithm and why I think that it works. ;) -- I'll do that soon.
I think you have to be careful about something like the following:
void f() { void *p = malloc(4); nocapture_func_frees_pointer(p); func_throws(); free(p); }
I don't think it's possible to trigger the issue at the moment, though, because we don't have a mustreturn attribute (so isGuaranteedToTransferExecutionToSuccessor() will return false for nocapture_func_frees_pointer).
lib/Transforms/Scalar/HeapToStack.cpp | ||
---|---|---|
139 | isGuaranteedToTransferExecutionToSuccessor is true for atomics. |
Add some optimization remarks. Now we'll get things like his:
In file included from /src/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp:31: In file included from /src/llvm/lib/Target/AArch64/AArch64.h:19: In file included from /src/llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h:23: In file included from /src/llvm/include/llvm/ADT/STLExtras.h:21: In file included from /src/llvm/include/llvm/ADT/SmallVector.h:21: /src/llvm/include/llvm/Support/MemAlloc.h:27:18: remark: removed memory allocation (placed on stack) [-Rpass=heap-to-stack] void *Result = std::malloc(Sz); ^ In file included from /src/llvm/lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp:32: In file included from /src/llvm/lib/Target/AArch64/AArch64InstrInfo.h:20: In file included from /src/llvm/include/llvm/CodeGen/TargetInstrInfo.h:21: In file included from /src/llvm/include/llvm/CodeGen/LiveRegUnits.h:18: /src/llvm/include/llvm/ADT/BitVector.h:164:18: remark: removed memory deallocation (placed on stack) [-Rpass=heap-to-stack] ~BitVector() { std::free(Bits.data()); } ^
During a check-clang self host, we currently remove 79 allocations (using the "very aggressive mode, at least").
Thanks for stats, interesting! Maybe try with some big C benchmarks/codebases? SPECs? :)
I think you're right. Thanks! -- I also rebased my nofree-attribute patch the other day. I'll post that update too. I think that can be used to solve this problem because in cases where we can infer nocapture we should also be able to infer nofree (unless, of course, as in this case, where the function actually does free the data).
lib/Transforms/Scalar/HeapToStack.cpp | ||
---|---|---|
139 | Indeed (at least some of them)... bool llvm::isGuaranteedToTransferExecutionToSuccessor(const Instruction *I) { // A memory operation returns normally if it isn't volatile. A volatile // operation is allowed to trap. // // An atomic operation isn't guaranteed to return in a reasonable amount of // time because it's possible for another thread to interfere with it for an // arbitrary length of time, but programs aren't allowed to rely on that. if (const LoadInst *LI = dyn_cast<LoadInst>(I)) return !LI->isVolatile(); if (const StoreInst *SI = dyn_cast<StoreInst>(I)) return !SI->isVolatile(); if (const AtomicCmpXchgInst *CXI = dyn_cast<AtomicCmpXchgInst>(I)) return !CXI->isVolatile(); if (const AtomicRMWInst *RMWI = dyn_cast<AtomicRMWInst>(I)) return !RMWI->isVolatile(); if (const MemIntrinsic *MII = dyn_cast<MemIntrinsic>(I)) return !MII->isVolatile(); |
Hi Hal! Are you planning on working on this one? If not maybe I can take over? Now that we have nofree & nosync added and deduced in the Attributor, maybe this can be revisited.
I would certainly like to see progress on this, and it's great that we can now get back to this. If you have time in the short term to make progress here, I'm happy for you to do so (and, of course, I'll help review).
isGuaranteedToTransferExecutionToSuccessor is true for atomics.