Transforms:
strlen(calloc(...)) -> 0
Details
Diff Detail
Event Timeline
lib/Analysis/MemoryLocation.cpp | ||
---|---|---|
24 | To add support for other ops, maybe extend function to | |
lib/Transforms/Scalar/DeadStoreElimination.cpp | ||
1108 | eliminateNoopStore uses "isCallocLikeFn" (does it check if any of the arguments is zero?), but I believe I could not use it here. Suggestions? |
lib/Transforms/Scalar/DeadStoreElimination.cpp | ||
---|---|---|
1409–1410 | Sorry for removal, my debug lines were here. I will restore these empty lines in the next patch update. |
Is this a common pattern, or are there real-world benchmarks this improves?
This adds a fair bit of code to a hot part of the compiler, so it'd be worth sharing LNT and CTMark data to make sure there aren't any regressions (http://llvm.org/docs/lnt/quickstart.html, https://github.com/llvm-mirror/test-suite/tree/master/CTMark).
I will try to run that benchmarks, but I see no reasons for regressions, but I will check it :)
This patch also introduces "base" in DSE for more future transformations like:
memset(s, 0, len)
strlen(s) --> 0
And also in the future we can finally do proper malloc+memset folding to calloc (currently there is suboptimal code for it in InstCombine)
Edit: data from CTMark: https://www.diffchecker.com/Zs5boksn
lib/Analysis/MemoryLocation.cpp | ||
---|---|---|
22 | Self note: add assert for nonnull Op. | |
lib/Transforms/Scalar/DeadStoreElimination.cpp | ||
1125 | I would like to also reduce number of args here, since we don't need some of them. new name: optimizeLibCalls? | |
1133 | I should check if CI->isNoBuiltin, I will fix it later | |
1136 | I would like to do here a similar switch like the one in SimplifyLibCalls Value *NewInst = nullptr; case LibFunc_strlen: NewInst = optimizeStrlen(CI, BBI, AA, MD, DL, TLI, IOL, InstrOrdering); break; .... } if (NewInst) { CI->replaceAllUsesWith(NewInst); CI->eraseFromParent(); return true; } Suggestions? |
Thanks. For patches which touch hot and/or performance-sensitive parts of the compiler, it's standard practice to provide benchmark results. This is a basic way to communicate improvements and to catch regressions (both performance & compile-time).
This patch also introduces "base" in DSE for more future transformations like:
memset(s, 0, len)
strlen(s) --> 0And also in the future we can finally do proper malloc+memset folding to calloc (currently there is suboptimal code for it in InstCombine)
Edit: data from CTMark: https://www.diffchecker.com/Zs5boksn
Could you aggregate this data over multiple runs, and report some statistics so we can compare the baseline versus the patched compiler at a high-level? The same goes for LNT. It would be useful to see if/how the geomean execution time & compile time change. For individual benchmarks, it's standard practice to share which benchmarks improve the most (& by how much), and which benchmarks regress.
Thanks, yes, sure!
But I would like to fix things from the code review, maybe @efriedma can look at it? He already gave me some advices related to DSE. After major notes would be addressed, I would do benchmarks :)
I can't imagine a situation where this transform would trigger in practice... but maybe I don't write enough C code. How often does this trigger on some large codebase, like the LLVM testsuite?
include/llvm/Analysis/MemoryLocation.h | ||
---|---|---|
82 | I'd rather not mess with this interface... if the instruction is a call, the caller needs to be more aware of what exactly it's asking for (since it could access any number of different memory locations). | |
lib/Analysis/MemoryLocation.cpp | ||
23 | TBAA metadata doesn't get applied to calls (I'm not sure what the semantics would be), so you probably don't need to call getAAMetadata. | |
lib/Transforms/Scalar/DeadStoreElimination.cpp | ||
581 | Can you just make the MemoryLocation a parameter here? I don't think your changes to MemoryLocation.h are useful. | |
1085 | There's no point to checking whether one of the arguments is zero. We can assume they're non-zero because strlen(nullptr) has undefined behavior. |
Currently, this is rare, but folding malloc + memset to calloc will bring more hints. This patch also introduces "support for CallInst" in MemoryLocation - what is a nice thing for any future patches.
lib/Transforms/Scalar/DeadStoreElimination.cpp | ||
---|---|---|
581 | Good point! Thanks. |
lib/Analysis/MemoryLocation.cpp | ||
---|---|---|
23 | Thanks for info.. So things will become simpler. |
I'd rather not mess with this interface... if the instruction is a call, the caller needs to be more aware of what exactly it's asking for (since it could access any number of different memory locations).