This is an archive of the discontinued LLVM Phabricator instance.

[DSE] Calloc-strlen optimizations
AbandonedPublic

Authored by xbolva00 on May 22 2018, 4:53 PM.

Details

Reviewers
efriedma
Summary

Transforms:
strlen(calloc(...)) -> 0

Diff Detail

Repository
rL LLVM

Event Timeline

xbolva00 created this revision.May 22 2018, 4:53 PM
xbolva00 added inline comments.May 22 2018, 5:01 PM
lib/Analysis/MemoryLocation.cpp
24

To add support for other ops, maybe extend function to
MemoryLocation::get(const CallInst *CI, const Value *Op)?

lib/Transforms/Scalar/DeadStoreElimination.cpp
1108

eliminateNoopStore uses "isCallocLikeFn" (does it check if any of the arguments is zero?), but I believe I could not use it here. Suggestions?

xbolva00 retitled this revision from [DSE] Calloc optimizations to [DSE] Calloc-strlen optimizations.May 22 2018, 5:02 PM
xbolva00 added inline comments.May 22 2018, 5:08 PM
lib/Transforms/Scalar/DeadStoreElimination.cpp
1404

Sorry for removal, my debug lines were here.

I will restore these empty lines in the next patch update.

xbolva00 updated this revision to Diff 148141.May 22 2018, 6:32 PM
  • Restored blank lines
  • Better get for CallInst
  • Other improvements
xbolva00 marked 2 inline comments as done.May 22 2018, 6:33 PM
vsk added a subscriber: vsk.May 22 2018, 8:34 PM

Is this a common pattern, or are there real-world benchmarks this improves?

This adds a fair bit of code to a hot part of the compiler, so it'd be worth sharing LNT and CTMark data to make sure there aren't any regressions (http://llvm.org/docs/lnt/quickstart.html, https://github.com/llvm-mirror/test-suite/tree/master/CTMark).

xbolva00 added a comment.EditedMay 22 2018, 8:41 PM
In D47237#1109008, @vsk wrote:

Is this a common pattern, or are there real-world benchmarks this improves?

This adds a fair bit of code to a hot part of the compiler, so it'd be worth sharing LNT and CTMark data to make sure there aren't any regressions (http://llvm.org/docs/lnt/quickstart.html, https://github.com/llvm-mirror/test-suite/tree/master/CTMark).

I will try to run that benchmarks, but I see no reasons for regressions, but I will check it :)

This patch also introduces "base" in DSE for more future transformations like:
memset(s, 0, len)
strlen(s) --> 0

And also in the future we can finally do proper malloc+memset folding to calloc (currently there is suboptimal code for it in InstCombine)

Edit: data from CTMark: https://www.diffchecker.com/Zs5boksn

xbolva00 added inline comments.May 23 2018, 7:02 AM
lib/Analysis/MemoryLocation.cpp
22

Self note: add assert for nonnull Op.

lib/Transforms/Scalar/DeadStoreElimination.cpp
1125

I would like to also reduce number of args here, since we don't need some of them.

new name: optimizeLibCalls?

1133

I should check if CI->isNoBuiltin, I will fix it later

1136

I would like to do here a similar switch like the one in SimplifyLibCalls

Value *NewInst = nullptr;
switch (...) {

case LibFunc_strlen:
     NewInst = optimizeStrlen(CI, BBI, AA, MD, DL, TLI, IOL, InstrOrdering);
     break;
 ....

}

if (NewInst) {

CI->replaceAllUsesWith(NewInst);
CI->eraseFromParent();
return true;

}

Suggestions?

vsk added a comment.May 24 2018, 4:49 PM
In D47237#1109008, @vsk wrote:

Is this a common pattern, or are there real-world benchmarks this improves?

This adds a fair bit of code to a hot part of the compiler, so it'd be worth sharing LNT and CTMark data to make sure there aren't any regressions (http://llvm.org/docs/lnt/quickstart.html, https://github.com/llvm-mirror/test-suite/tree/master/CTMark).

I will try to run that benchmarks, but I see no reasons for regressions, but I will check it :)

Thanks. For patches which touch hot and/or performance-sensitive parts of the compiler, it's standard practice to provide benchmark results. This is a basic way to communicate improvements and to catch regressions (both performance & compile-time).

This patch also introduces "base" in DSE for more future transformations like:
memset(s, 0, len)
strlen(s) --> 0

And also in the future we can finally do proper malloc+memset folding to calloc (currently there is suboptimal code for it in InstCombine)

Edit: data from CTMark: https://www.diffchecker.com/Zs5boksn

Could you aggregate this data over multiple runs, and report some statistics so we can compare the baseline versus the patched compiler at a high-level? The same goes for LNT. It would be useful to see if/how the geomean execution time & compile time change. For individual benchmarks, it's standard practice to share which benchmarks improve the most (& by how much), and which benchmarks regress.

xbolva00 added a comment.EditedMay 24 2018, 5:01 PM

Thanks, yes, sure!

But I would like to fix things from the code review, maybe @efriedma can look at it? He already gave me some advices related to DSE. After major notes would be addressed, I would do benchmarks :)

I can't imagine a situation where this transform would trigger in practice... but maybe I don't write enough C code. How often does this trigger on some large codebase, like the LLVM testsuite?

include/llvm/Analysis/MemoryLocation.h
82

I'd rather not mess with this interface... if the instruction is a call, the caller needs to be more aware of what exactly it's asking for (since it could access any number of different memory locations).

lib/Analysis/MemoryLocation.cpp
23

TBAA metadata doesn't get applied to calls (I'm not sure what the semantics would be), so you probably don't need to call getAAMetadata.

lib/Transforms/Scalar/DeadStoreElimination.cpp
582

Can you just make the MemoryLocation a parameter here? I don't think your changes to MemoryLocation.h are useful.

1085

There's no point to checking whether one of the arguments is zero. We can assume they're non-zero because strlen(nullptr) has undefined behavior.

I can't imagine a situation where this transform would trigger in practice... but maybe I don't write enough C code. How often does this trigger on some large codebase, like the LLVM testsuite?

Currently, this is rare, but folding malloc + memset to calloc will bring more hints. This patch also introduces "support for CallInst" in MemoryLocation - what is a nice thing for any future patches.

xbolva00 added inline comments.May 24 2018, 5:42 PM
lib/Transforms/Scalar/DeadStoreElimination.cpp
582

Good point! Thanks.

xbolva00 added inline comments.May 24 2018, 5:44 PM
lib/Analysis/MemoryLocation.cpp
23

Thanks for info.. So things will become simpler.

xbolva00 abandoned this revision.Jun 12 2018, 12:26 PM