This is an archive of the discontinued LLVM Phabricator instance.

Do actual DCE in LoopUnroll
ClosedPublic

Authored by reames on May 14 2021, 9:52 AM.

Details

Summary

LoopUnroll does a limited DCE pass after unrolling, but if you have a chain of dead instructions, it only deletes the last one. Improve the code to recursively delete all trivially dead instructions.

Diff Detail

Event Timeline

reames created this revision.May 14 2021, 9:52 AM
reames requested review of this revision.May 14 2021, 9:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 14 2021, 9:52 AM
Whitney accepted this revision.May 14 2021, 9:59 AM
This revision is now accepted and ready to land.May 14 2021, 9:59 AM
nikic added inline comments.May 14 2021, 10:04 AM
llvm/lib/Transforms/Utils/LoopUnroll.cpp
226

Why do we need WeakTrackingVH here?

239–243

Why not use the RecursivelyDeleteTriviallyDeadInstructions(DeadInsts) overload here?

This revision was landed with ongoing or failed builds.May 14 2021, 10:05 AM
This revision was automatically updated to reflect the committed changes.
nikic added inline comments.May 14 2021, 10:07 AM
llvm/lib/Transforms/Utils/LoopUnroll.cpp
226

Err, ignore this comment, I submitted it by accident -- it's incompatible with the other one, which requires WeakTrackingVH in the API.

@nikic Didn't see your comments until after pushing, will address post-commit shortly.

llvm/lib/Transforms/Utils/LoopUnroll.cpp
226

I was concerned about visit order issues. Taking another look, I think this is redundant.

239–243

Good question, will fix this and other site in same file.

fhahn added a subscriber: fhahn.May 16 2021, 3:43 AM

It looks like this is causing Clang to crash when building the C source below with -O3 on X86 (https://bugs.llvm.org/show_bug.cgi?id=50354)

int a, b, c;
int main () {
  while (b)
    for (c = -9; c; c++)
      a = (a ^ 2) == 0;
  return 0;
}
lebedev.ri added inline comments.
llvm/lib/Transforms/Utils/LoopUnroll.cpp
212

Bikeshedding question: *why* do we do this here?
I'm looking at a test case where we spend most of the time (60%+, minutes) in LoopUnroll,
apparently in simplifyLoopIVs() specifically.

reames added inline comments.May 17 2021, 3:03 PM
llvm/lib/Transforms/Utils/LoopUnroll.cpp
212

Presumably there are some tests that fail or some git history which can give a clue. Not sure off the top of my head.

If you see 60% of time, that's probably a SCEV bug as a guess. :)

lebedev.ri added inline comments.May 18 2021, 3:39 AM
llvm/lib/Transforms/Utils/LoopUnroll.cpp
212

Hmm, i was hoping this was simply resulting in bad SCEV forget-compute cache hammering,
but even if i completely drop this function, and run a standalone IndVars pass afterwards,
it also exhibits the same perf problem.