This is an archive of the discontinued LLVM Phabricator instance.

[PM/LoopUnswitch] Teach the new unswitch to handle nontrivial unswitching of switches.
ClosedPublic

Authored by chandlerc on Jun 3 2018, 3:06 AM.

Details

Summary

This works much like trivial unswitching of switches in that it reliably
moves the switch out of the loop. Here we potentially clone the entire
loop into each successor of the switch and re-point the cases at these
clones.

Due to the complexity of actually doing nontrivial unswitching, this
patch doesn't create a dedicated routine for handling switches -- it
would duplicate far too much code. Instead, it generalizes the existing
routine to handle both branches and switches as it largely reduces to
looping in a few places instead of doing something once. This actually
improves the results in some cases with branches due to being much more
careful about how dead regions of code are managed. With branches,
because exactly one clone is created and there are exactly two edges
considered, somewhat sloppy handling of the dead regions of code was
sufficient in most cases. But with switches, there are much more
complicated patterns of dead code and so I've had to move to a more
robust model generally. We still do as much pruning of the dead code
early as possible because that allows us to avoid even cloning the code.

This also surfaced another problem with nontrivial unswitching before
which is that we weren't as precise in reconstructing loops as we could
have been. This seems to have been mostly harmless, but resulted in
pointless LCSSA PHI nodes and other unnecessary cruft. With switches, we
have to get this *right*, and everything benefits from it.

While the testing may seem a bit light here because we only have two
real cases with actual switches, they do a surprisingly good job of
exercising numerous edge cases here. Also, because we share the logic
with branches, most of the changes in this patch are reasonably well
covered by existing tests. Still, I'm going to look at some potentially
more interesting test coverage as well but didn't want to delay sending
this out.

The new unswitch now has all of the same fundamental power as the old
one with the exception of the single unsound case of *partial* switch
unswitching -- that really is just loop specialization and not
unswitching at all. It doesn't fit into the canonicalization model in
any way. We can add a loop specialization pass that runs late based on
profile data if important test cases ever come up here.

Depends on D47522.

Diff Detail

Event Timeline

chandlerc created this revision.Jun 3 2018, 3:06 AM
hfinkel added a subscriber: hfinkel.Jun 3 2018, 6:41 AM

Can you please update the comment at the top of include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h to define "full vs. partial" unswitching and "trivial vs. non-trivial" unswitching?

llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
1797–1798

Dead code?

1990–1991

This FIXME is now out of date (switches are handled above)?

asbirlea added inline comments.Jun 5 2018, 3:47 PM
llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
735

Could you update the above comment to clarify who DominatingSucc is?
e.g. "It skips loop and exit blocks if their dominating successor (DominatingSucc) is not the block being unswitched." ?

1751

clang-format?

chandlerc updated this revision to Diff 150145.Jun 6 2018, 8:24 AM
chandlerc marked 3 inline comments as done.

Update to address code review comments.

chandlerc updated this revision to Diff 150146.Jun 6 2018, 8:29 AM
chandlerc marked an inline comment as done.

Another fixe.

Thanks for comments, should all be addressed

llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
735

I used different phrasing, but let me know if it didn't work.

1751

Fixed.

1797–1798

Doh! sorry i didn't delete this.

chandlerc updated this revision to Diff 150190.Jun 6 2018, 1:20 PM

This time *actually* inculde the updated comments in the header file that Hal requested. Soryr this were missing in the prior update.

Not sure where to put this into... but I ran an experiment by adding -passes=loop(unswitch) to the LoopUnswitch tests and here is the result
with the build of all the three dependent fixes:
Failing Tests (11):

  LLVM :: Transforms/LoopUnswitch/2011-11-18-TwoSwitches.ll
  LLVM :: Transforms/LoopUnswitch/basictest.ll
  LLVM :: Transforms/LoopUnswitch/copy-metadata.ll
  LLVM :: Transforms/LoopUnswitch/elseif-non-exponential-behavior.ll
  LLVM :: Transforms/LoopUnswitch/guards.ll
  LLVM :: Transforms/LoopUnswitch/infinite-loop.ll
  LLVM :: Transforms/LoopUnswitch/msan.ll
  LLVM :: Transforms/LoopUnswitch/simplify-with-nonvalness.ll
  LLVM :: Transforms/LoopUnswitch/trivial-unswitch.ll
  LLVM :: Transforms/LoopUnswitch/unswitch-equality-undef.ll
  LLVM :: Transforms/LoopUnswitch/unswitch-select.ll

Expected Passes    : 30
Unsupported Tests  : 1
Unexpected Failures: 11

Some tests involve selects and guards, which I gather are not yet supported, some involve multiple unswitches over the same loop etc.
IMO it makes sense to go through these results and see if there are any surprises there.
I can submit a diff (and/or log of the run) if needed.

Not sure where to put this into... but I ran an experiment by adding -passes=loop(unswitch) to the LoopUnswitch tests and here is the result
with the build of all the three dependent fixes:
Failing Tests (11):

  LLVM :: Transforms/LoopUnswitch/2011-11-18-TwoSwitches.ll
  LLVM :: Transforms/LoopUnswitch/basictest.ll
  LLVM :: Transforms/LoopUnswitch/copy-metadata.ll
  LLVM :: Transforms/LoopUnswitch/elseif-non-exponential-behavior.ll
  LLVM :: Transforms/LoopUnswitch/guards.ll
  LLVM :: Transforms/LoopUnswitch/infinite-loop.ll
  LLVM :: Transforms/LoopUnswitch/msan.ll
  LLVM :: Transforms/LoopUnswitch/simplify-with-nonvalness.ll
  LLVM :: Transforms/LoopUnswitch/trivial-unswitch.ll
  LLVM :: Transforms/LoopUnswitch/unswitch-equality-undef.ll
  LLVM :: Transforms/LoopUnswitch/unswitch-select.ll

Expected Passes    : 30
Unsupported Tests  : 1
Unexpected Failures: 11

Some tests involve selects and guards, which I gather are not yet supported, some involve multiple unswitches over the same loop etc.
IMO it makes sense to go through these results and see if there are any surprises there.
I can submit a diff (and/or log of the run) if needed.

Most of these test files were ported to the new pass. Some of them aren't relevant any more.

I did in fact go through these and ported as many as I could when doing the initial work. But the old passes's testing is .... not great. I don't think there is much more value in re-analyzing these, but if you see specific things that you'd like to port over, go for it.

On the flip side, if you see specific logic you'd like better testing of, I'm more than happy to just directly add the testing. I think the tests we will get from that will be substantially higher quality.

ONe thing that makes it very expensive is that the results of unswitching are substantially different between the passes meaning that most of the CHECK lines have to be manually rewritten anyways.

Rebase now that all the prior patches are landed and ping.

asbirlea accepted this revision.Jun 22 2018, 2:32 PM

LGTM.

This revision is now accepted and ready to land.Jun 22 2018, 2:32 PM
This revision was automatically updated to reflect the committed changes.