This is an archive of the discontinued LLVM Phabricator instance.

[NewPM] Disable non-trivial loop-unswitch on targets with divergence
ClosedPublic

Authored by sameerds on Mar 19 2021, 8:52 AM.

Details

Summary

Unswitching a loop on a non-trivial divergent branch is expensive
since it serializes the execution of both version of the
loop. But identifying a divergent branch needs divergence analysis,
which is a function level analysis.

The legacy pass manager handles this dependency by isolating such a
loop transform and rerunning the required function analyses. This
functionality is currently missing in the new pass manager, and there
is no safe way for the SimpleLoopUnswitch pass to depend on
DivergenceAnalysis. So we conservatively assume that all non-trivial
branches are divergent if the target has divergence.

Diff Detail

Event Timeline

sameerds created this revision.Mar 19 2021, 8:52 AM
sameerds requested review of this revision.Mar 19 2021, 8:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 19 2021, 8:52 AM
rampitec added inline comments.Mar 19 2021, 9:10 AM
llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
2907

Is this a correct condition? Double negative "!NonTrivial" = Trivial, right? Are we not doing trivial loop unswitching?

tra added inline comments.Mar 19 2021, 10:01 AM
llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
2907

This should be a bit easier to read:

if (!(EnableNonTrivialUnswitch || 
      (NonTrivial && !TTI.hasBranchDivergence())))

Or, maybe extract part of the condition into a bool ShouldUnswitch = NonTrivial && !TTI.hasBranchDivergence().

sameerds updated this revision to Diff 332196.Mar 21 2021, 10:50 PM

Improved the readability of a potentially triple-negative boolean condition

Improved the readability of a potentially triple-negative boolean condition

The linter pointed out two overlong lines in the new comment. I've fixed that locally in my working directory. But I didn't want to trigger yet another build by uploading that NFC change.

tra added inline comments.Mar 22 2021, 10:18 AM
llvm/test/Transforms/LoopUnswitch/AMDGPU/divergent-unswitch.ll
16

Perhaps the test should be replaced with a test that verifies that unswitch does *not* happen.

LGTM modulo Artem's comment about test for unswitch to not happen.

sameerds added inline comments.Mar 22 2021, 5:38 PM
llvm/test/Transforms/LoopUnswitch/AMDGPU/divergent-unswitch.ll
16

I am not sure what you mean. I am guessing that the way phab presents this change might be to blame. There are two files: divergent-unswitch.ll (this file) and a new uniform-unswitch.ll. This file checks that the divergent unswitch does not happen, and passes.

The part that carries this comment is moved to uniform-unswitch.ll, where it is checked to ensure that the uniform unswitch does happen. But the new behaviour conservatily treats all non-trivial branches as divergent, and hence the new test is marked as XFAIL.

tra added inline comments.Mar 23 2021, 9:54 AM
llvm/test/Transforms/LoopUnswitch/AMDGPU/divergent-unswitch.ll
16

I've missed that you've split the files.

the new behaviour conservatily treats all non-trivial branches as divergent, and hence the new test is marked as XFAIL.

You may want to add a comment explaining that next to that XFAIL. Tests that test something that does not happen are hard to understand without additional details.

I guess I can rephrase my suggestion as "could we replace XFAIL with a positive test that unswitching does not happen on AMDGPU unless it's force-enabled by the flag"?

If this is a temporary arrangement and we do plan to eventually allow unswitching for uniform loops, XFAIL+a comment may do.

sameerds added inline comments.Mar 23 2021, 10:08 PM
llvm/test/Transforms/LoopUnswitch/AMDGPU/divergent-unswitch.ll
16

If this is a temporary arrangement and we do plan to eventually allow unswitching for uniform loops, XFAIL+a comment may do.

Correct. The test uniform-unswitch.ll does have a comment point to a bug report after the XFAIL line. It is not highlighted in this diff because Phabricator treats it as existing code that got moved to a new file.

But one can ask how long is "eventually". The deeper issue here is that this can be classified as a functional regression in the new pass manager. The old manager correctly handled the dependence of a loop transform on a function analysis, but the new pass manager has no such infrastructure. This test is supposed to pass (uniform non-trivial branches should get unswitched) when we bring that functionality to the new pass manager.

tra accepted this revision.Mar 24 2021, 9:40 AM
tra added inline comments.
llvm/test/Transforms/LoopUnswitch/AMDGPU/divergent-unswitch.ll
16

SGTM.

llvm/test/Transforms/LoopUnswitch/AMDGPU/uniform-unswitch.ll
2

Please add a comment/TODO explaining that the unswitch is expected to happen in principle, but we still need <something> to make it work.

This revision is now accepted and ready to land.Mar 24 2021, 9:40 AM
This revision was landed with ongoing or failed builds.Mar 25 2021, 4:28 AM
This revision was automatically updated to reflect the committed changes.
RKSimon added inline comments.
llvm/test/Transforms/LoopUnswitch/AMDGPU/uniform-unswitch.ll
2

@sameerds This is breaking a windows buildbot as its actually passing.....

http://lab.llvm.org:8011/#/builders/123/builds/3602

aeubanks added inline comments.Mar 25 2021, 10:15 AM
llvm/test/Transforms/LoopUnswitch/AMDGPU/uniform-unswitch.ll
2

If I had to guess, some bots are still using the legacy PM.
This test should either use -passes=default<O3> (new PM) or -O3 -enable-new-pm=0 (legacy PM).

Actually test unexpectedly passes with my local Linux build too:

********************
Unexpectedly Passed Tests (1):
  LLVM :: Transforms/LoopUnswitch/AMDGPU/uniform-unswitch.ll
sameerds added inline comments.
llvm/test/Transforms/LoopUnswitch/AMDGPU/uniform-unswitch.ll
2

Thanks to Richard Smith (@rsmith ?) for fixing this. The test was updated to force -enable-new-pm=1.