This is an archive of the discontinued LLVM Phabricator instance.

[Sink] allow sinking convergent operations across uniform branches
AbandonedPublic

Authored by sameerds on Jul 27 2021, 3:32 AM.

Details

Summary

This is a preparatory patch for fixing the semantics of attribute
"convergent" in terms of sets of threads determined by divergent
control flow.

See https://reviews.llvm.org/D104504 for the latest discussion.

Diff Detail

Event Timeline

sameerds created this revision.Jul 27 2021, 3:32 AM
sameerds requested review of this revision.Jul 27 2021, 3:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2021, 3:32 AM
sameerds updated this revision to Diff 362011.Jul 27 2021, 6:44 AM

Fixed a gross misunderstanding about the term "successor"

foad added inline comments.Jul 28 2021, 3:51 AM
llvm/lib/Analysis/DivergenceAnalysis.cpp
357–360

I think it would be less confusing to put the !DA check before the ContainsIrreducible check. I realise you have initialised ContainsIrreducible to false so that this still works, even for an "empty" analysis with no DA or LegacyDA, but that seems like a lie since we don't actually know whether the function constains irreducible regions or not.

I would hope for a structure like:

if (LegacyDA)
  return (something based on LegacyDA);
if (DA)
  return (something based on DA);
return false;

Same for the other functions below.

llvm/lib/Transforms/Scalar/Sink.cpp
247

It seems odd that you explicitly choose the new DA here but the legacy DA in SinkingLegacyPass. I know they both have "legacy" in their name but that doesn't seem like a good reason. SinkingLegacyPass is just a wrapper for use with the legacy pass manager, but I don't see why it should necessarily use the legacy DA.

Or is there some problem like the legacy DA doesn't work with the new PM, which has forced you to do this?

sameerds updated this revision to Diff 362995.Jul 30 2021, 1:45 AM

The change was rebased, which may produce a very noisy diff relative to
the previous version of the change.

Replaced the bool ContainsIrreducible with a new enum that is always
consistent, irrespective of whether we are using legacy DA or new DA.
This also allows a command-line option to force "divergent" control
flow, so that we can test changes on the default target for better
coverage.

sameerds added inline comments.Jul 30 2021, 1:51 AM
llvm/lib/Analysis/DivergenceAnalysis.cpp
357–360

Addressed with a new enum that always has a consistent value. It has other uses too, so it is checked first.

llvm/lib/Transforms/Scalar/Sink.cpp
247

The legacy DA was never ported to the new pass manager. It can only be invoked from the old pass manager, where it is the default DA. See the RUN lines in the updated lit test to see all the valid combinations.

llvm/test/Transforms/Sink/convergent.ll
6

The earlier attempt had copied the test into two versions, one with the default target and one with AMDGPU. But this is not scalable with more changes in the pipeline that have lots of other tests. Instead, the new option forces the new DA to report "divergent" for everything, which allows us to test convergent operations even on the default target. The tests can then run in every build, ensuring good coverage.

arsenm added inline comments.Jul 30 2021, 5:23 PM
llvm/lib/Transforms/Scalar/Sink.cpp
58–64

This is reinterpreting the IR semantics based on target information. There is no defined link between divergence and convergent, and I'm not sure there should even be one. There are multiple target defined concepts of uniformity, and not all of them matter for convergent on AMDGPU. I think any kind of transform that wants to make this kind of assumption needs to be a specific divergence aware control flow optimizer

sameerds added inline comments.Aug 1 2021, 4:14 AM
llvm/lib/Transforms/Scalar/Sink.cpp
58–64

There is nothing target-specific about this change. The optimization is treating the convergent operation exactly the way it should be. And yes, there is a link between divergence and convergent operations. The current "definition" of convergent operations in the LangRef does not amount to much, and the actual implementation of it clearly reflects the link with divergence.

The link between convergent operations and divergence is being formalized,
here: https://reviews.llvm.org/D85603
and further made explicit here: https://reviews.llvm.org/D104504

There may be multiple concepts of uniformity, but that's an entirely separate enhancement. Independent of how many ways in which threads can diverge (wave, work-group, grid, whatever), convergent operations are all linked to the fact that they diverge.

Every control flow transform in the optimizer should be aware of divergence. Divergence and convergent operations are two sides of the same coin, and the above linked reviews expose them everywhere in LLVM. There is no reason to sequester these concepts into a corner called "divergence aware optimizer". There needs to be no such thing.

sameerds updated this revision to Diff 364354.Aug 4 2021, 10:48 PM

Replaced the full-fledged divergence analysis with a trivial
DivergenceInfo object instead. The change also allows short-circuiting
the divergence analysis in lit tests, as demonstrated by changes to
LoopUnswitch.

Bump!

Replaced the full-fledged divergence analysis with a trivial
DivergenceInfo object instead. The change also allows short-circuiting
the divergence analysis in lit tests, as demonstrated by changes to
LoopUnswitch.

sameerds abandoned this revision.Jul 28 2022, 11:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 28 2022, 11:47 PM
Herald added a subscriber: kosarev. · View Herald Transcript
llvm/test/Analysis/DivergenceAnalysis/AMDGPU/irreducible.ll