This is an archive of the discontinued LLVM Phabricator instance.

[Loop Fusion] Sink/hoist memory instructions between loop fusion candidates
ClosedPublic

Authored by aaronkintel on Aug 10 2022, 11:20 AM.

Details

Summary

Currently, instructions in the preheader of the second of two fusion candidates are sunk and hoisted whenever possible, to try to allow the loops to fuse. Memory instructions are skipped, and are never sunk or hoisted. This change adds memory instructions for sinking/hoisting consideration.

This change uses DependenceAnalysis to check if a mem inst in the preheader of FC1 depends on an instruction in FC0's header, across which it will be hoisted, or FC1's header, across which it will be sunk. We reject cases where the dependency is a data hazard.

Diff Detail

Event Timeline

aaronkintel created this revision.Aug 10 2022, 11:20 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2022, 11:20 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
aaronkintel requested review of this revision.Aug 10 2022, 11:20 AM
Whitney added a reviewer: Restricted Project.Aug 19 2022, 7:09 AM
Whitney added inline comments.Aug 22 2022, 9:48 AM
llvm/lib/Transforms/Scalar/LoopFuse.cpp
1042

Please add a description of this function and the expected result/meaning of the arguments.

1046

This comment no longer needed in this context, as this function is for checking if the instruction can be hoisted.

1065

No need braces for single instruction block.

1082

Why only considering instructions in the header and not any other blocks in FC0 loop?

1097

Please add a description of this function.

1111

Please remove braces.

1115

Why only considering instructions in the header and not any other blocks in FC1 loop?

1152

nit: add a period at the end of a sentence.

1170–1174

Should we also skip other non-store instructions if they are volatile or atomic?

I can see that LLVM::Instruction has function isVolatile() and isAtomic() to check those.

llvm/test/Transforms/LoopFusion/no_sink_hoist_load.ll
2

Do we have test case for volatile or atomic?

llvm/test/Transforms/LoopFusion/no_sink_hoist_unknown_function.ll
13–14

; CHECK-NOT: call void @unknown_func()?

27–28

; CHECK-NOT: call void @unknown_func()?

llvm/test/Transforms/LoopFusion/simple.ll
512 ↗(On Diff #451576)

How is this change related to this patch?

llvm/test/Transforms/LoopFusion/sink_load.ll
2

do we have test cases for hoist memory instructions?

bmahjour added inline comments.
llvm/lib/Transforms/Scalar/LoopFuse.cpp
1065

why the call to mayHaveSideEffects has been removed?

1111

why the call to mayHaveSideEffects has been removed?

Responding to comments

aaronkintel marked 13 inline comments as done.Aug 24 2022, 12:43 PM
aaronkintel added inline comments.
llvm/lib/Transforms/Scalar/LoopFuse.cpp
1065

mayHaveSideEffects() returns mayWriteToMemory() || mayThrow() || !willReturn(), but we don't want to reject cases that may read to memory anymore.

1082

In a canonical LLVM loop,

1111

As above.

1170–1174

sUnordered() calls isVolatile() and checks atomicity under the hood.

llvm/test/Transforms/LoopFusion/simple.ll
512 ↗(On Diff #451576)

Lit test was otherwise broken.

Whitney added inline comments.Aug 24 2022, 1:23 PM
llvm/lib/Transforms/Scalar/LoopFuse.cpp
1042

please add \p before parameters. e.g., instruction \p I.

1042

You may want to specific is the end of the preheader.

1065

Can combine !I.mayReadFromMemory() && !I.mayWriteToMemory() to !I.mayReadOrWriteMemory().

1082

In a canonical LLVM loop,

Not sure what you mean by that. I can see that you changed it to traverse memory reads and writes, LGTM.

1103

You may want to specify is to the beginning of the exit block.

1170–1174

Right, but isUnordered() is not a member function of LLVM::Instruction. Your new change for this comment LGTM.

llvm/test/Transforms/LoopFusion/simple.ll
512 ↗(On Diff #451576)

Can you please elaborate more on why it was broken?

The changes made in this patch, notably canHoistInst() and canSinkInst() seems to have duplicate functionality from CodeMoverUtils.cpp. I'm wondering if we can reuse the code from CodeMoverUtils.cpp? Or is there something that canHoistInst() and canSinkInst() can do but functions from CodeMoverUtils.cpp cannot do?

llvm/test/Transforms/LoopFusion/simple.ll
537 ↗(On Diff #455332)

Why change from adding 1 to adding %add?

aaronkintel marked 2 inline comments as done.Aug 24 2022, 1:47 PM

The changes made in this patch, notably canHoistInst() and canSinkInst() seems to have duplicate functionality from CodeMoverUtils.cpp. I'm wondering if we can reuse the code from CodeMoverUtils.cpp? Or is there something that canHoistInst() and canSinkInst() can do but functions from CodeMoverUtils.cpp cannot do?

The difficulty I see with utilizing isSafeToMove{Before, After}() from CodeMoverUtils in this context is that when determining if an instruction in the preheader of FC1 can be sunk/hoisted, we need to take into account other instructions in the preheader of FC1 which are eligible for sinking/hoisting.

Consider this example:

define void @sink_preheader(i32 %N) {
pre1:
  br label %body1

body1:  ; preds = %pre1, %body1
  ...
  br i1 %cond, label %body1, label %pre2

pre2:
  %i1 = add i32 1, %N
  %i2 = add i32 1, %i1
  br label %body2

....
}

Both %i1 and %i2 can be hoisted, though isSafeToMoveBefore() would presumably not believe %i2 should be moved to the end of pre1. If you're seeing something I'm not, let me know and I'd be happy to simplify the code.

llvm/test/Transforms/LoopFusion/simple.ll
512 ↗(On Diff #451576)

In fact, it should still work without any changes. Will fix

Responding to comments.

The changes made in this patch, notably canHoistInst() and canSinkInst() seems to have duplicate functionality from CodeMoverUtils.cpp. I'm wondering if we can reuse the code from CodeMoverUtils.cpp? Or is there something that canHoistInst() and canSinkInst() can do but functions from CodeMoverUtils.cpp cannot do?

The difficulty I see with utilizing isSafeToMove{Before, After}() from CodeMoverUtils in this context is that when determining if an instruction in the preheader of FC1 can be sunk/hoisted, we need to take into account other instructions in the preheader of FC1 which are eligible for sinking/hoisting.

Consider this example:

define void @sink_preheader(i32 %N) {
pre1:
  br label %body1

body1:  ; preds = %pre1, %body1
  ...
  br i1 %cond, label %body1, label %pre2

pre2:
  %i1 = add i32 1, %N
  %i2 = add i32 1, %i1
  br label %body2

....
}

Both %i1 and %i2 can be hoisted, though isSafeToMoveBefore() would presumably not believe %i2 should be moved to the end of pre1. If you're seeing something I'm not, let me know and I'd be happy to simplify the code.

IIUC the problem you described has been addressed and handled in CodeMoverUtils already.

The function isSafeToMoveBefore() has the following signature where the last argument CheckForEntireBlock can be set to true such that previous instructions (before the current instruction that is under consideration) would be considered when checking if it is legal to move it. If I'm not mistaken, for the IR you posted we can actually move both %i1 and %i2 (by setting CheckForEntireBlock=true). You might want to check out this patch https://reviews.llvm.org/D110378, which introduced CheckForEntireBlock.

bool isSafeToMoveBefore(Instruction &I, Instruction &InsertPoint,
                        DominatorTree &DT,
                        const PostDominatorTree *PDT = nullptr,
                        DependenceInfo *DI = nullptr,
                        bool CheckForEntireBlock = false);

If I'm not mistaken (please correct me if I'm wrong), I wish we could reuse CodeMoverUtils since it could simplify things.

IIUC the problem you described has been addressed and handled in CodeMoverUtils already.

The function isSafeToMoveBefore() has the following signature where the last argument CheckForEntireBlock can be set to true such that previous instructions (before the current instruction that is under consideration) would be considered when checking if it is legal to move it. If I'm not mistaken, for the IR you posted we can actually move both %i1 and %i2 (by setting CheckForEntireBlock=true). You might want to check out this patch https://reviews.llvm.org/D110378, which introduced CheckForEntireBlock.

bool isSafeToMoveBefore(Instruction &I, Instruction &InsertPoint,
                        DominatorTree &DT,
                        const PostDominatorTree *PDT = nullptr,
                        DependenceInfo *DI = nullptr,
                        bool CheckForEntireBlock = false);

If I'm not mistaken (please correct me if I'm wrong), I wish we could reuse CodeMoverUtils since it could simplify things.

I tried some tests with isSafeToMoveBefore(). The sink_preheader.ll test fails, since %sinkme can be sunk but then %sinkme2 cannot sink, because isSafeToMoveBefore() apparently does not ignore users in the same block like it ignores operands in the same block when hoisting. The condition CheckForEntireBlock for sinking could be added, to mirror the case with hoisting, but I foresee other issues. For example, %sinkme3 in sink_preheader.ll should not be hoisted, though the only obstacle for hoisting it (%sinkme1 ) is within the same block as %sinkme3 itself; we need to know that %sinkme1 also cannot be hoisted, and it is insufficient to just ignore variable.

IIUC the problem you described has been addressed and handled in CodeMoverUtils already.

The function isSafeToMoveBefore() has the following signature where the last argument CheckForEntireBlock can be set to true such that previous instructions (before the current instruction that is under consideration) would be considered when checking if it is legal to move it. If I'm not mistaken, for the IR you posted we can actually move both %i1 and %i2 (by setting CheckForEntireBlock=true). You might want to check out this patch https://reviews.llvm.org/D110378, which introduced CheckForEntireBlock.

bool isSafeToMoveBefore(Instruction &I, Instruction &InsertPoint,
                        DominatorTree &DT,
                        const PostDominatorTree *PDT = nullptr,
                        DependenceInfo *DI = nullptr,
                        bool CheckForEntireBlock = false);

If I'm not mistaken (please correct me if I'm wrong), I wish we could reuse CodeMoverUtils since it could simplify things.

I tried some tests with isSafeToMoveBefore(). The sink_preheader.ll test fails, since %sinkme can be sunk but then %sinkme2 cannot sink, because isSafeToMoveBefore() apparently does not ignore users in the same block like it ignores operands in the same block when hoisting. The condition CheckForEntireBlock for sinking could be added, to mirror the case with hoisting, but I foresee other issues. For example, %sinkme3 in sink_preheader.ll should not be hoisted, though the only obstacle for hoisting it (%sinkme1 ) is within the same block as %sinkme3 itself; we need to know that %sinkme1 also cannot be hoisted, and it is insufficient to just ignore variable.

IMHO I don't think %sinkme3 will be hoisted. Maybe I should have made it more clear in my last comment: in loop fusion we would usually want to move an entire BB, for example moving the preheader of the second loop to the first loop in order to create an empty preheader. The API from CodeMoverUtils for this purpose is the following, where the first argument is BasicBlock&. Maybe it should be renamed to something like "isSafeToMoveBBBefore()" to be more clear.

bool llvm::isSafeToMoveBefore(BasicBlock &BB, Instruction &InsertPoint,
                              DominatorTree &DT, const PostDominatorTree *PDT,
                              DependenceInfo *DI) {
  return llvm::all_of(BB, [&](Instruction &I) {
    if (BB.getTerminator() == &I)
      return true;

    return isSafeToMoveBefore(I, InsertPoint, DT, PDT, DI,
                              /*CheckForEntireBlock=*/true);
  });
}

%sinkme3 won't be hoisted because %sinkme1 won't be hoisted at the first place since %barrier does not dominate the insertion point, and llvm::all_of() would bail out early. Therefore IIUC this would not be an issue.

I did not mean to block or suspend this patch, although I think it might be cleaner and simpler to add some code in isSafeToMoveBefore() to mirror the case with hoisting and then reuse isSafeToMoveBefore(), which seems sufficient to fullfil your purposes. I quickly looked at the logic in isSafeToMoveBefore() and it looks like adding some code under the if (isReachedBefore(&I, &InsertPoint, &DT, PDT)) condition to enable stronger sinking capability might just do the work, similar to the code added under if (isReachedBefore(&InsertPoint, &I, &DT, PDT)) that enabled stronger hoisting.

Again I'm open to any possible solution, let it be this patch or other approaches - just trying to provide some input.

IMHO I don't think %sinkme3 will be hoisted. Maybe I should have made it more clear in my last comment: in loop fusion we would usually want to move an entire BB, for example moving the preheader of the second loop to the first loop in order to create an empty preheader. The API from CodeMoverUtils for this purpose is the following, where the first argument is BasicBlock&. Maybe it should be renamed to something like "isSafeToMoveBBBefore()" to be more clear.

bool llvm::isSafeToMoveBefore(BasicBlock &BB, Instruction &InsertPoint,
                              DominatorTree &DT, const PostDominatorTree *PDT,
                              DependenceInfo *DI) {
  return llvm::all_of(BB, [&](Instruction &I) {
    if (BB.getTerminator() == &I)
      return true;

    return isSafeToMoveBefore(I, InsertPoint, DT, PDT, DI,
                              /*CheckForEntireBlock=*/true);
  });
}

%sinkme3 won't be hoisted because %sinkme1 won't be hoisted at the first place since %barrier does not dominate the insertion point, and llvm::all_of() would bail out early. Therefore IIUC this would not be an issue.

I did not mean to block or suspend this patch, although I think it might be cleaner and simpler to add some code in isSafeToMoveBefore() to mirror the case with hoisting and then reuse isSafeToMoveBefore(), which seems sufficient to fullfil your purposes. I quickly looked at the logic in isSafeToMoveBefore() and it looks like adding some code under the if (isReachedBefore(&I, &InsertPoint, &DT, PDT)) condition to enable stronger sinking capability might just do the work, similar to the code added under if (isReachedBefore(&InsertPoint, &I, &DT, PDT)) that enabled stronger hoisting.

Again I'm open to any possible solution, let it be this patch or other approaches - just trying to provide some input.

Ok, I think I see what you're getting at. The function isSafeToMoveBefore() could be used to sink/hoist the entire preheader of FC1 atomically, but it cannot handle the situation where some instructions can be sunk and some can be hoisted safely.

Fixing clang-format issues

aaronkintel marked 4 inline comments as done.Aug 31 2022, 8:29 AM

This change is building successfully. Are there any more reviewer comments?

Adding TODO to add the canSinkInst/canHoistInsts to codeMoverUtils.

Like I said I won't block the patch, I'm fine if other reviewers accept this patch.

For me personally I would suggest the // TODO: Move functionality into CodeMoverUtils really be done before I would accept it myself, since CodeMoverUtils looks like the place that fits better for the functionality that is added in this patch.

Whitney accepted this revision as: Whitney.Sep 6 2022, 12:49 PM

I agree with @congzhe that CodeMoverUtils fits better for the functionality that is added in this patch. I am ok that this being a 2-steps process, where you first add the functionality in loop fusion pass, and then improve CodeMoverUtils.
Given that @aaronkintel added the TODOs, I can approve it, but please do improve CodeMoverUtils.

This revision is now accepted and ready to land.Sep 6 2022, 12:49 PM
eopXD added a subscriber: eopXD.Sep 6 2022, 7:01 PM

May you have the new added tests be separated to a pre-commit patch so the effect of this patch can be more clear?

This revision was landed with ongoing or failed builds.Sep 7 2022, 4:59 AM
This revision was automatically updated to reflect the committed changes.