This is an archive of the discontinued LLVM Phabricator instance.

LICM: Hoist LOAD without STORE
ClosedPublic

Authored by djtodoro on Nov 5 2021, 8:24 AM.

Details

Summary

When doing load/store promotion within LICM, if we cannot prove that it is safe to sink the store we won't hoist the load, even though we can prove the load could be dereferenced and moved outside the loop. This patch implements the load promotion by moving it in the loop preheader by inserting proper PHI in the loop. The store is kept as is in the loop. By doing this, we avoid doing the load from a memory location in each iteration.

Please consider this small example:

loop {
  var = *ptr;
  if (var) break;
  *ptr= var + 1;
}

After this patch, it will be:

var0 = *ptr;
loop {
  var1 = phi (var0, var2);
  if (var1) break;
  var2 = var1 + 1;
  *ptr = var2;
}

This addresses some problems from [0].

[0] https://bugs.llvm.org/show_bug.cgi?id=51193

Diff Detail

Event Timeline

djtodoro created this revision.Nov 5 2021, 8:24 AM
djtodoro requested review of this revision.Nov 5 2021, 8:24 AM
djtodoro retitled this revision from [WIP] LICM: Hoist LOAD without STORE to LICM: Hoist LOAD without STORE.Nov 16 2021, 2:06 AM
djtodoro added a reviewer: efriedma.
lebedev.ri requested changes to this revision.EditedNov 16 2021, 10:48 AM
lebedev.ri added a reviewer: reames.
lebedev.ri added subscribers: reames, lebedev.ri.

This seems to be in-line with our recent discussions about whether on not LICM is a general canonicalization pass,
or should it have target-specific configurations. @reames thoughts? (to be noted, this more applies to D113290, if it was done without a target hook it would not be a question)

This revision now requires changes to proceed.Nov 16 2021, 10:48 AM
efriedma added inline comments.Nov 16 2021, 10:55 AM
llvm/lib/Transforms/Scalar/LICM.cpp
2001

Maybe move this explanation to the definition of the variable "ShouldTryLoadHoistingOnly", since that's where it's relevant.

2300

Not sure I understand what you're doing with SavedInsts here.

If I'm following correctly, SavedInsts is empty the first time this is called from LoopInvariantCodeMotion::runOnLoop. On subsequent iterations, it contains the stores that correspond to loads that were optimized.

If you're trying to prevent an infinite loop, this isn't the way to do it; promoteLoopAccessesToScalars should just return false if there aren't any loads to promote. If you're just trying to save a little work in subsequent iterations, please leave it out for now; if there's actually some significant compile-time gain here, we can look into it as a followup.

llvm/test/Transforms/LICM/AArch64/hoist-load-without-store.ll
20 ↗(On Diff #385082)

I'd like to see check lines for the PHI nodes.

Do we want any other tests here? Doesn't feel right to have just one test. I guess the cases where this transforms fails are generally covered by other tests, though, so not sure what we'd need to cover.

lebedev.ri added inline comments.Nov 16 2021, 11:06 AM
llvm/lib/Transforms/Scalar/LICM.cpp
120–125

This is really ambiguous to me. Is this talking about
*only* about hoisting loads, and preventing hoisting stores,
or hoisting loads even if we can't hoist the store?

reames requested changes to this revision.Nov 16 2021, 11:11 AM

This needs much clearer motivation - specifically a clear explanation of what you're trying to achieve and why. Please consider that motivation blocking for further review.

This needs much clearer motivation - specifically a clear explanation of what you're trying to achieve and why. Please consider that motivation blocking for further review.

FWIW, i think the general idea behind this is *really* important, "partial" mem2reg is still a win ; D113520 is somewhat related.

djtodoro edited the summary of this revision. (Show Details)Nov 19 2021, 3:05 AM

Thanks a lot for your comments!

This needs much clearer motivation - specifically a clear explanation of what you're trying to achieve and why. Please consider that motivation blocking for further review.

I've updated the summary as a first step.

llvm/lib/Transforms/Scalar/LICM.cpp
120–125

Oh, I agree...

2001

Makes sense.

2300

Hmm, actually it doesn't improve anything, we should simply return false if there is no load to promote.

llvm/test/Transforms/LICM/AArch64/hoist-load-without-store.ll
20 ↗(On Diff #385082)

The more tests the better. I'll try to make some other tests as well. As a starting point, I'll update the CHECKS for this one.

djtodoro updated this revision to Diff 388452.Nov 19 2021, 3:14 AM
  • Addressing comments
  • Improving && reducing the test

The new summary describing the optimization is a bit confusing to try to parse. I'd suggest splitting into three paragraphs:

  1. What LLVM currently does on the testcase.
  2. How the current patch changes the code generation, and why it's an improvement.
  3. Potential future directions (fallow-store-data-races etc.)
djtodoro edited the summary of this revision. (Show Details)Nov 22 2021, 6:35 AM
djtodoro edited the summary of this revision. (Show Details)Nov 22 2021, 6:40 AM

Ok, let me describe back to you what you're trying to do. Give a case like:

loop {
  a = *g;
  if (a) break;
  *g = a + 1
}

Where g is some loop invariant address, you want to produce a form which looks like this:

a0 = *g
loop {
  a1 = phi (a0, a2)
  if (a1) break;
  a2 = a1 + 1
  *g = a2;
}

Assuming I've correctly described that, I agree this is valuable. I'm honestly a bit shocked we didn't already handle it. However, this should absolutely not be target dependent. It should simply be always enabled.

llvm/lib/Transforms/Scalar/LICM.cpp
1863

Please name this CanInsertStoresInExitBlocks and invert the code appropriately.

2118

Please move this above the if block. Next to the SawAtomic/SawNotAtomic is a good place.

2215–2218

This block simplifies with the removal of the flag.

llvm/lib/Transforms/Utils/SSAUpdater.cpp
446

I see what you're doing here, but this is really ugly. I think we need to separate the set of uses being visited from the set of uses being *deleted*.

Please add a "shouldDelete(Inst)" callback on the LoadAndStorePromoter, default it to answering yes, then implement that in the sub-class. Still not great, but at least a bit better.

llvm/test/Transforms/LICM/AArch64/hoist-load-without-store.ll
31 ↗(On Diff #388452)

Please use the simplest test possible. An IR form of my example from the review is probably good. You must strip all irrelevant IR details (e.g. metadata arguments, globals, etc..).

reames requested changes to this revision.Nov 22 2021, 3:57 PM
This revision now requires changes to proceed.Nov 22 2021, 3:57 PM

Ok, let me describe back to you what you're trying to do. Give a case like:

loop {
  a = *g;
  if (a) break;
  *g = a + 1
}

Where g is some loop invariant address, you want to produce a form which looks like this:

a0 = *g
loop {
  a1 = phi (a0, a2)
  if (a1) break;
  a2 = a1 + 1
  *g = a2;
}

Yep, that's it!

Assuming I've correctly described that, I agree this is valuable. I'm honestly a bit shocked we didn't already handle it. However, this should absolutely not be target dependent. It should simply be always enabled.

Thanks! I totally agree, so I removed the flag and abandoned the second patch from the stack that was target-dependent.

djtodoro updated this revision to Diff 389491.Nov 24 2021, 7:18 AM
  • removed the option (so this is enabled always)
  • reduced the test
  • refactored the code a bit
reames requested changes to this revision.Nov 24 2021, 11:32 AM

Heading in the right direction.

In addition to the inline comments, please update and simplify the submission comment. The comment should not explain the motivating workload (except maybe as a link to a bug), and should instead focus on explaining the effect of the transform. i.e. use a more minimal example, and explain *why* we're implementing the transform.

llvm/lib/Transforms/Scalar/LICM.cpp
1863

Comment is stale.

1952

The whole preserveduses mechanism seems like overkill. The preserveuses set should simply be every store in the use set when CanInstertStoreInExistingBlocks is false. If it's not, I'm missing something and we probably need some comments/tests.

2217

You've already proven dereferenceability (i.e. load legality) above. This check should be:
if (!SafeToInsertStore && !FoundLoadToPromote)

return false;
llvm/lib/Transforms/Utils/SSAUpdater.cpp
450

Drop isa<StoreInst> from this check. The shouldDelete interface should handle this.

This revision now requires changes to proceed.Nov 24 2021, 11:32 AM

Heading in the right direction.

In addition to the inline comments, please update and simplify the submission comment. The comment should not explain the motivating workload (except maybe as a link to a bug), and should instead focus on explaining the effect of the transform. i.e. use a more minimal example, and explain *why* we're implementing the transform.

It makes sense to me.

llvm/lib/Transforms/Scalar/LICM.cpp
1952

Yes... I've had some tests failing due to some other reason, but this shouldn't have been included. Thanks.

djtodoro updated this revision to Diff 389963.Nov 26 2021, 2:46 AM
  • Simplify the code a bit
  • Address comments
djtodoro edited the summary of this revision. (Show Details)Nov 26 2021, 3:00 AM
djtodoro edited the summary of this revision. (Show Details)
reames accepted this revision.Nov 29 2021, 12:10 PM

LGTM w/required changes before submission.

llvm/include/llvm/Transforms/Utils/SSAUpdater.h
173

Make this comment generic.

e.g. Will return true if a sub-class wants to keep one of the loads or stores after SSA construction.

llvm/lib/Transforms/Scalar/LICM.cpp
469

Whitespace only change - please remove.

llvm/test/Transforms/LICM/scalar-promote.ll
2

Please autogen this in a separate commit, then rebase your change over it so that diff is visible in commit.

djtodoro added a subscriber: nikic.Nov 30 2021, 5:23 AM

@reames Thanks a lot for the comments!

llvm/include/llvm/Transforms/Utils/SSAUpdater.h
173

Actually opposite, but it makes sense.

llvm/test/Transforms/LICM/scalar-promote.ll
2

Sure. @nikic already addressed that with eee035235eb.

This revision was not accepted when it landed; it landed in state Needs Review.Dec 1 2021, 4:28 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
djtodoro updated this revision to Diff 391009.Dec 1 2021, 6:46 AM
  • Update the llvm/test/Transforms/LICM/scalar-promote-opaque-ptrs.ll

@lebedev.ri is this ok to go as is?

lebedev.ri added inline comments.Dec 1 2021, 6:52 AM
llvm/test/Transforms/LICM/hoist-load-without-store.ll
1

I mean, can you just commit this test file first, with proper check lines,
so that they are only changed by the following patch?

djtodoro added inline comments.Dec 1 2021, 6:55 AM
llvm/test/Transforms/LICM/hoist-load-without-store.ll
1

Got it! Sure. I'll do that.

I thought you wanted to double check the new changes in the failing tests.

nikic added inline comments.Dec 1 2021, 7:06 AM
llvm/test/Transforms/LICM/scalar-promote.ll
312

Comment should probably be updated.

djtodoro added inline comments.Dec 1 2021, 7:09 AM
llvm/test/Transforms/LICM/scalar-promote.ll
312

I'll do that before commiting. Thanks!

This revision was landed with ongoing or failed builds.Dec 2 2021, 3:54 AM
uabelho added a subscriber: uabelho.Jun 7 2022, 5:09 AM

I wrote an issue about LICM behaving differently with/without debug information with this patch:
https://github.com/llvm/llvm-project/issues/55915

Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2022, 5:09 AM