This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
5/5
LICM.cpp
-
test/Transforms/LICM/
-
Transforms/
-
LICM/
-
scalar-promote.ll

Differential D123473

[LICM] Only create load in pre-header when promoting load.
ClosedPublic

Authored by fhahn on Apr 10 2022, 2:26 PM.

Download Raw Diff

Details

Reviewers

nikic
nlopes
reames
aqjune

Commits

rG42229b96bf94: [LICM] Only create load in pre-header when promoting load.

Summary

When only a store is sunk, there is no need to create a load in the
pre-header, as the result of the load will never get used.

The dead load can can introduce UB, if the function is marked as
writeonly.

Fixes #51248.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	60,030 ms	x64 debian > libFuzzer.libFuzzer::large.test

Event Timeline

fhahn created this revision.Apr 10 2022, 2:26 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 10 2022, 2:26 PM

Herald added subscribers: asbirlea, hiraditya. · View Herald Transcript

fhahn requested review of this revision.Apr 10 2022, 2:26 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 10 2022, 2:26 PM

Harbormaster completed remote builds in B158928: Diff 421812.Apr 10 2022, 3:11 PM

nikic added inline comments.Apr 11 2022, 2:18 AM

llvm/lib/Transforms/Scalar/LICM.cpp
2180	Do we need to add the poison value here at all? If we don't add it, shouldn't SSAUpdater avoid creating the phi altogether and directly use the value stored in the loop?

Update patch to only add available value when needed.

fhahn added inline comments.Apr 11 2022, 4:30 AM

llvm/lib/Transforms/Scalar/LICM.cpp
2180	We don't need to add the poison value. We still need the phi node in the test case, because the store doesn't dominate the exit. The SSA updater uses undef instead of poison.

Harbormaster completed remote builds in B158973: Diff 421874.Apr 11 2022, 5:32 AM

LGTM

This revision is now accepted and ready to land.Apr 11 2022, 5:35 AM

I think the patch is not the full fix, as far as I understand. It's a good improvement, perf & correctness wise, though, but doesn't fix #51248.

There's one missing case: the load is needed and the function can't load memory (writeonly, readnone, etc?). In that case the optimization should bail out.

In D123473#3442753, @nlopes wrote:

I think the patch is not the full fix, as far as I understand. It's a good improvement, perf & correctness wise, though, but doesn't fix #51248.

There's one missing case: the load is needed and the function can't load memory (writeonly, readnone, etc?). In that case the optimization should bail out.

I think at the moment, the load will only be generated if the source already had a load that is guaranteed to execute. So hoisting that in a write only function should be fine, as it is already UB.

In D123473#3442779, @fhahn wrote:

In D123473#3442753, @nlopes wrote:

I think the patch is not the full fix, as far as I understand. It's a good improvement, perf & correctness wise, though, but doesn't fix #51248.

There's one missing case: the load is needed and the function can't load memory (writeonly, readnone, etc?). In that case the optimization should bail out.

I think at the moment, the load will only be generated if the source already had a load that is guaranteed to execute. So hoisting that in a write only function should be fine, as it is already UB.

OK then, thank you!
LGTM

This revision was landed with ongoing or failed builds.Apr 11 2022, 7:45 AM

Closed by commit rG42229b96bf94: [LICM] Only create load in pre-header when promoting load. (authored by fhahn). · Explain Why

This revision was automatically updated to reflect the committed changes.

fhahn added a commit: rG42229b96bf94: [LICM] Only create load in pre-header when promoting load..

fhahn added a reverting change: rG1ddc719680c2: Revert "[LICM] Only create load in pre-header when promoting load.".Apr 11 2022, 8:38 AM

fhahn added inline comments.Apr 11 2022, 10:36 AM

llvm/lib/Transforms/Scalar/LICM.cpp
2180	Hm, it looks like adding the available value unconditionally was needed for the code later on to preserve LCSSA form. Otherwise SSAUpdater may introduce non-LCSSA uses later on. I am planning on re-committing with a poison value for the pre-header to fix this.

After this got recommitted, our bot has issue during stage2 with llvm-tablgen crashing, have you seen this elsewhere?

Oh I see you already reverted the recommit, nevermind.

In case it helps, I was diagnosing a regression in Rust that bisected to the recommit of this when the re-revert landed. https://buildkite.com/llvm-project/rust-llvm-integrate-prototype/builds/9758 has the failure, though I imagine you'd want some un-optimized IR for the test case or something. If you'd like that please let me know and I can get something together. Thanks!

If you're looking for a testcase for the latest failure, this change appears to cause a miscompile in llvm::VersionTuple::tryParse, at least in our internal bootstrap builds.

fhahn mentioned this in rG5e54a413de1f: [LICM] Add additional writeonly tests, check attributes..Apr 20 2022, 10:49 AM

fhahn reopened this revision.Apr 20 2022, 11:48 AM

This revision is now accepted and ready to land.Apr 20 2022, 11:48 AM

I think this needs another look. AFAICT the original version was causing some mis-compiles, because *if* the object we store to is function local (alloca), we sink the store even if the store is not guarnateed to execute (tests added in 5e54a413de1f803).

I updated the patch to introduce the load, unless the the store is guaranteed to execute. If the load is introduced, the writeonly attribute is removed. This isn't ideal, because the attribute could have knock-on effects for functions calling the modified function.

Maybe a better solution would be to clarify the wording for writeonly in langref to be clear that the attribute only considers writes to memory visible to its callers. Then we can retain writeonly even if loads to local objects are introduced. I doubt considering reads to function-local objects for writeonly adds any value.

Maybe a better solution would be to clarify the wording for writeonly in langref to be clear that the attribute only considers writes to memory visible to its callers. Then we can retain writeonly even if loads to local objects are introduced. I doubt considering reads to function-local objects for writeonly adds any value.

These are already accepted semantics for memory attributes, see e.g. the code in checkFunctionMemoryAccess(), which ignores local memory. The wording in LangRef is "If a writeonly function reads memory visible to the program", which is a bit unclear, but probably the intent of "visible to the program" here is "visible to the caller".

Harbormaster completed remote builds in B160500: Diff 423986.Apr 20 2022, 12:54 PM

fhahn mentioned this in D124124: [LangRef] Limit read/writeonly attrs to memory visible to caller.Apr 20 2022, 2:07 PM

In D123473#3462792, @nikic wrote:

Maybe a better solution would be to clarify the wording for writeonly in langref to be clear that the attribute only considers writes to memory visible to its callers. Then we can retain writeonly even if loads to local objects are introduced. I doubt considering reads to function-local objects for writeonly adds any value.

These are already accepted semantics for memory attributes, see e.g. the code in checkFunctionMemoryAccess(), which ignores local memory. The wording in LangRef is "If a writeonly function reads memory visible to the program", which is a bit unclear, but probably the intent of "visible to the program" here is "visible to the caller".

Right, Alive2 implements that semantics. Memory accesses to local allocas/mallocs are ok in functions that restrict loads and/or stores.

In D123473#3462792, @nikic wrote:

Maybe a better solution would be to clarify the wording for writeonly in langref to be clear that the attribute only considers writes to memory visible to its callers. Then we can retain writeonly even if loads to local objects are introduced. I doubt considering reads to function-local objects for writeonly adds any value.

These are already accepted semantics for memory attributes, see e.g. the code in checkFunctionMemoryAccess(), which ignores local memory. The wording in LangRef is "If a writeonly function reads memory visible to the program", which is a bit unclear, but probably the intent of "visible to the program" here is "visible to the caller".

Yep, that's the ambiguity that would be good to clarify. I put up D124124.

nlopes added inline comments.Apr 20 2022, 2:10 PM

llvm/lib/Transforms/Scalar/LICM.cpp
2165	we can't remove the writeonly attribute locally. We would need to remove it transitively as it's illegal for a writeonly function to call a non-writeonly function. We need to bail out if a non-local store is needed.

fhahn mentioned this in rGce3bb82e4503: [LICM] Add test for writeonly fn with noalias call..Apr 22 2022, 1:46 PM

Update the patch to not remove the writeonly attribute, but instead verify that the writeonly semantics are not violated if we need to introduce a load, as clarified in D123473. If the store is not guaranteed to execute, a new load should only be introduced for not escaping noalias calls or allocas.

fhahn marked an inline comment as done.Apr 22 2022, 2:12 PM

fhahn added inline comments.

llvm/lib/Transforms/Scalar/LICM.cpp
2165	Yeah, I tried to allude to that in the message for the update. Updated to assert no problematic loads will be introduced in write only functions as per clarification in D124124.

Harbormaster completed remote builds in B160951: Diff 424602.Apr 22 2022, 3:35 PM

I'm surprised the assertion is not an if. But if it doesn't trigger, then LGTM.

fhahn mentioned this in rGb00fd352777d: [LangRef] Limit readnone,read/writeonly to memory visible outside the fn.Apr 25 2022, 3:33 AM

In D123473#3470417, @nlopes wrote:

I'm surprised the assertion is not an if. But if it doesn't trigger, then LGTM.

Yeah, I couldn't find an example where this would trigger. The hope is that if it triggers we will soon have a test case.

@regehr If you have time, it would be nice to fuzz LICM to check if this assertion can be triggered (after the patch lands).
It needs a function with the writeonly attribute + a pointer as input (or a global variable), a loop with a store to said pointer, a phi, and that's it (i.e., the test cases here as seed). Then some random stuff to make it crash.
Thanks!

Coming into this quite late, but I'm concerned about this patch and the reasoning discussed in this thread.

Consider the fact that the original program may have contained a load along a dynamically dead path inside a loop contained by a writeonly function. Even if we interpret the load as being UB when executed, the original program is well defined. If we then speculatively insert a load into the preheader - which we do based on speculation safety, even with this patch - then the load will execute in a writeonly function.

Before:

loop { 
  v = 0
  if (never_taken) {
    v = load addr
  }
  store v, addr
}

After:

v1 = load addr
loop { 
  v = 0
  if (never_taken) {
    v = v1
  }
  store v, addr
}

If we consider such a load immediate UB, then this is not a valid transform. However, consider that this transform is the same as performed by LICM (e.g. not store promotion.)(

If we actually want to prevent this transform, I think we'd have to consider loads in writeonly functions not safe to speculate.

I don't think we actually want to do that. Instead, I think we're making a mistake by considering the load to be immediate UB as opposed to simply returning poison in this case.

If we accept that semantic, then this patch becomes completely unnecessary.

In D123473#3484702, @reames wrote:

I don't think we actually want to do that. Instead, I think we're making a mistake by considering the load to be immediate UB as opposed to simply returning poison in this case.

You've a point. My concern is that it's surprising to have LLVM introducing a load in a function that's not supposed to do any load. I was thinking about I/O, but those accesses should be volatile. For kernel code it may not be ok, but probably they also use volatile to prevent the compiler messing around instead of relying on writeonly.
I can give it a try. I will run Alive2 on the LLVM test suite and see if anything fails.
Nevertheless, as mentioned below, whether the load is UB or returns poison is orthogonal to this patch.

In D123473#3484702, @reames wrote:

If we accept that semantic, then this patch becomes completely unnecessary.

Not true. The loop may not execute and thus you need the previous value. If the loaded value is poison, you still can't do the transformation,
Plus this patch removes useless loads; it's a perf win.

nlopes added a reviewer: aqjune.May 2 2022, 4:02 AM

In D123473#3484702, @reames wrote:

I don't think we actually want to do that. Instead, I think we're making a mistake by considering the load to be immediate UB as opposed to simply returning poison in this case.

If there is a transformation that turns load in a writeonly function into unreachable, defining it as UB would be necessary.
However, my guess is that existing optimizations won't be that aggressive; perhaps for existing opts, poison might be enough. This is still my guess, though.

In D123473#3485355, @nlopes wrote:

In D123473#3484702, @reames wrote:

I don't think we actually want to do that. Instead, I think we're making a mistake by considering the load to be immediate UB as opposed to simply returning poison in this case.

You've a point. My concern is that it's surprising to have LLVM introducing a load in a function that's not supposed to do any load. I was thinking about I/O, but those accesses should be volatile. For kernel code it may not be ok, but probably they also use volatile to prevent the compiler messing around instead of relying on writeonly

Something can be surprising, and yet a natural consequence of other reasonable choices. I think this is one of them.

On the topic of volatile, you raise an interesting point. We don't seem to be suppressing speculation of volatile loads in the common utility. That sounds like a bug, and likely worth a patch of it's own. It's much broader than LICM though.
.

In D123473#3484702, @reames wrote:

If we accept that semantic, then this patch becomes completely unnecessary.

Not true. The loop may not execute and thus you need the previous value. If the loaded value is poison, you still can't do the transformation,

Yes, you can. That's my point. If the code which branches on it executes, then you have UB. Not before.

Plus this patch removes useless loads; it's a perf win.

Hm, this a minor argument for folding load of caller accessible memory in writeonly to poison as an instcombine rule, but is pretty unconvincing here.

In D123473#3485641, @aqjune wrote:

In D123473#3484702, @reames wrote:

I don't think we actually want to do that. Instead, I think we're making a mistake by considering the load to be immediate UB as opposed to simply returning poison in this case.

If there is a transformation that turns load in a writeonly function into unreachable, defining it as UB would be necessary.
However, my guess is that existing optimizations won't be that aggressive; perhaps for existing opts, poison might be enough. This is still my guess, though.

If you find it, let me know. We can debate removing it.

In D123473#3485355, @nlopes wrote:

In D123473#3484702, @reames wrote:

I don't think we actually want to do that. Instead, I think we're making a mistake by considering the load to be immediate UB as opposed to simply returning poison in this case.

You've a point. My concern is that it's surprising to have LLVM introducing a load in a function that's not supposed to do any load. I was thinking about I/O, but those accesses should be volatile. For kernel code it may not be ok, but probably they also use volatile to prevent the compiler messing around instead of relying on writeonly

I run LLVM's test suite under Alive2 and found no issues with the load-is-poison semantics for writeonly & argmemonly functions.

Giving it a second thought, I agree with you. These attributes are not there to prevent optimizations, but to enable them. The compiler can do whatever it wants as long as it respects the memory model. The kernel may have a different idea about how the memory model looks like, but for now we don't support such special casing like gcc does (or used to).

So I think we can change it. It's generally better to keep UB usage as low as possible, and yield poison whenever possible. So I favor the change.

As for this patch, it optimizes the case where the loop & store are guaranteed to execute, so no load is needed. This is independent of writeonly.
See this example:

define void @test(i8 %var, ptr %ptr) {
entry:
  br label %for.cond

for.cond:
  %i = phi i64 [ 0, %entry ], [ %add, %cond.end ]
  %cmp = icmp ult i64 %i, 2
  br i1 %cmp, label %for.body39, label %for.end

for.body39:
  %div = sdiv i8 %var, 3
  %cmp2 = icmp slt i8 %div, 0
  br i1 %cmp2, label %cond.true, label %cond.false

cond.true:
  br label %cond.end

cond.false:
  br label %cond.end

cond.end:
  %merge = phi i8 [ %div, %cond.true ], [ 0, %cond.false ]
  store i8 %merge, ptr %ptr, align 1
  %add = add i64 %i, 4
  br label %for.cond

for.end:
  ret void
}

LICM currently produces a load that is replaced with poison with this patch:

define void @test(i8 %var, ptr %ptr) {
entry:
  %div = sdiv i8 %var, 3
  %cmp2 = icmp slt i8 %div, 0
  %ptr.promoted = load i8, ptr %ptr, align 1
  br label %for.cond
...

fhahn mentioned this in rG3497a4f39601: [LICM] Add test to exercise assertion from D123473..May 5 2022, 2:50 AM

In D123473#3491746, @nlopes wrote:

In D123473#3485355, @nlopes wrote:

In D123473#3484702, @reames wrote:

I don't think we actually want to do that. Instead, I think we're making a mistake by considering the load to be immediate UB as opposed to simply returning poison in this case.

You've a point. My concern is that it's surprising to have LLVM introducing a load in a function that's not supposed to do any load. I was thinking about I/O, but those accesses should be volatile. For kernel code it may not be ok, but probably they also use volatile to prevent the compiler messing around instead of relying on writeonly

While I think that's a great first data point, the coverage of writeonly functions is probably not too extensive unfortunately. Another interesting data point may be to (randomly?) add writeonly attributes to the existing tests and verify that.

I run LLVM's test suite under Alive2 and found no issues with the load-is-poison semantics for writeonly & argmemonly functions.

Giving it a second thought, I agree with you. These attributes are not there to prevent optimizations, but to enable them. The compiler can do whatever it wants as long as it respects the memory model. The kernel may have a different idea about how the memory model looks like, but for now we don't support such special casing like gcc does (or used to).

So I think we can change it. It's generally better to keep UB usage as low as possible, and yield poison whenever possible. So I favor the change.

I think it makes sense in isolation. But I think we should be careful to consider consistency with other/related attributes. At the moment, violations of a set of related memory attributes all have the same effect, UB. If we decide to change that, I think we should update at least all related attributes (readnone, inaccessiblememonly, inaccessiblemem_or_argmemonly, argmemonly, !dereferenceable metadata,...). For attributes that forbid writing to memory, we would need to retain UB?

Changing the langref is straight-forward, but enforcing it is difficult. Is there anything we could do to be more confident when adjusting the semantics?

I added a test that triggers the assertion in the current version of the patch (3497a4f39601).

As for this patch, it optimizes the case where the loop & store are guaranteed to execute, so no load is needed. This is independent of writeonly.

Yeah, this should be an independent improvement. I am leaning towards landing the improvement without the assertion. And leave resolution of #51248 until we either adjust LangRef or add a bailout. I probably won't have time in the immediate future to work on updating the langref and possibly audit existing code.

In D123473#3493474, @fhahn wrote:

In D123473#3491746, @nlopes wrote:

In D123473#3485355, @nlopes wrote:

In D123473#3484702, @reames wrote:

I don't think we actually want to do that. Instead, I think we're making a mistake by considering the load to be immediate UB as opposed to simply returning poison in this case.

You've a point. My concern is that it's surprising to have LLVM introducing a load in a function that's not supposed to do any load. I was thinking about I/O, but those accesses should be volatile. For kernel code it may not be ok, but probably they also use volatile to prevent the compiler messing around instead of relying on writeonly

While I think that's a great first data point, the coverage of writeonly functions is probably not too extensive unfortunately. Another interesting data point may be to (randomly?) add writeonly attributes to the existing tests and verify that.

Agreed. There are very few tests for writeonly. The good news is that it isn't heavily used either.

I run LLVM's test suite under Alive2 and found no issues with the load-is-poison semantics for writeonly & argmemonly functions.

Giving it a second thought, I agree with you. These attributes are not there to prevent optimizations, but to enable them. The compiler can do whatever it wants as long as it respects the memory model. The kernel may have a different idea about how the memory model looks like, but for now we don't support such special casing like gcc does (or used to).

So I think we can change it. It's generally better to keep UB usage as low as possible, and yield poison whenever possible. So I favor the change.

I think it makes sense in isolation. But I think we should be careful to consider consistency with other/related attributes. At the moment, violations of a set of related memory attributes all have the same effect, UB. If we decide to change that, I think we should update at least all related attributes (readnone, inaccessiblememonly, inaccessiblemem_or_argmemonly, argmemonly, !dereferenceable metadata,...).

This kind of attributes are usually added by LLVM itself, they are not specified by hand by users. Hence we are more free to select the semantics.
There are valid reasons to stop the compiler from hoisting loads, but I don't think writeonly is the right attribute. As it was probably added by LLVM to denote a function that is functional, may write to memory, but it will always write the same thing. So repeated calls can be removed. Introducing loads that yield poison doesn't change a thing, as long as their result is not used at run-time.
So for the uses of writeonly that we have today, I think yield poison is sufficient. Nevertheless, (part of) your patch is still useful regardless of the semantics to avoid creating useless loads.

For attributes that forbid writing to memory, we would need to retain UB?

Yes, as we use the "nowrite" attributes to determine whether a function can be hoisted and etc, so we can't introduce a store. Also, introducing stores is already forbidden in many cases.

Changing the langref is straight-forward, but enforcing it is difficult. Is there anything we could do to be more confident when adjusting the semantics?

The best thing we have today is fuzzing + Alive2. @regehr has been working on the fuzzing side of things. This strategy has been reasonably successful in testing semantics. The only caveat is that Alive2 doesn't support IPO, so we disable it for the Attributor and related things to avoid false-positives. For attributes that's not ideal.

I went ahead and recommitted the patch as 0776c48f9b7e. As per earlier discussions, the committed version doesn't fix the Github issue.

uabelho added a subscriber: uabelho.May 30 2022, 2:08 AM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

LICM.cpp

46 lines

test/

Transforms/

LICM/

scalar-promote.ll

12 lines

Diff 423986

llvm/lib/Transforms/Scalar/LICM.cpp

Show First 20 Lines • Show All 1,953 Lines • ▼ Show 20 Lines	bool llvm::promoteLoopAccessesToScalars(
// blocks is reached, the original dynamic path would have taken us through		// blocks is reached, the original dynamic path would have taken us through
// the store, so inserting a store into the exit block is safe. Note that this		// the store, so inserting a store into the exit block is safe. Note that this
// is different from the store being guaranteed to execute. For instance,		// is different from the store being guaranteed to execute. For instance,
// if an exception is thrown on the first iteration of the loop, the original		// if an exception is thrown on the first iteration of the loop, the original
// store is never executed, but the exit blocks are not executed either.		// store is never executed, but the exit blocks are not executed either.

bool DereferenceableInPH = false;		bool DereferenceableInPH = false;
bool SafeToInsertStore = false;		bool SafeToInsertStore = false;
		bool StoreIsGuanteedToExecute = false;
bool FoundLoadToPromote = false;		bool FoundLoadToPromote = false;

SmallVector<Instruction *, 64> LoopUses;		SmallVector<Instruction *, 64> LoopUses;

// We start with an alignment of one and try to find instructions that allow		// We start with an alignment of one and try to find instructions that allow
// us to prove better alignment.		// us to prove better alignment.
Align Alignment;		Align Alignment;
// Keep track of which types of access we see		// Keep track of which types of access we see
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	for (Use &U : ASIV->uses()) {
SawNotAtomic \|= !Store->isAtomic();		SawNotAtomic \|= !Store->isAtomic();

// If the store is guaranteed to execute, both properties are satisfied.		// If the store is guaranteed to execute, both properties are satisfied.
// We may want to check if a store is guaranteed to execute even if we		// We may want to check if a store is guaranteed to execute even if we
// already know that promotion is safe, since it may have higher		// already know that promotion is safe, since it may have higher
// alignment than any other guaranteed stores, in which case we can		// alignment than any other guaranteed stores, in which case we can
// raise the alignment on the promoted store.		// raise the alignment on the promoted store.
Align InstAlignment = Store->getAlign();		Align InstAlignment = Store->getAlign();
		bool GuaranteedToExecute =
		SafetyInfo->isGuaranteedToExecute(*UI, DT, CurLoop);
		StoreIsGuanteedToExecute \|= GuaranteedToExecute;
if (!DereferenceableInPH \|\| !SafeToInsertStore \|\|		if (!DereferenceableInPH \|\| !SafeToInsertStore \|\|
(InstAlignment > Alignment)) {		(InstAlignment > Alignment)) {
if (SafetyInfo->isGuaranteedToExecute(*UI, DT, CurLoop)) {		if (GuaranteedToExecute) {
DereferenceableInPH = true;		DereferenceableInPH = true;
SafeToInsertStore = true;		SafeToInsertStore = true;
Alignment = std::max(Alignment, InstAlignment);		Alignment = std::max(Alignment, InstAlignment);
}		}
}		}

// If a store dominates all exit blocks, it is safe to sink.		// If a store dominates all exit blocks, it is safe to sink.
// As explained above, if an exit block was executed, a dominating		// As explained above, if an exit block was executed, a dominating
▲ Show 20 Lines • Show All 97 Lines • ▼ Show 20 Lines	bool llvm::promoteLoopAccessesToScalars(
SSAUpdater SSA(&NewPHIs);		SSAUpdater SSA(&NewPHIs);
LoopPromoter Promoter(SomePtr, LoopUses, SSA, PointerMustAliases, ExitBlocks,		LoopPromoter Promoter(SomePtr, LoopUses, SSA, PointerMustAliases, ExitBlocks,
InsertPts, MSSAInsertPts, PIC, MSSAU, *LI, DL,		InsertPts, MSSAInsertPts, PIC, MSSAU, *LI, DL,
Alignment, SawUnorderedAtomic, AATags, *SafetyInfo,		Alignment, SawUnorderedAtomic, AATags, *SafetyInfo,
SafeToInsertStore);		SafeToInsertStore);

// Set up the preheader to have a definition of the value. It is the live-out		// Set up the preheader to have a definition of the value. It is the live-out
// value from the preheader that uses in the loop will use.		// value from the preheader that uses in the loop will use.
LoadInst *PreheaderLoad = new LoadInst(		LoadInst *PreheaderLoad = nullptr;
AccessTy, SomePtr, SomePtr->getName() + ".promoted",		if (FoundLoadToPromote \|\| !StoreIsGuanteedToExecute) {
		// A new load is introduced, remove WriteOnly attribute from the function.
		Preheader->getParent()->removeFnAttr(Attribute::WriteOnly);
		nlopesUnsubmitted Done Reply Inline Actions we can't remove the writeonly attribute locally. We would need to remove it transitively as it's illegal for a writeonly function to call a non-writeonly function. We need to bail out if a non-local store is needed. nlopes: we can't remove the writeonly attribute locally. We would need to remove it transitively as…
		fhahnAuthorUnsubmitted Done Reply Inline Actions Yeah, I tried to allude to that in the message for the update. Updated to assert no problematic loads will be introduced in write only functions as per clarification in D124124. fhahn: Yeah, I tried to allude to that in the message for the update. Updated to assert no problematic…
		PreheaderLoad =
		new LoadInst(AccessTy, SomePtr, SomePtr->getName() + ".promoted",
Preheader->getTerminator());		Preheader->getTerminator());
if (SawUnorderedAtomic)		if (SawUnorderedAtomic)
PreheaderLoad->setOrdering(AtomicOrdering::Unordered);		PreheaderLoad->setOrdering(AtomicOrdering::Unordered);
PreheaderLoad->setAlignment(Alignment);		PreheaderLoad->setAlignment(Alignment);
PreheaderLoad->setDebugLoc(DebugLoc());		PreheaderLoad->setDebugLoc(DebugLoc());
if (AATags)		if (AATags)
PreheaderLoad->setAAMetadata(AATags);		PreheaderLoad->setAAMetadata(AATags);
SSA.AddAvailableValue(Preheader, PreheaderLoad);

MemoryAccess *PreheaderLoadMemoryAccess = MSSAU.createMemoryAccessInBB(		MemoryAccess *PreheaderLoadMemoryAccess = MSSAU.createMemoryAccessInBB(
PreheaderLoad, nullptr, PreheaderLoad->getParent(), MemorySSA::End);		PreheaderLoad, nullptr, PreheaderLoad->getParent(), MemorySSA::End);
MemoryUse *NewMemUse = cast<MemoryUse>(PreheaderLoadMemoryAccess);		MemoryUse *NewMemUse = cast<MemoryUse>(PreheaderLoadMemoryAccess);
MSSAU.insertUse(NewMemUse, /RenameUses=/true);		MSSAU.insertUse(NewMemUse, /RenameUses=/true);
		SSA.AddAvailableValue(Preheader, PreheaderLoad);
		nikicUnsubmitted Done Reply Inline Actions Do we need to add the poison value here at all? If we don't add it, shouldn't SSAUpdater avoid creating the phi altogether and directly use the value stored in the loop? nikic: Do we need to add the poison value here at all? If we don't add it, shouldn't SSAUpdater avoid…
		fhahnAuthorUnsubmitted Done Reply Inline Actions We don't need to add the poison value. We still need the phi node in the test case, because the store doesn't dominate the exit. The SSA updater uses undef instead of poison. fhahn: We don't need to add the poison value. We still need the phi node in the test case, because the…
		fhahnAuthorUnsubmitted Done Reply Inline Actions Hm, it looks like adding the available value unconditionally was needed for the code later on to preserve LCSSA form. Otherwise SSAUpdater may introduce non-LCSSA uses later on. I am planning on re-committing with a poison value for the pre-header to fix this. fhahn: Hm, it looks like adding the available value unconditionally was needed for the code later on…
		} else {
		SSA.AddAvailableValue(Preheader, PoisonValue::get(AccessTy));
		}

if (VerifyMemorySSA)		if (VerifyMemorySSA)
MSSAU.getMemorySSA()->verifyMemorySSA();		MSSAU.getMemorySSA()->verifyMemorySSA();
// Rewrite all the loads in the loop and remember all the definitions from		// Rewrite all the loads in the loop and remember all the definitions from
// stores in the loop.		// stores in the loop.
Promoter.run(LoopUses);		Promoter.run(LoopUses);

if (VerifyMemorySSA)		if (VerifyMemorySSA)
MSSAU.getMemorySSA()->verifyMemorySSA();		MSSAU.getMemorySSA()->verifyMemorySSA();
// If the SSAUpdater didn't use the load in the preheader, just zap it now.		// If the SSAUpdater didn't use the load in the preheader, just zap it now.
if (PreheaderLoad->use_empty())		if (PreheaderLoad && PreheaderLoad->use_empty())
eraseInstruction(PreheaderLoad, SafetyInfo, MSSAU);		eraseInstruction(PreheaderLoad, SafetyInfo, MSSAU);

return true;		return true;
}		}

static void foreachMemoryAccess(MemorySSA MSSA, Loop L,		static void foreachMemoryAccess(MemorySSA MSSA, Loop L,
function_ref<void(Instruction *)> Fn) {		function_ref<void(Instruction *)> Fn) {
for (const BasicBlock *BB : L->blocks())		for (const BasicBlock *BB : L->blocks())
▲ Show 20 Lines • Show All 119 Lines • Show Last 20 Lines

llvm/test/Transforms/LICM/scalar-promote.ll

Show First 20 Lines • Show All 600 Lines • ▼ Show 20 Lines

@glb = external global i8, align 1		@glb = external global i8, align 1

; Test case for PR51248.		; Test case for PR51248.
define void @test_sink_store_only() writeonly {		define void @test_sink_store_only() writeonly {
; CHECK: Function Attrs: writeonly		; CHECK: Function Attrs: writeonly
; CHECK-LABEL: @test_sink_store_only(		; CHECK-LABEL: @test_sink_store_only(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[GLB_PROMOTED:%.]] = load i8, i8 @glb, align 1
; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]
; CHECK: loop.header:		; CHECK: loop.header:
; CHECK-NEXT: [[DIV1:%.]] = phi i8 [ [[GLB_PROMOTED]], [[ENTRY:%.]] ], [ [[DIV:%.]], [[LOOP_LATCH:%.]] ]		; CHECK-NEXT: [[DIV1:%.]] = phi i8 [ poison, [[ENTRY:%.]] ], [ [[DIV:%.]], [[LOOP_LATCH:%.]] ]
; CHECK-NEXT: [[I:%.]] = phi i8 [ 0, [[ENTRY]] ], [ [[ADD:%.]], [[LOOP_LATCH]] ]		; CHECK-NEXT: [[I:%.]] = phi i8 [ 0, [[ENTRY]] ], [ [[ADD:%.]], [[LOOP_LATCH]] ]
; CHECK-NEXT: [[CMP:%.*]] = icmp ult i8 [[I]], 4		; CHECK-NEXT: [[CMP:%.*]] = icmp ult i8 [[I]], 4
; CHECK-NEXT: br i1 [[CMP]], label [[LOOP_LATCH]], label [[EXIT:%.*]]		; CHECK-NEXT: br i1 [[CMP]], label [[LOOP_LATCH]], label [[EXIT:%.*]]
; CHECK: loop.latch:		; CHECK: loop.latch:
; CHECK-NEXT: [[DIV]] = sdiv i8 [[I]], 3		; CHECK-NEXT: [[DIV]] = sdiv i8 [[I]], 3
; CHECK-NEXT: [[ADD]] = add i8 [[I]], 4		; CHECK-NEXT: [[ADD]] = add i8 [[I]], 4
; CHECK-NEXT: br label [[LOOP_HEADER]]		; CHECK-NEXT: br label [[LOOP_HEADER]]
; CHECK: exit:		; CHECK: exit:
Show All 19 Lines	exit:
ret void		ret void
}		}

define void @test_sink_store_to_local_object_only_loop_must_execute() writeonly {		define void @test_sink_store_to_local_object_only_loop_must_execute() writeonly {
; CHECK: Function Attrs: writeonly		; CHECK: Function Attrs: writeonly
; CHECK-LABEL: @test_sink_store_to_local_object_only_loop_must_execute(		; CHECK-LABEL: @test_sink_store_to_local_object_only_loop_must_execute(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[A:%.*]] = alloca i8, align 1		; CHECK-NEXT: [[A:%.*]] = alloca i8, align 1
; CHECK-NEXT: [[A_PROMOTED:%.]] = load i8, i8 [[A]], align 1
; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]
; CHECK: loop.header:		; CHECK: loop.header:
; CHECK-NEXT: [[DIV1:%.]] = phi i8 [ [[A_PROMOTED]], [[ENTRY:%.]] ], [ [[DIV:%.]], [[LOOP_LATCH:%.]] ]		; CHECK-NEXT: [[DIV1:%.]] = phi i8 [ poison, [[ENTRY:%.]] ], [ [[DIV:%.]], [[LOOP_LATCH:%.]] ]
; CHECK-NEXT: [[I:%.]] = phi i8 [ 0, [[ENTRY]] ], [ [[ADD:%.]], [[LOOP_LATCH]] ]		; CHECK-NEXT: [[I:%.]] = phi i8 [ 0, [[ENTRY]] ], [ [[ADD:%.]], [[LOOP_LATCH]] ]
; CHECK-NEXT: [[CMP:%.*]] = icmp ult i8 [[I]], 4		; CHECK-NEXT: [[CMP:%.*]] = icmp ult i8 [[I]], 4
; CHECK-NEXT: br i1 [[CMP]], label [[LOOP_LATCH]], label [[EXIT:%.*]]		; CHECK-NEXT: br i1 [[CMP]], label [[LOOP_LATCH]], label [[EXIT:%.*]]
; CHECK: loop.latch:		; CHECK: loop.latch:
; CHECK-NEXT: [[DIV]] = sdiv i8 [[I]], 3		; CHECK-NEXT: [[DIV]] = sdiv i8 [[I]], 3
; CHECK-NEXT: [[ADD]] = add i8 [[I]], 4		; CHECK-NEXT: [[ADD]] = add i8 [[I]], 4
; CHECK-NEXT: br label [[LOOP_HEADER]]		; CHECK-NEXT: br label [[LOOP_HEADER]]
; CHECK: exit:		; CHECK: exit:
Show All 18 Lines

exit:		exit:
ret void		ret void
}		}

; The store in the loop may not execute, so we need to introduce a load in the		; The store in the loop may not execute, so we need to introduce a load in the
; pre-header. Make sure the writeonly attribute is dropped.		; pre-header. Make sure the writeonly attribute is dropped.
define void @test_sink_store_to_local_object_only_loop_may_not_execute(i8 %n) writeonly {		define void @test_sink_store_to_local_object_only_loop_may_not_execute(i8 %n) writeonly {
; CHECK: Function Attrs: writeonly
; CHECK-LABEL: @test_sink_store_to_local_object_only_loop_may_not_execute(		; CHECK-LABEL: @test_sink_store_to_local_object_only_loop_may_not_execute(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[A:%.*]] = alloca i8, align 1		; CHECK-NEXT: [[A:%.*]] = alloca i8, align 1
; CHECK-NEXT: [[A_PROMOTED:%.]] = load i8, i8 [[A]], align 1		; CHECK-NEXT: [[A_PROMOTED:%.]] = load i8, i8 [[A]], align 1
; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]
; CHECK: loop.header:		; CHECK: loop.header:
; CHECK-NEXT: [[DIV1:%.]] = phi i8 [ [[A_PROMOTED]], [[ENTRY:%.]] ], [ [[DIV:%.]], [[LOOP_LATCH:%.]] ]		; CHECK-NEXT: [[DIV1:%.]] = phi i8 [ [[A_PROMOTED]], [[ENTRY:%.]] ], [ [[DIV:%.]], [[LOOP_LATCH:%.]] ]
; CHECK-NEXT: [[I:%.]] = phi i8 [ 0, [[ENTRY]] ], [ [[ADD:%.]], [[LOOP_LATCH]] ]		; CHECK-NEXT: [[I:%.]] = phi i8 [ 0, [[ENTRY]] ], [ [[ADD:%.]], [[LOOP_LATCH]] ]
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines
; CHECK-LABEL: @sink_store_lcssa_phis(		; CHECK-LABEL: @sink_store_lcssa_phis(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: br label [[LOOP_1_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_1_HEADER:%.*]]
; CHECK: loop.1.header:		; CHECK: loop.1.header:
; CHECK-NEXT: br label [[LOOP_2_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_2_HEADER:%.*]]
; CHECK: loop.2.header:		; CHECK: loop.2.header:
; CHECK-NEXT: br i1 false, label [[LOOP_3_HEADER_PREHEADER:%.]], label [[LOOP_1_LATCH:%.]]		; CHECK-NEXT: br i1 false, label [[LOOP_3_HEADER_PREHEADER:%.]], label [[LOOP_1_LATCH:%.]]
; CHECK: loop.3.header.preheader:		; CHECK: loop.3.header.preheader:
; CHECK-NEXT: [[PTR_PROMOTED:%.]] = load i32, i32 [[PTR:%.*]], align 4
; CHECK-NEXT: br label [[LOOP_3_HEADER:%.*]]		; CHECK-NEXT: br label [[LOOP_3_HEADER:%.*]]
; CHECK: loop.3.header:		; CHECK: loop.3.header:
; CHECK-NEXT: [[I_11:%.]] = phi i32 [ [[I_1:%.]], [[LOOP_3_LATCH:%.*]] ], [ [[PTR_PROMOTED]], [[LOOP_3_HEADER_PREHEADER]] ]		; CHECK-NEXT: [[I_11:%.]] = phi i32 [ [[I_1:%.]], [[LOOP_3_LATCH:%.*]] ], [ poison, [[LOOP_3_HEADER_PREHEADER]] ]
; CHECK-NEXT: [[I_1]] = phi i32 [ 1, [[LOOP_3_LATCH]] ], [ 0, [[LOOP_3_HEADER_PREHEADER]] ]		; CHECK-NEXT: [[I_1]] = phi i32 [ 1, [[LOOP_3_LATCH]] ], [ 0, [[LOOP_3_HEADER_PREHEADER]] ]
; CHECK-NEXT: br i1 true, label [[LOOP_3_LATCH]], label [[LOOP_2_LATCH:%.*]]		; CHECK-NEXT: br i1 true, label [[LOOP_3_LATCH]], label [[LOOP_2_LATCH:%.*]]
; CHECK: loop.3.latch:		; CHECK: loop.3.latch:
; CHECK-NEXT: br label [[LOOP_3_HEADER]]		; CHECK-NEXT: br label [[LOOP_3_HEADER]]
; CHECK: loop.2.latch:		; CHECK: loop.2.latch:
; CHECK-NEXT: [[I_11_LCSSA:%.*]] = phi i32 [ [[I_11]], [[LOOP_3_HEADER]] ]		; CHECK-NEXT: [[I_11_LCSSA:%.*]] = phi i32 [ [[I_11]], [[LOOP_3_HEADER]] ]
; CHECK-NEXT: store i32 [[I_11_LCSSA]], i32* [[PTR]], align 4		; CHECK-NEXT: store i32 [[I_11_LCSSA]], i32* [[PTR:%.*]], align 4
; CHECK-NEXT: br label [[LOOP_2_HEADER]]		; CHECK-NEXT: br label [[LOOP_2_HEADER]]
; CHECK: loop.1.latch:		; CHECK: loop.1.latch:
; CHECK-NEXT: br i1 [[C:%.]], label [[LOOP_1_HEADER]], label [[EXIT:%.]]		; CHECK-NEXT: br i1 [[C:%.]], label [[LOOP_1_HEADER]], label [[EXIT:%.]]
; CHECK: exit:		; CHECK: exit:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
br label %loop.1.header		br label %loop.1.header
Show All 31 Lines