Download Raw Diff

Details

Reviewers

jdoerfert
nikic
Prazek
asbirlea
aeubanks

Commits

rG58eac856ccc0: [LICM] Ensure LICM can hoist invariant.group

Summary

Invariant.group's are not sufficiently handled by LICM. Specifically,
if a given invariant.group loaded pointer is not overwritten between
the start of a loop, and its use in the load, it can be hoisted.
The invariant.group (on an already invariant pointer operand) ensures
the result is the same. If it is not overwritten between the start
of the loop and the load, it is therefore legal to hoist.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	60,060 ms	x64 debian > libFuzzer.libFuzzer::value-profile-load.test

Event Timeline

wsmoses created this revision.Feb 14 2023, 3:37 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 14 2023, 3:37 PM

Herald added subscribers: StephenFan, hiraditya. · View Herald Transcript

wsmoses requested review of this revision.Feb 14 2023, 3:37 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 14 2023, 3:37 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B213749: Diff 497462.Feb 14 2023, 4:37 PM

I think this is correct, but I'm not very familiar with invariant.group nuances.

llvm/lib/Transforms/Scalar/LICM.cpp
1206	Can we move this logic into pointerInvalidatedByLoop? It looks like the only difference for the invariant.group case is the last check for the clobber being the loop header MemoryPhi, right?
llvm/test/Transforms/LICM/invariant.group.ll
4	Second RUN line is unnecessary.
26	`i32*` -> `ptr`

vchuravy added a subscriber: vchuravy.Feb 15 2023, 5:36 AM

Address feedback

Run linter

Actually run linter

Fix test

wsmoses marked 3 inline comments as done.Feb 15 2023, 11:10 AM

Looks reasonable to me

llvm/lib/Transforms/Scalar/LICM.cpp
167	Nit: Style, also below.

Harbormaster completed remote builds in B213955: Diff 497753.Feb 15 2023, 11:59 AM

Fix style

wsmoses marked an inline comment as done.Feb 15 2023, 12:53 PM

Harbormaster completed remote builds in B213963: Diff 497766.Feb 15 2023, 1:14 PM

LGTM. Possibly there is some way to nicely integrate this directly into the MemorySSA clobber walker (which already has invariant.group support, but doesn't handle this case), but I think this is fine for now.

llvm/lib/Transforms/Scalar/LICM.cpp
1180	Doesn't seem like the variable is needed anymore. If you want to keep it, use `auto *`.

This revision is now accepted and ready to land.Feb 16 2023, 12:00 AM

I have a question what behavior the compiler raises if !invariant.group 's semantic is violated? For this case:

define i32 @foo(ptr nocapture %p, ptr nocapture %q) {
entry:
  %0 = load i32, ptr %p, align 4, !invariant.load !0
  store i8 2, ptr %p, align 1
  %1 = load i32, ptr %p, align 4, !invariant.load !0
  %add = add nsw i32 %1, %0
  ret i32 %add
}

!0 = !{}
}

if runs GVN on it, this program is not changed.

But in this patch, we just check if the load's clobber def is a memoryphi and locates in loopheader, for this case

define void @test(i64 %v, ptr %arg, ptr %arg1) {
bb2:                                              ; preds = %bb
  br label %bb5

bb5:                                              ; preds = %bb5, %bb2
  %tmp6 = phi i64 [ 0, %bb2 ], [ %tmp10, %bb5 ]
  %tmp3 = load i64, ptr %arg1, align 4, !invariant.group !0
  store i64 %tmp6, ptr %arg1, align 8
  %tmp10 = add nuw nsw i64 %tmp6, %tmp3
  %tmp11 = icmp eq i64 %tmp10, 200
  br i1 %tmp11, label %bb12, label %bb5

bb12:                                             ; preds = %bb5, %bb
  ret void
}

It hoists the load instruction even if there is a store that writes a new value to %arg1 pointer operand.

And I found that for

define i32 @foo(ptr nocapture %p, ptr nocapture %q) {
entry:
  %0 = load i32, ptr %p, align 4, !invariant.load !3
  %conv = trunc i32 %0 to i8
  store i8 %conv, ptr %q, align 1
  %1 = load i32, ptr %p, align 4, !invariant.load !3
  %add = add nsw i32 %1, 1
  ret i32 %add
}

!3 = !{}

GVN removes %1 = load i32, ptr %p, align 4, !invariant.load !3. The reason is although %q mayalias %p, they are not the same pointer operand. According to the docs, the value under %p will not be changed. So I think In this patch we can just check if clobber defs' pointer operand is the same as the load instruction to decide if we can hoist it.

This revision was landed with ongoing or failed builds.Feb 26 2023, 9:41 AM

Closed by commit rG58eac856ccc0: [LICM] Ensure LICM can hoist invariant.group (authored by wsmoses). · Explain Why

This revision was automatically updated to reflect the committed changes.

wsmoses added a commit: rG58eac856ccc0: [LICM] Ensure LICM can hoist invariant.group.

I'm personally of the opinion that building a special case for an illegal use of the invariant metadata (e.g. where it would guarantee the memory is not changed), would produce more fragile code -- since it'll be harder for someone who accidentally creates such illegal metadata to find the actual source of the bug.

Diff 497753

llvm/lib/Transforms/Scalar/LICM.cpp

Show First 20 Lines • Show All 157 Lines • ▼ Show 20 Lines	static bool sink(Instruction &I, LoopInfo LI, DominatorTree DT,
MemorySSAUpdater &MSSAU, OptimizationRemarkEmitter *ORE);		MemorySSAUpdater &MSSAU, OptimizationRemarkEmitter *ORE);
static bool isSafeToExecuteUnconditionally(		static bool isSafeToExecuteUnconditionally(
Instruction &Inst, const DominatorTree DT, const TargetLibraryInfo TLI,		Instruction &Inst, const DominatorTree DT, const TargetLibraryInfo TLI,
const Loop CurLoop, const LoopSafetyInfo SafetyInfo,		const Loop CurLoop, const LoopSafetyInfo SafetyInfo,
OptimizationRemarkEmitter ORE, const Instruction CtxI,		OptimizationRemarkEmitter ORE, const Instruction CtxI,
AssumptionCache *AC, bool AllowSpeculation);		AssumptionCache *AC, bool AllowSpeculation);
static bool pointerInvalidatedByLoop(MemorySSA MSSA, MemoryUse MU,		static bool pointerInvalidatedByLoop(MemorySSA MSSA, MemoryUse MU,
Loop *CurLoop, Instruction &I,		Loop *CurLoop, Instruction &I,
SinkAndHoistLICMFlags &Flags);		SinkAndHoistLICMFlags &Flags,
		bool invariant_group);
		jdoerfertUnsubmitted Done Reply Inline Actions Nit: Style, also below. jdoerfert: Nit: Style, also below.
static bool pointerInvalidatedByBlock(BasicBlock &BB, MemorySSA &MSSA,		static bool pointerInvalidatedByBlock(BasicBlock &BB, MemorySSA &MSSA,
MemoryUse &MU);		MemoryUse &MU);
static Instruction *cloneInstructionInExitBlock(		static Instruction *cloneInstructionInExitBlock(
Instruction &I, BasicBlock &ExitBlock, PHINode &PN, const LoopInfo *LI,		Instruction &I, BasicBlock &ExitBlock, PHINode &PN, const LoopInfo *LI,
const LoopSafetyInfo *SafetyInfo, MemorySSAUpdater &MSSAU);		const LoopSafetyInfo *SafetyInfo, MemorySSAUpdater &MSSAU);

static void eraseInstruction(Instruction &I, ICFLoopSafetyInfo &SafetyInfo,		static void eraseInstruction(Instruction &I, ICFLoopSafetyInfo &SafetyInfo,
MemorySSAUpdater &MSSAU);		MemorySSAUpdater &MSSAU);
▲ Show 20 Lines • Show All 996 Lines • ▼ Show 20 Lines	if (LoadInst *LI = dyn_cast<LoadInst>(&I)) {

if (LI->isAtomic() && !TargetExecutesOncePerLoop)		if (LI->isAtomic() && !TargetExecutesOncePerLoop)
return false; // Don't risk duplicating unordered loads		return false; // Don't risk duplicating unordered loads

// This checks for an invariant.start dominating the load.		// This checks for an invariant.start dominating the load.
if (isLoadInvariantInLoop(LI, DT, CurLoop))		if (isLoadInvariantInLoop(LI, DT, CurLoop))
return true;		return true;

		auto MU = cast<MemoryUse>(MSSA->getMemoryAccess(LI));
		nikicUnsubmitted Not Done Reply Inline Actions Doesn't seem like the variable is needed anymore. If you want to keep it, use `auto `. nikic:* Doesn't seem like the variable is needed anymore. If you want to keep it, use `auto *`.

		bool invariant_group = LI->hasMetadata(LLVMContext::MD_invariant_group);

bool Invalidated = pointerInvalidatedByLoop(		bool Invalidated = pointerInvalidatedByLoop(
MSSA, cast<MemoryUse>(MSSA->getMemoryAccess(LI)), CurLoop, I, Flags);		MSSA, MU, CurLoop, I, Flags, invariant_group);
// Check loop-invariant address because this may also be a sinkable load		// Check loop-invariant address because this may also be a sinkable load
// whose address is not necessarily loop-invariant.		// whose address is not necessarily loop-invariant.
if (ORE && Invalidated && CurLoop->isLoopInvariant(LI->getPointerOperand()))		if (ORE && Invalidated && CurLoop->isLoopInvariant(LI->getPointerOperand()))
ORE->emit([&]() {		ORE->emit([&]() {
return OptimizationRemarkMissed(		return OptimizationRemarkMissed(
DEBUG_TYPE, "LoadWithLoopInvariantAddressInvalidated", LI)		DEBUG_TYPE, "LoadWithLoopInvariantAddressInvalidated", LI)
<< "failed to move load with loop-invariant address "		<< "failed to move load with loop-invariant address "
"because the loop may invalidate its value";		"because the loop may invalidate its value";
});		});

return !Invalidated;		return !Invalidated;
} else if (CallInst *CI = dyn_cast<CallInst>(&I)) {		} else if (CallInst *CI = dyn_cast<CallInst>(&I)) {
// Don't sink or hoist dbg info; it's legal, but not useful.		// Don't sink or hoist dbg info; it's legal, but not useful.
if (isa<DbgInfoIntrinsic>(I))		if (isa<DbgInfoIntrinsic>(I))
return false;		return false;

// Don't sink calls which can throw.		// Don't sink calls which can throw.
if (CI->mayThrow())		if (CI->mayThrow())
return false;		return false;

// Convergent attribute has been used on operations that involve		// Convergent attribute has been used on operations that involve
		nikicUnsubmitted Done Reply Inline Actions Can we move this logic into pointerInvalidatedByLoop? It looks like the only difference for the invariant.group case is the last check for the clobber being the loop header MemoryPhi, right? nikic: Can we move this logic into pointerInvalidatedByLoop? It looks like the only difference for the…
// inter-thread communication which results are implicitly affected by the		// inter-thread communication which results are implicitly affected by the
// enclosing control flows. It is not safe to hoist or sink such operations		// enclosing control flows. It is not safe to hoist or sink such operations
// across control flow.		// across control flow.
if (CI->isConvergent())		if (CI->isConvergent())
return false;		return false;

using namespace PatternMatch;		using namespace PatternMatch;
if (match(CI, m_Intrinsic<Intrinsic::assume>()))		if (match(CI, m_Intrinsic<Intrinsic::assume>()))
Show All 13 Lines	if (Behavior.onlyReadsMemory()) {
// it's arguments with arbitrary offsets. If we can prove there are no		// it's arguments with arbitrary offsets. If we can prove there are no
// writes to this memory in the loop, we can hoist or sink.		// writes to this memory in the loop, we can hoist or sink.
if (Behavior.onlyAccessesArgPointees()) {		if (Behavior.onlyAccessesArgPointees()) {
// TODO: expand to writeable arguments		// TODO: expand to writeable arguments
for (Value *Op : CI->args())		for (Value *Op : CI->args())
if (Op->getType()->isPointerTy() &&		if (Op->getType()->isPointerTy() &&
pointerInvalidatedByLoop(		pointerInvalidatedByLoop(
MSSA, cast<MemoryUse>(MSSA->getMemoryAccess(CI)), CurLoop, I,		MSSA, cast<MemoryUse>(MSSA->getMemoryAccess(CI)), CurLoop, I,
Flags))		Flags, /invariant_group=/false))
return false;		return false;
return true;		return true;
}		}

// If this call only reads from memory and there are no writes to memory		// If this call only reads from memory and there are no writes to memory
// in the loop, we can hoist or sink the call as appropriate.		// in the loop, we can hoist or sink the call as appropriate.
if (isReadOnly(MSSAU, CurLoop))		if (isReadOnly(MSSAU, CurLoop))
return true;		return true;
▲ Show 20 Lines • Show All 1,085 Lines • ▼ Show 20 Lines	for (auto [Set, HasReadsOutsideSet] : Sets) {
Result.emplace_back(std::move(PointerMustAliases), HasReadsOutsideSet);		Result.emplace_back(std::move(PointerMustAliases), HasReadsOutsideSet);
}		}

return Result;		return Result;
}		}

static bool pointerInvalidatedByLoop(MemorySSA MSSA, MemoryUse MU,		static bool pointerInvalidatedByLoop(MemorySSA MSSA, MemoryUse MU,
Loop *CurLoop, Instruction &I,		Loop *CurLoop, Instruction &I,
SinkAndHoistLICMFlags &Flags) {		SinkAndHoistLICMFlags &Flags,
		bool invariant_group) {
// For hoisting, use the walker to determine safety		// For hoisting, use the walker to determine safety
if (!Flags.getIsSink()) {		if (!Flags.getIsSink()) {
MemoryAccess *Source;		MemoryAccess *Source;
// See declaration of SetLicmMssaOptCap for usage details.		// See declaration of SetLicmMssaOptCap for usage details.
if (Flags.tooManyClobberingCalls())		if (Flags.tooManyClobberingCalls())
Source = MU->getDefiningAccess();		Source = MU->getDefiningAccess();
else {		else {
Source = MSSA->getSkipSelfWalker()->getClobberingMemoryAccess(MU);		Source = MSSA->getSkipSelfWalker()->getClobberingMemoryAccess(MU);
Flags.incrementClobberingCalls();		Flags.incrementClobberingCalls();
}		}
		// If hoisting an invariant group, we only need to check that there
		// is no store to the loaded pointer between the start of the loop,
		// and the load (since all values must be the same).

		// This can be checked in two conditions:
		// 1) if the memoryaccess is outside the loop
		// 2) the earliest access is at the loop header,
		// if the memory loaded is the phi node

return !MSSA->isLiveOnEntryDef(Source) &&		return !MSSA->isLiveOnEntryDef(Source) &&
CurLoop->contains(Source->getBlock());		CurLoop->contains(Source->getBlock()) &&
		!(invariant_group && Source->getBlock() == CurLoop->getHeader() && isa<MemoryPhi>(Source));
}		}

// For sinking, we'd need to check all Defs below this use. The getClobbering		// For sinking, we'd need to check all Defs below this use. The getClobbering
// call will look on the backedge of the loop, but will check aliasing with		// call will look on the backedge of the loop, but will check aliasing with
// the instructions on the previous iteration.		// the instructions on the previous iteration.
// For example:		// For example:
// for (i ... )		// for (i ... )
// load a[i] ( Use (LoE)		// load a[i] ( Use (LoE)
Show All 39 Lines

llvm/test/Transforms/LICM/invariant.group.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -passes=licm < %s -S \| FileCheck %s

				define void @test(ptr %arg, ptr %arg1) {
				nikicUnsubmitted Done Reply Inline Actions Second RUN line is unnecessary. nikic: Second RUN line is unnecessary.
				; CHECK-LABEL: @test(
				; CHECK-NEXT: bb2:
				; CHECK-NEXT: [[TMP3:%.]] = load i32, ptr [[ARG1:%.]], align 4, !invariant.group !0
				; CHECK-NEXT: br label [[BB5:%.*]]
				; CHECK: bb5:
				; CHECK-NEXT: [[TMP6:%.]] = phi i64 [ 0, [[BB2:%.]] ], [ [[TMP10:%.*]], [[BB5]] ]
				; CHECK-NEXT: [[TMP7:%.]] = getelementptr inbounds i32, ptr [[ARG:%.]], i64 [[TMP6]]
				; CHECK-NEXT: store i32 [[TMP3]], ptr [[TMP7]], align 8
				; CHECK-NEXT: [[TMP10]] = add nuw nsw i64 [[TMP6]], 1
				; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i64 [[TMP10]], 200
				; CHECK-NEXT: br i1 [[TMP11]], label [[BB12:%.*]], label [[BB5]]
				; CHECK: bb12:
				; CHECK-NEXT: ret void
				;
				bb2: ; preds = %bb
				br label %bb5

				bb5: ; preds = %bb5, %bb2
				%tmp6 = phi i64 [ 0, %bb2 ], [ %tmp10, %bb5 ]
				%tmp3 = load i32, ptr %arg1, align 4, !invariant.group !0
				%tmp7 = getelementptr inbounds i32, ptr %arg, i64 %tmp6
				store i32 %tmp3, ptr %tmp7, align 8
				nikicUnsubmitted Done Reply Inline Actions `i32` -> `ptr` nikic:* `i32*` -> `ptr`
				%tmp10 = add nuw nsw i64 %tmp6, 1
				%tmp11 = icmp eq i64 %tmp10, 200
				br i1 %tmp11, label %bb12, label %bb5

				bb12: ; preds = %bb5, %bb
				ret void
				}


				define void @test_fail(ptr %arg, ptr %arg1) {
				; CHECK-LABEL: @test_fail(
				; CHECK-NEXT: bb2:
				; CHECK-NEXT: br label [[BB5:%.*]]
				; CHECK: bb5:
				; CHECK-NEXT: [[TMP6:%.]] = phi i64 [ 0, [[BB2:%.]] ], [ [[TMP10:%.*]], [[BB5]] ]
				; CHECK-NEXT: store i32 3, ptr [[ARG1:%.*]], align 4
				; CHECK-NEXT: [[TMP3:%.*]] = load i32, ptr [[ARG1]], align 4, !invariant.group !0
				; CHECK-NEXT: [[TMP7:%.]] = getelementptr inbounds i32, ptr [[ARG:%.]], i64 [[TMP6]]
				; CHECK-NEXT: store i32 [[TMP3]], ptr [[TMP7]], align 8
				; CHECK-NEXT: [[TMP10]] = add nuw nsw i64 [[TMP6]], 1
				; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i64 [[TMP10]], 200
				; CHECK-NEXT: br i1 [[TMP11]], label [[BB12:%.*]], label [[BB5]]
				; CHECK: bb12:
				; CHECK-NEXT: ret void
				;
				bb2: ; preds = %bb
				br label %bb5

				bb5: ; preds = %bb5, %bb2
				%tmp6 = phi i64 [ 0, %bb2 ], [ %tmp10, %bb5 ]
				store i32 3, ptr %arg1
				%tmp3 = load i32, ptr %arg1, align 4, !invariant.group !0
				%tmp7 = getelementptr inbounds i32, ptr %arg, i64 %tmp6
				store i32 %tmp3, ptr %tmp7, align 8
				%tmp10 = add nuw nsw i64 %tmp6, 1
				%tmp11 = icmp eq i64 %tmp10, 200
				br i1 %tmp11, label %bb12, label %bb5

				bb12: ; preds = %bb5, %bb
				ret void
				}

				!0 = !{}

This is an archive of the discontinued LLVM Phabricator instance.

[LICM] Ensure LICM can hoist invariant.group
ClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 497753

llvm/lib/Transforms/Scalar/LICM.cpp

llvm/test/Transforms/LICM/invariant.group.ll

This is an archive of the discontinued LLVM Phabricator instance.

[LICM] Ensure LICM can hoist invariant.groupClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 497753

llvm/lib/Transforms/Scalar/LICM.cpp

llvm/test/Transforms/LICM/invariant.group.ll

[LICM] Ensure LICM can hoist invariant.group
ClosedPublic