This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
4
MemorySSA.h
-
lib/
-
Analysis/
-
MemorySSA.cpp
-
Transforms/Scalar/
-
Scalar/
-
DeadStoreElimination.cpp
-
EarlyCSE.cpp
-
GVNHoist.cpp
-
LICM.cpp
-
unittests/Analysis/
-
Analysis/
-
MemorySSATest.cpp

Differential D121381

[MemorySSA] Support lazy use optimization
ClosedPublic

Authored by nikic on Mar 10 2022, 8:21 AM.

Download Raw Diff

Details

Reviewers

asbirlea
fhahn
george.burgess.iv
reames

Commits

rGf96428e16de2: [MemorySSA] Don't optimize uses during construction

Summary

This changes MemorySSA to be constructed in unoptimized form. MemorySSA::ensureOptimized() can be called to optimize all uses (once). This should be done by passes where having optimized uses is beneficial, either because we're going to query all uses anyway, or because we're doing def-use walks.

This should help reduce the compile-time impact of MemorySSA for some use cases (the reason why I started looking into this is D117926), which can avoid optimizing all uses upfront, and instead only optimize those that are actually queried.

Actually, we have an existing use-case for this, which is EarlyCSE. Disabling eager use optimization there gives a significant compile-time improvement: http://llvm-compile-time-tracker.com/compare.php?from=59191057243e34d85b644716ef2811bfea8efd1e&to=1dd3499045716e71bad61ccd70700e734a74d350&stat=instructions This is because EarlyCSE will generally only query clobbers for a subset of all uses.

To be more conservative I haven't included the EarlyCSE change here, but I can add it.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

nikic created this revision.Mar 10 2022, 8:21 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2022, 8:21 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

nikic requested review of this revision.Mar 10 2022, 8:21 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 10 2022, 8:21 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B153570: Diff 414381.Mar 10 2022, 9:07 AM

Thank you for sending this out!

Broadly the changes look good. The patch as is can also give some compile-time improvements for cases where MSSA is built and preserved in loop pipelines and need only be optimized before LICM, not before the first loop pass in the pipeline. This also means results may change based on the delayed optimization.
This should hold true for both Legacy and NPM, but I see in practice there's a difference only for LPM (http://llvm-compile-time-tracker.com/compare.php?from=067c035012fc061ad6378458774ac2df117283c6&to=59191057243e34d85b644716ef2811bfea8efd1e&stat=instructions)

Let me test this out as is first. Then separately with the EarlyCSE ensureOptimized() removed to check if the gains in compile-time are not coupled with run time regressions.

In D121381#3373615, @asbirlea wrote:

Thank you for sending this out!

Broadly the changes look good. The patch as is can also give some compile-time improvements for cases where MSSA is built and preserved in loop pipelines and need only be optimized before LICM, not before the first loop pass in the pipeline. This also means results may change based on the delayed optimization.
This should hold true for both Legacy and NPM, but I see in practice there's a difference only for LPM (http://llvm-compile-time-tracker.com/compare.php?from=067c035012fc061ad6378458774ac2df117283c6&to=59191057243e34d85b644716ef2811bfea8efd1e&stat=instructions)

I think this is because with the LegacyPM, loop analyses (including MSSA) will get scheduled unconditionally, even if there are no loops, so we sometimes compute MSSA unnecessarily. With the NewPM, analyses are only computed if there are loops, so we generally don't perform unnecessary MSSA constructions there.

Let me test this out as is first. Then separately with the EarlyCSE ensureOptimized() removed to check if the gains in compile-time are not coupled with run time regressions.

I should mention here that this does impact EarlyCSE results due to this fallback: https://github.com/llvm/llvm-project/blob/54d7fde46e8a0e425245e18732c2a78e64fa7b35/llvm/lib/Transforms/Scalar/EarlyCSE.cpp#L1030-L1034 Without eagerly optimizing uses, the getDefiningAccess() fallback will be unoptimized for uses. But I think we can compensate this by increasing the EarlyCSEMssaOptCap if necessary.

Conceptually, I think this makes a lot of sense.

Biggest thing I see in this patch is a strong need to update comments and APIs. This introduces the possibility of clients seeing unoptimized uses which is new, and not currently reflected in the API documentation. Please address before landing.

llvm/include/llvm/Analysis/MemorySSA.h
805	Strong comment: ensureOptimizedLoads. This does NOT optimize stores, and that's important to distinguish in comments and naming.

LGTM for this patch on the impact side, testing looks good.

+1 on improving documentation for future uses, to include that by default uses are no longer optimized. Also rename the new method to ensureOptimizedUses() per @reames's suggestion.
(docs can be landed separately as long as they come shortly after)

On the impact for the EarlyCSE follow-up I'm seeing a couple of small regressions in fdo configuration that I would not block on. I'd go with landing the EarlyCSE change separately from this and evaluate if increasing the cap makes sense or not based on if other folks notice any performance impact from the change.

asbirlea mentioned this in D121740: [docs] Add details to MemorySSA docs..Mar 15 2022, 2:22 PM

Sent out https://reviews.llvm.org/D121740.

llvm/include/llvm/Analysis/MemorySSA.h
804	Please expand the doc for this API here.

Rename to ensureOptimizedUses(), some comment updates.

Harbormaster completed remote builds in B154618: Diff 415859.Mar 16 2022, 9:12 AM

Thanks, this looks like a nice improvement in combination with D121740!

LGTM with a couple nits.

llvm/include/llvm/Analysis/MemorySSA.h
350	This is not necessarily true due to the cap set by MaxCheckLimit (memssa-check-limit). Also not true if a MemoryDef was added that invalidated optimizations (`OptimizedID` will no longer match).
806	Add: `By default, during MemorySSA build, uses are not optimized`

This revision is now accepted and ready to land.Mar 17 2022, 3:02 PM

asbirlea mentioned this in rG187a5f230f4b: [docs] Add details to MemorySSA docs..Mar 17 2022, 3:25 PM

This revision was landed with ongoing or failed builds.Mar 18 2022, 1:56 AM

Closed by commit rGf96428e16de2: [MemorySSA] Don't optimize uses during construction (authored by nikic). · Explain Why

This revision was automatically updated to reflect the committed changes.

nikic added a commit: rGf96428e16de2: [MemorySSA] Don't optimize uses during construction.

I noticed by accident that MemCpyOpt does not call ensureOptimizedUses, and thus changes behavior after this change. Not sure if that's good or bad, but it doesn't seem to have been explicitly noted in the review. For call slot optimization, we do perform a lot of clobber walks from loads.

(Separately, we could recognize the code to perform far fewer clobber walks. - e.g. there's no point in searching for the clobber for any non-alloca location.)

In D121381#3410466, @reames wrote:

I noticed by accident that MemCpyOpt does not call ensureOptimizedUses, and thus changes behavior after this change. Not sure if that's good or bad, but it doesn't seem to have been explicitly noted in the review. For call slot optimization, we do perform a lot of clobber walks from loads.

I'm surprised that this would cause a change in behavior. MemCpyOpt always does clobber walks (without any limit that falls back to getDefiningAccess), so it really shouldn't care whether uses are optimized or not. Possibly our optimization cutoffs work slightly different between the eager use optimization and the clobber walk, and that can cause differences?

In D121381#3410717, @nikic wrote:

In D121381#3410466, @reames wrote:

I noticed by accident that MemCpyOpt does not call ensureOptimizedUses, and thus changes behavior after this change. Not sure if that's good or bad, but it doesn't seem to have been explicitly noted in the review. For call slot optimization, we do perform a lot of clobber walks from loads.

I'm surprised that this would cause a change in behavior. MemCpyOpt always does clobber walks (without any limit that falls back to getDefiningAccess), so it really shouldn't care whether uses are optimized or not. Possibly our optimization cutoffs work slightly different between the eager use optimization and the clobber walk, and that can cause differences?

Er, sorry, was apparently sloppy in my wording.

I did *not* see a functional difference. I saw a compile time difference. On at least one example, not having the eager use optimization caused a significant shift in where we spend time. I'm not even saying it slowed down (the results were too noisy to tell), just that it was different and that doesn't seem obviously expected from the review.

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

MemorySSA.h

14 lines

lib/

Analysis/

MemorySSA.cpp

20 lines

Transforms/

Scalar/

DeadStoreElimination.cpp

1 line

EarlyCSE.cpp

5 lines

GVNHoist.cpp

4 lines

LICM.cpp

1 line

unittests/

Analysis/

MemorySSATest.cpp

2 lines

Diff 416428

llvm/include/llvm/Analysis/MemorySSA.h

Show First 20 Lines • Show All 340 Lines • ▼ Show 20 Lines	public:

void print(raw_ostream &OS) const;		void print(raw_ostream &OS) const;

void setOptimized(MemoryAccess *DMA) {		void setOptimized(MemoryAccess *DMA) {
OptimizedID = DMA->getID();		OptimizedID = DMA->getID();
setOperand(0, DMA);		setOperand(0, DMA);
}		}

/// The defining access of a MemoryUses are always optimized if queried from		/// Whether the MemoryUse is optimized. If ensureOptimizedUses() was called,
/// outside MSSA construction itself. This result is only useful inside		/// uses will usually be optimized, but this is not guaranteed (e.g. due to
		asbirleaUnsubmitted Not Done Reply Inline Actions This is not necessarily true due to the cap set by MaxCheckLimit (memssa-check-limit). Also not true if a MemoryDef was added that invalidated optimizations (`OptimizedID` will no longer match). asbirlea: This is not necessarily true due to the cap set by MaxCheckLimit (memssa-check-limit). Also not…
/// the MSSA implementation.		/// invalidation and optimization limits.)
bool isOptimized() const {		bool isOptimized() const {
return getDefiningAccess() && OptimizedID == getDefiningAccess()->getID();		return getDefiningAccess() && OptimizedID == getDefiningAccess()->getID();
}		}

MemoryAccess *getOptimized() const {		MemoryAccess *getOptimized() const {
return getDefiningAccess();		return getDefiningAccess();
}		}

▲ Show 20 Lines • Show All 436 Lines • ▼ Show 20 Lines	public:
/// Verify that MemorySSA is self consistent (IE definitions dominate		/// Verify that MemorySSA is self consistent (IE definitions dominate
/// all uses, uses appear in the right places). This is used by unit tests.		/// all uses, uses appear in the right places). This is used by unit tests.
void verifyMemorySSA(VerificationLevel = VerificationLevel::Fast) const;		void verifyMemorySSA(VerificationLevel = VerificationLevel::Fast) const;

/// Used in various insertion functions to specify whether we are talking		/// Used in various insertion functions to specify whether we are talking
/// about the beginning or end of a block.		/// about the beginning or end of a block.
enum InsertionPlace { Beginning, End, BeforeTerminator };		enum InsertionPlace { Beginning, End, BeforeTerminator };

		/// By default, uses are not optimized during MemorySSA construction.
		asbirleaUnsubmitted Not Done Reply Inline Actions Please expand the doc for this API here. asbirlea: Please expand the doc for this API here.
		/// Calling this method will attempt to optimize all MemoryUses, if this has
		reamesUnsubmitted Not Done Reply Inline Actions Strong comment: ensureOptimizedLoads. This does NOT optimize stores, and that's important to distinguish in comments and naming. reames: Strong comment: ensureOptimizedLoads. This does NOT optimize stores, and that's important to…
		/// not happened yet for this MemorySSA instance. This should be done if you
		asbirleaUnsubmitted Not Done Reply Inline Actions Add: `By default, during MemorySSA build, uses are not optimized` asbirlea: Add: `By default, during MemorySSA build, uses are not optimized`
		/// plan to query the clobbering access for most uses, or if you walk the
		/// def-use chain of uses.
		void ensureOptimizedUses();

protected:		protected:
// Used by Memory SSA dumpers and wrapper pass		// Used by Memory SSA dumpers and wrapper pass
friend class MemorySSAPrinterLegacyPass;		friend class MemorySSAPrinterLegacyPass;
friend class MemorySSAUpdater;		friend class MemorySSAUpdater;

void verifyOrderingDominationAndDefUses(		void verifyOrderingDominationAndDefUses(
Function &F, VerificationLevel = VerificationLevel::Fast) const;		Function &F, VerificationLevel = VerificationLevel::Fast) const;
void verifyDominationNumbers(const Function &F) const;		void verifyDominationNumbers(const Function &F) const;
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	private:
mutable SmallPtrSet<const BasicBlock *, 16> BlockNumberingValid;		mutable SmallPtrSet<const BasicBlock *, 16> BlockNumberingValid;
mutable DenseMap<const MemoryAccess *, unsigned long> BlockNumbering;		mutable DenseMap<const MemoryAccess *, unsigned long> BlockNumbering;

// Memory SSA building info		// Memory SSA building info
std::unique_ptr<ClobberWalkerBase<AliasAnalysis>> WalkerBase;		std::unique_ptr<ClobberWalkerBase<AliasAnalysis>> WalkerBase;
std::unique_ptr<CachingWalker<AliasAnalysis>> Walker;		std::unique_ptr<CachingWalker<AliasAnalysis>> Walker;
std::unique_ptr<SkipSelfWalker<AliasAnalysis>> SkipWalker;		std::unique_ptr<SkipSelfWalker<AliasAnalysis>> SkipWalker;
unsigned NextID = 0;		unsigned NextID = 0;
		bool IsOptimized = false;
};		};

/// Enables verification of MemorySSA.		/// Enables verification of MemorySSA.
///		///
/// The checks which this flag enables is exensive and disabled by default		/// The checks which this flag enables is exensive and disabled by default
/// unless `EXPENSIVE_CHECKS` is defined. The flag `-verify-memoryssa` can be		/// unless `EXPENSIVE_CHECKS` is defined. The flag `-verify-memoryssa` can be
/// used to selectively enable the verification without re-compilation.		/// used to selectively enable the verification without re-compilation.
extern bool VerifyMemorySSA;		extern bool VerifyMemorySSA;
▲ Show 20 Lines • Show All 457 Lines • Show Last 20 Lines

llvm/lib/Analysis/MemorySSA.cpp

Show First 20 Lines • Show All 1,391 Lines • ▼ Show 20 Lines	void MemorySSA::OptimizeUses::optimizeUsesInBlock(
for (MemoryAccess &MA : *Accesses) {		for (MemoryAccess &MA : *Accesses) {
auto *MU = dyn_cast<MemoryUse>(&MA);		auto *MU = dyn_cast<MemoryUse>(&MA);
if (!MU) {		if (!MU) {
VersionStack.push_back(&MA);		VersionStack.push_back(&MA);
++StackEpoch;		++StackEpoch;
continue;		continue;
}		}

		if (MU->isOptimized())
		continue;

if (isUseTriviallyOptimizableToLiveOnEntry(*AA, MU->getMemoryInst())) {		if (isUseTriviallyOptimizableToLiveOnEntry(*AA, MU->getMemoryInst())) {
MU->setDefiningAccess(MSSA->getLiveOnEntryDef(), true, None);		MU->setDefiningAccess(MSSA->getLiveOnEntryDef(), true, None);
continue;		continue;
}		}

MemoryLocOrCall UseMLOC(MU);		MemoryLocOrCall UseMLOC(MU);
auto &LocInfo = LocStackInfo[UseMLOC];		auto &LocInfo = LocStackInfo[UseMLOC];
// If the pop epoch changed, it means we've removed stuff from top of		// If the pop epoch changed, it means we've removed stuff from top of
▲ Show 20 Lines • Show All 172 Lines • ▼ Show 20 Lines	void MemorySSA::buildMemorySSA(BatchAAResults &BAA) {
}		}
placePHINodes(DefiningBlocks);		placePHINodes(DefiningBlocks);

// Now do regular SSA renaming on the MemoryDef/MemoryUse. Visited will get		// Now do regular SSA renaming on the MemoryDef/MemoryUse. Visited will get
// filled in with all blocks.		// filled in with all blocks.
SmallPtrSet<BasicBlock *, 16> Visited;		SmallPtrSet<BasicBlock *, 16> Visited;
renamePass(DT->getRootNode(), LiveOnEntryDef.get(), Visited);		renamePass(DT->getRootNode(), LiveOnEntryDef.get(), Visited);

ClobberWalkerBase<BatchAAResults> WalkerBase(this, &BAA, DT);
CachingWalker<BatchAAResults> WalkerLocal(this, &WalkerBase);
OptimizeUses(this, &WalkerLocal, &BAA, DT).optimizeUses();

// Mark the uses in unreachable blocks as live on entry, so that they go		// Mark the uses in unreachable blocks as live on entry, so that they go
// somewhere.		// somewhere.
for (auto &BB : F)		for (auto &BB : F)
if (!Visited.count(&BB))		if (!Visited.count(&BB))
markUnreachableAsLiveOnEntry(&BB);		markUnreachableAsLiveOnEntry(&BB);
}		}

MemorySSAWalker *MemorySSA::getWalker() { return getWalkerImpl(); }		MemorySSAWalker *MemorySSA::getWalker() { return getWalkerImpl(); }
▲ Show 20 Lines • Show All 573 Lines • ▼ Show 20 Lines	if (UseBB != Dominator->getBlock())
return DT->dominates(Dominator->getBlock(), UseBB);		return DT->dominates(Dominator->getBlock(), UseBB);
// If the UseBB and the DefBB are the same, compare locally.		// If the UseBB and the DefBB are the same, compare locally.
return locallyDominates(Dominator, cast<MemoryAccess>(Dominatee));		return locallyDominates(Dominator, cast<MemoryAccess>(Dominatee));
}		}
// If it's not a PHI node use, the normal dominates can already handle it.		// If it's not a PHI node use, the normal dominates can already handle it.
return dominates(Dominator, cast<MemoryAccess>(Dominatee.getUser()));		return dominates(Dominator, cast<MemoryAccess>(Dominatee.getUser()));
}		}

		void MemorySSA::ensureOptimizedUses() {
		if (IsOptimized)
		return;

		BatchAAResults BatchAA(*AA);
		ClobberWalkerBase<BatchAAResults> WalkerBase(this, &BatchAA, DT);
		CachingWalker<BatchAAResults> WalkerLocal(this, &WalkerBase);
		OptimizeUses(this, &WalkerLocal, &BatchAA, DT).optimizeUses();
		IsOptimized = true;
		}

void MemoryAccess::print(raw_ostream &OS) const {		void MemoryAccess::print(raw_ostream &OS) const {
switch (getValueID()) {		switch (getValueID()) {
case MemoryPhiVal: return static_cast<const MemoryPhi *>(this)->print(OS);		case MemoryPhiVal: return static_cast<const MemoryPhi *>(this)->print(OS);
case MemoryDefVal: return static_cast<const MemoryDef *>(this)->print(OS);		case MemoryDefVal: return static_cast<const MemoryDef *>(this)->print(OS);
case MemoryUseVal: return static_cast<const MemoryUse *>(this)->print(OS);		case MemoryUseVal: return static_cast<const MemoryUse *>(this)->print(OS);
}		}
llvm_unreachable("invalid value id");		llvm_unreachable("invalid value id");
}		}
▲ Show 20 Lines • Show All 156 Lines • ▼ Show 20 Lines	return getNodeLabel(Node, CFGInfo).find(';') != std::string::npos
: "";		: "";
}		}
};		};

} // namespace llvm		} // namespace llvm

bool MemorySSAPrinterLegacyPass::runOnFunction(Function &F) {		bool MemorySSAPrinterLegacyPass::runOnFunction(Function &F) {
auto &MSSA = getAnalysis<MemorySSAWrapperPass>().getMSSA();		auto &MSSA = getAnalysis<MemorySSAWrapperPass>().getMSSA();
		MSSA.ensureOptimizedUses();
if (DotCFGMSSA != "") {		if (DotCFGMSSA != "") {
DOTFuncMSSAInfo CFGInfo(F, MSSA);		DOTFuncMSSAInfo CFGInfo(F, MSSA);
WriteGraph(&CFGInfo, "", false, "MSSA", DotCFGMSSA);		WriteGraph(&CFGInfo, "", false, "MSSA", DotCFGMSSA);
} else		} else
MSSA.print(dbgs());		MSSA.print(dbgs());

if (VerifyMemorySSA)		if (VerifyMemorySSA)
MSSA.verifyMemorySSA();		MSSA.verifyMemorySSA();
Show All 16 Lines	bool MemorySSAAnalysis::Result::invalidate(
return !(PAC.preserved() \|\| PAC.preservedSet<AllAnalysesOn<Function>>()) \|\|		return !(PAC.preserved() \|\| PAC.preservedSet<AllAnalysesOn<Function>>()) \|\|
Inv.invalidate<AAManager>(F, PA) \|\|		Inv.invalidate<AAManager>(F, PA) \|\|
Inv.invalidate<DominatorTreeAnalysis>(F, PA);		Inv.invalidate<DominatorTreeAnalysis>(F, PA);
}		}

PreservedAnalyses MemorySSAPrinterPass::run(Function &F,		PreservedAnalyses MemorySSAPrinterPass::run(Function &F,
FunctionAnalysisManager &AM) {		FunctionAnalysisManager &AM) {
auto &MSSA = AM.getResult<MemorySSAAnalysis>(F).getMSSA();		auto &MSSA = AM.getResult<MemorySSAAnalysis>(F).getMSSA();
		MSSA.ensureOptimizedUses();
if (DotCFGMSSA != "") {		if (DotCFGMSSA != "") {
DOTFuncMSSAInfo CFGInfo(F, MSSA);		DOTFuncMSSAInfo CFGInfo(F, MSSA);
WriteGraph(&CFGInfo, "", false, "MSSA", DotCFGMSSA);		WriteGraph(&CFGInfo, "", false, "MSSA", DotCFGMSSA);
} else {		} else {
OS << "MemorySSA for function: " << F.getName() << "\n";		OS << "MemorySSA for function: " << F.getName() << "\n";
MSSA.print(OS);		MSSA.print(OS);
}		}

▲ Show 20 Lines • Show All 293 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp

	Show First 20 Lines • Show All 1,927 Lines • ▼ Show 20 Lines
	};			};

	static bool eliminateDeadStores(Function &F, AliasAnalysis &AA, MemorySSA &MSSA,			static bool eliminateDeadStores(Function &F, AliasAnalysis &AA, MemorySSA &MSSA,
	DominatorTree &DT, PostDominatorTree &PDT,			DominatorTree &DT, PostDominatorTree &PDT,
	const TargetLibraryInfo &TLI,			const TargetLibraryInfo &TLI,
	const LoopInfo &LI) {			const LoopInfo &LI) {
	bool MadeChange = false;			bool MadeChange = false;

				MSSA.ensureOptimizedUses();
	DSEState State(F, AA, MSSA, DT, PDT, TLI, LI);			DSEState State(F, AA, MSSA, DT, PDT, TLI, LI);
	// For each store:			// For each store:
	for (unsigned I = 0; I < State.MemDefs.size(); I++) {			for (unsigned I = 0; I < State.MemDefs.size(); I++) {
	MemoryDef *KillingDef = State.MemDefs[I];			MemoryDef *KillingDef = State.MemDefs[I];
	if (State.SkipStores.count(KillingDef))			if (State.SkipStores.count(KillingDef))
	continue;			continue;
	Instruction *KillingI = KillingDef->getMemoryInst();			Instruction *KillingI = KillingDef->getMemoryInst();

	▲ Show 20 Lines • Show All 263 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/EarlyCSE.cpp

Show First 20 Lines • Show All 592 Lines • ▼ Show 20 Lines	public:
/// This is the current generation of the memory value.		/// This is the current generation of the memory value.
unsigned CurrentGeneration = 0;		unsigned CurrentGeneration = 0;

/// Set up the EarlyCSE runner for a particular function.		/// Set up the EarlyCSE runner for a particular function.
EarlyCSE(const DataLayout &DL, const TargetLibraryInfo &TLI,		EarlyCSE(const DataLayout &DL, const TargetLibraryInfo &TLI,
const TargetTransformInfo &TTI, DominatorTree &DT,		const TargetTransformInfo &TTI, DominatorTree &DT,
AssumptionCache &AC, MemorySSA *MSSA)		AssumptionCache &AC, MemorySSA *MSSA)
: TLI(TLI), TTI(TTI), DT(DT), AC(AC), SQ(DL, &TLI, &DT, &AC), MSSA(MSSA),		: TLI(TLI), TTI(TTI), DT(DT), AC(AC), SQ(DL, &TLI, &DT, &AC), MSSA(MSSA),
MSSAUpdater(std::make_unique<MemorySSAUpdater>(MSSA)) {}		MSSAUpdater(std::make_unique<MemorySSAUpdater>(MSSA)) {
		if (MSSA)
		MSSA->ensureOptimizedUses();
		}

bool run();		bool run();

private:		private:
unsigned ClobberCounter = 0;		unsigned ClobberCounter = 0;
// Almost a POD, but needs to call the constructors for the scoped hash		// Almost a POD, but needs to call the constructors for the scoped hash
// tables so that a new scope gets pushed on. These are RAII so that the		// tables so that a new scope gets pushed on. These are RAII so that the
// scope gets popped when the NodeScope is destroyed.		// scope gets popped when the NodeScope is destroyed.
▲ Show 20 Lines • Show All 1,166 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/GVNHoist.cpp

	Show First 20 Lines • Show All 253 Lines • ▼ Show 20 Lines
	// This pass hoists common computations across branches sharing common			// This pass hoists common computations across branches sharing common
	// dominator. The primary goal is to reduce the code size, and in some			// dominator. The primary goal is to reduce the code size, and in some
	// cases reduce critical path (by exposing more ILP).			// cases reduce critical path (by exposing more ILP).
	class GVNHoist {			class GVNHoist {
	public:			public:
	GVNHoist(DominatorTree DT, PostDominatorTree PDT, AliasAnalysis *AA,			GVNHoist(DominatorTree DT, PostDominatorTree PDT, AliasAnalysis *AA,
	MemoryDependenceResults MD, MemorySSA MSSA)			MemoryDependenceResults MD, MemorySSA MSSA)
	: DT(DT), PDT(PDT), AA(AA), MD(MD), MSSA(MSSA),			: DT(DT), PDT(PDT), AA(AA), MD(MD), MSSA(MSSA),
	MSSAUpdater(std::make_unique<MemorySSAUpdater>(MSSA)) {}			MSSAUpdater(std::make_unique<MemorySSAUpdater>(MSSA)) {
				MSSA->ensureOptimizedUses();
				}

	bool run(Function &F);			bool run(Function &F);

	// Copied from NewGVN.cpp			// Copied from NewGVN.cpp
	// This function provides global ranking of operations so that we can place			// This function provides global ranking of operations so that we can place
	// them in a canonical order. Note that rank alone is not necessarily enough			// them in a canonical order. Note that rank alone is not necessarily enough
	// for a complete ordering, as constants all have the same rank. However,			// for a complete ordering, as constants all have the same rank. However,
	// generally, we will simplify an operation with all constants so that it			// generally, we will simplify an operation with all constants so that it
	▲ Show 20 Lines • Show All 991 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/LICM.cpp

	Show First 20 Lines • Show All 368 Lines • ▼ Show 20 Lines
	bool LoopInvariantCodeMotion::runOnLoop(			bool LoopInvariantCodeMotion::runOnLoop(
	Loop L, AAResults AA, LoopInfo LI, DominatorTree DT,			Loop L, AAResults AA, LoopInfo LI, DominatorTree DT,
	BlockFrequencyInfo BFI, TargetLibraryInfo TLI, TargetTransformInfo *TTI,			BlockFrequencyInfo BFI, TargetLibraryInfo TLI, TargetTransformInfo *TTI,
	ScalarEvolution SE, MemorySSA MSSA, OptimizationRemarkEmitter *ORE,			ScalarEvolution SE, MemorySSA MSSA, OptimizationRemarkEmitter *ORE,
	bool LoopNestMode) {			bool LoopNestMode) {
	bool Changed = false;			bool Changed = false;

	assert(L->isLCSSAForm(*DT) && "Loop is not in LCSSA form.");			assert(L->isLCSSAForm(*DT) && "Loop is not in LCSSA form.");
				MSSA->ensureOptimizedUses();

	// If this loop has metadata indicating that LICM is not to be performed then			// If this loop has metadata indicating that LICM is not to be performed then
	// just exit.			// just exit.
	if (hasDisableLICMTransformsHint(L)) {			if (hasDisableLICMTransformsHint(L)) {
	return false;			return false;
	}			}

	// Don't sink stores from loops with coroutine suspend instructions.			// Don't sink stores from loops with coroutine suspend instructions.
	▲ Show 20 Lines • Show All 1,982 Lines • Show Last 20 Lines

llvm/unittests/Analysis/MemorySSATest.cpp

Show First 20 Lines • Show All 1,020 Lines • ▼ Show 20 Lines	TEST_F(MemorySSATest, TestLoadMustAlias) {
B.CreateStore(ConstantInt::get(Int8, 1), AllocaA);		B.CreateStore(ConstantInt::get(Int8, 1), AllocaA);
// Check load from store/def		// Check load from store/def
LoadInst *LA3 = B.CreateLoad(Int8, AllocaA, "");		LoadInst *LA3 = B.CreateLoad(Int8, AllocaA, "");
// Check load alias cached for second load		// Check load alias cached for second load
LoadInst *LA4 = B.CreateLoad(Int8, AllocaA, "");		LoadInst *LA4 = B.CreateLoad(Int8, AllocaA, "");

setupAnalyses();		setupAnalyses();
MemorySSA &MSSA = *Analyses->MSSA;		MemorySSA &MSSA = *Analyses->MSSA;
		MSSA.ensureOptimizedUses();

unsigned I = 0;		unsigned I = 0;
for (LoadInst *V : {LA1, LA2}) {		for (LoadInst *V : {LA1, LA2}) {
MemoryUse *MemUse = dyn_cast_or_null<MemoryUse>(MSSA.getMemoryAccess(V));		MemoryUse *MemUse = dyn_cast_or_null<MemoryUse>(MSSA.getMemoryAccess(V));
EXPECT_EQ(MemUse->getOptimizedAccessType(), None)		EXPECT_EQ(MemUse->getOptimizedAccessType(), None)
<< "Load " << I << " doesn't have the correct alias information";		<< "Load " << I << " doesn't have the correct alias information";
// EXPECT_EQ expands such that if we increment I above, it won't get		// EXPECT_EQ expands such that if we increment I above, it won't get
// incremented except when we try to print the error message.		// incremented except when we try to print the error message.
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	TEST_F(MemorySSATest, TestLoadMayAlias) {
LoadInst *LB1 = B.CreateLoad(Int8, PointerB, "");		LoadInst *LB1 = B.CreateLoad(Int8, PointerB, "");
B.CreateStore(ConstantInt::get(Int8, 0), PointerA);		B.CreateStore(ConstantInt::get(Int8, 0), PointerA);
LoadInst *LA2 = B.CreateLoad(Int8, PointerA, "");		LoadInst *LA2 = B.CreateLoad(Int8, PointerA, "");
B.CreateStore(ConstantInt::get(Int8, 0), PointerB);		B.CreateStore(ConstantInt::get(Int8, 0), PointerB);
LoadInst *LB2 = B.CreateLoad(Int8, PointerB, "");		LoadInst *LB2 = B.CreateLoad(Int8, PointerB, "");

setupAnalyses();		setupAnalyses();
MemorySSA &MSSA = *Analyses->MSSA;		MemorySSA &MSSA = *Analyses->MSSA;
		MSSA.ensureOptimizedUses();

unsigned I = 0;		unsigned I = 0;
for (LoadInst *V : {LA1, LB1}) {		for (LoadInst *V : {LA1, LB1}) {
MemoryUse *MemUse = dyn_cast_or_null<MemoryUse>(MSSA.getMemoryAccess(V));		MemoryUse *MemUse = dyn_cast_or_null<MemoryUse>(MSSA.getMemoryAccess(V));
EXPECT_EQ(MemUse->getOptimizedAccessType().getValue(),		EXPECT_EQ(MemUse->getOptimizedAccessType().getValue(),
AliasResult::MayAlias)		AliasResult::MayAlias)
<< "Load " << I << " doesn't have the correct alias information";		<< "Load " << I << " doesn't have the correct alias information";
// EXPECT_EQ expands such that if we increment I above, it won't get		// EXPECT_EQ expands such that if we increment I above, it won't get
▲ Show 20 Lines • Show All 737 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[MemorySSA] Support lazy use optimizationClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 416428

llvm/include/llvm/Analysis/MemorySSA.h

llvm/lib/Analysis/MemorySSA.cpp

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp

llvm/lib/Transforms/Scalar/EarlyCSE.cpp

llvm/lib/Transforms/Scalar/GVNHoist.cpp

llvm/lib/Transforms/Scalar/LICM.cpp

llvm/unittests/Analysis/MemorySSATest.cpp

[MemorySSA] Support lazy use optimization
ClosedPublic