This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Transforms/Scalar/
-
llvm/
-
Transforms/
-
Scalar/
-
GVN.h
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
7/10
GVN.cpp
-
test/Transforms/GVN/
-
Transforms/
-
GVN/
-
PRE/
-
2011-06-01-NonLocalMemdepMiscompile.ll
-
2017-06-28-pre-load-dbgloc.ll
1/1
pre-load.ll
-
volatile.ll
-
condprop.ll

Differential D139582

[GVN] Improve PRE on load instructions
ClosedPublic

Authored by Carrot on Dec 7 2022, 2:31 PM.

Download Raw Diff

Details

Reviewers

mkazantsev
fhahn
nikic

Commits

rG1f1d501843e5: [GVN] Improve PRE on load instructions

Summary

This patch implements the enhancement proposed by https://github.com/llvm/llvm-project/issues/59312.

Suppose we have following code

   v0 = load %addr
   br %LoadBB

LoadBB:
   v1 = load %addr
   ...

PredBB:
   ...
   br %cond, label %LoadBB, label %SuccBB

SuccBB:
   v2 = load %addr
   ...

Instruction v1 in LoadBB is partially redundant, edge (PredBB, LoadBB) is a critical edge. SuccBB is another successor of PredBB, it contains another load v2 which is identical to v1. Current GVN splits the critical edge (PredBB, LoadBB) and inserts a new load in it. A better method is move the load of v2 into PredBB, then v1 can be changed to a PHI instruction.

If there are two or more similar predecessors, like the test case in the bug entry, current GVN simply gives up because otherwise it needs to split multiple critical edges. But we can move all loads in successor blocks into predecessors.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

Carrot created this revision.Dec 7 2022, 2:31 PM

Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2022, 2:31 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

Carrot requested review of this revision.Dec 7 2022, 2:31 PM

Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2022, 2:31 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B201818: Diff 481039.Dec 7 2022, 2:32 PM

nikic added a reviewer: nikic.Dec 7 2022, 11:56 PM

Compile-time: http://llvm-compile-time-tracker.com/compare.php?from=fe2ca62e92c82efcbbd10916e352bdfeddb80e19&to=1d81778eee4d4c38db40ec50a5e18ac76e251b44&stat=instructions:u Seems to be all over the place, probably due to second-order effects.

Seems like this patch keeps finding values in vectors, and it might be a reason of CT degradation. Maybe refactor them to be maps instead?

llvm/lib/Transforms/Scalar/GVN.cpp
931	`AvailValInBlkVect` ?

In D139582#3981256, @mkazantsev wrote:

Seems like this patch keeps finding values in vectors, and it might be a reason of CT degradation. Maybe refactor them to be maps instead?

Only function ReplaceValuesPerBlockEntry searching values in vector, it is called in the final transformation step. So it is unlikely causing CT issue unless there are a lot of such optimizations actually triggered. I would like to look for CT issues in earlier time.

llvm/lib/Transforms/Scalar/GVN.cpp
931	Similar to ConstructSSAForLoadSet, this function is a member of GVNPass, but AvailValInBlkVect is private in GVNPass. So I can't directly use it here.

Compile time improvement.

The original code checks all predecessors of LoadBB and find possible solution for all predecessors. And later checks if the solution is profitable. This versions checks if current solution is still profitable in the process of checking predecessors, and early exit if current solution is already non profitable.

Harbormaster completed remote builds in B202116: Diff 481483.Dec 8 2022, 9:41 PM

The compile time regression on ClamAV is in file libclamav_htmlnorm.c. The increased compile time is completely in GVN pass, from 4.0s to 5.3s. There is a huge function cli_html_normalise in this file, it's more than 6600 lines in generated assembly file. The statistic result of NumPRELoad increased from 1 to 10 because of this patch. In function GVNPass::runImpl we have

unsigned Iteration = 0;
while (ShouldContinue) {
  LLVM_DEBUG(dbgs() << "GVN iteration: " << Iteration << "\n");
  (void) Iteration;
  ShouldContinue = iterateOnFunction(F);
  Changed |= ShouldContinue;
  ++Iteration;
}

It means we continuously do GVN on a function until there is no more such optimization applicable. This patch enables more optimizations, potentially it may also cause more iterations on a function. In this case, the loop is executed 4 times without this patch, but 5 times with this patch. These numbers closely correlate to the increased compile time.

So the extra compile time is caused by more iterations on a huge function, and the more iterations is caused by more optimizations enabled by this patch.

In D139582#3999511, @Carrot wrote:
The compile time regression on ClamAV is in file libclamav_htmlnorm.c. The increased compile time is completely in GVN pass, from 4.0s to 5.3s. There is a huge function cli_html_normalise in this file, it's more than 6600 lines in generated assembly file. The statistic result of NumPRELoad increased from 1 to 10 because of this patch. In function GVNPass::runImpl we have
unsigned Iteration = 0;
while (ShouldContinue) {
  LLVM_DEBUG(dbgs() << "GVN iteration: " << Iteration << "\n");
  (void) Iteration;
  ShouldContinue = iterateOnFunction(F);
  Changed |= ShouldContinue;
  ++Iteration;
}
It means we continuously do GVN on a function until there is no more such optimization applicable. This patch enables more optimizations, potentially it may also cause more iterations on a function. In this case, the loop is executed 4 times without this patch, but 5 times with this patch. These numbers closely correlate to the increased compile time.

So the extra compile time is caused by more iterations on a huge function, and the more iterations is caused by more optimizations enabled by this patch.

Thanks for the analysis. I think this is fine then. I expect https://reviews.llvm.org/D140097 to mitigate this somewhat.

llvm/lib/Transforms/Scalar/GVN.cpp
930	nit: `replaceValuesPerBlockEntry`
1399	Block scan needs a cutoff.
1404	Do we need to be concerned about speculating the load? You do perform a speculation check, but I think it does not cover the speculation of this load.
1405	`cast<>`
2703	I am not sure this assertion is safe to remove. I think a problem with your current code is that it may try to remove a load that is part of the leader table for that block, if that block has been processed before the current one.
llvm/test/Transforms/GVN/PRE/pre-load.ll
766	Please pre-commit the test.

Carrot mentioned this in D140234: [TEST] Pre-commit test for GVN PRE load.Dec 16 2022, 10:55 AM

Carrot mentioned this in rG4c13af22b4d4: [TEST] Pre-commit test for GVN PRE load.Dec 20 2022, 10:46 AM

Carrot updated this revision to Diff 484624.Dec 21 2022, 11:01 AM

Carrot marked 4 inline comments as done.

Carrot added inline comments.

llvm/lib/Transforms/Scalar/GVN.cpp
1404	Do you mean the following code in function PerformLoadPRE ? + for (auto &CEP : CriticalEdgePredAndLoad) + if (!isSafeToSpeculativelyExecute(Load, CEP.first->getTerminator(), AC, + DT)) + return false; It's specifically for this purpose. I used the original Load instruction instead of CEP.second because all of them are identical.
2703	Added code into function eliminatePartiallyRedundantLoad to delete the corresponding leader table entry.

Harbormaster completed remote builds in B204417: Diff 484624.Dec 21 2022, 11:50 AM

I think this is fine, please wait few more days if someone else has concerns.

llvm/lib/Transforms/Scalar/GVN.cpp
1840	nit: `/CriticalEdgePredAndLoad/ nullptr`

This revision is now accepted and ready to land.Dec 25 2022, 8:27 PM

Thanks for the review! Will commit this version.

Harbormaster completed remote builds in B205993: Diff 486682.Jan 5 2023, 4:08 PM

lkail added a subscriber: lkail.Jan 5 2023, 6:50 PM

This revision was landed with ongoing or failed builds.Jan 9 2023, 3:09 PM

Closed by commit rG1f1d501843e5: [GVN] Improve PRE on load instructions (authored by Carrot). · Explain Why

This revision was automatically updated to reflect the committed changes.

Carrot added a commit: rG1f1d501843e5: [GVN] Improve PRE on load instructions.

Herald added a subscriber: StephenFan. · View Herald TranscriptJan 9 2023, 3:09 PM

Seems this causes miscompilation. Investigating.

@Carrot This change seems to be causing some weird unit test failures that I cannot explain. Can you take a look and revert if you need time to investigate?

https://lab.llvm.org/buildbot/#/builders/247/builds/294
https://lab.llvm.org/buildbot/#/builders/75/builds/25910

SixWeining added a subscriber: SixWeining.Jan 9 2023, 6:31 PM

Carrot added a reverting change: rG9852941f0138: Revert "[GVN] Improve PRE on load instructions".Jan 9 2023, 6:40 PM

@dyung @chapuni thank you for the report, I have reverted it.
How did you investigate it ? I try to reproduce it according to https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild, but always got following error

...
+ BUILDBOT_MONO_REPO_PATH=
+ BUILDBOT_REVISION=llvmorg-
+ buildbot_update
+ echo @@@BUILD_STEP update llvmorg-@@@
@@@BUILD_STEP update llvmorg-@@@
+ [[ -d '' ]]
+ local DEPTH=100
+ [[ -d llvm-project ]]
+ cd llvm-project
+ git fetch origin
+ git clean -fd
+ local REV=llvmorg-
+ git checkout -f llvmorg-
error: pathspec 'llvmorg-' did not match any file(s) known to git
+ git status
On branch main

No commits yet

nothing to commit (create/copy files and use "git add" to track)
+ git rev-list --pretty --max-count=1 HEAD
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
+ build_exception
+ echo

+ echo 'How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild'
How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild
+ echo
...

This change failed the https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/pr59747.c (-O2/-O3) on many platforms including x86-64.

I think it is easy to reproduce:

clang -O2 pr59747.c
./a.out

In D139582#4038393, @Carrot wrote:

...
+ BUILDBOT_MONO_REPO_PATH=
+ BUILDBOT_REVISION=llvmorg-
+ buildbot_update
+ echo @@@BUILD_STEP update llvmorg-@@@
@@@BUILD_STEP update llvmorg-@@@
+ [[ -d '' ]]
+ local DEPTH=100
+ [[ -d llvm-project ]]
+ cd llvm-project
+ git fetch origin
+ git clean -fd
+ local REV=llvmorg-
+ git checkout -f llvmorg-
error: pathspec 'llvmorg-' did not match any file(s) known to git
+ git status
On branch main

No commits yet

nothing to commit (create/copy files and use "git add" to track)
+ git rev-list --pretty --max-count=1 HEAD
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
+ build_exception
+ echo

+ echo 'How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild'
How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild
+ echo
...

I can't speak for the sanitizer bot, but for llvm-clang-x86_64-gcc-ubuntu you should be able to reproduce some of the failures by running:

cmake ../llvm-project/llvm -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ -DCMAKE_BUILD_TYPE=Release -DCLANG_ENABLE_CLANGD=OFF -DLLVM_BUILD_RUNTIME=ON -DLLVM_BUILD_TESTS=ON -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_INCLUDE_EXAMPLES=OFF '-DLLVM_LIT_ARGS=--verbose -j48' -DLLVM_USE_LINKER=gold '-DLLVM_ENABLE_PROJECTS=compiler-rt;clang;lld;cross-project-tests;llvm;clang-tools-extra' -GNinja
ninja check-fuzzer

In D139582#4038394, @SixWeining wrote:
This change failed the https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/pr59747.c (-O2/-O3) on many platforms including x86-64.

I think it is easy to reproduce:
clang -O2 pr59747.c
./a.out

The 11 unexpected failures should be caused by this change.

@SixWeining @dyung , both reproduction work for me, thanks a lot!

Carrot mentioned this in D141712: [GVN] Improve PRE on load instructions.Jan 13 2023, 11:12 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Transforms/

Scalar/

GVN.h

8 lines

lib/

Transforms/

Scalar/

GVN.cpp

131 lines

test/

Transforms/

GVN/

PRE/

2011-06-01-NonLocalMemdepMiscompile.ll

2 lines

2017-06-28-pre-load-dbgloc.ll

13 lines

pre-load.ll

30 lines

volatile.ll

10 lines

condprop.ll

12 lines

Diff 487573

llvm/include/llvm/Transforms/Scalar/GVN.h

Show First 20 Lines • Show All 324 Lines • ▼ Show 20 Lines	private:

/// Given a list of non-local dependencies, determine if a value is		/// Given a list of non-local dependencies, determine if a value is
/// available for the load in each specified block. If it is, add it to		/// available for the load in each specified block. If it is, add it to
/// ValuesPerBlock. If not, add it to UnavailableBlocks.		/// ValuesPerBlock. If not, add it to UnavailableBlocks.
void AnalyzeLoadAvailability(LoadInst *Load, LoadDepVect &Deps,		void AnalyzeLoadAvailability(LoadInst *Load, LoadDepVect &Deps,
AvailValInBlkVect &ValuesPerBlock,		AvailValInBlkVect &ValuesPerBlock,
UnavailBlkVect &UnavailableBlocks);		UnavailBlkVect &UnavailableBlocks);

		/// Given a critical edge from Pred to LoadBB, find a load instruction
		/// which is identical to Load from another successor of Pred.
		LoadInst findLoadToHoistIntoPred(BasicBlock Pred, BasicBlock *LoadBB,
		LoadInst *Load);

bool PerformLoadPRE(LoadInst *Load, AvailValInBlkVect &ValuesPerBlock,		bool PerformLoadPRE(LoadInst *Load, AvailValInBlkVect &ValuesPerBlock,
UnavailBlkVect &UnavailableBlocks);		UnavailBlkVect &UnavailableBlocks);

/// Try to replace a load which executes on each loop iteraiton with Phi		/// Try to replace a load which executes on each loop iteraiton with Phi
/// translation of load in preheader and load(s) in conditionally executed		/// translation of load in preheader and load(s) in conditionally executed
/// paths.		/// paths.
bool performLoopLoadPRE(LoadInst *Load, AvailValInBlkVect &ValuesPerBlock,		bool performLoopLoadPRE(LoadInst *Load, AvailValInBlkVect &ValuesPerBlock,
UnavailBlkVect &UnavailableBlocks);		UnavailBlkVect &UnavailableBlocks);

/// Eliminates partially redundant \p Load, replacing it with \p		/// Eliminates partially redundant \p Load, replacing it with \p
/// AvailableLoads (connected by Phis if needed).		/// AvailableLoads (connected by Phis if needed).
void eliminatePartiallyRedundantLoad(		void eliminatePartiallyRedundantLoad(
LoadInst *Load, AvailValInBlkVect &ValuesPerBlock,		LoadInst *Load, AvailValInBlkVect &ValuesPerBlock,
MapVector<BasicBlock , Value > &AvailableLoads);		MapVector<BasicBlock , Value > &AvailableLoads,
		MapVector<BasicBlock , LoadInst > *CriticalEdgePredAndLoad);

// Other helper routines		// Other helper routines
bool processInstruction(Instruction *I);		bool processInstruction(Instruction *I);
bool processBlock(BasicBlock *BB);		bool processBlock(BasicBlock *BB);
void dump(DenseMap<uint32_t, Value *> &d) const;		void dump(DenseMap<uint32_t, Value *> &d) const;
bool iterateOnFunction(Function &F);		bool iterateOnFunction(Function &F);
bool performPRE(Function &F);		bool performPRE(Function &F);
bool performScalarPRE(Instruction *I);		bool performScalarPRE(Instruction *I);
Show All 37 Lines

llvm/lib/Transforms/Scalar/GVN.cpp

Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines

// This is based on IsValueFullyAvailableInBlockNumSpeculationsMax stat.		// This is based on IsValueFullyAvailableInBlockNumSpeculationsMax stat.
static cl::opt<uint32_t> MaxBBSpeculations(		static cl::opt<uint32_t> MaxBBSpeculations(
"gvn-max-block-speculations", cl::Hidden, cl::init(600),		"gvn-max-block-speculations", cl::Hidden, cl::init(600),
cl::desc("Max number of blocks we're willing to speculate on (and recurse "		cl::desc("Max number of blocks we're willing to speculate on (and recurse "
"into) when deducing if a value is fully available or not in GVN "		"into) when deducing if a value is fully available or not in GVN "
"(default = 600)"));		"(default = 600)"));

		static cl::opt<uint32_t> MaxNumInsnsPerBlock(
		"gvn-max-num-insns", cl::Hidden, cl::init(100),
		cl::desc("Max number of instructions to scan in each basic block in GVN "
		"(default = 100)"));

struct llvm::GVNPass::Expression {		struct llvm::GVNPass::Expression {
uint32_t opcode;		uint32_t opcode;
bool commutative = false;		bool commutative = false;
// The type is not necessarily the result type of the expression, it may be		// The type is not necessarily the result type of the expression, it may be
// any additional type needed to disambiguate the expression.		// any additional type needed to disambiguate the expression.
Type *type = nullptr;		Type *type = nullptr;
SmallVector<uint32_t, 4> varargs;		SmallVector<uint32_t, 4> varargs;

▲ Show 20 Lines • Show All 782 Lines • ▼ Show 20 Lines	#ifndef NDEBUG

assert(NewSpeculativelyAvailableBBs.empty() &&		assert(NewSpeculativelyAvailableBBs.empty() &&
"Must have fixed all the new speculatively available blocks.");		"Must have fixed all the new speculatively available blocks.");
#endif		#endif

return !UnavailableBB;		return !UnavailableBB;
}		}

		/// If the specified BB exists in ValuesPerBlock, replace its value with
		/// NewValue.
		static void replaceValuesPerBlockEntry(
		nikicUnsubmitted Done Reply Inline Actions nit: `replaceValuesPerBlockEntry` nikic: nit: `replaceValuesPerBlockEntry`
		SmallVectorImpl<AvailableValueInBlock> &ValuesPerBlock, BasicBlock *BB,
		mkazantsevUnsubmitted Not Done Reply Inline Actions `AvailValInBlkVect` ? mkazantsev: `AvailValInBlkVect` ?
		CarrotAuthorUnsubmitted Done Reply Inline Actions Similar to ConstructSSAForLoadSet, this function is a member of GVNPass, but AvailValInBlkVect is private in GVNPass. So I can't directly use it here. Carrot: Similar to ConstructSSAForLoadSet, this function is a member of GVNPass, but AvailValInBlkVect…
		Value *NewValue) {
		for (AvailableValueInBlock &V : ValuesPerBlock) {
		if (V.BB == BB) {
		V = AvailableValueInBlock::get(BB, NewValue);
		return;
		}
		}
		}

/// Given a set of loads specified by ValuesPerBlock,		/// Given a set of loads specified by ValuesPerBlock,
/// construct SSA form, allowing us to eliminate Load. This returns the value		/// construct SSA form, allowing us to eliminate Load. This returns the value
/// that should be used at Load's definition site.		/// that should be used at Load's definition site.
static Value *		static Value *
ConstructSSAForLoadSet(LoadInst *Load,		ConstructSSAForLoadSet(LoadInst *Load,
SmallVectorImpl<AvailableValueInBlock> &ValuesPerBlock,		SmallVectorImpl<AvailableValueInBlock> &ValuesPerBlock,
GVNPass &gvn) {		GVNPass &gvn) {
// Check for the fully redundant, dominating load case. In this case, we can		// Check for the fully redundant, dominating load case. In this case, we can
▲ Show 20 Lines • Show All 411 Lines • ▼ Show 20 Lines	if (AnalyzeLoadAvailability(Load, DepInfo, Address, AV)) {
UnavailableBlocks.push_back(DepBB);		UnavailableBlocks.push_back(DepBB);
}		}
}		}

assert(NumDeps == ValuesPerBlock.size() + UnavailableBlocks.size() &&		assert(NumDeps == ValuesPerBlock.size() + UnavailableBlocks.size() &&
"post condition violation");		"post condition violation");
}		}

		/// Given the following code, v1 is partially available on some edges, but not
		/// available on the edge from PredBB. This function tries to find if there is
		/// another identical load in the other successor of PredBB.
		///
		/// v0 = load %addr
		/// br %LoadBB
		///
		/// LoadBB:
		/// v1 = load %addr
		/// ...
		///
		/// PredBB:
		/// ...
		/// br %cond, label %LoadBB, label %SuccBB
		///
		/// SuccBB:
		/// v2 = load %addr
		/// ...
		///
		LoadInst GVNPass::findLoadToHoistIntoPred(BasicBlock Pred, BasicBlock *LoadBB,
		LoadInst *Load) {
		// For simplicity we handle a Pred has 2 successors only.
		auto *Term = Pred->getTerminator();
		if (Term->getNumSuccessors() != 2)
		return nullptr;
		auto *SuccBB = Term->getSuccessor(0);
		if (SuccBB == LoadBB)
		SuccBB = Term->getSuccessor(1);
		if (!SuccBB->getSinglePredecessor())
		return nullptr;

		int NumInsts = MaxNumInsnsPerBlock;
		nikicUnsubmitted Done Reply Inline Actions Block scan needs a cutoff. nikic: Block scan needs a cutoff.
		for (Instruction &Inst : *SuccBB) {
		if (Inst.isIdenticalTo(Load)) {
		MemDepResult Dep = MD->getDependency(&Inst);
		// If an identical load doesn't depends on any local instructions, it can
		// be safely moved to PredBB.
		nikicUnsubmitted Not Done Reply Inline Actions Do we need to be concerned about speculating the load? You do perform a speculation check, but I think it does not cover the speculation of this load. nikic: Do we need to be concerned about speculating the load? You do perform a speculation check, but…
		CarrotAuthorUnsubmitted Done Reply Inline Actions Do you mean the following code in function PerformLoadPRE ? + for (auto &CEP : CriticalEdgePredAndLoad) + if (!isSafeToSpeculativelyExecute(Load, CEP.first->getTerminator(), AC, + DT)) + return false; It's specifically for this purpose. I used the original Load instruction instead of CEP.second because all of them are identical. Carrot: Do you mean the following code in function PerformLoadPRE ? ``` + for (auto &CEP…
		if (Dep.isNonLocal())
		nikicUnsubmitted Done Reply Inline Actions `cast<>` nikic: `cast<>`
		return cast<LoadInst>(&Inst);

		return nullptr;
		}

		if (--NumInsts == 0)
		return nullptr;
		}

		return nullptr;
		}

void GVNPass::eliminatePartiallyRedundantLoad(		void GVNPass::eliminatePartiallyRedundantLoad(
LoadInst *Load, AvailValInBlkVect &ValuesPerBlock,		LoadInst *Load, AvailValInBlkVect &ValuesPerBlock,
MapVector<BasicBlock , Value > &AvailableLoads) {		MapVector<BasicBlock , Value > &AvailableLoads,
		MapVector<BasicBlock , LoadInst > *CriticalEdgePredAndLoad) {
for (const auto &AvailableLoad : AvailableLoads) {		for (const auto &AvailableLoad : AvailableLoads) {
BasicBlock *UnavailableBlock = AvailableLoad.first;		BasicBlock *UnavailableBlock = AvailableLoad.first;
Value *LoadPtr = AvailableLoad.second;		Value *LoadPtr = AvailableLoad.second;

auto *NewLoad =		auto *NewLoad =
new LoadInst(Load->getType(), LoadPtr, Load->getName() + ".pre",		new LoadInst(Load->getType(), LoadPtr, Load->getName() + ".pre",
Load->isVolatile(), Load->getAlign(), Load->getOrdering(),		Load->isVolatile(), Load->getAlign(), Load->getOrdering(),
Load->getSyncScopeID(), UnavailableBlock->getTerminator());		Load->getSyncScopeID(), UnavailableBlock->getTerminator());
Show All 37 Lines	for (const auto &AvailableLoad : AvailableLoads) {
// FIXME: How do we retain source locations without causing poor debugging		// FIXME: How do we retain source locations without causing poor debugging
// behavior?		// behavior?

// Add the newly created load.		// Add the newly created load.
ValuesPerBlock.push_back(		ValuesPerBlock.push_back(
AvailableValueInBlock::get(UnavailableBlock, NewLoad));		AvailableValueInBlock::get(UnavailableBlock, NewLoad));
MD->invalidateCachedPointerInfo(LoadPtr);		MD->invalidateCachedPointerInfo(LoadPtr);
LLVM_DEBUG(dbgs() << "GVN INSERTED " << *NewLoad << '\n');		LLVM_DEBUG(dbgs() << "GVN INSERTED " << *NewLoad << '\n');

		// For PredBB in CriticalEdgePredAndLoad we need to delete the already found
		// load instruction which is now redundant.
		if (CriticalEdgePredAndLoad) {
		auto I = CriticalEdgePredAndLoad->find(UnavailableBlock);
		if (I != CriticalEdgePredAndLoad->end()) {
		LoadInst *OldLoad = I->second;
		OldLoad->replaceAllUsesWith(NewLoad);
		replaceValuesPerBlockEntry(ValuesPerBlock, OldLoad->getParent(),
		NewLoad);
		markInstructionForDeletion(OldLoad);
		if (uint32_t ValNo = VN.lookup(OldLoad, false))
		removeFromLeaderTable(ValNo, OldLoad, OldLoad->getParent());
		}
		}
}		}

// Perform PHI construction.		// Perform PHI construction.
Value V = ConstructSSAForLoadSet(Load, ValuesPerBlock, this);		Value V = ConstructSSAForLoadSet(Load, ValuesPerBlock, this);
Load->replaceAllUsesWith(V);		Load->replaceAllUsesWith(V);
if (isa<PHINode>(V))		if (isa<PHINode>(V))
V->takeName(Load);		V->takeName(Load);
if (Instruction *I = dyn_cast<Instruction>(V))		if (Instruction *I = dyn_cast<Instruction>(V))
▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	bool GVNPass::PerformLoadPRE(LoadInst *Load, AvailValInBlkVect &ValuesPerBlock,
// available.		// available.
MapVector<BasicBlock , Value > PredLoads;		MapVector<BasicBlock , Value > PredLoads;
DenseMap<BasicBlock *, AvailabilityState> FullyAvailableBlocks;		DenseMap<BasicBlock *, AvailabilityState> FullyAvailableBlocks;
for (const AvailableValueInBlock &AV : ValuesPerBlock)		for (const AvailableValueInBlock &AV : ValuesPerBlock)
FullyAvailableBlocks[AV.BB] = AvailabilityState::Available;		FullyAvailableBlocks[AV.BB] = AvailabilityState::Available;
for (BasicBlock *UnavailableBB : UnavailableBlocks)		for (BasicBlock *UnavailableBB : UnavailableBlocks)
FullyAvailableBlocks[UnavailableBB] = AvailabilityState::Unavailable;		FullyAvailableBlocks[UnavailableBB] = AvailabilityState::Unavailable;

SmallVector<BasicBlock *, 4> CriticalEdgePred;		// The edge from Pred to LoadBB is a critical edge will be splitted.
		SmallVector<BasicBlock *, 4> CriticalEdgePredSplit;
		// The edge from Pred to LoadBB is a critical edge, another successor of Pred
		// contains a load can be moved to Pred. This data structure maps the Pred to
		// the movable load.
		MapVector<BasicBlock , LoadInst > CriticalEdgePredAndLoad;
for (BasicBlock *Pred : predecessors(LoadBB)) {		for (BasicBlock *Pred : predecessors(LoadBB)) {
// If any predecessor block is an EH pad that does not allow non-PHI		// If any predecessor block is an EH pad that does not allow non-PHI
// instructions before the terminator, we can't PRE the load.		// instructions before the terminator, we can't PRE the load.
if (Pred->getTerminator()->isEHPad()) {		if (Pred->getTerminator()->isEHPad()) {
LLVM_DEBUG(		LLVM_DEBUG(
dbgs() << "COULD NOT PRE LOAD BECAUSE OF AN EH PAD PREDECESSOR '"		dbgs() << "COULD NOT PRE LOAD BECAUSE OF AN EH PAD PREDECESSOR '"
<< Pred->getName() << "': " << *Load << '\n');		<< Pred->getName() << "': " << *Load << '\n');
return false;		return false;
Show All 23 Lines	if (Pred->getTerminator()->getNumSuccessors() != 1) {
if (DT->dominates(LoadBB, Pred)) {		if (DT->dominates(LoadBB, Pred)) {
LLVM_DEBUG(		LLVM_DEBUG(
dbgs()		dbgs()
<< "COULD NOT PRE LOAD BECAUSE OF A BACKEDGE CRITICAL EDGE '"		<< "COULD NOT PRE LOAD BECAUSE OF A BACKEDGE CRITICAL EDGE '"
<< Pred->getName() << "': " << *Load << '\n');		<< Pred->getName() << "': " << *Load << '\n');
return false;		return false;
}		}

CriticalEdgePred.push_back(Pred);		if (LoadInst *LI = findLoadToHoistIntoPred(Pred, LoadBB, Load)) {
		CriticalEdgePredAndLoad[Pred] = LI;
		} else
		CriticalEdgePredSplit.push_back(Pred);
} else {		} else {
// Only add the predecessors that will not be split for now.		// Only add the predecessors that will not be split for now.
PredLoads[Pred] = nullptr;		PredLoads[Pred] = nullptr;
}		}

		// Early check for non profitable PRE load.
		unsigned NumInsertPreds = PredLoads.size() + CriticalEdgePredSplit.size();
		if (NumInsertPreds > 1)
		return false;
}		}

// Decide whether PRE is profitable for this load.		// Decide whether PRE is profitable for this load.
unsigned NumUnavailablePreds = PredLoads.size() + CriticalEdgePred.size();		unsigned NumInsertPreds = PredLoads.size() + CriticalEdgePredSplit.size();
		unsigned NumUnavailablePreds = NumInsertPreds +
		CriticalEdgePredAndLoad.size();
assert(NumUnavailablePreds != 0 &&		assert(NumUnavailablePreds != 0 &&
"Fully available value should already be eliminated!");		"Fully available value should already be eliminated!");

// If this load is unavailable in multiple predecessors, reject it.		// If we need to insert new load in multiple predecessors, reject it.
// FIXME: If we could restructure the CFG, we could make a common pred with		// FIXME: If we could restructure the CFG, we could make a common pred with
// all the preds that don't have an available Load and insert a new load into		// all the preds that don't have an available Load and insert a new load into
// that one block.		// that one block.
if (NumUnavailablePreds != 1)		if (NumInsertPreds > 1)
return false;		return false;

// Now we know where we will insert load. We must ensure that it is safe		// Now we know where we will insert load. We must ensure that it is safe
// to speculatively execute the load at that points.		// to speculatively execute the load at that points.
if (MustEnsureSafetyOfSpeculativeExecution) {		if (MustEnsureSafetyOfSpeculativeExecution) {
if (CriticalEdgePred.size())		if (CriticalEdgePredSplit.size())
if (!isSafeToSpeculativelyExecute(Load, LoadBB->getFirstNonPHI(), AC, DT))		if (!isSafeToSpeculativelyExecute(Load, LoadBB->getFirstNonPHI(), AC, DT))
return false;		return false;
for (auto &PL : PredLoads)		for (auto &PL : PredLoads)
if (!isSafeToSpeculativelyExecute(Load, PL.first->getTerminator(), AC,		if (!isSafeToSpeculativelyExecute(Load, PL.first->getTerminator(), AC,
DT))		DT))
return false;		return false;
		for (auto &CEP : CriticalEdgePredAndLoad)
		if (!isSafeToSpeculativelyExecute(Load, CEP.first->getTerminator(), AC,
		DT))
		return false;
}		}

// Split critical edges, and update the unavailable predecessors accordingly.		// Split critical edges, and update the unavailable predecessors accordingly.
for (BasicBlock *OrigPred : CriticalEdgePred) {		for (BasicBlock *OrigPred : CriticalEdgePredSplit) {
BasicBlock *NewPred = splitCriticalEdges(OrigPred, LoadBB);		BasicBlock *NewPred = splitCriticalEdges(OrigPred, LoadBB);
assert(!PredLoads.count(OrigPred) && "Split edges shouldn't be in map!");		assert(!PredLoads.count(OrigPred) && "Split edges shouldn't be in map!");
PredLoads[NewPred] = nullptr;		PredLoads[NewPred] = nullptr;
LLVM_DEBUG(dbgs() << "Split critical edge " << OrigPred->getName() << "->"		LLVM_DEBUG(dbgs() << "Split critical edge " << OrigPred->getName() << "->"
<< LoadBB->getName() << '\n');		<< LoadBB->getName() << '\n');
}		}

		for (auto &CEP : CriticalEdgePredAndLoad)
		PredLoads[CEP.first] = nullptr;

// Check if the load can safely be moved to all the unavailable predecessors.		// Check if the load can safely be moved to all the unavailable predecessors.
bool CanDoPRE = true;		bool CanDoPRE = true;
const DataLayout &DL = Load->getModule()->getDataLayout();		const DataLayout &DL = Load->getModule()->getDataLayout();
SmallVector<Instruction*, 8> NewInsts;		SmallVector<Instruction*, 8> NewInsts;
for (auto &PredLoad : PredLoads) {		for (auto &PredLoad : PredLoads) {
BasicBlock *UnavailablePred = PredLoad.first;		BasicBlock *UnavailablePred = PredLoad.first;

// Do PHI translation to get its value in the predecessor if necessary. The		// Do PHI translation to get its value in the predecessor if necessary. The
Show All 40 Lines	while (!NewInsts.empty()) {
// trying to number them. PHI translation might insert instructions		// trying to number them. PHI translation might insert instructions
// in basic blocks other than the current one, and we delete them		// in basic blocks other than the current one, and we delete them
// directly, as markInstructionForDeletion only allows removing from the		// directly, as markInstructionForDeletion only allows removing from the
// current basic block.		// current basic block.
NewInsts.pop_back_val()->eraseFromParent();		NewInsts.pop_back_val()->eraseFromParent();
}		}
// HINT: Don't revert the edge-splitting as following transformation may		// HINT: Don't revert the edge-splitting as following transformation may
// also need to split these critical edges.		// also need to split these critical edges.
return !CriticalEdgePred.empty();		return !CriticalEdgePredSplit.empty();
}		}

// Okay, we can eliminate this load by inserting a reload in the predecessor		// Okay, we can eliminate this load by inserting a reload in the predecessor
// and using PHI construction to get the value in the other predecessors, do		// and using PHI construction to get the value in the other predecessors, do
// it.		// it.
LLVM_DEBUG(dbgs() << "GVN REMOVING PRE LOAD: " << *Load << '\n');		LLVM_DEBUG(dbgs() << "GVN REMOVING PRE LOAD: " << *Load << '\n');
LLVM_DEBUG(if (!NewInsts.empty()) dbgs() << "INSERTED " << NewInsts.size()		LLVM_DEBUG(if (!NewInsts.empty()) dbgs() << "INSERTED " << NewInsts.size()
<< " INSTS: " << *NewInsts.back()		<< " INSTS: " << *NewInsts.back()
<< '\n');		<< '\n');

// Assign value numbers to the new instructions.		// Assign value numbers to the new instructions.
for (Instruction *I : NewInsts) {		for (Instruction *I : NewInsts) {
// Instructions that have been inserted in predecessor(s) to materialize		// Instructions that have been inserted in predecessor(s) to materialize
// the load address do not retain their original debug locations. Doing		// the load address do not retain their original debug locations. Doing
// so could lead to confusing (but correct) source attributions.		// so could lead to confusing (but correct) source attributions.
I->updateLocationAfterHoist();		I->updateLocationAfterHoist();

// FIXME: We really _ought_ to insert these value numbers into their		// FIXME: We really _ought_ to insert these value numbers into their
// parent's availability map. However, in doing so, we risk getting into		// parent's availability map. However, in doing so, we risk getting into
// ordering issues. If a block hasn't been processed yet, we would be		// ordering issues. If a block hasn't been processed yet, we would be
// marking a value as AVAIL-IN, which isn't what we intend.		// marking a value as AVAIL-IN, which isn't what we intend.
VN.lookupOrAdd(I);		VN.lookupOrAdd(I);
}		}

eliminatePartiallyRedundantLoad(Load, ValuesPerBlock, PredLoads);		eliminatePartiallyRedundantLoad(Load, ValuesPerBlock, PredLoads,
		&CriticalEdgePredAndLoad);
++NumPRELoad;		++NumPRELoad;
return true;		return true;
}		}

bool GVNPass::performLoopLoadPRE(LoadInst *Load,		bool GVNPass::performLoopLoadPRE(LoadInst *Load,
AvailValInBlkVect &ValuesPerBlock,		AvailValInBlkVect &ValuesPerBlock,
UnavailBlkVect &UnavailableBlocks) {		UnavailBlkVect &UnavailableBlocks) {
if (!LI)		if (!LI)
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	if (LoadPtr->canBeFreed())
return false;		return false;

// TODO: Support critical edge splitting if blocker has more than 1 successor.		// TODO: Support critical edge splitting if blocker has more than 1 successor.
MapVector<BasicBlock , Value > AvailableLoads;		MapVector<BasicBlock , Value > AvailableLoads;
AvailableLoads[LoopBlock] = LoadPtr;		AvailableLoads[LoopBlock] = LoadPtr;
AvailableLoads[Preheader] = LoadPtr;		AvailableLoads[Preheader] = LoadPtr;

LLVM_DEBUG(dbgs() << "GVN REMOVING PRE LOOP LOAD: " << *Load << '\n');		LLVM_DEBUG(dbgs() << "GVN REMOVING PRE LOOP LOAD: " << *Load << '\n');
eliminatePartiallyRedundantLoad(Load, ValuesPerBlock, AvailableLoads);		eliminatePartiallyRedundantLoad(Load, ValuesPerBlock, AvailableLoads,
		/CriticalEdgePredAndLoad/ nullptr);
		mkazantsevUnsubmitted Done Reply Inline Actions nit: `/CriticalEdgePredAndLoad/ nullptr` mkazantsev: nit: `/CriticalEdgePredAndLoad/ nullptr`
++NumPRELoopLoad;		++NumPRELoopLoad;
return true;		return true;
}		}

static void reportLoadElim(LoadInst Load, Value AvailableValue,		static void reportLoadElim(LoadInst Load, Value AvailableValue,
OptimizationRemarkEmitter *ORE) {		OptimizationRemarkEmitter *ORE) {
using namespace ore;		using namespace ore;

▲ Show 20 Lines • Show All 954 Lines • ▼ Show 20 Lines	for (BasicBlock::iterator BI = BB->begin(), BE = BB->end();
NumGVNInstr += InstrsToErase.size();		NumGVNInstr += InstrsToErase.size();

// Avoid iterator invalidation.		// Avoid iterator invalidation.
bool AtStart = BI == BB->begin();		bool AtStart = BI == BB->begin();
if (!AtStart)		if (!AtStart)
--BI;		--BI;

for (auto *I : InstrsToErase) {		for (auto *I : InstrsToErase) {
assert(I->getParent() == BB && "Removing instruction from wrong block?");
nikicUnsubmitted Not Done Reply Inline Actions I am not sure this assertion is safe to remove. I think a problem with your current code is that it may try to remove a load that is part of the leader table for that block, if that block has been processed before the current one. nikic: I am not sure this assertion is safe to remove. I think a problem with your current code is…
CarrotAuthorUnsubmitted Done Reply Inline Actions Added code into function eliminatePartiallyRedundantLoad to delete the corresponding leader table entry. Carrot: Added code into function eliminatePartiallyRedundantLoad to delete the corresponding leader…
LLVM_DEBUG(dbgs() << "GVN removed: " << *I << '\n');		LLVM_DEBUG(dbgs() << "GVN removed: " << *I << '\n');
salvageKnowledge(I, AC);		salvageKnowledge(I, AC);
salvageDebugInfo(*I);		salvageDebugInfo(*I);
if (MD) MD->removeInstruction(I);		if (MD) MD->removeInstruction(I);
if (MSSAU)		if (MSSAU)
MSSAU->removeMemoryAccess(I);		MSSAU->removeMemoryAccess(I);
LLVM_DEBUG(verifyRemoved(I));		LLVM_DEBUG(verifyRemoved(I));
ICF->removeInstruction(I);		ICF->removeInstruction(I);
▲ Show 20 Lines • Show All 536 Lines • Show Last 20 Lines

llvm/test/Transforms/GVN/PRE/2011-06-01-NonLocalMemdepMiscompile.ll

	Show First 20 Lines • Show All 51 Lines • ▼ Show 20 Lines

	; CHECK-LABEL: bb6:			; CHECK-LABEL: bb6:
	; CHECK: br i1 undef, label %bb15split, label %bb10			; CHECK: br i1 undef, label %bb15split, label %bb10

	; CHECK-LABEL: bb15split: ; preds = %bb6			; CHECK-LABEL: bb15split: ; preds = %bb6
	; CHECK-NEXT: br label %bb15			; CHECK-NEXT: br label %bb15

	; CHECK-LABEL: bb15:			; CHECK-LABEL: bb15:
	; CHECK: %tmp17 = phi i8 [ %tmp8, %bb15split ], [ %tmp17.pre, %bb1.bb15_crit_edge ]			; CHECK: %tmp17 = phi i8 [ %tmp12.pre3, %bb15split ], [ %tmp17.pre, %bb1.bb15_crit_edge ]

	bb19: ; preds = %bb15			bb19: ; preds = %bb15
	ret i1 %tmp18			ret i1 %tmp18
	}			}

	declare void @isalnum() nounwind inlinehint ssp			declare void @isalnum() nounwind inlinehint ssp

llvm/test/Transforms/GVN/PRE/2017-06-28-pre-load-dbgloc.ll

	; This test checks if debug loc is propagated to load/store created by GVN/Instcombine.			; This test checks if debug loc is propagated to load/store created by GVN/Instcombine.
	; RUN: opt < %s -passes=gvn -S \| FileCheck %s --check-prefixes=ALL,GVN			; RUN: opt < %s -passes=gvn -S \| FileCheck %s --check-prefixes=ALL
	; RUN: opt < %s -passes=gvn,instcombine -S \| FileCheck %s --check-prefixes=ALL,INSTCOMBINE			; RUN: opt < %s -passes=gvn,instcombine -S \| FileCheck %s --check-prefixes=ALL

	; struct node {			; struct node {
	; int *v;			; int *v;
	; struct desc *descs;			; struct desc *descs;
	; };			; };

	; struct desc {			; struct desc {
	; struct node *node;			; struct node *node;
	Show All 18 Lines

	%struct.desc = type { ptr }			%struct.desc = type { ptr }
	%struct.node = type { ptr, ptr }			%struct.node = type { ptr, ptr }

	define i32 @test(ptr readonly %desc) local_unnamed_addr #0 !dbg !4 {			define i32 @test(ptr readonly %desc) local_unnamed_addr #0 !dbg !4 {
	entry:			entry:
	%tobool = icmp eq ptr %desc, null			%tobool = icmp eq ptr %desc, null
	br i1 %tobool, label %cond.end, label %cond.false, !dbg !9			br i1 %tobool, label %cond.end, label %cond.false, !dbg !9
	; ALL: br i1 %tobool, label %entry.cond.end_crit_edge, label %cond.false, !dbg [[LOC_15_6:![0-9]+]]			; ALL: %.pre = load ptr, ptr %desc, align 8, !dbg [[LOC_16_13:![0-9]+]]
	; ALL: entry.cond.end_crit_edge:			; ALL: br i1 %tobool, label %cond.end, label %cond.false, !dbg [[LOC_15_6:![0-9]+]]
	; GVN: %.pre = load ptr, ptr null, align 8, !dbg [[LOC_16_13:![0-9]+]]			; ALL: cond.false:
	; INSTCOMBINE:store ptr poison, ptr null, align 4294967296, !dbg [[LOC_16_13:![0-9]+]]

	cond.false:			cond.false:
	%0 = load ptr, ptr %desc, align 8, !dbg !11			%0 = load ptr, ptr %desc, align 8, !dbg !11
	%1 = load ptr, ptr %0, align 8			%1 = load ptr, ptr %0, align 8
	br label %cond.end, !dbg !9			br label %cond.end, !dbg !9

	cond.end:			cond.end:
	%2 = phi ptr [ %1, %cond.false ], [ null, %entry ], !dbg !9			%2 = phi ptr [ %1, %cond.false ], [ null, %entry ], !dbg !9
	Show All 17 Lines
	!6 = !{!7}			!6 = !{!7}
	!7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)			!7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
	!8 = !{}			!8 = !{}
	!9 = !DILocation(line: 15, column: 6, scope: !4)			!9 = !DILocation(line: 15, column: 6, scope: !4)
	!10 = !DILocation(line: 16, column: 13, scope: !4)			!10 = !DILocation(line: 16, column: 13, scope: !4)
	!11 = !DILocation(line: 15, column: 34, scope: !4)			!11 = !DILocation(line: 15, column: 34, scope: !4)

	;ALL: [[SCOPE:![0-9]+]] = distinct !DISubprogram(name: "test",{{.*}}			;ALL: [[SCOPE:![0-9]+]] = distinct !DISubprogram(name: "test",{{.*}}
	;ALL: [[LOC_15_6]] = !DILocation(line: 15, column: 6, scope: [[SCOPE]])
	;ALL: [[LOC_16_13]] = !DILocation(line: 16, column: 13, scope: [[SCOPE]])			;ALL: [[LOC_16_13]] = !DILocation(line: 16, column: 13, scope: [[SCOPE]])
				;ALL: [[LOC_15_6]] = !DILocation(line: 15, column: 6, scope: [[SCOPE]])

llvm/test/Transforms/GVN/PRE/pre-load.ll

Show First 20 Lines • Show All 681 Lines • ▼ Show 20 Lines
; Same as test13, but %x here is dereferenceable. A pointer that is		; Same as test13, but %x here is dereferenceable. A pointer that is
; dereferenceable can be loaded from speculatively without a risk of trapping.		; dereferenceable can be loaded from speculatively without a risk of trapping.
; Since it is OK to speculate, PRE is allowed.		; Since it is OK to speculate, PRE is allowed.

define i32 @test15(ptr noalias nocapture readonly dereferenceable(8) align 4 %x, ptr noalias nocapture %r, i32 %a) nofree nosync {		define i32 @test15(ptr noalias nocapture readonly dereferenceable(8) align 4 %x, ptr noalias nocapture %r, i32 %a) nofree nosync {
; CHECK-LABEL: @test15(		; CHECK-LABEL: @test15(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[TOBOOL:%.]] = icmp eq i32 [[A:%.]], 0		; CHECK-NEXT: [[TOBOOL:%.]] = icmp eq i32 [[A:%.]], 0
; CHECK-NEXT: br i1 [[TOBOOL]], label [[ENTRY_IF_END_CRIT_EDGE:%.]], label [[IF_THEN:%.]]
; CHECK: entry.if.end_crit_edge:
; CHECK-NEXT: [[VV_PRE:%.]] = load i32, ptr [[X:%.]], align 4		; CHECK-NEXT: [[VV_PRE:%.]] = load i32, ptr [[X:%.]], align 4
; CHECK-NEXT: br label [[IF_END:%.*]]		; CHECK-NEXT: br i1 [[TOBOOL]], label [[IF_END:%.]], label [[IF_THEN:%.]]
; CHECK: if.then:		; CHECK: if.then:
; CHECK-NEXT: [[UU:%.*]] = load i32, ptr [[X]], align 4		; CHECK-NEXT: store i32 [[VV_PRE]], ptr [[R:%.*]], align 4
; CHECK-NEXT: store i32 [[UU]], ptr [[R:%.*]], align 4
; CHECK-NEXT: br label [[IF_END]]		; CHECK-NEXT: br label [[IF_END]]
; CHECK: if.end:		; CHECK: if.end:
; CHECK-NEXT: [[VV:%.*]] = phi i32 [ [[VV_PRE]], [[ENTRY_IF_END_CRIT_EDGE]] ], [ [[UU]], [[IF_THEN]] ]
; CHECK-NEXT: call void @f()		; CHECK-NEXT: call void @f()
; CHECK-NEXT: ret i32 [[VV]]		; CHECK-NEXT: ret i32 [[VV_PRE]]
;		;

entry:		entry:
%tobool = icmp eq i32 %a, 0		%tobool = icmp eq i32 %a, 0
br i1 %tobool, label %if.end, label %if.then		br i1 %tobool, label %if.end, label %if.then


if.then:		if.then:
Show All 13 Lines
; Same as test14, but %x here is dereferenceable. A pointer that is		; Same as test14, but %x here is dereferenceable. A pointer that is
; dereferenceable can be loaded from speculatively without a risk of trapping.		; dereferenceable can be loaded from speculatively without a risk of trapping.
; Since it is OK to speculate, PRE is allowed.		; Since it is OK to speculate, PRE is allowed.

define i32 @test16(ptr noalias nocapture readonly dereferenceable(8) align 4 %x, ptr noalias nocapture %r, i32 %a) nofree nosync {		define i32 @test16(ptr noalias nocapture readonly dereferenceable(8) align 4 %x, ptr noalias nocapture %r, i32 %a) nofree nosync {
; CHECK-LABEL: @test16(		; CHECK-LABEL: @test16(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[TOBOOL:%.]] = icmp eq i32 [[A:%.]], 0		; CHECK-NEXT: [[TOBOOL:%.]] = icmp eq i32 [[A:%.]], 0
; CHECK-NEXT: br i1 [[TOBOOL]], label [[ENTRY_IF_END_CRIT_EDGE:%.]], label [[IF_THEN:%.]]
; CHECK: entry.if.end_crit_edge:
; CHECK-NEXT: [[VV_PRE:%.]] = load i32, ptr [[X:%.]], align 4		; CHECK-NEXT: [[VV_PRE:%.]] = load i32, ptr [[X:%.]], align 4
; CHECK-NEXT: br label [[IF_END:%.*]]		; CHECK-NEXT: br i1 [[TOBOOL]], label [[IF_END:%.]], label [[IF_THEN:%.]]
; CHECK: if.then:		; CHECK: if.then:
; CHECK-NEXT: [[UU:%.*]] = load i32, ptr [[X]], align 4		; CHECK-NEXT: store i32 [[VV_PRE]], ptr [[R:%.*]], align 4
; CHECK-NEXT: store i32 [[UU]], ptr [[R:%.*]], align 4
; CHECK-NEXT: br label [[IF_END]]		; CHECK-NEXT: br label [[IF_END]]
; CHECK: if.end:		; CHECK: if.end:
; CHECK-NEXT: [[VV:%.*]] = phi i32 [ [[VV_PRE]], [[ENTRY_IF_END_CRIT_EDGE]] ], [ [[UU]], [[IF_THEN]] ]
; CHECK-NEXT: call void @f()		; CHECK-NEXT: call void @f()
; CHECK-NEXT: ret i32 [[VV]]		; CHECK-NEXT: ret i32 [[VV_PRE]]
;		;

entry:		entry:
%tobool = icmp eq i32 %a, 0		%tobool = icmp eq i32 %a, 0
br i1 %tobool, label %if.end, label %if.then		br i1 %tobool, label %if.end, label %if.then


if.then:		if.then:
Show All 15 Lines	follow_2:
ret i32 %vv		ret i32 %vv
}		}

declare i1 @foo()		declare i1 @foo()
declare i1 @bar()		declare i1 @bar()

; %v3 is partially redundant, bb3 has multiple predecessors coming through		; %v3 is partially redundant, bb3 has multiple predecessors coming through
; critical edges. The other successors of those predecessors have same loads.		; critical edges. The other successors of those predecessors have same loads.
; We can move all loads into predecessors.		; We can move all loads into predecessors.
		nikicUnsubmitted Done Reply Inline Actions Please pre-commit the test. nikic: Please pre-commit the test.

define void @test17(ptr %p1, ptr %p2, ptr %p3, ptr %p4)		define void @test17(ptr %p1, ptr %p2, ptr %p3, ptr %p4)
; CHECK-LABEL: @test17(		; CHECK-LABEL: @test17(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: [[V1:%.]] = load i64, ptr [[P1:%.]], align 8		; CHECK-NEXT: [[V1:%.]] = load i64, ptr [[P1:%.]], align 8
; CHECK-NEXT: [[COND1:%.*]] = icmp sgt i64 [[V1]], 200		; CHECK-NEXT: [[COND1:%.*]] = icmp sgt i64 [[V1]], 200
; CHECK-NEXT: br i1 [[COND1]], label [[BB200:%.]], label [[BB1:%.]]		; CHECK-NEXT: br i1 [[COND1]], label [[BB200:%.]], label [[BB1:%.]]
; CHECK: bb1:		; CHECK: bb1:
; CHECK-NEXT: [[COND2:%.*]] = icmp sgt i64 [[V1]], 100		; CHECK-NEXT: [[COND2:%.*]] = icmp sgt i64 [[V1]], 100
; CHECK-NEXT: br i1 [[COND2]], label [[BB100:%.]], label [[BB2:%.]]		; CHECK-NEXT: br i1 [[COND2]], label [[BB100:%.]], label [[BB2:%.]]
; CHECK: bb2:		; CHECK: bb2:
; CHECK-NEXT: [[V2:%.*]] = add nsw i64 [[V1]], 1		; CHECK-NEXT: [[V2:%.*]] = add nsw i64 [[V1]], 1
; CHECK-NEXT: store i64 [[V2]], ptr [[P1]], align 8		; CHECK-NEXT: store i64 [[V2]], ptr [[P1]], align 8
; CHECK-NEXT: br label [[BB3:%.*]]		; CHECK-NEXT: br label [[BB3:%.*]]
; CHECK: bb3:		; CHECK: bb3:
; CHECK-NEXT: [[V3:%.*]] = load i64, ptr [[P1]], align 8		; CHECK-NEXT: [[V3:%.]] = phi i64 [ [[V3_PRE:%.]], [[BB200]] ], [ [[V3_PRE1:%.*]], [[BB100]] ], [ [[V2]], [[BB2]] ]
; CHECK-NEXT: store i64 [[V3]], ptr [[P2:%.*]], align 8		; CHECK-NEXT: store i64 [[V3]], ptr [[P2:%.*]], align 8
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
; CHECK: bb100:		; CHECK: bb100:
; CHECK-NEXT: [[COND3:%.*]] = call i1 @foo()		; CHECK-NEXT: [[COND3:%.*]] = call i1 @foo()
		; CHECK-NEXT: [[V3_PRE1]] = load i64, ptr [[P1]], align 8
; CHECK-NEXT: br i1 [[COND3]], label [[BB3]], label [[BB101:%.*]]		; CHECK-NEXT: br i1 [[COND3]], label [[BB3]], label [[BB101:%.*]]
; CHECK: bb101:		; CHECK: bb101:
; CHECK-NEXT: [[V4:%.*]] = load i64, ptr [[P1]], align 8		; CHECK-NEXT: store i64 [[V3_PRE1]], ptr [[P3:%.*]], align 8
; CHECK-NEXT: store i64 [[V4]], ptr [[P3:%.*]], align 8
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
; CHECK: bb200:		; CHECK: bb200:
; CHECK-NEXT: [[COND4:%.*]] = call i1 @bar()		; CHECK-NEXT: [[COND4:%.*]] = call i1 @bar()
		; CHECK-NEXT: [[V3_PRE]] = load i64, ptr [[P1]], align 8
; CHECK-NEXT: br i1 [[COND4]], label [[BB3]], label [[BB201:%.*]]		; CHECK-NEXT: br i1 [[COND4]], label [[BB3]], label [[BB201:%.*]]
; CHECK: bb201:		; CHECK: bb201:
; CHECK-NEXT: [[V5:%.*]] = load i64, ptr [[P1]], align 8		; CHECK-NEXT: store i64 [[V3_PRE]], ptr [[P4:%.*]], align 8
; CHECK-NEXT: store i64 [[V5]], ptr [[P4:%.*]], align 8
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
{		{
entry:		entry:
%v1 = load i64, ptr %p1, align 8		%v1 = load i64, ptr %p1, align 8
%cond1 = icmp sgt i64 %v1, 200		%cond1 = icmp sgt i64 %v1, 200
br i1 %cond1, label %bb200, label %bb1		br i1 %cond1, label %bb200, label %bb1

Show All 32 Lines

llvm/test/Transforms/GVN/PRE/volatile.ll

	Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines
	exit:			exit:
	ret i32 %add			ret i32 %add
	}			}

	; Does cross block PRE work with volatiles?			; Does cross block PRE work with volatiles?
	define i32 @test7(i1 %c, ptr noalias nocapture %p, ptr noalias nocapture %q) {			define i32 @test7(i1 %c, ptr noalias nocapture %p, ptr noalias nocapture %q) {
	; CHECK-LABEL: @test7(			; CHECK-LABEL: @test7(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br i1 [[C:%.]], label [[ENTRY_HEADER_CRIT_EDGE:%.]], label [[SKIP:%.*]]
	; CHECK: entry.header_crit_edge:
	; CHECK-NEXT: [[Y_PRE:%.]] = load i32, ptr [[P:%.]], align 4			; CHECK-NEXT: [[Y_PRE:%.]] = load i32, ptr [[P:%.]], align 4
	; CHECK-NEXT: br label [[HEADER:%.*]]			; CHECK-NEXT: br i1 [[C:%.]], label [[HEADER:%.]], label [[SKIP:%.*]]
	; CHECK: skip:			; CHECK: skip:
	; CHECK-NEXT: [[Y1:%.*]] = load i32, ptr [[P]], align 4			; CHECK-NEXT: call void @use(i32 [[Y_PRE]])
	; CHECK-NEXT: call void @use(i32 [[Y1]])
	; CHECK-NEXT: br label [[HEADER]]			; CHECK-NEXT: br label [[HEADER]]
	; CHECK: header:			; CHECK: header:
	; CHECK-NEXT: [[Y:%.*]] = phi i32 [ [[Y_PRE]], [[ENTRY_HEADER_CRIT_EDGE]] ], [ [[Y]], [[HEADER]] ], [ [[Y1]], [[SKIP]] ]
	; CHECK-NEXT: [[X:%.]] = load volatile i32, ptr [[Q:%.]], align 4			; CHECK-NEXT: [[X:%.]] = load volatile i32, ptr [[Q:%.]], align 4
	; CHECK-NEXT: [[ADD:%.*]] = sub i32 [[Y]], [[X]]			; CHECK-NEXT: [[ADD:%.*]] = sub i32 [[Y_PRE]], [[X]]
	; CHECK-NEXT: [[CND:%.*]] = icmp eq i32 [[ADD]], 0			; CHECK-NEXT: [[CND:%.*]] = icmp eq i32 [[ADD]], 0
	; CHECK-NEXT: br i1 [[CND]], label [[EXIT:%.*]], label [[HEADER]]			; CHECK-NEXT: br i1 [[CND]], label [[EXIT:%.*]], label [[HEADER]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret i32 0			; CHECK-NEXT: ret i32 0
	;			;
	entry:			entry:
	br i1 %c, label %header, label %skip			br i1 %c, label %header, label %skip
	skip:			skip:
	▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines

llvm/test/Transforms/GVN/condprop.ll

	Show First 20 Lines • Show All 515 Lines • ▼ Show 20 Lines
	; that gep2 does not alias ptr1 on that path (as it would require that			; that gep2 does not alias ptr1 on that path (as it would require that
	; ptr2==ptr2+2), so we can perform PRE of the load.			; ptr2==ptr2+2), so we can perform PRE of the load.
	define i32 @test13(ptr %ptr1, ptr %ptr2) {			define i32 @test13(ptr %ptr1, ptr %ptr2) {
	; CHECK-LABEL: @test13(			; CHECK-LABEL: @test13(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[GEP1:%.]] = getelementptr i32, ptr [[PTR2:%.]], i32 1			; CHECK-NEXT: [[GEP1:%.]] = getelementptr i32, ptr [[PTR2:%.]], i32 1
	; CHECK-NEXT: [[GEP2:%.*]] = getelementptr i32, ptr [[PTR2]], i32 2			; CHECK-NEXT: [[GEP2:%.*]] = getelementptr i32, ptr [[PTR2]], i32 2
	; CHECK-NEXT: [[CMP:%.]] = icmp eq ptr [[PTR1:%.]], [[PTR2]]			; CHECK-NEXT: [[CMP:%.]] = icmp eq ptr [[PTR1:%.]], [[PTR2]]
	; CHECK-NEXT: br i1 [[CMP]], label [[IF:%.]], label [[ENTRY_END_CRIT_EDGE:%.]]
	; CHECK: entry.end_crit_edge:
	; CHECK-NEXT: [[VAL2_PRE:%.*]] = load i32, ptr [[GEP2]], align 4			; CHECK-NEXT: [[VAL2_PRE:%.*]] = load i32, ptr [[GEP2]], align 4
	; CHECK-NEXT: br label [[END:%.*]]			; CHECK-NEXT: br i1 [[CMP]], label [[IF:%.]], label [[END:%.]]
	; CHECK: if:			; CHECK: if:
	; CHECK-NEXT: [[VAL1:%.*]] = load i32, ptr [[GEP2]], align 4
	; CHECK-NEXT: br label [[END]]			; CHECK-NEXT: br label [[END]]
	; CHECK: end:			; CHECK: end:
	; CHECK-NEXT: [[VAL2:%.*]] = phi i32 [ [[VAL1]], [[IF]] ], [ [[VAL2_PRE]], [[ENTRY_END_CRIT_EDGE]] ]			; CHECK-NEXT: [[PHI1:%.]] = phi ptr [ [[PTR2]], [[IF]] ], [ [[GEP1]], [[ENTRY:%.]] ]
	; CHECK-NEXT: [[PHI1:%.*]] = phi ptr [ [[PTR2]], [[IF]] ], [ [[GEP1]], [[ENTRY_END_CRIT_EDGE]] ]			; CHECK-NEXT: [[PHI2:%.*]] = phi i32 [ [[VAL2_PRE]], [[IF]] ], [ 0, [[ENTRY]] ]
	; CHECK-NEXT: [[PHI2:%.*]] = phi i32 [ [[VAL1]], [[IF]] ], [ 0, [[ENTRY_END_CRIT_EDGE]] ]
	; CHECK-NEXT: store i32 0, ptr [[PHI1]], align 4			; CHECK-NEXT: store i32 0, ptr [[PHI1]], align 4
	; CHECK-NEXT: [[RET:%.*]] = add i32 [[PHI2]], [[VAL2]]			; CHECK-NEXT: [[RET:%.*]] = add i32 [[PHI2]], [[VAL2_PRE]]
	; CHECK-NEXT: ret i32 [[RET]]			; CHECK-NEXT: ret i32 [[RET]]
	;			;
	entry:			entry:
	%gep1 = getelementptr i32, ptr %ptr2, i32 1			%gep1 = getelementptr i32, ptr %ptr2, i32 1
	%gep2 = getelementptr i32, ptr %ptr2, i32 2			%gep2 = getelementptr i32, ptr %ptr2, i32 2
	%cmp = icmp eq ptr %ptr1, %ptr2			%cmp = icmp eq ptr %ptr1, %ptr2
	br i1 %cmp, label %if, label %end			br i1 %cmp, label %if, label %end

	▲ Show 20 Lines • Show All 71 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[GVN] Improve PRE on load instructionsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 487573

llvm/include/llvm/Transforms/Scalar/GVN.h

llvm/lib/Transforms/Scalar/GVN.cpp

llvm/test/Transforms/GVN/PRE/2011-06-01-NonLocalMemdepMiscompile.ll

llvm/test/Transforms/GVN/PRE/2017-06-28-pre-load-dbgloc.ll

llvm/test/Transforms/GVN/PRE/pre-load.ll

llvm/test/Transforms/GVN/PRE/volatile.ll

llvm/test/Transforms/GVN/condprop.ll

[GVN] Improve PRE on load instructions
ClosedPublic