This is an archive of the discontinued LLVM Phabricator instance.

[GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE
ClosedPublic

Authored by bmakam on Nov 3 2014, 3:25 PM.

Download Raw Diff

Details

Reviewers

• HaoLiu
jmolloy
• dberlin
resistor
Jiangning
hfinkel
mcrosier
apazos

Summary

All,

This patch addresses the missing PRE opportunities initially reported by James Molloy in 450.soplex

This patch re-factors James' patch to "Make GVN more iterative" based on the comments/suggestions from Daniel that iterating all of GVN over again is pretty big hammer.
Instead of iterating GVN all over, this patch does a ScalarPRE of any scalar instructions that a load is dependent on, before performing LoadPRE on that load.

When tested on a Cortex-A57, James' initial patch to make GVN more iterative improved 450.soplex by 3%. This patch improved 450.soplex by 7% without iterating GVN all over again.
In order to achieve this I had to enable the reverse post order traversal for iterateOnFunction because we would have to value number dependent scalar instructions before performing ScalarPRE on them. Although, traversing in reverse post order is costly in terms of compile time but this may be cheaper than iterating GVN all over again and also results in better performance. What do you guys think?

Diff Detail

Event Timeline

bmakam updated this revision to Diff 15736.Nov 3 2014, 3:25 PM

bmakam retitled this revision from to [GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE.

bmakam updated this object.

bmakam edited the test plan for this revision. (Show Details)

bmakam added reviewers: jmolloy, • dberlin, apazos, Jiangning, • HaoLiu, mcrosier, hfinkel.

bmakam added a subscriber: Unknown Object (MLST).

Thanks for continuing to work on this!

Generally, please post patches with full context. For instructions on how to do this, see: http://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface

I think that a 7% speedup sounds nice, does anything else improve? But please do provide some compile-time slowdown numbers, so that we can get a better handle on the cost/benefit analysis.

lib/Transforms/Scalar/GVN.cpp
2655	If the patch turns this on, please just remove the #if.

I turned on the slow path and commented out the fast path. If we can decide we no longer need to keep the fast path around I will clean it up.

I am running a perf run to gather compile times and will update the comment once I get back the results.

[Update1]
While I am still waiting for perf data on other benchmarks in Spec2k/2k6, here is the data I got so far:

a) Compilation times:

compiling clang.bc (Thanks to Jiangning for running the tests)

With the patch,

real 19m56.978s
user 141m16.602s
sys 2m59.942s

Without the patch, (original)

real 19m58.099s
user 141m21.219s
sys 2m58.493

which is 2s(0.85%) slower, so the slowdown is in noise range.

On Spec, only 433.milc slowed down the most by 5%. Other slowdowns were in

179.art - 2%
164.gzip - 2%
445.gobmk - 2%

b) Runtime Performance:
Other benchmarks whose performance improved:
447.dealII - 6%
403.gcc - 14%

I will update data on other benchmarks later.

[Update2]
401.bzip2 - 4%
464.h264ref - 4%
186.crafty - 7%

and only regression above noise range was in 181.mcf with at -3%

bmakam added inline comments.Nov 4 2014, 6:09 AM

lib/Transforms/Scalar/GVN.cpp
2655	I will turn on the slow path and turn off the fast path in my next patch that I will upload with full context. I am not sure if we still want to keep the fast path commented out in the code or clean it up.

hfinkel added inline comments.Nov 4 2014, 6:39 AM

lib/Transforms/Scalar/GVN.cpp
2655	If you want to still keep it, add a command-line flag to enable it. We don't generally keep commented-out code, as a policy.

bmakam added a reviewer: resistor.Nov 6 2014, 1:42 AM

All,

I updated my comment with slowdowns in compilation times and other benchmarks that improve with this patch. Slowdowns in compile times are in noise range I do not see any performance regressions greater than 3% in Spec. Based on this data, I would like to get rid of the fast path and will upload a patch removing out the commented-out code if you agree. Thanks for reviewing.

Sounds good to me.
I'll review in detail later today

Cleaned up dead code.

Ping.

LGTM modulo one comment

lib/Transforms/Scalar/GVN.cpp
2447–2452	If you are going to add LoadInst, you might as well add all the memory ops (anything where getOpCode() > MemoryOpsBegin && getOpCode < MemoryOpsEnd. If you do this, i'd add isMemoryOp to instruction.h alongside isBinaryOp, etc)

bmakam added inline comments.Nov 13 2014, 1:05 PM

lib/Transforms/Scalar/GVN.cpp
2447–2452	Thanks for catching this Daniel. The LoadInst was unintentional, it is already covered by CurInst->mayReadFromMemory(). I will prepare a patch removing the LoadInst. If LGTM, please feel free to +2 it since I do not have commit rights.

Addressed Daniel's comment and also rebased.

Approved based on Daniel's review.

This revision is now accepted and ready to land.Nov 13 2014, 1:16 PM

Committed r221924.

Revision Contents

Path

Size

lib/

Transforms/

Scalar/

GVN.cpp

332 lines

test/

Transforms/

GVN/

pre-gep-load.ll

49 lines

Diff 16174

lib/Transforms/Scalar/GVN.cpp

Show All 14 Lines
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Transforms/Scalar.h"		#include "llvm/Transforms/Scalar.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/DepthFirstIterator.h"		#include "llvm/ADT/DepthFirstIterator.h"
#include "llvm/ADT/Hashing.h"		#include "llvm/ADT/Hashing.h"
#include "llvm/ADT/MapVector.h"		#include "llvm/ADT/MapVector.h"
		#include "llvm/ADT/PostOrderIterator.h"
#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/AssumptionTracker.h"		#include "llvm/Analysis/AssumptionTracker.h"
#include "llvm/Analysis/CFG.h"		#include "llvm/Analysis/CFG.h"
#include "llvm/Analysis/ConstantFolding.h"		#include "llvm/Analysis/ConstantFolding.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
▲ Show 20 Lines • Show All 673 Lines • ▼ Show 20 Lines	bool PerformLoadPRE(LoadInst *LI, AvailValInBlkVect &ValuesPerBlock,
UnavailBlkVect &UnavailableBlocks);		UnavailBlkVect &UnavailableBlocks);

// Other helper routines		// Other helper routines
bool processInstruction(Instruction *I);		bool processInstruction(Instruction *I);
bool processBlock(BasicBlock *BB);		bool processBlock(BasicBlock *BB);
void dump(DenseMap<uint32_t, Value*> &d);		void dump(DenseMap<uint32_t, Value*> &d);
bool iterateOnFunction(Function &F);		bool iterateOnFunction(Function &F);
bool performPRE(Function &F);		bool performPRE(Function &F);
		bool performScalarPRE(Instruction *I);
Value findLeader(const BasicBlock BB, uint32_t num);		Value findLeader(const BasicBlock BB, uint32_t num);
void cleanupGlobalSets();		void cleanupGlobalSets();
void verifyRemoved(const Instruction *I) const;		void verifyRemoved(const Instruction *I) const;
bool splitCriticalEdges();		bool splitCriticalEdges();
BasicBlock splitCriticalEdges(BasicBlock Pred, BasicBlock *Succ);		BasicBlock splitCriticalEdges(BasicBlock Pred, BasicBlock *Succ);
unsigned replaceAllDominatedUsesWith(Value From, Value To,		unsigned replaceAllDominatedUsesWith(Value From, Value To,
const BasicBlockEdge &Root);		const BasicBlockEdge &Root);
bool propagateEquality(Value LHS, Value RHS, const BasicBlockEdge &Root);		bool propagateEquality(Value LHS, Value RHS, const BasicBlockEdge &Root);
▲ Show 20 Lines • Show All 1,004 Lines • ▼ Show 20 Lines	if (NumDeps == 1 &&
DEBUG(		DEBUG(
dbgs() << "GVN: non-local load ";		dbgs() << "GVN: non-local load ";
LI->printAsOperand(dbgs());		LI->printAsOperand(dbgs());
dbgs() << " has unknown dependencies\n";		dbgs() << " has unknown dependencies\n";
);		);
return false;		return false;
}		}

		// If this load follows a GEP, see if we can PRE the indices before analyzing.
		if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(LI->getOperand(0))) {
		for(GetElementPtrInst::op_iterator OI = GEP->idx_begin(),
		OE = GEP->idx_end(); OI != OE; ++OI)
		if (Instruction *I = dyn_cast<Instruction>(OI->get()))
		performScalarPRE(I);
		}

// Step 2: Analyze the availability of the load		// Step 2: Analyze the availability of the load
AvailValInBlkVect ValuesPerBlock;		AvailValInBlkVect ValuesPerBlock;
UnavailBlkVect UnavailableBlocks;		UnavailBlkVect UnavailableBlocks;
AnalyzeLoadAvailability(LI, Deps, ValuesPerBlock, UnavailableBlocks);		AnalyzeLoadAvailability(LI, Deps, ValuesPerBlock, UnavailableBlocks);

// If we have no predecessors that produce a known value for this load, exit		// If we have no predecessors that produce a known value for this load, exit
// early.		// early.
if (ValuesPerBlock.empty())		if (ValuesPerBlock.empty())
▲ Show 20 Lines • Show All 686 Lines • ▼ Show 20 Lines	if (AtStart)
BI = BB->begin();		BI = BB->begin();
else		else
++BI;		++BI;
}		}

return ChangedFunction;		return ChangedFunction;
}		}

/// performPRE - Perform a purely local form of PRE that looks for diamond		bool GVN::performScalarPRE(Instruction *CurInst) {
/// control flow patterns and attempts to perform simple PRE at the join point.
bool GVN::performPRE(Function &F) {
bool Changed = false;
SmallVector<std::pair<Value, BasicBlock>, 8> predMap;		SmallVector<std::pair<Value, BasicBlock>, 8> predMap;
for (BasicBlock *CurrentBlock : depth_first(&F.getEntryBlock())) {
// Nothing to PRE in the entry block.
if (CurrentBlock == &F.getEntryBlock()) continue;

// Don't perform PRE on a landing pad.
if (CurrentBlock->isLandingPad()) continue;

for (BasicBlock::iterator BI = CurrentBlock->begin(),
BE = CurrentBlock->end(); BI != BE; ) {
Instruction *CurInst = BI++;

if (isa<AllocaInst>(CurInst) \|\|		if (isa<AllocaInst>(CurInst) \|\|
isa<TerminatorInst>(CurInst) \|\| isa<PHINode>(CurInst) \|\|		isa<TerminatorInst>(CurInst) \|\| isa<PHINode>(CurInst) \|\|
CurInst->getType()->isVoidTy() \|\|		CurInst->getType()->isVoidTy() \|\|
CurInst->mayReadFromMemory() \|\| CurInst->mayHaveSideEffects() \|\|		CurInst->mayReadFromMemory() \|\| CurInst->mayHaveSideEffects() \|\|
isa<DbgInfoIntrinsic>(CurInst))		isa<DbgInfoIntrinsic>(CurInst))
continue;		return false;
		dberlinUnsubmitted Not Done Reply Inline Actions If you are going to add LoadInst, you might as well add all the memory ops (anything where getOpCode() > MemoryOpsBegin && getOpCode < MemoryOpsEnd. If you do this, i'd add isMemoryOp to instruction.h alongside isBinaryOp, etc) dberlin: If you are going to add LoadInst, you might as well add all the memory ops (anything where…
		bmakamAuthorUnsubmitted Not Done Reply Inline Actions Thanks for catching this Daniel. The LoadInst was unintentional, it is already covered by CurInst->mayReadFromMemory(). I will prepare a patch removing the LoadInst. If LGTM, please feel free to +2 it since I do not have commit rights. bmakam: Thanks for catching this Daniel. The LoadInst was unintentional, it is already covered by…

// Don't do PRE on compares. The PHI would prevent CodeGenPrepare from		// Don't do PRE on compares. The PHI would prevent CodeGenPrepare from
// sinking the compare again, and it would force the code generator to		// sinking the compare again, and it would force the code generator to
// move the i1 from processor flags or predicate registers into a general		// move the i1 from processor flags or predicate registers into a general
// purpose register.		// purpose register.
if (isa<CmpInst>(CurInst))		if (isa<CmpInst>(CurInst))
continue;		return false;

// We don't currently value number ANY inline asm calls.		// We don't currently value number ANY inline asm calls.
if (CallInst *CallI = dyn_cast<CallInst>(CurInst))		if (CallInst *CallI = dyn_cast<CallInst>(CurInst))
if (CallI->isInlineAsm())		if (CallI->isInlineAsm())
continue;		return false;

uint32_t ValNo = VN.lookup(CurInst);		uint32_t ValNo = VN.lookup(CurInst);

// Look for the predecessors for PRE opportunities. We're		// Look for the predecessors for PRE opportunities. We're
// only trying to solve the basic diamond case, where		// only trying to solve the basic diamond case, where
// a value is computed in the successor and one predecessor,		// a value is computed in the successor and one predecessor,
// but not the other. We also explicitly disallow cases		// but not the other. We also explicitly disallow cases
// where the successor is its own predecessor, because they're		// where the successor is its own predecessor, because they're
// more complicated to get right.		// more complicated to get right.
unsigned NumWith = 0;		unsigned NumWith = 0;
unsigned NumWithout = 0;		unsigned NumWithout = 0;
BasicBlock *PREPred = nullptr;		BasicBlock *PREPred = nullptr;
		BasicBlock *CurrentBlock = CurInst->getParent();
predMap.clear();		predMap.clear();

for (pred_iterator PI = pred_begin(CurrentBlock),		for (pred_iterator PI = pred_begin(CurrentBlock),
PE = pred_end(CurrentBlock); PI != PE; ++PI) {		PE = pred_end(CurrentBlock); PI != PE; ++PI) {
BasicBlock P = PI;		BasicBlock P = PI;
// We're not interested in PRE where the block is its		// We're not interested in PRE where the block is its
// own predecessor, or in blocks with predecessors		// own predecessor, or in blocks with predecessors
// that are not reachable.		// that are not reachable.
if (P == CurrentBlock) {		if (P == CurrentBlock) {
NumWithout = 2;		NumWithout = 2;
break;		break;
} else if (!DT->isReachableFromEntry(P)) {		} else if (!DT->isReachableFromEntry(P)) {
NumWithout = 2;		NumWithout = 2;
break;		break;
}		}

Value* predV = findLeader(P, ValNo);		Value* predV = findLeader(P, ValNo);
if (!predV) {		if (!predV) {
predMap.push_back(std::make_pair(static_cast<Value *>(nullptr), P));		predMap.push_back(std::make_pair(static_cast<Value *>(nullptr), P));
PREPred = P;		PREPred = P;
++NumWithout;		++NumWithout;
} else if (predV == CurInst) {		} else if (predV == CurInst) {
/* CurInst dominates this predecessor. */		/* CurInst dominates this predecessor. */
NumWithout = 2;		NumWithout = 2;
break;		break;
} else {		} else {
predMap.push_back(std::make_pair(predV, P));		predMap.push_back(std::make_pair(predV, P));
++NumWith;		++NumWith;
}		}
}		}

// Don't do PRE when it might increase code size, i.e. when		// Don't do PRE when it might increase code size, i.e. when
// we would need to insert instructions in more than one pred.		// we would need to insert instructions in more than one pred.
if (NumWithout != 1 \|\| NumWith == 0)		if (NumWithout != 1 \|\| NumWith == 0)
continue;		return false;

// Don't do PRE across indirect branch.		// Don't do PRE across indirect branch.
if (isa<IndirectBrInst>(PREPred->getTerminator()))		if (isa<IndirectBrInst>(PREPred->getTerminator()))
continue;		return false;

// We can't do PRE safely on a critical edge, so instead we schedule		// We can't do PRE safely on a critical edge, so instead we schedule
// the edge to be split and perform the PRE the next time we iterate		// the edge to be split and perform the PRE the next time we iterate
// on the function.		// on the function.
unsigned SuccNum = GetSuccessorNumber(PREPred, CurrentBlock);		unsigned SuccNum = GetSuccessorNumber(PREPred, CurrentBlock);
if (isCriticalEdge(PREPred->getTerminator(), SuccNum)) {		if (isCriticalEdge(PREPred->getTerminator(), SuccNum)) {
toSplit.push_back(std::make_pair(PREPred->getTerminator(), SuccNum));		toSplit.push_back(std::make_pair(PREPred->getTerminator(), SuccNum));
continue;		return false;
}		}

// Instantiate the expression in the predecessor that lacked it.		// Instantiate the expression in the predecessor that lacked it.
// Because we are going top-down through the block, all value numbers		// Because we are going top-down through the block, all value numbers
// will be available in the predecessor by the time we need them. Any		// will be available in the predecessor by the time we need them. Any
// that weren't originally present will have been instantiated earlier		// that weren't originally present will have been instantiated earlier
// in this loop.		// in this loop.
Instruction *PREInstr = CurInst->clone();		Instruction *PREInstr = CurInst->clone();
bool success = true;		bool success = true;
for (unsigned i = 0, e = CurInst->getNumOperands(); i != e; ++i) {		for (unsigned i = 0, e = CurInst->getNumOperands(); i != e; ++i) {
Value *Op = PREInstr->getOperand(i);		Value *Op = PREInstr->getOperand(i);
if (isa<Argument>(Op) \|\| isa<Constant>(Op) \|\| isa<GlobalValue>(Op))		if (isa<Argument>(Op) \|\| isa<Constant>(Op) \|\| isa<GlobalValue>(Op))
continue;		continue;

if (Value *V = findLeader(PREPred, VN.lookup(Op))) {		if (Value *V = findLeader(PREPred, VN.lookup(Op))) {
PREInstr->setOperand(i, V);		PREInstr->setOperand(i, V);
} else {		} else {
success = false;		success = false;
break;		break;
}		}
}		}

// Fail out if we encounter an operand that is not available in		// Fail out if we encounter an operand that is not available in
// the PRE predecessor. This is typically because of loads which		// the PRE predecessor. This is typically because of loads which
// are not value numbered precisely.		// are not value numbered precisely.
if (!success) {		if (!success) {
DEBUG(verifyRemoved(PREInstr));		DEBUG(verifyRemoved(PREInstr));
delete PREInstr;		delete PREInstr;
continue;		return false;
}		}

PREInstr->insertBefore(PREPred->getTerminator());		PREInstr->insertBefore(PREPred->getTerminator());
PREInstr->setName(CurInst->getName() + ".pre");		PREInstr->setName(CurInst->getName() + ".pre");
PREInstr->setDebugLoc(CurInst->getDebugLoc());		PREInstr->setDebugLoc(CurInst->getDebugLoc());
VN.add(PREInstr, ValNo);		VN.add(PREInstr, ValNo);
++NumGVNPRE;		++NumGVNPRE;

// Update the availability map to include the new instruction.		// Update the availability map to include the new instruction.
addToLeaderTable(ValNo, PREInstr, PREPred);		addToLeaderTable(ValNo, PREInstr, PREPred);

// Create a PHI to make the value available in this block.		// Create a PHI to make the value available in this block.
PHINode* Phi = PHINode::Create(CurInst->getType(), predMap.size(),		PHINode* Phi = PHINode::Create(CurInst->getType(), predMap.size(),
CurInst->getName() + ".pre-phi",		CurInst->getName() + ".pre-phi",
CurrentBlock->begin());		CurrentBlock->begin());
for (unsigned i = 0, e = predMap.size(); i != e; ++i) {		for (unsigned i = 0, e = predMap.size(); i != e; ++i) {
if (Value *V = predMap[i].first)		if (Value *V = predMap[i].first)
Phi->addIncoming(V, predMap[i].second);		Phi->addIncoming(V, predMap[i].second);
else		else
Phi->addIncoming(PREInstr, PREPred);		Phi->addIncoming(PREInstr, PREPred);
}		}

VN.add(Phi, ValNo);		VN.add(Phi, ValNo);
addToLeaderTable(ValNo, Phi, CurrentBlock);		addToLeaderTable(ValNo, Phi, CurrentBlock);
Phi->setDebugLoc(CurInst->getDebugLoc());		Phi->setDebugLoc(CurInst->getDebugLoc());
CurInst->replaceAllUsesWith(Phi);		CurInst->replaceAllUsesWith(Phi);
if (Phi->getType()->getScalarType()->isPointerTy()) {		if (Phi->getType()->getScalarType()->isPointerTy()) {
// Because we have added a PHI-use of the pointer value, it has now		// Because we have added a PHI-use of the pointer value, it has now
// "escaped" from alias analysis' perspective. We need to inform		// "escaped" from alias analysis' perspective. We need to inform
// AA of this.		// AA of this.
for (unsigned ii = 0, ee = Phi->getNumIncomingValues(); ii != ee;		for (unsigned ii = 0, ee = Phi->getNumIncomingValues(); ii != ee;
++ii) {		++ii) {
unsigned jj = PHINode::getOperandNumForIncomingValue(ii);		unsigned jj = PHINode::getOperandNumForIncomingValue(ii);
VN.getAliasAnalysis()->addEscapingUse(Phi->getOperandUse(jj));		VN.getAliasAnalysis()->addEscapingUse(Phi->getOperandUse(jj));
}		}

if (MD)		if (MD)
MD->invalidateCachedPointerInfo(Phi);		MD->invalidateCachedPointerInfo(Phi);
}		}
VN.erase(CurInst);		VN.erase(CurInst);
removeFromLeaderTable(ValNo, CurInst, CurrentBlock);		removeFromLeaderTable(ValNo, CurInst, CurrentBlock);

DEBUG(dbgs() << "GVN PRE removed: " << *CurInst << '\n');		DEBUG(dbgs() << "GVN PRE removed: " << *CurInst << '\n');
if (MD) MD->removeInstruction(CurInst);		if (MD) MD->removeInstruction(CurInst);
DEBUG(verifyRemoved(CurInst));		DEBUG(verifyRemoved(CurInst));
CurInst->eraseFromParent();		CurInst->eraseFromParent();
Changed = true;		return true;
		}

		/// performPRE - Perform a purely local form of PRE that looks for diamond
		/// control flow patterns and attempts to perform simple PRE at the join point.
		bool GVN::performPRE(Function &F) {
		bool Changed = false;
		for (BasicBlock *CurrentBlock : depth_first(&F.getEntryBlock())) {
		// Nothing to PRE in the entry block.
		if (CurrentBlock == &F.getEntryBlock()) continue;

		// Don't perform PRE on a landing pad.
		if (CurrentBlock->isLandingPad()) continue;

		for (BasicBlock::iterator BI = CurrentBlock->begin(),
		BE = CurrentBlock->end(); BI != BE; ) {
		Instruction *CurInst = BI++;
		Changed = performScalarPRE(CurInst);
}		}
}		}

if (splitCriticalEdges())		if (splitCriticalEdges())
Changed = true;		Changed = true;

return Changed;		return Changed;
}		}
Show All 21 Lines
}		}

/// iterateOnFunction - Executes one iteration of GVN		/// iterateOnFunction - Executes one iteration of GVN
bool GVN::iterateOnFunction(Function &F) {		bool GVN::iterateOnFunction(Function &F) {
cleanupGlobalSets();		cleanupGlobalSets();

// Top-down walk of the dominator tree		// Top-down walk of the dominator tree
bool Changed = false;		bool Changed = false;
#if 0
// Needed for value numbering with phi construction to work.		// Needed for value numbering with phi construction to work.
		hfinkelUnsubmitted Not Done Reply Inline Actions If the patch turns this on, please just remove the #if. hfinkel: If the patch turns this on, please just remove the #if.
		bmakamAuthorUnsubmitted Not Done Reply Inline Actions I will turn on the slow path and turn off the fast path in my next patch that I will upload with full context. I am not sure if we still want to keep the fast path commented out in the code or clean it up. bmakam: I will turn on the slow path and turn off the fast path in my next patch that I will upload…
		hfinkelUnsubmitted Not Done Reply Inline Actions If you want to still keep it, add a command-line flag to enable it. We don't generally keep commented-out code, as a policy. hfinkel: If you want to still keep it, add a command-line flag to enable it. We don't generally keep…
ReversePostOrderTraversal<Function*> RPOT(&F);		ReversePostOrderTraversal<Function*> RPOT(&F);
for (ReversePostOrderTraversal<Function*>::rpo_iterator RI = RPOT.begin(),		for (ReversePostOrderTraversal<Function*>::rpo_iterator RI = RPOT.begin(),
RE = RPOT.end(); RI != RE; ++RI)		RE = RPOT.end(); RI != RE; ++RI)
Changed \|= processBlock(*RI);		Changed \|= processBlock(*RI);
#else
// Save the blocks this function have before transformation begins. GVN may
// split critical edge, and hence may invalidate the RPO/DT iterator.
//
std::vector<BasicBlock *> BBVect;
BBVect.reserve(256);
for (DomTreeNode *X : depth_first(DT->getRootNode()))
BBVect.push_back(X->getBlock());

for (std::vector<BasicBlock *>::iterator I = BBVect.begin(), E = BBVect.end();
I != E; I++)
Changed \|= processBlock(*I);
#endif

return Changed;		return Changed;
}		}

void GVN::cleanupGlobalSets() {		void GVN::cleanupGlobalSets() {
VN.clear();		VN.clear();
LeaderTable.clear();		LeaderTable.clear();
TableAllocator.Reset();		TableAllocator.Reset();
}		}
▲ Show 20 Lines • Show All 149 Lines • Show Last 20 Lines

test/Transforms/GVN/pre-gep-load.ll

This file was added.

				; RUN: opt < %s -basicaa -gvn -enable-load-pre -S \| FileCheck %s
				target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"
				target triple = "aarch64--linux-gnu"

				define double @foo(i32 %stat, i32 %i, double** %p) {
				; CHECK-LABEL: @foo(
				entry:
				switch i32 %stat, label %sw.default [
				i32 0, label %sw.bb
				i32 1, label %sw.bb
				i32 2, label %sw.bb2
				]

				sw.bb: ; preds = %entry, %entry
				%idxprom = sext i32 %i to i64
				%arrayidx = getelementptr inbounds double** %p, i64 0
				%0 = load double** %arrayidx, align 8
				%arrayidx1 = getelementptr inbounds double* %0, i64 %idxprom
				%1 = load double* %arrayidx1, align 8
				%sub = fsub double %1, 1.000000e+00
				%cmp = fcmp olt double %sub, 0.000000e+00
				br i1 %cmp, label %if.then, label %if.end

				if.then: ; preds = %sw.bb
				br label %return

				if.end: ; preds = %sw.bb
				br label %sw.bb2

				sw.bb2: ; preds = %if.end, %entry
				%idxprom3 = sext i32 %i to i64
				%arrayidx4 = getelementptr inbounds double** %p, i64 0
				%2 = load double** %arrayidx4, align 8
				%arrayidx5 = getelementptr inbounds double* %2, i64 %idxprom3
				%3 = load double* %arrayidx5, align 8
				; CHECK: sw.bb2:
				; CHECK-NEXT-NOT: sext
				; CHECK-NEXT: phi double [
				; CHECK-NOT: load
				%sub6 = fsub double 3.000000e+00, %3
				br label %return

				sw.default: ; preds = %entry
				br label %return

				return: ; preds = %sw.default, %sw.bb2, %if.then
				%retval.0 = phi double [ 0.000000e+00, %sw.default ], [ %sub6, %sw.bb2 ], [ %sub, %if.then ]
				ret double %retval.0
				}