Download Raw Diff

Details

Reviewers

tejohnson
spatel

Commits

rGce3be45cacc1: [CodeGenPrepare] Fix ModifiedDT flag in optimizeSelectInst
rL355751: [CodeGenPrepare] Fix ModifiedDT flag in optimizeSelectInst

Summary

r344412 fixed a huge compile time regression but it needs ModifiedDT flag
to be maintained correctly in optimizations in optimizeBlock() and optimizeInst().

optimizeSelectInst() does not updated the flag ( it updates a different ModifiedDT in CodeGenPrepare class).

This patche propagates the flag in optimizeSelectInst() back to optimizeBlock().

I also change the name of ModifiedDT in CodeGenPrepare class to avoid confusion. It seems ModifiedDT in CodeGenPrepare class is not being used anywhere. We may want to delete it.

Diff Detail

Event Timeline

xur created this revision.Mar 8 2019, 9:48 AM

Herald added a subscriber: jdoerfert. · View Herald TranscriptMar 8 2019, 9:48 AM

Thanks for the fix. A couple comments below.

lib/CodeGen/CodeGenPrepare.cpp
293–294	I'd either remove this completely, since it isn't used, or migrate to using this instead of passing around a flag parameter.
7248	I think this should still set ModifiedDT so that the caller handles it appropriately (see callsite in optimizeBlock). The comment doesn't make sense to me...might be stale?

xur marked 2 inline comments as done.Mar 8 2019, 10:11 AM

xur added inline comments.

lib/CodeGen/CodeGenPrepare.cpp
293–294	I think remove this field is better. The existing uses of this field will be kept in a ref parameter.
7248	We can remove all this.

Integrated Teresa's comments.

Also using spatel's test in Bug 41004 which is much cleaner.
My test was using produced by bugpoint (reduced from clang PGO bootstrap) and needs branchweight.

Just got a report of an internal test failure from my commit, let me see if your patch fixes it.

lib/CodeGen/CodeGenPrepare.cpp
2045	Should this set the new ModifiedDT?

In D59139#1422924, @tejohnson wrote:

Just got a report of an internal test failure from my commit, let me see if your patch fixes it.

This patch fixes the test failure.

xur added inline comments.Mar 8 2019, 10:32 AM

lib/CodeGen/CodeGenPrepare.cpp
2045	Sorry for missing this one. We should. I will update the patch.

add one missing updates in last version of the patch.

LGTM

This revision is now accepted and ready to land.Mar 8 2019, 10:42 AM

Minor: it would be better to make the test in IR (opt -codegenprepare), so it would live under test/Transforms/CodeGenPrepare/X86/. That's because this isn't really an x86 bug. It could affect any target, and the bug is really one of an IR transform. Along those lines, I would include auto-generated FileCheck output too even if the IR for this example is semi-useless, just so we know exactly what is coming out of CGP.

changed test case based on spatel's comments.

spatel added inline comments.Mar 8 2019, 12:38 PM

test/Transforms/CodeGenPrepare/X86/optimizeSelect-DT.ll
24	typo - CEECK You could avoid this kind of bug by using the script at utils/update_test_checks.py.

using utils/update_test_checks.py to generate the test checks.

thanks the tip from spatel!

Closed by commit rL355751: [CodeGenPrepare] Fix ModifiedDT flag in optimizeSelectInst (authored by xur). · Explain WhyMar 8 2019, 2:48 PM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2019, 2:48 PM

This change caused a compile time regression (that I referenced at the end of my summary in D59696). In this case, there are huge numbers of select instructions. After this change, since we now update the ModifiedDT correctly, the loop over the function in CodeGenPrepare::runOnFunction will break after each select is optimized, due to ModifiedDT being set. As mentioned in D59696, even after making the DT build lazy, there is still a large regression because of the number of times we re-walk the function.

I have a couple possible solutions, and am looking for guidance on what is preferable.

Reset the DT instead of existing the function walk. Now that D59696 is in and the DT is built lazily, instead of setting ModifiedDT true and breaking out of the function walk in runOnFunction() (which was previously required due to the DT being built for each function walk), we can simply reset the DT unique_ptr, which will force a rebuild of DT when it is next needed:

diff --git a/lib/CodeGen/CodeGenPrepare.cpp b/lib/CodeGen/CodeGenPrepare.cpp
index 9d642ba245c..bd9cc5e4158 100644

a/lib/CodeGen/CodeGenPrepare.cpp

+++ b/lib/CodeGen/CodeGenPrepare.cpp
@@ -5962,7 +5962,7 @@ bool CodeGenPrepare::optimizeSelectInst(SelectInst *SI, bool &ModifiedDT) {

  !isFormingBranchFromSelectProfitable(TTI, TLI, SI))
return false;

ModifiedDT = true;

+ DT.reset();

// Transform a sequence like this:
//    start:

In my test case, since apparently there are many more select instructions (which don't use the DT to optimize) than the type of instructions that need the DT, this fixes the regression. It may not always have an effect, say if selects were interspersed with instructions that need the DT to optimize, but it won't make matters worse than the current status quo (which would cause the function iteration to exit and the DT to be reset there). It's possible that other places that set ModifiedDT could get the same treatment, I just tried in optimizeSelectInst since that was the place affected by this patch and that caused my regression.

Optimize select insts before iterative function walk. Similar to how some of the other instructions are optimized before this iterative walk (e.g. splitBranchCondition()), since select instructions don't need the DT, and presumably don't need iterative optimization, we could do the select optimizations earlier:

diff --git a/lib/CodeGen/CodeGenPrepare.cpp b/lib/CodeGen/CodeGenPrepare.cpp
index 9d642ba245c..b2a0c8a9570 100644

a/lib/CodeGen/CodeGenPrepare.cpp

+++ b/lib/CodeGen/CodeGenPrepare.cpp
@@ -465,6 +465,15 @@ bool CodeGenPrepare::runOnFunction(Function &F) {

// to help generate sane code for PHIs involving such edges.
EverMadeChange |= SplitIndirectBrCriticalEdges(F);

+ for (Function::iterator I = F.begin(); I != F.end(); ) {
+ BasicBlock *BB = &*I++;
+ CurInstIterator = BB->begin();
+ while (CurInstIterator != BB->end()) {
+ if (SelectInst *SI = dyn_cast<SelectInst>(&*CurInstIterator++))
+ EverMadeChange |= optimizeSelectInst(SI, ModifiedDT);
+ }
+ }
+

bool MadeChange = true;
while (MadeChange) {
  MadeChange = false;

Presumably this would be refactored into a separate function, similar to splitBranchCondition, the above was just a quick hack to test. The advantage of this second approach is that it will improve compile time in any other cases involving select instructions interspersed with instructions that need the DT to optimize.

However - this only eliminates 2/3 of the additional CGP overhead. Since I didn't remove the call to optimizeSelectInst from optimizeInst in this case, we may be finding additional selects to optimize during the iterative function optimization stage.

Similar to 2 but do the select optimization once per function walk:

diff --git a/lib/CodeGen/CodeGenPrepare.cpp b/lib/CodeGen/CodeGenPrepare.cpp
index 9d642ba245c..2241d4af206 100644

a/lib/CodeGen/CodeGenPrepare.cpp

+++ b/lib/CodeGen/CodeGenPrepare.cpp
@@ -468,6 +468,14 @@ bool CodeGenPrepare::runOnFunction(Function &F) {

bool MadeChange = true;
while (MadeChange) {
  MadeChange = false;

+ for (Function::iterator I = F.begin(); I != F.end(); ) {
+ BasicBlock *BB = &*I++;
+ CurInstIterator = BB->begin();
+ while (CurInstIterator != BB->end()) {
+ if (SelectInst *SI = dyn_cast<SelectInst>(&*CurInstIterator++))
+ MadeChange |= optimizeSelectInst(SI, ModifiedDT);
+ }
+ }

DT.reset();
for (Function::iterator I = F.begin(); I != F.end(); ) {
  BasicBlock *BB = &*I++;

This removes the remaining overhead.

If #1 is a one-liner that doesn't change any existing tests, that sounds like a good choice.
#2 / #3 also sound like good options. IMO, the fact that we have or need to make that kind of structural change points out that CGP itself has gotten too big. According to its lead comment, this is supposed to be temporary spot for hacks prohibited by SDAG (ie, once we switch to global-isel, CGP should go away)...but CGP has become several independent passes in 1 file. The bigger change to correct these kinds of problems would be to split things into multiple passes.

In D59139#1444332, @spatel wrote:

If #1 is a one-liner that doesn't change any existing tests, that sounds like a good choice.

Basically, yes (I'd also remove the ModifiedDT parameter to that routine).

#2 / #3 also sound like good options. IMO, the fact that we have or need to make that kind of structural change points out that CGP itself has gotten too big. According to its lead comment, this is supposed to be temporary spot for hacks prohibited by SDAG (ie, once we switch to global-isel, CGP should go away)...but CGP has become several independent passes in 1 file. The bigger change to correct these kinds of problems would be to split things into multiple passes.

Yeah it seems that the structure is has gotten really unwieldy and haphazard (i.e. when the function walk needs to be completely restarted and which parts need to be iterative isn't clear to me).

I will go ahead and send a patch for #1 for now since it is very simple and addresses this regression specifically, by undoing the iteration change provoked by this fix.

tejohnson mentioned this in D59696: [CGP] Build the DominatorTree lazily.Mar 29 2019, 8:55 AM

Diff 189923

lib/CodeGen/CodeGenPrepare.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 284 Lines • ▼ Show 20 Lines	class CodeGenPrepare : public FunctionPass {
/// Keep track of new GEP base after splitting the GEPs having large offset.		/// Keep track of new GEP base after splitting the GEPs having large offset.
SmallSet<AssertingVH<Value>, 2> NewGEPBases;		SmallSet<AssertingVH<Value>, 2> NewGEPBases;

/// Map serial numbers to Large offset GEPs.		/// Map serial numbers to Large offset GEPs.
DenseMap<AssertingVH<GetElementPtrInst>, int> LargeOffsetGEPID;		DenseMap<AssertingVH<GetElementPtrInst>, int> LargeOffsetGEPID;

/// Keep track of SExt promoted.		/// Keep track of SExt promoted.
ValueToSExts ValToSExtendedUses;		ValueToSExts ValToSExtendedUses;

/// True if CFG is modified in any way.
bool ModifiedDT;

/// True if optimizing for size.		/// True if optimizing for size.
		tejohnsonUnsubmitted Not Done Reply Inline Actions I'd either remove this completely, since it isn't used, or migrate to using this instead of passing around a flag parameter. tejohnson: I'd either remove this completely, since it isn't used, or migrate to using this instead of…
		xurAuthorUnsubmitted Done Reply Inline Actions I think remove this field is better. The existing uses of this field will be kept in a ref parameter. xur: I think remove this field is better. The existing uses of this field will be kept in a ref…
bool OptSize;		bool OptSize;

/// DataLayout for the Function being processed.		/// DataLayout for the Function being processed.
const DataLayout *DL = nullptr;		const DataLayout *DL = nullptr;

public:		public:
static char ID; // Pass identification, replacement for typeid		static char ID; // Pass identification, replacement for typeid

▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	private:
bool optimizeInst(Instruction *I, DominatorTree &DT, bool &ModifiedDT);		bool optimizeInst(Instruction *I, DominatorTree &DT, bool &ModifiedDT);
bool optimizeMemoryInst(Instruction MemoryInst, Value Addr,		bool optimizeMemoryInst(Instruction MemoryInst, Value Addr,
Type *AccessTy, unsigned AddrSpace);		Type *AccessTy, unsigned AddrSpace);
bool optimizeInlineAsmInst(CallInst *CS);		bool optimizeInlineAsmInst(CallInst *CS);
bool optimizeCallInst(CallInst *CI, bool &ModifiedDT);		bool optimizeCallInst(CallInst *CI, bool &ModifiedDT);
bool optimizeExt(Instruction *&I);		bool optimizeExt(Instruction *&I);
bool optimizeExtUses(Instruction *I);		bool optimizeExtUses(Instruction *I);
bool optimizeLoadExt(LoadInst *Load);		bool optimizeLoadExt(LoadInst *Load);
bool optimizeSelectInst(SelectInst *SI);		bool optimizeSelectInst(SelectInst *SI, bool &ModifiedDT);
bool optimizeShuffleVectorInst(ShuffleVectorInst *SVI);		bool optimizeShuffleVectorInst(ShuffleVectorInst *SVI);
bool optimizeSwitchInst(SwitchInst *SI);		bool optimizeSwitchInst(SwitchInst *SI);
bool optimizeExtractElementInst(Instruction *Inst);		bool optimizeExtractElementInst(Instruction *Inst);
bool dupRetToEnableTailCallOpts(BasicBlock *BB);		bool dupRetToEnableTailCallOpts(BasicBlock *BB, bool &ModifiedDT);
bool placeDbgValues(Function &F);		bool placeDbgValues(Function &F);
bool canFormExtLd(const SmallVectorImpl<Instruction *> &MovedExts,		bool canFormExtLd(const SmallVectorImpl<Instruction *> &MovedExts,
LoadInst &LI, Instruction &Inst, bool HasPromoted);		LoadInst &LI, Instruction &Inst, bool HasPromoted);
bool tryToPromoteExts(TypePromotionTransaction &TPT,		bool tryToPromoteExts(TypePromotionTransaction &TPT,
const SmallVectorImpl<Instruction *> &Exts,		const SmallVectorImpl<Instruction *> &Exts,
SmallVectorImpl<Instruction *> &ProfitablyMovedExts,		SmallVectorImpl<Instruction *> &ProfitablyMovedExts,
unsigned CreatedInstsCost = 0);		unsigned CreatedInstsCost = 0);
bool mergeSExts(Function &F, DominatorTree &DT);		bool mergeSExts(Function &F, DominatorTree &DT);
bool splitLargeGEPOffsets();		bool splitLargeGEPOffsets();
bool performAddressTypePromotion(		bool performAddressTypePromotion(
Instruction *&Inst,		Instruction *&Inst,
bool AllowPromotionWithoutCommonHeader,		bool AllowPromotionWithoutCommonHeader,
bool HasPromoted, TypePromotionTransaction &TPT,		bool HasPromoted, TypePromotionTransaction &TPT,
SmallVectorImpl<Instruction *> &SpeculativelyMovedExts);		SmallVectorImpl<Instruction *> &SpeculativelyMovedExts);
bool splitBranchCondition(Function &F);		bool splitBranchCondition(Function &F, bool &ModifiedDT);
bool simplifyOffsetableRelocate(Instruction &I);		bool simplifyOffsetableRelocate(Instruction &I);

bool tryToSinkFreeOperands(Instruction *I);		bool tryToSinkFreeOperands(Instruction *I);
};		};

} // end anonymous namespace		} // end anonymous namespace

char CodeGenPrepare::ID = 0;		char CodeGenPrepare::ID = 0;
Show All 12 Lines	bool CodeGenPrepare::runOnFunction(Function &F) {

DL = &F.getParent()->getDataLayout();		DL = &F.getParent()->getDataLayout();

bool EverMadeChange = false;		bool EverMadeChange = false;
// Clear per function information.		// Clear per function information.
InsertedInsts.clear();		InsertedInsts.clear();
PromotedInsts.clear();		PromotedInsts.clear();

ModifiedDT = false;
if (auto *TPC = getAnalysisIfAvailable<TargetPassConfig>()) {		if (auto *TPC = getAnalysisIfAvailable<TargetPassConfig>()) {
TM = &TPC->getTM<TargetMachine>();		TM = &TPC->getTM<TargetMachine>();
SubtargetInfo = TM->getSubtargetImpl(F);		SubtargetInfo = TM->getSubtargetImpl(F);
TLI = SubtargetInfo->getTargetLowering();		TLI = SubtargetInfo->getTargetLowering();
TRI = SubtargetInfo->getRegisterInfo();		TRI = SubtargetInfo->getRegisterInfo();
}		}
TLInfo = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();		TLInfo = &getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();
TTI = &getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);		TTI = &getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
Show All 26 Lines	while (BB != nullptr) {
BB = Next;		BB = Next;
}		}
}		}

// Eliminate blocks that contain only PHI nodes and an		// Eliminate blocks that contain only PHI nodes and an
// unconditional branch.		// unconditional branch.
EverMadeChange \|= eliminateMostlyEmptyBlocks(F);		EverMadeChange \|= eliminateMostlyEmptyBlocks(F);

		bool ModifiedDT = false;
if (!DisableBranchOpts)		if (!DisableBranchOpts)
EverMadeChange \|= splitBranchCondition(F);		EverMadeChange \|= splitBranchCondition(F, ModifiedDT);

// Split some critical edges where one of the sources is an indirect branch,		// Split some critical edges where one of the sources is an indirect branch,
// to help generate sane code for PHIs involving such edges.		// to help generate sane code for PHIs involving such edges.
EverMadeChange \|= SplitIndirectBrCriticalEdges(F);		EverMadeChange \|= SplitIndirectBrCriticalEdges(F);

bool MadeChange = true;		bool MadeChange = true;
while (MadeChange) {		while (MadeChange) {
MadeChange = false;		MadeChange = false;
▲ Show 20 Lines • Show All 1,493 Lines • ▼ Show 20 Lines
/// ret i32 %tmp0		/// ret i32 %tmp0
/// bb1:		/// bb1:
/// %tmp1 = tail call i32 @f1()		/// %tmp1 = tail call i32 @f1()
/// ret i32 %tmp1		/// ret i32 %tmp1
/// bb2:		/// bb2:
/// %tmp2 = tail call i32 @f2()		/// %tmp2 = tail call i32 @f2()
/// ret i32 %tmp2		/// ret i32 %tmp2
/// @endcode		/// @endcode
bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB) {		bool CodeGenPrepare::dupRetToEnableTailCallOpts(BasicBlock *BB, bool &ModifiedDT) {
if (!TLI)		if (!TLI)
return false;		return false;

ReturnInst *RetI = dyn_cast<ReturnInst>(BB->getTerminator());		ReturnInst *RetI = dyn_cast<ReturnInst>(BB->getTerminator());
if (!RetI)		if (!RetI)
return false;		return false;

PHINode *PN = nullptr;		PHINode *PN = nullptr;
▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = TailCalls.size(); i != e; ++i) {
// the return block.		// the return block.
BasicBlock *CallBB = CI->getParent();		BasicBlock *CallBB = CI->getParent();
BranchInst *BI = dyn_cast<BranchInst>(CallBB->getTerminator());		BranchInst *BI = dyn_cast<BranchInst>(CallBB->getTerminator());
if (!BI \|\| !BI->isUnconditional() \|\| BI->getSuccessor(0) != BB)		if (!BI \|\| !BI->isUnconditional() \|\| BI->getSuccessor(0) != BB)
continue;		continue;

// Duplicate the return into CallBB.		// Duplicate the return into CallBB.
(void)FoldReturnIntoUncondBranch(RetI, BB, CallBB);		(void)FoldReturnIntoUncondBranch(RetI, BB, CallBB);
ModifiedDT = Changed = true;		ModifiedDT = Changed = true;
tejohnsonUnsubmitted Not Done Reply Inline Actions Should this set the new ModifiedDT? tejohnson: Should this set the new ModifiedDT?
xurAuthorUnsubmitted Not Done Reply Inline Actions Sorry for missing this one. We should. I will update the patch. xur: Sorry for missing this one. We should. I will update the patch.
++NumRetsDup;		++NumRetsDup;
}		}

// If we eliminated all predecessors of the block, delete the block now.		// If we eliminated all predecessors of the block, delete the block now.
if (Changed && !BB->hasAddressTaken() && pred_begin(BB) == pred_end(BB))		if (Changed && !BB->hasAddressTaken() && pred_begin(BB) == pred_end(BB))
BB->eraseFromParent();		BB->eraseFromParent();

return Changed;		return Changed;
▲ Show 20 Lines • Show All 3,804 Lines • ▼ Show 20 Lines	assert(DefSI->getCondition() == SI->getCondition() &&
"The condition of DefSI does not match with SI");		"The condition of DefSI does not match with SI");
V = (isTrue ? DefSI->getTrueValue() : DefSI->getFalseValue());		V = (isTrue ? DefSI->getTrueValue() : DefSI->getFalseValue());
}		}
return V;		return V;
}		}

/// If we have a SelectInst that will likely profit from branch prediction,		/// If we have a SelectInst that will likely profit from branch prediction,
/// turn it into a branch.		/// turn it into a branch.
bool CodeGenPrepare::optimizeSelectInst(SelectInst *SI) {		bool CodeGenPrepare::optimizeSelectInst(SelectInst *SI, bool &ModifiedDT) {
// If branch conversion isn't desirable, exit early.		// If branch conversion isn't desirable, exit early.
if (DisableSelectToBranch \|\| OptSize \|\| !TLI)		if (DisableSelectToBranch \|\| OptSize \|\| !TLI)
return false;		return false;

// Find all consecutive select instructions that share the same condition.		// Find all consecutive select instructions that share the same condition.
SmallVector<SelectInst *, 2> ASI;		SmallVector<SelectInst *, 2> ASI;
ASI.push_back(SI);		ASI.push_back(SI);
for (BasicBlock::iterator It = ++BasicBlock::iterator(SI);		for (BasicBlock::iterator It = ++BasicBlock::iterator(SI);
▲ Show 20 Lines • Show All 1,077 Lines • ▼ Show 20 Lines	bool CodeGenPrepare::optimizeInst(Instruction *I, DominatorTree &DT,

if (tryToSinkFreeOperands(I))		if (tryToSinkFreeOperands(I))
return true;		return true;

if (CallInst *CI = dyn_cast<CallInst>(I))		if (CallInst *CI = dyn_cast<CallInst>(I))
return optimizeCallInst(CI, ModifiedDT);		return optimizeCallInst(CI, ModifiedDT);

if (SelectInst *SI = dyn_cast<SelectInst>(I))		if (SelectInst *SI = dyn_cast<SelectInst>(I))
return optimizeSelectInst(SI);		return optimizeSelectInst(SI, ModifiedDT);

if (ShuffleVectorInst *SVI = dyn_cast<ShuffleVectorInst>(I))		if (ShuffleVectorInst *SVI = dyn_cast<ShuffleVectorInst>(I))
return optimizeShuffleVectorInst(SVI);		return optimizeShuffleVectorInst(SVI);

if (auto *Switch = dyn_cast<SwitchInst>(I))		if (auto *Switch = dyn_cast<SwitchInst>(I))
return optimizeSwitchInst(Switch);		return optimizeSwitchInst(Switch);

if (isa<ExtractElementInst>(I))		if (isa<ExtractElementInst>(I))
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	while (TLI && MadeBitReverse) {
for (auto &I : reverse(BB)) {		for (auto &I : reverse(BB)) {
if (makeBitReverse(I, DL, TLI)) {		if (makeBitReverse(I, DL, TLI)) {
MadeBitReverse = MadeChange = true;		MadeBitReverse = MadeChange = true;
ModifiedDT = true;		ModifiedDT = true;
break;		break;
}		}
}		}
}		}
MadeChange \|= dupRetToEnableTailCallOpts(&BB);		MadeChange \|= dupRetToEnableTailCallOpts(&BB, ModifiedDT);

return MadeChange;		return MadeChange;
}		}

// llvm.dbg.value is far away from the value then iSel may not be able		// llvm.dbg.value is far away from the value then iSel may not be able
// handle it properly. iSel will drop llvm.dbg.value if it can not		// handle it properly. iSel will drop llvm.dbg.value if it can not
// find a node corresponding to the value.		// find a node corresponding to the value.
bool CodeGenPrepare::placeDbgValues(Function &F) {		bool CodeGenPrepare::placeDbgValues(Function &F) {
▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines
/// br i1 %1, label %TrueBB, label %FalseBB		/// br i1 %1, label %TrueBB, label %FalseBB
/// \endcode		/// \endcode
/// This usually allows instruction selection to do even further optimizations		/// This usually allows instruction selection to do even further optimizations
/// and combine the compare with the branch instruction. Currently this is		/// and combine the compare with the branch instruction. Currently this is
/// applied for targets which have "cheap" jump instructions.		/// applied for targets which have "cheap" jump instructions.
///		///
/// FIXME: Remove the (equivalent?) implementation in SelectionDAG.		/// FIXME: Remove the (equivalent?) implementation in SelectionDAG.
///		///
bool CodeGenPrepare::splitBranchCondition(Function &F) {		bool CodeGenPrepare::splitBranchCondition(Function &F, bool &ModifiedDT) {
if (!TM \|\| !TM->Options.EnableFastISel \|\| !TLI \|\| TLI->isJumpExpensive())		if (!TM \|\| !TM->Options.EnableFastISel \|\| !TLI \|\| TLI->isJumpExpensive())
return false;		return false;

bool MadeChange = false;		bool MadeChange = false;
for (auto &BB : F) {		for (auto &BB : F) {
// Does this BB end with the following?		// Does this BB end with the following?
// %cond1 = icmp\|fcmp\|binary instruction ...		// %cond1 = icmp\|fcmp\|binary instruction ...
// %cond2 = icmp\|fcmp\|binary instruction ...		// %cond2 = icmp\|fcmp\|binary instruction ...
▲ Show 20 Lines • Show All 140 Lines • ▼ Show 20 Lines	if (Opc == Instruction::Or) {
NewTrueWeight = 2 * TrueWeight;		NewTrueWeight = 2 * TrueWeight;
NewFalseWeight = FalseWeight;		NewFalseWeight = FalseWeight;
scaleWeights(NewTrueWeight, NewFalseWeight);		scaleWeights(NewTrueWeight, NewFalseWeight);
Br2->setMetadata(LLVMContext::MD_prof, MDBuilder(Br2->getContext())		Br2->setMetadata(LLVMContext::MD_prof, MDBuilder(Br2->getContext())
.createBranchWeights(TrueWeight, FalseWeight));		.createBranchWeights(TrueWeight, FalseWeight));
}		}
}		}

// Note: No point in getting fancy here, since the DT info is never
// available to CodeGenPrepare.
ModifiedDT = true;		ModifiedDT = true;
		tejohnsonUnsubmitted Not Done Reply Inline Actions I think this should still set ModifiedDT so that the caller handles it appropriately (see callsite in optimizeBlock). The comment doesn't make sense to me...might be stale? tejohnson: I think this should still set ModifiedDT so that the caller handles it appropriately (see…
		xurAuthorUnsubmitted Done Reply Inline Actions We can remove all this. xur: We can remove all this.

MadeChange = true;		MadeChange = true;

LLVM_DEBUG(dbgs() << "After branch condition splitting\n"; BB.dump();		LLVM_DEBUG(dbgs() << "After branch condition splitting\n"; BB.dump();
TmpBB->dump());		TmpBB->dump());
}		}
return MadeChange;		return MadeChange;
}		}

test/Transforms/CodeGenPrepare/X86/optimizeSelect-DT.ll

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -S -codegenprepare < %s \| FileCheck %s

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				define i1 @PR41004(i32 %x, i32 %y, i32 %t1) {
				; CHECK-LABEL: @PR41004(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[T0:%.]] = icmp eq i32 [[Y:%.]], 1
				; CHECK-NEXT: br i1 [[T0]], label [[SELECT_TRUE_SINK:%.]], label [[SELECT_END:%.]]
				; CHECK: select.true.sink:
				; CHECK-NEXT: [[REM:%.]] = srem i32 [[X:%.]], 2
				; CHECK-NEXT: br label [[SELECT_END]]
				; CHECK: select.end:
				; CHECK-NEXT: [[MUL:%.]] = phi i32 [ [[REM]], [[SELECT_TRUE_SINK]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-NEXT: [[TMP0:%.]] = call { i32, i1 } @llvm.usub.with.overflow.i32(i32 [[T1:%.]], i32 1)
				; CHECK-NEXT: [[MATH:%.*]] = extractvalue { i32, i1 } [[TMP0]], 0
				; CHECK-NEXT: [[OV:%.*]] = extractvalue { i32, i1 } [[TMP0]], 1
				; CHECK-NEXT: [[ADD:%.*]] = add i32 [[MATH]], [[MUL]]
				; CHECK-NEXT: ret i1 [[OV]]
				;
				entry:
				%rem = srem i32 %x, 2
				spatelUnsubmitted Not Done Reply Inline Actions typo - CEECK You could avoid this kind of bug by using the script at utils/update_test_checks.py. spatel: typo - CEECK You could avoid this kind of bug by using the script at utils/update_test_checks.
				%t0 = icmp eq i32 %y, 1
				%mul = select i1 %t0, i32 %rem, i32 0
				%neg = add i32 %t1, -1
				%add = add i32 %neg, %mul
				br label %if

				if:
				%tobool = icmp eq i32 %t1, 0
				ret i1 %tobool
				}

This is an archive of the discontinued LLVM Phabricator instance.

[CodeGenPrepare] Fix ModifiedDT flag in optimizeSelectInst
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 189923

lib/CodeGen/CodeGenPrepare.cpp

test/Transforms/CodeGenPrepare/X86/optimizeSelect-DT.ll

This is an archive of the discontinued LLVM Phabricator instance.

[CodeGenPrepare] Fix ModifiedDT flag in optimizeSelectInstClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 189923

lib/CodeGen/CodeGenPrepare.cpp

test/Transforms/CodeGenPrepare/X86/optimizeSelect-DT.ll

[CodeGenPrepare] Fix ModifiedDT flag in optimizeSelectInst
ClosedPublic