This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
InstCombineInternal.h
-
InstCombineSelect.cpp
-
InstructionCombining.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
maxnum-02.ll
-
minnum-02.ll

Differential D70852

[InstCombine] Guard maxnum/minnum conversions with a TTI query
AbandonedPublic

Authored by jonpa on Nov 29 2019, 5:52 AM.

Download Raw Diff

Details

Reviewers

cameron.mcinally
spatel
uweigand

Summary

After https://reviews.llvm.org/D62414 "[InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics", the test-suite no longer builds on earlier subtargets. This was commented on here: https://github.com/llvm/llvm-project/commit/ebf9bf2cbc8fa68d536e481e370c4ba40ce61a8a.

This is an attempt to fix this by guarding this transformation with a query to TTI so that if this would result in a libcall, it is skipped.

I was expecting InstCombine to use TargetTransforminfo, but found that I had to add the TTI member. Is there a reason it should not be used?

With this patch the test-suite now builds again on z10...

Diff Detail

Event Timeline

jonpa created this revision.Nov 29 2019, 5:52 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 29 2019, 5:52 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

This seems like a lowering problem in SystemZ backend.

In D70852#1763820, @lebedev.ri wrote:

This seems like a lowering problem in SystemZ backend.

So is the backend supposed to lower minnum/maxnum always (to the obvious select sequence, I guess?), even if there's no instruction for it? Why would that be the job of the backend?

In D70852#1763861, @uweigand wrote:

In D70852#1763820, @lebedev.ri wrote:

This seems like a lowering problem in SystemZ backend.

So is the backend supposed to lower minnum/maxnum always (to the obvious select sequence, I guess?), even if there's no instruction for it? Why would that be the job of the backend?

Because it is a non-target-specific (i.e. not e.g. @llvm.x86.) llvm instruction.
They are considered canonical, and supposed to be handled by all backends,
even if via expansion into some other instructions.

For other examples, i think pretty much no backend has native funnel shift
(not rotate!) instruction, many don't have saturating math,
probably most don't have native fixed point math, not sure about strict-fp;
treating them as legal *in middle-end* only if particular backend supports
them is somewhat contrary to having canonical representation..

But in those cases like funnel shifts, common code (specifically SelectionDAGBuilder::visitIntrinsicCall) will handle the expansion if the target doesn't have an instruction.

So if we do need to do that for minnum/maxnum, that would probably be the better place.

However, this is still a bit different, because minnum/maxnum can also be the result of an explicit call to fmin/fmax in the source, and it would seem surprising to have that translated into a select sequence as well. If we originally had an fmin/fmax call, which was translated into minnum/maxnum that we then couldn't further optimize, the current expansion (back to the fmin/fmax call) seems correct to me.

It's only the case where we originally had a conditional that was translated into minnum/maxnum that expansion into fmin/fmax is incorrect (because those routines are only available when linking with -lm which the user might not do since there's no actual call to a libm routine in the original source).

The problem really is IMO that there's no way to tell these two cases apart at instruction selection time.

In D70852#1763946, @uweigand wrote:

But in those cases like funnel shifts, common code (specifically SelectionDAGBuilder::visitIntrinsicCall) will handle the expansion if the target doesn't have an instruction.

So if we do need to do that for minnum/maxnum, that would probably be the better place.

However, this is still a bit different, because minnum/maxnum can also be the result of an explicit call to fmin/fmax in the source, and it would seem surprising to have that translated into a select sequence as well. If we originally had an fmin/fmax call, which was translated into minnum/maxnum that we then couldn't further optimize, the current expansion (back to the fmin/fmax call) seems correct to me.

It's only the case where we originally had a conditional that was translated into minnum/maxnum that expansion into fmin/fmax is incorrect (because those routines are only available when linking with -lm which the user might not do since there's no actual call to a libm routine in the original source).

The problem really is IMO that there's no way to tell these two cases apart at instruction selection time.

FWIW I agree with Ulrich here, either we can lower this in the middle end or we depend upon either libc/compiler-rt/libgcc/etc. Depending upon libm is historically different here and would require some additional discussion.

In D70852#1763946, @uweigand wrote:

But in those cases like funnel shifts, common code (specifically SelectionDAGBuilder::visitIntrinsicCall) will handle the expansion if the target doesn't have an instruction.

So if we do need to do that for minnum/maxnum, that would probably be the better place.

However, this is still a bit different, because minnum/maxnum can also be the result of an explicit call to fmin/fmax in the source, and it would seem surprising to have that translated into a select sequence as well. If we originally had an fmin/fmax call, which was translated into minnum/maxnum that we then couldn't further optimize, the current expansion (back to the fmin/fmax call) seems correct to me.

Handling this in SDAG sounds like the right solution to me. We don't want instcombine to rely on TTI because it's supposed to be canonicalizing target-independently.
It's worth noting that the transform that is being done in IR depends on FMF:

// Canonicalize select of FP values where NaN and -0.0 are not valid as
// minnum/maxnum intrinsics.

So I think SDAG can also detect the difference between IEEE-compliant source code that uses libm calls vs. loose/fast source code that got converted to the LLVM intrinsic. Ie, we can expand using fcmp+select vs. libcall based on the FMF.

I was worried that our FMF plumbing in DAG was wrong, but I just checked how x86 deals with this, and it seems to be working:

declare float @llvm.maxnum.f32(float, float) #0

define float @maxnum_fast(float %arg1, float %arg2) {
  %r = call fast float @llvm.maxnum.f32(float %arg1, float %arg2)
  ret float %r
}

define float @maxnum_strict(float %arg1, float %arg2) {
  %r = call float @llvm.maxnum.f32(float %arg1, float %arg2)
  ret float %r
}

Debug output for llc shows that we propagated FMF from the calls to the DAG nodes correctly:

t5: f32 = fmaxnum nnan ninf nsz arcp contract afn reassoc t2, t4

vs.

t5: f32 = fmaxnum t2, t4

And x86 then converts the generic nodes to target-specific nodes and instruction selection produces the right asm:

maxss	%xmm1, %xmm0

vs.

movaps	%xmm0, %xmm2
cmpunordss	%xmm0, %xmm2
movaps	%xmm2, %xmm3
andps	%xmm1, %xmm3
maxss	%xmm0, %xmm1
andnps	%xmm1, %xmm2
orps	%xmm3, %xmm2
movaps	%xmm2, %xmm0

x86 is choosing not to use a libcall because inline code should be faster, but if you add 'minsize', you'll get a libcall to 'fmaxf'. So all possibilities should be covered, and SystemZ should be able to copy/share that logic.

uweigand mentioned this in D70965: [SelectionDAG] Expand nnan FMINNUM/FMAXNUM to select sequence.Dec 3 2019, 8:11 AM

Indeed, it looks like this will work; thanks for the suggestion!

However, I still think that this should be done by default in common code; if targets prefer something else they can always override it, but the default behavior of common code should be to not create dependencies on libm out of thin air ...

I've now implemented this as D70965, and it seems to work for me.

I've now committed D70965, so I think this can be closed.

jonpa abandoned this revision.Dec 5 2019, 5:16 PM

spatel mentioned this in D122610: [SDAG] avoid libcalls to fmin/fmax for soft-float targets.Mar 28 2022, 12:37 PM

spatel mentioned this in rG436b875e49ec: [SDAG] avoid libcalls to fmin/fmax for soft-float targets.Mar 30 2022, 8:22 AM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

InstCombineInternal.h

7 lines

InstCombineSelect.cpp

13 lines

InstructionCombining.cpp

15 lines

test/

Transforms/

InstCombine/

maxnum-02.ll

16 lines

minnum-02.ll

16 lines

Diff 231523

llvm/lib/Transforms/InstCombine/InstCombineInternal.h

Show First 20 Lines • Show All 321 Lines • ▼ Show 20 Lines	private:
AssumptionCache &AC;		AssumptionCache &AC;
TargetLibraryInfo &TLI;		TargetLibraryInfo &TLI;
DominatorTree &DT;		DominatorTree &DT;
const DataLayout &DL;		const DataLayout &DL;
const SimplifyQuery SQ;		const SimplifyQuery SQ;
OptimizationRemarkEmitter &ORE;		OptimizationRemarkEmitter &ORE;
BlockFrequencyInfo *BFI;		BlockFrequencyInfo *BFI;
ProfileSummaryInfo *PSI;		ProfileSummaryInfo *PSI;
		TargetTransformInfo *TTI;

// Optional analyses. When non-null, these can both be used to do better		// Optional analyses. When non-null, these can both be used to do better
// combining and will be updated to reflect any changes.		// combining and will be updated to reflect any changes.
LoopInfo *LI;		LoopInfo *LI;

bool MadeIRChange = false;		bool MadeIRChange = false;

public:		public:
InstCombiner(InstCombineWorklist &Worklist, BuilderTy &Builder,		InstCombiner(InstCombineWorklist &Worklist, BuilderTy &Builder,
bool MinimizeSize, bool ExpensiveCombines, AliasAnalysis *AA,		bool MinimizeSize, bool ExpensiveCombines, AliasAnalysis *AA,
AssumptionCache &AC, TargetLibraryInfo &TLI, DominatorTree &DT,		AssumptionCache &AC, TargetLibraryInfo &TLI, DominatorTree &DT,
OptimizationRemarkEmitter &ORE, BlockFrequencyInfo *BFI,		OptimizationRemarkEmitter &ORE, BlockFrequencyInfo *BFI,
ProfileSummaryInfo PSI, const DataLayout &DL, LoopInfo LI)		ProfileSummaryInfo PSI, TargetTransformInfo TTI,
		const DataLayout &DL, LoopInfo *LI)
: Worklist(Worklist), Builder(Builder), MinimizeSize(MinimizeSize),		: Worklist(Worklist), Builder(Builder), MinimizeSize(MinimizeSize),
ExpensiveCombines(ExpensiveCombines), AA(AA), AC(AC), TLI(TLI), DT(DT),		ExpensiveCombines(ExpensiveCombines), AA(AA), AC(AC), TLI(TLI), DT(DT),
DL(DL), SQ(DL, &TLI, &DT, &AC), ORE(ORE), BFI(BFI), PSI(PSI), LI(LI) {}		DL(DL), SQ(DL, &TLI, &DT, &AC), ORE(ORE), BFI(BFI), PSI(PSI), TTI(TTI),
		LI(LI) {}

/// Run the combiner over the entire worklist until it is empty.		/// Run the combiner over the entire worklist until it is empty.
///		///
/// \returns true if the IR is changed.		/// \returns true if the IR is changed.
bool run();		bool run();

AssumptionCache &getAssumptionCache() const { return AC; }		AssumptionCache &getAssumptionCache() const { return AC; }

▲ Show 20 Lines • Show All 654 Lines • Show Last 20 Lines

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp

Show All 12 Lines
#include "InstCombineInternal.h"		#include "InstCombineInternal.h"
#include "llvm/ADT/APInt.h"		#include "llvm/ADT/APInt.h"
#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/Analysis/AssumptionCache.h"		#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/CmpInstAnalysis.h"		#include "llvm/Analysis/CmpInstAnalysis.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Constant.h"		#include "llvm/IR/Constant.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/InstrTypes.h"		#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/Instruction.h"		#include "llvm/IR/Instruction.h"
▲ Show 20 Lines • Show All 2,585 Lines • ▼ Show 20 Lines	if (SelectPatternResult::isMinOrMax(SPF)) {
return I;		return I;
}		}
}		}

// Canonicalize select of FP values where NaN and -0.0 are not valid as		// Canonicalize select of FP values where NaN and -0.0 are not valid as
// minnum/maxnum intrinsics.		// minnum/maxnum intrinsics.
if (isa<FPMathOperator>(SI) && SI.hasNoNaNs() && SI.hasNoSignedZeros()) {		if (isa<FPMathOperator>(SI) && SI.hasNoNaNs() && SI.hasNoSignedZeros()) {
Value X, Y;		Value X, Y;
if (match(&SI, m_OrdFMax(m_Value(X), m_Value(Y))))		auto FMF = cast<FPMathOperator>(SI).getFastMathFlags();

		if (TTI->getIntrinsicInstrCost(Intrinsic::maxnum, SelType,
		{SelType, SelType},
		FMF) < TargetTransformInfo::TCC_Expensive &&
		match(&SI, m_OrdFMax(m_Value(X), m_Value(Y))))
return replaceInstUsesWith(		return replaceInstUsesWith(
SI, Builder.CreateBinaryIntrinsic(Intrinsic::maxnum, X, Y, &SI));		SI, Builder.CreateBinaryIntrinsic(Intrinsic::maxnum, X, Y, &SI));

if (match(&SI, m_OrdFMin(m_Value(X), m_Value(Y))))		if (TTI->getIntrinsicInstrCost(Intrinsic::minnum, SelType,
		{SelType, SelType},
		FMF) < TargetTransformInfo::TCC_Expensive &&
		match(&SI, m_OrdFMin(m_Value(X), m_Value(Y))))
return replaceInstUsesWith(		return replaceInstUsesWith(
SI, Builder.CreateBinaryIntrinsic(Intrinsic::minnum, X, Y, &SI));		SI, Builder.CreateBinaryIntrinsic(Intrinsic::minnum, X, Y, &SI));
}		}

// See if we can fold the select into a phi node if the condition is a select.		// See if we can fold the select into a phi node if the condition is a select.
if (auto *PN = dyn_cast<PHINode>(SI.getCondition()))		if (auto *PN = dyn_cast<PHINode>(SI.getCondition()))
// The true/false values have to be live in the PHI predecessor's blocks.		// The true/false values have to be live in the PHI predecessor's blocks.
if (canSelectOperandBeMappingIntoPredBlock(TrueVal, SI) &&		if (canSelectOperandBeMappingIntoPredBlock(TrueVal, SI) &&
▲ Show 20 Lines • Show All 147 Lines • Show Last 20 Lines

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
#include "llvm/Analysis/LazyBlockFrequencyInfo.h"		#include "llvm/Analysis/LazyBlockFrequencyInfo.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/MemoryBuiltins.h"		#include "llvm/Analysis/MemoryBuiltins.h"
#include "llvm/Analysis/OptimizationRemarkEmitter.h"		#include "llvm/Analysis/OptimizationRemarkEmitter.h"
#include "llvm/Analysis/ProfileSummaryInfo.h"		#include "llvm/Analysis/ProfileSummaryInfo.h"
#include "llvm/Analysis/TargetFolder.h"		#include "llvm/Analysis/TargetFolder.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/CFG.h"		#include "llvm/IR/CFG.h"
#include "llvm/IR/Constant.h"		#include "llvm/IR/Constant.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DIBuilder.h"		#include "llvm/IR/DIBuilder.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
▲ Show 20 Lines • Show All 3,482 Lines • ▼ Show 20 Lines	static bool prepareICWorklistFromFunction(Function &F, const DataLayout &DL,

return MadeIRChange;		return MadeIRChange;
}		}

static bool combineInstructionsOverFunction(		static bool combineInstructionsOverFunction(
Function &F, InstCombineWorklist &Worklist, AliasAnalysis *AA,		Function &F, InstCombineWorklist &Worklist, AliasAnalysis *AA,
AssumptionCache &AC, TargetLibraryInfo &TLI, DominatorTree &DT,		AssumptionCache &AC, TargetLibraryInfo &TLI, DominatorTree &DT,
OptimizationRemarkEmitter &ORE, BlockFrequencyInfo *BFI,		OptimizationRemarkEmitter &ORE, BlockFrequencyInfo *BFI,
ProfileSummaryInfo *PSI, bool ExpensiveCombines = true,		ProfileSummaryInfo PSI, TargetTransformInfo TTI,
LoopInfo *LI = nullptr) {		bool ExpensiveCombines = true, LoopInfo *LI = nullptr) {
auto &DL = F.getParent()->getDataLayout();		auto &DL = F.getParent()->getDataLayout();
ExpensiveCombines \|= EnableExpensiveCombines;		ExpensiveCombines \|= EnableExpensiveCombines;

/// Builder - This is an IRBuilder that automatically inserts new		/// Builder - This is an IRBuilder that automatically inserts new
/// instructions into the worklist when they are created.		/// instructions into the worklist when they are created.
IRBuilder<TargetFolder, IRBuilderCallbackInserter> Builder(		IRBuilder<TargetFolder, IRBuilderCallbackInserter> Builder(
F.getContext(), TargetFolder(DL),		F.getContext(), TargetFolder(DL),
IRBuilderCallbackInserter([&Worklist, &AC](Instruction *I) {		IRBuilderCallbackInserter([&Worklist, &AC](Instruction *I) {
Show All 13 Lines	static bool combineInstructionsOverFunction(
while (true) {		while (true) {
++Iteration;		++Iteration;
LLVM_DEBUG(dbgs() << "\n\nINSTCOMBINE ITERATION #" << Iteration << " on "		LLVM_DEBUG(dbgs() << "\n\nINSTCOMBINE ITERATION #" << Iteration << " on "
<< F.getName() << "\n");		<< F.getName() << "\n");

MadeIRChange \|= prepareICWorklistFromFunction(F, DL, &TLI, Worklist);		MadeIRChange \|= prepareICWorklistFromFunction(F, DL, &TLI, Worklist);

InstCombiner IC(Worklist, Builder, F.hasMinSize(), ExpensiveCombines, AA,		InstCombiner IC(Worklist, Builder, F.hasMinSize(), ExpensiveCombines, AA,
AC, TLI, DT, ORE, BFI, PSI, DL, LI);		AC, TLI, DT, ORE, BFI, PSI, TTI, DL, LI);
IC.MaxArraySizeForCombine = MaxArraySize;		IC.MaxArraySizeForCombine = MaxArraySize;

if (!IC.run())		if (!IC.run())
break;		break;
}		}

return MadeIRChange \|\| Iteration > 1;		return MadeIRChange \|\| Iteration > 1;
}		}
Show All 9 Lines	PreservedAnalyses InstCombinePass::run(Function &F,

auto *AA = &AM.getResult<AAManager>(F);		auto *AA = &AM.getResult<AAManager>(F);
const ModuleAnalysisManager &MAM =		const ModuleAnalysisManager &MAM =
AM.getResult<ModuleAnalysisManagerFunctionProxy>(F).getManager();		AM.getResult<ModuleAnalysisManagerFunctionProxy>(F).getManager();
ProfileSummaryInfo *PSI =		ProfileSummaryInfo *PSI =
MAM.getCachedResult<ProfileSummaryAnalysis>(*F.getParent());		MAM.getCachedResult<ProfileSummaryAnalysis>(*F.getParent());
auto *BFI = (PSI && PSI->hasProfileSummary()) ?		auto *BFI = (PSI && PSI->hasProfileSummary()) ?
&AM.getResult<BlockFrequencyAnalysis>(F) : nullptr;		&AM.getResult<BlockFrequencyAnalysis>(F) : nullptr;
		auto &TTI = AM.getResult<TargetIRAnalysis>(F);

if (!combineInstructionsOverFunction(F, Worklist, AA, AC, TLI, DT, ORE,		if (!combineInstructionsOverFunction(F, Worklist, AA, AC, TLI, DT, ORE,
BFI, PSI, ExpensiveCombines, LI))		BFI, PSI, &TTI, ExpensiveCombines, LI))
// No changes, all analyses are preserved.		// No changes, all analyses are preserved.
return PreservedAnalyses::all();		return PreservedAnalyses::all();

// Mark all the analyses that instcombine updates as preserved.		// Mark all the analyses that instcombine updates as preserved.
PreservedAnalyses PA;		PreservedAnalyses PA;
PA.preserveSet<CFGAnalyses>();		PA.preserveSet<CFGAnalyses>();
PA.preserve<AAManager>();		PA.preserve<AAManager>();
PA.preserve<BasicAA>();		PA.preserve<BasicAA>();
PA.preserve<GlobalsAA>();		PA.preserve<GlobalsAA>();
return PA;		return PA;
}		}

void InstructionCombiningPass::getAnalysisUsage(AnalysisUsage &AU) const {		void InstructionCombiningPass::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesCFG();		AU.setPreservesCFG();
AU.addRequired<AAResultsWrapperPass>();		AU.addRequired<AAResultsWrapperPass>();
AU.addRequired<AssumptionCacheTracker>();		AU.addRequired<AssumptionCacheTracker>();
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
AU.addRequired<OptimizationRemarkEmitterWrapperPass>();		AU.addRequired<OptimizationRemarkEmitterWrapperPass>();
AU.addPreserved<DominatorTreeWrapperPass>();		AU.addPreserved<DominatorTreeWrapperPass>();
AU.addPreserved<AAResultsWrapperPass>();		AU.addPreserved<AAResultsWrapperPass>();
AU.addPreserved<BasicAAWrapperPass>();		AU.addPreserved<BasicAAWrapperPass>();
AU.addPreserved<GlobalsAAWrapperPass>();		AU.addPreserved<GlobalsAAWrapperPass>();
AU.addRequired<ProfileSummaryInfoWrapperPass>();		AU.addRequired<ProfileSummaryInfoWrapperPass>();
		AU.addRequired<TargetTransformInfoWrapperPass>();
LazyBlockFrequencyInfoPass::getLazyBFIAnalysisUsage(AU);		LazyBlockFrequencyInfoPass::getLazyBFIAnalysisUsage(AU);
}		}

bool InstructionCombiningPass::runOnFunction(Function &F) {		bool InstructionCombiningPass::runOnFunction(Function &F) {
if (skipFunction(F))		if (skipFunction(F))
return false;		return false;

// Required analyses.		// Required analyses.
auto AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();		auto AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();
auto &AC = getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);		auto &AC = getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);
auto &TLI = getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);		auto &TLI = getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();		auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
auto &ORE = getAnalysis<OptimizationRemarkEmitterWrapperPass>().getORE();		auto &ORE = getAnalysis<OptimizationRemarkEmitterWrapperPass>().getORE();

// Optional analyses.		// Optional analyses.
auto *LIWP = getAnalysisIfAvailable<LoopInfoWrapperPass>();		auto *LIWP = getAnalysisIfAvailable<LoopInfoWrapperPass>();
auto *LI = LIWP ? &LIWP->getLoopInfo() : nullptr;		auto *LI = LIWP ? &LIWP->getLoopInfo() : nullptr;
ProfileSummaryInfo *PSI =		ProfileSummaryInfo *PSI =
&getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI();		&getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI();
BlockFrequencyInfo *BFI =		BlockFrequencyInfo *BFI =
(PSI && PSI->hasProfileSummary()) ?		(PSI && PSI->hasProfileSummary()) ?
&getAnalysis<LazyBlockFrequencyInfoPass>().getBFI() :		&getAnalysis<LazyBlockFrequencyInfoPass>().getBFI() :
nullptr;		nullptr;
		auto &TTI = getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);

return combineInstructionsOverFunction(F, Worklist, AA, AC, TLI, DT, ORE,		return combineInstructionsOverFunction(F, Worklist, AA, AC, TLI, DT, ORE,
BFI, PSI, ExpensiveCombines, LI);		BFI, PSI, &TTI, ExpensiveCombines, LI);
}		}

char InstructionCombiningPass::ID = 0;		char InstructionCombiningPass::ID = 0;

InstructionCombiningPass::InstructionCombiningPass(bool ExpensiveCombines)		InstructionCombiningPass::InstructionCombiningPass(bool ExpensiveCombines)
: FunctionPass(ID), ExpensiveCombines(ExpensiveCombines) {		: FunctionPass(ID), ExpensiveCombines(ExpensiveCombines) {
initializeInstructionCombiningPassPass(*PassRegistry::getPassRegistry());		initializeInstructionCombiningPassPass(*PassRegistry::getPassRegistry());
}		}

INITIALIZE_PASS_BEGIN(InstructionCombiningPass, "instcombine",		INITIALIZE_PASS_BEGIN(InstructionCombiningPass, "instcombine",
"Combine redundant instructions", false, false)		"Combine redundant instructions", false, false)
INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)		INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)		INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)		INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)
INITIALIZE_PASS_DEPENDENCY(GlobalsAAWrapperPass)		INITIALIZE_PASS_DEPENDENCY(GlobalsAAWrapperPass)
INITIALIZE_PASS_DEPENDENCY(OptimizationRemarkEmitterWrapperPass)		INITIALIZE_PASS_DEPENDENCY(OptimizationRemarkEmitterWrapperPass)
INITIALIZE_PASS_DEPENDENCY(LazyBlockFrequencyInfoPass)		INITIALIZE_PASS_DEPENDENCY(LazyBlockFrequencyInfoPass)
INITIALIZE_PASS_DEPENDENCY(ProfileSummaryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(ProfileSummaryInfoWrapperPass)
		INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
INITIALIZE_PASS_END(InstructionCombiningPass, "instcombine",		INITIALIZE_PASS_END(InstructionCombiningPass, "instcombine",
"Combine redundant instructions", false, false)		"Combine redundant instructions", false, false)

// Initialization Routines		// Initialization Routines
void llvm::initializeInstCombine(PassRegistry &Registry) {		void llvm::initializeInstCombine(PassRegistry &Registry) {
initializeInstructionCombiningPassPass(Registry);		initializeInstructionCombiningPassPass(Registry);
}		}

Show All 11 Lines

llvm/test/Transforms/InstCombine/maxnum-02.ll

This file was added.

				; RUN: opt < %s -instcombine -S -mtriple=systemz-linux-gnu -mcpu=z13 \
				; RUN: \| FileCheck %s -check-prefix=CHECK-Z13
				; RUN: opt < %s -instcombine -S -mtriple=systemz-linux-gnu -mcpu=z14 \
				; RUN: \| FileCheck %s -check-prefix=CHECK-Z14
				; REQUIRES: systemz-registered-target
				;
				; Check that maxnum/minnum intrinsics are not created without fmax/fmin support.

				define float @f0(float %arg1, float %arg2) {
				; CHECK-Z13-NOT: call fast float @llvm.maxnum.f32
				; CHECK-Z14: call fast float @llvm.maxnum.f32
				bb:
				%tmp5 = fcmp fast oge float %arg1, %arg2
				%arg1.arg2 = select fast i1 %tmp5, float %arg1, float %arg2
				ret float %arg1.arg2
				}

llvm/test/Transforms/InstCombine/minnum-02.ll

This file was added.

				; RUN: opt < %s -instcombine -S -mtriple=systemz-linux-gnu -mcpu=z13 \
				; RUN: \| FileCheck %s -check-prefix=CHECK-Z13
				; RUN: opt < %s -instcombine -S -mtriple=systemz-linux-gnu -mcpu=z14 \
				; RUN: \| FileCheck %s -check-prefix=CHECK-Z14
				; REQUIRES: systemz-registered-target
				;
				; Check that maxnum/minnum intrinsics are not created without fmax/fmin support.

				define float @f0(float %arg1, float %arg2) {
				; CHECK-Z13-NOT: call fast float @llvm.minnum.f32
				; CHECK-Z14: call fast float @llvm.minnum.f32
				bb:
				%tmp5 = fcmp fast ole float %arg1, %arg2
				%arg1.arg2 = select fast i1 %tmp5, float %arg1, float %arg2
				ret float %arg1.arg2
				}