This is an archive of the discontinued LLVM Phabricator instance.

[CostModel] Replace getUserCost with getInstructionCost.
ClosedPublic

Authored by RKSimon on May 6 2020, 5:27 AM.

Download Raw Diff

Details

Reviewers

craig.topper
spatel
lebedev.ri
samparker
apostolakis
reames

Commits

rGfdec50182d85: [CostModel] Replace getUserCost with getInstructionCost

Summary

Replace getUserCost with getInstructionCost, covering all cost kinds.
Remove getInstructionLatency, it's not implemented by any backends, and we should fold the functionality into getUserCost (now getInstructionCost) to make it easier for targets to handle the cost kinds with their existing cost callbacks.

Original Patch by @samparker (Sam Parker)

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

samparker created this revision.May 6 2020, 5:27 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 6 2020, 5:27 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

samparker mentioned this in D78651: [TTI] Devirtualize getInstructionLatency.May 12 2020, 11:50 PM

samparker mentioned this in D79848: [CostModel] Unify getCastInstrCost.May 13 2020, 4:13 AM

I'm beginning to change getInstructionThroughput so that it can just call into getUserCost.

First is cast instructions: D79848.

Next is:

control-flow ops: D79849.

Merge two of the intrinsic cost APIs: D79941

Unify intrinsic costs: D80012

Updated with final patch now that all the other pieces are in.

ping

Is getUserCost a better name than getInstructionCost? I wonder if we're moving in the wrong direction.

FWIW, when someone gets around to wanting to fix the vectorizer's interleaving heuristic, we'll need instruction latencies.

Is getUserCost a better name than getInstructionCost? I wonder if we're moving in the wrong direction.

I'd prefer to use getInstructionCost too.

Renamed getUserCost to getInstructionCost.

Herald added subscribers: asbirlea, zzheng, haicheng and 3 others. · View Herald TranscriptJul 2 2020, 4:05 AM

dfukalov added a subscriber: dfukalov.Jul 10 2020, 1:02 AM

Herald added a subscriber: • wuzish. · View Herald TranscriptJul 10 2020, 1:02 AM

@samparker Are you intending to look at this again?

Herald added a subscriber: pengfei. · View Herald TranscriptAug 21 2021, 10:03 AM

@RKSimon I'm not really doing much work on LLVM currently, so no.... But I'd be happy to help out with reviews if someone else wanted to pick it up.

reames resigned from this revision.Nov 30 2021, 9:55 AM

Commandeering from @samparker - this has been on my backlog for ages, but you never know, I might get around to it soon...

cheers!

RKSimon planned changes to this revision.Dec 10 2021, 2:19 AM

RKSimon mentioned this in rG4178e3347076: [CostModel] Update RUN -passes=* to double quotes to appease update scripts on….Aug 10 2022, 9:54 AM

rebase - (WIP) still plenty of cost cleanup todo - it looks like some targets aren't correctly filtering costs by cost kind

Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2022, 10:17 AM

Herald added subscribers: • pcwang-thead, snehasish, ormris and 24 others. · View Herald Transcript

Harbormaster completed remote builds in B180454: Diff 451541.Aug 10 2022, 1:26 PM

looks like there's a codegen regression as well that needs addressing

Please shout when you're ready for review.

Harbormaster completed remote builds in B180626: Diff 451787.Aug 11 2022, 3:26 AM

Cleaned up default fp arithmetic / select costs - still investigating the remaining regressions......

RKSimon retitled this revision from [CostModel] Replace getUserCost with getInstructionCost. to [CostModel] Replace getUserCost with getInstructionCost. WIP.Aug 13 2022, 10:36 AM

Harbormaster completed remote builds in B181105: Diff 452438.Aug 13 2022, 11:13 AM

RKSimon added a reviewer: apostolakis.Aug 16 2022, 10:41 AM

RKSimon added a subscriber: apostolakis.

RKSimon added inline comments.

llvm/test/CodeGen/X86/select-optimize.ll
395 ↗	(On Diff #452438)	@apostolakis Please would take a look at this - we're about to start work on improving the cost numbers for latency/size and just this initial cleanup is causing this test to fail - which suggests you're relying on some very inaccurate costs.

RKSimon mentioned this in rG1d522a39f7ed: [TTI] Remove getInstructionThroughput cost helper..Aug 17 2022, 3:49 AM

rebase

RKSimon edited the summary of this revision. (Show Details)Aug 17 2022, 4:12 AM

Harbormaster completed remote builds in B181714: Diff 453255.Aug 17 2022, 4:48 AM

RKSimon edited the summary of this revision. (Show Details)Aug 17 2022, 5:48 AM

rebase

apostolakis added inline comments.Aug 17 2022, 7:17 AM

llvm/test/CodeGen/X86/select-optimize.ll
395 ↗	(On Diff #452438)	I will remove this test. This particular test was a bit flaky to begin with given that it depended on the instruction latency of the instructions. At least it served the purpose of notifying me of the changes in the underlying cost modeling. Just to clarify the use in the select-optimize pass. The goal was to approximate TargetSchedModel::computeInstrLatency and compute the length of dependence chains. getInstructionCost seemed the best approximation based on the documentation. getUserCost had an unclear purpose to me, did not account for latency queries (I guess this will change now) and takes the operands as input which is worrisome since I just wanted the cost of the instructions on their own regardless of their operands (in practice the operands are mostly ignored so that might not be an issue). The biggest change in cost modeling at least from switching from getInstructionCost to getUserCost seem to be the function calls which might be for the better.

Harbormaster completed remote builds in B181722: Diff 453266.Aug 17 2022, 7:19 AM

RKSimon added inline comments.Aug 17 2022, 7:30 AM

llvm/test/CodeGen/X86/select-optimize.ll
395 ↗	(On Diff #452438)	Thanks - if you're wanting to perform this with IR then getUserCost (-> getInstructionCost) will be your best bet - but it does rely on us getting all the costs to be more accurate, which will take time. I'm intending to fix the bitrot (we changed the way cost-model values are reported) on the script from D103695 which allows us to compare TTI costs vs various scheduler models (via llvm-mca), so eventually most instructions/intrinsics should have values similar to TargetSchedModel::computeInstrLatency. The script just helped check for throughput mismatches, but I did have the other cost kinds in mind as well. I'd recommend that you do take operands into account where possible, as at least on X86 there are some attempts to use them to match likely codegen (e.g. sign/zero-extended integers that can use smaller arithmetic ops).

apostolakis added inline comments.Aug 17 2022, 7:34 AM

llvm/test/CodeGen/X86/select-optimize.ll
395 ↗	(On Diff #452438)	Sounds great! Thanks for improving this cost modeling. It is much needed. For the operands, yes it seems that the purpose is to better predict how they will be lowered which indeed serves to improve cost modeling accuracy.

apostolakis mentioned this in D132029: [SelectOpti] Remove test on loop-level analysis.Aug 17 2022, 7:41 AM

apostolakis added inline comments.Aug 17 2022, 7:44 AM

llvm/test/CodeGen/X86/select-optimize.ll
395 ↗	(On Diff #452438)	Test removed in D132029.

apostolakis mentioned this in rG848e9e454fe9: [SelectOpti] Remove test on loop-level analysis.Aug 17 2022, 9:14 AM

rebase - we can now make Instruction::Select latency cost override to be x86-only.

I think this is ready for general review now.

Harbormaster completed remote builds in B181761: Diff 453330.Aug 17 2022, 11:43 AM

mingmingl added a subscriber: mingmingl.Aug 17 2022, 2:09 PM

This is mostly expected to have minimal impact on the decisions the compiler makes, right? Most of the changes are in latency, which are not used very much in heuristics.

I ran some benchmarks and (after fixing what I got wrong the first time), it looks OK.

llvm/lib/Transforms/Scalar/LICM.cpp
1324–1325	getUserCost -> getInstructionCost

In D79483#3731290, @dmgreen wrote:

This is mostly expected to have minimal impact on the decisions the compiler makes, right? Most of the changes are in latency, which are not used very much in heuristics.

That's correct - the select-optimize pass is the only one that really tries to use it so far, and that's still disabled by default until the costs have been improved.

I ran some benchmarks and (after fixing what I got wrong the first time), it looks OK.

👍

RKSimon edited the summary of this revision. (Show Details)Aug 18 2022, 2:20 AM

RKSimon mentioned this in rGb994f8718409: [Analysis] CostModel.cpp - merge isa<IntrinsicInst> and dyn_cast<IntrinsicInst>….Aug 18 2022, 2:44 AM

RKSimon mentioned this in rGe48892ee4230: [Transforms] LICM.cpp - pull out repeated getUserCost call.

samparker added inline comments.Aug 18 2022, 2:46 AM

llvm/include/llvm/Analysis/TargetTransformInfo.h
303–304	nit: three-argument
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
505	Would it be better, for now, to keep this in the X86 backend, if you need it there?

rebase

RKSimon added inline comments.Aug 18 2022, 2:58 AM

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
505	getInstructionLatency did have this as the default for all fp results As well as x86 it causes one aarch64 sve fadd cost test to change from latency cost = 3 to 1 - not sure if thats enough of a reason to keep it generic or not?

Make the default fp latency = 3 to be x86 only

LGTM, and thanks for doing this.

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
505	getInstructionLatency did have this as the default for all fp results Okay, fine then.

This revision is now accepted and ready to land.Aug 18 2022, 3:10 AM

RKSimon added inline comments.Aug 18 2022, 3:19 AM

llvm/test/Analysis/CostModel/AArch64/sve-math.ll
15	@samparker Just to be clear - you're happy for me to make the latency=3 default x86-only?

dmgreen added inline comments.Aug 18 2022, 3:31 AM

llvm/test/Analysis/CostModel/AArch64/sve-math.ll
15	I think 3 sounds pretty sensible as a first approximation - it might not be precise but I've commonly seem fp operations in that ballpark. If was used for all targets before then I would keep as all targets now.

Default fp arithmmetic latency = 3

This revision was landed with ongoing or failed builds.Aug 18 2022, 3:55 AM

Closed by commit rGfdec50182d85: [CostModel] Replace getUserCost with getInstructionCost (authored by RKSimon). · Explain Why

This revision was automatically updated to reflect the committed changes.

RKSimon added a commit: rGfdec50182d85: [CostModel] Replace getUserCost with getInstructionCost.

Harbormaster completed remote builds in B181962: Diff 453613.Aug 18 2022, 4:18 AM

RKSimon mentioned this in D128302: [AArch64][CostModel] Detects that {Extract,Insert}Element at lane 0 have the same cost as other lanes for real instructions that operates on integer types .Aug 24 2022, 5:48 AM

Matt added a subscriber: Matt.Aug 24 2022, 11:32 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

TargetTransformInfo.h

69 lines

TargetTransformInfoImpl.h

49 lines

CodeGen/

BasicTTIImpl.h

7 lines

lib/

Analysis/

CodeMetrics.cpp

2 lines

InlineCost.cpp

12 lines

TargetTransformInfo.cpp

13 lines

CodeGen/

CodeGenPrepare.cpp

4 lines

Target/

ARM/

ARMTargetTransformInfo.cpp

4 lines

Hexagon/

HexagonTargetTransformInfo.h

5 lines

HexagonTargetTransformInfo.cpp

9 lines

PowerPC/

PPCTargetTransformInfo.h

5 lines

PPCTargetTransformInfo.cpp

12 lines

RISCV/

RISCVTargetTransformInfo.cpp

4 lines

X86/

X86TargetTransformInfo.cpp

5 lines

Transforms/

IPO/

FunctionSpecialization.cpp

3 lines

Scalar/

4 lines

6 lines

2 lines

4 lines

SimpleLoopUnswitch.cpp

2 lines

SpeculativeExecution.cpp

2 lines

Utils/

SimplifyCFG.cpp

9 lines

test/

Analysis/

CostModel/

AArch64/

sve-math.ll

2 lines

ARM/

intrinsic-cost-kinds.ll

36 lines

target-intrinsics.ll

2 lines

SystemZ/

ext-of-icmp-cost.ll

2 lines

X86/

arith-fp-costkinds.ll

237 lines

costmodel.ll

2 lines

intrinsic-cost-kinds.ll

52 lines

Diff 453616

llvm/include/llvm/Analysis/TargetTransformInfo.h

Show First 20 Lines • Show All 213 Lines • ▼ Show 20 Lines	public:
/// target. The normalization of each cost model may be target specific.		/// target. The normalization of each cost model may be target specific.
enum TargetCostKind {		enum TargetCostKind {
TCK_RecipThroughput, ///< Reciprocal throughput.		TCK_RecipThroughput, ///< Reciprocal throughput.
TCK_Latency, ///< The latency of instruction.		TCK_Latency, ///< The latency of instruction.
TCK_CodeSize, ///< Instruction code size.		TCK_CodeSize, ///< Instruction code size.
TCK_SizeAndLatency ///< The weighted sum of size and latency.		TCK_SizeAndLatency ///< The weighted sum of size and latency.
};		};

/// Query the cost of a specified instruction.
///
/// Clients should use this interface to query the cost of an existing
/// instruction. The instruction must have a valid parent (basic block).
///
/// Note, this method does not cache the cost calculation and it
/// can be expensive in some cases.
InstructionCost getInstructionCost(const Instruction *I,
enum TargetCostKind kind) const {
InstructionCost Cost;
switch (kind) {
case TCK_Latency:
Cost = getInstructionLatency(I);
break;
case TCK_RecipThroughput:
case TCK_CodeSize:
case TCK_SizeAndLatency:
Cost = getUserCost(I, kind);
break;
}
return Cost;
}

/// Underlying constants for 'cost' values in this interface.		/// Underlying constants for 'cost' values in this interface.
///		///
/// Many APIs in this interface return a cost. This enum defines the		/// Many APIs in this interface return a cost. This enum defines the
/// fundamental values that should be used to interpret (and produce) those		/// fundamental values that should be used to interpret (and produce) those
/// costs. The costs are returned as an int rather than a member of this		/// costs. The costs are returned as an int rather than a member of this
/// enumeration because it is expected that the cost of one IR instruction		/// enumeration because it is expected that the cost of one IR instruction
/// may have a multiplicative factor to it or otherwise won't fit directly		/// may have a multiplicative factor to it or otherwise won't fit directly
/// into the enum. Moreover, it is common to sum or average costs which works		/// into the enum. Moreover, it is common to sum or average costs which works
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	public:
/// \p Operands is a list of operands which can be a result of transformations		/// \p Operands is a list of operands which can be a result of transformations
/// of the current operands. The number of the operands on the list must equal		/// of the current operands. The number of the operands on the list must equal
/// to the number of the current operands the IR user has. Their order on the		/// to the number of the current operands the IR user has. Their order on the
/// list must be the same as the order of the current operands the IR user		/// list must be the same as the order of the current operands the IR user
/// has.		/// has.
///		///
/// The returned cost is defined in terms of \c TargetCostConstants, see its		/// The returned cost is defined in terms of \c TargetCostConstants, see its
/// comments for a detailed explanation of the cost values.		/// comments for a detailed explanation of the cost values.
InstructionCost getUserCost(const User U, ArrayRef<const Value > Operands,		InstructionCost getInstructionCost(const User *U,
		ArrayRef<const Value *> Operands,
TargetCostKind CostKind) const;		TargetCostKind CostKind) const;

/// This is a helper function which calls the two-argument getUserCost		/// This is a helper function which calls the three-argument
		samparkerUnsubmitted Not Done Reply Inline Actions nit: three-argument samparker: nit: three-argument
/// with \p Operands which are the current operands U has.		/// getInstructionCost with \p Operands which are the current operands U has.
InstructionCost getUserCost(const User *U, TargetCostKind CostKind) const {		InstructionCost getInstructionCost(const User *U,
		TargetCostKind CostKind) const {
SmallVector<const Value *, 4> Operands(U->operand_values());		SmallVector<const Value *, 4> Operands(U->operand_values());
return getUserCost(U, Operands, CostKind);		return getInstructionCost(U, Operands, CostKind);
}		}

/// If a branch or a select condition is skewed in one direction by more than		/// If a branch or a select condition is skewed in one direction by more than
/// this factor, it is very likely to be predicted correctly.		/// this factor, it is very likely to be predicted correctly.
BranchProbability getPredictableBranchThreshold() const;		BranchProbability getPredictableBranchThreshold() const;

/// Return true if branch divergence exists.		/// Return true if branch divergence exists.
///		///
▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	struct LSRCost {
unsigned ImmCost;		unsigned ImmCost;
unsigned SetupCost;		unsigned SetupCost;
unsigned ScaleCost;		unsigned ScaleCost;
};		};

/// Parameters that control the generic loop unrolling transformation.		/// Parameters that control the generic loop unrolling transformation.
struct UnrollingPreferences {		struct UnrollingPreferences {
/// The cost threshold for the unrolled loop. Should be relative to the		/// The cost threshold for the unrolled loop. Should be relative to the
/// getUserCost values returned by this API, and the expectation is that		/// getInstructionCost values returned by this API, and the expectation is
/// the unrolled loop's instructions when run through that interface should		/// that the unrolled loop's instructions when run through that interface
/// not exceed this cost. However, this is only an estimate. Also, specific		/// should not exceed this cost. However, this is only an estimate. Also,
/// loops may be unrolled even with a cost above this threshold if deemed		/// specific loops may be unrolled even with a cost above this threshold if
/// profitable. Set this to UINT_MAX to disable the loop body cost		/// deemed profitable. Set this to UINT_MAX to disable the loop body cost
/// restriction.		/// restriction.
unsigned Threshold;		unsigned Threshold;
/// If complete unrolling will reduce the cost of the loop, we will boost		/// If complete unrolling will reduce the cost of the loop, we will boost
/// the Threshold by a certain percent to allow more aggressive complete		/// the Threshold by a certain percent to allow more aggressive complete
/// unrolling. This value provides the maximum boost percentage that we		/// unrolling. This value provides the maximum boost percentage that we
/// can apply to Threshold (The value should be no less than 100).		/// can apply to Threshold (The value should be no less than 100).
/// BoostedThreshold = Threshold * min(RolledCost / UnrolledCost,		/// BoostedThreshold = Threshold * min(RolledCost / UnrolledCost,
/// MaxPercentThresholdBoost / 100)		/// MaxPercentThresholdBoost / 100)
▲ Show 20 Lines • Show All 1,066 Lines • ▼ Show 20 Lines
/// \returns How the target needs this vector-predicated operation to be		/// \returns How the target needs this vector-predicated operation to be
/// transformed.		/// transformed.
VPLegalization getVPLegalizationStrategy(const VPIntrinsic &PI) const;		VPLegalization getVPLegalizationStrategy(const VPIntrinsic &PI) const;
/// @}		/// @}

/// @}		/// @}

private:		private:
/// Estimate the latency of specified instruction.
/// Returns 1 as the default value.
InstructionCost getInstructionLatency(const Instruction *I) const;

/// The abstract base class used to type erase specific TTI		/// The abstract base class used to type erase specific TTI
/// implementations.		/// implementations.
class Concept;		class Concept;

/// The template model for the base class which wraps a concrete		/// The template model for the base class which wraps a concrete
/// implementation in a type erased interface.		/// implementation in a type erased interface.
template <typename T> class Model;		template <typename T> class Model;

Show All 10 Lines	public:
virtual unsigned getInliningThresholdMultiplier() = 0;		virtual unsigned getInliningThresholdMultiplier() = 0;
virtual unsigned adjustInliningThreshold(const CallBase *CB) = 0;		virtual unsigned adjustInliningThreshold(const CallBase *CB) = 0;
virtual int getInlinerVectorBonusPercent() = 0;		virtual int getInlinerVectorBonusPercent() = 0;
virtual InstructionCost getMemcpyCost(const Instruction *I) = 0;		virtual InstructionCost getMemcpyCost(const Instruction *I) = 0;
virtual unsigned		virtual unsigned
getEstimatedNumberOfCaseClusters(const SwitchInst &SI, unsigned &JTSize,		getEstimatedNumberOfCaseClusters(const SwitchInst &SI, unsigned &JTSize,
ProfileSummaryInfo *PSI,		ProfileSummaryInfo *PSI,
BlockFrequencyInfo *BFI) = 0;		BlockFrequencyInfo *BFI) = 0;
virtual InstructionCost getUserCost(const User *U,		virtual InstructionCost getInstructionCost(const User *U,
ArrayRef<const Value *> Operands,		ArrayRef<const Value *> Operands,
TargetCostKind CostKind) = 0;		TargetCostKind CostKind) = 0;
virtual BranchProbability getPredictableBranchThreshold() = 0;		virtual BranchProbability getPredictableBranchThreshold() = 0;
virtual bool hasBranchDivergence() = 0;		virtual bool hasBranchDivergence() = 0;
virtual bool useGPUDivergenceAnalysis() = 0;		virtual bool useGPUDivergenceAnalysis() = 0;
virtual bool isSourceOfDivergence(const Value *V) = 0;		virtual bool isSourceOfDivergence(const Value *V) = 0;
virtual bool isAlwaysUniform(const Value *V) = 0;		virtual bool isAlwaysUniform(const Value *V) = 0;
virtual unsigned getFlatAddressSpace() = 0;		virtual unsigned getFlatAddressSpace() = 0;
virtual bool collectFlatAddressOperands(SmallVectorImpl<int> &OpIndexes,		virtual bool collectFlatAddressOperands(SmallVectorImpl<int> &OpIndexes,
Intrinsic::ID IID) const = 0;		Intrinsic::ID IID) const = 0;
▲ Show 20 Lines • Show All 298 Lines • ▼ Show 20 Lines	virtual bool preferPredicatedReductionSelect(unsigned Opcode, Type *Ty,
ReductionFlags) const = 0;		ReductionFlags) const = 0;
virtual bool shouldExpandReduction(const IntrinsicInst *II) const = 0;		virtual bool shouldExpandReduction(const IntrinsicInst *II) const = 0;
virtual unsigned getGISelRematGlobalCost() const = 0;		virtual unsigned getGISelRematGlobalCost() const = 0;
virtual unsigned getMinTripCountTailFoldingThreshold() const = 0;		virtual unsigned getMinTripCountTailFoldingThreshold() const = 0;
virtual bool enableScalableVectorization() const = 0;		virtual bool enableScalableVectorization() const = 0;
virtual bool supportsScalableVectors() const = 0;		virtual bool supportsScalableVectors() const = 0;
virtual bool hasActiveVectorLength(unsigned Opcode, Type *DataType,		virtual bool hasActiveVectorLength(unsigned Opcode, Type *DataType,
Align Alignment) const = 0;		Align Alignment) const = 0;
virtual InstructionCost getInstructionLatency(const Instruction *I) = 0;
virtual VPLegalization		virtual VPLegalization
getVPLegalizationStrategy(const VPIntrinsic &PI) const = 0;		getVPLegalizationStrategy(const VPIntrinsic &PI) const = 0;
};		};

template <typename T>		template <typename T>
class TargetTransformInfo::Model final : public TargetTransformInfo::Concept {		class TargetTransformInfo::Model final : public TargetTransformInfo::Concept {
T Impl;		T Impl;

Show All 18 Lines	unsigned adjustInliningThreshold(const CallBase *CB) override {
return Impl.adjustInliningThreshold(CB);		return Impl.adjustInliningThreshold(CB);
}		}
int getInlinerVectorBonusPercent() override {		int getInlinerVectorBonusPercent() override {
return Impl.getInlinerVectorBonusPercent();		return Impl.getInlinerVectorBonusPercent();
}		}
InstructionCost getMemcpyCost(const Instruction *I) override {		InstructionCost getMemcpyCost(const Instruction *I) override {
return Impl.getMemcpyCost(I);		return Impl.getMemcpyCost(I);
}		}
InstructionCost getUserCost(const User U, ArrayRef<const Value > Operands,		InstructionCost getInstructionCost(const User *U,
		ArrayRef<const Value *> Operands,
TargetCostKind CostKind) override {		TargetCostKind CostKind) override {
return Impl.getUserCost(U, Operands, CostKind);		return Impl.getInstructionCost(U, Operands, CostKind);
}		}
BranchProbability getPredictableBranchThreshold() override {		BranchProbability getPredictableBranchThreshold() override {
return Impl.getPredictableBranchThreshold();		return Impl.getPredictableBranchThreshold();
}		}
bool hasBranchDivergence() override { return Impl.hasBranchDivergence(); }		bool hasBranchDivergence() override { return Impl.hasBranchDivergence(); }
bool useGPUDivergenceAnalysis() override {		bool useGPUDivergenceAnalysis() override {
return Impl.useGPUDivergenceAnalysis();		return Impl.useGPUDivergenceAnalysis();
}		}
▲ Show 20 Lines • Show All 598 Lines • ▼ Show 20 Lines	bool enableScalableVectorization() const override {
return Impl.enableScalableVectorization();		return Impl.enableScalableVectorization();
}		}

bool hasActiveVectorLength(unsigned Opcode, Type *DataType,		bool hasActiveVectorLength(unsigned Opcode, Type *DataType,
Align Alignment) const override {		Align Alignment) const override {
return Impl.hasActiveVectorLength(Opcode, DataType, Alignment);		return Impl.hasActiveVectorLength(Opcode, DataType, Alignment);
}		}

InstructionCost getInstructionLatency(const Instruction *I) override {
return Impl.getInstructionLatency(I);
}

VPLegalization		VPLegalization
getVPLegalizationStrategy(const VPIntrinsic &PI) const override {		getVPLegalizationStrategy(const VPIntrinsic &PI) const override {
return Impl.getVPLegalizationStrategy(PI);		return Impl.getVPLegalizationStrategy(PI);
}		}
};		};

template <typename T>		template <typename T>
TargetTransformInfo::TargetTransformInfo(T Impl)		TargetTransformInfo::TargetTransformInfo(T Impl)
▲ Show 20 Lines • Show All 98 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

Show First 20 Lines • Show All 494 Lines • ▼ Show 20 Lines	InstructionCost getArithmeticInstrCost(
case Instruction::FRem:		case Instruction::FRem:
case Instruction::SDiv:		case Instruction::SDiv:
case Instruction::SRem:		case Instruction::SRem:
case Instruction::UDiv:		case Instruction::UDiv:
case Instruction::URem:		case Instruction::URem:
// FIXME: Unlikely to be true for CodeSize.		// FIXME: Unlikely to be true for CodeSize.
return TTI::TCC_Expensive;		return TTI::TCC_Expensive;
}		}

		// Assume a 3cy latency for fp arithmetic ops.
		if (CostKind == TTI::TCK_Latency)
		samparkerUnsubmitted Not Done Reply Inline Actions Would it be better, for now, to keep this in the X86 backend, if you need it there? samparker: Would it be better, for now, to keep this in the X86 backend, if you need it there?
		RKSimonAuthorUnsubmitted Not Done Reply Inline Actions getInstructionLatency did have this as the default for all fp results As well as x86 it causes one aarch64 sve fadd cost test to change from latency cost = 3 to 1 - not sure if thats enough of a reason to keep it generic or not? RKSimon: getInstructionLatency did have this as the default for all fp results As well as x86 it causes…
		samparkerUnsubmitted Not Done Reply Inline Actions getInstructionLatency did have this as the default for all fp results Okay, fine then. samparker: > getInstructionLatency did have this as the default for all fp results Okay, fine then.
		if (Ty->getScalarType()->isFloatingPointTy())
		return 3;

return 1;		return 1;
}		}

InstructionCost getShuffleCost(TTI::ShuffleKind Kind, VectorType *Ty,		InstructionCost getShuffleCost(TTI::ShuffleKind Kind, VectorType *Ty,
ArrayRef<int> Mask, int Index,		ArrayRef<int> Mask, int Index,
VectorType *SubTp,		VectorType *SubTp,
ArrayRef<const Value *> Args = None) const {		ArrayRef<const Value *> Args = None) const {
return 1;		return 1;
▲ Show 20 Lines • Show All 477 Lines • ▼ Show 20 Lines	InstructionCost getGEPCost(Type PointeeType, const Value Ptr,
if (static_cast<T *>(this)->isLegalAddressingMode(		if (static_cast<T *>(this)->isLegalAddressingMode(
TargetType, const_cast<GlobalValue *>(BaseGV),		TargetType, const_cast<GlobalValue *>(BaseGV),
BaseOffset.sextOrTrunc(64).getSExtValue(), HasBaseReg, Scale,		BaseOffset.sextOrTrunc(64).getSExtValue(), HasBaseReg, Scale,
Ptr->getType()->getPointerAddressSpace()))		Ptr->getType()->getPointerAddressSpace()))
return TTI::TCC_Free;		return TTI::TCC_Free;
return TTI::TCC_Basic;		return TTI::TCC_Basic;
}		}

InstructionCost getUserCost(const User U, ArrayRef<const Value > Operands,		InstructionCost getInstructionCost(const User *U,
		ArrayRef<const Value *> Operands,
TTI::TargetCostKind CostKind) {		TTI::TargetCostKind CostKind) {
using namespace llvm::PatternMatch;		using namespace llvm::PatternMatch;

auto TargetTTI = static_cast<T >(this);		auto TargetTTI = static_cast<T >(this);
// Handle non-intrinsic calls, invokes, and callbr.		// Handle non-intrinsic calls, invokes, and callbr.
// FIXME: Unlikely to be true for anything but CodeSize.		// FIXME: Unlikely to be true for anything but CodeSize.
auto *CB = dyn_cast<CallBase>(U);		auto *CB = dyn_cast<CallBase>(U);
if (CB && !isa<IntrinsicInst>(U)) {		if (CB && !isa<IntrinsicInst>(U)) {
if (const Function *F = CB->getCalledFunction()) {		if (const Function *F = CB->getCalledFunction()) {
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	InstructionCost getInstructionCost(const User *U,
case Instruction::Store: {		case Instruction::Store: {
auto *SI = cast<StoreInst>(U);		auto *SI = cast<StoreInst>(U);
Type *ValTy = U->getOperand(0)->getType();		Type *ValTy = U->getOperand(0)->getType();
return TargetTTI->getMemoryOpCost(Opcode, ValTy, SI->getAlign(),		return TargetTTI->getMemoryOpCost(Opcode, ValTy, SI->getAlign(),
SI->getPointerAddressSpace(),		SI->getPointerAddressSpace(),
CostKind, I);		CostKind, I);
}		}
case Instruction::Load: {		case Instruction::Load: {
		// FIXME: Arbitary cost which could come from the backend.
		if (CostKind == TTI::TCK_Latency)
		return 4;
auto *LI = cast<LoadInst>(U);		auto *LI = cast<LoadInst>(U);
Type *LoadType = U->getType();		Type *LoadType = U->getType();
// If there is a non-register sized type, the cost estimation may expand		// If there is a non-register sized type, the cost estimation may expand
// it to be several instructions to load into multiple registers on the		// it to be several instructions to load into multiple registers on the
// target. But, if the only use of the load is a trunc instruction to a		// target. But, if the only use of the load is a trunc instruction to a
// register sized type, the instruction selector can combine these		// register sized type, the instruction selector can combine these
// instructions to be a single load. So, in this case, we use the		// instructions to be a single load. So, in this case, we use the
// destination type of the trunc instruction rather than the load to		// destination type of the trunc instruction rather than the load to
▲ Show 20 Lines • Show All 135 Lines • ▼ Show 20 Lines	case Instruction::ExtractElement: {
unsigned Idx = -1;		unsigned Idx = -1;
if (auto *CI = dyn_cast<ConstantInt>(EEI->getOperand(1)))		if (auto *CI = dyn_cast<ConstantInt>(EEI->getOperand(1)))
if (CI->getValue().getActiveBits() <= 32)		if (CI->getValue().getActiveBits() <= 32)
Idx = CI->getZExtValue();		Idx = CI->getZExtValue();
Type *DstTy = U->getOperand(0)->getType();		Type *DstTy = U->getOperand(0)->getType();
return TargetTTI->getVectorInstrCost(*EEI, DstTy, Idx);		return TargetTTI->getVectorInstrCost(*EEI, DstTy, Idx);
}		}
}		}
// By default, just classify everything as 'basic'.
return TTI::TCC_Basic;
}

InstructionCost getInstructionLatency(const Instruction *I) {		// By default, just classify everything as 'basic' or -1 to represent that
SmallVector<const Value *, 4> Operands(I->operand_values());		// don't know the throughput cost.
if (getUserCost(I, Operands, TTI::TCK_Latency) == TTI::TCC_Free)		return CostKind == TTI::TCK_RecipThroughput ? -1 : TTI::TCC_Basic;
return 0;

if (isa<LoadInst>(I))
return 4;

Type *DstTy = I->getType();

// Usually an intrinsic is a simple instruction.
// A real function call is much slower.
if (auto *CI = dyn_cast<CallInst>(I)) {
const Function *F = CI->getCalledFunction();
if (!F \|\| static_cast<T *>(this)->isLoweredToCall(F))
return 40;
// Some intrinsics return a value and a flag, we use the value type
// to decide its latency.
if (StructType *StructTy = dyn_cast<StructType>(DstTy))
DstTy = StructTy->getElementType(0);
// Fall through to simple instructions.
}

if (VectorType *VectorTy = dyn_cast<VectorType>(DstTy))
DstTy = VectorTy->getElementType();
if (DstTy->isFloatingPointTy())
return 3;

return 1;
}		}
};		};
} // namespace llvm		} // namespace llvm

#endif		#endif

llvm/include/llvm/CodeGen/BasicTTIImpl.h

Show First 20 Lines • Show All 632 Lines • ▼ Show 20 Lines	Optional<Value *> simplifyDemandedVectorEltsIntrinsic(
APInt &UndefElts2, APInt &UndefElts3,		APInt &UndefElts2, APInt &UndefElts3,
std::function<void(Instruction *, unsigned, APInt, APInt &)>		std::function<void(Instruction *, unsigned, APInt, APInt &)>
SimplifyAndSetOp) {		SimplifyAndSetOp) {
return BaseT::simplifyDemandedVectorEltsIntrinsic(		return BaseT::simplifyDemandedVectorEltsIntrinsic(
IC, II, DemandedElts, UndefElts, UndefElts2, UndefElts3,		IC, II, DemandedElts, UndefElts, UndefElts2, UndefElts3,
SimplifyAndSetOp);		SimplifyAndSetOp);
}		}

InstructionCost getInstructionLatency(const Instruction *I) {
if (isa<LoadInst>(I))
return getST()->getSchedModel().DefaultLoadLatency;

return BaseT::getInstructionLatency(I);
}

virtual Optional<unsigned>		virtual Optional<unsigned>
getCacheSize(TargetTransformInfo::CacheLevel Level) const {		getCacheSize(TargetTransformInfo::CacheLevel Level) const {
return Optional<unsigned>(		return Optional<unsigned>(
getST()->getCacheSize(static_cast<unsigned>(Level)));		getST()->getCacheSize(static_cast<unsigned>(Level)));
}		}

virtual Optional<unsigned>		virtual Optional<unsigned>
getCacheAssociativity(TargetTransformInfo::CacheLevel Level) const {		getCacheAssociativity(TargetTransformInfo::CacheLevel Level) const {
▲ Show 20 Lines • Show All 1,754 Lines • Show Last 20 Lines

llvm/lib/Analysis/CodeMetrics.cpp

Show First 20 Lines • Show All 171 Lines • ▼ Show 20 Lines	if (const CallInst *CI = dyn_cast<CallInst>(&I)) {
if (CI->isConvergent())		if (CI->isConvergent())
convergent = true;		convergent = true;
}		}

if (const InvokeInst *InvI = dyn_cast<InvokeInst>(&I))		if (const InvokeInst *InvI = dyn_cast<InvokeInst>(&I))
if (InvI->cannotDuplicate())		if (InvI->cannotDuplicate())
notDuplicatable = true;		notDuplicatable = true;

NumInsts += TTI.getUserCost(&I, TargetTransformInfo::TCK_CodeSize);		NumInsts += TTI.getInstructionCost(&I, TargetTransformInfo::TCK_CodeSize);
}		}

if (isa<ReturnInst>(BB->getTerminator()))		if (isa<ReturnInst>(BB->getTerminator()))
++NumRets;		++NumRets;

// We never want to inline functions that contain an indirectbr. This is		// We never want to inline functions that contain an indirectbr. This is
// incorrect because all the blockaddress's (in static global initializers		// incorrect because all the blockaddress's (in static global initializers
// for example) would be referring to the original function, and this indirect		// for example) would be referring to the original function, and this indirect
Show All 14 Lines

llvm/lib/Analysis/InlineCost.cpp

Show First 20 Lines • Show All 1,355 Lines • ▼ Show 20 Lines
bool CallAnalyzer::isGEPFree(GetElementPtrInst &GEP) {		bool CallAnalyzer::isGEPFree(GetElementPtrInst &GEP) {
SmallVector<Value *, 4> Operands;		SmallVector<Value *, 4> Operands;
Operands.push_back(GEP.getOperand(0));		Operands.push_back(GEP.getOperand(0));
for (const Use &Op : GEP.indices())		for (const Use &Op : GEP.indices())
if (Constant *SimpleOp = SimplifiedValues.lookup(Op))		if (Constant *SimpleOp = SimplifiedValues.lookup(Op))
Operands.push_back(SimpleOp);		Operands.push_back(SimpleOp);
else		else
Operands.push_back(Op);		Operands.push_back(Op);
return TTI.getUserCost(&GEP, Operands,		return TTI.getInstructionCost(&GEP, Operands,
TargetTransformInfo::TCK_SizeAndLatency) ==		TargetTransformInfo::TCK_SizeAndLatency) ==
TargetTransformInfo::TCC_Free;		TargetTransformInfo::TCC_Free;
}		}

bool CallAnalyzer::visitAlloca(AllocaInst &I) {		bool CallAnalyzer::visitAlloca(AllocaInst &I) {
disableSROA(I.getOperand(0));		disableSROA(I.getOperand(0));

// Check whether inlining will turn a dynamic alloca into a static		// Check whether inlining will turn a dynamic alloca into a static
// alloca and handle that case.		// alloca and handle that case.
▲ Show 20 Lines • Show All 260 Lines • ▼ Show 20 Lines	bool CallAnalyzer::visitPtrToInt(PtrToIntInst &I) {
// inlining, it will be nuked, and SROA should proceed. All of the uses which		// inlining, it will be nuked, and SROA should proceed. All of the uses which
// would block SROA would also block SROA if applied directly to a pointer,		// would block SROA would also block SROA if applied directly to a pointer,
// and so we can just add the integer in here. The only places where SROA is		// and so we can just add the integer in here. The only places where SROA is
// preserved either cannot fire on an integer, or won't in-and-of themselves		// preserved either cannot fire on an integer, or won't in-and-of themselves
// disable SROA (ext) w/o some later use that we would see and disable.		// disable SROA (ext) w/o some later use that we would see and disable.
if (auto *SROAArg = getSROAArgForValueOrNull(I.getOperand(0)))		if (auto *SROAArg = getSROAArgForValueOrNull(I.getOperand(0)))
SROAArgValues[&I] = SROAArg;		SROAArgValues[&I] = SROAArg;

return TTI.getUserCost(&I, TargetTransformInfo::TCK_SizeAndLatency) ==		return TTI.getInstructionCost(&I, TargetTransformInfo::TCK_SizeAndLatency) ==
TargetTransformInfo::TCC_Free;		TargetTransformInfo::TCC_Free;
}		}

bool CallAnalyzer::visitIntToPtr(IntToPtrInst &I) {		bool CallAnalyzer::visitIntToPtr(IntToPtrInst &I) {
// Propagate constants through ptrtoint.		// Propagate constants through ptrtoint.
if (simplifyInstruction(I))		if (simplifyInstruction(I))
return true;		return true;

// Track base/offset pairs when round-tripped through a pointer without		// Track base/offset pairs when round-tripped through a pointer without
// modifications provided the integer is not too large.		// modifications provided the integer is not too large.
Value *Op = I.getOperand(0);		Value *Op = I.getOperand(0);
unsigned IntegerSize = Op->getType()->getScalarSizeInBits();		unsigned IntegerSize = Op->getType()->getScalarSizeInBits();
if (IntegerSize <= DL.getPointerTypeSizeInBits(I.getType())) {		if (IntegerSize <= DL.getPointerTypeSizeInBits(I.getType())) {
std::pair<Value *, APInt> BaseAndOffset = ConstantOffsetPtrs.lookup(Op);		std::pair<Value *, APInt> BaseAndOffset = ConstantOffsetPtrs.lookup(Op);
if (BaseAndOffset.first)		if (BaseAndOffset.first)
ConstantOffsetPtrs[&I] = BaseAndOffset;		ConstantOffsetPtrs[&I] = BaseAndOffset;
}		}

// "Propagate" SROA here in the same manner as we do for ptrtoint above.		// "Propagate" SROA here in the same manner as we do for ptrtoint above.
if (auto *SROAArg = getSROAArgForValueOrNull(Op))		if (auto *SROAArg = getSROAArgForValueOrNull(Op))
SROAArgValues[&I] = SROAArg;		SROAArgValues[&I] = SROAArg;

return TTI.getUserCost(&I, TargetTransformInfo::TCK_SizeAndLatency) ==		return TTI.getInstructionCost(&I, TargetTransformInfo::TCK_SizeAndLatency) ==
TargetTransformInfo::TCC_Free;		TargetTransformInfo::TCC_Free;
}		}

bool CallAnalyzer::visitCastInst(CastInst &I) {		bool CallAnalyzer::visitCastInst(CastInst &I) {
// Propagate constants through casts.		// Propagate constants through casts.
if (simplifyInstruction(I))		if (simplifyInstruction(I))
return true;		return true;

Show All 13 Lines	bool CallAnalyzer::visitCastInst(CastInst &I) {
case Instruction::FPToSI:		case Instruction::FPToSI:
if (TTI.getFPOpCost(I.getType()) == TargetTransformInfo::TCC_Expensive)		if (TTI.getFPOpCost(I.getType()) == TargetTransformInfo::TCC_Expensive)
onCallPenalty();		onCallPenalty();
break;		break;
default:		default:
break;		break;
}		}

return TTI.getUserCost(&I, TargetTransformInfo::TCK_SizeAndLatency) ==		return TTI.getInstructionCost(&I, TargetTransformInfo::TCK_SizeAndLatency) ==
TargetTransformInfo::TCC_Free;		TargetTransformInfo::TCC_Free;
}		}

bool CallAnalyzer::paramHasAttr(Argument *A, Attribute::AttrKind Attr) {		bool CallAnalyzer::paramHasAttr(Argument *A, Attribute::AttrKind Attr) {
return CandidateCall.paramHasAttr(A->getArgNo(), Attr);		return CandidateCall.paramHasAttr(A->getArgNo(), Attr);
}		}

bool CallAnalyzer::isKnownNonNullInCallee(Value *V) {		bool CallAnalyzer::isKnownNonNullInCallee(Value *V) {
▲ Show 20 Lines • Show All 681 Lines • ▼ Show 20 Lines	bool CallAnalyzer::visitUnreachableInst(UnreachableInst &I) {
// to unreachable as they have the lowest possible impact on both runtime and		// to unreachable as they have the lowest possible impact on both runtime and
// code size.		// code size.
return true; // No actual code is needed for unreachable.		return true; // No actual code is needed for unreachable.
}		}

bool CallAnalyzer::visitInstruction(Instruction &I) {		bool CallAnalyzer::visitInstruction(Instruction &I) {
// Some instructions are free. All of the free intrinsics can also be		// Some instructions are free. All of the free intrinsics can also be
// handled by SROA, etc.		// handled by SROA, etc.
if (TTI.getUserCost(&I, TargetTransformInfo::TCK_SizeAndLatency) ==		if (TTI.getInstructionCost(&I, TargetTransformInfo::TCK_SizeAndLatency) ==
TargetTransformInfo::TCC_Free)		TargetTransformInfo::TCC_Free)
return true;		return true;

// We found something we don't understand or can't handle. Mark any SROA-able		// We found something we don't understand or can't handle. Mark any SROA-able
// values in the operand list as no longer viable.		// values in the operand list as no longer viable.
for (const Use &Op : I.operands())		for (const Use &Op : I.operands())
disableSROA(Op);		disableSROA(Op);

▲ Show 20 Lines • Show All 754 Lines • Show Last 20 Lines

llvm/lib/Analysis/TargetTransformInfo.cpp

Show First 20 Lines • Show All 215 Lines • ▼ Show 20 Lines

unsigned TargetTransformInfo::getEstimatedNumberOfCaseClusters(		unsigned TargetTransformInfo::getEstimatedNumberOfCaseClusters(
const SwitchInst &SI, unsigned &JTSize, ProfileSummaryInfo *PSI,		const SwitchInst &SI, unsigned &JTSize, ProfileSummaryInfo *PSI,
BlockFrequencyInfo *BFI) const {		BlockFrequencyInfo *BFI) const {
return TTIImpl->getEstimatedNumberOfCaseClusters(SI, JTSize, PSI, BFI);		return TTIImpl->getEstimatedNumberOfCaseClusters(SI, JTSize, PSI, BFI);
}		}

InstructionCost		InstructionCost
TargetTransformInfo::getUserCost(const User *U,		TargetTransformInfo::getInstructionCost(const User *U,
ArrayRef<const Value *> Operands,		ArrayRef<const Value *> Operands,
enum TargetCostKind CostKind) const {		enum TargetCostKind CostKind) const {
InstructionCost Cost = TTIImpl->getUserCost(U, Operands, CostKind);		InstructionCost Cost = TTIImpl->getInstructionCost(U, Operands, CostKind);
assert((CostKind == TTI::TCK_RecipThroughput \|\| Cost >= 0) &&		assert((CostKind == TTI::TCK_RecipThroughput \|\| Cost >= 0) &&
"TTI should not produce negative costs!");		"TTI should not produce negative costs!");
return Cost;		return Cost;
}		}

BranchProbability TargetTransformInfo::getPredictableBranchThreshold() const {		BranchProbability TargetTransformInfo::getPredictableBranchThreshold() const {
return TTIImpl->getPredictableBranchThreshold();		return TTIImpl->getPredictableBranchThreshold();
}		}
▲ Show 20 Lines • Show All 908 Lines • ▼ Show 20 Lines	bool TargetTransformInfo::enableScalableVectorization() const {
return TTIImpl->enableScalableVectorization();		return TTIImpl->enableScalableVectorization();
}		}

bool TargetTransformInfo::hasActiveVectorLength(unsigned Opcode, Type *DataType,		bool TargetTransformInfo::hasActiveVectorLength(unsigned Opcode, Type *DataType,
Align Alignment) const {		Align Alignment) const {
return TTIImpl->hasActiveVectorLength(Opcode, DataType, Alignment);		return TTIImpl->hasActiveVectorLength(Opcode, DataType, Alignment);
}		}

InstructionCost
TargetTransformInfo::getInstructionLatency(const Instruction *I) const {
return TTIImpl->getInstructionLatency(I);
}

TargetTransformInfo::Concept::~Concept() = default;		TargetTransformInfo::Concept::~Concept() = default;

TargetIRAnalysis::TargetIRAnalysis() : TTICallback(&getDefaultTTI) {}		TargetIRAnalysis::TargetIRAnalysis() : TTICallback(&getDefaultTTI) {}

TargetIRAnalysis::TargetIRAnalysis(		TargetIRAnalysis::TargetIRAnalysis(
std::function<Result(const Function &)> TTICallback)		std::function<Result(const Function &)> TTICallback)
: TTICallback(std::move(TTICallback)) {}		: TTICallback(std::move(TTICallback)) {}

▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CodeGenPrepare.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 6,597 Lines • ▼ Show 20 Lines

	/// Check if V (an operand of a select instruction) is an expensive instruction			/// Check if V (an operand of a select instruction) is an expensive instruction
	/// that is only used once.			/// that is only used once.
	static bool sinkSelectOperand(const TargetTransformInfo TTI, Value V) {			static bool sinkSelectOperand(const TargetTransformInfo TTI, Value V) {
	auto *I = dyn_cast<Instruction>(V);			auto *I = dyn_cast<Instruction>(V);
	// If it's safe to speculatively execute, then it should not have side			// If it's safe to speculatively execute, then it should not have side
	// effects; therefore, it's safe to sink and possibly not execute.			// effects; therefore, it's safe to sink and possibly not execute.
	return I && I->hasOneUse() && isSafeToSpeculativelyExecute(I) &&			return I && I->hasOneUse() && isSafeToSpeculativelyExecute(I) &&
	TTI->getUserCost(I, TargetTransformInfo::TCK_SizeAndLatency) >=			TTI->getInstructionCost(I, TargetTransformInfo::TCK_SizeAndLatency) >=
	TargetTransformInfo::TCC_Expensive;			TargetTransformInfo::TCC_Expensive;
	}			}

	/// Returns true if a SelectInst should be turned into an explicit branch.			/// Returns true if a SelectInst should be turned into an explicit branch.
	static bool isFormingBranchFromSelectProfitable(const TargetTransformInfo *TTI,			static bool isFormingBranchFromSelectProfitable(const TargetTransformInfo *TTI,
	const TargetLowering *TLI,			const TargetLowering *TLI,
	SelectInst *SI) {			SelectInst *SI) {
	// If even a predictable select is cheap, then a branch can't be cheaper.			// If even a predictable select is cheap, then a branch can't be cheaper.
	if (!TLI->isPredictableSelectExpensive())			if (!TLI->isPredictableSelectExpensive())
	▲ Show 20 Lines • Show All 1,810 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp

Show First 20 Lines • Show All 2,334 Lines • ▼ Show 20 Lines	for (auto &I : *BB) {
if (const Function *F = cast<CallBase>(I).getCalledFunction()) {		if (const Function *F = cast<CallBase>(I).getCalledFunction()) {
if (!isLoweredToCall(F))		if (!isLoweredToCall(F))
continue;		continue;
}		}
return;		return;
}		}

SmallVector<const Value*, 4> Operands(I.operand_values());		SmallVector<const Value*, 4> Operands(I.operand_values());
Cost +=		Cost += getInstructionCost(&I, Operands,
getUserCost(&I, Operands, TargetTransformInfo::TCK_SizeAndLatency);		TargetTransformInfo::TCK_SizeAndLatency);
}		}
}		}

// On v6m cores, there are very few registers available. We can easily end up		// On v6m cores, there are very few registers available. We can easily end up
// spilling and reloading more registers in an unrolled loop. Look at the		// spilling and reloading more registers in an unrolled loop. Look at the
// number of LCSSA phis as a rough measure of how many registers will need to		// number of LCSSA phis as a rough measure of how many registers will need to
// be live out of the loop, reducing the default unroll count if more than 1		// be live out of the loop, reducing the default unroll count if more than 1
// value is needed. In the long run, all of this should be being learnt by a		// value is needed. In the long run, all of this should be being learnt by a
▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.h

Show First 20 Lines • Show All 159 Lines • ▼ Show 20 Lines	InstructionCost getCFInstrCost(unsigned Opcode, TTI::TargetCostKind CostKind,
return 1;		return 1;
}		}

bool isLegalMaskedStore(Type *DataType, Align Alignment);		bool isLegalMaskedStore(Type *DataType, Align Alignment);
bool isLegalMaskedLoad(Type *DataType, Align Alignment);		bool isLegalMaskedLoad(Type *DataType, Align Alignment);

/// @}		/// @}

InstructionCost getUserCost(const User U, ArrayRef<const Value > Operands,		InstructionCost getInstructionCost(const User *U,
		ArrayRef<const Value *> Operands,
TTI::TargetCostKind CostKind);		TTI::TargetCostKind CostKind);

// Hexagon specific decision to generate a lookup table.		// Hexagon specific decision to generate a lookup table.
bool shouldBuildLookupTables() const;		bool shouldBuildLookupTables() const;
};		};

} // end namespace llvm		} // end namespace llvm
#endif // LLVM_LIB_TARGET_HEXAGON_HEXAGONTARGETTRANSFORMINFO_H		#endif // LLVM_LIB_TARGET_HEXAGON_HEXAGONTARGETTRANSFORMINFO_H

llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.cpp

	Show First 20 Lines • Show All 334 Lines • ▼ Show 20 Lines
	unsigned HexagonTTIImpl::getPrefetchDistance() const {			unsigned HexagonTTIImpl::getPrefetchDistance() const {
	return ST.getL1PrefetchDistance();			return ST.getL1PrefetchDistance();
	}			}

	unsigned HexagonTTIImpl::getCacheLineSize() const {			unsigned HexagonTTIImpl::getCacheLineSize() const {
	return ST.getL1CacheLineSize();			return ST.getL1CacheLineSize();
	}			}

	InstructionCost HexagonTTIImpl::getUserCost(const User *U,			InstructionCost
				HexagonTTIImpl::getInstructionCost(const User *U,
	ArrayRef<const Value *> Operands,			ArrayRef<const Value *> Operands,
	TTI::TargetCostKind CostKind) {			TTI::TargetCostKind CostKind) {
	auto isCastFoldedIntoLoad = [this](const CastInst *CI) -> bool {			auto isCastFoldedIntoLoad = [this](const CastInst *CI) -> bool {
	if (!CI->isIntegerCast())			if (!CI->isIntegerCast())
	return false;			return false;
	// Only extensions from an integer type shorter than 32-bit to i32			// Only extensions from an integer type shorter than 32-bit to i32
	// can be folded into the load.			// can be folded into the load.
	const DataLayout &DL = getDataLayout();			const DataLayout &DL = getDataLayout();
	unsigned SBW = DL.getTypeSizeInBits(CI->getSrcTy());			unsigned SBW = DL.getTypeSizeInBits(CI->getSrcTy());
	unsigned DBW = DL.getTypeSizeInBits(CI->getDestTy());			unsigned DBW = DL.getTypeSizeInBits(CI->getDestTy());
	if (DBW != 32 \|\| SBW >= DBW)			if (DBW != 32 \|\| SBW >= DBW)
	return false;			return false;

	const LoadInst *LI = dyn_cast<const LoadInst>(CI->getOperand(0));			const LoadInst *LI = dyn_cast<const LoadInst>(CI->getOperand(0));
	// Technically, this code could allow multiple uses of the load, and			// Technically, this code could allow multiple uses of the load, and
	// check if all the uses are the same extension operation, but this			// check if all the uses are the same extension operation, but this
	// should be sufficient for most cases.			// should be sufficient for most cases.
	return LI && LI->hasOneUse();			return LI && LI->hasOneUse();
	};			};

	if (const CastInst *CI = dyn_cast<const CastInst>(U))			if (const CastInst *CI = dyn_cast<const CastInst>(U))
	if (isCastFoldedIntoLoad(CI))			if (isCastFoldedIntoLoad(CI))
	return TargetTransformInfo::TCC_Free;			return TargetTransformInfo::TCC_Free;
	return BaseT::getUserCost(U, Operands, CostKind);			return BaseT::getInstructionCost(U, Operands, CostKind);
	}			}

	bool HexagonTTIImpl::shouldBuildLookupTables() const {			bool HexagonTTIImpl::shouldBuildLookupTables() const {
	return EmitLookupTables;			return EmitLookupTables;
	}			}

llvm/lib/Target/PowerPC/PPCTargetTransformInfo.h

Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	public:
InstructionCost getIntImmCostInst(unsigned Opcode, unsigned Idx,		InstructionCost getIntImmCostInst(unsigned Opcode, unsigned Idx,
const APInt &Imm, Type *Ty,		const APInt &Imm, Type *Ty,
TTI::TargetCostKind CostKind,		TTI::TargetCostKind CostKind,
Instruction *Inst = nullptr);		Instruction *Inst = nullptr);
InstructionCost getIntImmCostIntrin(Intrinsic::ID IID, unsigned Idx,		InstructionCost getIntImmCostIntrin(Intrinsic::ID IID, unsigned Idx,
const APInt &Imm, Type *Ty,		const APInt &Imm, Type *Ty,
TTI::TargetCostKind CostKind);		TTI::TargetCostKind CostKind);

InstructionCost getUserCost(const User U, ArrayRef<const Value > Operands,		InstructionCost getInstructionCost(const User *U,
		ArrayRef<const Value *> Operands,
TTI::TargetCostKind CostKind);		TTI::TargetCostKind CostKind);

TTI::PopcntSupportKind getPopcntSupport(unsigned TyWidth);		TTI::PopcntSupportKind getPopcntSupport(unsigned TyWidth);
bool isHardwareLoopProfitable(Loop *L, ScalarEvolution &SE,		bool isHardwareLoopProfitable(Loop *L, ScalarEvolution &SE,
AssumptionCache &AC,		AssumptionCache &AC,
TargetLibraryInfo *LibInfo,		TargetLibraryInfo *LibInfo,
HardwareLoopInfo &HWLoopInfo);		HardwareLoopInfo &HWLoopInfo);
bool canSaveCmp(Loop L, BranchInst BI, ScalarEvolution SE, LoopInfo *LI,		bool canSaveCmp(Loop L, BranchInst BI, ScalarEvolution SE, LoopInfo *LI,
DominatorTree DT, AssumptionCache AC,		DominatorTree DT, AssumptionCache AC,
▲ Show 20 Lines • Show All 86 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp

	Show First 20 Lines • Show All 315 Lines • ▼ Show 20 Lines

	// Check if the current Type is an MMA vector type. Valid MMA types are			// Check if the current Type is an MMA vector type. Valid MMA types are
	// v256i1 and v512i1 respectively.			// v256i1 and v512i1 respectively.
	static bool isMMAType(Type *Ty) {			static bool isMMAType(Type *Ty) {
	return Ty->isVectorTy() && (Ty->getScalarSizeInBits() == 1) &&			return Ty->isVectorTy() && (Ty->getScalarSizeInBits() == 1) &&
	(Ty->getPrimitiveSizeInBits() > 128);			(Ty->getPrimitiveSizeInBits() > 128);
	}			}

	InstructionCost PPCTTIImpl::getUserCost(const User *U,			InstructionCost PPCTTIImpl::getInstructionCost(const User *U,
	ArrayRef<const Value *> Operands,			ArrayRef<const Value *> Operands,
	TTI::TargetCostKind CostKind) {			TTI::TargetCostKind CostKind) {
	// We already implement getCastInstrCost and getMemoryOpCost where we perform			// We already implement getCastInstrCost and getMemoryOpCost where we perform
	// the vector adjustment there.			// the vector adjustment there.
	if (isa<CastInst>(U) \|\| isa<LoadInst>(U) \|\| isa<StoreInst>(U))			if (isa<CastInst>(U) \|\| isa<LoadInst>(U) \|\| isa<StoreInst>(U))
	return BaseT::getUserCost(U, Operands, CostKind);			return BaseT::getInstructionCost(U, Operands, CostKind);

	if (U->getType()->isVectorTy()) {			if (U->getType()->isVectorTy()) {
	// Instructions that need to be split should cost more.			// Instructions that need to be split should cost more.
	std::pair<InstructionCost, MVT> LT = getTypeLegalizationCost(U->getType());			std::pair<InstructionCost, MVT> LT = getTypeLegalizationCost(U->getType());
	return LT.first * BaseT::getUserCost(U, Operands, CostKind);			return LT.first * BaseT::getInstructionCost(U, Operands, CostKind);
	}			}

	return BaseT::getUserCost(U, Operands, CostKind);			return BaseT::getInstructionCost(U, Operands, CostKind);
	}			}

	// Determining the address of a TLS variable results in a function call in			// Determining the address of a TLS variable results in a function call in
	// certain TLS models.			// certain TLS models.
	static bool memAddrUsesCTR(const Value *MemAddr, const PPCTargetMachine &TM,			static bool memAddrUsesCTR(const Value *MemAddr, const PPCTargetMachine &TM,
	SmallPtrSetImpl<const Value *> &Visited) {			SmallPtrSetImpl<const Value *> &Visited) {
	// No need to traverse again if we already checked this operand.			// No need to traverse again if we already checked this operand.
	if (!Visited.insert(MemAddr).second)			if (!Visited.insert(MemAddr).second)
	▲ Show 20 Lines • Show All 1,115 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

Show First 20 Lines • Show All 486 Lines • ▼ Show 20 Lines	for (auto &I : *BB) {
if (const Function *F = cast<CallBase>(I).getCalledFunction()) {		if (const Function *F = cast<CallBase>(I).getCalledFunction()) {
if (!isLoweredToCall(F))		if (!isLoweredToCall(F))
continue;		continue;
}		}
return;		return;
}		}

SmallVector<const Value *> Operands(I.operand_values());		SmallVector<const Value *> Operands(I.operand_values());
Cost +=		Cost += getInstructionCost(&I, Operands,
getUserCost(&I, Operands, TargetTransformInfo::TCK_SizeAndLatency);		TargetTransformInfo::TCK_SizeAndLatency);
}		}
}		}

LLVM_DEBUG(dbgs() << "Cost of loop: " << Cost << "\n");		LLVM_DEBUG(dbgs() << "Cost of loop: " << Cost << "\n");

UP.Partial = true;		UP.Partial = true;
UP.Runtime = true;		UP.Runtime = true;
UP.UnrollRemainder = true;		UP.UnrollRemainder = true;
Show All 26 Lines

llvm/lib/Target/X86/X86TargetTransformInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,615 Lines • ▼ Show 20 Lines	return AdjustCost(
BaseT::getCastInstrCost(Opcode, Dst, Src, CCH, CostKind, I));		BaseT::getCastInstrCost(Opcode, Dst, Src, CCH, CostKind, I));
}		}

InstructionCost X86TTIImpl::getCmpSelInstrCost(unsigned Opcode, Type *ValTy,		InstructionCost X86TTIImpl::getCmpSelInstrCost(unsigned Opcode, Type *ValTy,
Type *CondTy,		Type *CondTy,
CmpInst::Predicate VecPred,		CmpInst::Predicate VecPred,
TTI::TargetCostKind CostKind,		TTI::TargetCostKind CostKind,
const Instruction *I) {		const Instruction *I) {
		// Assume a 3cy latency for fp select ops.
		if (CostKind == TTI::TCK_Latency && Opcode == Instruction::Select)
		if (ValTy->getScalarType()->isFloatingPointTy())
		return 3;

// TODO: Handle other cost kinds.		// TODO: Handle other cost kinds.
if (CostKind != TTI::TCK_RecipThroughput)		if (CostKind != TTI::TCK_RecipThroughput)
return BaseT::getCmpSelInstrCost(Opcode, ValTy, CondTy, VecPred, CostKind,		return BaseT::getCmpSelInstrCost(Opcode, ValTy, CondTy, VecPred, CostKind,
I);		I);

// Legalize the type.		// Legalize the type.
std::pair<InstructionCost, MVT> LT = getTypeLegalizationCost(ValTy);		std::pair<InstructionCost, MVT> LT = getTypeLegalizationCost(ValTy);

▲ Show 20 Lines • Show All 3,387 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp

Show First 20 Lines • Show All 567 Lines • ▼ Show 20 Lines	InstructionCost getUserBonus(User *U, llvm::TargetTransformInfo &TTI,
LoopInfo &LI) {		LoopInfo &LI) {
auto *I = dyn_cast_or_null<Instruction>(U);		auto *I = dyn_cast_or_null<Instruction>(U);
// If not an instruction we do not know how to evaluate.		// If not an instruction we do not know how to evaluate.
// Keep minimum possible cost for now so that it doesnt affect		// Keep minimum possible cost for now so that it doesnt affect
// specialization.		// specialization.
if (!I)		if (!I)
return std::numeric_limits<unsigned>::min();		return std::numeric_limits<unsigned>::min();

auto Cost = TTI.getUserCost(U, TargetTransformInfo::TCK_SizeAndLatency);		InstructionCost Cost =
		TTI.getInstructionCost(U, TargetTransformInfo::TCK_SizeAndLatency);

// Traverse recursively if there are more uses.		// Traverse recursively if there are more uses.
// TODO: Any other instructions to be added here?		// TODO: Any other instructions to be added here?
if (I->mayReadFromMemory() \|\| I->isCast())		if (I->mayReadFromMemory() \|\| I->isCast())
for (auto *User : I->users())		for (auto *User : I->users())
Cost += getUserBonus(User, TTI, LI);		Cost += getUserBonus(User, TTI, LI);

// Increase the cost if it is inside the loop.		// Increase the cost if it is inside the loop.
▲ Show 20 Lines • Show All 365 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/JumpThreading.cpp

Show First 20 Lines • Show All 555 Lines • ▼ Show 20 Lines	if (I->getType()->isTokenTy() && I->isUsedOutsideOfBlock(BB))
return ~0U;		return ~0U;

// Blocks with NoDuplicate are modelled as having infinite cost, so they		// Blocks with NoDuplicate are modelled as having infinite cost, so they
// are never duplicated.		// are never duplicated.
if (const CallInst *CI = dyn_cast<CallInst>(I))		if (const CallInst *CI = dyn_cast<CallInst>(I))
if (CI->cannotDuplicate() \|\| CI->isConvergent())		if (CI->cannotDuplicate() \|\| CI->isConvergent())
return ~0U;		return ~0U;

if (TTI->getUserCost(&*I, TargetTransformInfo::TCK_SizeAndLatency)		if (TTI->getInstructionCost(&*I, TargetTransformInfo::TCK_SizeAndLatency) ==
== TargetTransformInfo::TCC_Free)		TargetTransformInfo::TCC_Free)
continue;		continue;

// All other instructions count for at least one unit.		// All other instructions count for at least one unit.
++Size;		++Size;

// Calls are more expensive. If they are non-intrinsic calls, we model them		// Calls are more expensive. If they are non-intrinsic calls, we model them
// as having cost of 4. If they are a non-vector intrinsic, we model them		// as having cost of 4. If they are a non-vector intrinsic, we model them
// as having cost of 2 total, and if they are a vector intrinsic, we model		// as having cost of 2 total, and if they are a vector intrinsic, we model
▲ Show 20 Lines • Show All 2,485 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/LICM.cpp

Show First 20 Lines • Show All 1,311 Lines • ▼ Show 20 Lines	static bool isTriviallyReplaceablePHI(const PHINode &PN, const Instruction &I) {

return true;		return true;
}		}

/// Return true if the instruction is free in the loop.		/// Return true if the instruction is free in the loop.
static bool isFreeInLoop(const Instruction &I, const Loop *CurLoop,		static bool isFreeInLoop(const Instruction &I, const Loop *CurLoop,
const TargetTransformInfo *TTI) {		const TargetTransformInfo *TTI) {
InstructionCost CostI =		InstructionCost CostI =
TTI->getUserCost(&I, TargetTransformInfo::TCK_SizeAndLatency);		TTI->getInstructionCost(&I, TargetTransformInfo::TCK_SizeAndLatency);

if (auto *GEP = dyn_cast<GetElementPtrInst>(&I)) {		if (auto *GEP = dyn_cast<GetElementPtrInst>(&I)) {
if (CostI != TargetTransformInfo::TCC_Free)		if (CostI != TargetTransformInfo::TCC_Free)
return false;		return false;
// For a GEP, we cannot simply use getUserCost because currently it		// For a GEP, we cannot simply use getInstructionCost because currently
		dmgreenUnsubmitted Not Done Reply Inline Actions getUserCost -> getInstructionCost dmgreen: getUserCost -> getInstructionCost
// optimistically assumes that a GEP will fold into addressing mode		// it optimistically assumes that a GEP will fold into addressing mode
// regardless of its users.		// regardless of its users.
const BasicBlock *BB = GEP->getParent();		const BasicBlock *BB = GEP->getParent();
for (const User *U : GEP->users()) {		for (const User *U : GEP->users()) {
const Instruction *UI = cast<Instruction>(U);		const Instruction *UI = cast<Instruction>(U);
if (CurLoop->contains(UI) &&		if (CurLoop->contains(UI) &&
(BB != UI->getParent() \|\|		(BB != UI->getParent() \|\|
(!isa<StoreInst>(UI) && !isa<LoadInst>(UI))))		(!isa<StoreInst>(UI) && !isa<LoadInst>(UI))))
return false;		return false;
▲ Show 20 Lines • Show All 985 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/LoopFlatten.cpp

Show First 20 Lines • Show All 546 Lines • ▼ Show 20 Lines	for (auto &I : *B) {
Br->getSuccessor(0) == FI.InnerLoop->getHeader())		Br->getSuccessor(0) == FI.InnerLoop->getHeader())
continue;		continue;
// Multiplies of the outer iteration variable and inner iteration		// Multiplies of the outer iteration variable and inner iteration
// count will be optimised out.		// count will be optimised out.
if (match(&I, m_c_Mul(m_Specific(FI.OuterInductionPHI),		if (match(&I, m_c_Mul(m_Specific(FI.OuterInductionPHI),
m_Specific(FI.InnerTripCount))))		m_Specific(FI.InnerTripCount))))
continue;		continue;
InstructionCost Cost =		InstructionCost Cost =
TTI->getUserCost(&I, TargetTransformInfo::TCK_SizeAndLatency);		TTI->getInstructionCost(&I, TargetTransformInfo::TCK_SizeAndLatency);
LLVM_DEBUG(dbgs() << "Cost " << Cost << ": "; I.dump());		LLVM_DEBUG(dbgs() << "Cost " << Cost << ": "; I.dump());
RepeatedInstrCost += Cost;		RepeatedInstrCost += Cost;
}		}
}		}

LLVM_DEBUG(dbgs() << "Cost of instructions that will be repeated: "		LLVM_DEBUG(dbgs() << "Cost of instructions that will be repeated: "
<< RepeatedInstrCost << "\n");		<< RepeatedInstrCost << "\n");
// Bail out if flattening the loops would cause instructions in the outer		// Bail out if flattening the loops would cause instructions in the outer
▲ Show 20 Lines • Show All 432 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp

Show First 20 Lines • Show All 437 Lines • ▼ Show 20 Lines	for (;; --Iteration) {
PhiI->getIncomingValueForBlock(L->getLoopLatch())))		PhiI->getIncomingValueForBlock(L->getLoopLatch())))
if (L->contains(OpI))		if (L->contains(OpI))
PHIUsedList.push_back(OpI);		PHIUsedList.push_back(OpI);
continue;		continue;
}		}

// First accumulate the cost of this instruction.		// First accumulate the cost of this instruction.
if (!Cost.IsFree) {		if (!Cost.IsFree) {
UnrolledCost += TTI.getUserCost(I, CostKind);		UnrolledCost += TTI.getInstructionCost(I, CostKind);
LLVM_DEBUG(dbgs() << "Adding cost of instruction (iteration "		LLVM_DEBUG(dbgs() << "Adding cost of instruction (iteration "
<< Iteration << "): ");		<< Iteration << "): ");
LLVM_DEBUG(I->dump());		LLVM_DEBUG(I->dump());
}		}

// We must count the cost of every operand which is not free,		// We must count the cost of every operand which is not free,
// recursively. If we reach a loop PHI node, simply add it to the set		// recursively. If we reach a loop PHI node, simply add it to the set
// to be considered on the next iteration (backwards!).		// to be considered on the next iteration (backwards!).
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	for (unsigned Idx = 0; Idx != BBWorklist.size(); ++Idx) {
for (Instruction &I : *BB) {		for (Instruction &I : *BB) {
// These won't get into the final code - don't even try calculating the		// These won't get into the final code - don't even try calculating the
// cost for them.		// cost for them.
if (isa<DbgInfoIntrinsic>(I) \|\| EphValues.count(&I))		if (isa<DbgInfoIntrinsic>(I) \|\| EphValues.count(&I))
continue;		continue;

// Track this instruction's expected baseline cost when executing the		// Track this instruction's expected baseline cost when executing the
// rolled loop form.		// rolled loop form.
RolledDynamicCost += TTI.getUserCost(&I, CostKind);		RolledDynamicCost += TTI.getInstructionCost(&I, CostKind);

// Visit the instruction to analyze its loop cost after unrolling,		// Visit the instruction to analyze its loop cost after unrolling,
// and if the visitor returns true, mark the instruction as free after		// and if the visitor returns true, mark the instruction as free after
// unrolling and continue.		// unrolling and continue.
bool IsFree = Analyzer.visit(I);		bool IsFree = Analyzer.visit(I);
bool Inserted = InstCostMap.insert({&I, (int)Iteration,		bool Inserted = InstCostMap.insert({&I, (int)Iteration,
(unsigned)IsFree,		(unsigned)IsFree,
/IsCounted/ false}).second;		/IsCounted/ false}).second;
▲ Show 20 Lines • Show All 1,114 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp

Show First 20 Lines • Show All 2,872 Lines • ▼ Show 20 Lines	for (auto &I : *BB) {
continue;		continue;

if (I.getType()->isTokenTy() && I.isUsedOutsideOfBlock(BB))		if (I.getType()->isTokenTy() && I.isUsedOutsideOfBlock(BB))
return false;		return false;
if (auto *CB = dyn_cast<CallBase>(&I))		if (auto *CB = dyn_cast<CallBase>(&I))
if (CB->isConvergent() \|\| CB->cannotDuplicate())		if (CB->isConvergent() \|\| CB->cannotDuplicate())
return false;		return false;

Cost += TTI.getUserCost(&I, CostKind);		Cost += TTI.getInstructionCost(&I, CostKind);
}		}
assert(Cost >= 0 && "Must not have negative costs!");		assert(Cost >= 0 && "Must not have negative costs!");
LoopCost += Cost;		LoopCost += Cost;
assert(LoopCost >= 0 && "Must not have negative loop costs!");		assert(LoopCost >= 0 && "Must not have negative loop costs!");
BBCostMap[BB] = Cost;		BBCostMap[BB] = Cost;
}		}
LLVM_DEBUG(dbgs() << " Total loop cost: " << LoopCost << "\n");		LLVM_DEBUG(dbgs() << " Total loop cost: " << LoopCost << "\n");

▲ Show 20 Lines • Show All 407 Lines • Show Last 20 Lines

llvm/lib/Transforms/Scalar/SpeculativeExecution.cpp

Show First 20 Lines • Show All 246 Lines • ▼ Show 20 Lines	switch (Operator::getOpcode(I)) {
case Instruction::FCmp:		case Instruction::FCmp:
case Instruction::Trunc:		case Instruction::Trunc:
case Instruction::Freeze:		case Instruction::Freeze:
case Instruction::ExtractElement:		case Instruction::ExtractElement:
case Instruction::InsertElement:		case Instruction::InsertElement:
case Instruction::ShuffleVector:		case Instruction::ShuffleVector:
case Instruction::ExtractValue:		case Instruction::ExtractValue:
case Instruction::InsertValue:		case Instruction::InsertValue:
return TTI.getUserCost(I, TargetTransformInfo::TCK_SizeAndLatency);		return TTI.getInstructionCost(I, TargetTransformInfo::TCK_SizeAndLatency);

default:		default:
return InstructionCost::getInvalid(); // Disallow anything not explicitly		return InstructionCost::getInvalid(); // Disallow anything not explicitly
// listed.		// listed.
}		}
}		}

bool SpeculativeExecutionPass::considerHoistingFromTo(		bool SpeculativeExecutionPass::considerHoistingFromTo(
▲ Show 20 Lines • Show All 86 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/SimplifyCFG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 375 Lines • ▼ Show 20 Lines
/// which is assumed to be safe to speculate. TCC_Free means cheap,		/// which is assumed to be safe to speculate. TCC_Free means cheap,
/// TCC_Basic means less cheap, and TCC_Expensive means prohibitively		/// TCC_Basic means less cheap, and TCC_Expensive means prohibitively
/// expensive.		/// expensive.
static InstructionCost computeSpeculationCost(const User *I,		static InstructionCost computeSpeculationCost(const User *I,
const TargetTransformInfo &TTI) {		const TargetTransformInfo &TTI) {
assert((!isa<Instruction>(I) \|\|		assert((!isa<Instruction>(I) \|\|
isSafeToSpeculativelyExecute(cast<Instruction>(I))) &&		isSafeToSpeculativelyExecute(cast<Instruction>(I))) &&
"Instruction is not safe to speculatively execute!");		"Instruction is not safe to speculatively execute!");
return TTI.getUserCost(I, TargetTransformInfo::TCK_SizeAndLatency);		return TTI.getInstructionCost(I, TargetTransformInfo::TCK_SizeAndLatency);
}		}

/// If we have a merge point of an "if condition" as accepted above,		/// If we have a merge point of an "if condition" as accepted above,
/// return true if the specified value dominates the block. We		/// return true if the specified value dominates the block. We
/// don't handle the true generality of domination here, just a special case		/// don't handle the true generality of domination here, just a special case
/// which works well enough for us.		/// which works well enough for us.
///		///
/// If AggressiveInsts is non-null, and if V does not dominate BB, we check to		/// If AggressiveInsts is non-null, and if V does not dominate BB, we check to
▲ Show 20 Lines • Show All 3,228 Lines • ▼ Show 20 Lines	if (isa<DbgInfoIntrinsic>(I) \|\| isa<BranchInst>(I))
continue;		continue;
// I must be safe to execute unconditionally.		// I must be safe to execute unconditionally.
if (!isSafeToSpeculativelyExecute(&I))		if (!isSafeToSpeculativelyExecute(&I))
return false;		return false;
SawVectorOp \|= isVectorOp(I);		SawVectorOp \|= isVectorOp(I);

// Account for the cost of duplicating this instruction into each		// Account for the cost of duplicating this instruction into each
// predecessor. Ignore free instructions.		// predecessor. Ignore free instructions.
if (!TTI \|\|		if (!TTI \|\| TTI->getInstructionCost(&I, CostKind) !=
TTI->getUserCost(&I, CostKind) != TargetTransformInfo::TCC_Free) {		TargetTransformInfo::TCC_Free) {
NumBonusInsts += PredCount;		NumBonusInsts += PredCount;

// Early exits once we reach the limit.		// Early exits once we reach the limit.
if (NumBonusInsts >		if (NumBonusInsts >
BonusInstThreshold * BranchFoldToCommonDestVectorMultiplier)		BonusInstThreshold * BranchFoldToCommonDestVectorMultiplier)
return false;		return false;
}		}

▲ Show 20 Lines • Show All 155 Lines • ▼ Show 20 Lines	for (auto &I : BB->instructionsWithoutDebug(false)) {
if (auto *S = dyn_cast<StoreInst>(&I))		if (auto *S = dyn_cast<StoreInst>(&I))
if (llvm::find(FreeStores, S))		if (llvm::find(FreeStores, S))
continue;		continue;
// Else, we have a white-list of instructions that we are ak speculating.		// Else, we have a white-list of instructions that we are ak speculating.
if (!isa<BinaryOperator>(I) && !isa<GetElementPtrInst>(I))		if (!isa<BinaryOperator>(I) && !isa<GetElementPtrInst>(I))
return false; // Not in white-list - not worthwhile folding.		return false; // Not in white-list - not worthwhile folding.
// And finally, if this is a non-free instruction that we are okay		// And finally, if this is a non-free instruction that we are okay
// speculating, ensure that we consider the speculation budget.		// speculating, ensure that we consider the speculation budget.
Cost += TTI.getUserCost(&I, TargetTransformInfo::TCK_SizeAndLatency);		Cost +=
		TTI.getInstructionCost(&I, TargetTransformInfo::TCK_SizeAndLatency);
if (Cost > Budget)		if (Cost > Budget)
return false; // Eagerly refuse to fold as soon as we're out of budget.		return false; // Eagerly refuse to fold as soon as we're out of budget.
}		}
assert(Cost <= Budget &&		assert(Cost <= Budget &&
"When we run out of budget we will eagerly return from within the "		"When we run out of budget we will eagerly return from within the "
"per-instruction loop.");		"per-instruction loop.");
return true;		return true;
};		};
▲ Show 20 Lines • Show All 3,360 Lines • Show Last 20 Lines

llvm/test/Analysis/CostModel/AArch64/sve-math.ll

	; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
	; RUN: opt -mtriple=aarch64-- -mattr=+sve -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=throughput < %s \| FileCheck %s --check-prefix=THRU			; RUN: opt -mtriple=aarch64-- -mattr=+sve -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=throughput < %s \| FileCheck %s --check-prefix=THRU
	; RUN: opt -mtriple=aarch64-- -mattr=+sve -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency < %s \| FileCheck %s --check-prefix=LATE			; RUN: opt -mtriple=aarch64-- -mattr=+sve -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=latency < %s \| FileCheck %s --check-prefix=LATE
	; RUN: opt -mtriple=aarch64-- -mattr=+sve -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size < %s \| FileCheck %s --check-prefix=SIZE			; RUN: opt -mtriple=aarch64-- -mattr=+sve -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=code-size < %s \| FileCheck %s --check-prefix=SIZE
	; RUN: opt -mtriple=aarch64-- -mattr=+sve -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency < %s \| FileCheck %s --check-prefix=SIZE_LATE			; RUN: opt -mtriple=aarch64-- -mattr=+sve -passes="print<cost-model>" 2>&1 -disable-output -cost-kind=size-latency < %s \| FileCheck %s --check-prefix=SIZE_LATE

	declare <vscale x 2 x double> @llvm.sqrt.v2f64(<vscale x 2 x double>)			declare <vscale x 2 x double> @llvm.sqrt.v2f64(<vscale x 2 x double>)

	define <vscale x 2 x double> @fadd_v2f64(<vscale x 2 x double> %a, <vscale x 2 x double> %b) {			define <vscale x 2 x double> @fadd_v2f64(<vscale x 2 x double> %a, <vscale x 2 x double> %b) {
	; THRU-LABEL: 'fadd_v2f64'			; THRU-LABEL: 'fadd_v2f64'
	; THRU-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = fadd <vscale x 2 x double> %a, %b			; THRU-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = fadd <vscale x 2 x double> %a, %b
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret <vscale x 2 x double> %r			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret <vscale x 2 x double> %r
	;			;
	; LATE-LABEL: 'fadd_v2f64'			; LATE-LABEL: 'fadd_v2f64'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = fadd <vscale x 2 x double> %a, %b			; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = fadd <vscale x 2 x double> %a, %b
				RKSimonAuthorUnsubmitted Done Reply Inline Actions @samparker Just to be clear - you're happy for me to make the latency=3 default x86-only? RKSimon: @samparker Just to be clear - you're happy for me to make the latency=3 default x86-only?
				dmgreenUnsubmitted Not Done Reply Inline Actions I think 3 sounds pretty sensible as a first approximation - it might not be precise but I've commonly seem fp operations in that ballpark. If was used for all targets before then I would keep as all targets now. dmgreen: I think 3 sounds pretty sensible as a first approximation - it might not be precise but I've…
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret <vscale x 2 x double> %r			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret <vscale x 2 x double> %r
	;			;
	; SIZE-LABEL: 'fadd_v2f64'			; SIZE-LABEL: 'fadd_v2f64'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r = fadd <vscale x 2 x double> %a, %b			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r = fadd <vscale x 2 x double> %a, %b
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret <vscale x 2 x double> %r			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret <vscale x 2 x double> %r
	;			;
	; SIZE_LATE-LABEL: 'fadd_v2f64'			; SIZE_LATE-LABEL: 'fadd_v2f64'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r = fadd <vscale x 2 x double> %a, %b			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %r = fadd <vscale x 2 x double> %a, %b
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret <vscale x 2 x double> %r			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret <vscale x 2 x double> %r
	;			;
	%r = fadd <vscale x 2 x double> %a, %b			%r = fadd <vscale x 2 x double> %a, %b
	ret <vscale x 2 x double> %r			ret <vscale x 2 x double> %r
	}			}

	define <vscale x 2 x double> @sqrt_v2f64(<vscale x 2 x double> %a) {			define <vscale x 2 x double> @sqrt_v2f64(<vscale x 2 x double> %a) {
	; THRU-LABEL: 'sqrt_v2f64'			; THRU-LABEL: 'sqrt_v2f64'
	; THRU-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = call <vscale x 2 x double> @llvm.sqrt.nxv2f64(<vscale x 2 x double> %a)			; THRU-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = call <vscale x 2 x double> @llvm.sqrt.nxv2f64(<vscale x 2 x double> %a)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret <vscale x 2 x double> %r			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret <vscale x 2 x double> %r
	;			;
	; LATE-LABEL: 'sqrt_v2f64'			; LATE-LABEL: 'sqrt_v2f64'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %r = call <vscale x 2 x double> @llvm.sqrt.nxv2f64(<vscale x 2 x double> %a)			; LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = call <vscale x 2 x double> @llvm.sqrt.nxv2f64(<vscale x 2 x double> %a)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret <vscale x 2 x double> %r			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret <vscale x 2 x double> %r
	;			;
	; SIZE-LABEL: 'sqrt_v2f64'			; SIZE-LABEL: 'sqrt_v2f64'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = call <vscale x 2 x double> @llvm.sqrt.nxv2f64(<vscale x 2 x double> %a)			; SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = call <vscale x 2 x double> @llvm.sqrt.nxv2f64(<vscale x 2 x double> %a)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret <vscale x 2 x double> %r			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret <vscale x 2 x double> %r
	;			;
	; SIZE_LATE-LABEL: 'sqrt_v2f64'			; SIZE_LATE-LABEL: 'sqrt_v2f64'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = call <vscale x 2 x double> @llvm.sqrt.nxv2f64(<vscale x 2 x double> %a)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %r = call <vscale x 2 x double> @llvm.sqrt.nxv2f64(<vscale x 2 x double> %a)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret <vscale x 2 x double> %r			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret <vscale x 2 x double> %r
	;			;
	%r = call <vscale x 2 x double> @llvm.sqrt.v2f64(<vscale x 2 x double> %a)			%r = call <vscale x 2 x double> @llvm.sqrt.v2f64(<vscale x 2 x double> %a)
	ret <vscale x 2 x double> %r			ret <vscale x 2 x double> %r
	}			}

llvm/test/Analysis/CostModel/ARM/intrinsic-cost-kinds.ll

	Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines

	define void @smax(i32 %a, i32 %b, <16 x i32> %va, <16 x i32> %vb) {			define void @smax(i32 %a, i32 %b, <16 x i32> %va, <16 x i32> %vb) {
	; THRU-LABEL: 'smax'			; THRU-LABEL: 'smax'
	; THRU-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)			; THRU-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)
	; THRU-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)			; THRU-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'smax'			; LATE-LABEL: 'smax'
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)			; LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)			; LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'smax'			; SIZE-LABEL: 'smax'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)			; SIZE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)			; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'smax'			; SIZE_LATE-LABEL: 'smax'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call i32 @llvm.smax.i32(i32 %a, i32 %b)			%s = call i32 @llvm.smax.i32(i32 %a, i32 %b)
	%v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)			%v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
	ret void			ret void
	}			}

	define void @fmuladd(float %a, float %b, float %c, <16 x float> %va, <16 x float> %vb, <16 x float> %vc) {			define void @fmuladd(float %a, float %b, float %c, <16 x float> %va, <16 x float> %vb, <16 x float> %vc) {
	; THRU-LABEL: 'fmuladd'			; THRU-LABEL: 'fmuladd'
	; THRU-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)			; THRU-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
	; THRU-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)			; THRU-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'fmuladd'			; LATE-LABEL: 'fmuladd'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)			; LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'fmuladd'			; SIZE-LABEL: 'fmuladd'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)			; SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'fmuladd'			; SIZE_LATE-LABEL: 'fmuladd'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)			%s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
	%v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)			%v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)
	ret void			ret void
	}			}

	define void @log2(float %a, <16 x float> %va) {			define void @log2(float %a, <16 x float> %va) {
	; THRU-LABEL: 'log2'			; THRU-LABEL: 'log2'
	; THRU-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.log2.f32(float %a)			; THRU-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.log2.f32(float %a)
	; THRU-NEXT: Cost Model: Found an estimated cost of 192 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)			; THRU-NEXT: Cost Model: Found an estimated cost of 192 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'log2'			; LATE-LABEL: 'log2'
	; LATE-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %s = call float @llvm.log2.f32(float %a)			; LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.log2.f32(float %a)
	; LATE-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)			; LATE-NEXT: Cost Model: Found an estimated cost of 192 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'log2'			; SIZE-LABEL: 'log2'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.log2.f32(float %a)			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.log2.f32(float %a)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 48 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)			; SIZE-NEXT: Cost Model: Found an estimated cost of 48 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'log2'			; SIZE_LATE-LABEL: 'log2'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.log2.f32(float %a)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.log2.f32(float %a)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 192 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 192 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call float @llvm.log2.f32(float %a)			%s = call float @llvm.log2.f32(float %a)
	%v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)			%v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)
	ret void			ret void
	}			}

	define void @constrained_fadd(float %a, <16 x float> %va) {			define void @constrained_fadd(float %a, <16 x float> %va) {
	; THRU-LABEL: 'constrained_fadd'			; THRU-LABEL: 'constrained_fadd'
	; THRU-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; THRU-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; THRU-NEXT: Cost Model: Found an estimated cost of 48 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; THRU-NEXT: Cost Model: Found an estimated cost of 48 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'constrained_fadd'			; LATE-LABEL: 'constrained_fadd'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; LATE-NEXT: Cost Model: Found an estimated cost of 48 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'constrained_fadd'			; SIZE-LABEL: 'constrained_fadd'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; SIZE-NEXT: Cost Model: Found an estimated cost of 48 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; SIZE-NEXT: Cost Model: Found an estimated cost of 48 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'constrained_fadd'			; SIZE_LATE-LABEL: 'constrained_fadd'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 48 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 48 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")			%s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	%t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")			%t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	ret void			ret void
	}			}

	define void @fmaximum(float %a, float %b, <16 x float> %va, <16 x float> %vb) {			define void @fmaximum(float %a, float %b, <16 x float> %va, <16 x float> %vb) {
	; THRU-LABEL: 'fmaximum'			; THRU-LABEL: 'fmaximum'
	; THRU-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.maximum.f32(float %a, float %b)			; THRU-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.maximum.f32(float %a, float %b)
	; THRU-NEXT: Cost Model: Found an estimated cost of 208 for instruction: %v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)			; THRU-NEXT: Cost Model: Found an estimated cost of 208 for instruction: %v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'fmaximum'			; LATE-LABEL: 'fmaximum'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call float @llvm.maximum.f32(float %a, float %b)			; LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.maximum.f32(float %a, float %b)
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)			; LATE-NEXT: Cost Model: Found an estimated cost of 208 for instruction: %v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'fmaximum'			; SIZE-LABEL: 'fmaximum'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.maximum.f32(float %a, float %b)			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.maximum.f32(float %a, float %b)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)			; SIZE-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'fmaximum'			; SIZE_LATE-LABEL: 'fmaximum'
	Show All 9 Lines
	define void @cttz(i32 %a, <16 x i32> %va) {			define void @cttz(i32 %a, <16 x i32> %va) {
	; THRU-LABEL: 'cttz'			; THRU-LABEL: 'cttz'
	; THRU-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.cttz.i32(i32 %a, i1 false)			; THRU-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.cttz.i32(i32 %a, i1 false)
	; THRU-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)			; THRU-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'cttz'			; LATE-LABEL: 'cttz'
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.cttz.i32(i32 %a, i1 false)			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.cttz.i32(i32 %a, i1 false)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)			; LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'cttz'			; SIZE-LABEL: 'cttz'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.cttz.i32(i32 %a, i1 false)			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.cttz.i32(i32 %a, i1 false)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)			; SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'cttz'			; SIZE_LATE-LABEL: 'cttz'
	Show All 9 Lines
	define void @ctlz(i32 %a, <16 x i32> %va) {			define void @ctlz(i32 %a, <16 x i32> %va) {
	; THRU-LABEL: 'ctlz'			; THRU-LABEL: 'ctlz'
	; THRU-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)			; THRU-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)
	; THRU-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)			; THRU-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'ctlz'			; LATE-LABEL: 'ctlz'
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)			; LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'ctlz'			; SIZE-LABEL: 'ctlz'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)			; SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'ctlz'			; SIZE_LATE-LABEL: 'ctlz'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)			%s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)
	%v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)			%v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)
	ret void			ret void
	}			}

	define void @fshl(i32 %a, i32 %b, i32 %c, <16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc) {			define void @fshl(i32 %a, i32 %b, i32 %c, <16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc) {
	; THRU-LABEL: 'fshl'			; THRU-LABEL: 'fshl'
	; THRU-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)			; THRU-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)
	; THRU-NEXT: Cost Model: Found an estimated cost of 256 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)			; THRU-NEXT: Cost Model: Found an estimated cost of 256 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'fshl'			; LATE-LABEL: 'fshl'
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)			; LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)			; LATE-NEXT: Cost Model: Found an estimated cost of 250 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'fshl'			; SIZE-LABEL: 'fshl'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)			; SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 229 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)			; SIZE-NEXT: Cost Model: Found an estimated cost of 229 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'fshl'			; SIZE_LATE-LABEL: 'fshl'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 250 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 250 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)			%s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)
	%v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)			%v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
	ret void			ret void
	}			}

	define void @maskedgather(<16 x float*> %va, <16 x i1> %vb, <16 x float> %vc) {			define void @maskedgather(<16 x float*> %va, <16 x i1> %vb, <16 x float> %vc) {
	; THRU-LABEL: 'maskedgather'			; THRU-LABEL: 'maskedgather'
	; THRU-NEXT: Cost Model: Found an estimated cost of 96 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)			; THRU-NEXT: Cost Model: Found an estimated cost of 96 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'maskedgather'			; LATE-LABEL: 'maskedgather'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)			; LATE-NEXT: Cost Model: Found an estimated cost of 96 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'maskedgather'			; SIZE-LABEL: 'maskedgather'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 96 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)			; SIZE-NEXT: Cost Model: Found an estimated cost of 96 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'maskedgather'			; SIZE_LATE-LABEL: 'maskedgather'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 96 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 96 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)			%v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)
	ret void			ret void
	}			}

	define void @maskedscatter(<16 x float> %va, <16 x float*> %vb, <16 x i1> %vc) {			define void @maskedscatter(<16 x float> %va, <16 x float*> %vb, <16 x i1> %vc) {
	; THRU-LABEL: 'maskedscatter'			; THRU-LABEL: 'maskedscatter'
	; THRU-NEXT: Cost Model: Found an estimated cost of 96 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)			; THRU-NEXT: Cost Model: Found an estimated cost of 96 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'maskedscatter'			; LATE-LABEL: 'maskedscatter'
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)			; LATE-NEXT: Cost Model: Found an estimated cost of 96 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'maskedscatter'			; SIZE-LABEL: 'maskedscatter'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 96 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)			; SIZE-NEXT: Cost Model: Found an estimated cost of 96 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'maskedscatter'			; SIZE_LATE-LABEL: 'maskedscatter'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 96 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 96 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)			call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)
	ret void			ret void
	}			}

	define void @reduce_fmax(<16 x float> %va) {			define void @reduce_fmax(<16 x float> %va) {
	; THRU-LABEL: 'reduce_fmax'			; THRU-LABEL: 'reduce_fmax'
	; THRU-NEXT: Cost Model: Found an estimated cost of 133 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)			; THRU-NEXT: Cost Model: Found an estimated cost of 133 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'reduce_fmax'			; LATE-LABEL: 'reduce_fmax'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)			; LATE-NEXT: Cost Model: Found an estimated cost of 131 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'reduce_fmax'			; SIZE-LABEL: 'reduce_fmax'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 122 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)			; SIZE-NEXT: Cost Model: Found an estimated cost of 122 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'reduce_fmax'			; SIZE_LATE-LABEL: 'reduce_fmax'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 131 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 131 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)			%v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
	ret void			ret void
	}			}

	define void @memcpy(i8* %a, i8* %b, i32 %c) {			define void @memcpy(i8* %a, i8* %b, i32 %c) {
	; THRU-LABEL: 'memcpy'			; THRU-LABEL: 'memcpy'
	; THRU-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)			; THRU-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'memcpy'			; LATE-LABEL: 'memcpy'
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)			; LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'memcpy'			; SIZE-LABEL: 'memcpy'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)			; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'memcpy'			; SIZE_LATE-LABEL: 'memcpy'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)			call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)
	ret void			ret void
	}			}

llvm/test/Analysis/CostModel/ARM/target-intrinsics.ll

	Show All 10 Lines
	; CHECK-THUMB2-RECIP-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t1 = call i32 @llvm.arm.ssat(i32 undef, i32 undef)			; CHECK-THUMB2-RECIP-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t1 = call i32 @llvm.arm.ssat(i32 undef, i32 undef)
	; CHECK-THUMB2-RECIP-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t2 = tail call { <8 x half>, <8 x half> } @llvm.arm.mve.vld2q.v8f16.p0f16(half* undef)			; CHECK-THUMB2-RECIP-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t2 = tail call { <8 x half>, <8 x half> } @llvm.arm.mve.vld2q.v8f16.p0f16(half* undef)
	; CHECK-THUMB2-RECIP-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t3 = call { i32, i32 } @llvm.arm.mve.sqrshrl(i32 undef, i32 undef, i32 undef, i32 48)			; CHECK-THUMB2-RECIP-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t3 = call { i32, i32 } @llvm.arm.mve.sqrshrl(i32 undef, i32 undef, i32 undef, i32 48)
	; CHECK-THUMB2-RECIP-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t4 = tail call { i32, i32 } @llvm.arm.mve.vmlldava.v8i16(i32 0, i32 0, i32 0, i32 0, i32 0, <8 x i16> undef, <8 x i16> undef)			; CHECK-THUMB2-RECIP-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t4 = tail call { i32, i32 } @llvm.arm.mve.vmlldava.v8i16(i32 0, i32 0, i32 0, i32 0, i32 0, <8 x i16> undef, <8 x i16> undef)
	; CHECK-THUMB2-RECIP-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; CHECK-THUMB2-RECIP-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; CHECK-THUMB2-LAT-LABEL: 'intrinsics'			; CHECK-THUMB2-LAT-LABEL: 'intrinsics'
	; CHECK-THUMB2-LAT-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t1 = call i32 @llvm.arm.ssat(i32 undef, i32 undef)			; CHECK-THUMB2-LAT-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t1 = call i32 @llvm.arm.ssat(i32 undef, i32 undef)
	; CHECK-THUMB2-LAT-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %t2 = tail call { <8 x half>, <8 x half> } @llvm.arm.mve.vld2q.v8f16.p0f16(half* undef)			; CHECK-THUMB2-LAT-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t2 = tail call { <8 x half>, <8 x half> } @llvm.arm.mve.vld2q.v8f16.p0f16(half* undef)
	; CHECK-THUMB2-LAT-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t3 = call { i32, i32 } @llvm.arm.mve.sqrshrl(i32 undef, i32 undef, i32 undef, i32 48)			; CHECK-THUMB2-LAT-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t3 = call { i32, i32 } @llvm.arm.mve.sqrshrl(i32 undef, i32 undef, i32 undef, i32 48)
	; CHECK-THUMB2-LAT-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t4 = tail call { i32, i32 } @llvm.arm.mve.vmlldava.v8i16(i32 0, i32 0, i32 0, i32 0, i32 0, <8 x i16> undef, <8 x i16> undef)			; CHECK-THUMB2-LAT-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t4 = tail call { i32, i32 } @llvm.arm.mve.vmlldava.v8i16(i32 0, i32 0, i32 0, i32 0, i32 0, <8 x i16> undef, <8 x i16> undef)
	; CHECK-THUMB2-LAT-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; CHECK-THUMB2-LAT-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; CHECK-THUMB2-SIZE-LABEL: 'intrinsics'			; CHECK-THUMB2-SIZE-LABEL: 'intrinsics'
	; CHECK-THUMB2-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t1 = call i32 @llvm.arm.ssat(i32 undef, i32 undef)			; CHECK-THUMB2-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t1 = call i32 @llvm.arm.ssat(i32 undef, i32 undef)
	; CHECK-THUMB2-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t2 = tail call { <8 x half>, <8 x half> } @llvm.arm.mve.vld2q.v8f16.p0f16(half* undef)			; CHECK-THUMB2-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t2 = tail call { <8 x half>, <8 x half> } @llvm.arm.mve.vld2q.v8f16.p0f16(half* undef)
	; CHECK-THUMB2-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t3 = call { i32, i32 } @llvm.arm.mve.sqrshrl(i32 undef, i32 undef, i32 undef, i32 48)			; CHECK-THUMB2-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %t3 = call { i32, i32 } @llvm.arm.mve.sqrshrl(i32 undef, i32 undef, i32 undef, i32 48)
	Show All 14 Lines

llvm/test/Analysis/CostModel/SystemZ/ext-of-icmp-cost.ll

	; RUN: opt < %s -passes="print<cost-model>" -cost-kind=code-size 2>&1 -disable-output \			; RUN: opt < %s -passes="print<cost-model>" -cost-kind=code-size 2>&1 -disable-output \
	; RUN: -mtriple=s390x-unknown-linux -mcpu=z13 \| FileCheck %s			; RUN: -mtriple=s390x-unknown-linux -mcpu=z13 \| FileCheck %s
	;			;
	; Check that getUserCost() does not return TCC_Free for extensions of			; Check that getInstructionCost() does not return TCC_Free for extensions of
	; i1 returned from icmp.			; i1 returned from icmp.

	define i64 @fun1(i64 %v) {			define i64 @fun1(i64 %v) {
	; CHECK-LABEL: 'fun1'			; CHECK-LABEL: 'fun1'
	; CHECK: Cost Model: Found an estimated cost of 1 for instruction: %cmp = icmp eq i64 %v, 0			; CHECK: Cost Model: Found an estimated cost of 1 for instruction: %cmp = icmp eq i64 %v, 0
	; CHECK: Cost Model: Found an estimated cost of 1 for instruction: %z = zext i1 %cmp to i64			; CHECK: Cost Model: Found an estimated cost of 1 for instruction: %z = zext i1 %cmp to i64
	; CHECK: Cost Model: Found an estimated cost of 1 for instruction: ret i64 %z			; CHECK: Cost Model: Found an estimated cost of 1 for instruction: ret i64 %z
	%cmp = icmp eq i64 %v, 0			%cmp = icmp eq i64 %v, 0
	▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

llvm/test/Analysis/CostModel/X86/arith-fp-costkinds.ll

; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+sse2 -cost-kind=latency < %s \| FileCheck %s --check-prefixes=LATE		; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+sse2 -cost-kind=latency < %s \| FileCheck %s --check-prefixes=CHECK,LATE,SSE-LATE
; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+sse2 -cost-kind=code-size < %s \| FileCheck %s --check-prefixes=SIZE,SSE-SIZE		; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+sse2 -cost-kind=code-size < %s \| FileCheck %s --check-prefixes=CHECK,SIZE,SSE-SIZE
; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+sse2 -cost-kind=size-latency < %s \| FileCheck %s --check-prefixes=SIZE_LATE,SSE-SIZE_LATE		; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+sse2 -cost-kind=size-latency < %s \| FileCheck %s --check-prefixes=CHECK,SIZE_LATE,SSE-SIZE_LATE
;		;
; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+avx -cost-kind=latency < %s \| FileCheck %s --check-prefixes=LATE		; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+avx -cost-kind=latency < %s \| FileCheck %s --check-prefixes=CHECK,LATE,AVX-LATE,AVX1-LATE
; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+avx -cost-kind=code-size < %s \| FileCheck %s --check-prefixes=SIZE,AVX-SIZE		; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+avx -cost-kind=code-size < %s \| FileCheck %s --check-prefixes=CHECK,SIZE,AVX-SIZE
; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+avx -cost-kind=size-latency < %s \| FileCheck %s --check-prefixes=SIZE_LATE,AVX-SIZE_LATE,AVX1-SIZE_LATE		; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+avx -cost-kind=size-latency < %s \| FileCheck %s --check-prefixes=CHECK,SIZE_LATE,AVX-SIZE_LATE,AVX1-SIZE_LATE
;		;
; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+avx2 -cost-kind=latency < %s \| FileCheck %s --check-prefixes=LATE		; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+avx2 -cost-kind=latency < %s \| FileCheck %s --check-prefixes=CHECK,LATE,AVX-LATE,AVX2-LATE
; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+avx2 -cost-kind=code-size < %s \| FileCheck %s --check-prefixes=SIZE,AVX-SIZE		; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+avx2 -cost-kind=code-size < %s \| FileCheck %s --check-prefixes=CHECK,SIZE,AVX-SIZE
; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+avx2 -cost-kind=size-latency < %s \| FileCheck %s --check-prefixes=SIZE_LATE,AVX-SIZE_LATE,AVX2-SIZE_LATE		; RUN: opt -mtriple=x86_64-- -passes="print<cost-model>" 2>&1 -disable-output -mattr=+avx2 -cost-kind=size-latency < %s \| FileCheck %s --check-prefixes=CHECK,SIZE_LATE,AVX-SIZE_LATE,AVX2-SIZE_LATE

target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"		target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.8.0"		target triple = "x86_64-apple-macosx10.8.0"

define i32 @fadd(i32 %arg) {		define i32 @fadd(i32 %arg) {
; LATE-LABEL: 'fadd'		; LATE-LABEL: 'fadd'
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %F32 = fadd float undef, undef		; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %F32 = fadd float undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4F32 = fadd <4 x float> undef, undef		; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4F32 = fadd <4 x float> undef, undef
▲ Show 20 Lines • Show All 224 Lines • ▼ Show 20 Lines	;
%V2F64 = fmul <2 x double> undef, undef		%V2F64 = fmul <2 x double> undef, undef
%V4F64 = fmul <4 x double> undef, undef		%V4F64 = fmul <4 x double> undef, undef
%V8F64 = fmul <8 x double> undef, undef		%V8F64 = fmul <8 x double> undef, undef

ret i32 undef		ret i32 undef
}		}

define i32 @fdiv(i32 %arg) {		define i32 @fdiv(i32 %arg) {
; LATE-LABEL: 'fdiv'		; CHECK-LABEL: 'fdiv'
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %F32 = fdiv float undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %F32 = fdiv float undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4F32 = fdiv <4 x float> undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F32 = fdiv <4 x float> undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V8F32 = fdiv <8 x float> undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = fdiv <8 x float> undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V16F32 = fdiv <16 x float> undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = fdiv <16 x float> undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %F64 = fdiv double undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %F64 = fdiv double undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2F64 = fdiv <2 x double> undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2F64 = fdiv <2 x double> undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4F64 = fdiv <4 x double> undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = fdiv <4 x double> undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V8F64 = fdiv <8 x double> undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F64 = fdiv <8 x double> undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;
; SIZE-LABEL: 'fdiv'
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %F32 = fdiv float undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F32 = fdiv <4 x float> undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = fdiv <8 x float> undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = fdiv <16 x float> undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %F64 = fdiv double undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2F64 = fdiv <2 x double> undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = fdiv <4 x double> undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F64 = fdiv <8 x double> undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;
; SIZE_LATE-LABEL: 'fdiv'
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %F32 = fdiv float undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F32 = fdiv <4 x float> undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = fdiv <8 x float> undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = fdiv <16 x float> undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %F64 = fdiv double undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2F64 = fdiv <2 x double> undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = fdiv <4 x double> undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F64 = fdiv <8 x double> undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
%F32 = fdiv float undef, undef		%F32 = fdiv float undef, undef
%V4F32 = fdiv <4 x float> undef, undef		%V4F32 = fdiv <4 x float> undef, undef
%V8F32 = fdiv <8 x float> undef, undef		%V8F32 = fdiv <8 x float> undef, undef
%V16F32 = fdiv <16 x float> undef, undef		%V16F32 = fdiv <16 x float> undef, undef

%F64 = fdiv double undef, undef		%F64 = fdiv double undef, undef
%V2F64 = fdiv <2 x double> undef, undef		%V2F64 = fdiv <2 x double> undef, undef
%V4F64 = fdiv <4 x double> undef, undef		%V4F64 = fdiv <4 x double> undef, undef
%V8F64 = fdiv <8 x double> undef, undef		%V8F64 = fdiv <8 x double> undef, undef

ret i32 undef		ret i32 undef
}		}

define i32 @frem(i32 %arg) {		define i32 @frem(i32 %arg) {
; LATE-LABEL: 'frem'		; CHECK-LABEL: 'frem'
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %F32 = frem float undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %F32 = frem float undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4F32 = frem <4 x float> undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F32 = frem <4 x float> undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V8F32 = frem <8 x float> undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = frem <8 x float> undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V16F32 = frem <16 x float> undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = frem <16 x float> undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %F64 = frem double undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %F64 = frem double undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2F64 = frem <2 x double> undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2F64 = frem <2 x double> undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4F64 = frem <4 x double> undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = frem <4 x double> undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V8F64 = frem <8 x double> undef, undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F64 = frem <8 x double> undef, undef
; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;
; SIZE-LABEL: 'frem'
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %F32 = frem float undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F32 = frem <4 x float> undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = frem <8 x float> undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = frem <16 x float> undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %F64 = frem double undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2F64 = frem <2 x double> undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = frem <4 x double> undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F64 = frem <8 x double> undef, undef
; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;
; SIZE_LATE-LABEL: 'frem'
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %F32 = frem float undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F32 = frem <4 x float> undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = frem <8 x float> undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = frem <16 x float> undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %F64 = frem double undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2F64 = frem <2 x double> undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = frem <4 x double> undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F64 = frem <8 x double> undef, undef
; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
%F32 = frem float undef, undef		%F32 = frem float undef, undef
%V4F32 = frem <4 x float> undef, undef		%V4F32 = frem <4 x float> undef, undef
%V8F32 = frem <8 x float> undef, undef		%V8F32 = frem <8 x float> undef, undef
%V16F32 = frem <16 x float> undef, undef		%V16F32 = frem <16 x float> undef, undef

%F64 = frem double undef, undef		%F64 = frem double undef, undef
%V2F64 = frem <2 x double> undef, undef		%V2F64 = frem <2 x double> undef, undef
%V4F64 = frem <4 x double> undef, undef		%V4F64 = frem <4 x double> undef, undef
%V8F64 = frem <8 x double> undef, undef		%V8F64 = frem <8 x double> undef, undef

ret i32 undef		ret i32 undef
}		}

define i32 @fsqrt(i32 %arg) {		define i32 @fsqrt(i32 %arg) {
; LATE-LABEL: 'fsqrt'		; SSE-LATE-LABEL: 'fsqrt'
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %F32 = call float @llvm.sqrt.f32(float undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %F32 = call float @llvm.sqrt.f32(float undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4F32 = call <4 x float> @llvm.sqrt.v4f32(<4 x float> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %V4F32 = call <4 x float> @llvm.sqrt.v4f32(<4 x float> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V8F32 = call <8 x float> @llvm.sqrt.v8f32(<8 x float> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %V8F32 = call <8 x float> @llvm.sqrt.v8f32(<8 x float> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V16F32 = call <16 x float> @llvm.sqrt.v16f32(<16 x float> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 224 for instruction: %V16F32 = call <16 x float> @llvm.sqrt.v16f32(<16 x float> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %F64 = call double @llvm.sqrt.f64(double undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %F64 = call double @llvm.sqrt.f64(double undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V8F64 = call <8 x double> @llvm.sqrt.v8f64(<8 x double> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 128 for instruction: %V8F64 = call <8 x double> @llvm.sqrt.v8f64(<8 x double> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
; SSE-SIZE-LABEL: 'fsqrt'		; SSE-SIZE-LABEL: 'fsqrt'
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %F32 = call float @llvm.sqrt.f32(float undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %F32 = call float @llvm.sqrt.f32(float undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %V4F32 = call <4 x float> @llvm.sqrt.v4f32(<4 x float> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %V4F32 = call <4 x float> @llvm.sqrt.v4f32(<4 x float> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %V8F32 = call <8 x float> @llvm.sqrt.v8f32(<8 x float> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %V8F32 = call <8 x float> @llvm.sqrt.v8f32(<8 x float> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 224 for instruction: %V16F32 = call <16 x float> @llvm.sqrt.v16f32(<16 x float> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 224 for instruction: %V16F32 = call <16 x float> @llvm.sqrt.v16f32(<16 x float> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %F64 = call double @llvm.sqrt.f64(double undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %F64 = call double @llvm.sqrt.f64(double undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 128 for instruction: %V8F64 = call <8 x double> @llvm.sqrt.v8f64(<8 x double> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 128 for instruction: %V8F64 = call <8 x double> @llvm.sqrt.v8f64(<8 x double> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
; SSE-SIZE_LATE-LABEL: 'fsqrt'		; SSE-SIZE_LATE-LABEL: 'fsqrt'
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %F32 = call float @llvm.sqrt.f32(float undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %F32 = call float @llvm.sqrt.f32(float undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %V4F32 = call <4 x float> @llvm.sqrt.v4f32(<4 x float> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %V4F32 = call <4 x float> @llvm.sqrt.v4f32(<4 x float> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %V8F32 = call <8 x float> @llvm.sqrt.v8f32(<8 x float> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %V8F32 = call <8 x float> @llvm.sqrt.v8f32(<8 x float> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 224 for instruction: %V16F32 = call <16 x float> @llvm.sqrt.v16f32(<16 x float> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 224 for instruction: %V16F32 = call <16 x float> @llvm.sqrt.v16f32(<16 x float> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %F64 = call double @llvm.sqrt.f64(double undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %F64 = call double @llvm.sqrt.f64(double undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 64 for instruction: %V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 128 for instruction: %V8F64 = call <8 x double> @llvm.sqrt.v8f64(<8 x double> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 128 for instruction: %V8F64 = call <8 x double> @llvm.sqrt.v8f64(<8 x double> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
		; AVX1-LATE-LABEL: 'fsqrt'
		; AVX1-LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %F32 = call float @llvm.sqrt.f32(float undef)
		; AVX1-LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %V4F32 = call <4 x float> @llvm.sqrt.v4f32(<4 x float> undef)
		; AVX1-LATE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %V8F32 = call <8 x float> @llvm.sqrt.v8f32(<8 x float> undef)
		; AVX1-LATE-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %V16F32 = call <16 x float> @llvm.sqrt.v16f32(<16 x float> undef)
		; AVX1-LATE-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %F64 = call double @llvm.sqrt.f64(double undef)
		; AVX1-LATE-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)
		; AVX1-LATE-NEXT: Cost Model: Found an estimated cost of 43 for instruction: %V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)
		; AVX1-LATE-NEXT: Cost Model: Found an estimated cost of 86 for instruction: %V8F64 = call <8 x double> @llvm.sqrt.v8f64(<8 x double> undef)
		; AVX1-LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
		;
; AVX1-SIZE_LATE-LABEL: 'fsqrt'		; AVX1-SIZE_LATE-LABEL: 'fsqrt'
; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %F32 = call float @llvm.sqrt.f32(float undef)		; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %F32 = call float @llvm.sqrt.f32(float undef)
; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %V4F32 = call <4 x float> @llvm.sqrt.v4f32(<4 x float> undef)		; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %V4F32 = call <4 x float> @llvm.sqrt.v4f32(<4 x float> undef)
; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %V8F32 = call <8 x float> @llvm.sqrt.v8f32(<8 x float> undef)		; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %V8F32 = call <8 x float> @llvm.sqrt.v8f32(<8 x float> undef)
; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %V16F32 = call <16 x float> @llvm.sqrt.v16f32(<16 x float> undef)		; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %V16F32 = call <16 x float> @llvm.sqrt.v16f32(<16 x float> undef)
; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %F64 = call double @llvm.sqrt.f64(double undef)		; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %F64 = call double @llvm.sqrt.f64(double undef)
; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)		; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)
; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 43 for instruction: %V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)		; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 43 for instruction: %V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)
; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 86 for instruction: %V8F64 = call <8 x double> @llvm.sqrt.v8f64(<8 x double> undef)		; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 86 for instruction: %V8F64 = call <8 x double> @llvm.sqrt.v8f64(<8 x double> undef)
; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; AVX1-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
		; AVX2-LATE-LABEL: 'fsqrt'
		; AVX2-LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %F32 = call float @llvm.sqrt.f32(float undef)
		; AVX2-LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V4F32 = call <4 x float> @llvm.sqrt.v4f32(<4 x float> undef)
		; AVX2-LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %V8F32 = call <8 x float> @llvm.sqrt.v8f32(<8 x float> undef)
		; AVX2-LATE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %V16F32 = call <16 x float> @llvm.sqrt.v16f32(<16 x float> undef)
		; AVX2-LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %F64 = call double @llvm.sqrt.f64(double undef)
		; AVX2-LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)
		; AVX2-LATE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)
		; AVX2-LATE-NEXT: Cost Model: Found an estimated cost of 56 for instruction: %V8F64 = call <8 x double> @llvm.sqrt.v8f64(<8 x double> undef)
		; AVX2-LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
		;
; AVX2-SIZE_LATE-LABEL: 'fsqrt'		; AVX2-SIZE_LATE-LABEL: 'fsqrt'
; AVX2-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %F32 = call float @llvm.sqrt.f32(float undef)		; AVX2-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %F32 = call float @llvm.sqrt.f32(float undef)
; AVX2-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V4F32 = call <4 x float> @llvm.sqrt.v4f32(<4 x float> undef)		; AVX2-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V4F32 = call <4 x float> @llvm.sqrt.v4f32(<4 x float> undef)
; AVX2-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %V8F32 = call <8 x float> @llvm.sqrt.v8f32(<8 x float> undef)		; AVX2-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %V8F32 = call <8 x float> @llvm.sqrt.v8f32(<8 x float> undef)
; AVX2-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %V16F32 = call <16 x float> @llvm.sqrt.v16f32(<16 x float> undef)		; AVX2-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %V16F32 = call <16 x float> @llvm.sqrt.v16f32(<16 x float> undef)
; AVX2-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %F64 = call double @llvm.sqrt.f64(double undef)		; AVX2-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %F64 = call double @llvm.sqrt.f64(double undef)
; AVX2-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)		; AVX2-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)
; AVX2-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)		; AVX2-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)
Show All 9 Lines	;
%V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)		%V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)
%V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)		%V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)
%V8F64 = call <8 x double> @llvm.sqrt.v8f64(<8 x double> undef)		%V8F64 = call <8 x double> @llvm.sqrt.v8f64(<8 x double> undef)

ret i32 undef		ret i32 undef
}		}

define i32 @fabs(i32 %arg) {		define i32 @fabs(i32 %arg) {
; LATE-LABEL: 'fabs'		; SSE-LATE-LABEL: 'fabs'
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %F32 = call float @llvm.fabs.f32(float undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.fabs.f32(float undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4F32 = call <4 x float> @llvm.fabs.v4f32(<4 x float> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.fabs.v4f32(<4 x float> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V8F32 = call <8 x float> @llvm.fabs.v8f32(<8 x float> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = call <8 x float> @llvm.fabs.v8f32(<8 x float> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V16F32 = call <16 x float> @llvm.fabs.v16f32(<16 x float> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16F32 = call <16 x float> @llvm.fabs.v16f32(<16 x float> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %F64 = call double @llvm.fabs.f64(double undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.fabs.f64(double undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2F64 = call <2 x double> @llvm.fabs.v2f64(<2 x double> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.fabs.v2f64(<2 x double> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4F64 = call <4 x double> @llvm.fabs.v4f64(<4 x double> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = call <4 x double> @llvm.fabs.v4f64(<4 x double> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V8F64 = call <8 x double> @llvm.fabs.v8f64(<8 x double> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V8F64 = call <8 x double> @llvm.fabs.v8f64(<8 x double> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
; SSE-SIZE-LABEL: 'fabs'		; SSE-SIZE-LABEL: 'fabs'
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.fabs.f32(float undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.fabs.f32(float undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.fabs.v4f32(<4 x float> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.fabs.v4f32(<4 x float> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = call <8 x float> @llvm.fabs.v8f32(<8 x float> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = call <8 x float> @llvm.fabs.v8f32(<8 x float> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16F32 = call <16 x float> @llvm.fabs.v16f32(<16 x float> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16F32 = call <16 x float> @llvm.fabs.v16f32(<16 x float> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.fabs.f64(double undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.fabs.f64(double undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.fabs.v2f64(<2 x double> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.fabs.v2f64(<2 x double> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = call <4 x double> @llvm.fabs.v4f64(<4 x double> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = call <4 x double> @llvm.fabs.v4f64(<4 x double> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V8F64 = call <8 x double> @llvm.fabs.v8f64(<8 x double> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V8F64 = call <8 x double> @llvm.fabs.v8f64(<8 x double> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
; SSE-SIZE_LATE-LABEL: 'fabs'		; SSE-SIZE_LATE-LABEL: 'fabs'
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.fabs.f32(float undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.fabs.f32(float undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.fabs.v4f32(<4 x float> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.fabs.v4f32(<4 x float> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = call <8 x float> @llvm.fabs.v8f32(<8 x float> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = call <8 x float> @llvm.fabs.v8f32(<8 x float> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16F32 = call <16 x float> @llvm.fabs.v16f32(<16 x float> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16F32 = call <16 x float> @llvm.fabs.v16f32(<16 x float> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.fabs.f64(double undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.fabs.f64(double undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.fabs.v2f64(<2 x double> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.fabs.v2f64(<2 x double> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = call <4 x double> @llvm.fabs.v4f64(<4 x double> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = call <4 x double> @llvm.fabs.v4f64(<4 x double> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V8F64 = call <8 x double> @llvm.fabs.v8f64(<8 x double> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V8F64 = call <8 x double> @llvm.fabs.v8f64(<8 x double> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
		; AVX-LATE-LABEL: 'fabs'
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.fabs.f32(float undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.fabs.v4f32(<4 x float> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8F32 = call <8 x float> @llvm.fabs.v8f32(<8 x float> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = call <16 x float> @llvm.fabs.v16f32(<16 x float> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.fabs.f64(double undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.fabs.v2f64(<2 x double> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F64 = call <4 x double> @llvm.fabs.v4f64(<4 x double> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F64 = call <8 x double> @llvm.fabs.v8f64(<8 x double> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
		;
; AVX-SIZE-LABEL: 'fabs'		; AVX-SIZE-LABEL: 'fabs'
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.fabs.f32(float undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.fabs.f32(float undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.fabs.v4f32(<4 x float> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.fabs.v4f32(<4 x float> undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8F32 = call <8 x float> @llvm.fabs.v8f32(<8 x float> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8F32 = call <8 x float> @llvm.fabs.v8f32(<8 x float> undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = call <16 x float> @llvm.fabs.v16f32(<16 x float> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = call <16 x float> @llvm.fabs.v16f32(<16 x float> undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.fabs.f64(double undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.fabs.f64(double undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.fabs.v2f64(<2 x double> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.fabs.v2f64(<2 x double> undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F64 = call <4 x double> @llvm.fabs.v4f64(<4 x double> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F64 = call <4 x double> @llvm.fabs.v4f64(<4 x double> undef)
Show All 20 Lines	;
%V2F64 = call <2 x double> @llvm.fabs.v2f64(<2 x double> undef)		%V2F64 = call <2 x double> @llvm.fabs.v2f64(<2 x double> undef)
%V4F64 = call <4 x double> @llvm.fabs.v4f64(<4 x double> undef)		%V4F64 = call <4 x double> @llvm.fabs.v4f64(<4 x double> undef)
%V8F64 = call <8 x double> @llvm.fabs.v8f64(<8 x double> undef)		%V8F64 = call <8 x double> @llvm.fabs.v8f64(<8 x double> undef)

ret i32 undef		ret i32 undef
}		}

define i32 @fcopysign(i32 %arg) {		define i32 @fcopysign(i32 %arg) {
; LATE-LABEL: 'fcopysign'		; SSE-LATE-LABEL: 'fcopysign'
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %F32 = call float @llvm.copysign.f32(float undef, float undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.copysign.f32(float undef, float undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4F32 = call <4 x float> @llvm.copysign.v4f32(<4 x float> undef, <4 x float> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.copysign.v4f32(<4 x float> undef, <4 x float> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V8F32 = call <8 x float> @llvm.copysign.v8f32(<8 x float> undef, <8 x float> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = call <8 x float> @llvm.copysign.v8f32(<8 x float> undef, <8 x float> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V16F32 = call <16 x float> @llvm.copysign.v16f32(<16 x float> undef, <16 x float> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16F32 = call <16 x float> @llvm.copysign.v16f32(<16 x float> undef, <16 x float> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %F64 = call double @llvm.copysign.f64(double undef, double undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.copysign.f64(double undef, double undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2F64 = call <2 x double> @llvm.copysign.v2f64(<2 x double> undef, <2 x double> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.copysign.v2f64(<2 x double> undef, <2 x double> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4F64 = call <4 x double> @llvm.copysign.v4f64(<4 x double> undef, <4 x double> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = call <4 x double> @llvm.copysign.v4f64(<4 x double> undef, <4 x double> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V8F64 = call <8 x double> @llvm.copysign.v8f64(<8 x double> undef, <8 x double> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V8F64 = call <8 x double> @llvm.copysign.v8f64(<8 x double> undef, <8 x double> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
; SSE-SIZE-LABEL: 'fcopysign'		; SSE-SIZE-LABEL: 'fcopysign'
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.copysign.f32(float undef, float undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.copysign.f32(float undef, float undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.copysign.v4f32(<4 x float> undef, <4 x float> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.copysign.v4f32(<4 x float> undef, <4 x float> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = call <8 x float> @llvm.copysign.v8f32(<8 x float> undef, <8 x float> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = call <8 x float> @llvm.copysign.v8f32(<8 x float> undef, <8 x float> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16F32 = call <16 x float> @llvm.copysign.v16f32(<16 x float> undef, <16 x float> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16F32 = call <16 x float> @llvm.copysign.v16f32(<16 x float> undef, <16 x float> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.copysign.f64(double undef, double undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.copysign.f64(double undef, double undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.copysign.v2f64(<2 x double> undef, <2 x double> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.copysign.v2f64(<2 x double> undef, <2 x double> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = call <4 x double> @llvm.copysign.v4f64(<4 x double> undef, <4 x double> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = call <4 x double> @llvm.copysign.v4f64(<4 x double> undef, <4 x double> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V8F64 = call <8 x double> @llvm.copysign.v8f64(<8 x double> undef, <8 x double> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V8F64 = call <8 x double> @llvm.copysign.v8f64(<8 x double> undef, <8 x double> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
; SSE-SIZE_LATE-LABEL: 'fcopysign'		; SSE-SIZE_LATE-LABEL: 'fcopysign'
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.copysign.f32(float undef, float undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.copysign.f32(float undef, float undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.copysign.v4f32(<4 x float> undef, <4 x float> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.copysign.v4f32(<4 x float> undef, <4 x float> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = call <8 x float> @llvm.copysign.v8f32(<8 x float> undef, <8 x float> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F32 = call <8 x float> @llvm.copysign.v8f32(<8 x float> undef, <8 x float> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16F32 = call <16 x float> @llvm.copysign.v16f32(<16 x float> undef, <16 x float> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V16F32 = call <16 x float> @llvm.copysign.v16f32(<16 x float> undef, <16 x float> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.copysign.f64(double undef, double undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.copysign.f64(double undef, double undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.copysign.v2f64(<2 x double> undef, <2 x double> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.copysign.v2f64(<2 x double> undef, <2 x double> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = call <4 x double> @llvm.copysign.v4f64(<4 x double> undef, <4 x double> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = call <4 x double> @llvm.copysign.v4f64(<4 x double> undef, <4 x double> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V8F64 = call <8 x double> @llvm.copysign.v8f64(<8 x double> undef, <8 x double> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V8F64 = call <8 x double> @llvm.copysign.v8f64(<8 x double> undef, <8 x double> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
		; AVX-LATE-LABEL: 'fcopysign'
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.copysign.f32(float undef, float undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.copysign.v4f32(<4 x float> undef, <4 x float> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8F32 = call <8 x float> @llvm.copysign.v8f32(<8 x float> undef, <8 x float> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = call <16 x float> @llvm.copysign.v16f32(<16 x float> undef, <16 x float> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.copysign.f64(double undef, double undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.copysign.v2f64(<2 x double> undef, <2 x double> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F64 = call <4 x double> @llvm.copysign.v4f64(<4 x double> undef, <4 x double> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8F64 = call <8 x double> @llvm.copysign.v8f64(<8 x double> undef, <8 x double> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
		;
; AVX-SIZE-LABEL: 'fcopysign'		; AVX-SIZE-LABEL: 'fcopysign'
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.copysign.f32(float undef, float undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F32 = call float @llvm.copysign.f32(float undef, float undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.copysign.v4f32(<4 x float> undef, <4 x float> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F32 = call <4 x float> @llvm.copysign.v4f32(<4 x float> undef, <4 x float> undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8F32 = call <8 x float> @llvm.copysign.v8f32(<8 x float> undef, <8 x float> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8F32 = call <8 x float> @llvm.copysign.v8f32(<8 x float> undef, <8 x float> undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = call <16 x float> @llvm.copysign.v16f32(<16 x float> undef, <16 x float> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16F32 = call <16 x float> @llvm.copysign.v16f32(<16 x float> undef, <16 x float> undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.copysign.f64(double undef, double undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.copysign.f64(double undef, double undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.copysign.v2f64(<2 x double> undef, <2 x double> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.copysign.v2f64(<2 x double> undef, <2 x double> undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F64 = call <4 x double> @llvm.copysign.v4f64(<4 x double> undef, <4 x double> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4F64 = call <4 x double> @llvm.copysign.v4f64(<4 x double> undef, <4 x double> undef)
Show All 20 Lines	;
%V2F64 = call <2 x double> @llvm.copysign.v2f64(<2 x double> undef, <2 x double> undef)		%V2F64 = call <2 x double> @llvm.copysign.v2f64(<2 x double> undef, <2 x double> undef)
%V4F64 = call <4 x double> @llvm.copysign.v4f64(<4 x double> undef, <4 x double> undef)		%V4F64 = call <4 x double> @llvm.copysign.v4f64(<4 x double> undef, <4 x double> undef)
%V8F64 = call <8 x double> @llvm.copysign.v8f64(<8 x double> undef, <8 x double> undef)		%V8F64 = call <8 x double> @llvm.copysign.v8f64(<8 x double> undef, <8 x double> undef)

ret i32 undef		ret i32 undef
}		}

define i32 @fma(i32 %arg) {		define i32 @fma(i32 %arg) {
; LATE-LABEL: 'fma'		; SSE-LATE-LABEL: 'fma'
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %F32 = call float @llvm.fma.f32(float undef, float undef, float undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %F32 = call float @llvm.fma.f32(float undef, float undef, float undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4F32 = call <4 x float> @llvm.fma.v4f32(<4 x float> undef, <4 x float> undef, <4 x float> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 43 for instruction: %V4F32 = call <4 x float> @llvm.fma.v4f32(<4 x float> undef, <4 x float> undef, <4 x float> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V8F32 = call <8 x float> @llvm.fma.v8f32(<8 x float> undef, <8 x float> undef, <8 x float> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 86 for instruction: %V8F32 = call <8 x float> @llvm.fma.v8f32(<8 x float> undef, <8 x float> undef, <8 x float> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V16F32 = call <16 x float> @llvm.fma.v16f32(<16 x float> undef, <16 x float> undef, <16 x float> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 172 for instruction: %V16F32 = call <16 x float> @llvm.fma.v16f32(<16 x float> undef, <16 x float> undef, <16 x float> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %F64 = call double @llvm.fma.f64(double undef, double undef, double undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %F64 = call double @llvm.fma.f64(double undef, double undef, double undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2F64 = call <2 x double> @llvm.fma.v2f64(<2 x double> undef, <2 x double> undef, <2 x double> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %V2F64 = call <2 x double> @llvm.fma.v2f64(<2 x double> undef, <2 x double> undef, <2 x double> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4F64 = call <4 x double> @llvm.fma.v4f64(<4 x double> undef, <4 x double> undef, <4 x double> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 42 for instruction: %V4F64 = call <4 x double> @llvm.fma.v4f64(<4 x double> undef, <4 x double> undef, <4 x double> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V8F64 = call <8 x double> @llvm.fma.v8f64(<8 x double> undef, <8 x double> undef, <8 x double> undef)		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 84 for instruction: %V8F64 = call <8 x double> @llvm.fma.v8f64(<8 x double> undef, <8 x double> undef, <8 x double> undef)
; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; SSE-LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
; SSE-SIZE-LABEL: 'fma'		; SSE-SIZE-LABEL: 'fma'
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %F32 = call float @llvm.fma.f32(float undef, float undef, float undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %F32 = call float @llvm.fma.f32(float undef, float undef, float undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V4F32 = call <4 x float> @llvm.fma.v4f32(<4 x float> undef, <4 x float> undef, <4 x float> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V4F32 = call <4 x float> @llvm.fma.v4f32(<4 x float> undef, <4 x float> undef, <4 x float> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %V8F32 = call <8 x float> @llvm.fma.v8f32(<8 x float> undef, <8 x float> undef, <8 x float> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %V8F32 = call <8 x float> @llvm.fma.v8f32(<8 x float> undef, <8 x float> undef, <8 x float> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %V16F32 = call <16 x float> @llvm.fma.v16f32(<16 x float> undef, <16 x float> undef, <16 x float> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %V16F32 = call <16 x float> @llvm.fma.v16f32(<16 x float> undef, <16 x float> undef, <16 x float> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %F64 = call double @llvm.fma.f64(double undef, double undef, double undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %F64 = call double @llvm.fma.f64(double undef, double undef, double undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2F64 = call <2 x double> @llvm.fma.v2f64(<2 x double> undef, <2 x double> undef, <2 x double> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2F64 = call <2 x double> @llvm.fma.v2f64(<2 x double> undef, <2 x double> undef, <2 x double> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %V4F64 = call <4 x double> @llvm.fma.v4f64(<4 x double> undef, <4 x double> undef, <4 x double> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %V4F64 = call <4 x double> @llvm.fma.v4f64(<4 x double> undef, <4 x double> undef, <4 x double> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %V8F64 = call <8 x double> @llvm.fma.v8f64(<8 x double> undef, <8 x double> undef, <8 x double> undef)		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %V8F64 = call <8 x double> @llvm.fma.v8f64(<8 x double> undef, <8 x double> undef, <8 x double> undef)
; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; SSE-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
; SSE-SIZE_LATE-LABEL: 'fma'		; SSE-SIZE_LATE-LABEL: 'fma'
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %F32 = call float @llvm.fma.f32(float undef, float undef, float undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %F32 = call float @llvm.fma.f32(float undef, float undef, float undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 43 for instruction: %V4F32 = call <4 x float> @llvm.fma.v4f32(<4 x float> undef, <4 x float> undef, <4 x float> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 43 for instruction: %V4F32 = call <4 x float> @llvm.fma.v4f32(<4 x float> undef, <4 x float> undef, <4 x float> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 86 for instruction: %V8F32 = call <8 x float> @llvm.fma.v8f32(<8 x float> undef, <8 x float> undef, <8 x float> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 86 for instruction: %V8F32 = call <8 x float> @llvm.fma.v8f32(<8 x float> undef, <8 x float> undef, <8 x float> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 172 for instruction: %V16F32 = call <16 x float> @llvm.fma.v16f32(<16 x float> undef, <16 x float> undef, <16 x float> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 172 for instruction: %V16F32 = call <16 x float> @llvm.fma.v16f32(<16 x float> undef, <16 x float> undef, <16 x float> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %F64 = call double @llvm.fma.f64(double undef, double undef, double undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %F64 = call double @llvm.fma.f64(double undef, double undef, double undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %V2F64 = call <2 x double> @llvm.fma.v2f64(<2 x double> undef, <2 x double> undef, <2 x double> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %V2F64 = call <2 x double> @llvm.fma.v2f64(<2 x double> undef, <2 x double> undef, <2 x double> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 42 for instruction: %V4F64 = call <4 x double> @llvm.fma.v4f64(<4 x double> undef, <4 x double> undef, <4 x double> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 42 for instruction: %V4F64 = call <4 x double> @llvm.fma.v4f64(<4 x double> undef, <4 x double> undef, <4 x double> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 84 for instruction: %V8F64 = call <8 x double> @llvm.fma.v8f64(<8 x double> undef, <8 x double> undef, <8 x double> undef)		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 84 for instruction: %V8F64 = call <8 x double> @llvm.fma.v8f64(<8 x double> undef, <8 x double> undef, <8 x double> undef)
; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef		; SSE-SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
;		;
		; AVX-LATE-LABEL: 'fma'
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %F32 = call float @llvm.fma.f32(float undef, float undef, float undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 43 for instruction: %V4F32 = call <4 x float> @llvm.fma.v4f32(<4 x float> undef, <4 x float> undef, <4 x float> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 88 for instruction: %V8F32 = call <8 x float> @llvm.fma.v8f32(<8 x float> undef, <8 x float> undef, <8 x float> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 176 for instruction: %V16F32 = call <16 x float> @llvm.fma.v16f32(<16 x float> undef, <16 x float> undef, <16 x float> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %F64 = call double @llvm.fma.f64(double undef, double undef, double undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %V2F64 = call <2 x double> @llvm.fma.v2f64(<2 x double> undef, <2 x double> undef, <2 x double> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 43 for instruction: %V4F64 = call <4 x double> @llvm.fma.v4f64(<4 x double> undef, <4 x double> undef, <4 x double> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 86 for instruction: %V8F64 = call <8 x double> @llvm.fma.v8f64(<8 x double> undef, <8 x double> undef, <8 x double> undef)
		; AVX-LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
		;
; AVX-SIZE-LABEL: 'fma'		; AVX-SIZE-LABEL: 'fma'
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %F32 = call float @llvm.fma.f32(float undef, float undef, float undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %F32 = call float @llvm.fma.f32(float undef, float undef, float undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V4F32 = call <4 x float> @llvm.fma.v4f32(<4 x float> undef, <4 x float> undef, <4 x float> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V4F32 = call <4 x float> @llvm.fma.v4f32(<4 x float> undef, <4 x float> undef, <4 x float> undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V8F32 = call <8 x float> @llvm.fma.v8f32(<8 x float> undef, <8 x float> undef, <8 x float> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V8F32 = call <8 x float> @llvm.fma.v8f32(<8 x float> undef, <8 x float> undef, <8 x float> undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %V16F32 = call <16 x float> @llvm.fma.v16f32(<16 x float> undef, <16 x float> undef, <16 x float> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %V16F32 = call <16 x float> @llvm.fma.v16f32(<16 x float> undef, <16 x float> undef, <16 x float> undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %F64 = call double @llvm.fma.f64(double undef, double undef, double undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %F64 = call double @llvm.fma.f64(double undef, double undef, double undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2F64 = call <2 x double> @llvm.fma.v2f64(<2 x double> undef, <2 x double> undef, <2 x double> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2F64 = call <2 x double> @llvm.fma.v2f64(<2 x double> undef, <2 x double> undef, <2 x double> undef)
; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V4F64 = call <4 x double> @llvm.fma.v4f64(<4 x double> undef, <4 x double> undef, <4 x double> undef)		; AVX-SIZE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V4F64 = call <4 x double> @llvm.fma.v4f64(<4 x double> undef, <4 x double> undef, <4 x double> undef)
▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

llvm/test/Analysis/CostModel/X86/costmodel.ll

	Show All 13 Lines
	; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %A2 = alloca i64, i64 undef, align 8			; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %A2 = alloca i64, i64 undef, align 8
	; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %I64 = add i64 undef, undef			; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %I64 = add i64 undef, undef
	; LATENCY-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %1 = load i64, i64* undef, align 4			; LATENCY-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %1 = load i64, i64* undef, align 4
	; LATENCY-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %BC = bitcast i8* undef to i32*			; LATENCY-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %BC = bitcast i8* undef to i32*
	; LATENCY-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %I2P = inttoptr i64 undef to i8*			; LATENCY-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %I2P = inttoptr i64 undef to i8*
	; LATENCY-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %P2I = ptrtoint i8* undef to i64			; LATENCY-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %P2I = ptrtoint i8* undef to i64
	; LATENCY-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %TC = trunc i64 undef to i32			; LATENCY-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %TC = trunc i64 undef to i32
	; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %uadd = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 undef, i32 undef)			; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %uadd = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 undef, i32 undef)
	; LATENCY-NEXT: Cost Model: Found an estimated cost of 40 for instruction: call void undef()			; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: call void undef()
	; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i64 undef			; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i64 undef
	;			;
	; CODESIZE-LABEL: 'foo'			; CODESIZE-LABEL: 'foo'
	; CODESIZE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %A1 = alloca i32, align 8			; CODESIZE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %A1 = alloca i32, align 8
	; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %A2 = alloca i64, i64 undef, align 8			; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %A2 = alloca i64, i64 undef, align 8
	; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %I64 = add i64 undef, undef			; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %I64 = add i64 undef, undef
	; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = load i64, i64* undef, align 4			; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = load i64, i64* undef, align 4
	; CODESIZE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %BC = bitcast i8* undef to i32*			; CODESIZE-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %BC = bitcast i8* undef to i32*
	Show All 19 Lines

llvm/test/Analysis/CostModel/X86/intrinsic-cost-kinds.ll

	Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines

	define void @umul(i32 %a, i32 %b, <16 x i32> %va, <16 x i32> %vb) {			define void @umul(i32 %a, i32 %b, <16 x i32> %va, <16 x i32> %vb) {
	; THRU-LABEL: 'umul'			; THRU-LABEL: 'umul'
	; THRU-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 %a, i32 %b)			; THRU-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
	; THRU-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %v = call { <16 x i32>, <16 x i1> } @llvm.umul.with.overflow.v16i32(<16 x i32> %va, <16 x i32> %vb)			; THRU-NEXT: Cost Model: Found an estimated cost of 112 for instruction: %v = call { <16 x i32>, <16 x i1> } @llvm.umul.with.overflow.v16i32(<16 x i32> %va, <16 x i32> %vb)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'umul'			; LATE-LABEL: 'umul'
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 %a, i32 %b)			; LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v = call { <16 x i32>, <16 x i1> } @llvm.umul.with.overflow.v16i32(<16 x i32> %va, <16 x i32> %vb)			; LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call { <16 x i32>, <16 x i1> } @llvm.umul.with.overflow.v16i32(<16 x i32> %va, <16 x i32> %vb)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'umul'			; SIZE-LABEL: 'umul'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 %a, i32 %b)			; SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call { <16 x i32>, <16 x i1> } @llvm.umul.with.overflow.v16i32(<16 x i32> %va, <16 x i32> %vb)			; SIZE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call { <16 x i32>, <16 x i1> } @llvm.umul.with.overflow.v16i32(<16 x i32> %va, <16 x i32> %vb)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'umul'			; SIZE_LATE-LABEL: 'umul'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 %a, i32 %b)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call { <16 x i32>, <16 x i1> } @llvm.umul.with.overflow.v16i32(<16 x i32> %va, <16 x i32> %vb)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call { <16 x i32>, <16 x i1> } @llvm.umul.with.overflow.v16i32(<16 x i32> %va, <16 x i32> %vb)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)			%s = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
	%v = call {<16 x i32>, <16 x i1>} @llvm.umul.with.overflow.v16i32(<16 x i32> %va, <16 x i32> %vb)			%v = call {<16 x i32>, <16 x i1>} @llvm.umul.with.overflow.v16i32(<16 x i32> %va, <16 x i32> %vb)
	ret void			ret void
	}			}

	define void @smax(i32 %a, i32 %b, <16 x i32> %va, <16 x i32> %vb) {			define void @smax(i32 %a, i32 %b, <16 x i32> %va, <16 x i32> %vb) {
	; THRU-LABEL: 'smax'			; THRU-LABEL: 'smax'
	; THRU-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)			; THRU-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)
	; THRU-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)			; THRU-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'smax'			; LATE-LABEL: 'smax'
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)			; LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)			; LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'smax'			; SIZE-LABEL: 'smax'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)			; SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)			; SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'smax'			; SIZE_LATE-LABEL: 'smax'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call i32 @llvm.smax.i32(i32 %a, i32 %b)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call i32 @llvm.smax.i32(i32 %a, i32 %b)			%s = call i32 @llvm.smax.i32(i32 %a, i32 %b)
	%v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)			%v = call <16 x i32> @llvm.smax.v16i32(<16 x i32> %va, <16 x i32> %vb)
	ret void			ret void
	}			}

	define void @fcopysign(float %a, float %b, <16 x float> %va, <16 x float> %vb) {			define void @fcopysign(float %a, float %b, <16 x float> %va, <16 x float> %vb) {
	; THRU-LABEL: 'fcopysign'			; THRU-LABEL: 'fcopysign'
	; THRU-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call float @llvm.copysign.f32(float %a, float %b)			; THRU-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call float @llvm.copysign.f32(float %a, float %b)
	; THRU-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x float> @llvm.copysign.v16f32(<16 x float> %va, <16 x float> %vb)			; THRU-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x float> @llvm.copysign.v16f32(<16 x float> %va, <16 x float> %vb)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'fcopysign'			; LATE-LABEL: 'fcopysign'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call float @llvm.copysign.f32(float %a, float %b)			; LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call float @llvm.copysign.f32(float %a, float %b)
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v = call <16 x float> @llvm.copysign.v16f32(<16 x float> %va, <16 x float> %vb)			; LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x float> @llvm.copysign.v16f32(<16 x float> %va, <16 x float> %vb)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'fcopysign'			; SIZE-LABEL: 'fcopysign'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call float @llvm.copysign.f32(float %a, float %b)			; SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call float @llvm.copysign.f32(float %a, float %b)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x float> @llvm.copysign.v16f32(<16 x float> %va, <16 x float> %vb)			; SIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x float> @llvm.copysign.v16f32(<16 x float> %va, <16 x float> %vb)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'fcopysign'			; SIZE_LATE-LABEL: 'fcopysign'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call float @llvm.copysign.f32(float %a, float %b)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call float @llvm.copysign.f32(float %a, float %b)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x float> @llvm.copysign.v16f32(<16 x float> %va, <16 x float> %vb)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %v = call <16 x float> @llvm.copysign.v16f32(<16 x float> %va, <16 x float> %vb)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call float @llvm.copysign.f32(float %a, float %b)			%s = call float @llvm.copysign.f32(float %a, float %b)
	%v = call <16 x float> @llvm.copysign.v16f32(<16 x float> %va, <16 x float> %vb)			%v = call <16 x float> @llvm.copysign.v16f32(<16 x float> %va, <16 x float> %vb)
	ret void			ret void
	}			}

	define void @fmuladd(float %a, float %b, float %c, <16 x float> %va, <16 x float> %vb, <16 x float> %vc) {			define void @fmuladd(float %a, float %b, float %c, <16 x float> %va, <16 x float> %vb, <16 x float> %vc) {
	; THRU-LABEL: 'fmuladd'			; THRU-LABEL: 'fmuladd'
	; THRU-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)			; THRU-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
	; THRU-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)			; THRU-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'fmuladd'			; LATE-LABEL: 'fmuladd'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)			; LATE-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)			; LATE-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'fmuladd'			; SIZE-LABEL: 'fmuladd'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)			; SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)			; SIZE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'fmuladd'			; SIZE_LATE-LABEL: 'fmuladd'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)			%s = call float @llvm.fmuladd.f32(float %a, float %b, float %c)
	%v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)			%v = call <16 x float> @llvm.fmuladd.v16f32(<16 x float> %va, <16 x float> %vb, <16 x float> %vc)
	ret void			ret void
	}			}

	define void @log2(float %a, <16 x float> %va) {			define void @log2(float %a, <16 x float> %va) {
	; THRU-LABEL: 'log2'			; THRU-LABEL: 'log2'
	; THRU-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.log2.f32(float %a)			; THRU-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.log2.f32(float %a)
	; THRU-NEXT: Cost Model: Found an estimated cost of 184 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)			; THRU-NEXT: Cost Model: Found an estimated cost of 184 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'log2'			; LATE-LABEL: 'log2'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call float @llvm.log2.f32(float %a)			; LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.log2.f32(float %a)
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)			; LATE-NEXT: Cost Model: Found an estimated cost of 184 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'log2'			; SIZE-LABEL: 'log2'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.log2.f32(float %a)			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.log2.f32(float %a)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)			; SIZE-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'log2'			; SIZE_LATE-LABEL: 'log2'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.log2.f32(float %a)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.log2.f32(float %a)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 184 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 184 for instruction: %v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call float @llvm.log2.f32(float %a)			%s = call float @llvm.log2.f32(float %a)
	%v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)			%v = call <16 x float> @llvm.log2.v16f32(<16 x float> %va)
	ret void			ret void
	}			}

	define void @constrained_fadd(float %a, <16 x float> %va) {			define void @constrained_fadd(float %a, <16 x float> %va) {
	; THRU-LABEL: 'constrained_fadd'			; THRU-LABEL: 'constrained_fadd'
	; THRU-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; THRU-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; THRU-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; THRU-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'constrained_fadd'			; LATE-LABEL: 'constrained_fadd'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; LATE-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'constrained_fadd'			; SIZE-LABEL: 'constrained_fadd'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; SIZE-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; SIZE-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'constrained_fadd'			; SIZE_LATE-LABEL: 'constrained_fadd'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")			%s = call float @llvm.experimental.constrained.fadd.f32(float %a, float %a, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	%t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")			%t = call <16 x float> @llvm.experimental.constrained.fadd.v16f32(<16 x float> %va, <16 x float> %va, metadata !"round.dynamic", metadata !"fpexcept.ignore")
	ret void			ret void
	}			}

	define void @fmaximum(float %a, float %b, <16 x float> %va, <16 x float> %vb) {			define void @fmaximum(float %a, float %b, <16 x float> %va, <16 x float> %vb) {
	; THRU-LABEL: 'fmaximum'			; THRU-LABEL: 'fmaximum'
	; THRU-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.maximum.f32(float %a, float %b)			; THRU-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.maximum.f32(float %a, float %b)
	; THRU-NEXT: Cost Model: Found an estimated cost of 196 for instruction: %v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)			; THRU-NEXT: Cost Model: Found an estimated cost of 196 for instruction: %v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'fmaximum'			; LATE-LABEL: 'fmaximum'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call float @llvm.maximum.f32(float %a, float %b)			; LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.maximum.f32(float %a, float %b)
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)			; LATE-NEXT: Cost Model: Found an estimated cost of 196 for instruction: %v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'fmaximum'			; SIZE-LABEL: 'fmaximum'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.maximum.f32(float %a, float %b)			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call float @llvm.maximum.f32(float %a, float %b)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 52 for instruction: %v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)			; SIZE-NEXT: Cost Model: Found an estimated cost of 52 for instruction: %v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'fmaximum'			; SIZE_LATE-LABEL: 'fmaximum'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.maximum.f32(float %a, float %b)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %s = call float @llvm.maximum.f32(float %a, float %b)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 196 for instruction: %v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 196 for instruction: %v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call float @llvm.maximum.f32(float %a, float %b)			%s = call float @llvm.maximum.f32(float %a, float %b)
	%v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)			%v = call <16 x float> @llvm.maximum.v16f32(<16 x float> %va, <16 x float> %vb)
	ret void			ret void
	}			}

	define void @cttz(i32 %a, <16 x i32> %va) {			define void @cttz(i32 %a, <16 x i32> %va) {
	; THRU-LABEL: 'cttz'			; THRU-LABEL: 'cttz'
	; THRU-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call i32 @llvm.cttz.i32(i32 %a, i1 false)			; THRU-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call i32 @llvm.cttz.i32(i32 %a, i1 false)
	; THRU-NEXT: Cost Model: Found an estimated cost of 72 for instruction: %v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)			; THRU-NEXT: Cost Model: Found an estimated cost of 72 for instruction: %v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'cttz'			; LATE-LABEL: 'cttz'
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.cttz.i32(i32 %a, i1 false)			; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call i32 @llvm.cttz.i32(i32 %a, i1 false)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)			; LATE-NEXT: Cost Model: Found an estimated cost of 72 for instruction: %v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'cttz'			; SIZE-LABEL: 'cttz'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call i32 @llvm.cttz.i32(i32 %a, i1 false)			; SIZE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call i32 @llvm.cttz.i32(i32 %a, i1 false)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 72 for instruction: %v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)			; SIZE-NEXT: Cost Model: Found an estimated cost of 72 for instruction: %v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'cttz'			; SIZE_LATE-LABEL: 'cttz'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call i32 @llvm.cttz.i32(i32 %a, i1 false)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %s = call i32 @llvm.cttz.i32(i32 %a, i1 false)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 72 for instruction: %v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 72 for instruction: %v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call i32 @llvm.cttz.i32(i32 %a, i1 false)			%s = call i32 @llvm.cttz.i32(i32 %a, i1 false)
	%v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)			%v = call <16 x i32> @llvm.cttz.v16i32(<16 x i32> %va, i1 false)
	ret void			ret void
	}			}

	define void @ctlz(i32 %a, <16 x i32> %va) {			define void @ctlz(i32 %a, <16 x i32> %va) {
	; THRU-LABEL: 'ctlz'			; THRU-LABEL: 'ctlz'
	; THRU-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)			; THRU-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)
	; THRU-NEXT: Cost Model: Found an estimated cost of 104 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)			; THRU-NEXT: Cost Model: Found an estimated cost of 104 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'ctlz'			; LATE-LABEL: 'ctlz'
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)			; LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)			; LATE-NEXT: Cost Model: Found an estimated cost of 104 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'ctlz'			; SIZE-LABEL: 'ctlz'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)			; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 104 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)			; SIZE-NEXT: Cost Model: Found an estimated cost of 104 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'ctlz'			; SIZE_LATE-LABEL: 'ctlz'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 104 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 104 for instruction: %v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)			%s = call i32 @llvm.ctlz.i32(i32 %a, i1 true)
	%v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)			%v = call <16 x i32> @llvm.ctlz.v16i32(<16 x i32> %va, i1 true)
	ret void			ret void
	}			}

	define void @fshl(i32 %a, i32 %b, i32 %c, <16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc) {			define void @fshl(i32 %a, i32 %b, i32 %c, <16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc) {
	; THRU-LABEL: 'fshl'			; THRU-LABEL: 'fshl'
	; THRU-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)			; THRU-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)
	; THRU-NEXT: Cost Model: Found an estimated cost of 136 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)			; THRU-NEXT: Cost Model: Found an estimated cost of 136 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'fshl'			; LATE-LABEL: 'fshl'
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)			; LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)			; LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'fshl'			; SIZE-LABEL: 'fshl'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)			; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)			; SIZE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'fshl'			; SIZE_LATE-LABEL: 'fshl'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)			%s = call i32 @llvm.fshl.i32(i32 %a, i32 %b, i32 %c)
	%v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)			%v = call <16 x i32> @llvm.fshl.v16i32(<16 x i32> %va, <16 x i32> %vb, <16 x i32> %vc)
	ret void			ret void
	}			}

	define void @maskedgather(<16 x float*> %va, <16 x i1> %vb, <16 x float> %vc) {			define void @maskedgather(<16 x float*> %va, <16 x i1> %vb, <16 x float> %vc) {
	; THRU-LABEL: 'maskedgather'			; THRU-LABEL: 'maskedgather'
	; THRU-NEXT: Cost Model: Found an estimated cost of 77 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)			; THRU-NEXT: Cost Model: Found an estimated cost of 77 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'maskedgather'			; LATE-LABEL: 'maskedgather'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)			; LATE-NEXT: Cost Model: Found an estimated cost of 76 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'maskedgather'			; SIZE-LABEL: 'maskedgather'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 76 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)			; SIZE-NEXT: Cost Model: Found an estimated cost of 76 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'maskedgather'			; SIZE_LATE-LABEL: 'maskedgather'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 76 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 76 for instruction: %v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)			%v = call <16 x float> @llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %va, i32 1, <16 x i1> %vb, <16 x float> %vc)
	ret void			ret void
	}			}

	define void @maskedscatter(<16 x float> %va, <16 x float*> %vb, <16 x i1> %vc) {			define void @maskedscatter(<16 x float> %va, <16 x float*> %vb, <16 x i1> %vc) {
	; THRU-LABEL: 'maskedscatter'			; THRU-LABEL: 'maskedscatter'
	; THRU-NEXT: Cost Model: Found an estimated cost of 77 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)			; THRU-NEXT: Cost Model: Found an estimated cost of 77 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'maskedscatter'			; LATE-LABEL: 'maskedscatter'
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)			; LATE-NEXT: Cost Model: Found an estimated cost of 76 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'maskedscatter'			; SIZE-LABEL: 'maskedscatter'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 76 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)			; SIZE-NEXT: Cost Model: Found an estimated cost of 76 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'maskedscatter'			; SIZE_LATE-LABEL: 'maskedscatter'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 76 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 76 for instruction: call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)			call void @llvm.masked.scatter.v16f32.v16p0f32(<16 x float> %va, <16 x float*> %vb, i32 1, <16 x i1> %vc)
	ret void			ret void
	}			}

	define void @reduce_fmax(<16 x float> %va) {			define void @reduce_fmax(<16 x float> %va) {
	; THRU-LABEL: 'reduce_fmax'			; THRU-LABEL: 'reduce_fmax'
	; THRU-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)			; THRU-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'reduce_fmax'			; LATE-LABEL: 'reduce_fmax'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)			; LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'reduce_fmax'			; SIZE-LABEL: 'reduce_fmax'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)			; SIZE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'reduce_fmax'			; SIZE_LATE-LABEL: 'reduce_fmax'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)			%v = call float @llvm.vector.reduce.fmax.v16f32(<16 x float> %va)
	ret void			ret void
	}			}

	define void @reduce_fmul(<16 x float> %va) {			define void @reduce_fmul(<16 x float> %va) {
	; THRU-LABEL: 'reduce_fmul'			; THRU-LABEL: 'reduce_fmul'
	; THRU-NEXT: Cost Model: Found an estimated cost of 44 for instruction: %v = call float @llvm.vector.reduce.fmul.v16f32(float 4.200000e+01, <16 x float> %va)			; THRU-NEXT: Cost Model: Found an estimated cost of 44 for instruction: %v = call float @llvm.vector.reduce.fmul.v16f32(float 4.200000e+01, <16 x float> %va)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'reduce_fmul'			; LATE-LABEL: 'reduce_fmul'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v = call float @llvm.vector.reduce.fmul.v16f32(float 4.200000e+01, <16 x float> %va)			; LATE-NEXT: Cost Model: Found an estimated cost of 60 for instruction: %v = call float @llvm.vector.reduce.fmul.v16f32(float 4.200000e+01, <16 x float> %va)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'reduce_fmul'			; SIZE-LABEL: 'reduce_fmul'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %v = call float @llvm.vector.reduce.fmul.v16f32(float 4.200000e+01, <16 x float> %va)			; SIZE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %v = call float @llvm.vector.reduce.fmul.v16f32(float 4.200000e+01, <16 x float> %va)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'reduce_fmul'			; SIZE_LATE-LABEL: 'reduce_fmul'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %v = call float @llvm.vector.reduce.fmul.v16f32(float 4.200000e+01, <16 x float> %va)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 28 for instruction: %v = call float @llvm.vector.reduce.fmul.v16f32(float 4.200000e+01, <16 x float> %va)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%v = call float @llvm.vector.reduce.fmul.v16f32(float 42.0, <16 x float> %va)			%v = call float @llvm.vector.reduce.fmul.v16f32(float 42.0, <16 x float> %va)
	ret void			ret void
	}			}

	define void @reduce_fadd_fast(<16 x float> %va) {			define void @reduce_fadd_fast(<16 x float> %va) {
	; THRU-LABEL: 'reduce_fadd_fast'			; THRU-LABEL: 'reduce_fadd_fast'
	; THRU-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %v = call fast float @llvm.vector.reduce.fadd.v16f32(float 0.000000e+00, <16 x float> %va)			; THRU-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %v = call fast float @llvm.vector.reduce.fadd.v16f32(float 0.000000e+00, <16 x float> %va)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'reduce_fadd_fast'			; LATE-LABEL: 'reduce_fadd_fast'
	; LATE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %v = call fast float @llvm.vector.reduce.fadd.v16f32(float 0.000000e+00, <16 x float> %va)			; LATE-NEXT: Cost Model: Found an estimated cost of 13 for instruction: %v = call fast float @llvm.vector.reduce.fadd.v16f32(float 0.000000e+00, <16 x float> %va)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'reduce_fadd_fast'			; SIZE-LABEL: 'reduce_fadd_fast'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call fast float @llvm.vector.reduce.fadd.v16f32(float 0.000000e+00, <16 x float> %va)			; SIZE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call fast float @llvm.vector.reduce.fadd.v16f32(float 0.000000e+00, <16 x float> %va)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'reduce_fadd_fast'			; SIZE_LATE-LABEL: 'reduce_fadd_fast'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call fast float @llvm.vector.reduce.fadd.v16f32(float 0.000000e+00, <16 x float> %va)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %v = call fast float @llvm.vector.reduce.fadd.v16f32(float 0.000000e+00, <16 x float> %va)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%v = call fast float @llvm.vector.reduce.fadd.v16f32(float 0.0, <16 x float> %va)			%v = call fast float @llvm.vector.reduce.fadd.v16f32(float 0.0, <16 x float> %va)
	ret void			ret void
	}			}

	define void @memcpy(i8* %a, i8* %b, i32 %c) {			define void @memcpy(i8* %a, i8* %b, i32 %c) {
	; THRU-LABEL: 'memcpy'			; THRU-LABEL: 'memcpy'
	; THRU-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)			; THRU-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)
	; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; THRU-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; LATE-LABEL: 'memcpy'			; LATE-LABEL: 'memcpy'
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)			; LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)
	; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE-LABEL: 'memcpy'			; SIZE-LABEL: 'memcpy'
	; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)			; SIZE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)
	; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	; SIZE_LATE-LABEL: 'memcpy'			; SIZE_LATE-LABEL: 'memcpy'
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)
	; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; SIZE_LATE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)			call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 1 %a, i8* align 1 %b, i32 32, i1 false)
	ret void			ret void
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[CostModel] Replace getUserCost with getInstructionCost.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 453616

llvm/include/llvm/Analysis/TargetTransformInfo.h

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

llvm/include/llvm/CodeGen/BasicTTIImpl.h

llvm/lib/Analysis/CodeMetrics.cpp

llvm/lib/Analysis/InlineCost.cpp

llvm/lib/Analysis/TargetTransformInfo.cpp

llvm/lib/CodeGen/CodeGenPrepare.cpp

llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp

llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.h

llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.cpp

llvm/lib/Target/PowerPC/PPCTargetTransformInfo.h

llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

llvm/lib/Target/X86/X86TargetTransformInfo.cpp

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp

llvm/lib/Transforms/Scalar/JumpThreading.cpp

llvm/lib/Transforms/Scalar/LICM.cpp

llvm/lib/Transforms/Scalar/LoopFlatten.cpp

llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp

llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp

llvm/lib/Transforms/Scalar/SpeculativeExecution.cpp

llvm/lib/Transforms/Utils/SimplifyCFG.cpp

llvm/test/Analysis/CostModel/AArch64/sve-math.ll

llvm/test/Analysis/CostModel/ARM/intrinsic-cost-kinds.ll

llvm/test/Analysis/CostModel/ARM/target-intrinsics.ll

llvm/test/Analysis/CostModel/SystemZ/ext-of-icmp-cost.ll

llvm/test/Analysis/CostModel/X86/arith-fp-costkinds.ll

llvm/test/Analysis/CostModel/X86/costmodel.ll

llvm/test/Analysis/CostModel/X86/intrinsic-cost-kinds.ll

[CostModel] Replace getUserCost with getInstructionCost.
ClosedPublic