This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Add target option to bail scalarization
Needs RevisionPublic

Authored by hgreving on Jun 9 2023, 6:04 PM.

Download Raw Diff

Details

Reviewers

majnemer
jmolloy
ThomasRaoux
nikic

Summary

Adds a TTI option to consider scalarization unprofitable for
targets that prefer vector code to stay vectorized.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

hgreving created this revision.Jun 9 2023, 6:04 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2023, 6:04 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

hgreving requested review of this revision.Jun 9 2023, 6:04 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2023, 6:04 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

hgreving added reviewers: majnemer, jmolloy, ThomasRaoux, nikic.Jun 9 2023, 6:06 PM

Herald added a subscriber: StephenFan. · View Herald TranscriptJun 9 2023, 6:06 PM

Harbormaster completed remote builds in B237896: Diff 530125.Jun 9 2023, 7:32 PM

InstCombine is a target-independent canonicalization pass. The use of TTI hooks is forbidden as a matter of policy.

If there is no subset of scalarization that is universally profitable, then the transform should be moved into VectorCombine, which is cost-model driven.

llvm/include/llvm/Transforms/InstCombine/InstCombiner.h
49	Read the comment.

This revision now requires changes to proceed.Jun 10 2023, 12:23 AM

In D152600#4410907, @nikic wrote:

InstCombine is a target-independent canonicalization pass. The use of TTI hooks is forbidden as a matter of policy.

If there is no subset of scalarization that is universally profitable, then the transform should be moved into VectorCombine, which is cost-model driven.

I expected this kind of reply, I felt though that TTI is already used in some cases. If it's strictly target independent, I don't understand why we would scalarize in instcombine in the first place, since this is very unlikely to be target agnostic. Would you be ok with a switch (probably not)? The only problem with moving it to vector-combine is that besides scalarize phi, it seems entangled with extract_elt transforms that _are_ target independent, WDYT?

In D152600#4414070, @hgreving wrote:

In D152600#4410907, @nikic wrote:

InstCombine is a target-independent canonicalization pass. The use of TTI hooks is forbidden as a matter of policy.

If there is no subset of scalarization that is universally profitable, then the transform should be moved into VectorCombine, which is cost-model driven.

I expected this kind of reply, I felt though that TTI is already used in some cases.

TTI is only used for handling of target intrinsics.

If it's strictly target independent, I don't understand why we would scalarize in instcombine in the first place, since this is very unlikely to be target agnostic.

Probably just an old transform that used to be universally profitable at the time.

Would you be ok with a switch (probably not)? The only problem with moving it to vector-combine is that besides scalarize phi, it seems entangled with extract_elt transforms that _are_ target independent, WDYT?

The patch description has no information on what problem you're actually trying to solve, so it's hard to give meaningful advice here. Is any kind of scalarization problematic for your target, or is it something more specific, e.g. an expensive variable extract from a binop being converted into two expensive variable extracts?

In D152600#4417270, @nikic wrote:

In D152600#4414070, @hgreving wrote:

In D152600#4410907, @nikic wrote:

InstCombine is a target-independent canonicalization pass. The use of TTI hooks is forbidden as a matter of policy.

If there is no subset of scalarization that is universally profitable, then the transform should be moved into VectorCombine, which is cost-model driven.

I expected this kind of reply, I felt though that TTI is already used in some cases.

TTI is only used for handling of target intrinsics.

If it's strictly target independent, I don't understand why we would scalarize in instcombine in the first place, since this is very unlikely to be target agnostic.

Probably just an old transform that used to be universally profitable at the time.

Would you be ok with a switch (probably not)? The only problem with moving it to vector-combine is that besides scalarize phi, it seems entangled with extract_elt transforms that _are_ target independent, WDYT?

The patch description has no information on what problem you're actually trying to solve, so it's hard to give meaningful advice here. Is any kind of scalarization problematic for your target, or is it something more specific, e.g. an expensive variable extract from a binop being converted into two expensive variable extracts?

%0 = mul <N x i32> %x, <splat vector>
%1 = extractelement <N x i32> %m, i32 %i
%2 = insertelement <N x i32> poison, i32 %1, i32 0
%3 = shufflevector <N x i32> %2, <N x i32> poison, <N x i32> zeroinitializer

instcombine scalarizes this (extract + scalar mul strength reduced to shl + insert + shuffle). Our target does not like the vector to scalar code as a result, it is not "cheap". Another way to fix this is to re-vectorize the code later, but my preference was to avoid scalarizing like this in the first place. I suppose adding a switch to instcombine is not really an option (I don't like it)?

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

TargetTransformInfo.h

5 lines

TargetTransformInfoImpl.h

2 lines

Transforms/

InstCombine/

InstCombiner.h

12 lines

lib/

Analysis/

TargetTransformInfo.cpp

4 lines

Transforms/

InstCombine/

InstCombineVectorOps.cpp

17 lines

Diff 530125

llvm/include/llvm/Analysis/TargetTransformInfo.h

Show First 20 Lines • Show All 767 Lines • ▼ Show 20 Lines	public:
/// addrspacecast to generic AS for volatile loads/stores. Default		/// addrspacecast to generic AS for volatile loads/stores. Default
/// implementation returns false, which prevents address space inference for		/// implementation returns false, which prevents address space inference for
/// volatile loads/stores.		/// volatile loads/stores.
bool hasVolatileVariant(Instruction *I, unsigned AddrSpace) const;		bool hasVolatileVariant(Instruction *I, unsigned AddrSpace) const;

/// Return true if target doesn't mind addresses in vectors.		/// Return true if target doesn't mind addresses in vectors.
bool prefersVectorizedAddressing() const;		bool prefersVectorizedAddressing() const;

		/// Returns true if target is ok with scalarizing vector code.
		bool isOkToScalarize() const;

/// Return the cost of the scaling factor used in the addressing		/// Return the cost of the scaling factor used in the addressing
/// mode represented by AM for this target, for a load/store		/// mode represented by AM for this target, for a load/store
/// of the specified type.		/// of the specified type.
/// If the AM is supported, the return value must be >= 0.		/// If the AM is supported, the return value must be >= 0.
/// If the AM is not supported, it returns a negative value.		/// If the AM is not supported, it returns a negative value.
/// TODO: Handle pre/postinc as well.		/// TODO: Handle pre/postinc as well.
InstructionCost getScalingFactorCost(Type Ty, GlobalValue BaseGV,		InstructionCost getScalingFactorCost(Type Ty, GlobalValue BaseGV,
int64_t BaseOffset, bool HasBaseReg,		int64_t BaseOffset, bool HasBaseReg,
▲ Show 20 Lines • Show All 969 Lines • ▼ Show 20 Lines	public:
virtual bool isLegalMaskedExpandLoad(Type *DataType) = 0;		virtual bool isLegalMaskedExpandLoad(Type *DataType) = 0;
virtual bool isLegalAltInstr(VectorType *VecTy, unsigned Opcode0,		virtual bool isLegalAltInstr(VectorType *VecTy, unsigned Opcode0,
unsigned Opcode1,		unsigned Opcode1,
const SmallBitVector &OpcodeMask) const = 0;		const SmallBitVector &OpcodeMask) const = 0;
virtual bool enableOrderedReductions() = 0;		virtual bool enableOrderedReductions() = 0;
virtual bool hasDivRemOp(Type *DataType, bool IsSigned) = 0;		virtual bool hasDivRemOp(Type *DataType, bool IsSigned) = 0;
virtual bool hasVolatileVariant(Instruction *I, unsigned AddrSpace) = 0;		virtual bool hasVolatileVariant(Instruction *I, unsigned AddrSpace) = 0;
virtual bool prefersVectorizedAddressing() = 0;		virtual bool prefersVectorizedAddressing() = 0;
		virtual bool isOkToScalarize() = 0;
virtual InstructionCost getScalingFactorCost(Type Ty, GlobalValue BaseGV,		virtual InstructionCost getScalingFactorCost(Type Ty, GlobalValue BaseGV,
int64_t BaseOffset,		int64_t BaseOffset,
bool HasBaseReg, int64_t Scale,		bool HasBaseReg, int64_t Scale,
unsigned AddrSpace) = 0;		unsigned AddrSpace) = 0;
virtual bool LSRWithInstrQueries() = 0;		virtual bool LSRWithInstrQueries() = 0;
virtual bool isTruncateFree(Type Ty1, Type Ty2) = 0;		virtual bool isTruncateFree(Type Ty1, Type Ty2) = 0;
virtual bool isProfitableToHoist(Instruction *I) = 0;		virtual bool isProfitableToHoist(Instruction *I) = 0;
virtual bool useAA() = 0;		virtual bool useAA() = 0;
▲ Show 20 Lines • Show All 452 Lines • ▼ Show 20 Lines	bool hasDivRemOp(Type *DataType, bool IsSigned) override {
return Impl.hasDivRemOp(DataType, IsSigned);		return Impl.hasDivRemOp(DataType, IsSigned);
}		}
bool hasVolatileVariant(Instruction *I, unsigned AddrSpace) override {		bool hasVolatileVariant(Instruction *I, unsigned AddrSpace) override {
return Impl.hasVolatileVariant(I, AddrSpace);		return Impl.hasVolatileVariant(I, AddrSpace);
}		}
bool prefersVectorizedAddressing() override {		bool prefersVectorizedAddressing() override {
return Impl.prefersVectorizedAddressing();		return Impl.prefersVectorizedAddressing();
}		}
		bool isOkToScalarize() override { return Impl.isOkToScalarize(); }
InstructionCost getScalingFactorCost(Type Ty, GlobalValue BaseGV,		InstructionCost getScalingFactorCost(Type Ty, GlobalValue BaseGV,
int64_t BaseOffset, bool HasBaseReg,		int64_t BaseOffset, bool HasBaseReg,
int64_t Scale,		int64_t Scale,
unsigned AddrSpace) override {		unsigned AddrSpace) override {
return Impl.getScalingFactorCost(Ty, BaseGV, BaseOffset, HasBaseReg, Scale,		return Impl.getScalingFactorCost(Ty, BaseGV, BaseOffset, HasBaseReg, Scale,
AddrSpace);		AddrSpace);
}		}
bool LSRWithInstrQueries() override { return Impl.LSRWithInstrQueries(); }		bool LSRWithInstrQueries() override { return Impl.LSRWithInstrQueries(); }
▲ Show 20 Lines • Show All 569 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

Show First 20 Lines • Show All 291 Lines • ▼ Show 20 Lines	public:
bool hasDivRemOp(Type *DataType, bool IsSigned) const { return false; }		bool hasDivRemOp(Type *DataType, bool IsSigned) const { return false; }

bool hasVolatileVariant(Instruction *I, unsigned AddrSpace) const {		bool hasVolatileVariant(Instruction *I, unsigned AddrSpace) const {
return false;		return false;
}		}

bool prefersVectorizedAddressing() const { return true; }		bool prefersVectorizedAddressing() const { return true; }

		bool isOkToScalarize() const { return true; }

InstructionCost getScalingFactorCost(Type Ty, GlobalValue BaseGV,		InstructionCost getScalingFactorCost(Type Ty, GlobalValue BaseGV,
int64_t BaseOffset, bool HasBaseReg,		int64_t BaseOffset, bool HasBaseReg,
int64_t Scale,		int64_t Scale,
unsigned AddrSpace) const {		unsigned AddrSpace) const {
// Guess that all legal addressing mode are free.		// Guess that all legal addressing mode are free.
if (isLegalAddressingMode(Ty, BaseGV, BaseOffset, HasBaseReg, Scale,		if (isLegalAddressingMode(Ty, BaseGV, BaseOffset, HasBaseReg, Scale,
AddrSpace))		AddrSpace))
return 0;		return 0;
▲ Show 20 Lines • Show All 1,048 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/InstCombine/InstCombiner.h

	Show All 38 Lines
	class TargetLibraryInfo;			class TargetLibraryInfo;
	class TargetTransformInfo;			class TargetTransformInfo;

	/// The core instruction combiner logic.			/// The core instruction combiner logic.
	///			///
	/// This class provides both the logic to recursively visit instructions and			/// This class provides both the logic to recursively visit instructions and
	/// combine them.			/// combine them.
	class LLVM_LIBRARY_VISIBILITY InstCombiner {			class LLVM_LIBRARY_VISIBILITY InstCombiner {
	/// Only used to call target specific intrinsic combining.
	/// It must NOT be used for any other purpose, as InstCombine is a
	/// target-independent canonicalization transform.
	nikicUnsubmitted Not Done Reply Inline Actions Read the comment. nikic: Read the comment.
	TargetTransformInfo &TTI;

	public:			public:
	/// Maximum size of array considered when transforming.			/// Maximum size of array considered when transforming.
	uint64_t MaxArraySizeForCombine = 0;			uint64_t MaxArraySizeForCombine = 0;

	/// An IRBuilder that automatically inserts new instructions into the			/// An IRBuilder that automatically inserts new instructions into the
	/// worklist.			/// worklist.
	using BuilderTy = IRBuilder<TargetFolder, IRBuilderCallbackInserter>;			using BuilderTy = IRBuilder<TargetFolder, IRBuilderCallbackInserter>;
	BuilderTy &Builder;			BuilderTy &Builder;

	protected:			protected:
				/// Only used to call target specific intrinsic combining.
				/// It must NOT be used for any other purpose, as InstCombine is a
				/// target-independent canonicalization transform.
				TargetTransformInfo &TTI;

	/// A worklist of the instructions that need to be simplified.			/// A worklist of the instructions that need to be simplified.
	InstructionWorklist &Worklist;			InstructionWorklist &Worklist;

	// Mode in which we are running the combiner.			// Mode in which we are running the combiner.
	const bool MinimizeSize;			const bool MinimizeSize;

	AAResults *AA;			AAResults *AA;

	Show All 15 Lines

	public:			public:
	InstCombiner(InstructionWorklist &Worklist, BuilderTy &Builder,			InstCombiner(InstructionWorklist &Worklist, BuilderTy &Builder,
	bool MinimizeSize, AAResults *AA, AssumptionCache &AC,			bool MinimizeSize, AAResults *AA, AssumptionCache &AC,
	TargetLibraryInfo &TLI, TargetTransformInfo &TTI,			TargetLibraryInfo &TLI, TargetTransformInfo &TTI,
	DominatorTree &DT, OptimizationRemarkEmitter &ORE,			DominatorTree &DT, OptimizationRemarkEmitter &ORE,
	BlockFrequencyInfo BFI, ProfileSummaryInfo PSI,			BlockFrequencyInfo BFI, ProfileSummaryInfo PSI,
	const DataLayout &DL, LoopInfo *LI)			const DataLayout &DL, LoopInfo *LI)
	: TTI(TTI), Builder(Builder), Worklist(Worklist),			: Builder(Builder), TTI(TTI), Worklist(Worklist),
	MinimizeSize(MinimizeSize), AA(AA), AC(AC), TLI(TLI), DT(DT), DL(DL),			MinimizeSize(MinimizeSize), AA(AA), AC(AC), TLI(TLI), DT(DT), DL(DL),
	SQ(DL, &TLI, &DT, &AC), ORE(ORE), BFI(BFI), PSI(PSI), LI(LI) {}			SQ(DL, &TLI, &DT, &AC), ORE(ORE), BFI(BFI), PSI(PSI), LI(LI) {}

	virtual ~InstCombiner() = default;			virtual ~InstCombiner() = default;

	/// Return the source operand of a potentially bitcasted value while			/// Return the source operand of a potentially bitcasted value while
	/// optionally checking if it has one use. If there is no bitcast or the one			/// optionally checking if it has one use. If there is no bitcast or the one
	/// use check is not met, return the input value itself.			/// use check is not met, return the input value itself.
	▲ Show 20 Lines • Show All 440 Lines • Show Last 20 Lines

llvm/lib/Analysis/TargetTransformInfo.cpp

Show First 20 Lines • Show All 483 Lines • ▼ Show 20 Lines	bool TargetTransformInfo::hasVolatileVariant(Instruction *I,
unsigned AddrSpace) const {		unsigned AddrSpace) const {
return TTIImpl->hasVolatileVariant(I, AddrSpace);		return TTIImpl->hasVolatileVariant(I, AddrSpace);
}		}

bool TargetTransformInfo::prefersVectorizedAddressing() const {		bool TargetTransformInfo::prefersVectorizedAddressing() const {
return TTIImpl->prefersVectorizedAddressing();		return TTIImpl->prefersVectorizedAddressing();
}		}

		bool TargetTransformInfo::isOkToScalarize() const {
		return TTIImpl->isOkToScalarize();
		}

InstructionCost TargetTransformInfo::getScalingFactorCost(		InstructionCost TargetTransformInfo::getScalingFactorCost(
Type Ty, GlobalValue BaseGV, int64_t BaseOffset, bool HasBaseReg,		Type Ty, GlobalValue BaseGV, int64_t BaseOffset, bool HasBaseReg,
int64_t Scale, unsigned AddrSpace) const {		int64_t Scale, unsigned AddrSpace) const {
InstructionCost Cost = TTIImpl->getScalingFactorCost(		InstructionCost Cost = TTIImpl->getScalingFactorCost(
Ty, BaseGV, BaseOffset, HasBaseReg, Scale, AddrSpace);		Ty, BaseGV, BaseOffset, HasBaseReg, Scale, AddrSpace);
assert(Cost >= 0 && "TTI should not produce negative costs!");		assert(Cost >= 0 && "TTI should not produce negative costs!");
return Cost;		return Cost;
}		}
▲ Show 20 Lines • Show All 776 Lines • Show Last 20 Lines

llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp

Show All 14 Lines
#include "llvm/ADT/APInt.h"		#include "llvm/ADT/APInt.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallBitVector.h"		#include "llvm/ADT/SmallBitVector.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Analysis/VectorUtils.h"		#include "llvm/Analysis/VectorUtils.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Constant.h"		#include "llvm/IR/Constant.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/InstrTypes.h"		#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/Instruction.h"		#include "llvm/IR/Instruction.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
Show All 19 Lines	STATISTIC(NumAggregateReconstructionsSimplified,
"Number of aggregate reconstructions turned into reuse of the "		"Number of aggregate reconstructions turned into reuse of the "
"original aggregate");		"original aggregate");

/// Return true if the value is cheaper to scalarize than it is to leave as a		/// Return true if the value is cheaper to scalarize than it is to leave as a
/// vector operation. If the extract index \p EI is a constant integer then		/// vector operation. If the extract index \p EI is a constant integer then
/// some operations may be cheap to scalarize.		/// some operations may be cheap to scalarize.
///		///
/// FIXME: It's possible to create more instructions than previously existed.		/// FIXME: It's possible to create more instructions than previously existed.
static bool cheapToScalarize(Value V, Value EI) {		static bool cheapToScalarize(Value V, Value EI, TargetTransformInfo &TTI) {
		if (!TTI.isOkToScalarize())
		return false;
ConstantInt *CEI = dyn_cast<ConstantInt>(EI);		ConstantInt *CEI = dyn_cast<ConstantInt>(EI);

// If we can pick a scalar constant value out of a vector, that is free.		// If we can pick a scalar constant value out of a vector, that is free.
if (auto *C = dyn_cast<Constant>(V))		if (auto *C = dyn_cast<Constant>(V))
return CEI \|\| C->getSplatValue();		return CEI \|\| C->getSplatValue();

if (CEI && match(V, m_Intrinsic<Intrinsic::experimental_stepvector>())) {		if (CEI && match(V, m_Intrinsic<Intrinsic::experimental_stepvector>())) {
ElementCount EC = cast<VectorType>(V->getType())->getElementCount();		ElementCount EC = cast<VectorType>(V->getType())->getElementCount();
Show All 11 Lines	static bool cheapToScalarize(Value V, Value EI, TargetTransformInfo &TTI) {
if (match(V, m_OneUse(m_Load(m_Value()))))		if (match(V, m_OneUse(m_Load(m_Value()))))
return true;		return true;

if (match(V, m_OneUse(m_UnOp())))		if (match(V, m_OneUse(m_UnOp())))
return true;		return true;

Value V0, V1;		Value V0, V1;
if (match(V, m_OneUse(m_BinOp(m_Value(V0), m_Value(V1)))))		if (match(V, m_OneUse(m_BinOp(m_Value(V0), m_Value(V1)))))
if (cheapToScalarize(V0, EI) \|\| cheapToScalarize(V1, EI))		if (cheapToScalarize(V0, EI, TTI) \|\| cheapToScalarize(V1, EI, TTI))
return true;		return true;

CmpInst::Predicate UnusedPred;		CmpInst::Predicate UnusedPred;
if (match(V, m_OneUse(m_Cmp(UnusedPred, m_Value(V0), m_Value(V1)))))		if (match(V, m_OneUse(m_Cmp(UnusedPred, m_Value(V0), m_Value(V1)))))
if (cheapToScalarize(V0, EI) \|\| cheapToScalarize(V1, EI))		if (cheapToScalarize(V0, EI, TTI) \|\| cheapToScalarize(V1, EI, TTI))
return true;		return true;

return false;		return false;
}		}

// If we have a PHI node with a vector type that is only used to feed		// If we have a PHI node with a vector type that is only used to feed
// itself and be an operand of extractelement at a constant location,		// itself and be an operand of extractelement at a constant location,
// try to replace the PHI of the vector type with a PHI of a scalar type.		// try to replace the PHI of the vector type with a PHI of a scalar type.
Show All 21 Lines	Instruction *InstCombinerImpl::scalarizePHI(ExtractElementInst &EI,
if (!PHIUser)		if (!PHIUser)
return nullptr;		return nullptr;

// Verify that this PHI user has one use, which is the PHI itself,		// Verify that this PHI user has one use, which is the PHI itself,
// and that it is a binary operation which is cheap to scalarize.		// and that it is a binary operation which is cheap to scalarize.
// otherwise return nullptr.		// otherwise return nullptr.
if (!PHIUser->hasOneUse() \|\| !(PHIUser->user_back() == PN) \|\|		if (!PHIUser->hasOneUse() \|\| !(PHIUser->user_back() == PN) \|\|
!(isa<BinaryOperator>(PHIUser)) \|\|		!(isa<BinaryOperator>(PHIUser)) \|\|
!cheapToScalarize(PHIUser, EI.getIndexOperand()))		!cheapToScalarize(PHIUser, EI.getIndexOperand(), TTI))
return nullptr;		return nullptr;

// Create a scalar PHI node that will replace the vector PHI node		// Create a scalar PHI node that will replace the vector PHI node
// just before the current PHI node.		// just before the current PHI node.
PHINode *scalarPHI = cast<PHINode>(InsertNewInstWith(		PHINode *scalarPHI = cast<PHINode>(InsertNewInstWith(
PHINode::Create(EI.getType(), PN->getNumIncomingValues(), ""), *PN));		PHINode::Create(EI.getType(), PN->getNumIncomingValues(), ""), *PN));
// Scalarize each PHI operand.		// Scalarize each PHI operand.
for (unsigned i = 0; i < PN->getNumIncomingValues(); i++) {		for (unsigned i = 0; i < PN->getNumIncomingValues(); i++) {
▲ Show 20 Lines • Show All 318 Lines • ▼ Show 20 Lines	if (IndexC) {
if (auto *Phi = dyn_cast<PHINode>(SrcVec))		if (auto *Phi = dyn_cast<PHINode>(SrcVec))
if (Instruction *ScalarPHI = scalarizePHI(EI, Phi))		if (Instruction *ScalarPHI = scalarizePHI(EI, Phi))
return ScalarPHI;		return ScalarPHI;
}		}

// TODO come up with a n-ary matcher that subsumes both unary and		// TODO come up with a n-ary matcher that subsumes both unary and
// binary matchers.		// binary matchers.
UnaryOperator *UO;		UnaryOperator *UO;
if (match(SrcVec, m_UnOp(UO)) && cheapToScalarize(SrcVec, Index)) {		if (match(SrcVec, m_UnOp(UO)) && cheapToScalarize(SrcVec, Index, TTI)) {
// extelt (unop X), Index --> unop (extelt X, Index)		// extelt (unop X), Index --> unop (extelt X, Index)
Value *X = UO->getOperand(0);		Value *X = UO->getOperand(0);
Value *E = Builder.CreateExtractElement(X, Index);		Value *E = Builder.CreateExtractElement(X, Index);
return UnaryOperator::CreateWithCopiedFlags(UO->getOpcode(), E, UO);		return UnaryOperator::CreateWithCopiedFlags(UO->getOpcode(), E, UO);
}		}

BinaryOperator *BO;		BinaryOperator *BO;
if (match(SrcVec, m_BinOp(BO)) && cheapToScalarize(SrcVec, Index)) {		if (match(SrcVec, m_BinOp(BO)) && cheapToScalarize(SrcVec, Index, TTI)) {
// extelt (binop X, Y), Index --> binop (extelt X, Index), (extelt Y, Index)		// extelt (binop X, Y), Index --> binop (extelt X, Index), (extelt Y, Index)
Value X = BO->getOperand(0), Y = BO->getOperand(1);		Value X = BO->getOperand(0), Y = BO->getOperand(1);
Value *E0 = Builder.CreateExtractElement(X, Index);		Value *E0 = Builder.CreateExtractElement(X, Index);
Value *E1 = Builder.CreateExtractElement(Y, Index);		Value *E1 = Builder.CreateExtractElement(Y, Index);
return BinaryOperator::CreateWithCopiedFlags(BO->getOpcode(), E0, E1, BO);		return BinaryOperator::CreateWithCopiedFlags(BO->getOpcode(), E0, E1, BO);
}		}

Value X, Y;		Value X, Y;
CmpInst::Predicate Pred;		CmpInst::Predicate Pred;
if (match(SrcVec, m_Cmp(Pred, m_Value(X), m_Value(Y))) &&		if (match(SrcVec, m_Cmp(Pred, m_Value(X), m_Value(Y))) &&
cheapToScalarize(SrcVec, Index)) {		cheapToScalarize(SrcVec, Index, TTI)) {
// extelt (cmp X, Y), Index --> cmp (extelt X, Index), (extelt Y, Index)		// extelt (cmp X, Y), Index --> cmp (extelt X, Index), (extelt Y, Index)
Value *E0 = Builder.CreateExtractElement(X, Index);		Value *E0 = Builder.CreateExtractElement(X, Index);
Value *E1 = Builder.CreateExtractElement(Y, Index);		Value *E1 = Builder.CreateExtractElement(Y, Index);
return CmpInst::Create(cast<CmpInst>(SrcVec)->getOpcode(), Pred, E0, E1);		return CmpInst::Create(cast<CmpInst>(SrcVec)->getOpcode(), Pred, E0, E1);
}		}

if (auto *I = dyn_cast<Instruction>(SrcVec)) {		if (auto *I = dyn_cast<Instruction>(SrcVec)) {
if (auto *IE = dyn_cast<InsertElementInst>(I)) {		if (auto *IE = dyn_cast<InsertElementInst>(I)) {
▲ Show 20 Lines • Show All 2,641 Lines • Show Last 20 Lines