Diff 120330

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 4,881 Lines • ▼ Show 20 Lines

	.. code-block:: llvm			.. code-block:: llvm

	%result = call i64 %binop(i64 %x, i64 %y), !callees !0			%result = call i64 %binop(i64 %x, i64 %y), !callees !0

	...			...
	!0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub}			!0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub}

				'``speculation.marker``' Metadata
				^^^^^^^^^^^^^^^^^^^^^^

				``speculation.marker`` metadata must be attached to a load. It consist of
				RKSimonUnsubmitted Not Done Reply Inline Actions "consists" RKSimon: "consists"
				set of ``i64`` type offsets indicating that the memory from load pointer address
				RKSimonUnsubmitted Not Done Reply Inline Actions Is it a set or a min/max range? The examples only ever have 2 entries. If a set do they have to be sorted? RKSimon: Is it a set or a min/max range? The examples only ever have 2 entries. If a set do they have to…
				is accessable for read operation with provided offsets. The intent of this
				metadata is keep notion of certain memory as dereferanceable after load
				RKSimonUnsubmitted Not Done Reply Inline Actions "is to keep track of dereferanceable memory locations after load operations to that memory were deleted, as it might be beneficial for future optimizations." RKSimon: "is to keep track of dereferanceable memory locations after load operations to that memory were…
				operations to that memory were deleted, but it might be beneficially to keep
				this notion for some optimizations.

				.. code-block:: llvm

				%ld1 = load double, double* %arrayidx1, align 8, !speculation.marker !0

				...
				!0 = !{i64 -1, i64 2}

	'``unpredictable``' Metadata			'``unpredictable``' Metadata
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	``unpredictable`` metadata may be attached to any branch or switch			``unpredictable`` metadata may be attached to any branch or switch
	instruction. It can be used to express the unpredictability of control			instruction. It can be used to express the unpredictability of control
	flow. Similar to the llvm.expect intrinsic, it may be used to alter			flow. Similar to the llvm.expect intrinsic, it may be used to alter
	optimizations related to compare and branch instructions. The metadata			optimizations related to compare and branch instructions. The metadata
	is treated as a boolean value; if it exists, it signals that the branch			is treated as a boolean value; if it exists, it signals that the branch
	▲ Show 20 Lines • Show All 9,533 Lines • Show Last 20 Lines

include/llvm/IR/LLVMContext.h

Show First 20 Lines • Show All 95 Lines • ▼ Show 20 Lines	enum {
MD_invariant_group = 16, // "invariant.group"		MD_invariant_group = 16, // "invariant.group"
MD_align = 17, // "align"		MD_align = 17, // "align"
MD_loop = 18, // "llvm.loop"		MD_loop = 18, // "llvm.loop"
MD_type = 19, // "type"		MD_type = 19, // "type"
MD_section_prefix = 20, // "section_prefix"		MD_section_prefix = 20, // "section_prefix"
MD_absolute_symbol = 21, // "absolute_symbol"		MD_absolute_symbol = 21, // "absolute_symbol"
MD_associated = 22, // "associated"		MD_associated = 22, // "associated"
MD_callees = 23, // "callees"		MD_callees = 23, // "callees"
		MD_speculation_marker = 24, // "speculation.marker"
		RKSimonUnsubmitted Not Done Reply Inline Actions // alignment RKSimon: // alignment
};		};

/// Known operand bundle tag IDs, which always have the same value. All		/// Known operand bundle tag IDs, which always have the same value. All
/// operand bundle tags that LLVM has special knowledge of are listed here.		/// operand bundle tags that LLVM has special knowledge of are listed here.
/// Additionally, this scheme allows LLVM to efficiently check for specific		/// Additionally, this scheme allows LLVM to efficiently check for specific
/// operand bundle tags without comparing strings.		/// operand bundle tags without comparing strings.
enum {		enum {
OB_deopt = 0, // "deopt"		OB_deopt = 0, // "deopt"
▲ Show 20 Lines • Show All 236 Lines • Show Last 20 Lines

include/llvm/Transforms/Vectorize/SLPVectorizer.h

Show First 20 Lines • Show All 106 Lines • ▼ Show 20 Lines	private:

/// \brief Vectorize the store instructions collected in Stores.		/// \brief Vectorize the store instructions collected in Stores.
bool vectorizeStoreChains(slpvectorizer::BoUpSLP &R);		bool vectorizeStoreChains(slpvectorizer::BoUpSLP &R);

/// \brief Vectorize the index computations of the getelementptr instructions		/// \brief Vectorize the index computations of the getelementptr instructions
/// collected in GEPs.		/// collected in GEPs.
bool vectorizeGEPIndices(BasicBlock *BB, slpvectorizer::BoUpSLP &R);		bool vectorizeGEPIndices(BasicBlock *BB, slpvectorizer::BoUpSLP &R);

		/// \brief Restore inserts out of speculation.marker metadata.
		InsertElementInst * restoreInserts(LoadInst *LInstr);

/// Try to find horizontal reduction or otherwise vectorize a chain of binary		/// Try to find horizontal reduction or otherwise vectorize a chain of binary
/// operators.		/// operators.
bool vectorizeRootInstruction(PHINode P, Value V, BasicBlock *BB,		bool vectorizeRootInstruction(PHINode P, Value V, BasicBlock *BB,
slpvectorizer::BoUpSLP &R,		slpvectorizer::BoUpSLP &R,
TargetTransformInfo *TTI);		TargetTransformInfo *TTI);

/// Try to vectorize trees that start at insertvalue instructions.		/// Try to vectorize trees that start at insertvalue instructions.
bool vectorizeInsertValueInst(InsertValueInst IVI, BasicBlock BB,		bool vectorizeInsertValueInst(InsertValueInst IVI, BasicBlock BB,
Show All 20 Lines	private:

bool vectorizeStores(ArrayRef<StoreInst *> Stores, slpvectorizer::BoUpSLP &R);		bool vectorizeStores(ArrayRef<StoreInst *> Stores, slpvectorizer::BoUpSLP &R);

/// The store instructions in a basic block organized by base pointer.		/// The store instructions in a basic block organized by base pointer.
StoreListMap Stores;		StoreListMap Stores;

/// The getelementptr instructions in a basic block organized by base pointer.		/// The getelementptr instructions in a basic block organized by base pointer.
WeakTrackingVHListMap GEPs;		WeakTrackingVHListMap GEPs;

		RKSimonUnsubmitted Not Done Reply Inline Actions Unnecessary whitespace change RKSimon: Unnecessary whitespace change
};		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_TRANSFORMS_VECTORIZE_SLPVECTORIZER_H		#endif // LLVM_TRANSFORMS_VECTORIZE_SLPVECTORIZER_H

lib/IR/LLVMContext.cpp

Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	std::pair<unsigned, StringRef> MDKinds[] = {
{MD_invariant_group, "invariant.group"},		{MD_invariant_group, "invariant.group"},
{MD_align, "align"},		{MD_align, "align"},
{MD_loop, "llvm.loop"},		{MD_loop, "llvm.loop"},
{MD_type, "type"},		{MD_type, "type"},
{MD_section_prefix, "section_prefix"},		{MD_section_prefix, "section_prefix"},
{MD_absolute_symbol, "absolute_symbol"},		{MD_absolute_symbol, "absolute_symbol"},
{MD_associated, "associated"},		{MD_associated, "associated"},
{MD_callees, "callees"},		{MD_callees, "callees"},
		{MD_speculation_marker, "speculation.marker"},
};		};

for (auto &MDKind : MDKinds) {		for (auto &MDKind : MDKinds) {
unsigned ID = getMDKindID(MDKind.second);		unsigned ID = getMDKindID(MDKind.second);
assert(ID == MDKind.first && "metadata kind id drifted");		assert(ID == MDKind.first && "metadata kind id drifted");
(void)ID;		(void)ID;
}		}

▲ Show 20 Lines • Show All 274 Lines • Show Last 20 Lines

lib/Transforms/Vectorize/SLPVectorizer.cpp

Show First 20 Lines • Show All 4,214 Lines • ▼ Show 20 Lines	for (auto BB : post_order(&F.getEntryBlock())) {
// Vectorize trees that end at stores.		// Vectorize trees that end at stores.
if (!Stores.empty()) {		if (!Stores.empty()) {
DEBUG(dbgs() << "SLP: Found stores for " << Stores.size()		DEBUG(dbgs() << "SLP: Found stores for " << Stores.size()
<< " underlying objects.\n");		<< " underlying objects.\n");
Changed \|= vectorizeStoreChains(R);		Changed \|= vectorizeStoreChains(R);
}		}

// Vectorize trees that end at reductions.		// Vectorize trees that end at reductions.
Changed \|= vectorizeChainsInBlock(BB, R);		Changed \|= vectorizeChainsInBlock(BB, R);
		RKSimonUnsubmitted Not Done Reply Inline Actions clang-format - braces RKSimon: clang-format - braces

// Vectorize the index computations of getelementptr instructions. This		// Vectorize the index computations of getelementptr instructions. This
// is primarily intended to catch gather-like idioms ending at		// is primarily intended to catch gather-like idioms ending at
// non-consecutive loads.		// non-consecutive loads.
if (!GEPs.empty()) {		if (!GEPs.empty()) {
DEBUG(dbgs() << "SLP: Found GEPs for " << GEPs.size()		DEBUG(dbgs() << "SLP: Found GEPs for " << GEPs.size()
<< " underlying objects.\n");		<< " underlying objects.\n");
Changed \|= vectorizeGEPIndices(BB, R);		Changed \|= vectorizeGEPIndices(BB, R);
▲ Show 20 Lines • Show All 1,420 Lines • ▼ Show 20 Lines	bool SLPVectorizerPass::vectorizeInsertValueInst(InsertValueInst *IVI,
SmallVector<Value *, 16> BuildVectorOpds;		SmallVector<Value *, 16> BuildVectorOpds;
if (!findBuildAggregate(IVI, BuildVector, BuildVectorOpds))		if (!findBuildAggregate(IVI, BuildVector, BuildVectorOpds))
return false;		return false;

DEBUG(dbgs() << "SLP: array mappable to vector: " << *IVI << "\n");		DEBUG(dbgs() << "SLP: array mappable to vector: " << *IVI << "\n");
return tryToVectorizeList(BuildVectorOpds, R, BuildVector, false);		return tryToVectorizeList(BuildVectorOpds, R, BuildVector, false);
}		}

		InsertElementInst SLPVectorizerPass::restoreInserts(LoadInst LInstr) {
		ShuffleVectorInst *Use = nullptr;
		Value *Ptr = nullptr;
		SmallDenseSet<uint64_t> Offsets;
		int64_t OffsetOfLoad;
		DenseMap<uint64_t, LoadInst *> Loads;
		DenseMap<uint64_t, InsertElementInst *> Inserts;
		DenseMap<uint64_t, GetElementPtrInst *> GEPs;
		BasicBlock *BB = LInstr->getParent();
		VectorType *VecType = nullptr;
		unsigned MaxOffset = 0;

		if (Instruction *Instr = dyn_cast<Instruction>(LInstr->getOperand(0))) {
		GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(Instr);
		if (!GEP)
		return nullptr;
		ConstantInt *C = dyn_cast<ConstantInt>(GEP->getOperand(1));
		if (!C)
		return nullptr;
		OffsetOfLoad = C->getSExtValue();
		Ptr = GEP->getOperand(0);
		GEPs[OffsetOfLoad] = GEP;
		Loads[OffsetOfLoad] = LInstr;
		} else {
		OffsetOfLoad = 0;
		Ptr = LInstr->getOperand(0);
		RKSimonUnsubmitted Not Done Reply Inline Actions It'd be much clearer to early out. if (!C \|\| C->getZExtValue() > MaxOffset) return nullptr; // We found the first InsertElem in this chain. uint64_t Offset = C->getZExtValue(); ... RKSimon: It'd be much clearer to early out. ``` if (!C \|\| C->getZExtValue() > MaxOffset) return…
		Loads[0] = LInstr;
		}

		MDNode *MD = LInstr->getMetadata(LLVMContext::MD_speculation_marker);
		assert(MD != nullptr && "Load should contain speculation.marker metadata");
		for (int i = 0, e = MD->getNumOperands(); i < e; i++) {
		ConstantInt *C = mdconst::dyn_extract<ConstantInt>(MD->getOperand(i));
		Offsets.insert(OffsetOfLoad + C->getSExtValue());
		}

		for (BasicBlock::iterator it = BB->begin(), e = BB->end(); it != e; it++) {
		if (InsertElementInst *Insert = dyn_cast<InsertElementInst>(it)) {
		ConstantInt *C = dyn_cast<ConstantInt>(Insert->getOperand(2));
		LoadInst *Load = dyn_cast<LoadInst>(Insert->getOperand(1));
		if (!C \|\| !Load)
		continue;
		GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(Load->getOperand(0));
		Value *Base = (GEP == nullptr) ? Load->getOperand(0) : GEP->getOperand(0);
		if (Base == Ptr) {
		if (!VecType)
		VecType = Insert->getType();
		else if (Insert->getType() != VecType)
		return nullptr;
		uint64_t Offset = C->getZExtValue();
		Inserts[Offset] = Insert;
		Loads[Offset] = Load;
		// GEP instruction might be missing for 0 offset.
		if (GEP)
		GEPs[Offset] = GEP;
		if (!Insert->use_empty() && !Use)
		if (ShuffleVectorInst *Shuffle =
		dyn_cast<ShuffleVectorInst>(Insert->user_back()))
		Use = Shuffle;
		continue;
		}
		}
		if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(it))
		// Examining the chain of GEP, Load, Insert.
		if (Ptr == GEP->getOperand(0)) {
		ConstantInt *C = dyn_cast<ConstantInt>(GEP->getOperand(1));
		if (!C \|\| GEP->use_empty() \|\| !GEP->hasOneUse())
		return nullptr;
		uint64_t Off = C->getZExtValue();
		if (GEPs[Off] == GEP)
		continue;
		LoadInst *Load = dyn_cast<LoadInst>(GEP->user_back());
		if (!Load)
		return nullptr;
		if (Load->getMetadata(LLVMContext::MD_speculation_marker) == nullptr) {
		InsertElementInst *Insert =
		dyn_cast<InsertElementInst>(Load->user_back());
		if (!Insert)
		return nullptr;
		RKSimonUnsubmitted Not Done Reply Inline Actions for (unsigned i = 0, e = MaxOffset + 1; i < e; i++) RKSimon: ``` for (unsigned i = 0, e = MaxOffset + 1; i < e; i++) ```
		if (!VecType)
		VecType = Insert->getType();
		else if (Insert->getType() != VecType)
		return nullptr;
		ConstantInt *C1 = cast<ConstantInt>(Insert->getOperand(2));
		uint64_t Off_Ins = C1->getZExtValue();
		// Offset from GEP must be equal to offset from
		// Insert otherwise we have to ignore.
		if (Off != Off_Ins)
		return nullptr;
		if (!Insert->use_empty() && !Use)
		RKSimonUnsubmitted Not Done Reply Inline Actions if (!Use \|\| Use->getParent() != BB) RKSimon: ``` if (!Use \|\| Use->getParent() != BB) ```
		if (ShuffleVectorInst *Shuffle =
		dyn_cast<ShuffleVectorInst>(Insert->user_back()))
		Use = Shuffle;
		Inserts[Off] = Insert;
		} else
		// Strip off speculation.marker for other loads
		// in the chain.
		Load->setMetadata(LLVMContext::MD_speculation_marker, nullptr);
		Loads[Off] = Load;
		GEPs[Off] = GEP;
		}
		}
		MaxOffset = VecType->getVectorNumElements();
		if (!Use \|\| Use->getParent() != BB)
		return nullptr;

		unsigned PrevOff = 0;
		for (auto Off : Offsets) {
		if (Off > MaxOffset - 1)
		return nullptr;
		if ((Off != 0 && GEPs[Off]) \|\| Loads[Off] \|\| Inserts[Off])
		return nullptr;
		for (unsigned i = PrevOff + 1; i < Off; i++)
		if ((i != 0 && !GEPs[i]) \|\| !Loads[i] \|\| !Inserts[i])
		RKSimonUnsubmitted Not Done Reply Inline Actions for (unsigned i = 0, e = MaxOffset + 1; i < e; i++) RKSimon: ``` for (unsigned i = 0, e = MaxOffset + 1; i < e; i++) ```
		return nullptr;
		PrevOff = Off;
		}

		IRBuilder<> Builder(LInstr->getParent(), ++BasicBlock::iterator(LInstr));
		if (!Inserts[0]) {
		Builder.SetInsertPoint(LInstr->getPrevNode());
		LoadInst *NewLoad = Builder.CreateLoad(Ptr);
		NewLoad->setAlignment(LInstr->getAlignment());
		Loads[0] = NewLoad;
		RKSimonUnsubmitted Not Done Reply Inline Actions You're accessing GEPs[i - 1] multiple times - pull out? RKSimon: You're accessing GEPs[i - 1] multiple times - pull out?
		Value *NewInsert = Builder.CreateInsertElement(
		UndefValue::get(VecType), NewLoad, Builder.getInt32(0));
		Inserts[0] = cast<InsertElementInst>(NewInsert);
		}
		for (unsigned i = 1, e = MaxOffset; i < e; i++) {
		// Building GEP, Load, Insert for
		// this Offset.
		GetElementPtrInst *PrevGEP = GEPs[i - 1];
		if (!Inserts[i]) {
		assert(Loads[i - 1] != nullptr &&
		"Couldn't find previous load in the chain.");
		if (!PrevGEP)
		Builder.SetInsertPoint(BB, ++Loads[i - 1]->getIterator());
		else
		Builder.SetInsertPoint(BB, ++PrevGEP->getIterator());
		Value *NewGEP =
		Builder.CreateGEP(LInstr->getType(), Ptr, Builder.getInt64(i));
		GEPs[i] = cast<GetElementPtrInst>(NewGEP);
		Builder.SetInsertPoint(BB, ++Loads[i - 1]->getIterator());
		Builder.SetCurrentDebugLocation(LInstr->getDebugLoc());
		LoadInst *NewLoad = Builder.CreateLoad(NewGEP);
		NewLoad->setAlignment(LInstr->getAlignment());
		Loads[i] = NewLoad;
		// InsertElement must be inserted after
		// previous InsertElement.
		InsertElementInst *PrevIns = Inserts[i - 1];
		Builder.SetInsertPoint(BB, ++PrevIns->getIterator());
		Value *NewInsert =
		Builder.CreateInsertElement(PrevIns, NewLoad, Builder.getInt32(i));
		Inserts[i] = cast<InsertElementInst>(NewInsert);
		}
		Inserts[i]->setOperand(0, Inserts[i - 1]);
		}

		// Update ShuffleVector with restored insert elements.
		Use->setOperand(0, Inserts[MaxOffset - 1]);
		return cast<InsertElementInst>(Inserts[MaxOffset - 1]);
		}

bool SLPVectorizerPass::vectorizeInsertElementInst(InsertElementInst *IEI,		bool SLPVectorizerPass::vectorizeInsertElementInst(InsertElementInst *IEI,
BasicBlock *BB, BoUpSLP &R) {		BasicBlock *BB, BoUpSLP &R) {
SmallVector<Value *, 16> BuildVector;		SmallVector<Value *, 16> BuildVector;
SmallVector<Value *, 16> BuildVectorOpds;		SmallVector<Value *, 16> BuildVectorOpds;
if (!findBuildVector(IEI, BuildVector, BuildVectorOpds))		if (!findBuildVector(IEI, BuildVector, BuildVectorOpds))
return false;		return false;

		if (LoadInst *LInstr = dyn_cast<LoadInst>(IEI->getOperand(1))) {
		if (LInstr->getMetadata(LLVMContext::MD_speculation_marker) != nullptr)
		if (InsertElementInst *Insert = restoreInserts(LInstr)) {
		BuildVector.clear();
		BuildVectorOpds.clear();
		if (!findBuildVector(Insert, BuildVector, BuildVectorOpds))
		return false;
		}
		}

// Vectorize starting with the build vector operands ignoring the BuildVector		// Vectorize starting with the build vector operands ignoring the BuildVector
// instructions for the purpose of scheduling and user extraction.		// instructions for the purpose of scheduling and user extraction.
return tryToVectorizeList(BuildVectorOpds, R, BuildVector);		return tryToVectorizeList(BuildVectorOpds, R, BuildVector);
}		}

bool SLPVectorizerPass::vectorizeCmpInst(CmpInst CI, BasicBlock BB,		bool SLPVectorizerPass::vectorizeCmpInst(CmpInst CI, BasicBlock BB,
BoUpSLP &R) {		BoUpSLP &R) {
if (tryToVectorizePair(CI->getOperand(0), CI->getOperand(1), R))		if (tryToVectorizePair(CI->getOperand(0), CI->getOperand(1), R))
▲ Show 20 Lines • Show All 280 Lines • Show Last 20 Lines

test/ThinLTO/X86/lazyload_metadata.ll

	; Do setup work for all below tests: generate bitcode and combined index			; Do setup work for all below tests: generate bitcode and combined index
	; RUN: opt -module-summary %s -o %t.bc -bitcode-mdindex-threshold=0			; RUN: opt -module-summary %s -o %t.bc -bitcode-mdindex-threshold=0
	; RUN: opt -module-summary %p/Inputs/lazyload_metadata.ll -o %t2.bc -bitcode-mdindex-threshold=0			; RUN: opt -module-summary %p/Inputs/lazyload_metadata.ll -o %t2.bc -bitcode-mdindex-threshold=0
	; RUN: llvm-lto -thinlto-action=thinlink -o %t3.bc %t.bc %t2.bc			; RUN: llvm-lto -thinlto-action=thinlink -o %t3.bc %t.bc %t2.bc
	; REQUIRES: asserts			; REQUIRES: asserts

	; Check that importing @globalfunc1 does not trigger loading all the global			; Check that importing @globalfunc1 does not trigger loading all the global
	; metadata for @globalfunc2 and @globalfunc3			; metadata for @globalfunc2 and @globalfunc3

	; RUN: llvm-lto -thinlto-action=import %t2.bc -thinlto-index=%t3.bc \			; RUN: llvm-lto -thinlto-action=import %t2.bc -thinlto-index=%t3.bc \
	; RUN: -o /dev/null -stats \			; RUN: -o /dev/null -stats \
	; RUN: 2>&1 \| FileCheck %s -check-prefix=LAZY			; RUN: 2>&1 \| FileCheck %s -check-prefix=LAZY
	; LAZY: 53 bitcode-reader - Number of Metadata records loaded			; LAZY: 55 bitcode-reader - Number of Metadata records loaded
	; LAZY: 2 bitcode-reader - Number of MDStrings loaded			; LAZY: 2 bitcode-reader - Number of MDStrings loaded

	; RUN: llvm-lto -thinlto-action=import %t2.bc -thinlto-index=%t3.bc \			; RUN: llvm-lto -thinlto-action=import %t2.bc -thinlto-index=%t3.bc \
	; RUN: -o /dev/null -disable-ondemand-mds-loading -stats \			; RUN: -o /dev/null -disable-ondemand-mds-loading -stats \
	; RUN: 2>&1 \| FileCheck %s -check-prefix=NOTLAZY			; RUN: 2>&1 \| FileCheck %s -check-prefix=NOTLAZY
	; NOTLAZY: 62 bitcode-reader - Number of Metadata records loaded			; NOTLAZY: 64 bitcode-reader - Number of Metadata records loaded
	; NOTLAZY: 7 bitcode-reader - Number of MDStrings loaded			; NOTLAZY: 7 bitcode-reader - Number of MDStrings loaded


	target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-apple-macosx10.11.0"			target triple = "x86_64-apple-macosx10.11.0"

	define void @globalfunc1(i32 %arg) {			define void @globalfunc1(i32 %arg) {
	%x = call i1 @llvm.type.test(i8* undef, metadata !"typeid1")			%x = call i1 @llvm.type.test(i8* undef, metadata !"typeid1")
	Show All 31 Lines

test/Transforms/SLPVectorizer/X86/pr21780.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -slp-vectorizer -mtriple=x86_64-unknown-linux-gnu -mcpu=bdver2 -S \| FileCheck %s

				define <4 x double> @foo(double* %ptr) #0 {
				; CHECK-LABEL: @foo(
				; CHECK-NEXT: [[ARRAYIDX2:%.]] = getelementptr inbounds double, double [[PTR:%.*]], i64 2
				; CHECK-NEXT: [[TMP1:%.]] = getelementptr double, double [[PTR]], i64 3
				; CHECK-NEXT: [[TMP2:%.]] = getelementptr double, double [[PTR]], i64 1
				; CHECK-NEXT: [[TMP3:%.]] = bitcast double [[PTR]] to <4 x double>*
				; CHECK-NEXT: [[TMP4:%.]] = load <4 x double>, <4 x double> [[TMP3]], align 8
				; CHECK-NEXT: [[TMP5:%.*]] = extractelement <4 x double> [[TMP4]], i32 0
				; CHECK-NEXT: [[INS0:%.*]] = insertelement <4 x double> undef, double [[TMP5]], i32 0
				; CHECK-NEXT: [[TMP6:%.*]] = extractelement <4 x double> [[TMP4]], i32 1
				; CHECK-NEXT: [[TMP7:%.*]] = insertelement <4 x double> [[INS0]], double [[TMP6]], i32 1
				; CHECK-NEXT: [[TMP8:%.*]] = extractelement <4 x double> [[TMP4]], i32 2
				; CHECK-NEXT: [[INS2:%.*]] = insertelement <4 x double> [[TMP7]], double [[TMP8]], i32 2
				; CHECK-NEXT: [[TMP9:%.*]] = extractelement <4 x double> [[TMP4]], i32 3
				; CHECK-NEXT: [[TMP10:%.*]] = insertelement <4 x double> [[INS2]], double [[TMP9]], i32 3
				; CHECK-NEXT: [[SHUFFLE:%.*]] = shufflevector <4 x double> [[TMP10]], <4 x double> undef, <4 x i32> <i32 0, i32 0, i32 2, i32 2>
				; CHECK-NEXT: ret <4 x double> [[SHUFFLE]]
				;
				%arrayidx2 = getelementptr inbounds double, double* %ptr, i64 2
				%ld0 = load double, double* %ptr, align 8, !speculation.marker !0
				reamesUnsubmitted Not Done Reply Inline Actions I think something might be missing here. You're forming a 4x wide load, but you've only proven dereferenceability for offsets 0, 1, 3. (i.e. not 2). How do we know it's safe to dereference between the two elements 1 & 3? reames: I think something might be missing here. You're forming a 4x wide load, but you've only proven…
				%ld2 = load double, double* %arrayidx2, align 8, !speculation.marker !1
				%ins0 = insertelement <4 x double> undef, double %ld0, i32 0
				%ins2 = insertelement <4 x double> %ins0, double %ld2, i32 2
				%shuffle = shufflevector <4 x double> %ins2, <4 x double> undef, <4 x i32> <i32 0, i32 0, i32 2, i32 2>
				ret <4 x double> %shuffle
				}

				define <4 x double> @bar(double* %ptr) #0 {
				; CHECK-LABEL: @bar(
				; CHECK-NEXT: [[ARRAYIDX1:%.]] = getelementptr inbounds double, double [[PTR:%.*]], i64 1
				; CHECK-NEXT: [[ARRAYIDX2:%.]] = getelementptr inbounds double, double [[PTR]], i64 2
				; CHECK-NEXT: [[TMP1:%.]] = getelementptr double, double [[PTR]], i64 3
				; CHECK-NEXT: [[TMP2:%.]] = bitcast double [[PTR]] to <4 x double>*
				; CHECK-NEXT: [[TMP3:%.]] = load <4 x double>, <4 x double> [[TMP2]], align 8
				; CHECK-NEXT: [[TMP4:%.*]] = extractelement <4 x double> [[TMP3]], i32 0
				; CHECK-NEXT: [[TMP5:%.*]] = insertelement <4 x double> undef, double [[TMP4]], i32 0
				; CHECK-NEXT: [[TMP6:%.*]] = extractelement <4 x double> [[TMP3]], i32 1
				; CHECK-NEXT: [[INS1:%.*]] = insertelement <4 x double> [[TMP5]], double [[TMP6]], i32 1
				; CHECK-NEXT: [[TMP7:%.*]] = extractelement <4 x double> [[TMP3]], i32 2
				; CHECK-NEXT: [[INS2:%.*]] = insertelement <4 x double> [[INS1]], double [[TMP7]], i32 2
				; CHECK-NEXT: [[TMP8:%.*]] = extractelement <4 x double> [[TMP3]], i32 3
				; CHECK-NEXT: [[TMP9:%.*]] = insertelement <4 x double> [[INS2]], double [[TMP8]], i32 3
				; CHECK-NEXT: [[SHUFFLE:%.*]] = shufflevector <4 x double> [[TMP9]], <4 x double> undef, <4 x i32> <i32 1, i32 1, i32 2, i32 2>
				; CHECK-NEXT: ret <4 x double> [[SHUFFLE]]
				;
				%arrayidx1 = getelementptr inbounds double, double* %ptr, i64 1
				%arrayidx2 = getelementptr inbounds double, double* %ptr, i64 2
				%ld1 = load double, double* %arrayidx1, align 8, !speculation.marker !2
				%ld2 = load double, double* %arrayidx2, align 8, !speculation.marker !3
				%ins1 = insertelement <4 x double> undef, double %ld1, i32 1
				%ins2 = insertelement <4 x double> %ins1, double %ld2, i32 2
				%shuffle = shufflevector <4 x double> %ins2, <4 x double> undef, <4 x i32> <i32 1, i32 1, i32 2, i32 2>
				ret <4 x double> %shuffle
				}

				attributes #0 = { "target-cpu"="bdver2" }

				!0 = !{i64 1, i64 3}
				!1 = !{i64 -1, i64 1}
				!2 = !{i64 -1, i64 2}
				!3 = !{i64 -2, i64 1}

This is an archive of the discontinued LLVM Phabricator instance.

[SLPVectorizer] Fix PR21780 Expansion of 256 bit vector loads fails to fold into shuffles
Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 120330

docs/LangRef.rst

include/llvm/IR/LLVMContext.h

include/llvm/Transforms/Vectorize/SLPVectorizer.h

lib/IR/LLVMContext.cpp

lib/Transforms/Vectorize/SLPVectorizer.cpp

test/ThinLTO/X86/lazyload_metadata.ll

test/Transforms/SLPVectorizer/X86/pr21780.ll

This is an archive of the discontinued LLVM Phabricator instance.

[SLPVectorizer] Fix PR21780 Expansion of 256 bit vector loads fails to fold into shufflesNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 120330

docs/LangRef.rst

include/llvm/IR/LLVMContext.h

include/llvm/Transforms/Vectorize/SLPVectorizer.h

lib/IR/LLVMContext.cpp

lib/Transforms/Vectorize/SLPVectorizer.cpp

test/ThinLTO/X86/lazyload_metadata.ll

test/Transforms/SLPVectorizer/X86/pr21780.ll

[SLPVectorizer] Fix PR21780 Expansion of 256 bit vector loads fails to fold into shuffles
Needs ReviewPublic