Download Raw Diff

Details

Reviewers

Commits

rGa2f0b594c26b: Redo store splitting in CodeGenPrepare.
rL290365: Redo store splitting in CodeGenPrepare.

Summary

This is a succeeding patch for https://reviews.llvm.org/D22840 to address the issue when a value to be merged into an int64 pair is in a different BB. The issue was originally described here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160912/390582.html (Sorry to delay the patch for a long time, because I took a long vacation).

The fix is to redo the store splitting in CodeGenPrepare, so we can match the pattern across multiple BBs or move some instructions into the same BB.

The patch changing the target query interface was put separately here: https://reviews.llvm.org/D24707

Diff Detail

Repository: rL LLVM

Event Timeline

wmi updated this revision to Diff 75604.Oct 24 2016, 10:10 AM

wmi retitled this revision from to Redo store splitting in CodeGenPrepare.

wmi updated this object.

wmi added a reviewer: chandlerc.

wmi set the repository for this revision to rL LLVM.

wmi added subscribers: llvm-commits, davidxl, majnemer, arsenm.

Ping.

chandlerc added inline comments.Nov 22 2016, 11:56 AM

lib/CodeGen/CodeGenPrepare.cpp
5218–5233 ↗	(On Diff #75604)	Have you looked at doing this with the IR pattern match library? (I should have mentioned this earlier, sorry.) Check out the PatternMatch.h header. Something like: unsigned HalfValBitSize = ...; Value Lo, Hi; if (match(SI.getValueOperand(), m_c_Or(m_OneUse(m_Shl(m_ZExt(m_Value(Hi)), m_SpecificInt(HalfValBitSize))), m_ZExt(m_Value(Lo))))) I forget the exact syntax for matching zero extends, but I think you can capture most of this in a very small number of lines of code.
5219 ↗	(On Diff #75604)	Why not go ahead and handle cases within a BB?

wmi added inline comments.Nov 23 2016, 1:52 PM

lib/CodeGen/CodeGenPrepare.cpp
5218–5233 ↗	(On Diff #75604)	I rewrite it using pattern match library. It is really awesome. The code becomes much simpler.
5219 ↗	(On Diff #75604)	I forgot the exact reason I added it. I remove the restriction and add a testcase for it.

Address Chandler's comments.

chandlerc added inline comments.Nov 23 2016, 2:07 PM

lib/CodeGen/CodeGenPrepare.cpp
5311–5312 ↗	(On Diff #79146)	No need to initialize these. Leaving them uninitialized makes it easier for MSan to find bugs if we try to use them despite the match failing. (Or a bug in the match that fails to set them.)
5342–5350 ↗	(On Diff #79146)	What about just walking LValue and HValue back across any bitcast instructions? That should make it easier to get the type for the cost query, and then make this simpler as you can directly store the pre-bitcast value. I also wouldn't worry about how many uses of the bitcast there are either, as bitcasts are expected to be free. As one example, if there are 3 uses of the bitcast, all to do merged stores, we should be willing to unmerge all three of them.
5352 ↗	(On Diff #79146)	Just use the type of the input value rather than re-computing? Also, you can probably just use a lambda to create the store and then call it for each of the inputs?

wmi added inline comments.Nov 23 2016, 3:03 PM

lib/CodeGen/CodeGenPrepare.cpp
5311–5312 ↗	(On Diff #79146)	Ok, fixed.
5342–5350 ↗	(On Diff #79146)	I am not very sure I understand correctly what you mean here, but I remove the hasOneUse limitation. Please take a look whether it looks the same as what is in your mind.
5352 ↗	(On Diff #79146)	Input can be i16 and it is extended to i64 before the "shl + or" bitwise operations. The type of splitted store here is i32. So we cannot necessarily get the i32 from input type. I created a lambda and called it for each input. It looks better.

Address Chandler's comments (The second round).

LGTM with a nit below.

I can send you a patch that tries to simplify the bitcast handling and see if you like that better. Seems much easier to explain with code.

lib/CodeGen/CodeGenPrepare.cpp
5347 ↗	(On Diff #79157)	CreateSplittedStore -> CreateSplitStore

This revision is now accepted and ready to land.Nov 23 2016, 3:14 PM

majnemer added inline comments.Nov 25 2016, 9:07 PM

lib/CodeGen/CodeGenPrepare.cpp
5300–5301 ↗	(On Diff #79157)	This looks wrong. Shouldn't it be `getTypeStoreSizeInBits` instead of `getTypeSizeInBits`?
5303 ↗	(On Diff #79157)	Start with an uppercase letter.
5320–5322 ↗	(On Diff #79157)	Ditto.
5327–5328 ↗	(On Diff #79157)	`auto *`

wmi added inline comments.Nov 28 2016, 10:15 AM

lib/CodeGen/CodeGenPrepare.cpp
5300–5301 ↗	(On Diff #79157)	They are the same here because the type of the store value must have power of 2 size if it is a merged store. But you remind me to add some testcases: @int31_float_pair, @int15_float_pair, @int7_float_pair added in the testcase.
5303 ↗	(On Diff #79157)	Fixed.
5327–5328 ↗	(On Diff #79157)	Fixed.
5347 ↗	(On Diff #79157)	Fixed.

Address David's comment.

majnemer added inline comments.Nov 28 2016, 11:36 AM

lib/CodeGen/CodeGenPrepare.cpp
5300–5301 ↗	(On Diff #79157)	But aren't i1, i2 and i4 powers of two? If we stored an i1, `HalfValBitSize` would be zero which sounds problematic.

wmi added inline comments.Nov 28 2016, 2:05 PM

lib/CodeGen/CodeGenPrepare.cpp
5300–5301 ↗	(On Diff #79157)	The store val will at least be i2 because it is a merged val from two smaller vals. If the store val is an i2 which are combined from an {i1, i1} pair, and we use getTypeStoreSizeInBits to compute the HalfValBitSize, the HalfValBitSize will be 4. It means the split store size will be 4 bits. It is not what we expect. We expect the type of split store is i1. I cannot add add an i1_i1_pair test because now the target query will return false for int pair. But I have verified that the i1_i1_pair test worked correctly to use getTypeSizeInBits by disabling the target query temporarily.

chandlerc added inline comments.Nov 28 2016, 2:13 PM

lib/CodeGen/CodeGenPrepare.cpp
5300–5301 ↗	(On Diff #79157)	I don't think storing i1s makes any sense here. I think you should add a check that the type store size == the type size both before and after splitting and not split unless that is satisfied, regardless of what the target says. And to make this easier to test, I suggest adding a flag that forces us to split everything we can split. Otherwise covering interesting inputs is too target dependent.

majnemer added inline comments.Nov 28 2016, 3:47 PM

lib/CodeGen/CodeGenPrepare.cpp
5300–5301 ↗	(On Diff #79157)	I'm still not sure why we would want to use TypeSizeInBits instead of the store size... What if the type is an i63? We would actually store 8 bytes...

Address Chandler and David's comments.

Add an option to force store splitting.
Add int pair related tests
early exit if getTypeStoreSizeInBits and getTypeSizeInBits return different values for store type before and after split

chandlerc added inline comments.Nov 28 2016, 4:35 PM

lib/CodeGen/CodeGenPrepare.cpp
5300–5301 ↗	(On Diff #79157)	I think it is at least reasonable to start with type size == store size. Among other things, when that isn't true, the shifting and extension logic will be at least a little more complicated. Still, Wei, this might make sense as a follow-up patch to work on to extend this to handle more complicated merged stores. As a potentially useful set of examples you could look at the generated code for: struct S { unsigned x : 27; }; std::pair<S, float> g1(); std::pair<float, S> g2(); void f(std::pair<float, unsigned> &result) { auto p1 = g1(); auto p2 = g2(); result.first = p1.second + p2.first; result.second = p1.first + p2.second; } Or something similar... But I'm happy to do this as a follow-up generalization now that the correctness issues are addressed. That seem reasonable David?
test/CodeGen/X86/split-store.ll
100–119 ↗	(On Diff #79471)	I would also include more negative tests: non-symmetric merges non-power-of-two merged store sizes where type size == store size (i24, i48, etc) non-power-of-two merged store sizes where type size != store size (i14)

Add tests suggested by Chandler.

non-symmetric merges: tests: int31_int17_pair (splitted), int7_int3_pair (splitted).
non-power-of-two merged store sizes where type size == store size (i24, i48, etc): tests: int24_int24_pair (splitted), int12_int12_pair (not splitted)
non-power-of-two merged store sizes where type size != store size (i14): test: int7_int7_pair (not splitted)

LGTM

Sorry I lost track of this, please feel free to ping more aggressively when not getting reviews! =D

I'm still interested in trying to write this in a way that is a bit more general, but I think it is fine to do that in follow-up patches.

As for how to get test cases, I would take the idea behind pseudo-code like I showed and write the LLVM IR by hand to trigger that case.

Closed by commit rL290365: Redo store splitting in CodeGenPrepare. (authored by wmi). · Explain WhyDec 22 2016, 11:55 AM

This revision was automatically updated to reflect the committed changes.

Diff 82358

llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp

Show First 20 Lines • Show All 125 Lines • ▼ Show 20 Lines	static cl::opt<bool> ProfileGuidedSectionPrefix(
"profile-guided-section-prefix", cl::Hidden, cl::init(true),		"profile-guided-section-prefix", cl::Hidden, cl::init(true),
cl::desc("Use profile info to add section prefix for hot/cold functions"));		cl::desc("Use profile info to add section prefix for hot/cold functions"));

static cl::opt<unsigned> FreqRatioToSkipMerge(		static cl::opt<unsigned> FreqRatioToSkipMerge(
"cgp-freq-ratio-to-skip-merge", cl::Hidden, cl::init(2),		"cgp-freq-ratio-to-skip-merge", cl::Hidden, cl::init(2),
cl::desc("Skip merging empty blocks if (frequency of empty block) / "		cl::desc("Skip merging empty blocks if (frequency of empty block) / "
"(frequency of destination block) is greater than this ratio"));		"(frequency of destination block) is greater than this ratio"));

		static cl::opt<bool> ForceSplitStore(
		"force-split-store", cl::Hidden, cl::init(false),
		cl::desc("Force store splitting no matter what the target query says."));

namespace {		namespace {
typedef SmallPtrSet<Instruction *, 16> SetOfInstrs;		typedef SmallPtrSet<Instruction *, 16> SetOfInstrs;
typedef PointerIntPair<Type *, 1, bool> TypeIsSExt;		typedef PointerIntPair<Type *, 1, bool> TypeIsSExt;
typedef DenseMap<Instruction *, TypeIsSExt> InstrToOrigTy;		typedef DenseMap<Instruction *, TypeIsSExt> InstrToOrigTy;
class TypePromotionTransaction;		class TypePromotionTransaction;

class CodeGenPrepare : public FunctionPass {		class CodeGenPrepare : public FunctionPass {
const TargetMachine *TM;		const TargetMachine *TM;
▲ Show 20 Lines • Show All 5,211 Lines • ▼ Show 20 Lines	while (Inst->hasOneUse()) {
DEBUG(dbgs() << "Promoting is possible... Enqueue for promotion!\n");		DEBUG(dbgs() << "Promoting is possible... Enqueue for promotion!\n");

VPH.enqueueForPromotion(ToBePromoted);		VPH.enqueueForPromotion(ToBePromoted);
Inst = ToBePromoted;		Inst = ToBePromoted;
}		}
return false;		return false;
}		}

		/// For the instruction sequence of store below, F and I values
		/// are bundled together as an i64 value before being stored into memory.
		/// Sometimes it is more efficent to generate separate stores for F and I,
		/// which can remove the bitwise instructions or sink them to colder places.
		///
		/// (store (or (zext (bitcast F to i32) to i64),
		/// (shl (zext I to i64), 32)), addr) -->
		/// (store F, addr) and (store I, addr+4)
		///
		/// Similarly, splitting for other merged store can also be beneficial, like:
		/// For pair of {i32, i32}, i64 store --> two i32 stores.
		/// For pair of {i32, i16}, i64 store --> two i32 stores.
		/// For pair of {i16, i16}, i32 store --> two i16 stores.
		/// For pair of {i16, i8}, i32 store --> two i16 stores.
		/// For pair of {i8, i8}, i16 store --> two i8 stores.
		///
		/// We allow each target to determine specifically which kind of splitting is
		/// supported.
		///
		/// The store patterns are commonly seen from the simple code snippet below
		/// if only std::make_pair(...) is sroa transformed before inlined into hoo.
		/// void goo(const std::pair<int, float> &);
		/// hoo() {
		/// ...
		/// goo(std::make_pair(tmp, ftmp));
		/// ...
		/// }
		///
		/// Although we already have similar splitting in DAG Combine, we duplicate
		/// it in CodeGenPrepare to catch the case in which pattern is across
		/// multiple BBs. The logic in DAG Combine is kept to catch case generated
		/// during code expansion.
		static bool splitMergedValStore(StoreInst &SI, const DataLayout &DL,
		const TargetLowering &TLI) {
		// Handle simple but common cases only.
		Type *StoreType = SI.getValueOperand()->getType();
		if (DL.getTypeStoreSizeInBits(StoreType) != DL.getTypeSizeInBits(StoreType) \|\|
		DL.getTypeSizeInBits(StoreType) == 0)
		return false;

		unsigned HalfValBitSize = DL.getTypeSizeInBits(StoreType) / 2;
		Type *SplitStoreType = Type::getIntNTy(SI.getContext(), HalfValBitSize);
		if (DL.getTypeStoreSizeInBits(SplitStoreType) !=
		DL.getTypeSizeInBits(SplitStoreType))
		return false;

		// Match the following patterns:
		// (store (or (zext LValue to i64),
		// (shl (zext HValue to i64), 32)), HalfValBitSize)
		// or
		// (store (or (shl (zext HValue to i64), 32)), HalfValBitSize)
		// (zext LValue to i64),
		// Expect both operands of OR and the first operand of SHL have only
		// one use.
		Value LValue, HValue;
		if (!match(SI.getValueOperand(),
		m_c_Or(m_OneUse(m_ZExt(m_Value(LValue))),
		m_OneUse(m_Shl(m_OneUse(m_ZExt(m_Value(HValue))),
		m_SpecificInt(HalfValBitSize))))))
		return false;

		// Check LValue and HValue are int with size less or equal than 32.
		if (!LValue->getType()->isIntegerTy() \|\|
		DL.getTypeSizeInBits(LValue->getType()) > HalfValBitSize \|\|
		!HValue->getType()->isIntegerTy() \|\|
		DL.getTypeSizeInBits(HValue->getType()) > HalfValBitSize)
		return false;

		// If LValue/HValue is a bitcast instruction, use the EVT before bitcast
		// as the input of target query.
		auto *LBC = dyn_cast<BitCastInst>(LValue);
		auto *HBC = dyn_cast<BitCastInst>(HValue);
		EVT LowTy = LBC ? EVT::getEVT(LBC->getOperand(0)->getType())
		: EVT::getEVT(LValue->getType());
		EVT HighTy = HBC ? EVT::getEVT(HBC->getOperand(0)->getType())
		: EVT::getEVT(HValue->getType());
		if (!ForceSplitStore && !TLI.isMultiStoresCheaperThanBitsMerge(LowTy, HighTy))
		return false;

		// Start to split store.
		IRBuilder<> Builder(SI.getContext());
		Builder.SetInsertPoint(&SI);

		// If LValue/HValue is a bitcast in another BB, create a new one in current
		// BB so it may be merged with the splitted stores by dag combiner.
		if (LBC && LBC->getParent() != SI.getParent())
		LValue = Builder.CreateBitCast(LBC->getOperand(0), LBC->getType());
		if (HBC && HBC->getParent() != SI.getParent())
		HValue = Builder.CreateBitCast(HBC->getOperand(0), HBC->getType());

		auto CreateSplitStore = [&](Value *V, bool Upper) {
		V = Builder.CreateZExtOrBitCast(V, SplitStoreType);
		Value *Addr = Builder.CreateBitCast(
		SI.getOperand(1),
		SplitStoreType->getPointerTo(SI.getPointerAddressSpace()));
		if (Upper)
		Addr = Builder.CreateGEP(
		SplitStoreType, Addr,
		ConstantInt::get(Type::getInt32Ty(SI.getContext()), 1));
		Builder.CreateAlignedStore(
		V, Addr, Upper ? SI.getAlignment() / 2 : SI.getAlignment());
		};

		CreateSplitStore(LValue, false);
		CreateSplitStore(HValue, true);

		// Delete the old store.
		SI.eraseFromParent();
		return true;
		}

bool CodeGenPrepare::optimizeInst(Instruction *I, bool& ModifiedDT) {		bool CodeGenPrepare::optimizeInst(Instruction *I, bool& ModifiedDT) {
// Bail out if we inserted the instruction to prevent optimizations from		// Bail out if we inserted the instruction to prevent optimizations from
// stepping on each other's toes.		// stepping on each other's toes.
if (InsertedInsts.count(I))		if (InsertedInsts.count(I))
return false;		return false;

if (PHINode *P = dyn_cast<PHINode>(I)) {		if (PHINode *P = dyn_cast<PHINode>(I)) {
// It is possible for very late stage optimizations (such as SimplifyCFG)		// It is possible for very late stage optimizations (such as SimplifyCFG)
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	if (TLI) {
unsigned AS = LI->getPointerAddressSpace();		unsigned AS = LI->getPointerAddressSpace();
Modified \|= optimizeMemoryInst(I, I->getOperand(0), LI->getType(), AS);		Modified \|= optimizeMemoryInst(I, I->getOperand(0), LI->getType(), AS);
return Modified;		return Modified;
}		}
return false;		return false;
}		}

if (StoreInst *SI = dyn_cast<StoreInst>(I)) {		if (StoreInst *SI = dyn_cast<StoreInst>(I)) {
		if (TLI && splitMergedValStore(SI, DL, *TLI))
		return true;
SI->setMetadata(LLVMContext::MD_invariant_group, nullptr);		SI->setMetadata(LLVMContext::MD_invariant_group, nullptr);
if (TLI) {		if (TLI) {
unsigned AS = SI->getPointerAddressSpace();		unsigned AS = SI->getPointerAddressSpace();
return optimizeMemoryInst(I, SI->getOperand(1),		return optimizeMemoryInst(I, SI->getOperand(1),
SI->getOperand(0)->getType(), AS);		SI->getOperand(0)->getType(), AS);
}		}
return false;		return false;
}		}
▲ Show 20 Lines • Show All 398 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/split-store.ll

; RUN: llc -mtriple=x86_64-unknown-unknown < %s \| FileCheck %s		; RUN: llc -mtriple=x86_64-unknown-unknown -force-split-store < %s \| FileCheck %s

; CHECK-LABEL: int32_float_pair		; CHECK-LABEL: int32_float_pair
; CHECK: movl %edi, (%rsi)		; CHECK: movl %edi, (%rsi)
; CHECK: movss %xmm0, 4(%rsi)		; CHECK: movss %xmm0, 4(%rsi)
define void @int32_float_pair(i32 %tmp1, float %tmp2, i64* %ref.tmp) {		define void @int32_float_pair(i32 %tmp1, float %tmp2, i64* %ref.tmp) {
entry:		entry:
%t0 = bitcast float %tmp2 to i32		%t0 = bitcast float %tmp2 to i32
%t1 = zext i32 %t0 to i64		%t1 = zext i32 %t0 to i64
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	entry:
%t0 = bitcast float %tmp2 to i32		%t0 = bitcast float %tmp2 to i32
%t1 = zext i32 %t0 to i64		%t1 = zext i32 %t0 to i64
%t2 = shl nuw i64 %t1, 32		%t2 = shl nuw i64 %t1, 32
%t3 = zext i8 %tmp1 to i64		%t3 = zext i8 %tmp1 to i64
%t4 = or i64 %t2, %t3		%t4 = or i64 %t2, %t3
store i64 %t4, i64* %ref.tmp, align 8		store i64 %t4, i64* %ref.tmp, align 8
ret void		ret void
}		}

		; CHECK-LABEL: int32_int32_pair
		; CHECK: movl %edi, (%rdx)
		; CHECK: movl %esi, 4(%rdx)
		define void @int32_int32_pair(i32 %tmp1, i32 %tmp2, i64* %ref.tmp) {
		entry:
		%t1 = zext i32 %tmp2 to i64
		%t2 = shl nuw i64 %t1, 32
		%t3 = zext i32 %tmp1 to i64
		%t4 = or i64 %t2, %t3
		store i64 %t4, i64* %ref.tmp, align 8
		ret void
		}

		; CHECK-LABEL: int16_int16_pair
		; CHECK: movw %di, (%rdx)
		; CHECK: movw %si, 2(%rdx)
		define void @int16_int16_pair(i16 signext %tmp1, i16 signext %tmp2, i32* %ref.tmp) {
		entry:
		%t1 = zext i16 %tmp2 to i32
		%t2 = shl nuw i32 %t1, 16
		%t3 = zext i16 %tmp1 to i32
		%t4 = or i32 %t2, %t3
		store i32 %t4, i32* %ref.tmp, align 4
		ret void
		}

		; CHECK-LABEL: int8_int8_pair
		; CHECK: movb %dil, (%rdx)
		; CHECK: movb %sil, 1(%rdx)
		define void @int8_int8_pair(i8 signext %tmp1, i8 signext %tmp2, i16* %ref.tmp) {
		entry:
		%t1 = zext i8 %tmp2 to i16
		%t2 = shl nuw i16 %t1, 8
		%t3 = zext i8 %tmp1 to i16
		%t4 = or i16 %t2, %t3
		store i16 %t4, i16* %ref.tmp, align 2
		ret void
		}

		; CHECK-LABEL: int31_int31_pair
		; CHECK: andl $2147483647, %edi
		; CHECK: movl %edi, (%rdx)
		; CHECK: andl $2147483647, %esi
		; CHECK: movl %esi, 4(%rdx)
		define void @int31_int31_pair(i31 %tmp1, i31 %tmp2, i64* %ref.tmp) {
		entry:
		%t1 = zext i31 %tmp2 to i64
		%t2 = shl nuw i64 %t1, 32
		%t3 = zext i31 %tmp1 to i64
		%t4 = or i64 %t2, %t3
		store i64 %t4, i64* %ref.tmp, align 8
		ret void
		}

		; CHECK-LABEL: int31_int17_pair
		; CHECK: andl $2147483647, %edi
		; CHECK: movl %edi, (%rdx)
		; CHECK: andl $131071, %esi
		; CHECK: movl %esi, 4(%rdx)
		define void @int31_int17_pair(i31 %tmp1, i17 %tmp2, i64* %ref.tmp) {
		entry:
		%t1 = zext i17 %tmp2 to i64
		%t2 = shl nuw i64 %t1, 32
		%t3 = zext i31 %tmp1 to i64
		%t4 = or i64 %t2, %t3
		store i64 %t4, i64* %ref.tmp, align 8
		ret void
		}

		; CHECK-LABEL: int7_int3_pair
		; CHECK: andb $127, %dil
		; CHECK: movb %dil, (%rdx)
		; CHECK: andb $7, %sil
		; CHECK: movb %sil, 1(%rdx)
		define void @int7_int3_pair(i7 signext %tmp1, i3 signext %tmp2, i16* %ref.tmp) {
		entry:
		%t1 = zext i3 %tmp2 to i16
		%t2 = shl nuw i16 %t1, 8
		%t3 = zext i7 %tmp1 to i16
		%t4 = or i16 %t2, %t3
		store i16 %t4, i16* %ref.tmp, align 2
		ret void
		}

		; CHECK-LABEL: int24_int24_pair
		; CHECK: movw %di, (%rdx)
		; CHECK: shrl $16, %edi
		; CHECK: movb %dil, 2(%rdx)
		; CHECK: movl %esi, %eax
		; CHECK: shrl $16, %eax
		; CHECK: movb %al, 6(%rdx)
		; CHECK: movw %si, 4(%rdx)
		define void @int24_int24_pair(i24 signext %tmp1, i24 signext %tmp2, i48* %ref.tmp) {
		entry:
		%t1 = zext i24 %tmp2 to i48
		%t2 = shl nuw i48 %t1, 24
		%t3 = zext i24 %tmp1 to i48
		%t4 = or i48 %t2, %t3
		store i48 %t4, i48* %ref.tmp, align 2
		ret void
		}

		; getTypeSizeInBits(i12) != getTypeStoreSizeInBits(i12), so store split doesn't kick in.
		; CHECK-LABEL: int12_int12_pair
		; CHECK: movl %esi, %eax
		; CHECK: shll $12, %eax
		; CHECK: andl $4095, %edi
		; CHECK: orl %eax, %edi
		; CHECK: shrl $4, %esi
		; CHECK: movb %sil, 2(%rdx)
		; CHECK: movw %di, (%rdx)
		define void @int12_int12_pair(i12 signext %tmp1, i12 signext %tmp2, i24* %ref.tmp) {
		entry:
		%t1 = zext i12 %tmp2 to i24
		%t2 = shl nuw i24 %t1, 12
		%t3 = zext i12 %tmp1 to i24
		%t4 = or i24 %t2, %t3
		store i24 %t4, i24* %ref.tmp, align 2
		ret void
		}

		; getTypeSizeInBits(i14) != getTypeStoreSizeInBits(i14), so store split doesn't kick in.
		; CHECK-LABEL: int7_int7_pair
		; CHECK: movzbl %sil, %eax
		; CHECK: shll $7, %eax
		; CHECK: andb $127, %dil
		; CHECK: movzbl %dil, %ecx
		; CHECK: orl %eax, %ecx
		; CHECK: andl $16383, %ecx
		; CHECK: movw %cx, (%rdx)
		define void @int7_int7_pair(i7 signext %tmp1, i7 signext %tmp2, i14* %ref.tmp) {
		entry:
		%t1 = zext i7 %tmp2 to i14
		%t2 = shl nuw i14 %t1, 7
		%t3 = zext i7 %tmp1 to i14
		%t4 = or i14 %t2, %t3
		store i14 %t4, i14* %ref.tmp, align 2
		ret void
		}

		; getTypeSizeInBits(i2) != getTypeStoreSizeInBits(i2), so store split doesn't kick in.
		; CHECK-LABEL: int1_int1_pair
		; CHECK: addb %sil, %sil
		; CHECK: andb $1, %dil
		; CHECK: orb %sil, %dil
		; CHECK: andb $3, %dil
		; CHECK: movb %dil, (%rdx)
		define void @int1_int1_pair(i1 signext %tmp1, i1 signext %tmp2, i2* %ref.tmp) {
		entry:
		%t1 = zext i1 %tmp2 to i2
		%t2 = shl nuw i2 %t1, 1
		%t3 = zext i1 %tmp1 to i2
		%t4 = or i2 %t2, %t3
		store i2 %t4, i2* %ref.tmp, align 1
		ret void
		}

		; CHECK-LABEL: mbb_int32_float_pair
		; CHECK: movl %edi, (%rsi)
		; CHECK: movss %xmm0, 4(%rsi)
		define void @mbb_int32_float_pair(i32 %tmp1, float %tmp2, i64* %ref.tmp) {
		entry:
		%t0 = bitcast float %tmp2 to i32
		br label %next
		next:
		%t1 = zext i32 %t0 to i64
		%t2 = shl nuw i64 %t1, 32
		%t3 = zext i32 %tmp1 to i64
		%t4 = or i64 %t2, %t3
		store i64 %t4, i64* %ref.tmp, align 8
		ret void
		}

		; CHECK-LABEL: mbb_int32_float_multi_stores
		; CHECK: movl %edi, (%rsi)
		; CHECK: movss %xmm0, 4(%rsi)
		; CHECK: # %bb2
		; CHECK: movl %edi, (%rdx)
		; CHECK: movss %xmm0, 4(%rdx)
		define void @mbb_int32_float_multi_stores(i32 %tmp1, float %tmp2, i64* %ref.tmp, i64* %ref.tmp1, i1 %cmp) {
		entry:
		%t0 = bitcast float %tmp2 to i32
		br label %bb1
		bb1:
		%t1 = zext i32 %t0 to i64
		%t2 = shl nuw i64 %t1, 32
		%t3 = zext i32 %tmp1 to i64
		%t4 = or i64 %t2, %t3
		store i64 %t4, i64* %ref.tmp, align 8
		br i1 %cmp, label %bb2, label %exitbb
		bb2:
		store i64 %t4, i64* %ref.tmp1, align 8
		br label %exitbb
		exitbb:
		ret void
		}

This is an archive of the discontinued LLVM Phabricator instance.

Redo store splitting in CodeGenPrepare
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 82358

llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp

llvm/trunk/test/CodeGen/X86/split-store.ll

This is an archive of the discontinued LLVM Phabricator instance.

Redo store splitting in CodeGenPrepareClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 82358

llvm/trunk/lib/CodeGen/CodeGenPrepare.cpp

llvm/trunk/test/CodeGen/X86/split-store.ll

Redo store splitting in CodeGenPrepare
ClosedPublic