This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/CodeGen/SelectionDAG/
-
CodeGen/
-
SelectionDAG/
15/15
DAGCombiner.cpp
-
test/CodeGen/
-
CodeGen/
-
AArch64/
-
arm64-abi-varargs.ll
-
ldst-opt.ll
-
swifterror.ll
-
ARM/
-
arm-storebytesmerge.ll
-
misched-fusion-aes.ll
-
Mips/
1
fastcc.ll
-
SystemZ/
-
pr36164.ll
-
X86/
-
i256-add.ll
-
stores-merging.ll

Differential D53552

[DAGCombine] Improve alias analysis for chain of independent stores.
ClosedPublic

Authored by niravd on Oct 22 2018, 8:32 PM.

Download Raw Diff

Details

Reviewers

courbet
spatel
RKSimon
bogner
efriedma
craig.topper
rnk

Commits

rG6ce9f72f76e3: [DAGCombine] Improve alias analysis for chain of independent stores.
rL346432: [DAGCombine] Improve alias analysis for chain of independent stores.

Summary

FindBetterNeighborChains simulateanously improves the chain
dependencies of a chain of related stores avoiding the generation of
extra token factors. For chains longer than the GatherAllAliasDepths,
stores further down in the chain will necessarily fail, a potentially
significant waste and preventing otherwise trivial parallelization.

This patch directly parallelize the chains of stores before improving
each store. This generally improves DAG-level parallelism.

Diff Detail

Repository

rL LLVM

Build Status

Buildable 24048
Build 24047: arc lint + arc unit

Event Timeline

niravd created this revision.Oct 22 2018, 8:32 PM

Harbormaster completed remote builds in B24048: Diff 170565.Oct 22 2018, 8:32 PM

Herald added subscribers: atanasyan, jrtc27, hiraditya and 2 others. · View Herald TranscriptOct 22 2018, 8:32 PM

courbet@, I think this should produce at least as good compile time as D53289. Can you verify on your testcase? If so, I move we drop D53289 (and D31068) in favor of this.

llvm/test/CodeGen/Mips/fastcc.ll
226	Apparently FileCheck balks at dealing with this many too many deferred lines in a DAG match. This is just a noop reordering to be closer to the output asm.

niravd mentioned this in D31068: [SDAG] Expand MergedConsecutiveStores to better handle Giving up in Chain Analysis.Oct 23 2018, 9:15 AM

dmgreen added a subscriber: dmgreen.Oct 23 2018, 1:46 PM

In D53552#1272602, @niravd wrote:

courbet@, I think this should produce at least as good compile time as D53289. Can you verify on your testcase? If so, I move we drop D53289 (and D31068) in favor of this.

I have not had the time to look at the diff yet, but I patched it and re-ran the benchmarks. I can confirm that this fixes the compile-time performance issue as well or better as D53289:

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
19060	this is never used

courbet added inline comments.Oct 26 2018, 2:12 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
18944	I think readability would be much better if this was split a bit: bool DAGCombiner::findBetterNeighborChains(StoreSDNode St) { if (OptLevel == CodeGenOpt::None) return false; // This holds the base pointer, index, and the offset in bytes from the base // pointer. BaseIndexOffset BasePtr = BaseIndexOffset::match(St, DAG); // We must have a base and an offset. if (!BasePtr.getBase().getNode()) return false; // Do not handle stores to undef base pointers. if (BasePtr.getBase().isUndef()) return false; // First try to merge chained stores. StoreSDNode STChain = St; SmallVector<StoreSDNode *, 8> ChainedStores = findChainedStores(STChain, BasePtr); if (ChainedStores.size() > 0) { mergeChainedStores(ChainedStores); return true; } // Improve St's Chain.. SDValue BetterChain = FindBetterChain(St, St->getChain()); if (St->getChain() != BetterChain) { replaceStoreChain(St, BetterChain); return true; } return false; } (maybe even merge `findChainedStores` into `mergeChainedStores`)
18950	Please make this const to save the reader the trouble of tracking all uses.
18969	this is unused
18982	we're -> we
18984	What about using `llvm::IntervalMap<int64_t, int>` for `IntervalsCovered` ? It would simplify the code here.
18987	why are you checking the iterator (rhs) ? The first check should be enough.
19015	please move this closer to where it's used.
19055	remove

Simplify. Update chain after both parallelizing chain and individual chain improvements. Also use IntervalMap.

niravd marked 4 inline comments as done.Oct 30 2018, 11:20 AM

courbet added inline comments.Nov 5 2018, 2:41 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
18951	every*
18954	Why ? This is `return true` AFAICT...
18958	ditto
18986	interval*
19080	Please fix the comment.
19081	this returns true if St has no base and offset. This is weird. I know that this cannot happen because we just made the check above, but it's misleading and redundant. Maybe pass `BasePtr` to `findBetterNeighborChains()` and assert ?

Address comments (typos and comment fixes) and rebase.

courbet accepted this revision.Nov 7 2018, 11:23 PM

This revision is now accepted and ready to land.Nov 7 2018, 11:23 PM

Closed by commit rL346432: [DAGCombine] Improve alias analysis for chain of independent stores. (authored by niravd). · Explain WhyNov 8 2018, 11:16 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

155 lines

test/

CodeGen/

AArch64/

arm64-abi-varargs.ll

9 lines

ldst-opt.ll

14 lines

swifterror.ll

5 lines

ARM/

arm-storebytesmerge.ll

175 lines

misched-fusion-aes.ll

15 lines

Mips/

fastcc.ll

36 lines

SystemZ/

pr36164.ll

69 lines

X86/

i256-add.ll

8 lines

stores-merging.ll

3 lines

Diff 170565

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 18,935 Lines • ▼ Show 20 Lines
	//			//
	// I believe this is mainly important because MergeConsecutiveStores			// I believe this is mainly important because MergeConsecutiveStores
	// is unable to deal with merging stores of different sizes, so unless			// is unable to deal with merging stores of different sizes, so unless
	// we improve the chains of all the potential candidates up-front			// we improve the chains of all the potential candidates up-front
	// before running MergeConsecutiveStores, it might only see some of			// before running MergeConsecutiveStores, it might only see some of
	// the nodes that will eventually be candidates, and then not be able			// the nodes that will eventually be candidates, and then not be able
	// to go from a partially-merged state to the desired final			// to go from a partially-merged state to the desired final
	// fully-merged state.			// fully-merged state.
	bool DAGCombiner::findBetterNeighborChains(StoreSDNode *St) {			bool DAGCombiner::findBetterNeighborChains(StoreSDNode *St) {
				courbetUnsubmitted Done Reply Inline Actions I think readability would be much better if this was split a bit: bool DAGCombiner::findBetterNeighborChains(StoreSDNode St) { if (OptLevel == CodeGenOpt::None) return false; // This holds the base pointer, index, and the offset in bytes from the base // pointer. BaseIndexOffset BasePtr = BaseIndexOffset::match(St, DAG); // We must have a base and an offset. if (!BasePtr.getBase().getNode()) return false; // Do not handle stores to undef base pointers. if (BasePtr.getBase().isUndef()) return false; // First try to merge chained stores. StoreSDNode STChain = St; SmallVector<StoreSDNode , 8> ChainedStores = findChainedStores(STChain, BasePtr); if (ChainedStores.size() > 0) { mergeChainedStores(ChainedStores); return true; } // Improve St's Chain.. SDValue BetterChain = FindBetterChain(St, St->getChain()); if (St->getChain() != BetterChain) { replaceStoreChain(St, BetterChain); return true; } return false; } (maybe even merge `findChainedStores` into `mergeChainedStores`) courbet:* I think readability would be much better if this was split a bit: ``` bool DAGCombiner…
	if (OptLevel == CodeGenOpt::None)			if (OptLevel == CodeGenOpt::None)
	return false;			return false;

	// This holds the base pointer, index, and the offset in bytes from the base			// This holds the base pointer, index, and the offset in bytes from the base
	// pointer.			// pointer.
	BaseIndexOffset BasePtr = BaseIndexOffset::match(St, DAG);			BaseIndexOffset BasePtr = BaseIndexOffset::match(St, DAG);
				courbetUnsubmitted Done Reply Inline Actions Please make this const to save the reader the trouble of tracking all uses. courbet: Please make this const to save the reader the trouble of tracking all uses.

				courbetUnsubmitted Done Reply Inline Actions every* courbet: every*
	// We must have a base and an offset.			// We must have a base and an offset.
	if (!BasePtr.getBase().getNode())			if (!BasePtr.getBase().getNode())
	return false;			return false;
				courbetUnsubmitted Done Reply Inline Actions Why ? This is `return true` AFAICT... courbet: Why ? This is `return true` AFAICT...

	// Do not handle stores to undef base pointers.			// Do not handle stores to undef base pointers.
	if (BasePtr.getBase().isUndef())			if (BasePtr.getBase().isUndef())
	return false;			return false;
				courbetUnsubmitted Done Reply Inline Actions ditto courbet: ditto

				bool MadeChangeToST = false;
				// Records which offsets from BaseIndex have been marked.
				std::set<std::pair<int64_t, int64_t>> IntervalsCovered;
				// Holds all stores chained to St
	SmallVector<StoreSDNode *, 8> ChainedStores;			SmallVector<StoreSDNode *, 8> ChainedStores;
	ChainedStores.push_back(St);			IntervalsCovered.insert(
				std::make_pair(0, St->getMemoryVT().getSizeInBits() / 8));

	// Walk up the chain and look for nodes with offsets from the same			StoreSDNode *STChain = St;
	// base pointer. Stop when reaching an instruction with a different kind			int64_t STChainOffset = 0;
				courbetUnsubmitted Done Reply Inline Actions this is unused courbet: this is unused
	// or instruction which has a different base pointer.			while (StoreSDNode *Chain = dyn_cast<StoreSDNode>(STChain->getChain())) {
	StoreSDNode *Index = St;
	while (Index) {
	// If the chain has more than one use, then we can't reorder the mem ops.			// If the chain has more than one use, then we can't reorder the mem ops.
	if (Index != St && !SDValue(Index, 0)->hasOneUse())			if (!SDValue(Chain, 0)->hasOneUse())
	break;			break;
				if (Chain->isVolatile() \|\| Chain->isIndexed())
	if (Index->isVolatile() \|\| Index->isIndexed())
	break;			break;

	// Find the base pointer and offset for this memory node.			// Find the base pointer and offset for this memory node.
	BaseIndexOffset Ptr = BaseIndexOffset::match(Index, DAG);			BaseIndexOffset Ptr = BaseIndexOffset::match(Chain, DAG);

	// Check that the base pointer is the same as the original one.			// Check that the base pointer is the same as the original one.
	if (!BasePtr.equalBaseIndex(Ptr, DAG))			int64_t Offset;
				if (!BasePtr.equalBaseIndex(Ptr, DAG, Offset))
	break;			break;
				// Make sure we're don't overlap.
				courbetUnsubmitted Done Reply Inline Actions we're -> we courbet: we're -> we
	// Walk up the chain to find the next store node, ignoring any			int64_t Length = Chain->getMemoryVT().getSizeInBits() / 8;
	// intermediate loads. Any other kind of node will halt the loop.			auto Res = IntervalsCovered.insert(std::make_pair(Offset, Offset + Length));
				courbetUnsubmitted Done Reply Inline Actions What about using `llvm::IntervalMap<int64_t, int>` for `IntervalsCovered` ? It would simplify the code here. courbet: What about using `llvm::IntervalMap<int64_t, int>` for `IntervalsCovered` ? It would simplify…
	SDNode *NextInChain = Index->getChain().getNode();			// Must be valid.
	while (true) {			auto I = Res.first;
				courbetUnsubmitted Done Reply Inline Actions interval* courbet: interval*
	if (StoreSDNode *STn = dyn_cast<StoreSDNode>(NextInChain)) {			if (!Res.second \|\| I == IntervalsCovered.end())
				courbetUnsubmitted Done Reply Inline Actions why are you checking the iterator (rhs) ? The first check should be enough. courbet: why are you checking the iterator (rhs) ? The first check should be enough.
	// We found a store node. Use it for the next iteration.
	if (STn->isVolatile() \|\| STn->isIndexed()) {
	Index = nullptr;
	break;			break;
	}			// Check the interval to the left, check that it ends before this one
	ChainedStores.push_back(STn);			// starts.
	Index = STn;			if (I != IntervalsCovered.begin()) {
				--I;
				if ((*I).second > Offset)
	break;			break;
	} else if (LoadSDNode *Ldn = dyn_cast<LoadSDNode>(NextInChain)) {			++I;
	NextInChain = Ldn->getChain().getNode();			}
	continue;			// If there's an interval to the right, check that it starts after this one
	} else {			// ends.
	Index = nullptr;			if (I != IntervalsCovered.end()) {
				++I;
				if ((*I).first < Offset + Length)
	break;			break;
				--I;
	}			}
	}// end while
				ChainedStores.push_back(Chain);
				STChainOffset = Offset;
				STChain = Chain;
	}			}

	// At this point, ChainedStores lists all of the Store nodes			// Do we have multiple stores?
	// reachable by iterating up through chain nodes matching the above			if (ChainedStores.size() > 0) {
	// conditions. For each such store identified, try to find an			SDValue NewChain = STChain->getChain();
	// earlier chain to attach the store to which won't violate the			SmallVector<SDValue, 8> TFOps;
	// required ordering.			SDValue NewST;
				courbetUnsubmitted Done Reply Inline Actions please move this closer to where it's used. courbet: please move this closer to where it's used.
	bool MadeChangeToSt = false;			bool AllImproved = true;
	SmallVector<std::pair<StoreSDNode *, SDValue>, 8> BetterChains;			for (unsigned I = ChainedStores.size(); I;) {
				StoreSDNode *S = ChainedStores[--I];
	for (StoreSDNode *ChainedStore : ChainedStores) {			S = cast<StoreSDNode>(DAG.UpdateNodeOperands(
	SDValue Chain = ChainedStore->getChain();			S, NewChain, S->getOperand(1), S->getOperand(2), S->getOperand(3)));
	SDValue BetterChain = FindBetterChain(ChainedStore, Chain);			SDValue BetterChain = FindBetterChain(S, S->getChain());
				if (S->getChain() != BetterChain)
				S = cast<StoreSDNode>(
				DAG.UpdateNodeOperands(S, BetterChain, S->getOperand(1),
				S->getOperand(2), S->getOperand(3)));
				else
				AllImproved = false;
				TFOps.push_back(SDValue(S, 0));
				ChainedStores[I] = S;
				}
				// Fixup St.
				if (St->isTruncatingStore())
				NewST = DAG.getTruncStore(NewChain, SDLoc(St), St->getValue(),
				St->getBasePtr(), St->getMemoryVT(),
				St->getMemOperand());
				else
				NewST = DAG.getStore(NewChain, SDLoc(St), St->getValue(),
				St->getBasePtr(), St->getMemOperand());
				// Improve NewST's chain.
				SDValue BetterChain =
				FindBetterChain(NewST.getNode(), NewST->getOperand(0));
				if (NewST->getOperand(0) != BetterChain)
				NewST = SDValue(DAG.UpdateNodeOperands(
				NewST.getNode(), BetterChain, NewST->getOperand(1),
				NewST->getOperand(2), NewST->getOperand(3)),
				0);
				else
				AllImproved = false;
				TFOps.push_back(NewST);

	if (Chain != BetterChain) {			// If each store is improved we no longer have the dependence on
	if (ChainedStore == St)			// STChain->getChain() and need to add it back.
	MadeChangeToSt = true;			if (AllImproved)
	BetterChains.push_back(std::make_pair(ChainedStore, BetterChain));			TFOps.insert(TFOps.begin(), NewChain);
	}			// TFOps.push_back(NewChain);
				courbetUnsubmitted Done Reply Inline Actions remove courbet: remove

				SDValue TF =
				DAG.getNode(ISD::TokenFactor, SDLoc(STChain), MVT::Other, TFOps);
				CombineTo(St, TF);
				MadeChangeToST = true;
				courbetUnsubmitted Done Reply Inline Actions this is never used courbet: this is never used

				// Add Nodes to worklist in order.
				AddToWorklist(STChain);
				for (StoreSDNode *&S : ChainedStores) {
				AddToWorklist(S);
	}			}
				AddToWorklist(NewST.getNode());

	// Do all replacements after finding the replacements to make to avoid making			return true;
	// the chains more complicated by introducing new TokenFactors.			}
	for (auto Replacement : BetterChains)
	replaceStoreChain(Replacement.first, Replacement.second);

	return MadeChangeToSt;			// Improve St's Chain..
				SDValue BetterChain = FindBetterChain(St, St->getChain());
				if (St->getChain() != BetterChain) {
				replaceStoreChain(St, BetterChain);
				return true;
				}
				return false;
	}			}

				courbetUnsubmitted Done Reply Inline Actions Please fix the comment. courbet: Please fix the comment.
	/// This is the entry point for the file.			/// This is the entry point for the file.
				courbetUnsubmitted Done Reply Inline Actions this returns true if St has no base and offset. This is weird. I know that this cannot happen because we just made the check above, but it's misleading and redundant. Maybe pass `BasePtr` to `findBetterNeighborChains()` and assert ? courbet: this returns true if St has no base and offset. This is weird. I know that this cannot happen…
	void SelectionDAG::Combine(CombineLevel Level, AliasAnalysis *AA,			void SelectionDAG::Combine(CombineLevel Level, AliasAnalysis *AA,
	CodeGenOpt::Level OptLevel) {			CodeGenOpt::Level OptLevel) {
	/// This is the main entry point to this class.			/// This is the main entry point to this class.
	DAGCombiner(*this, AA, OptLevel).Run(Level);			DAGCombiner(*this, AA, OptLevel).Run(Level);
	}			}

llvm/test/CodeGen/AArch64/arm64-abi-varargs.ll

	; RUN: llc < %s -mtriple=arm64-apple-ios7.0.0 -mcpu=cyclone -enable-misched=false \| FileCheck %s			; RUN: llc < %s -mtriple=arm64-apple-ios7.0.0 -mcpu=cyclone -enable-misched=false \| FileCheck %s

	; rdar://13625505			; rdar://13625505
	; Here we have 9 fixed integer arguments the 9th argument in on stack, the			; Here we have 9 fixed integer arguments the 9th argument in on stack, the
	; varargs start right after at 8-byte alignment.			; varargs start right after at 8-byte alignment.
	define void @fn9(i32* %a1, i32 %a2, i32 %a3, i32 %a4, i32 %a5, i32 %a6, i32 %a7, i32 %a8, i32 %a9, ...) nounwind noinline ssp {			define void @fn9(i32* %a1, i32 %a2, i32 %a3, i32 %a4, i32 %a5, i32 %a6, i32 %a7, i32 %a8, i32 %a9, ...) nounwind noinline ssp {
	; CHECK-LABEL: fn9:			; CHECK-LABEL: fn9:
	; 9th fixed argument			; 9th fixed argument
	; CHECK: ldr {{w[0-9]+}}, [sp, #64]			; CHECK: ldr {{w[0-9]+}}, [sp, #64]
	; CHECK: add [[ARGS:x[0-9]+]], sp, #72			; CHECK-DAG: add [[ARGS:x[0-9]+]], sp, #72
	; CHECK: add {{x[0-9]+}}, [[ARGS]], #8
	; First vararg			; First vararg
	; CHECK: ldr {{w[0-9]+}}, [sp, #72]			; CHECK-DAG: ldr {{w[0-9]+}}, [sp, #72]
	; Second vararg			; Second vararg
	; CHECK: ldr {{w[0-9]+}}, [{{x[0-9]+}}], #8			; CHECK-DAG: ldr {{w[0-9]+}}, [sp, #80]
	; Third vararg			; Third vararg
	; CHECK: ldr {{w[0-9]+}}, [{{x[0-9]+}}], #8			; CHECK-DAG: ldr {{w[0-9]+}}, [sp, #88]
	%1 = alloca i32, align 4			%1 = alloca i32, align 4
	%2 = alloca i32, align 4			%2 = alloca i32, align 4
	%3 = alloca i32, align 4			%3 = alloca i32, align 4
	%4 = alloca i32, align 4			%4 = alloca i32, align 4
	%5 = alloca i32, align 4			%5 = alloca i32, align 4
	%6 = alloca i32, align 4			%6 = alloca i32, align 4
	%7 = alloca i32, align 4			%7 = alloca i32, align 4
	%8 = alloca i32, align 4			%8 = alloca i32, align 4
	▲ Show 20 Lines • Show All 161 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/ldst-opt.ll

Show First 20 Lines • Show All 1,459 Lines • ▼ Show 20 Lines	entry:
store <2 x i32> zeroinitializer, <2 x i32>* %p		store <2 x i32> zeroinitializer, <2 x i32>* %p
ret void		ret void
}		}

; Like merge_zr32, but with 3-vector type.		; Like merge_zr32, but with 3-vector type.
define void @merge_zr32_3vec(<3 x i32>* %p) {		define void @merge_zr32_3vec(<3 x i32>* %p) {
; CHECK-LABEL: merge_zr32_3vec:		; CHECK-LABEL: merge_zr32_3vec:
; CHECK: // %entry		; CHECK: // %entry
; NOSTRICTALIGN-NEXT: str xzr, [x{{[0-9]+}}]
; NOSTRICTALIGN-NEXT: str wzr, [x{{[0-9]+}}, #8]		; NOSTRICTALIGN-NEXT: str wzr, [x{{[0-9]+}}, #8]
; STRICTALIGN-NEXT: stp wzr, wzr, [x{{[0-9]+}}]		; NOSTRICTALIGN-NEXT: str xzr, [x{{[0-9]+}}]
; STRICTALIGN-NEXT: str wzr, [x{{[0-9]+}}, #8]		; STRICTALIGN-NEXT: stp wzr, wzr, [x{{[0-9]+}}, #4]
		; STRICTALIGN-NEXT: str wzr, [x{{[0-9]+}}]
; CHECK-NEXT: ret		; CHECK-NEXT: ret
entry:		entry:
store <3 x i32> zeroinitializer, <3 x i32>* %p		store <3 x i32> zeroinitializer, <3 x i32>* %p
ret void		ret void
}		}

; Like merge_zr32, but with 4-vector type.		; Like merge_zr32, but with 4-vector type.
define void @merge_zr32_4vec(<4 x i32>* %p) {		define void @merge_zr32_4vec(<4 x i32>* %p) {
; CHECK-LABEL: merge_zr32_4vec:		; CHECK-LABEL: merge_zr32_4vec:
; CHECK: // %entry		; CHECK: // %entry
; NOSTRICTALIGN-NEXT: stp xzr, xzr, [x{{[0-9]+}}]		; NOSTRICTALIGN-NEXT: stp xzr, xzr, [x{{[0-9]+}}]
; STRICTALIGN-NEXT: stp wzr, wzr, [x{{[0-9]+}}]
; STRICTALIGN-NEXT: stp wzr, wzr, [x{{[0-9]+}}, #8]		; STRICTALIGN-NEXT: stp wzr, wzr, [x{{[0-9]+}}, #8]
		; STRICTALIGN-NEXT: stp wzr, wzr, [x{{[0-9]+}}]
; CHECK-NEXT: ret		; CHECK-NEXT: ret
entry:		entry:
store <4 x i32> zeroinitializer, <4 x i32>* %p		store <4 x i32> zeroinitializer, <4 x i32>* %p
ret void		ret void
}		}

; Like merge_zr32, but with 2-vector float type.		; Like merge_zr32, but with 2-vector float type.
define void @merge_zr32_2vecf(<2 x float>* %p) {		define void @merge_zr32_2vecf(<2 x float>* %p) {
; CHECK-LABEL: merge_zr32_2vecf:		; CHECK-LABEL: merge_zr32_2vecf:
; CHECK: // %entry		; CHECK: // %entry
; NOSTRICTALIGN-NEXT: str xzr, [x{{[0-9]+}}]		; NOSTRICTALIGN-NEXT: str xzr, [x{{[0-9]+}}]
; STRICTALIGN-NEXT: stp wzr, wzr, [x{{[0-9]+}}]		; STRICTALIGN-NEXT: stp wzr, wzr, [x{{[0-9]+}}]
; CHECK-NEXT: ret		; CHECK-NEXT: ret
entry:		entry:
store <2 x float> zeroinitializer, <2 x float>* %p		store <2 x float> zeroinitializer, <2 x float>* %p
ret void		ret void
}		}

; Like merge_zr32, but with 4-vector float type.		; Like merge_zr32, but with 4-vector float type.
define void @merge_zr32_4vecf(<4 x float>* %p) {		define void @merge_zr32_4vecf(<4 x float>* %p) {
; CHECK-LABEL: merge_zr32_4vecf:		; CHECK-LABEL: merge_zr32_4vecf:
; CHECK: // %entry		; CHECK: // %entry
; NOSTRICTALIGN-NEXT: stp xzr, xzr, [x{{[0-9]+}}]		; NOSTRICTALIGN-NEXT: stp xzr, xzr, [x{{[0-9]+}}]
; STRICTALIGN-NEXT: stp wzr, wzr, [x{{[0-9]+}}]
; STRICTALIGN-NEXT: stp wzr, wzr, [x{{[0-9]+}}, #8]		; STRICTALIGN-NEXT: stp wzr, wzr, [x{{[0-9]+}}, #8]
		; STRICTALIGN-NEXT: stp wzr, wzr, [x{{[0-9]+}}]
; CHECK-NEXT: ret		; CHECK-NEXT: ret
entry:		entry:
store <4 x float> zeroinitializer, <4 x float>* %p		store <4 x float> zeroinitializer, <4 x float>* %p
ret void		ret void
}		}

; Similar to merge_zr32, but for 64-bit values.		; Similar to merge_zr32, but for 64-bit values.
define void @merge_zr64(i64* %p) {		define void @merge_zr64(i64* %p) {
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	entry:
store <2 x double> zeroinitializer, <2 x double>* %p		store <2 x double> zeroinitializer, <2 x double>* %p
ret void		ret void
}		}

; Like merge_zr64, but with 3-vector i64 type.		; Like merge_zr64, but with 3-vector i64 type.
define void @merge_zr64_3vec(<3 x i64>* %p) {		define void @merge_zr64_3vec(<3 x i64>* %p) {
; CHECK-LABEL: merge_zr64_3vec:		; CHECK-LABEL: merge_zr64_3vec:
; CHECK: // %entry		; CHECK: // %entry
; CHECK-NEXT: stp xzr, xzr, [x{{[0-9]+}}]		; CHECK-NEXT: stp xzr, xzr, [x{{[0-9]+}}, #8]
; CHECK-NEXT: str xzr, [x{{[0-9]+}}, #16]		; CHECK-NEXT: str xzr, [x{{[0-9]+}}]
; CHECK-NEXT: ret		; CHECK-NEXT: ret
entry:		entry:
store <3 x i64> zeroinitializer, <3 x i64>* %p		store <3 x i64> zeroinitializer, <3 x i64>* %p
ret void		ret void
}		}

; Like merge_zr64_2, but with 4-vector double type.		; Like merge_zr64_2, but with 4-vector double type.
define void @merge_zr64_4vecd(<4 x double>* %p) {		define void @merge_zr64_4vecd(<4 x double>* %p) {
▲ Show 20 Lines • Show All 82 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/swifterror.ll

	Show First 20 Lines • Show All 308 Lines • ▼ Show 20 Lines
	; CHECK-APPLE-LABEL: foo_vararg:			; CHECK-APPLE-LABEL: foo_vararg:
	; CHECK-APPLE: orr w0, wzr, #0x10			; CHECK-APPLE: orr w0, wzr, #0x10
	; CHECK-APPLE: malloc			; CHECK-APPLE: malloc
	; CHECK-APPLE-DAG: orr [[ID:w[0-9]+]], wzr, #0x1			; CHECK-APPLE-DAG: orr [[ID:w[0-9]+]], wzr, #0x1
	; CHECK-APPLE-DAG: add [[ARGS:x[0-9]+]], [[TMP:x[0-9]+]], #16			; CHECK-APPLE-DAG: add [[ARGS:x[0-9]+]], [[TMP:x[0-9]+]], #16
	; CHECK-APPLE-DAG: strb [[ID]], [x0, #8]			; CHECK-APPLE-DAG: strb [[ID]], [x0, #8]

	; First vararg			; First vararg
	; CHECK-APPLE-DAG: orr {{x[0-9]+}}, [[ARGS]], #0x8
	; CHECK-APPLE-DAG: ldr {{w[0-9]+}}, [{{.*}}[[TMP]], #16]			; CHECK-APPLE-DAG: ldr {{w[0-9]+}}, [{{.*}}[[TMP]], #16]
	; Second vararg			; Second vararg
	; CHECK-APPLE-DAG: ldr {{w[0-9]+}}, [{{x[0-9]+}}], #8			; CHECK-APPLE-DAG: ldr {{w[0-9]+}}, [{{.*}}[[TMP]], #24]
	; CHECK-APPLE-DAG: add {{x[0-9]+}}, {{x[0-9]+}}, #16			; CHECK-APPLE-DAG: add {{x[0-9]+}}, {{x[0-9]+}}, #16
	; Third vararg			; Third vararg
	; CHECK-APPLE: ldr {{w[0-9]+}}, [{{x[0-9]+}}], #8			; CHECK-APPLE-DAG: ldr {{w[0-9]+}}, [{{.*}}[[TMP]], #32]

	; CHECK-APPLE: mov x21, x0			; CHECK-APPLE: mov x21, x0
	; CHECK-APPLE-NOT: x21			; CHECK-APPLE-NOT: x21
	entry:			entry:
	%call = call i8* @malloc(i64 16)			%call = call i8* @malloc(i64 16)
	%call.0 = bitcast i8* %call to %swift_error*			%call.0 = bitcast i8* %call to %swift_error*
	store %swift_error* %call.0, %swift_error** %error_ptr_ref			store %swift_error* %call.0, %swift_error** %error_ptr_ref
	%tmp = getelementptr inbounds i8, i8* %call, i64 8			%tmp = getelementptr inbounds i8, i8* %call, i64 8
	▲ Show 20 Lines • Show All 294 Lines • Show Last 20 Lines

llvm/test/CodeGen/ARM/arm-storebytesmerge.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=arm-eabi -mattr=+neon %s -o - \| FileCheck %s			; RUN: llc -mtriple=arm-eabi -mattr=+neon %s -o - \| FileCheck %s

	target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"			target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
	target triple = "thumbv7em-arm-none-eabi"			target triple = "thumbv7em-arm-none-eabi"

	; Function Attrs: nounwind			; Function Attrs: nounwind
	define arm_aapcs_vfpcc void @test(i8* %v50) #0 {			define arm_aapcs_vfpcc void @test(i8* %v50) #0 {
	; CHECK-LABEL: test:			; CHECK-LABEL: test:
	; CHECK: @ %bb.0:			; CHECK: @ %bb.0:
	; CHECK-NEXT: movw r1, #35722			; CHECK-NEXT: movw r1, #65534
	; CHECK-NEXT: movt r1, #36236			; CHECK-NEXT: strh.w r1, [r0, #510]
	; CHECK-NEXT: str.w r1, [r0, #394]			; CHECK-NEXT: movw r1, #64506
	; CHECK-NEXT: movw r1, #36750			; CHECK-NEXT: movt r1, #65020
	; CHECK-NEXT: movt r1, #37264			; CHECK-NEXT: str.w r1, [r0, #506]
	; CHECK-NEXT: str.w r1, [r0, #398]			; CHECK-NEXT: movw r1, #63478
	; CHECK-NEXT: movw r1, #37778			; CHECK-NEXT: movt r1, #63992
	; CHECK-NEXT: movt r1, #38292			; CHECK-NEXT: str.w r1, [r0, #502]
	; CHECK-NEXT: str.w r1, [r0, #402]			; CHECK-NEXT: movw r1, #62450
				; CHECK-NEXT: movt r1, #62964
				; CHECK-NEXT: str.w r1, [r0, #498]
				; CHECK-NEXT: movw r1, #61422
				; CHECK-NEXT: movt r1, #61936
				; CHECK-NEXT: str.w r1, [r0, #494]
				; CHECK-NEXT: movw r1, #60394
				; CHECK-NEXT: movt r1, #60908
				; CHECK-NEXT: str.w r1, [r0, #490]
				; CHECK-NEXT: movw r1, #59366
				; CHECK-NEXT: movt r1, #59880
				; CHECK-NEXT: str.w r1, [r0, #486]
				; CHECK-NEXT: movw r1, #58338
				; CHECK-NEXT: movt r1, #58852
				; CHECK-NEXT: str.w r1, [r0, #482]
				; CHECK-NEXT: movw r1, #57310
				; CHECK-NEXT: movt r1, #57824
				; CHECK-NEXT: str.w r1, [r0, #478]
				; CHECK-NEXT: movw r1, #56282
				; CHECK-NEXT: movt r1, #56796
				; CHECK-NEXT: str.w r1, [r0, #474]
				; CHECK-NEXT: movw r1, #55254
				; CHECK-NEXT: movt r1, #55768
				; CHECK-NEXT: str.w r1, [r0, #470]
				; CHECK-NEXT: movw r1, #54226
				; CHECK-NEXT: movt r1, #54740
				; CHECK-NEXT: str.w r1, [r0, #466]
				; CHECK-NEXT: movw r1, #53198
				; CHECK-NEXT: movt r1, #53712
				; CHECK-NEXT: str.w r1, [r0, #462]
				; CHECK-NEXT: movw r1, #52170
				; CHECK-NEXT: movt r1, #52684
				; CHECK-NEXT: str.w r1, [r0, #458]
				; CHECK-NEXT: movw r1, #51142
				; CHECK-NEXT: movt r1, #51656
				; CHECK-NEXT: str.w r1, [r0, #454]
				; CHECK-NEXT: movw r1, #50114
				; CHECK-NEXT: movt r1, #50628
				; CHECK-NEXT: str.w r1, [r0, #450]
				; CHECK-NEXT: movw r1, #49086
				; CHECK-NEXT: movt r1, #49600
				; CHECK-NEXT: str.w r1, [r0, #446]
				; CHECK-NEXT: movw r1, #48058
				; CHECK-NEXT: movt r1, #48572
				; CHECK-NEXT: str.w r1, [r0, #442]
				; CHECK-NEXT: movw r1, #47030
				; CHECK-NEXT: movt r1, #47544
				; CHECK-NEXT: str.w r1, [r0, #438]
				; CHECK-NEXT: movw r1, #46002
				; CHECK-NEXT: movt r1, #46516
				; CHECK-NEXT: str.w r1, [r0, #434]
				; CHECK-NEXT: movw r1, #44974
				; CHECK-NEXT: movt r1, #45488
				; CHECK-NEXT: str.w r1, [r0, #430]
				; CHECK-NEXT: movw r1, #43946
				; CHECK-NEXT: movt r1, #44460
				; CHECK-NEXT: str.w r1, [r0, #426]
				; CHECK-NEXT: movw r1, #42918
				; CHECK-NEXT: movt r1, #43432
				; CHECK-NEXT: str.w r1, [r0, #422]
				; CHECK-NEXT: movw r1, #41890
				; CHECK-NEXT: movt r1, #42404
				; CHECK-NEXT: str.w r1, [r0, #418]
				; CHECK-NEXT: movw r1, #40862
				; CHECK-NEXT: movt r1, #41376
				; CHECK-NEXT: str.w r1, [r0, #414]
				; CHECK-NEXT: movw r1, #39834
				; CHECK-NEXT: movt r1, #40348
				; CHECK-NEXT: str.w r1, [r0, #410]
	; CHECK-NEXT: movw r1, #38806			; CHECK-NEXT: movw r1, #38806
	; CHECK-NEXT: movt r1, #39320			; CHECK-NEXT: movt r1, #39320
	; CHECK-NEXT: str.w r1, [r0, #406]			; CHECK-NEXT: str.w r1, [r0, #406]
	; CHECK-NEXT: movw r1, #39834			; CHECK-NEXT: movw r1, #37778
	; CHECK-NEXT: strh.w r1, [r0, #410]			; CHECK-NEXT: movt r1, #38292
	; CHECK-NEXT: movs r1, #156			; CHECK-NEXT: str.w r1, [r0, #402]
	; CHECK-NEXT: strb.w r1, [r0, #412]			; CHECK-NEXT: movw r1, #36750
	; CHECK-NEXT: movw r1, #40605			; CHECK-NEXT: movt r1, #37264
	; CHECK-NEXT: movt r1, #41119			; CHECK-NEXT: str.w r1, [r0, #398]
	; CHECK-NEXT: str.w r1, [r0, #413]			; CHECK-NEXT: movw r1, #35722
	; CHECK-NEXT: movw r1, #41633			; CHECK-NEXT: movt r1, #36236
	; CHECK-NEXT: movt r1, #42147			; CHECK-NEXT: str.w r1, [r0, #394]
	; CHECK-NEXT: str.w r1, [r0, #417]
	; CHECK-NEXT: movw r1, #42661
	; CHECK-NEXT: movt r1, #43175
	; CHECK-NEXT: str.w r1, [r0, #421]
	; CHECK-NEXT: movw r1, #43689
	; CHECK-NEXT: movt r1, #44203
	; CHECK-NEXT: str.w r1, [r0, #425]
	; CHECK-NEXT: movw r1, #44717
	; CHECK-NEXT: movt r1, #45231
	; CHECK-NEXT: str.w r1, [r0, #429]
	; CHECK-NEXT: movw r1, #45745
	; CHECK-NEXT: movt r1, #46259
	; CHECK-NEXT: str.w r1, [r0, #433]
	; CHECK-NEXT: movw r1, #46773
	; CHECK-NEXT: movt r1, #47287
	; CHECK-NEXT: str.w r1, [r0, #437]
	; CHECK-NEXT: movw r1, #47801
	; CHECK-NEXT: movt r1, #48315
	; CHECK-NEXT: str.w r1, [r0, #441]
	; CHECK-NEXT: movw r1, #48829
	; CHECK-NEXT: movt r1, #49343
	; CHECK-NEXT: str.w r1, [r0, #445]
	; CHECK-NEXT: movw r1, #49857
	; CHECK-NEXT: movt r1, #50371
	; CHECK-NEXT: str.w r1, [r0, #449]
	; CHECK-NEXT: movw r1, #50885
	; CHECK-NEXT: movt r1, #51399
	; CHECK-NEXT: str.w r1, [r0, #453]
	; CHECK-NEXT: movw r1, #52941
	; CHECK-NEXT: movt r1, #53455
	; CHECK-NEXT: str.w r1, [r0, #461]
	; CHECK-NEXT: movw r1, #51913
	; CHECK-NEXT: movt r1, #52427
	; CHECK-NEXT: str.w r1, [r0, #457]
	; CHECK-NEXT: movw r1, #53969
	; CHECK-NEXT: movt r1, #54483
	; CHECK-NEXT: str.w r1, [r0, #465]
	; CHECK-NEXT: movw r1, #54997
	; CHECK-NEXT: movt r1, #55511
	; CHECK-NEXT: str.w r1, [r0, #469]
	; CHECK-NEXT: movw r1, #56025
	; CHECK-NEXT: movt r1, #56539
	; CHECK-NEXT: str.w r1, [r0, #473]
	; CHECK-NEXT: movw r1, #57053
	; CHECK-NEXT: movt r1, #57567
	; CHECK-NEXT: str.w r1, [r0, #477]
	; CHECK-NEXT: movw r1, #58081
	; CHECK-NEXT: movt r1, #58595
	; CHECK-NEXT: str.w r1, [r0, #481]
	; CHECK-NEXT: movw r1, #59109
	; CHECK-NEXT: movt r1, #59623
	; CHECK-NEXT: str.w r1, [r0, #485]
	; CHECK-NEXT: movw r1, #60137
	; CHECK-NEXT: movt r1, #60651
	; CHECK-NEXT: str.w r1, [r0, #489]
	; CHECK-NEXT: movw r1, #61165
	; CHECK-NEXT: movt r1, #61679
	; CHECK-NEXT: str.w r1, [r0, #493]
	; CHECK-NEXT: movw r1, #62193
	; CHECK-NEXT: movt r1, #62707
	; CHECK-NEXT: str.w r1, [r0, #497]
	; CHECK-NEXT: movw r1, #63221
	; CHECK-NEXT: movt r1, #63735
	; CHECK-NEXT: str.w r1, [r0, #501]
	; CHECK-NEXT: movw r1, #64249
	; CHECK-NEXT: movt r1, #64763
	; CHECK-NEXT: str.w r1, [r0, #505]
	; CHECK-NEXT: movw r1, #65277
	; CHECK-NEXT: strh.w r1, [r0, #509]
	; CHECK-NEXT: movs r1, #255
	; CHECK-NEXT: strb.w r1, [r0, #511]
	; CHECK-NEXT: bx lr			; CHECK-NEXT: bx lr
	%v190 = getelementptr inbounds i8, i8* %v50, i32 394			%v190 = getelementptr inbounds i8, i8* %v50, i32 394
	store i8 -118, i8* %v190, align 1			store i8 -118, i8* %v190, align 1
	%v191 = getelementptr inbounds i8, i8* %v50, i32 395			%v191 = getelementptr inbounds i8, i8* %v50, i32 395
	store i8 -117, i8* %v191, align 1			store i8 -117, i8* %v191, align 1
	%v192 = getelementptr inbounds i8, i8* %v50, i32 396			%v192 = getelementptr inbounds i8, i8* %v50, i32 396
	store i8 -116, i8* %v192, align 1			store i8 -116, i8* %v192, align 1
	%v193 = getelementptr inbounds i8, i8* %v50, i32 397			%v193 = getelementptr inbounds i8, i8* %v50, i32 397
	▲ Show 20 Lines • Show All 234 Lines • Show Last 20 Lines

llvm/test/CodeGen/ARM/misched-fusion-aes.ll

Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	define void @aesea(<16 x i8>* %a0, <16 x i8>* %b0, <16 x i8>* %c0, <16 x i8> %d, <16 x i8> %e) {
store <16 x i8> %h2, <16 x i8>* %c2		store <16 x i8> %h2, <16 x i8>* %c2
%c3 = getelementptr inbounds <16 x i8>, <16 x i8>* %c0, i64 3		%c3 = getelementptr inbounds <16 x i8>, <16 x i8>* %c0, i64 3
store <16 x i8> %h3, <16 x i8>* %c3		store <16 x i8> %h3, <16 x i8>* %c3
ret void		ret void

; CHECK-LABEL: aesea:		; CHECK-LABEL: aesea:
; CHECK: aese.8 [[QA:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aese.8 [[QA:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QA]]		; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QA]]

; CHECK: aese.8 [[QB:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aese.8 [[QB:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QB]]		; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QB]]
; CHECK: aese.8 {{q[0-9][0-9]?}}, {{q[0-9][0-9]?}}
; CHECK: aese.8 [[QC:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aese.8 [[QC:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QC]]		; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QC]]

		; CHECK: aese.8 {{q[0-9][0-9]?}}, {{q[0-9][0-9]?}}
; CHECK: aese.8 [[QD:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aese.8 [[QD:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QD]]		; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QD]]

		; CHECK: aese.8 {{q[0-9][0-9]?}}, {{q[0-9][0-9]?}}
; CHECK: aese.8 [[QE:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aese.8 [[QE:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QE]]		; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QE]]
; CHECK: aese.8 {{q[0-9][0-9]?}}, {{q[0-9][0-9]?}}
; CHECK: aese.8 [[QF:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aese.8 [[QF:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QF]]		; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QF]]

; CHECK: aese.8 [[QG:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aese.8 [[QG:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QG]]		; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QG]]

; CHECK: aese.8 {{q[0-9][0-9]?}}, {{q[0-9][0-9]?}}		; CHECK: aese.8 {{q[0-9][0-9]?}}, {{q[0-9][0-9]?}}
; CHECK: aese.8 [[QH:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aese.8 [[QH:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QH]]		; CHECK-NEXT: aesmc.8 {{q[0-9][0-9]?}}, [[QH]]
}		}

define void @aesda(<16 x i8>* %a0, <16 x i8>* %b0, <16 x i8>* %c0, <16 x i8> %d, <16 x i8> %e) {		define void @aesda(<16 x i8>* %a0, <16 x i8>* %b0, <16 x i8>* %c0, <16 x i8> %d, <16 x i8> %e) {
%d0 = load <16 x i8>, <16 x i8>* %a0		%d0 = load <16 x i8>, <16 x i8>* %a0
%a1 = getelementptr inbounds <16 x i8>, <16 x i8>* %a0, i64 1		%a1 = getelementptr inbounds <16 x i8>, <16 x i8>* %a0, i64 1
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	define void @aesda(<16 x i8>* %a0, <16 x i8>* %b0, <16 x i8>* %c0, <16 x i8> %d, <16 x i8> %e) {
store <16 x i8> %h3, <16 x i8>* %c3		store <16 x i8> %h3, <16 x i8>* %c3
ret void		ret void

; CHECK-LABEL: aesda:		; CHECK-LABEL: aesda:
; CHECK: aesd.8 [[QA:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aesd.8 [[QA:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QA]]		; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QA]]
; CHECK: aesd.8 [[QB:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aesd.8 [[QB:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QB]]		; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QB]]
; CHECK: aesd.8 {{q[0-9][0-9]?}}, {{q[0-9][0-9]?}}
; CHECK: aesd.8 [[QC:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aesd.8 [[QC:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QC]]		; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QC]]
		; CHECK: aesd.8 {{q[0-9][0-9]?}}, {{q[0-9][0-9]?}}
; CHECK: aesd.8 [[QD:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aesd.8 [[QD:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QD]]		; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QD]]
		; CHECK: aesd.8 {{q[0-9][0-9]?}}, {{q[0-9][0-9]?}}
; CHECK: aesd.8 [[QE:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aesd.8 [[QE:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QE]]		; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QE]]
; CHECK: aesd.8 {{q[0-9][0-9]?}}, {{q[0-9][0-9]?}}
; CHECK: aesd.8 [[QF:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aesd.8 [[QF:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QF]]		; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QF]]
; CHECK: aesd.8 [[QG:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aesd.8 [[QG:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QG]]		; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QG]]
; CHECK: aesd.8 {{q[0-9][0-9]?}}, {{q[0-9][0-9]?}}		; CHECK: aesd.8 {{q[0-9][0-9]?}}, {{q[0-9][0-9]?}}
; CHECK: aesd.8 [[QH:q[0-9][0-9]?]], {{q[0-9][0-9]?}}		; CHECK: aesd.8 [[QH:q[0-9][0-9]?]], {{q[0-9][0-9]?}}
; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QH]]		; CHECK-NEXT: aesimc.8 {{q[0-9][0-9]?}}, [[QH]]
}		}
Show All 27 Lines

llvm/test/CodeGen/Mips/fastcc.ll

Show First 20 Lines • Show All 217 Lines • ▼ Show 20 Lines	; CHECK: lwc1 $f0
%20 = load float, float* @gfa20, align 4		%20 = load float, float* @gfa20, align 4
tail call fastcc void @callee1(float %0, float %1, float %2, float %3, float %4, float %5, float %6, float %7, float %8, float %9, float %10, float %11, float %12, float %13, float %14, float %15, float %16, float %17, float %18, float %19, float %20)		tail call fastcc void @callee1(float %0, float %1, float %2, float %3, float %4, float %5, float %6, float %7, float %8, float %9, float %10, float %11, float %12, float %13, float %14, float %15, float %16, float %17, float %18, float %19, float %20)
ret void		ret void
}		}

define internal fastcc void @callee1(float %a0, float %a1, float %a2, float %a3, float %a4, float %a5, float %a6, float %a7, float %a8, float %a9, float %a10, float %a11, float %a12, float %a13, float %a14, float %a15, float %a16, float %a17, float %a18, float %a19, float %a20) nounwind noinline {		define internal fastcc void @callee1(float %a0, float %a1, float %a2, float %a3, float %a4, float %a5, float %a6, float %a7, float %a8, float %a9, float %a10, float %a11, float %a12, float %a13, float %a14, float %a15, float %a16, float %a17, float %a18, float %a19, float %a20) nounwind noinline {
entry:		entry:
; CHECK-LABEL: callee1:		; CHECK-LABEL: callee1:
; CHECK-DAG: swc1 $f0
niravdAuthorUnsubmitted Not Done Reply Inline Actions Apparently FileCheck balks at dealing with this many too many deferred lines in a DAG match. This is just a noop reordering to be closer to the output asm. niravd: Apparently FileCheck balks at dealing with this many too many deferred lines in a DAG match.
; CHECK-DAG: swc1 $f1
; CHECK-DAG: swc1 $f2
; CHECK-DAG: swc1 $f3
; CHECK-DAG: swc1 $f4
; CHECK-DAG: swc1 $f5
; CHECK-DAG: swc1 $f6
; CHECK-DAG: swc1 $f7
; CHECK-DAG: swc1 $f8
; CHECK-DAG: swc1 $f9
; CHECK-DAG: swc1 $f10
; CHECK-DAG: swc1 $f11
; CHECK-DAG: swc1 $f12
; CHECK-DAG: swc1 $f13
; CHECK-DAG: swc1 $f14
; CHECK-DAG: swc1 $f15
; CHECK-DAG: swc1 $f16
; CHECK-DAG: swc1 $f17		; CHECK-DAG: swc1 $f17
		; CHECK-DAG: swc1 $f16
		; CHECK-DAG: swc1 $f15
		; CHECK-DAG: swc1 $f14
		; CHECK-DAG: swc1 $f13
		; CHECK-DAG: swc1 $f12
		; CHECK-DAG: swc1 $f11
		; CHECK-DAG: swc1 $f10
		; CHECK-DAG: swc1 $f9
		; CHECK-DAG: swc1 $f8
		; CHECK-DAG: swc1 $f7
		; CHECK-DAG: swc1 $f6
		; CHECK-DAG: swc1 $f5
		; CHECK-DAG: swc1 $f4
		; CHECK-DAG: swc1 $f3
		; CHECK-DAG: swc1 $f2
		; CHECK-DAG: swc1 $f1
		; CHECK-DAG: swc1 $f0
; CHECK-DAG: swc1 $f18		; CHECK-DAG: swc1 $f18
; CHECK-DAG: swc1 $f19		; CHECK-DAG: swc1 $f19

store float %a0, float* @gf0, align 4		store float %a0, float* @gf0, align 4
store float %a1, float* @gf1, align 4		store float %a1, float* @gf1, align 4
store float %a2, float* @gf2, align 4		store float %a2, float* @gf2, align 4
store float %a3, float* @gf3, align 4		store float %a3, float* @gf3, align 4
store float %a4, float* @gf4, align 4		store float %a4, float* @gf4, align 4
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
; NOODDSPREG-DAG: swc1 $f6, 12($[[R0]])		; NOODDSPREG-DAG: swc1 $f6, 12($[[R0]])
; NOODDSPREG-DAG: swc1 $f8, 16($[[R0]])		; NOODDSPREG-DAG: swc1 $f8, 16($[[R0]])
; NOODDSPREG-DAG: swc1 $f10, 20($[[R0]])		; NOODDSPREG-DAG: swc1 $f10, 20($[[R0]])
; NOODDSPREG-DAG: swc1 $f12, 24($[[R0]])		; NOODDSPREG-DAG: swc1 $f12, 24($[[R0]])
; NOODDSPREG-DAG: swc1 $f14, 28($[[R0]])		; NOODDSPREG-DAG: swc1 $f14, 28($[[R0]])
; NOODDSPREG-DAG: swc1 $f16, 32($[[R0]])		; NOODDSPREG-DAG: swc1 $f16, 32($[[R0]])
; NOODDSPREG-DAG: swc1 $f18, 36($[[R0]])		; NOODDSPREG-DAG: swc1 $f18, 36($[[R0]])

; NOODDSPREG-DAG: lwc1 $[[F0:f[0-9]*[02468]]], 0($sp)		; NOODDSPREG-DAG: lwc1 $[[F0:f[0-9]*[02468]]], {{[0-9]+}}($sp)
; NOODDSPREG-DAG: swc1 $[[F0]], 40($[[R0]])		; NOODDSPREG-DAG: swc1 $[[F0]], 40($[[R0]])

store float %a0, float* getelementptr ([11 x float], [11 x float]* @fa, i32 0, i32 0), align 4		store float %a0, float* getelementptr ([11 x float], [11 x float]* @fa, i32 0, i32 0), align 4
store float %a1, float* getelementptr ([11 x float], [11 x float]* @fa, i32 0, i32 1), align 4		store float %a1, float* getelementptr ([11 x float], [11 x float]* @fa, i32 0, i32 1), align 4
store float %a2, float* getelementptr ([11 x float], [11 x float]* @fa, i32 0, i32 2), align 4		store float %a2, float* getelementptr ([11 x float], [11 x float]* @fa, i32 0, i32 2), align 4
store float %a3, float* getelementptr ([11 x float], [11 x float]* @fa, i32 0, i32 3), align 4		store float %a3, float* getelementptr ([11 x float], [11 x float]* @fa, i32 0, i32 3), align 4
store float %a4, float* getelementptr ([11 x float], [11 x float]* @fa, i32 0, i32 4), align 4		store float %a4, float* getelementptr ([11 x float], [11 x float]* @fa, i32 0, i32 4), align 4
store float %a5, float* getelementptr ([11 x float], [11 x float]* @fa, i32 0, i32 5), align 4		store float %a5, float* getelementptr ([11 x float], [11 x float]* @fa, i32 0, i32 5), align 4
▲ Show 20 Lines • Show All 87 Lines • Show Last 20 Lines

llvm/test/CodeGen/SystemZ/pr36164.ll

	Show All 9 Lines
	@g_73 = external dso_local unnamed_addr global i32, align 4			@g_73 = external dso_local unnamed_addr global i32, align 4
	@g_832 = external dso_local constant %0, align 4			@g_832 = external dso_local constant %0, align 4
	@g_938 = external dso_local unnamed_addr global i64, align 8			@g_938 = external dso_local unnamed_addr global i64, align 8

	; Function Attrs: nounwind			; Function Attrs: nounwind
	define void @main() local_unnamed_addr #0 {			define void @main() local_unnamed_addr #0 {
	; CHECK-LABEL: main:			; CHECK-LABEL: main:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: stmg %r12, %r15, 96(%r15)
	; CHECK-NEXT: .cfi_offset %r12, -64
	; CHECK-NEXT: .cfi_offset %r13, -56
	; CHECK-NEXT: .cfi_offset %r14, -48
	; CHECK-NEXT: .cfi_offset %r15, -40
	; CHECK-NEXT: lhi %r0, 1			; CHECK-NEXT: lhi %r0, 1
	; CHECK-NEXT: larl %r1, g_938			; CHECK-NEXT: larl %r1, g_938
	; CHECK-NEXT: lhi %r2, 2			; CHECK-NEXT: lhi %r2, 0
	; CHECK-NEXT: lhi %r3, 3			; CHECK-NEXT: lhi %r3, 4
	; CHECK-NEXT: lhi %r4, 0			; CHECK-NEXT: larl %r4, g_11
	; CHECK-NEXT: lhi %r5, 4
	; CHECK-NEXT: larl %r14, g_11
	; CHECK-NEXT: .LBB0_1: # =>This Inner Loop Header: Depth=1			; CHECK-NEXT: .LBB0_1: # =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: strl %r0, g_73			; CHECK-NEXT: strl %r0, g_73
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: strl %r0, g_69
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-DAG: lghi %r13, 24
	; CHECK-DAG: strl %r2, g_69
	; CHECK-DAG: ag %r13, 0(%r1)
	; CHECK-NEXT: lrl %r12, g_832
	; CHECK-NEXT: strl %r3, g_69
	; CHECK-NEXT: lrl %r12, g_832
	; CHECK-NEXT: strl %r4, g_69
	; CHECK-NEXT: lrl %r12, g_832
	; CHECK-NEXT: strl %r0, g_69
	; CHECK-NEXT: lrl %r12, g_832
	; CHECK-NEXT: strl %r2, g_69			; CHECK-NEXT: strl %r2, g_69
	; CHECK-NEXT: lrl %r12, g_832			; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: lrl %r5, g_832
				; CHECK-NEXT: agsi 0(%r1), 24
				; CHECK-NEXT: lrl %r5, g_832
	; CHECK-NEXT: strl %r3, g_69			; CHECK-NEXT: strl %r3, g_69
	; CHECK-NEXT: stgrl %r13, g_938			; CHECK-NEXT: mvi 0(%r4), 1
	; CHECK-NEXT: lrl %r13, g_832
	; CHECK-NEXT: strl %r5, g_69
	; CHECK-NEXT: mvi 0(%r14), 1
	; CHECK-NEXT: j .LBB0_1			; CHECK-NEXT: j .LBB0_1
	br label %1			br label %1

	; <label>:1: ; preds = %1, %0			; <label>:1: ; preds = %1, %0
	store i32 1, i32* @g_73, align 4			store i32 1, i32* @g_73, align 4
	%2 = load i64, i64* @g_938, align 8			%2 = load i64, i64* @g_938, align 8
	store i32 0, i32* @g_69, align 4			store i32 0, i32* @g_69, align 4
	%3 = load volatile i32, i32* getelementptr inbounds (%0, %0* @g_832, i64 0, i32 0), align 4			%3 = load volatile i32, i32* getelementptr inbounds (%0, %0* @g_832, i64 0, i32 0), align 4
	Show All 40 Lines

llvm/test/CodeGen/X86/i256-add.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i386-unknown \| FileCheck %s --check-prefix=X32			; RUN: llc < %s -mtriple=i386-unknown \| FileCheck %s --check-prefix=X32
	; RUN: llc < %s -mtriple=x86_64-unknown \| FileCheck %s --check-prefix=X64			; RUN: llc < %s -mtriple=x86_64-unknown \| FileCheck %s --check-prefix=X64

	define void @add(i256* %p, i256* %q) nounwind {			define void @add(i256* %p, i256* %q) nounwind {
	; X32-LABEL: add:			; X32-LABEL: add:
	; X32: # %bb.0:			; X32: # %bb.0:
	; X32-NEXT: pushl %ebp			; X32-NEXT: pushl %ebp
	; X32-NEXT: pushl %ebx			; X32-NEXT: pushl %ebx
	; X32-NEXT: pushl %edi			; X32-NEXT: pushl %edi
	; X32-NEXT: pushl %esi			; X32-NEXT: pushl %esi
	; X32-NEXT: subl $8, %esp			; X32-NEXT: subl $8, %esp
	; X32-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: movl 28(%eax), %ecx			; X32-NEXT: movl 28(%eax), %ecx
	; X32-NEXT: movl %ecx, {{[0-9]+}}(%esp) # 4-byte Spill			; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
	; X32-NEXT: movl 24(%eax), %ecx			; X32-NEXT: movl 24(%eax), %ecx
	; X32-NEXT: movl %ecx, (%esp) # 4-byte Spill			; X32-NEXT: movl %ecx, (%esp) # 4-byte Spill
	; X32-NEXT: movl 20(%eax), %esi			; X32-NEXT: movl 20(%eax), %esi
	; X32-NEXT: movl 16(%eax), %edi			; X32-NEXT: movl 16(%eax), %edi
	; X32-NEXT: movl 12(%eax), %ebx			; X32-NEXT: movl 12(%eax), %ebx
	; X32-NEXT: movl 8(%eax), %ebp			; X32-NEXT: movl 8(%eax), %ebp
	; X32-NEXT: movl (%eax), %ecx			; X32-NEXT: movl (%eax), %ecx
	; X32-NEXT: movl 4(%eax), %edx			; X32-NEXT: movl 4(%eax), %edx
	; X32-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: addl %ecx, (%eax)			; X32-NEXT: addl %ecx, (%eax)
	; X32-NEXT: adcl %edx, 4(%eax)			; X32-NEXT: adcl %edx, 4(%eax)
	; X32-NEXT: adcl %ebp, 8(%eax)			; X32-NEXT: adcl %ebp, 8(%eax)
	; X32-NEXT: adcl %ebx, 12(%eax)			; X32-NEXT: adcl %ebx, 12(%eax)
	; X32-NEXT: adcl %edi, 16(%eax)			; X32-NEXT: adcl %edi, 16(%eax)
	; X32-NEXT: adcl %esi, 20(%eax)			; X32-NEXT: adcl %esi, 20(%eax)
	; X32-NEXT: movl (%esp), %ecx # 4-byte Reload			; X32-NEXT: movl (%esp), %ecx # 4-byte Reload
	; X32-NEXT: adcl %ecx, 24(%eax)			; X32-NEXT: adcl %ecx, 24(%eax)
	; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx # 4-byte Reload			; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
	; X32-NEXT: adcl %ecx, 28(%eax)			; X32-NEXT: adcl %ecx, 28(%eax)
	; X32-NEXT: addl $8, %esp			; X32-NEXT: addl $8, %esp
	; X32-NEXT: popl %esi			; X32-NEXT: popl %esi
	; X32-NEXT: popl %edi			; X32-NEXT: popl %edi
	; X32-NEXT: popl %ebx			; X32-NEXT: popl %ebx
	; X32-NEXT: popl %ebp			; X32-NEXT: popl %ebp
	; X32-NEXT: retl			; X32-NEXT: retl
	;			;
	Show All 19 Lines
	; X32: # %bb.0:			; X32: # %bb.0:
	; X32-NEXT: pushl %ebp			; X32-NEXT: pushl %ebp
	; X32-NEXT: pushl %ebx			; X32-NEXT: pushl %ebx
	; X32-NEXT: pushl %edi			; X32-NEXT: pushl %edi
	; X32-NEXT: pushl %esi			; X32-NEXT: pushl %esi
	; X32-NEXT: subl $8, %esp			; X32-NEXT: subl $8, %esp
	; X32-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: movl 28(%eax), %ecx			; X32-NEXT: movl 28(%eax), %ecx
	; X32-NEXT: movl %ecx, {{[0-9]+}}(%esp) # 4-byte Spill			; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
	; X32-NEXT: movl 24(%eax), %ecx			; X32-NEXT: movl 24(%eax), %ecx
	; X32-NEXT: movl %ecx, (%esp) # 4-byte Spill			; X32-NEXT: movl %ecx, (%esp) # 4-byte Spill
	; X32-NEXT: movl 20(%eax), %esi			; X32-NEXT: movl 20(%eax), %esi
	; X32-NEXT: movl 16(%eax), %edi			; X32-NEXT: movl 16(%eax), %edi
	; X32-NEXT: movl 12(%eax), %ebx			; X32-NEXT: movl 12(%eax), %ebx
	; X32-NEXT: movl 8(%eax), %ebp			; X32-NEXT: movl 8(%eax), %ebp
	; X32-NEXT: movl (%eax), %ecx			; X32-NEXT: movl (%eax), %ecx
	; X32-NEXT: movl 4(%eax), %edx			; X32-NEXT: movl 4(%eax), %edx
	; X32-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: subl %ecx, (%eax)			; X32-NEXT: subl %ecx, (%eax)
	; X32-NEXT: sbbl %edx, 4(%eax)			; X32-NEXT: sbbl %edx, 4(%eax)
	; X32-NEXT: sbbl %ebp, 8(%eax)			; X32-NEXT: sbbl %ebp, 8(%eax)
	; X32-NEXT: sbbl %ebx, 12(%eax)			; X32-NEXT: sbbl %ebx, 12(%eax)
	; X32-NEXT: sbbl %edi, 16(%eax)			; X32-NEXT: sbbl %edi, 16(%eax)
	; X32-NEXT: sbbl %esi, 20(%eax)			; X32-NEXT: sbbl %esi, 20(%eax)
	; X32-NEXT: movl (%esp), %ecx # 4-byte Reload			; X32-NEXT: movl (%esp), %ecx # 4-byte Reload
	; X32-NEXT: sbbl %ecx, 24(%eax)			; X32-NEXT: sbbl %ecx, 24(%eax)
	; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx # 4-byte Reload			; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Reload
	; X32-NEXT: sbbl %ecx, 28(%eax)			; X32-NEXT: sbbl %ecx, 28(%eax)
	; X32-NEXT: addl $8, %esp			; X32-NEXT: addl $8, %esp
	; X32-NEXT: popl %esi			; X32-NEXT: popl %esi
	; X32-NEXT: popl %edi			; X32-NEXT: popl %edi
	; X32-NEXT: popl %ebx			; X32-NEXT: popl %ebx
	; X32-NEXT: popl %ebp			; X32-NEXT: popl %ebp
	; X32-NEXT: retl			; X32-NEXT: retl
	;			;
	Show All 17 Lines

llvm/test/CodeGen/X86/stores-merging.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx \| FileCheck %s			; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx \| FileCheck %s

	%structTy = type { i8, i32, i32 }			%structTy = type { i8, i32, i32 }

	@e = common global %structTy zeroinitializer, align 4			@e = common global %structTy zeroinitializer, align 4

	;; Ensure that MergeConsecutiveStores doesn't incorrectly reorder			;; Ensure that MergeConsecutiveStores doesn't incorrectly reorder
	;; store operations. The first test stores in increasing address			;; store operations. The first test stores in increasing address
	;; order, the second in decreasing -- but in both cases should have			;; order, the second in decreasing -- but in both cases should have
	;; the same result in memory in the end.			;; the same result in memory in the end.

	define void @redundant_stores_merging() {			define void @redundant_stores_merging() {
	; CHECK-LABEL: redundant_stores_merging:			; CHECK-LABEL: redundant_stores_merging:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: movabsq $528280977409, %rax # imm = 0x7B00000001			; CHECK-NEXT: movabsq $1958505086977, %rax # imm = 0x1C800000001
	; CHECK-NEXT: movq %rax, e+{{.*}}(%rip)			; CHECK-NEXT: movq %rax, e+{{.*}}(%rip)
	; CHECK-NEXT: movl $456, e+{{.*}}(%rip) # imm = 0x1C8
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	store i32 1, i32* getelementptr inbounds (%structTy, %structTy* @e, i64 0, i32 1), align 4			store i32 1, i32* getelementptr inbounds (%structTy, %structTy* @e, i64 0, i32 1), align 4
	store i32 123, i32* getelementptr inbounds (%structTy, %structTy* @e, i64 0, i32 2), align 4			store i32 123, i32* getelementptr inbounds (%structTy, %structTy* @e, i64 0, i32 2), align 4
	store i32 456, i32* getelementptr inbounds (%structTy, %structTy* @e, i64 0, i32 2), align 4			store i32 456, i32* getelementptr inbounds (%structTy, %structTy* @e, i64 0, i32 2), align 4
	ret void			ret void
	}			}

	;; This variant tests PR25154.			;; This variant tests PR25154.
	▲ Show 20 Lines • Show All 197 Lines • Show Last 20 Lines