This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/X86/
-
Target/
-
X86/
8/8
X86ISelLowering.h
1/1
X86ISelLowering.cpp
-
test/CodeGen/X86/
-
CodeGen/
-
X86/
-
add-sub-bool.ll
-
bswap-wide-int.ll
-
fshl.ll
-
fshr.ll
-
i128-add.ll
-
icmp-shift-opt.ll
-
legalize-shl-vec.ll
-
merge-consecutive-stores-nt.ll
1/1
setcc-wide-types.ll
-
smin.ll
-
smul-with-overflow.ll
-
smulo-128-legalisation-lowering.ll
-
umin.ll
-
umul-with-overflow.ll
-
wide-integer-cmp.ll
-
xaluo128.ll

Differential D141776

[X86] `X86TargetLowering`: override `allowsMemoryAccess()`
ClosedPublic

Authored by lebedev.ri on Jan 14 2023, 1:18 PM.

Download Raw Diff

Details

Reviewers

RKSimon
spatel
pengfei
craig.topper
efriedma

Commits

rG005173cbb609: [X86] `X86TargetLowering`: override `allowsMemoryAccess()`

Summary

The baseline allowsMemoryAccess() is wrong for X86.
It assumes that aligned memory operations are always allowed,
but that is not true.

For example, We can not perform a 32-byte aligned non-temporal load
of a 32-byte vector, without AVX2 that is, yet allowsMemoryAccess()
will say it is allowed, so we may end up merging non-temporal loads,
only to split them up to legalize them, and here we go again.

NOTE: the test changes here are superfluous. The main effect is that without this change, in D141777, we'd get stuck endlessly merging and splitting non-temporal stores.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

lebedev.ri created this revision.Jan 14 2023, 1:18 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 14 2023, 1:18 PM

Herald added subscribers: StephenFan, hiraditya. · View Herald Transcript

lebedev.ri requested review of this revision.Jan 14 2023, 1:18 PM

lebedev.ri mentioned this in D141777: [X86] Reenable store merging post-legalization.Jan 14 2023, 1:25 PM

lebedev.ri added a child revision: D141777: [X86] Reenable store merging post-legalization.

can you improve the summary please? you about nt memops, but all the test changes don't have anything to do with them

In D141776#4054185, @RKSimon wrote:

can you improve the summary please? you about nt memops, but all the test changes don't have anything to do with them

I'm open to suggestions. As you can check yourself, with the follow-up patch (and without this patch),
in llvm-project/llvm/test/CodeGen/X86/merge-consecutive-stores-nt.ll
we endlessly try to combine non-temporal stores, only to split them back again:

Combining: t4164: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
Creating constant: t4166: i64 = Constant<16>
Creating new node: t4167: i64 = add t2, Constant:i64<16>
Creating new node: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
Creating new node: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Creating new node: t4170: ch = TokenFactor t4168:1, t4169:1
Creating new node: t4171: v8f32 = concat_vectors t4168, t4169

Replacing.1 t4164: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64

With: t4171: v8f32 = concat_vectors t4168, t4169
 and 1 other values

Legalizing: t4170: ch = TokenFactor t4168:1, t4169:1
Legal node: nothing to do

Combining: t4170: ch = TokenFactor t4168:1, t4169:1

Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing non-extending load operation

Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64

Legalizing: t4167: i64 = add t2, Constant:i64<16>
Legal node: nothing to do

Combining: t4167: i64 = add t2, Constant:i64<16>

Legalizing: t4166: i64 = Constant<16>
Legal node: nothing to do

Combining: t4166: i64 = Constant<16>

Legalizing: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
Legalizing non-extending load operation

Combining: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64

Legalizing: t4171: v8f32 = concat_vectors t4168, t4169
Trying custom legalization
Creating new node: t4172: v8f32 = undef
Creating constant: t4173: i64 = Constant<0>
Creating new node: t4174: v8f32 = insert_subvector undef:v8f32, t4168, Constant:i64<0>
Creating constant: t4175: i64 = Constant<4>
Creating new node: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
Successfully custom legalized node
 ... replacing: t4171: v8f32 = concat_vectors t4168, t4169
     with:      t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>

Legalizing: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
Legalizing non-extending load operation

Combining: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64

Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing non-extending load operation

Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64

Legalizing: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
Legal node: nothing to do

Combining: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>

Legalizing: t4175: i64 = Constant<4>
Legal node: nothing to do

Combining: t4175: i64 = Constant<4>

Legalizing: t4174: v8f32 = insert_subvector undef:v8f32, t4168, Constant:i64<0>
Legal node: nothing to do

Combining: t4174: v8f32 = insert_subvector undef:v8f32, t4168, Constant:i64<0>
Creating new node: t4177: v4f32 = undef

Legalizing: t4173: i64 = Constant<0>
Legal node: nothing to do

Combining: t4173: i64 = Constant<0>

Legalizing: t4172: v8f32 = undef
Legal node: nothing to do

Combining: t4172: v8f32 = undef

Legalizing: t4165: ch = store<(non-temporal store (s256) into %ir.a1)> t4163, t4176, t4, undef:i64
Legalizing store operation
Optimizing float store operations
Trying custom lowering
Creating new node: t4178: v4f32 = extract_subvector t4176, Constant:i64<0>
Creating new node: t4179: i64 = add t4, Constant:i64<16>
Creating new node: t4180: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4163, t4178, t4, undef:i64
Creating new node: t4181: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4163, t4169, t4179, undef:i64
Creating new node: t4182: ch = TokenFactor t4180, t4181
 ... replacing: t4165: ch = store<(non-temporal store (s256) into %ir.a1)> t4163, t4176, t4, undef:i64
     with:      t4182: ch = TokenFactor t4180, t4181

Legalizing: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>
Legal node: nothing to do

Combining: t4176: v8f32 = insert_subvector t4174, t4169, Constant:i64<4>

Legalizing: t4182: ch = TokenFactor t4180, t4181
Legal node: nothing to do

Combining: t4182: ch = TokenFactor t4180, t4181

Legalizing: t4181: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4163, t4169, t4179, undef:i64
Legalizing store operation
Optimizing float store operations
Legal store

Combining: t4181: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4163, t4169, t4179, undef:i64
Creating new node: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64
Creating new node: t4184: ch = TokenFactor t4163, t4183

Replacing.1 t4181: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4163, t4169, t4179, undef:i64

With: t4184: ch = TokenFactor t4163, t4183
 and 0 other values

Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing non-extending load operation

Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64

Legalizing: t4184: ch = TokenFactor t4163, t4183
Legal node: nothing to do

Combining: t4184: ch = TokenFactor t4163, t4183

Legalizing: t4182: ch = TokenFactor t4180, t4184
Legal node: nothing to do

Combining: t4182: ch = TokenFactor t4180, t4184
Creating new node: t4185: ch = TokenFactor t4180, t4183
 ... into: t4185: ch = TokenFactor t4180, t4183

Legalizing: t4185: ch = TokenFactor t4180, t4183
Legal node: nothing to do

Combining: t4185: ch = TokenFactor t4180, t4183

Legalizing: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64
Legalizing store operation
Optimizing float store operations
Legal store

Combining: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64

Legalizing: t4179: i64 = add t4, Constant:i64<16>
Legal node: nothing to do

Combining: t4179: i64 = add t4, Constant:i64<16>

Legalizing: t4180: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4163, t4178, t4, undef:i64
Legalizing store operation
Optimizing float store operations
Legal store

Combining: t4180: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4163, t4178, t4, undef:i64
Creating new node: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4178, t4, undef:i64
Creating new node: t4187: ch = TokenFactor t4163, t4186

Replacing.1 t4180: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4163, t4178, t4, undef:i64

With: t4187: ch = TokenFactor t4163, t4186
 and 0 other values

Legalizing: t4187: ch = TokenFactor t4163, t4186
Legal node: nothing to do

Combining: t4187: ch = TokenFactor t4163, t4186
Creating new node: t4188: ch = TokenFactor t4186, t4170
 ... into: t4188: ch = TokenFactor t4186, t4170

Legalizing: t4170: ch = TokenFactor t4168:1, t4169:1
Legal node: nothing to do

Combining: t4170: ch = TokenFactor t4168:1, t4169:1

Legalizing: t4188: ch = TokenFactor t4186, t4170
Legal node: nothing to do

Combining: t4188: ch = TokenFactor t4186, t4170
Creating new node: t4189: ch = TokenFactor t4186, t4169:1
 ... into: t4189: ch = TokenFactor t4186, t4169:1

Legalizing: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
Legalizing non-extending load operation

Combining: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64

Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing non-extending load operation

Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64

Legalizing: t4189: ch = TokenFactor t4186, t4169:1
Legal node: nothing to do

Combining: t4189: ch = TokenFactor t4186, t4169:1

Legalizing: t4185: ch = TokenFactor t4189, t4183
Legal node: nothing to do

Combining: t4185: ch = TokenFactor t4189, t4183
Creating new node: t4190: ch = TokenFactor t4183, t4186
 ... into: t4190: ch = TokenFactor t4183, t4186

Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing non-extending load operation

Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64

Legalizing: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64
Legalizing store operation
Optimizing float store operations
Legal store

Combining: t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4169:1, t4169, t4179, undef:i64

Legalizing: t4190: ch = TokenFactor t4183, t4186
Legal node: nothing to do

Combining: t4190: ch = TokenFactor t4183, t4186

Legalizing: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4178, t4, undef:i64
Legalizing store operation
Optimizing float store operations
Legal store

Combining: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4178, t4, undef:i64

Legalizing: t4178: v4f32 = extract_subvector t4176, Constant:i64<0>
Legal node: nothing to do

Combining: t4178: v4f32 = extract_subvector t4176, Constant:i64<0>
 ... into: t4168: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64

Legalizing: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64
Legalizing non-extending load operation

Combining: t4169: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4167, undef:i64

Legalizing: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4168, t4, undef:i64
Legalizing store operation
Optimizing float store operations
Legal store

Combining: t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4168:1, t4168, t4, undef:i64
Creating new node: t4191: ch = TokenFactor t4168:1, t4169:1
Creating new node: t4192: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
Creating new node: t4193: ch = store<(non-temporal store (s256) into %ir.a1)> t4191, t4192, t4, undef:i64

Replacing.1 t4186: ch = store<(non-temporal store (s128) into %ir.a1, align 32)> t4192:1, t4168, t4, undef:i64

With: t4193: ch = store<(non-temporal store (s256) into %ir.a1)> t4191, t4192, t4, undef:i64
 and 0 other values

Replacing.1 t4183: ch = store<(non-temporal store (s128) into %ir.a1 + 16, basealign 32)> t4192:1, t4169, t4179, undef:i64

With: t4193: ch = store<(non-temporal store (s256) into %ir.a1)> t4191, t4192, t4, undef:i64
 and 0 other values

Legalizing: t4192: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
Legalizing non-extending load operation

Combining: t4192: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64
Creating constant: t4194: i64 = Constant<16>
Creating new node: t4195: i64 = add t2, Constant:i64<16>
Creating new node: t4196: v4f32,ch = load<(non-temporal load (s128) from %ir.a0, align 32)> t0, t2, undef:i64
Creating new node: t4197: v4f32,ch = load<(non-temporal load (s128) from %ir.a0 + 16, basealign 32)> t0, t4195, undef:i64
Creating new node: t4198: ch = TokenFactor t4196:1, t4197:1
Creating new node: t4199: v8f32 = concat_vectors t4196, t4197

Replacing.1 t4192: v8f32,ch = load<(non-temporal load (s256) from %ir.a0)> t0, t2, undef:i64

Harbormaster completed remote builds in B207874: Diff 489312.Jan 14 2023, 2:34 PM

@RKSimon updated description:

The baseline `allowsMemoryAccess()` is wrong for X86.
It assumes that aligned memory operations are always allowed,
but that is not true.

For example, We can not perform a 32-byte aligned non-temporal load
of a 32-byte vector, without AVX2 that is, yet `allowsMemoryAccess()`
will say it is allowed, so we may end up merging non-temporal loads,
only to split them up to legalize them, and here we go again.

NOTE: the test changes here are superfluous. The main effect is that without this change,
in D141777, we'd get stuck endlessly merging and splitting non-temporal stores.

i believe, this accurately describes what is going on.

pengfei added inline comments.Jan 15 2023, 3:34 AM

llvm/lib/Target/X86/X86ISelLowering.h
1024–1029	Why do we need this and doing the same thing as `TargetLoweringBase`?

lebedev.ri marked an inline comment as done.Jan 15 2023, 4:56 AM

lebedev.ri added inline comments.

llvm/lib/Target/X86/X86ISelLowering.h
1024–1029	Because otherwise we get a compilation failure.

RKSimon added inline comments.Jan 16 2023, 6:32 AM

llvm/lib/Target/X86/X86ISelLowering.h
1024–1029	it looks like not all the TargetLoweringBase::allowsMemoryAccess variants are marked as virtual - that probably needs cleaning up?

lebedev.ri marked 2 inline comments as done.Jan 16 2023, 6:56 AM

lebedev.ri added inline comments.

llvm/lib/Target/X86/X86ISelLowering.h
1024–1029	that probably needs cleaning up? ABSOLUTELY NOT! There is only a single true `allowsMemoryAccess()`, all others are just helpers to call it with different types of arguments. If they are all virtual, and not just one of them, then you will essentially need to override them all. It would not be an improvement.

ping

pengfei added inline comments.Jan 18 2023, 8:13 PM

llvm/lib/Target/X86/X86ISelLowering.h
1024–1029	I don't get the point. Why can't we use the helper directly but define a new one here? What are these helper used for if we need to define a same one each time?

lebedev.ri marked an inline comment as done.Jan 19 2023, 5:19 AM

lebedev.ri added inline comments.

llvm/lib/Target/X86/X86ISelLowering.h
1024–1029	Does anyone have comments on the rest of the change, ignoring this particular function? :) If i drop this function, we get compilation error. We can't make it virtual because then every target will need to override every single `allowsMemoryAccess()`, which is error-prone, as comparing a single `allowsMemoryAccess()`. I do not understand the feedback here.

lebedev.ri added a reviewer: craig.topper.Jan 19 2023, 4:02 PM

lebedev.ri added a reviewer: efriedma.Jan 20 2023, 1:24 PM

LGTM with a few minors

llvm/lib/Target/X86/X86ISelLowering.cpp
2734	maybe add a isBitAligned(Align, uint64_t SizeInBits) helper?
llvm/lib/Target/X86/X86ISelLowering.h
1006	(style) isMemoryAccessFast sounds a little better
llvm/test/CodeGen/X86/setcc-wide-types.ll
1241	SSE2/SSE41 might be mergable here if we add a SSE check-prefix case?

This revision is now accepted and ready to land.Jan 21 2023, 10:13 AM

@RKSimon thank you for the review!

llvm/lib/Target/X86/X86ISelLowering.h
1006	Right. This is a typo.

This revision was landed with ongoing or failed builds.Jan 21 2023, 1:18 PM

Closed by commit rG005173cbb609: [X86] `X86TargetLowering`: override `allowsMemoryAccess()` (authored by lebedev.ri). · Explain Why

This revision was automatically updated to reflect the committed changes.

lebedev.ri added a commit: rG005173cbb609: [X86] `X86TargetLowering`: override `allowsMemoryAccess()`.

Harbormaster completed remote builds in B209168: Diff 491097.Jan 21 2023, 2:27 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

X86/

X86ISelLowering.h

19 lines

X86ISelLowering.cpp

74 lines

test/

CodeGen/

X86/

8 lines

4 lines

68 lines

51 lines

16 lines

28 lines

128 lines

merge-consecutive-stores-nt.ll

8 lines

setcc-wide-types.ll

262 lines

smin.ll

26 lines

smul-with-overflow.ll

4 lines

smulo-128-legalisation-lowering.ll

20 lines

umin.ll

26 lines

umul-with-overflow.ll

4 lines

wide-integer-cmp.ll

2 lines

xaluo128.ll

24 lines

Diff 491098

llvm/lib/Target/X86/X86ISelLowering.h

Show First 20 Lines • Show All 997 Lines • ▼ Show 20 Lines	public:
/// Returns true if it's safe to use load / store of the		/// Returns true if it's safe to use load / store of the
/// specified type to expand memcpy / memset inline. This is mostly true		/// specified type to expand memcpy / memset inline. This is mostly true
/// for all types except for some special cases. For example, on X86		/// for all types except for some special cases. For example, on X86
/// targets without SSE2 f64 load / store are done with fldl / fstpl which		/// targets without SSE2 f64 load / store are done with fldl / fstpl which
/// also does type conversion. Note the specified type doesn't have to be		/// also does type conversion. Note the specified type doesn't have to be
/// legal as the hook is used before type legalization.		/// legal as the hook is used before type legalization.
bool isSafeMemOpType(MVT VT) const override;		bool isSafeMemOpType(MVT VT) const override;

		bool isMemoryAccessFast(EVT VT, Align Alignment) const;
		RKSimonUnsubmitted Done Reply Inline Actions (style) isMemoryAccessFast sounds a little better RKSimon: (style) isMemoryAccessFast sounds a little better
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions Right. This is a typo. lebedev.ri: Right. This is a typo.

/// Returns true if the target allows unaligned memory accesses of the		/// Returns true if the target allows unaligned memory accesses of the
/// specified type. Returns whether it is "fast" in the last argument.		/// specified type. Returns whether it is "fast" in the last argument.
bool allowsMisalignedMemoryAccesses(EVT VT, unsigned AS, Align Alignment,		bool allowsMisalignedMemoryAccesses(EVT VT, unsigned AS, Align Alignment,
MachineMemOperand::Flags Flags,		MachineMemOperand::Flags Flags,
unsigned *Fast) const override;		unsigned *Fast) const override;

		/// This function returns true if the memory access is aligned or if the
		/// target allows this specific unaligned memory access. If the access is
		/// allowed, the optional final parameter returns a relative speed of the
		/// access (as defined by the target).
		bool allowsMemoryAccess(
		LLVMContext &Context, const DataLayout &DL, EVT VT, unsigned AddrSpace,
		Align Alignment,
		MachineMemOperand::Flags Flags = MachineMemOperand::MONone,
		unsigned *Fast = nullptr) const override;

		bool allowsMemoryAccess(LLVMContext &Context, const DataLayout &DL, EVT VT,
		const MachineMemOperand &MMO,
		unsigned *Fast) const {
		return allowsMemoryAccess(Context, DL, VT, MMO.getAddrSpace(),
		MMO.getAlign(), MMO.getFlags(), Fast);
		}
		pengfeiUnsubmitted Done Reply Inline Actions Why do we need this and doing the same thing as `TargetLoweringBase`? pengfei: Why do we need this and doing the same thing as `TargetLoweringBase`?
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions Because otherwise we get a compilation failure. lebedev.ri: Because otherwise we get a compilation failure.
		RKSimonUnsubmitted Done Reply Inline Actions it looks like not all the TargetLoweringBase::allowsMemoryAccess variants are marked as virtual - that probably needs cleaning up? RKSimon: it looks like not all the TargetLoweringBase::allowsMemoryAccess variants are marked as virtual…
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions that probably needs cleaning up? ABSOLUTELY NOT! There is only a single true `allowsMemoryAccess()`, all others are just helpers to call it with different types of arguments. If they are all virtual, and not just one of them, then you will essentially need to override them all. It would not be an improvement. lebedev.ri: > that probably needs cleaning up? ABSOLUTELY NOT! There is only a single true…
		pengfeiUnsubmitted Done Reply Inline Actions I don't get the point. Why can't we use the helper directly but define a new one here? What are these helper used for if we need to define a same one each time? pengfei: I don't get the point. Why can't we use the helper directly but define a new one here? What are…
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions Does anyone have comments on the rest of the change, ignoring this particular function? :) If i drop this function, we get compilation error. We can't make it virtual because then every target will need to override every single `allowsMemoryAccess()`, which is error-prone, as comparing a single `allowsMemoryAccess()`. I do not understand the feedback here. lebedev.ri: Does anyone have comments on the rest of the change, ignoring this particular function? :) If…

/// Provide custom lowering hooks for some operations.		/// Provide custom lowering hooks for some operations.
///		///
SDValue LowerOperation(SDValue Op, SelectionDAG &DAG) const override;		SDValue LowerOperation(SDValue Op, SelectionDAG &DAG) const override;

/// Replace the results of node with an illegal result		/// Replace the results of node with an illegal result
/// type with new values built out of custom code.		/// type with new values built out of custom code.
///		///
void ReplaceNodeResults(SDNode *N, SmallVectorImpl<SDValue>&Results,		void ReplaceNodeResults(SDNode *N, SmallVectorImpl<SDValue>&Results,
▲ Show 20 Lines • Show All 817 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 2,724 Lines • ▼ Show 20 Lines
	bool X86TargetLowering::isSafeMemOpType(MVT VT) const {			bool X86TargetLowering::isSafeMemOpType(MVT VT) const {
	if (VT == MVT::f32)			if (VT == MVT::f32)
	return Subtarget.hasSSE1();			return Subtarget.hasSSE1();
	if (VT == MVT::f64)			if (VT == MVT::f64)
	return Subtarget.hasSSE2();			return Subtarget.hasSSE2();
	return true;			return true;
	}			}

	bool X86TargetLowering::allowsMisalignedMemoryAccesses(			static bool isBitAligned(Align Alignment, uint64_t SizeInBits) {
	EVT VT, unsigned, Align Alignment, MachineMemOperand::Flags Flags,			return (8 * Alignment.value()) % SizeInBits == 0;
				RKSimonUnsubmitted Done Reply Inline Actions maybe add a isBitAligned(Align, uint64_t SizeInBits) helper? RKSimon: maybe add a isBitAligned(Align, uint64_t SizeInBits) helper?
	unsigned *Fast) const {			}
	if (Fast) {
				bool X86TargetLowering::isMemoryAccessFast(EVT VT, Align Alignment) const {
				if (isBitAligned(Alignment, VT.getSizeInBits()))
				return true;
	switch (VT.getSizeInBits()) {			switch (VT.getSizeInBits()) {
	default:			default:
	// 8-byte and under are always assumed to be fast.			// 8-byte and under are always assumed to be fast.
	*Fast = 1;			return true;
	break;
	case 128:			case 128:
	*Fast = !Subtarget.isUnalignedMem16Slow();			return !Subtarget.isUnalignedMem16Slow();
	break;
	case 256:			case 256:
	*Fast = !Subtarget.isUnalignedMem32Slow();			return !Subtarget.isUnalignedMem32Slow();
	break;
	// TODO: What about AVX-512 (512-bit) accesses?			// TODO: What about AVX-512 (512-bit) accesses?
	}			}
	}			}

				bool X86TargetLowering::allowsMisalignedMemoryAccesses(
				EVT VT, unsigned, Align Alignment, MachineMemOperand::Flags Flags,
				unsigned *Fast) const {
				if (Fast)
				*Fast = isMemoryAccessFast(VT, Alignment);
	// NonTemporal vector memory ops must be aligned.			// NonTemporal vector memory ops must be aligned.
	if (!!(Flags & MachineMemOperand::MONonTemporal) && VT.isVector()) {			if (!!(Flags & MachineMemOperand::MONonTemporal) && VT.isVector()) {
	// NT loads can only be vector aligned, so if its less aligned than the			// NT loads can only be vector aligned, so if its less aligned than the
	// minimum vector size (which we can split the vector down to), we might as			// minimum vector size (which we can split the vector down to), we might as
	// well use a regular unaligned vector load.			// well use a regular unaligned vector load.
	// We don't have any NT loads pre-SSE41.			// We don't have any NT loads pre-SSE41.
	if (!!(Flags & MachineMemOperand::MOLoad))			if (!!(Flags & MachineMemOperand::MOLoad))
	return (Alignment < 16 \|\| !Subtarget.hasSSE41());			return (Alignment < 16 \|\| !Subtarget.hasSSE41());
	return false;			return false;
	}			}
	// Misaligned accesses of any size are always allowed.			// Misaligned accesses of any size are always allowed.
	return true;			return true;
	}			}

				bool X86TargetLowering::allowsMemoryAccess(LLVMContext &Context,
				const DataLayout &DL, EVT VT,
				unsigned AddrSpace, Align Alignment,
				MachineMemOperand::Flags Flags,
				unsigned *Fast) const {
				if (Fast)
				*Fast = isMemoryAccessFast(VT, Alignment);
				if (!!(Flags & MachineMemOperand::MONonTemporal) && VT.isVector()) {
				if (allowsMisalignedMemoryAccesses(VT, AddrSpace, Alignment, Flags,
				/Fast=/nullptr))
				return true;
				// NonTemporal vector memory ops are special, and must be aligned.
				if (!isBitAligned(Alignment, VT.getSizeInBits()))
				return false;
				switch (VT.getSizeInBits()) {
				case 128:
				if (!!(Flags & MachineMemOperand::MOLoad) && Subtarget.hasSSE41())
				return true;
				if (!!(Flags & MachineMemOperand::MOStore) && Subtarget.hasSSE2())
				return true;
				return false;
				case 256:
				if (!!(Flags & MachineMemOperand::MOLoad) && Subtarget.hasAVX2())
				return true;
				if (!!(Flags & MachineMemOperand::MOStore) && Subtarget.hasAVX())
				return true;
				return false;
				case 512:
				if (Subtarget.hasAVX512())
				return true;
				return false;
				default:
				return false; // Don't have NonTemporal vector memory ops of this size.
				}
				}
				return true;
				}

	/// Return the entry encoding for a jump table in the			/// Return the entry encoding for a jump table in the
	/// current function. The returned value is a member of the			/// current function. The returned value is a member of the
	/// MachineJumpTableInfo::JTEntryKind enum.			/// MachineJumpTableInfo::JTEntryKind enum.
	unsigned X86TargetLowering::getJumpTableEncoding() const {			unsigned X86TargetLowering::getJumpTableEncoding() const {
	// In GOT pic mode, each entry in the jump table is emitted as a @GOTOFF			// In GOT pic mode, each entry in the jump table is emitted as a @GOTOFF
	// symbol.			// symbol.
	if (isPositionIndependent() && Subtarget.isPICStyleGOT())			if (isPositionIndependent() && Subtarget.isPICStyleGOT())
	return MachineJumpTableInfo::EK_Custom32;			return MachineJumpTableInfo::EK_Custom32;
	▲ Show 20 Lines • Show All 54,943 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/add-sub-bool.ll

	Show First 20 Lines • Show All 107 Lines • ▼ Show 20 Lines
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: addl {{[0-9]+}}(%esp), %esi			; X86-NEXT: addl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %edi			; X86-NEXT: adcl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: adcl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: adcl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: btl $5, {{[0-9]+}}(%esp)			; X86-NEXT: btl $5, {{[0-9]+}}(%esp)
	; X86-NEXT: adcl $0, %esi			; X86-NEXT: adcl $0, %esi
	; X86-NEXT: adcl $0, %edi			; X86-NEXT: adcl $0, %edi
	; X86-NEXT: adcl $0, %edx
	; X86-NEXT: adcl $0, %ecx			; X86-NEXT: adcl $0, %ecx
				; X86-NEXT: adcl $0, %edx
	; X86-NEXT: movl %edi, 4(%eax)			; X86-NEXT: movl %edi, 4(%eax)
	; X86-NEXT: movl %esi, (%eax)			; X86-NEXT: movl %esi, (%eax)
	; X86-NEXT: movl %edx, 8(%eax)			; X86-NEXT: movl %ecx, 8(%eax)
	; X86-NEXT: movl %ecx, 12(%eax)			; X86-NEXT: movl %edx, 12(%eax)
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
	; X86-NEXT: retl $4			; X86-NEXT: retl $4
	;			;
	; X64-LABEL: test_i128_add_add_idx:			; X64-LABEL: test_i128_add_add_idx:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movq %rdi, %rax			; X64-NEXT: movq %rdi, %rax
	; X64-NEXT: addq %rdx, %rax			; X64-NEXT: addq %rdx, %rax
	▲ Show 20 Lines • Show All 496 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/bswap-wide-int.ll

	Show First 20 Lines • Show All 65 Lines • ▼ Show 20 Lines
	; X86-MOVBE-NEXT: pushl %esi			; X86-MOVBE-NEXT: pushl %esi
	; X86-MOVBE-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-MOVBE-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-MOVBE-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-MOVBE-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-MOVBE-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-MOVBE-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-MOVBE-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-MOVBE-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-MOVBE-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-MOVBE-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-MOVBE-NEXT: movbel %esi, 12(%eax)			; X86-MOVBE-NEXT: movbel %esi, 12(%eax)
	; X86-MOVBE-NEXT: movbel %edi, 8(%eax)			; X86-MOVBE-NEXT: movbel %edi, 8(%eax)
	; X86-MOVBE-NEXT: movbel %edx, 4(%eax)			; X86-MOVBE-NEXT: movbel %ecx, 4(%eax)
	; X86-MOVBE-NEXT: movbel %ecx, (%eax)			; X86-MOVBE-NEXT: movbel %edx, (%eax)
	; X86-MOVBE-NEXT: popl %esi			; X86-MOVBE-NEXT: popl %esi
	; X86-MOVBE-NEXT: popl %edi			; X86-MOVBE-NEXT: popl %edi
	; X86-MOVBE-NEXT: retl $4			; X86-MOVBE-NEXT: retl $4
	;			;
	; X64-LABEL: bswap_i128:			; X64-LABEL: bswap_i128:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movq %rsi, %rax			; X64-NEXT: movq %rsi, %rax
	; X64-NEXT: bswapq %rax			; X64-NEXT: bswapq %rax
	▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/fshl.ll

	Show First 20 Lines • Show All 269 Lines • ▼ Show 20 Lines
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ebx			; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ebx
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-FAST-NEXT: testb $64, %cl			; X86-FAST-NEXT: testb $64, %cl
	; X86-FAST-NEXT: jne .LBB6_1			; X86-FAST-NEXT: jne .LBB6_1
	; X86-FAST-NEXT: # %bb.2:			; X86-FAST-NEXT: # %bb.2:
	; X86-FAST-NEXT: movl %edi, %eax
	; X86-FAST-NEXT: movl %esi, %edi
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-FAST-NEXT: movl %ebx, %ebp			; X86-FAST-NEXT: movl %ebx, %ebp
	; X86-FAST-NEXT: movl %edx, %ebx			; X86-FAST-NEXT: movl %edx, %ebx
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %edx
				; X86-FAST-NEXT: movl %edi, %eax
				; X86-FAST-NEXT: movl %esi, %edi
				; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-FAST-NEXT: testb $32, %cl			; X86-FAST-NEXT: testb $32, %cl
	; X86-FAST-NEXT: je .LBB6_5			; X86-FAST-NEXT: je .LBB6_5
	; X86-FAST-NEXT: .LBB6_4:			; X86-FAST-NEXT: .LBB6_4:
	; X86-FAST-NEXT: movl %edx, %esi			; X86-FAST-NEXT: movl %edx, %esi
	; X86-FAST-NEXT: movl %edi, %edx			; X86-FAST-NEXT: movl %edi, %edx
	; X86-FAST-NEXT: movl %ebx, %edi			; X86-FAST-NEXT: movl %ebx, %edi
	; X86-FAST-NEXT: movl %eax, %ebx			; X86-FAST-NEXT: movl %eax, %ebx
	; X86-FAST-NEXT: jmp .LBB6_6			; X86-FAST-NEXT: jmp .LBB6_6
	Show All 27 Lines
	;			;
	; X86-SLOW-LABEL: var_shift_i128:			; X86-SLOW-LABEL: var_shift_i128:
	; X86-SLOW: # %bb.0:			; X86-SLOW: # %bb.0:
	; X86-SLOW-NEXT: pushl %ebp			; X86-SLOW-NEXT: pushl %ebp
	; X86-SLOW-NEXT: pushl %ebx			; X86-SLOW-NEXT: pushl %ebx
	; X86-SLOW-NEXT: pushl %edi			; X86-SLOW-NEXT: pushl %edi
	; X86-SLOW-NEXT: pushl %esi			; X86-SLOW-NEXT: pushl %esi
	; X86-SLOW-NEXT: pushl %eax			; X86-SLOW-NEXT: pushl %eax
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %edi
				; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %ebp			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %ebx
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SLOW-NEXT: testb $64, %al			; X86-SLOW-NEXT: testb $64, %al
	; X86-SLOW-NEXT: jne .LBB6_1			; X86-SLOW-NEXT: jne .LBB6_1
	; X86-SLOW-NEXT: # %bb.2:			; X86-SLOW-NEXT: # %bb.2:
	; X86-SLOW-NEXT: movl %ebp, %ecx			; X86-SLOW-NEXT: movl %edx, %ebp
	; X86-SLOW-NEXT: movl %edi, %ebp			; X86-SLOW-NEXT: movl %edi, %edx
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-SLOW-NEXT: movl %edx, %ebx			; X86-SLOW-NEXT: movl %ebx, %ecx
	; X86-SLOW-NEXT: movl %esi, %edx			; X86-SLOW-NEXT: movl %esi, %ebx
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-SLOW-NEXT: testb $32, %al			; X86-SLOW-NEXT: testb $32, %al
	; X86-SLOW-NEXT: je .LBB6_5			; X86-SLOW-NEXT: je .LBB6_5
	; X86-SLOW-NEXT: .LBB6_4:			; X86-SLOW-NEXT: .LBB6_4:
	; X86-SLOW-NEXT: movl %esi, (%esp) # 4-byte Spill			; X86-SLOW-NEXT: movl %edi, (%esp) # 4-byte Spill
	; X86-SLOW-NEXT: movl %ebp, %esi			; X86-SLOW-NEXT: movl %ebx, %edi
	; X86-SLOW-NEXT: movl %edx, %ebp			; X86-SLOW-NEXT: movl %edx, %ebx
	; X86-SLOW-NEXT: movl %ecx, %edx			; X86-SLOW-NEXT: movl %ecx, %edx
	; X86-SLOW-NEXT: jmp .LBB6_6			; X86-SLOW-NEXT: jmp .LBB6_6
	; X86-SLOW-NEXT: .LBB6_1:			; X86-SLOW-NEXT: .LBB6_1:
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %ebx			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %ebp
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SLOW-NEXT: testb $32, %al			; X86-SLOW-NEXT: testb $32, %al
	; X86-SLOW-NEXT: jne .LBB6_4			; X86-SLOW-NEXT: jne .LBB6_4
	; X86-SLOW-NEXT: .LBB6_5:			; X86-SLOW-NEXT: .LBB6_5:
	; X86-SLOW-NEXT: movl %ecx, %ebx			; X86-SLOW-NEXT: movl %ecx, %ebp
	; X86-SLOW-NEXT: movl %edi, (%esp) # 4-byte Spill			; X86-SLOW-NEXT: movl %esi, (%esp) # 4-byte Spill
	; X86-SLOW-NEXT: .LBB6_6:			; X86-SLOW-NEXT: .LBB6_6:
	; X86-SLOW-NEXT: movl %edx, %edi			; X86-SLOW-NEXT: movl %edx, %esi
	; X86-SLOW-NEXT: movl %eax, %ecx			; X86-SLOW-NEXT: movl %eax, %ecx
	; X86-SLOW-NEXT: shll %cl, %edi			; X86-SLOW-NEXT: shll %cl, %esi
	; X86-SLOW-NEXT: shrl %ebx			; X86-SLOW-NEXT: shrl %ebp
	; X86-SLOW-NEXT: movb %al, %ch			; X86-SLOW-NEXT: movb %al, %ch
	; X86-SLOW-NEXT: notb %ch			; X86-SLOW-NEXT: notb %ch
	; X86-SLOW-NEXT: movb %ch, %cl			; X86-SLOW-NEXT: movb %ch, %cl
	; X86-SLOW-NEXT: shrl %cl, %ebx			; X86-SLOW-NEXT: shrl %cl, %ebp
	; X86-SLOW-NEXT: orl %edi, %ebx			; X86-SLOW-NEXT: orl %esi, %ebp
	; X86-SLOW-NEXT: movl %ebp, %edi			; X86-SLOW-NEXT: movl %ebx, %esi
	; X86-SLOW-NEXT: movb %al, %cl			; X86-SLOW-NEXT: movb %al, %cl
	; X86-SLOW-NEXT: shll %cl, %edi			; X86-SLOW-NEXT: shll %cl, %esi
	; X86-SLOW-NEXT: shrl %edx			; X86-SLOW-NEXT: shrl %edx
	; X86-SLOW-NEXT: movb %ch, %cl			; X86-SLOW-NEXT: movb %ch, %cl
	; X86-SLOW-NEXT: shrl %cl, %edx			; X86-SLOW-NEXT: shrl %cl, %edx
	; X86-SLOW-NEXT: orl %edi, %edx			; X86-SLOW-NEXT: orl %esi, %edx
	; X86-SLOW-NEXT: movl %esi, %edi			; X86-SLOW-NEXT: movl %edi, %esi
	; X86-SLOW-NEXT: movb %al, %cl			; X86-SLOW-NEXT: movb %al, %cl
	; X86-SLOW-NEXT: shll %cl, %edi			; X86-SLOW-NEXT: shll %cl, %esi
	; X86-SLOW-NEXT: shrl %ebp			; X86-SLOW-NEXT: shrl %ebx
	; X86-SLOW-NEXT: movb %ch, %cl			; X86-SLOW-NEXT: movb %ch, %cl
	; X86-SLOW-NEXT: shrl %cl, %ebp			; X86-SLOW-NEXT: shrl %cl, %ebx
	; X86-SLOW-NEXT: orl %edi, %ebp			; X86-SLOW-NEXT: orl %esi, %ebx
	; X86-SLOW-NEXT: movb %al, %cl			; X86-SLOW-NEXT: movb %al, %cl
	; X86-SLOW-NEXT: movl (%esp), %eax # 4-byte Reload			; X86-SLOW-NEXT: movl (%esp), %eax # 4-byte Reload
	; X86-SLOW-NEXT: shll %cl, %eax			; X86-SLOW-NEXT: shll %cl, %eax
	; X86-SLOW-NEXT: shrl %esi			; X86-SLOW-NEXT: shrl %edi
	; X86-SLOW-NEXT: movb %ch, %cl			; X86-SLOW-NEXT: movb %ch, %cl
	; X86-SLOW-NEXT: shrl %cl, %esi			; X86-SLOW-NEXT: shrl %cl, %edi
	; X86-SLOW-NEXT: orl %eax, %esi			; X86-SLOW-NEXT: orl %eax, %edi
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SLOW-NEXT: movl %esi, 12(%eax)			; X86-SLOW-NEXT: movl %edi, 12(%eax)
	; X86-SLOW-NEXT: movl %ebp, 8(%eax)			; X86-SLOW-NEXT: movl %ebx, 8(%eax)
	; X86-SLOW-NEXT: movl %edx, 4(%eax)			; X86-SLOW-NEXT: movl %edx, 4(%eax)
	; X86-SLOW-NEXT: movl %ebx, (%eax)			; X86-SLOW-NEXT: movl %ebp, (%eax)
	; X86-SLOW-NEXT: addl $4, %esp			; X86-SLOW-NEXT: addl $4, %esp
	; X86-SLOW-NEXT: popl %esi			; X86-SLOW-NEXT: popl %esi
	; X86-SLOW-NEXT: popl %edi			; X86-SLOW-NEXT: popl %edi
	; X86-SLOW-NEXT: popl %ebx			; X86-SLOW-NEXT: popl %ebx
	; X86-SLOW-NEXT: popl %ebp			; X86-SLOW-NEXT: popl %ebp
	; X86-SLOW-NEXT: retl $4			; X86-SLOW-NEXT: retl $4
	;			;
	; X64-FAST-LABEL: var_shift_i128:			; X64-FAST-LABEL: var_shift_i128:
	▲ Show 20 Lines • Show All 275 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/fshr.ll

	Show First 20 Lines • Show All 257 Lines • ▼ Show 20 Lines
	; X86-FAST-LABEL: var_shift_i128:			; X86-FAST-LABEL: var_shift_i128:
	; X86-FAST: # %bb.0:			; X86-FAST: # %bb.0:
	; X86-FAST-NEXT: pushl %ebp			; X86-FAST-NEXT: pushl %ebp
	; X86-FAST-NEXT: pushl %ebx			; X86-FAST-NEXT: pushl %ebx
	; X86-FAST-NEXT: pushl %edi			; X86-FAST-NEXT: pushl %edi
	; X86-FAST-NEXT: pushl %esi			; X86-FAST-NEXT: pushl %esi
	; X86-FAST-NEXT: pushl %eax			; X86-FAST-NEXT: pushl %eax
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ebx			; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ebx
				; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-FAST-NEXT: testb $64, %cl			; X86-FAST-NEXT: testb $64, %cl
	; X86-FAST-NEXT: je .LBB6_1			; X86-FAST-NEXT: je .LBB6_1
	; X86-FAST-NEXT: # %bb.2:			; X86-FAST-NEXT: # %bb.2:
				; X86-FAST-NEXT: movl %edx, (%esp) # 4-byte Spill
				; X86-FAST-NEXT: movl %esi, %edx
				; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-FAST-NEXT: movl %edi, %ebp			; X86-FAST-NEXT: movl %edi, %ebp
	; X86-FAST-NEXT: movl %ebx, %edi			; X86-FAST-NEXT: movl %ebx, %edi
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ebx			; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ebx
	; X86-FAST-NEXT: movl %esi, (%esp) # 4-byte Spill
	; X86-FAST-NEXT: movl %edx, %esi
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-FAST-NEXT: testb $32, %cl			; X86-FAST-NEXT: testb $32, %cl
	; X86-FAST-NEXT: je .LBB6_4			; X86-FAST-NEXT: je .LBB6_4
	; X86-FAST-NEXT: jmp .LBB6_5			; X86-FAST-NEXT: jmp .LBB6_5
	; X86-FAST-NEXT: .LBB6_1:			; X86-FAST-NEXT: .LBB6_1:
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ebp			; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ebp
	; X86-FAST-NEXT: movl %ebp, (%esp) # 4-byte Spill			; X86-FAST-NEXT: movl %ebp, (%esp) # 4-byte Spill
	; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ebp			; X86-FAST-NEXT: movl {{[0-9]+}}(%esp), %ebp
	; X86-FAST-NEXT: testb $32, %cl			; X86-FAST-NEXT: testb $32, %cl
	; X86-FAST-NEXT: jne .LBB6_5			; X86-FAST-NEXT: jne .LBB6_5
	; X86-FAST-NEXT: .LBB6_4:			; X86-FAST-NEXT: .LBB6_4:
	; X86-FAST-NEXT: movl %edx, %ebx			; X86-FAST-NEXT: movl %esi, %ebx
	; X86-FAST-NEXT: movl %edi, %edx			; X86-FAST-NEXT: movl %edi, %esi
	; X86-FAST-NEXT: movl %esi, %edi			; X86-FAST-NEXT: movl %edx, %edi
	; X86-FAST-NEXT: movl %ebp, %esi			; X86-FAST-NEXT: movl %ebp, %edx
	; X86-FAST-NEXT: movl (%esp), %ebp # 4-byte Reload			; X86-FAST-NEXT: movl (%esp), %ebp # 4-byte Reload
	; X86-FAST-NEXT: .LBB6_5:			; X86-FAST-NEXT: .LBB6_5:
	; X86-FAST-NEXT: shrdl %cl, %esi, %ebp			; X86-FAST-NEXT: shrdl %cl, %edx, %ebp
	; X86-FAST-NEXT: shrdl %cl, %edi, %esi			; X86-FAST-NEXT: shrdl %cl, %edi, %edx
	; X86-FAST-NEXT: shrdl %cl, %edx, %edi			; X86-FAST-NEXT: shrdl %cl, %esi, %edi
	; X86-FAST-NEXT: # kill: def $cl killed $cl killed $ecx			; X86-FAST-NEXT: # kill: def $cl killed $cl killed $ecx
	; X86-FAST-NEXT: shrdl %cl, %ebx, %edx			; X86-FAST-NEXT: shrdl %cl, %ebx, %esi
	; X86-FAST-NEXT: movl %edx, 12(%eax)			; X86-FAST-NEXT: movl %esi, 12(%eax)
	; X86-FAST-NEXT: movl %edi, 8(%eax)			; X86-FAST-NEXT: movl %edi, 8(%eax)
	; X86-FAST-NEXT: movl %esi, 4(%eax)			; X86-FAST-NEXT: movl %edx, 4(%eax)
	; X86-FAST-NEXT: movl %ebp, (%eax)			; X86-FAST-NEXT: movl %ebp, (%eax)
	; X86-FAST-NEXT: addl $4, %esp			; X86-FAST-NEXT: addl $4, %esp
	; X86-FAST-NEXT: popl %esi			; X86-FAST-NEXT: popl %esi
	; X86-FAST-NEXT: popl %edi			; X86-FAST-NEXT: popl %edi
	; X86-FAST-NEXT: popl %ebx			; X86-FAST-NEXT: popl %ebx
	; X86-FAST-NEXT: popl %ebp			; X86-FAST-NEXT: popl %ebp
	; X86-FAST-NEXT: retl $4			; X86-FAST-NEXT: retl $4
	;			;
	; X86-SLOW-LABEL: var_shift_i128:			; X86-SLOW-LABEL: var_shift_i128:
	; X86-SLOW: # %bb.0:			; X86-SLOW: # %bb.0:
	; X86-SLOW-NEXT: pushl %ebp			; X86-SLOW-NEXT: pushl %ebp
	; X86-SLOW-NEXT: pushl %ebx			; X86-SLOW-NEXT: pushl %ebx
	; X86-SLOW-NEXT: pushl %edi			; X86-SLOW-NEXT: pushl %edi
	; X86-SLOW-NEXT: pushl %esi			; X86-SLOW-NEXT: pushl %esi
	; X86-SLOW-NEXT: subl $8, %esp			; X86-SLOW-NEXT: subl $8, %esp
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %ebx			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %ebx
				; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %ebp			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %ebp
				; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-SLOW-NEXT: testb $64, %cl			; X86-SLOW-NEXT: testb $64, %cl
	; X86-SLOW-NEXT: je .LBB6_1			; X86-SLOW-NEXT: je .LBB6_1
	; X86-SLOW-NEXT: # %bb.2:			; X86-SLOW-NEXT: # %bb.2:
	; X86-SLOW-NEXT: movl %ebx, %edx
	; X86-SLOW-NEXT: movl %edi, %ebx
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-SLOW-NEXT: movl %ebp, %eax			; X86-SLOW-NEXT: movl %ebp, %eax
	; X86-SLOW-NEXT: movl %esi, %ebp			; X86-SLOW-NEXT: movl %ebx, %ebp
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %ebx
				; X86-SLOW-NEXT: movl %esi, %edx
				; X86-SLOW-NEXT: movl %edi, %esi
				; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-SLOW-NEXT: testb $32, %cl			; X86-SLOW-NEXT: testb $32, %cl
	; X86-SLOW-NEXT: jne .LBB6_5			; X86-SLOW-NEXT: jne .LBB6_5
	; X86-SLOW-NEXT: .LBB6_4:			; X86-SLOW-NEXT: .LBB6_4:
	; X86-SLOW-NEXT: movl %esi, %edi			; X86-SLOW-NEXT: movl %ebx, %edi
	; X86-SLOW-NEXT: movl %ebx, (%esp) # 4-byte Spill			; X86-SLOW-NEXT: movl %esi, (%esp) # 4-byte Spill
	; X86-SLOW-NEXT: movl %ebp, %esi			; X86-SLOW-NEXT: movl %ebp, %esi
	; X86-SLOW-NEXT: movl %edx, %ebp			; X86-SLOW-NEXT: movl %edx, %ebp
	; X86-SLOW-NEXT: movl %eax, %edx			; X86-SLOW-NEXT: movl %eax, %edx
	; X86-SLOW-NEXT: jmp .LBB6_6			; X86-SLOW-NEXT: jmp .LBB6_6
	; X86-SLOW-NEXT: .LBB6_1:			; X86-SLOW-NEXT: .LBB6_1:
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-SLOW-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-SLOW-NEXT: testb $32, %cl			; X86-SLOW-NEXT: testb $32, %cl
	; X86-SLOW-NEXT: je .LBB6_4			; X86-SLOW-NEXT: je .LBB6_4
	; X86-SLOW-NEXT: .LBB6_5:			; X86-SLOW-NEXT: .LBB6_5:
	; X86-SLOW-NEXT: movl %esi, (%esp) # 4-byte Spill			; X86-SLOW-NEXT: movl %ebx, (%esp) # 4-byte Spill
	; X86-SLOW-NEXT: movl %ebx, %esi
	; X86-SLOW-NEXT: .LBB6_6:			; X86-SLOW-NEXT: .LBB6_6:
	; X86-SLOW-NEXT: shrl %cl, %edx			; X86-SLOW-NEXT: shrl %cl, %edx
	; X86-SLOW-NEXT: movl %ecx, %ebx			; X86-SLOW-NEXT: movl %ecx, %ebx
	; X86-SLOW-NEXT: notb %bl			; X86-SLOW-NEXT: notb %bl
	; X86-SLOW-NEXT: leal (%ebp,%ebp), %eax			; X86-SLOW-NEXT: leal (%ebp,%ebp), %eax
	; X86-SLOW-NEXT: movl %ebx, %ecx			; X86-SLOW-NEXT: movl %ebx, %ecx
	; X86-SLOW-NEXT: shll %cl, %eax			; X86-SLOW-NEXT: shll %cl, %eax
	; X86-SLOW-NEXT: orl %edx, %eax			; X86-SLOW-NEXT: orl %edx, %eax
	▲ Show 20 Lines • Show All 309 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/i128-add.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefixes=X86			; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefixes=X86
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefixes=X64			; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefixes=X64

	define i128 @add_i128(i128 %x, i128 %y) nounwind {			define i128 @add_i128(i128 %x, i128 %y) nounwind {
	; X86-LABEL: add_i128:			; X86-LABEL: add_i128:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: pushl %edi			; X86-NEXT: pushl %edi
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: addl {{[0-9]+}}(%esp), %esi			; X86-NEXT: addl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %edi			; X86-NEXT: adcl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: adcl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: adcl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: addl $1, %esi			; X86-NEXT: addl $1, %esi
	; X86-NEXT: adcl $0, %edi			; X86-NEXT: adcl $0, %edi
	; X86-NEXT: adcl $0, %edx
	; X86-NEXT: adcl $0, %ecx			; X86-NEXT: adcl $0, %ecx
				; X86-NEXT: adcl $0, %edx
	; X86-NEXT: movl %edi, 4(%eax)			; X86-NEXT: movl %edi, 4(%eax)
	; X86-NEXT: movl %esi, (%eax)			; X86-NEXT: movl %esi, (%eax)
	; X86-NEXT: movl %edx, 8(%eax)			; X86-NEXT: movl %ecx, 8(%eax)
	; X86-NEXT: movl %ecx, 12(%eax)			; X86-NEXT: movl %edx, 12(%eax)
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
	; X86-NEXT: retl $4			; X86-NEXT: retl $4
	;			;
	; X64-LABEL: add_i128:			; X64-LABEL: add_i128:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movq %rdi, %rax			; X64-NEXT: movq %rdi, %rax
	; X64-NEXT: addq %rdx, %rax			; X64-NEXT: addq %rdx, %rax
	Show All 15 Lines
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: addl {{[0-9]+}}(%esp), %esi			; X86-NEXT: addl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %edi			; X86-NEXT: adcl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: adcl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: adcl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: addl $1, %esi			; X86-NEXT: addl $1, %esi
	; X86-NEXT: adcl $0, %edi			; X86-NEXT: adcl $0, %edi
	; X86-NEXT: adcl $0, %edx
	; X86-NEXT: adcl $0, %ecx			; X86-NEXT: adcl $0, %ecx
				; X86-NEXT: adcl $0, %edx
	; X86-NEXT: movl %edi, 4(%eax)			; X86-NEXT: movl %edi, 4(%eax)
	; X86-NEXT: movl %esi, (%eax)			; X86-NEXT: movl %esi, (%eax)
	; X86-NEXT: movl %edx, 8(%eax)			; X86-NEXT: movl %ecx, 8(%eax)
	; X86-NEXT: movl %ecx, 12(%eax)			; X86-NEXT: movl %edx, 12(%eax)
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
	; X86-NEXT: retl $4			; X86-NEXT: retl $4
	;			;
	; X64-LABEL: add_v1i128:			; X64-LABEL: add_v1i128:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movq %rdi, %rax			; X64-NEXT: movq %rdi, %rax
	; X64-NEXT: addq %rdx, %rax			; X64-NEXT: addq %rdx, %rax
	Show All 13 Lines

llvm/test/CodeGen/X86/icmp-shift-opt.ll

Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	exit:
ret i128 %inc		ret i128 %inc
}		}

define i1 @opt_setcc_srl_eq_zero(i128 %a) nounwind {		define i1 @opt_setcc_srl_eq_zero(i128 %a) nounwind {
; X86-LABEL: opt_setcc_srl_eq_zero:		; X86-LABEL: opt_setcc_srl_eq_zero:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: orl {{[0-9]+}}(%esp), %eax		; X86-NEXT: orl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: orl %ecx, %edx
; X86-NEXT: orl %eax, %edx		; X86-NEXT: orl %eax, %edx
; X86-NEXT: orl %ecx, %eax		; X86-NEXT: orl %ecx, %edx
; X86-NEXT: shldl $15, %edx, %eax		; X86-NEXT: orl %eax, %ecx
		; X86-NEXT: shldl $15, %edx, %ecx
; X86-NEXT: sete %al		; X86-NEXT: sete %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: opt_setcc_srl_eq_zero:		; X64-LABEL: opt_setcc_srl_eq_zero:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: shrq $17, %rdi		; X64-NEXT: shrq $17, %rdi
; X64-NEXT: orq %rsi, %rdi		; X64-NEXT: orq %rsi, %rdi
; X64-NEXT: sete %al		; X64-NEXT: sete %al
; X64-NEXT: retq		; X64-NEXT: retq
%srl = lshr i128 %a, 17		%srl = lshr i128 %a, 17
%cmp = icmp eq i128 %srl, 0		%cmp = icmp eq i128 %srl, 0
ret i1 %cmp		ret i1 %cmp
}		}

define i1 @opt_setcc_srl_ne_zero(i128 %a) nounwind {		define i1 @opt_setcc_srl_ne_zero(i128 %a) nounwind {
; X86-LABEL: opt_setcc_srl_ne_zero:		; X86-LABEL: opt_setcc_srl_ne_zero:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: orl {{[0-9]+}}(%esp), %eax		; X86-NEXT: orl {{[0-9]+}}(%esp), %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: orl %ecx, %edx
; X86-NEXT: orl %eax, %edx		; X86-NEXT: orl %eax, %edx
; X86-NEXT: orl %ecx, %eax		; X86-NEXT: orl %ecx, %edx
; X86-NEXT: shldl $15, %edx, %eax		; X86-NEXT: orl %eax, %ecx
		; X86-NEXT: shldl $15, %edx, %ecx
; X86-NEXT: setne %al		; X86-NEXT: setne %al
; X86-NEXT: retl		; X86-NEXT: retl
;		;
; X64-LABEL: opt_setcc_srl_ne_zero:		; X64-LABEL: opt_setcc_srl_ne_zero:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: shrq $17, %rdi		; X64-NEXT: shrq $17, %rdi
; X64-NEXT: orq %rsi, %rdi		; X64-NEXT: orq %rsi, %rdi
; X64-NEXT: setne %al		; X64-NEXT: setne %al
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines

; Negative test: optimization should not be applied if shift has multiple users.		; Negative test: optimization should not be applied if shift has multiple users.
define i1 @opt_setcc_shl_eq_zero_multiple_shl_users(i128 %a) nounwind {		define i1 @opt_setcc_shl_eq_zero_multiple_shl_users(i128 %a) nounwind {
; X86-LABEL: opt_setcc_shl_eq_zero_multiple_shl_users:		; X86-LABEL: opt_setcc_shl_eq_zero_multiple_shl_users:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: pushl %ebx		; X86-NEXT: pushl %ebx
; X86-NEXT: pushl %edi		; X86-NEXT: pushl %edi
; X86-NEXT: pushl %esi		; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
		; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
; X86-NEXT: shldl $17, %esi, %edx		; X86-NEXT: shldl $17, %edx, %esi
; X86-NEXT: shldl $17, %ecx, %esi		; X86-NEXT: shldl $17, %ecx, %edx
; X86-NEXT: shldl $17, %eax, %ecx		; X86-NEXT: shldl $17, %eax, %ecx
; X86-NEXT: shll $17, %eax		; X86-NEXT: shll $17, %eax
; X86-NEXT: movl %ecx, %edi		; X86-NEXT: movl %ecx, %edi
; X86-NEXT: orl %edx, %edi		; X86-NEXT: orl %esi, %edi
; X86-NEXT: movl %eax, %ebx		; X86-NEXT: movl %eax, %ebx
; X86-NEXT: orl %esi, %ebx		; X86-NEXT: orl %edx, %ebx
; X86-NEXT: orl %edi, %ebx		; X86-NEXT: orl %edi, %ebx
; X86-NEXT: sete %bl		; X86-NEXT: sete %bl
; X86-NEXT: pushl %edx
; X86-NEXT: pushl %esi		; X86-NEXT: pushl %esi
		; X86-NEXT: pushl %edx
; X86-NEXT: pushl %ecx		; X86-NEXT: pushl %ecx
; X86-NEXT: pushl %eax		; X86-NEXT: pushl %eax
; X86-NEXT: calll use@PLT		; X86-NEXT: calll use@PLT
; X86-NEXT: addl $16, %esp		; X86-NEXT: addl $16, %esp
; X86-NEXT: movl %ebx, %eax		; X86-NEXT: movl %ebx, %eax
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
▲ Show 20 Lines • Show All 87 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/legalize-shl-vec.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=i686-unknown-unknown \| FileCheck %s --check-prefix=X32			; RUN: llc < %s -mtriple=i686-unknown-unknown \| FileCheck %s --check-prefix=X32
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=X64			; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=X64

	define <2 x i256> @test_shl(<2 x i256> %In) {			define <2 x i256> @test_shl(<2 x i256> %In) {
	; X32-LABEL: test_shl:			; X32-LABEL: test_shl:
	; X32: # %bb.0:			; X32: # %bb.0:
	; X32-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X32-NEXT: shldl $2, %ecx, %edx			; X32-NEXT: shldl $2, %ecx, %edx
	; X32-NEXT: movl %edx, 60(%eax)			; X32-NEXT: movl %edx, 60(%eax)
	; X32-NEXT: movl {{[0-9]+}}(%esp), %edx			; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X32-NEXT: shldl $2, %edx, %ecx			; X32-NEXT: shldl $2, %edx, %ecx
	; X32-NEXT: movl %ecx, 56(%eax)			; X32-NEXT: movl %ecx, 56(%eax)
	; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-NEXT: shldl $2, %ecx, %edx			; X32-NEXT: shldl $2, %ecx, %edx
	; X32-NEXT: movl %edx, 52(%eax)			; X32-NEXT: movl %edx, 52(%eax)
	Show All 24 Lines
	; X32-NEXT: retl $4			; X32-NEXT: retl $4
	;			;
	; X64-LABEL: test_shl:			; X64-LABEL: test_shl:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movq %rdi, %rax			; X64-NEXT: movq %rdi, %rax
	; X64-NEXT: movq {{[0-9]+}}(%rsp), %rcx			; X64-NEXT: movq {{[0-9]+}}(%rsp), %rcx
	; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdx			; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdx
	; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdi			; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdi
	; X64-NEXT: shldq $2, %rcx, %rdx			; X64-NEXT: shldq $2, %rdx, %rcx
	; X64-NEXT: shldq $2, %rdi, %rcx			; X64-NEXT: shldq $2, %rdi, %rdx
	; X64-NEXT: shldq $2, %r9, %rdi			; X64-NEXT: shldq $2, %r9, %rdi
	; X64-NEXT: shlq $63, %rsi			; X64-NEXT: shlq $63, %rsi
	; X64-NEXT: shlq $2, %r9			; X64-NEXT: shlq $2, %r9
	; X64-NEXT: movq %rdx, 56(%rax)			; X64-NEXT: movq %rcx, 56(%rax)
	; X64-NEXT: movq %rcx, 48(%rax)			; X64-NEXT: movq %rdx, 48(%rax)
	; X64-NEXT: movq %rdi, 40(%rax)			; X64-NEXT: movq %rdi, 40(%rax)
	; X64-NEXT: movq %r9, 32(%rax)			; X64-NEXT: movq %r9, 32(%rax)
	; X64-NEXT: movq %rsi, 24(%rax)			; X64-NEXT: movq %rsi, 24(%rax)
	; X64-NEXT: xorps %xmm0, %xmm0			; X64-NEXT: xorps %xmm0, %xmm0
	; X64-NEXT: movaps %xmm0, (%rax)			; X64-NEXT: movaps %xmm0, (%rax)
	; X64-NEXT: movq $0, 16(%rax)			; X64-NEXT: movq $0, 16(%rax)
	; X64-NEXT: retq			; X64-NEXT: retq
	%Amt = insertelement <2 x i256> <i256 1, i256 2>, i256 255, i32 0			%Amt = insertelement <2 x i256> <i256 1, i256 2>, i256 255, i32 0
	Show All 13 Lines
	; X32-NEXT: pushl %esi			; X32-NEXT: pushl %esi
	; X32-NEXT: .cfi_def_cfa_offset 20			; X32-NEXT: .cfi_def_cfa_offset 20
	; X32-NEXT: subl $8, %esp			; X32-NEXT: subl $8, %esp
	; X32-NEXT: .cfi_def_cfa_offset 28			; X32-NEXT: .cfi_def_cfa_offset 28
	; X32-NEXT: .cfi_offset %esi, -20			; X32-NEXT: .cfi_offset %esi, -20
	; X32-NEXT: .cfi_offset %edi, -16			; X32-NEXT: .cfi_offset %edi, -16
	; X32-NEXT: .cfi_offset %ebx, -12			; X32-NEXT: .cfi_offset %ebx, -12
	; X32-NEXT: .cfi_offset %ebp, -8			; X32-NEXT: .cfi_offset %ebp, -8
	; X32-NEXT: movl {{[0-9]+}}(%esp), %ebx
	; X32-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X32-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X32-NEXT: movl {{[0-9]+}}(%esp), %edx			; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: movl {{[0-9]+}}(%esp), %ebp			; X32-NEXT: movl {{[0-9]+}}(%esp), %ebp
	; X32-NEXT: movl %ebx, %ecx			; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X32-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X32-NEXT: movl {{[0-9]+}}(%esp), %edi
				; X32-NEXT: movl %ebp, %ebx
				; X32-NEXT: shldl $28, %edx, %ebx
				; X32-NEXT: shldl $28, %esi, %edx
				; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X32-NEXT: shldl $28, %ecx, %esi
				; X32-NEXT: movl %esi, (%esp) # 4-byte Spill
	; X32-NEXT: shldl $28, %edi, %ecx			; X32-NEXT: shldl $28, %edi, %ecx
	; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill			; X32-NEXT: shldl $28, %eax, %edi
	; X32-NEXT: shldl $28, %esi, %edi			; X32-NEXT: movl %eax, %esi
	; X32-NEXT: shldl $28, %edx, %esi
	; X32-NEXT: shldl $28, %eax, %edx
	; X32-NEXT: shldl $28, %ebp, %eax
	; X32-NEXT: movl %eax, (%esp) # 4-byte Spill
	; X32-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: shldl $28, %eax, %ebp			; X32-NEXT: shldl $28, %eax, %esi
	; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X32-NEXT: shrdl $4, %eax, %ecx			; X32-NEXT: shrdl $4, %eax, %edx
	; X32-NEXT: shrl $4, %ebx			; X32-NEXT: shrl $4, %ebp
	; X32-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: movl %ebx, 60(%eax)			; X32-NEXT: movl %ebp, 60(%eax)
	; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
	; X32-NEXT: movl %ebx, 56(%eax)			; X32-NEXT: movl %ebx, 56(%eax)
	; X32-NEXT: movl %edi, 52(%eax)			; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
	; X32-NEXT: movl %esi, 48(%eax)			; X32-NEXT: movl %ebx, 52(%eax)
	; X32-NEXT: movl %edx, 44(%eax)			; X32-NEXT: movl (%esp), %ebx # 4-byte Reload
	; X32-NEXT: movl (%esp), %edx # 4-byte Reload			; X32-NEXT: movl %ebx, 48(%eax)
	; X32-NEXT: movl %edx, 40(%eax)			; X32-NEXT: movl %ecx, 44(%eax)
	; X32-NEXT: movl %ebp, 36(%eax)			; X32-NEXT: movl %edi, 40(%eax)
	; X32-NEXT: movl %ecx, 32(%eax)			; X32-NEXT: movl %esi, 36(%eax)
				; X32-NEXT: movl %edx, 32(%eax)
	; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-NEXT: shrl $31, %ecx			; X32-NEXT: shrl $31, %ecx
	; X32-NEXT: movl %ecx, (%eax)			; X32-NEXT: movl %ecx, (%eax)
	; X32-NEXT: movl $0, 28(%eax)			; X32-NEXT: movl $0, 28(%eax)
	; X32-NEXT: movl $0, 24(%eax)			; X32-NEXT: movl $0, 24(%eax)
	; X32-NEXT: movl $0, 20(%eax)			; X32-NEXT: movl $0, 20(%eax)
	; X32-NEXT: movl $0, 16(%eax)			; X32-NEXT: movl $0, 16(%eax)
	; X32-NEXT: movl $0, 12(%eax)			; X32-NEXT: movl $0, 12(%eax)
	Show All 13 Lines
	;			;
	; X64-LABEL: test_srl:			; X64-LABEL: test_srl:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movq %rdi, %rax			; X64-NEXT: movq %rdi, %rax
	; X64-NEXT: movq {{[0-9]+}}(%rsp), %rcx			; X64-NEXT: movq {{[0-9]+}}(%rsp), %rcx
	; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdx			; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdx
	; X64-NEXT: movq {{[0-9]+}}(%rsp), %rsi			; X64-NEXT: movq {{[0-9]+}}(%rsp), %rsi
	; X64-NEXT: shrdq $4, %rsi, %r9			; X64-NEXT: shrdq $4, %rsi, %r9
	; X64-NEXT: shrdq $4, %rcx, %rsi			; X64-NEXT: shrdq $4, %rdx, %rsi
				; X64-NEXT: shrdq $4, %rcx, %rdx
	; X64-NEXT: shrq $63, %r8			; X64-NEXT: shrq $63, %r8
	; X64-NEXT: shrdq $4, %rdx, %rcx			; X64-NEXT: shrq $4, %rcx
	; X64-NEXT: shrq $4, %rdx			; X64-NEXT: movq %rcx, 56(%rdi)
	; X64-NEXT: movq %rdx, 56(%rdi)			; X64-NEXT: movq %rdx, 48(%rdi)
	; X64-NEXT: movq %rcx, 48(%rdi)
	; X64-NEXT: movq %rsi, 40(%rdi)			; X64-NEXT: movq %rsi, 40(%rdi)
	; X64-NEXT: movq %r9, 32(%rdi)			; X64-NEXT: movq %r9, 32(%rdi)
	; X64-NEXT: movq %r8, (%rdi)			; X64-NEXT: movq %r8, (%rdi)
	; X64-NEXT: xorps %xmm0, %xmm0			; X64-NEXT: xorps %xmm0, %xmm0
	; X64-NEXT: movaps %xmm0, 16(%rdi)			; X64-NEXT: movaps %xmm0, 16(%rdi)
	; X64-NEXT: movq $0, 8(%rdi)			; X64-NEXT: movq $0, 8(%rdi)
	; X64-NEXT: retq			; X64-NEXT: retq
	%Amt = insertelement <2 x i256> <i256 3, i256 4>, i256 255, i32 0			%Amt = insertelement <2 x i256> <i256 3, i256 4>, i256 255, i32 0
	Show All 13 Lines
	; X32-NEXT: pushl %esi			; X32-NEXT: pushl %esi
	; X32-NEXT: .cfi_def_cfa_offset 20			; X32-NEXT: .cfi_def_cfa_offset 20
	; X32-NEXT: subl $8, %esp			; X32-NEXT: subl $8, %esp
	; X32-NEXT: .cfi_def_cfa_offset 28			; X32-NEXT: .cfi_def_cfa_offset 28
	; X32-NEXT: .cfi_offset %esi, -20			; X32-NEXT: .cfi_offset %esi, -20
	; X32-NEXT: .cfi_offset %edi, -16			; X32-NEXT: .cfi_offset %edi, -16
	; X32-NEXT: .cfi_offset %ebx, -12			; X32-NEXT: .cfi_offset %ebx, -12
	; X32-NEXT: .cfi_offset %ebp, -8			; X32-NEXT: .cfi_offset %ebp, -8
	; X32-NEXT: movl {{[0-9]+}}(%esp), %ebx
	; X32-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X32-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X32-NEXT: movl {{[0-9]+}}(%esp), %edx			; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: movl {{[0-9]+}}(%esp), %ebp			; X32-NEXT: movl {{[0-9]+}}(%esp), %ebp
	; X32-NEXT: movl %ebx, %ecx			; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X32-NEXT: movl {{[0-9]+}}(%esp), %esi
				; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
				; X32-NEXT: movl {{[0-9]+}}(%esp), %edi
				; X32-NEXT: movl %ebp, %ebx
				; X32-NEXT: shldl $26, %edx, %ebx
				; X32-NEXT: shldl $26, %esi, %edx
				; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
				; X32-NEXT: shldl $26, %ecx, %esi
				; X32-NEXT: movl %esi, (%esp) # 4-byte Spill
	; X32-NEXT: shldl $26, %edi, %ecx			; X32-NEXT: shldl $26, %edi, %ecx
	; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill			; X32-NEXT: shldl $26, %eax, %edi
	; X32-NEXT: shldl $26, %esi, %edi			; X32-NEXT: movl %eax, %esi
	; X32-NEXT: shldl $26, %edx, %esi
	; X32-NEXT: shldl $26, %eax, %edx
	; X32-NEXT: shldl $26, %ebp, %eax
	; X32-NEXT: movl %eax, (%esp) # 4-byte Spill
	; X32-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: shldl $26, %eax, %ebp			; X32-NEXT: shldl $26, %eax, %esi
	; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X32-NEXT: shrdl $6, %eax, %ecx			; X32-NEXT: shrdl $6, %eax, %edx
	; X32-NEXT: sarl $6, %ebx			; X32-NEXT: sarl $6, %ebp
	; X32-NEXT: movl {{[0-9]+}}(%esp), %eax			; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X32-NEXT: movl %ebx, 60(%eax)			; X32-NEXT: movl %ebp, 60(%eax)
	; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
	; X32-NEXT: movl %ebx, 56(%eax)			; X32-NEXT: movl %ebx, 56(%eax)
	; X32-NEXT: movl %edi, 52(%eax)			; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
	; X32-NEXT: movl %esi, 48(%eax)			; X32-NEXT: movl %ebx, 52(%eax)
	; X32-NEXT: movl %edx, 44(%eax)			; X32-NEXT: movl (%esp), %ebx # 4-byte Reload
	; X32-NEXT: movl (%esp), %edx # 4-byte Reload			; X32-NEXT: movl %ebx, 48(%eax)
	; X32-NEXT: movl %edx, 40(%eax)			; X32-NEXT: movl %ecx, 44(%eax)
	; X32-NEXT: movl %ebp, 36(%eax)			; X32-NEXT: movl %edi, 40(%eax)
	; X32-NEXT: movl %ecx, 32(%eax)			; X32-NEXT: movl %esi, 36(%eax)
				; X32-NEXT: movl %edx, 32(%eax)
	; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X32-NEXT: sarl $31, %ecx			; X32-NEXT: sarl $31, %ecx
	; X32-NEXT: movl %ecx, 28(%eax)			; X32-NEXT: movl %ecx, 28(%eax)
	; X32-NEXT: movl %ecx, 24(%eax)			; X32-NEXT: movl %ecx, 24(%eax)
	; X32-NEXT: movl %ecx, 20(%eax)			; X32-NEXT: movl %ecx, 20(%eax)
	; X32-NEXT: movl %ecx, 16(%eax)			; X32-NEXT: movl %ecx, 16(%eax)
	; X32-NEXT: movl %ecx, 12(%eax)			; X32-NEXT: movl %ecx, 12(%eax)
	; X32-NEXT: movl %ecx, 8(%eax)			; X32-NEXT: movl %ecx, 8(%eax)
	Show All 13 Lines
	;			;
	; X64-LABEL: test_sra:			; X64-LABEL: test_sra:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movq %rdi, %rax			; X64-NEXT: movq %rdi, %rax
	; X64-NEXT: movq {{[0-9]+}}(%rsp), %rcx			; X64-NEXT: movq {{[0-9]+}}(%rsp), %rcx
	; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdx			; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdx
	; X64-NEXT: movq {{[0-9]+}}(%rsp), %rsi			; X64-NEXT: movq {{[0-9]+}}(%rsp), %rsi
	; X64-NEXT: shrdq $6, %rsi, %r9			; X64-NEXT: shrdq $6, %rsi, %r9
	; X64-NEXT: shrdq $6, %rcx, %rsi			; X64-NEXT: shrdq $6, %rdx, %rsi
				; X64-NEXT: shrdq $6, %rcx, %rdx
	; X64-NEXT: sarq $63, %r8			; X64-NEXT: sarq $63, %r8
	; X64-NEXT: shrdq $6, %rdx, %rcx			; X64-NEXT: sarq $6, %rcx
	; X64-NEXT: sarq $6, %rdx			; X64-NEXT: movq %rcx, 56(%rdi)
	; X64-NEXT: movq %rdx, 56(%rdi)			; X64-NEXT: movq %rdx, 48(%rdi)
	; X64-NEXT: movq %rcx, 48(%rdi)
	; X64-NEXT: movq %rsi, 40(%rdi)			; X64-NEXT: movq %rsi, 40(%rdi)
	; X64-NEXT: movq %r9, 32(%rdi)			; X64-NEXT: movq %r9, 32(%rdi)
	; X64-NEXT: movq %r8, 24(%rdi)			; X64-NEXT: movq %r8, 24(%rdi)
	; X64-NEXT: movq %r8, 16(%rdi)			; X64-NEXT: movq %r8, 16(%rdi)
	; X64-NEXT: movq %r8, 8(%rdi)			; X64-NEXT: movq %r8, 8(%rdi)
	; X64-NEXT: movq %r8, (%rdi)			; X64-NEXT: movq %r8, (%rdi)
	; X64-NEXT: retq			; X64-NEXT: retq
	%Amt = insertelement <2 x i256> <i256 5, i256 6>, i256 255, i32 0			%Amt = insertelement <2 x i256> <i256 5, i256 6>, i256 255, i32 0
	%Out = ashr <2 x i256> %In, %Amt			%Out = ashr <2 x i256> %In, %Amt
	ret <2 x i256> %Out			ret <2 x i256> %Out
	}			}

llvm/test/CodeGen/X86/merge-consecutive-stores-nt.ll

	Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines
	; X64-SSE41-NEXT: movntdqa (%rdi), %xmm0			; X64-SSE41-NEXT: movntdqa (%rdi), %xmm0
	; X64-SSE41-NEXT: movntdqa 16(%rdi), %xmm1			; X64-SSE41-NEXT: movntdqa 16(%rdi), %xmm1
	; X64-SSE41-NEXT: movntdq %xmm0, (%rsi)			; X64-SSE41-NEXT: movntdq %xmm0, (%rsi)
	; X64-SSE41-NEXT: movntdq %xmm1, 16(%rsi)			; X64-SSE41-NEXT: movntdq %xmm1, 16(%rsi)
	; X64-SSE41-NEXT: retq			; X64-SSE41-NEXT: retq
	;			;
	; X64-AVX1-LABEL: merge_2_v4f32_align32:			; X64-AVX1-LABEL: merge_2_v4f32_align32:
	; X64-AVX1: # %bb.0:			; X64-AVX1: # %bb.0:
	; X64-AVX1-NEXT: vmovntdqa 16(%rdi), %xmm0			; X64-AVX1-NEXT: vmovntdqa (%rdi), %xmm0
	; X64-AVX1-NEXT: vmovntdqa (%rdi), %xmm1			; X64-AVX1-NEXT: vmovntdqa 16(%rdi), %xmm1
	; X64-AVX1-NEXT: vmovntdq %xmm1, (%rsi)			; X64-AVX1-NEXT: vmovntdq %xmm0, (%rsi)
	; X64-AVX1-NEXT: vmovntdq %xmm0, 16(%rsi)			; X64-AVX1-NEXT: vmovntdq %xmm1, 16(%rsi)
	; X64-AVX1-NEXT: retq			; X64-AVX1-NEXT: retq
	;			;
	; X64-AVX2-LABEL: merge_2_v4f32_align32:			; X64-AVX2-LABEL: merge_2_v4f32_align32:
	; X64-AVX2: # %bb.0:			; X64-AVX2: # %bb.0:
	; X64-AVX2-NEXT: vmovntdqa (%rdi), %ymm0			; X64-AVX2-NEXT: vmovntdqa (%rdi), %ymm0
	; X64-AVX2-NEXT: vmovntdq %ymm0, (%rsi)			; X64-AVX2-NEXT: vmovntdq %ymm0, (%rsi)
	; X64-AVX2-NEXT: vzeroupper			; X64-AVX2-NEXT: vzeroupper
	; X64-AVX2-NEXT: retq			; X64-AVX2-NEXT: retq
	▲ Show 20 Lines • Show All 446 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/setcc-wide-types.ll

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=sse2 \| FileCheck %s --check-prefix=ANY --check-prefix=NO512 --check-prefix=SSE2		; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=sse2 \| FileCheck %s --check-prefix=ANY --check-prefix=NO512 --check-prefix=SSE --check-prefix=SSE2
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=sse4.1 \| FileCheck %s --check-prefix=ANY --check-prefix=NO512 --check-prefix=SSE41		; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=sse4.1 \| FileCheck %s --check-prefix=ANY --check-prefix=NO512 --check-prefix=SSE --check-prefix=SSE41
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx \| FileCheck %s --check-prefix=ANY --check-prefix=NO512 --check-prefix=AVXANY --check-prefix=AVX1		; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx \| FileCheck %s --check-prefix=ANY --check-prefix=NO512 --check-prefix=AVXANY --check-prefix=AVX1
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx2 \| FileCheck %s --check-prefix=ANY --check-prefix=NO512 --check-prefix=AVXANY --check-prefix=AVX2		; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx2 \| FileCheck %s --check-prefix=ANY --check-prefix=NO512 --check-prefix=AVXANY --check-prefix=AVX2
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx512f \| FileCheck %s --check-prefix=ANY --check-prefix=AVXANY --check-prefix=AVX512 --check-prefix=AVX512F		; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx512f \| FileCheck %s --check-prefix=ANY --check-prefix=AVXANY --check-prefix=AVX512 --check-prefix=AVX512F
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx512bw \| FileCheck %s --check-prefix=ANY --check-prefix=AVXANY --check-prefix=AVX512 --check-prefix=AVX512BW		; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=avx512bw \| FileCheck %s --check-prefix=ANY --check-prefix=AVXANY --check-prefix=AVX512 --check-prefix=AVX512BW

; Equality checks of 128/256-bit values can use PMOVMSK or PTEST to avoid scalarization.		; Equality checks of 128/256-bit values can use PMOVMSK or PTEST to avoid scalarization.

define i32 @ne_i128(<2 x i64> %x, <2 x i64> %y) {		define i32 @ne_i128(<2 x i64> %x, <2 x i64> %y) {
▲ Show 20 Lines • Show All 717 Lines • ▼ Show 20 Lines	; AVXANY-NEXT: retq
%z = zext i1 %cmp to i32		%z = zext i1 %cmp to i32
ret i32 %z		ret i32 %z
}		}

; This test models the expansion of 'memcmp(a, b, 64) != 0'		; This test models the expansion of 'memcmp(a, b, 64) != 0'
; if we allowed 2 pairs of 32-byte loads per block.		; if we allowed 2 pairs of 32-byte loads per block.

define i32 @ne_i256_pair(ptr %a, ptr %b) {		define i32 @ne_i256_pair(ptr %a, ptr %b) {
; SSE2-LABEL: ne_i256_pair:		; SSE-LABEL: ne_i256_pair:
; SSE2: # %bb.0:		; SSE: # %bb.0:
; SSE2-NEXT: movq 16(%rdi), %rax		; SSE-NEXT: movq 16(%rdi), %rax
; SSE2-NEXT: movq 24(%rdi), %rcx		; SSE-NEXT: movq 24(%rdi), %rcx
; SSE2-NEXT: movq (%rdi), %rdx		; SSE-NEXT: movq (%rdi), %rdx
; SSE2-NEXT: movq 8(%rdi), %r8		; SSE-NEXT: movq 8(%rdi), %r8
; SSE2-NEXT: xorq 8(%rsi), %r8		; SSE-NEXT: xorq 8(%rsi), %r8
; SSE2-NEXT: xorq 24(%rsi), %rcx		; SSE-NEXT: xorq 24(%rsi), %rcx
; SSE2-NEXT: xorq (%rsi), %rdx		; SSE-NEXT: xorq (%rsi), %rdx
; SSE2-NEXT: xorq 16(%rsi), %rax		; SSE-NEXT: xorq 16(%rsi), %rax
; SSE2-NEXT: movq 48(%rdi), %r9		; SSE-NEXT: movq 48(%rdi), %r9
; SSE2-NEXT: movq 32(%rdi), %r10		; SSE-NEXT: movq 32(%rdi), %r10
; SSE2-NEXT: movq 56(%rdi), %r11		; SSE-NEXT: movq 56(%rdi), %r11
; SSE2-NEXT: movq 40(%rdi), %rdi		; SSE-NEXT: movq 40(%rdi), %rdi
; SSE2-NEXT: xorq 40(%rsi), %rdi		; SSE-NEXT: xorq 40(%rsi), %rdi
; SSE2-NEXT: orq %r8, %rdi		; SSE-NEXT: orq %r8, %rdi
; SSE2-NEXT: xorq 56(%rsi), %r11		; SSE-NEXT: xorq 56(%rsi), %r11
; SSE2-NEXT: orq %rcx, %r11		; SSE-NEXT: orq %rcx, %r11
; SSE2-NEXT: orq %rdi, %r11		; SSE-NEXT: orq %rdi, %r11
; SSE2-NEXT: xorq 32(%rsi), %r10		; SSE-NEXT: xorq 32(%rsi), %r10
; SSE2-NEXT: orq %rdx, %r10		; SSE-NEXT: orq %rdx, %r10
; SSE2-NEXT: xorq 48(%rsi), %r9		; SSE-NEXT: xorq 48(%rsi), %r9
; SSE2-NEXT: orq %rax, %r9		; SSE-NEXT: orq %rax, %r9
; SSE2-NEXT: orq %r10, %r9		; SSE-NEXT: orq %r10, %r9
; SSE2-NEXT: xorl %eax, %eax		; SSE-NEXT: xorl %eax, %eax
; SSE2-NEXT: orq %r11, %r9		; SSE-NEXT: orq %r11, %r9
; SSE2-NEXT: setne %al		; SSE-NEXT: setne %al
; SSE2-NEXT: retq		; SSE-NEXT: retq
;
; SSE41-LABEL: ne_i256_pair:
; SSE41: # %bb.0:
; SSE41-NEXT: movq 16(%rdi), %rax
; SSE41-NEXT: movq 24(%rdi), %rcx
; SSE41-NEXT: movq (%rdi), %rdx
; SSE41-NEXT: movq 8(%rdi), %r8
; SSE41-NEXT: xorq 8(%rsi), %r8
; SSE41-NEXT: xorq 24(%rsi), %rcx
; SSE41-NEXT: xorq (%rsi), %rdx
; SSE41-NEXT: xorq 16(%rsi), %rax
; SSE41-NEXT: movq 48(%rdi), %r9
; SSE41-NEXT: movq 32(%rdi), %r10
; SSE41-NEXT: movq 56(%rdi), %r11
; SSE41-NEXT: movq 40(%rdi), %rdi
; SSE41-NEXT: xorq 40(%rsi), %rdi
; SSE41-NEXT: orq %r8, %rdi
; SSE41-NEXT: xorq 56(%rsi), %r11
; SSE41-NEXT: orq %rcx, %r11
; SSE41-NEXT: orq %rdi, %r11
; SSE41-NEXT: xorq 32(%rsi), %r10
; SSE41-NEXT: orq %rdx, %r10
; SSE41-NEXT: xorq 48(%rsi), %r9
; SSE41-NEXT: orq %rax, %r9
; SSE41-NEXT: orq %r10, %r9
; SSE41-NEXT: xorl %eax, %eax
; SSE41-NEXT: orq %r11, %r9
; SSE41-NEXT: setne %al
; SSE41-NEXT: retq
;		;
; AVX1-LABEL: ne_i256_pair:		; AVX1-LABEL: ne_i256_pair:
; AVX1: # %bb.0:		; AVX1: # %bb.0:
; AVX1-NEXT: vmovups (%rdi), %ymm0		; AVX1-NEXT: vmovups (%rdi), %ymm0
; AVX1-NEXT: vmovups 32(%rdi), %ymm1		; AVX1-NEXT: vmovups 32(%rdi), %ymm1
; AVX1-NEXT: vxorps 32(%rsi), %ymm1, %ymm1		; AVX1-NEXT: vxorps 32(%rsi), %ymm1, %ymm1
; AVX1-NEXT: vxorps (%rsi), %ymm0, %ymm0		; AVX1-NEXT: vxorps (%rsi), %ymm0, %ymm0
; AVX1-NEXT: vorps %ymm1, %ymm0, %ymm0		; AVX1-NEXT: vorps %ymm1, %ymm0, %ymm0
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	; AVX512-NEXT: retq
%z = zext i1 %cmp to i32		%z = zext i1 %cmp to i32
ret i32 %z		ret i32 %z
}		}

; This test models the expansion of 'memcmp(a, b, 64) == 0'		; This test models the expansion of 'memcmp(a, b, 64) == 0'
; if we allowed 2 pairs of 32-byte loads per block.		; if we allowed 2 pairs of 32-byte loads per block.

define i32 @eq_i256_pair(ptr %a, ptr %b) {		define i32 @eq_i256_pair(ptr %a, ptr %b) {
; SSE2-LABEL: eq_i256_pair:		; SSE-LABEL: eq_i256_pair:
; SSE2: # %bb.0:		; SSE: # %bb.0:
; SSE2-NEXT: movq 16(%rdi), %rax		; SSE-NEXT: movq 16(%rdi), %rax
; SSE2-NEXT: movq 24(%rdi), %rcx		; SSE-NEXT: movq 24(%rdi), %rcx
; SSE2-NEXT: movq (%rdi), %rdx		; SSE-NEXT: movq (%rdi), %rdx
; SSE2-NEXT: movq 8(%rdi), %r8		; SSE-NEXT: movq 8(%rdi), %r8
; SSE2-NEXT: xorq 8(%rsi), %r8		; SSE-NEXT: xorq 8(%rsi), %r8
; SSE2-NEXT: xorq 24(%rsi), %rcx		; SSE-NEXT: xorq 24(%rsi), %rcx
; SSE2-NEXT: xorq (%rsi), %rdx		; SSE-NEXT: xorq (%rsi), %rdx
; SSE2-NEXT: xorq 16(%rsi), %rax		; SSE-NEXT: xorq 16(%rsi), %rax
; SSE2-NEXT: movq 48(%rdi), %r9		; SSE-NEXT: movq 48(%rdi), %r9
; SSE2-NEXT: movq 32(%rdi), %r10		; SSE-NEXT: movq 32(%rdi), %r10
; SSE2-NEXT: movq 56(%rdi), %r11		; SSE-NEXT: movq 56(%rdi), %r11
; SSE2-NEXT: movq 40(%rdi), %rdi		; SSE-NEXT: movq 40(%rdi), %rdi
; SSE2-NEXT: xorq 40(%rsi), %rdi		; SSE-NEXT: xorq 40(%rsi), %rdi
; SSE2-NEXT: orq %r8, %rdi		; SSE-NEXT: orq %r8, %rdi
; SSE2-NEXT: xorq 56(%rsi), %r11		; SSE-NEXT: xorq 56(%rsi), %r11
; SSE2-NEXT: orq %rcx, %r11		; SSE-NEXT: orq %rcx, %r11
; SSE2-NEXT: orq %rdi, %r11		; SSE-NEXT: orq %rdi, %r11
; SSE2-NEXT: xorq 32(%rsi), %r10		; SSE-NEXT: xorq 32(%rsi), %r10
; SSE2-NEXT: orq %rdx, %r10		; SSE-NEXT: orq %rdx, %r10
; SSE2-NEXT: xorq 48(%rsi), %r9		; SSE-NEXT: xorq 48(%rsi), %r9
; SSE2-NEXT: orq %rax, %r9		; SSE-NEXT: orq %rax, %r9
; SSE2-NEXT: orq %r10, %r9		; SSE-NEXT: orq %r10, %r9
; SSE2-NEXT: xorl %eax, %eax		; SSE-NEXT: xorl %eax, %eax
; SSE2-NEXT: orq %r11, %r9		; SSE-NEXT: orq %r11, %r9
; SSE2-NEXT: sete %al		; SSE-NEXT: sete %al
; SSE2-NEXT: retq		; SSE-NEXT: retq
;
; SSE41-LABEL: eq_i256_pair:
; SSE41: # %bb.0:
; SSE41-NEXT: movq 16(%rdi), %rax
; SSE41-NEXT: movq 24(%rdi), %rcx
; SSE41-NEXT: movq (%rdi), %rdx
; SSE41-NEXT: movq 8(%rdi), %r8
; SSE41-NEXT: xorq 8(%rsi), %r8
; SSE41-NEXT: xorq 24(%rsi), %rcx
; SSE41-NEXT: xorq (%rsi), %rdx
; SSE41-NEXT: xorq 16(%rsi), %rax
; SSE41-NEXT: movq 48(%rdi), %r9
; SSE41-NEXT: movq 32(%rdi), %r10
; SSE41-NEXT: movq 56(%rdi), %r11
; SSE41-NEXT: movq 40(%rdi), %rdi
; SSE41-NEXT: xorq 40(%rsi), %rdi
; SSE41-NEXT: orq %r8, %rdi
; SSE41-NEXT: xorq 56(%rsi), %r11
; SSE41-NEXT: orq %rcx, %r11
; SSE41-NEXT: orq %rdi, %r11
; SSE41-NEXT: xorq 32(%rsi), %r10
; SSE41-NEXT: orq %rdx, %r10
; SSE41-NEXT: xorq 48(%rsi), %r9
; SSE41-NEXT: orq %rax, %r9
; SSE41-NEXT: orq %r10, %r9
; SSE41-NEXT: xorl %eax, %eax
; SSE41-NEXT: orq %r11, %r9
; SSE41-NEXT: sete %al
; SSE41-NEXT: retq
;		;
; AVX1-LABEL: eq_i256_pair:		; AVX1-LABEL: eq_i256_pair:
; AVX1: # %bb.0:		; AVX1: # %bb.0:
; AVX1-NEXT: vmovups (%rdi), %ymm0		; AVX1-NEXT: vmovups (%rdi), %ymm0
; AVX1-NEXT: vmovups 32(%rdi), %ymm1		; AVX1-NEXT: vmovups 32(%rdi), %ymm1
; AVX1-NEXT: vxorps 32(%rsi), %ymm1, %ymm1		; AVX1-NEXT: vxorps 32(%rsi), %ymm1, %ymm1
; AVX1-NEXT: vxorps (%rsi), %ymm0, %ymm0		; AVX1-NEXT: vxorps (%rsi), %ymm0, %ymm0
; AVX1-NEXT: vorps %ymm1, %ymm0, %ymm0		; AVX1-NEXT: vorps %ymm1, %ymm0, %ymm0
▲ Show 20 Lines • Show All 317 Lines • ▼ Show 20 Lines
; ANY-NEXT: sete %al		; ANY-NEXT: sete %al
; ANY-NEXT: retq		; ANY-NEXT: retq
%a2 = add i256 %a, 1		%a2 = add i256 %a, 1
%r = icmp eq i256 %a2, %b		%r = icmp eq i256 %a2, %b
ret i1 %r		ret i1 %r
}		}

define i1 @eq_i512_op(i512 %a, i512 %b) {		define i1 @eq_i512_op(i512 %a, i512 %b) {
; ANY-LABEL: eq_i512_op:		; SSE-LABEL: eq_i512_op:
; ANY: # %bb.0:		; SSE: # %bb.0:
; ANY-NEXT: movq {{[0-9]+}}(%rsp), %r10		; SSE-NEXT: movq {{[0-9]+}}(%rsp), %rax
; ANY-NEXT: movq {{[0-9]+}}(%rsp), %rax		; SSE-NEXT: movq {{[0-9]+}}(%rsp), %r10
; ANY-NEXT: addq $1, %rdi		; SSE-NEXT: addq $1, %rdi
; ANY-NEXT: adcq $0, %rsi		; SSE-NEXT: adcq $0, %rsi
; ANY-NEXT: adcq $0, %rdx		; SSE-NEXT: adcq $0, %rdx
; ANY-NEXT: adcq $0, %rcx		; SSE-NEXT: adcq $0, %rcx
; ANY-NEXT: adcq $0, %r8		; SSE-NEXT: adcq $0, %r8
; ANY-NEXT: adcq $0, %r9		; SSE-NEXT: adcq $0, %r9
; ANY-NEXT: adcq $0, %r10		; SSE-NEXT: adcq $0, %r10
; ANY-NEXT: adcq $0, %rax		; SSE-NEXT: adcq $0, %rax
; ANY-NEXT: xorq {{[0-9]+}}(%rsp), %rsi		; SSE-NEXT: xorq {{[0-9]+}}(%rsp), %rsi
; ANY-NEXT: xorq {{[0-9]+}}(%rsp), %r9		; SSE-NEXT: xorq {{[0-9]+}}(%rsp), %r9
; ANY-NEXT: orq %rsi, %r9		; SSE-NEXT: orq %rsi, %r9
; ANY-NEXT: xorq {{[0-9]+}}(%rsp), %rcx		; SSE-NEXT: xorq {{[0-9]+}}(%rsp), %rcx
; ANY-NEXT: xorq {{[0-9]+}}(%rsp), %rax		; SSE-NEXT: xorq {{[0-9]+}}(%rsp), %rax
; ANY-NEXT: orq %rcx, %rax		; SSE-NEXT: orq %rcx, %rax
; ANY-NEXT: orq %r9, %rax		; SSE-NEXT: orq %r9, %rax
; ANY-NEXT: xorq {{[0-9]+}}(%rsp), %rdx		; SSE-NEXT: xorq {{[0-9]+}}(%rsp), %rdx
; ANY-NEXT: xorq {{[0-9]+}}(%rsp), %r10		; SSE-NEXT: xorq {{[0-9]+}}(%rsp), %r10
; ANY-NEXT: orq %rdx, %r10		; SSE-NEXT: orq %rdx, %r10
; ANY-NEXT: xorq {{[0-9]+}}(%rsp), %r8		; SSE-NEXT: xorq {{[0-9]+}}(%rsp), %r8
; ANY-NEXT: xorq {{[0-9]+}}(%rsp), %rdi		; SSE-NEXT: xorq {{[0-9]+}}(%rsp), %rdi
; ANY-NEXT: orq %r8, %rdi		; SSE-NEXT: orq %r8, %rdi
; ANY-NEXT: orq %r10, %rdi		; SSE-NEXT: orq %r10, %rdi
; ANY-NEXT: orq %rax, %rdi		; SSE-NEXT: orq %rax, %rdi
; ANY-NEXT: sete %al		; SSE-NEXT: sete %al
; ANY-NEXT: retq		; SSE-NEXT: retq
		;
		; AVXANY-LABEL: eq_i512_op:
		; AVXANY: # %bb.0:
		; AVXANY-NEXT: movq {{[0-9]+}}(%rsp), %r10
		; AVXANY-NEXT: movq {{[0-9]+}}(%rsp), %rax
		; AVXANY-NEXT: addq $1, %rdi
		; AVXANY-NEXT: adcq $0, %rsi
		; AVXANY-NEXT: adcq $0, %rdx
		; AVXANY-NEXT: adcq $0, %rcx
		; AVXANY-NEXT: adcq $0, %r8
		; AVXANY-NEXT: adcq $0, %r9
		; AVXANY-NEXT: adcq $0, %r10
		; AVXANY-NEXT: adcq $0, %rax
		; AVXANY-NEXT: xorq {{[0-9]+}}(%rsp), %rsi
		; AVXANY-NEXT: xorq {{[0-9]+}}(%rsp), %r9
		; AVXANY-NEXT: orq %rsi, %r9
		; AVXANY-NEXT: xorq {{[0-9]+}}(%rsp), %rcx
		; AVXANY-NEXT: xorq {{[0-9]+}}(%rsp), %rax
		; AVXANY-NEXT: orq %rcx, %rax
		; AVXANY-NEXT: orq %r9, %rax
		; AVXANY-NEXT: xorq {{[0-9]+}}(%rsp), %rdx
		; AVXANY-NEXT: xorq {{[0-9]+}}(%rsp), %r10
		; AVXANY-NEXT: orq %rdx, %r10
		; AVXANY-NEXT: xorq {{[0-9]+}}(%rsp), %r8
		; AVXANY-NEXT: xorq {{[0-9]+}}(%rsp), %rdi
		; AVXANY-NEXT: orq %r8, %rdi
		; AVXANY-NEXT: orq %r10, %rdi
		; AVXANY-NEXT: orq %rax, %rdi
		; AVXANY-NEXT: sete %al
		; AVXANY-NEXT: retq
		RKSimonUnsubmitted Done Reply Inline Actions SSE2/SSE41 might be mergable here if we add a SSE check-prefix case? RKSimon: SSE2/SSE41 might be mergable here if we add a SSE check-prefix case?
%a2 = add i512 %a, 1		%a2 = add i512 %a, 1
%r = icmp eq i512 %a2, %b		%r = icmp eq i512 %a2, %b
ret i1 %r		ret i1 %r
}		}

define i1 @eq_i128_load_arg(ptr%p, i128 %b) {		define i1 @eq_i128_load_arg(ptr%p, i128 %b) {
; ANY-LABEL: eq_i128_load_arg:		; ANY-LABEL: eq_i128_load_arg:
; ANY: # %bb.0:		; ANY: # %bb.0:
▲ Show 20 Lines • Show All 54 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/smin.ll

	Show First 20 Lines • Show All 152 Lines • ▼ Show 20 Lines
	;			;
	; X86-LABEL: test_i128:			; X86-LABEL: test_i128:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: pushl %ebp			; X86-NEXT: pushl %ebp
	; X86-NEXT: pushl %ebx			; X86-NEXT: pushl %ebx
	; X86-NEXT: pushl %edi			; X86-NEXT: pushl %edi
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: subl $8, %esp			; X86-NEXT: subl $8, %esp
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ebp			; X86-NEXT: movl {{[0-9]+}}(%esp), %ebp
	; X86-NEXT: cmpl %ecx, %edi			; X86-NEXT: cmpl %edx, %edi
	; X86-NEXT: movl %ecx, %eax			; X86-NEXT: movl %edx, %eax
	; X86-NEXT: cmovbl %edi, %eax			; X86-NEXT: cmovbl %edi, %eax
	; X86-NEXT: cmpl %esi, %ebp			; X86-NEXT: cmpl %esi, %ebp
	; X86-NEXT: movl %ecx, %ebx			; X86-NEXT: movl %edx, %ebx
	; X86-NEXT: cmovbl %edi, %ebx			; X86-NEXT: cmovbl %edi, %ebx
	; X86-NEXT: cmovel %eax, %ebx			; X86-NEXT: cmovel %eax, %ebx
	; X86-NEXT: movl %esi, %eax			; X86-NEXT: movl %esi, %eax
	; X86-NEXT: cmovbl %ebp, %eax			; X86-NEXT: cmovbl %ebp, %eax
	; X86-NEXT: movl %eax, (%esp) # 4-byte Spill			; X86-NEXT: movl %eax, (%esp) # 4-byte Spill
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: cmpl %edx, %edi			; X86-NEXT: cmpl %ecx, %edi
	; X86-NEXT: movl %edx, %eax			; X86-NEXT: movl %ecx, %eax
	; X86-NEXT: cmovbl %edi, %eax			; X86-NEXT: cmovbl %edi, %eax
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl %eax, %ebp			; X86-NEXT: movl %eax, %ebp
	; X86-NEXT: sbbl %edi, %ebp			; X86-NEXT: sbbl %edi, %ebp
	; X86-NEXT: cmovll {{[0-9]+}}(%esp), %esi			; X86-NEXT: cmovll {{[0-9]+}}(%esp), %esi
	; X86-NEXT: cmovll {{[0-9]+}}(%esp), %ecx			; X86-NEXT: cmovll {{[0-9]+}}(%esp), %edx
	; X86-NEXT: movl %eax, %ebp			; X86-NEXT: movl %eax, %ebp
	; X86-NEXT: xorl %edi, %ebp			; X86-NEXT: xorl %edi, %ebp
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: xorl %edx, %eax			; X86-NEXT: xorl %ecx, %eax
	; X86-NEXT: orl %ebp, %eax			; X86-NEXT: orl %ebp, %eax
	; X86-NEXT: cmovel %ebx, %ecx			; X86-NEXT: cmovel %ebx, %edx
	; X86-NEXT: cmovel (%esp), %esi # 4-byte Folded Reload			; X86-NEXT: cmovel (%esp), %esi # 4-byte Folded Reload
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: cmpl %edi, %eax			; X86-NEXT: cmpl %edi, %eax
	; X86-NEXT: cmovll {{[0-9]+}}(%esp), %edx			; X86-NEXT: cmovll {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload			; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Folded Reload
	; X86-NEXT: cmovll %eax, %edi			; X86-NEXT: cmovll %eax, %edi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl %edi, 12(%eax)			; X86-NEXT: movl %edi, 12(%eax)
	; X86-NEXT: movl %edx, 8(%eax)			; X86-NEXT: movl %ecx, 8(%eax)
	; X86-NEXT: movl %esi, 4(%eax)			; X86-NEXT: movl %esi, 4(%eax)
	; X86-NEXT: movl %ecx, (%eax)			; X86-NEXT: movl %edx, (%eax)
	; X86-NEXT: addl $8, %esp			; X86-NEXT: addl $8, %esp
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
	; X86-NEXT: popl %ebx			; X86-NEXT: popl %ebx
	; X86-NEXT: popl %ebp			; X86-NEXT: popl %ebp
	; X86-NEXT: retl $4			; X86-NEXT: retl $4
	%r = call i128 @llvm.smin.i128(i128 %a, i128 %b)			%r = call i128 @llvm.smin.i128(i128 %a, i128 %b)
	ret i128 %r			ret i128 %r
	▲ Show 20 Lines • Show All 565 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/smul-with-overflow.ll

	Show First 20 Lines • Show All 435 Lines • ▼ Show 20 Lines
	; X86-NEXT: movzbl %al, %eax			; X86-NEXT: movzbl %al, %eax
	; X86-NEXT: adcl %edx, %eax			; X86-NEXT: adcl %edx, %eax
	; X86-NEXT: movl %eax, (%esp) # 4-byte Spill			; X86-NEXT: movl %eax, (%esp) # 4-byte Spill
	; X86-NEXT: movl %ebx, %eax			; X86-NEXT: movl %ebx, %eax
	; X86-NEXT: mull {{[0-9]+}}(%esp)			; X86-NEXT: mull {{[0-9]+}}(%esp)
	; X86-NEXT: movl %edx, %ecx			; X86-NEXT: movl %edx, %ecx
	; X86-NEXT: movl %eax, %ebp			; X86-NEXT: movl %eax, %ebp
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
	; X86-NEXT: movl %ebx, %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: mull {{[0-9]+}}(%esp)			; X86-NEXT: mull %ebx
	; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill			; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
	; X86-NEXT: movl %eax, %ebx			; X86-NEXT: movl %eax, %ebx
	; X86-NEXT: movl %eax, %edi			; X86-NEXT: movl %eax, %edi
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
	; X86-NEXT: movl %ecx, %eax			; X86-NEXT: movl %ecx, %eax
	; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill			; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
	; X86-NEXT: addl %ecx, %edi			; X86-NEXT: addl %ecx, %edi
	; X86-NEXT: movl %edx, %ecx			; X86-NEXT: movl %edx, %ecx
	▲ Show 20 Lines • Show All 559 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/smulo-128-legalisation-lowering.ll

	Show First 20 Lines • Show All 83 Lines • ▼ Show 20 Lines
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: .cfi_def_cfa_offset 20			; X86-NEXT: .cfi_def_cfa_offset 20
	; X86-NEXT: subl $60, %esp			; X86-NEXT: subl $60, %esp
	; X86-NEXT: .cfi_def_cfa_offset 80			; X86-NEXT: .cfi_def_cfa_offset 80
	; X86-NEXT: .cfi_offset %esi, -20			; X86-NEXT: .cfi_offset %esi, -20
	; X86-NEXT: .cfi_offset %edi, -16			; X86-NEXT: .cfi_offset %edi, -16
	; X86-NEXT: .cfi_offset %ebx, -12			; X86-NEXT: .cfi_offset %ebx, -12
	; X86-NEXT: .cfi_offset %ebp, -8			; X86-NEXT: .cfi_offset %ebp, -8
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
				; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx
	; X86-NEXT: movl %edi, %eax			; X86-NEXT: movl %edi, %eax
	; X86-NEXT: mull %ebx			; X86-NEXT: mull %ebx
	; X86-NEXT: movl %edx, %ecx			; X86-NEXT: movl %edx, %ecx
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; X86-NEXT: movl %esi, %eax			; X86-NEXT: movl %esi, %eax
	; X86-NEXT: mull %ebx			; X86-NEXT: mull %ebx
	; X86-NEXT: movl %edx, %esi			; X86-NEXT: movl %edx, %esi
	▲ Show 20 Lines • Show All 144 Lines • ▼ Show 20 Lines
	; X86-NEXT: addl %ebx, %eax			; X86-NEXT: addl %ebx, %eax
	; X86-NEXT: adcl %ebp, %edx			; X86-NEXT: adcl %ebp, %edx
	; X86-NEXT: addl {{[-0-9]+}}(%e{{[sb]}}p), %esi ## 4-byte Folded Reload			; X86-NEXT: addl {{[-0-9]+}}(%e{{[sb]}}p), %esi ## 4-byte Folded Reload
	; X86-NEXT: movzbl {{[-0-9]+}}(%e{{[sb]}}p), %ecx ## 1-byte Folded Reload			; X86-NEXT: movzbl {{[-0-9]+}}(%e{{[sb]}}p), %ecx ## 1-byte Folded Reload
	; X86-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %ecx ## 4-byte Folded Reload			; X86-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %ecx ## 4-byte Folded Reload
	; X86-NEXT: addl %eax, %esi			; X86-NEXT: addl %eax, %esi
	; X86-NEXT: adcl %edx, %ecx			; X86-NEXT: adcl %edx, %ecx
	; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: sarl $31, %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: sarl $31, %eax			; X86-NEXT: mull %ecx
	; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: mull {{[0-9]+}}(%esp)
	; X86-NEXT: movl %edx, %edi			; X86-NEXT: movl %edx, %edi
	; X86-NEXT: movl %eax, %ebx			; X86-NEXT: movl %eax, %ebx
	; X86-NEXT: movl %eax, %ebp			; X86-NEXT: movl %eax, %ebp
	; X86-NEXT: addl %edx, %ebp			; X86-NEXT: addl %edx, %ebp
	; X86-NEXT: adcl $0, %edi			; X86-NEXT: adcl $0, %edi
	; X86-NEXT: movl %ecx, %eax			; X86-NEXT: movl %ecx, %eax
	; X86-NEXT: mull {{[0-9]+}}(%esp)			; X86-NEXT: mull {{[0-9]+}}(%esp)
	; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	▲ Show 20 Lines • Show All 314 Lines • ▼ Show 20 Lines
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: .cfi_def_cfa_offset 20			; X86-NEXT: .cfi_def_cfa_offset 20
	; X86-NEXT: subl $156, %esp			; X86-NEXT: subl $156, %esp
	; X86-NEXT: .cfi_def_cfa_offset 176			; X86-NEXT: .cfi_def_cfa_offset 176
	; X86-NEXT: .cfi_offset %esi, -20			; X86-NEXT: .cfi_offset %esi, -20
	; X86-NEXT: .cfi_offset %edi, -16			; X86-NEXT: .cfi_offset %edi, -16
	; X86-NEXT: .cfi_offset %ebx, -12			; X86-NEXT: .cfi_offset %ebx, -12
	; X86-NEXT: .cfi_offset %ebp, -8			; X86-NEXT: .cfi_offset %ebp, -8
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ebp			; X86-NEXT: movl {{[0-9]+}}(%esp), %ebp
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx
	; X86-NEXT: movl %ebp, %eax			; X86-NEXT: movl %ebp, %eax
	; X86-NEXT: mull %ebx			; X86-NEXT: mull %ebx
	; X86-NEXT: movl %edx, %ecx			; X86-NEXT: movl %edx, %ecx
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; X86-NEXT: movl %edi, %eax			; X86-NEXT: movl %edi, %eax
	; X86-NEXT: mull %ebx			; X86-NEXT: mull %ebx
	; X86-NEXT: movl %edx, %esi			; X86-NEXT: movl %edx, %esi
	▲ Show 20 Lines • Show All 692 Lines • ▼ Show 20 Lines
	; X86-NEXT: mull {{[0-9]+}}(%esp)			; X86-NEXT: mull {{[0-9]+}}(%esp)
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: movl %eax, %ebp			; X86-NEXT: movl %eax, %ebp
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; X86-NEXT: movl %edx, %esi			; X86-NEXT: movl %edx, %esi
	; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; X86-NEXT: addl %edx, %ecx			; X86-NEXT: addl %edx, %ecx
	; X86-NEXT: adcl $0, %esi			; X86-NEXT: adcl $0, %esi
	; X86-NEXT: movl %edi, %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: mull {{[0-9]+}}(%esp)			; X86-NEXT: mull %edi
	; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; X86-NEXT: addl %eax, %ecx			; X86-NEXT: addl %eax, %ecx
	; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; X86-NEXT: adcl %edx, %esi			; X86-NEXT: adcl %edx, %esi
	; X86-NEXT: setb %bl			; X86-NEXT: setb %bl
	; X86-NEXT: addl %eax, %esi			; X86-NEXT: addl %eax, %esi
	; X86-NEXT: movzbl %bl, %ebx			; X86-NEXT: movzbl %bl, %ebx
	; X86-NEXT: adcl %edx, %ebx			; X86-NEXT: adcl %edx, %ebx
	; X86-NEXT: movl %ebp, %eax			; X86-NEXT: movl %ebp, %eax
	; X86-NEXT: addl %esi, %eax			; X86-NEXT: addl %esi, %eax
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; X86-NEXT: movl %ecx, %eax			; X86-NEXT: movl %ecx, %eax
	; X86-NEXT: adcl %ebx, %eax			; X86-NEXT: adcl %ebx, %eax
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; X86-NEXT: adcl $0, %esi			; X86-NEXT: adcl $0, %esi
	; X86-NEXT: adcl $0, %ebx			; X86-NEXT: adcl $0, %ebx
	; X86-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; X86-NEXT: movl %edi, %eax
	; X86-NEXT: movl %edi, %ecx			; X86-NEXT: movl %edi, %ecx
	; X86-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %edi, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
				; X86-NEXT: movl %edi, %eax
	; X86-NEXT: mull {{[0-9]+}}(%esp)			; X86-NEXT: mull {{[0-9]+}}(%esp)
	; X86-NEXT: movl %edx, %ebp			; X86-NEXT: movl %edx, %ebp
	; X86-NEXT: movl %eax, %ebx			; X86-NEXT: movl %eax, %ebx
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; X86-NEXT: addl %edx, %ebx			; X86-NEXT: addl %edx, %ebx
	; X86-NEXT: movl %edx, %edi			; X86-NEXT: movl %edx, %edi
	; X86-NEXT: adcl $0, %edi			; X86-NEXT: adcl $0, %edi
	; X86-NEXT: movl %ecx, %eax			; X86-NEXT: movl %ecx, %eax
	▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines
	; X86-NEXT: adcl %edx, %ebp			; X86-NEXT: adcl %edx, %ebp
	; X86-NEXT: addl %esi, %ebx			; X86-NEXT: addl %esi, %ebx
	; X86-NEXT: movzbl %al, %eax			; X86-NEXT: movzbl %al, %eax
	; X86-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax ## 4-byte Folded Reload			; X86-NEXT: adcl {{[-0-9]+}}(%e{{[sb]}}p), %eax ## 4-byte Folded Reload
	; X86-NEXT: addl %edi, %ebx			; X86-NEXT: addl %edi, %ebx
	; X86-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %ebx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; X86-NEXT: adcl %ebp, %eax			; X86-NEXT: adcl %ebp, %eax
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebp ## 4-byte Reload			; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebp ## 4-byte Reload
	; X86-NEXT: mull %ebp			; X86-NEXT: movl %ebp, %eax
				; X86-NEXT: mull {{[0-9]+}}(%esp)
	; X86-NEXT: movl %edx, %esi			; X86-NEXT: movl %edx, %esi
	; X86-NEXT: movl %eax, %ecx			; X86-NEXT: movl %eax, %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: mull %ebp			; X86-NEXT: mull %ebp
	; X86-NEXT: movl %edx, %edi			; X86-NEXT: movl %edx, %edi
	; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; X86-NEXT: movl %eax, %ebx			; X86-NEXT: movl %eax, %ebx
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/umin.ll

	Show First 20 Lines • Show All 148 Lines • ▼ Show 20 Lines
	;			;
	; X86-LABEL: test_i128:			; X86-LABEL: test_i128:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: pushl %ebp			; X86-NEXT: pushl %ebp
	; X86-NEXT: pushl %ebx			; X86-NEXT: pushl %ebx
	; X86-NEXT: pushl %edi			; X86-NEXT: pushl %edi
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: subl $8, %esp			; X86-NEXT: subl $8, %esp
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ebp			; X86-NEXT: movl {{[0-9]+}}(%esp), %ebp
	; X86-NEXT: cmpl %ecx, %edi			; X86-NEXT: cmpl %edx, %edi
	; X86-NEXT: movl %ecx, %eax			; X86-NEXT: movl %edx, %eax
	; X86-NEXT: cmovbl %edi, %eax			; X86-NEXT: cmovbl %edi, %eax
	; X86-NEXT: cmpl %esi, %ebp			; X86-NEXT: cmpl %esi, %ebp
	; X86-NEXT: movl %ecx, %ebx			; X86-NEXT: movl %edx, %ebx
	; X86-NEXT: cmovbl %edi, %ebx			; X86-NEXT: cmovbl %edi, %ebx
	; X86-NEXT: cmovel %eax, %ebx			; X86-NEXT: cmovel %eax, %ebx
	; X86-NEXT: movl %esi, %eax			; X86-NEXT: movl %esi, %eax
	; X86-NEXT: cmovbl %ebp, %eax			; X86-NEXT: cmovbl %ebp, %eax
	; X86-NEXT: movl %eax, (%esp) # 4-byte Spill			; X86-NEXT: movl %eax, (%esp) # 4-byte Spill
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: cmpl %edx, %edi			; X86-NEXT: cmpl %ecx, %edi
	; X86-NEXT: movl %edx, %eax			; X86-NEXT: movl %ecx, %eax
	; X86-NEXT: cmovbl %edi, %eax			; X86-NEXT: cmovbl %edi, %eax
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl %eax, %ebp			; X86-NEXT: movl %eax, %ebp
	; X86-NEXT: sbbl %edi, %ebp			; X86-NEXT: sbbl %edi, %ebp
	; X86-NEXT: cmovbl {{[0-9]+}}(%esp), %esi			; X86-NEXT: cmovbl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: cmovbl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: cmovbl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: movl %eax, %ebp			; X86-NEXT: movl %eax, %ebp
	; X86-NEXT: xorl %edi, %ebp			; X86-NEXT: xorl %edi, %ebp
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: xorl %edx, %eax			; X86-NEXT: xorl %ecx, %eax
	; X86-NEXT: orl %ebp, %eax			; X86-NEXT: orl %ebp, %eax
	; X86-NEXT: cmovel %ebx, %ecx			; X86-NEXT: cmovel %ebx, %edx
	; X86-NEXT: cmovel (%esp), %esi # 4-byte Folded Reload			; X86-NEXT: cmovel (%esp), %esi # 4-byte Folded Reload
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: cmpl %edi, %eax			; X86-NEXT: cmpl %edi, %eax
	; X86-NEXT: cmovbl {{[0-9]+}}(%esp), %edx			; X86-NEXT: cmovbl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Folded Reload			; X86-NEXT: cmovel {{[-0-9]+}}(%e{{[sb]}}p), %ecx # 4-byte Folded Reload
	; X86-NEXT: cmovbl %eax, %edi			; X86-NEXT: cmovbl %eax, %edi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: movl %edi, 12(%eax)			; X86-NEXT: movl %edi, 12(%eax)
	; X86-NEXT: movl %edx, 8(%eax)			; X86-NEXT: movl %ecx, 8(%eax)
	; X86-NEXT: movl %esi, 4(%eax)			; X86-NEXT: movl %esi, 4(%eax)
	; X86-NEXT: movl %ecx, (%eax)			; X86-NEXT: movl %edx, (%eax)
	; X86-NEXT: addl $8, %esp			; X86-NEXT: addl $8, %esp
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
	; X86-NEXT: popl %ebx			; X86-NEXT: popl %ebx
	; X86-NEXT: popl %ebp			; X86-NEXT: popl %ebp
	; X86-NEXT: retl $4			; X86-NEXT: retl $4
	%r = call i128 @llvm.umin.i128(i128 %a, i128 %b)			%r = call i128 @llvm.umin.i128(i128 %a, i128 %b)
	ret i128 %r			ret i128 %r
	▲ Show 20 Lines • Show All 578 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/umul-with-overflow.ll

	Show First 20 Lines • Show All 81 Lines • ▼ Show 20 Lines
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: subl $76, %esp			; X86-NEXT: subl $76, %esp
	; X86-NEXT: movl $4095, %ecx # imm = 0xFFF			; X86-NEXT: movl $4095, %ecx # imm = 0xFFF
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
	; X86-NEXT: andl %ecx, %eax			; X86-NEXT: andl %ecx, %eax
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
	; X86-NEXT: andl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: andl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill			; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx
				; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: movl %ebx, %eax			; X86-NEXT: movl %ebx, %eax
	; X86-NEXT: mull %edi			; X86-NEXT: mull %edi
	; X86-NEXT: movl %edx, %esi			; X86-NEXT: movl %edx, %esi
	; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill			; X86-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
	; X86-NEXT: movl %ecx, %eax			; X86-NEXT: movl %ecx, %eax
	; X86-NEXT: mull %edi			; X86-NEXT: mull %edi
	; X86-NEXT: movl %edx, %edi			; X86-NEXT: movl %edx, %edi
	▲ Show 20 Lines • Show All 427 Lines • ▼ Show 20 Lines
	; X64-NEXT: pushq %r14			; X64-NEXT: pushq %r14
	; X64-NEXT: pushq %r13			; X64-NEXT: pushq %r13
	; X64-NEXT: pushq %r12			; X64-NEXT: pushq %r12
	; X64-NEXT: pushq %rbx			; X64-NEXT: pushq %rbx
	; X64-NEXT: movq %r9, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill			; X64-NEXT: movq %r9, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
	; X64-NEXT: movq %r8, %r11			; X64-NEXT: movq %r8, %r11
	; X64-NEXT: movq %rcx, %r8			; X64-NEXT: movq %rcx, %r8
	; X64-NEXT: movq %rdx, %rcx			; X64-NEXT: movq %rdx, %rcx
	; X64-NEXT: movq {{[0-9]+}}(%rsp), %r10
	; X64-NEXT: movq {{[0-9]+}}(%rsp), %r9			; X64-NEXT: movq {{[0-9]+}}(%rsp), %r9
				; X64-NEXT: movq {{[0-9]+}}(%rsp), %r10
	; X64-NEXT: movq %rsi, %rax			; X64-NEXT: movq %rsi, %rax
	; X64-NEXT: mulq %r10			; X64-NEXT: mulq %r10
	; X64-NEXT: movq %rdx, %rbx			; X64-NEXT: movq %rdx, %rbx
	; X64-NEXT: movq %rax, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill			; X64-NEXT: movq %rax, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
	; X64-NEXT: movq %rcx, %rax			; X64-NEXT: movq %rcx, %rax
	; X64-NEXT: mulq %r10			; X64-NEXT: mulq %r10
	; X64-NEXT: movq %r10, %rbp			; X64-NEXT: movq %r10, %rbp
	; X64-NEXT: movq %rdx, %r14			; X64-NEXT: movq %rdx, %r14
	▲ Show 20 Lines • Show All 85 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/wide-integer-cmp.ll

	Show First 20 Lines • Show All 93 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: .cfi_def_cfa_offset 8			; CHECK-NEXT: .cfi_def_cfa_offset 8
	; CHECK-NEXT: .cfi_offset %esi, -8			; CHECK-NEXT: .cfi_offset %esi, -8
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax			; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %ecx			; CHECK-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %edx			; CHECK-NEXT: movl {{[0-9]+}}(%esp), %edx
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %esi			; CHECK-NEXT: movl {{[0-9]+}}(%esp), %esi
	; CHECK-NEXT: cmpl {{[0-9]+}}(%esp), %edx			; CHECK-NEXT: cmpl {{[0-9]+}}(%esp), %edx
	; CHECK-NEXT: sbbl {{[0-9]+}}(%esp), %esi			; CHECK-NEXT: sbbl {{[0-9]+}}(%esp), %esi
	; CHECK-NEXT: sbbl {{[0-9]+}}(%esp), %ecx
	; CHECK-NEXT: sbbl {{[0-9]+}}(%esp), %eax			; CHECK-NEXT: sbbl {{[0-9]+}}(%esp), %eax
				; CHECK-NEXT: sbbl {{[0-9]+}}(%esp), %ecx
	; CHECK-NEXT: jge .LBB4_2			; CHECK-NEXT: jge .LBB4_2
	; CHECK-NEXT: # %bb.1: # %bb1			; CHECK-NEXT: # %bb.1: # %bb1
	; CHECK-NEXT: movl $1, %eax			; CHECK-NEXT: movl $1, %eax
	; CHECK-NEXT: popl %esi			; CHECK-NEXT: popl %esi
	; CHECK-NEXT: .cfi_def_cfa_offset 4			; CHECK-NEXT: .cfi_def_cfa_offset 4
	; CHECK-NEXT: retl			; CHECK-NEXT: retl
	; CHECK-NEXT: .LBB4_2: # %bb2			; CHECK-NEXT: .LBB4_2: # %bb2
	; CHECK-NEXT: .cfi_def_cfa_offset 8			; CHECK-NEXT: .cfi_def_cfa_offset 8
	Show All 37 Lines

llvm/test/CodeGen/X86/xaluo128.ll

	Show All 18 Lines
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx
	; X86-NEXT: addl {{[0-9]+}}(%esp), %edi			; X86-NEXT: addl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %ebx			; X86-NEXT: adcl {{[0-9]+}}(%esp), %ebx
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %edx			; X86-NEXT: adcl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: adcl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: seto %al			; X86-NEXT: seto %al
	; X86-NEXT: movl %edi, (%ecx)			; X86-NEXT: movl %edi, (%ecx)
	; X86-NEXT: movl %ebx, 4(%ecx)			; X86-NEXT: movl %ebx, 4(%ecx)
	; X86-NEXT: movl %esi, 8(%ecx)			; X86-NEXT: movl %edx, 8(%ecx)
	; X86-NEXT: movl %edx, 12(%ecx)			; X86-NEXT: movl %esi, 12(%ecx)
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
	; X86-NEXT: popl %ebx			; X86-NEXT: popl %ebx
	; X86-NEXT: retl			; X86-NEXT: retl
	%t = call {i128, i1} @llvm.sadd.with.overflow.i128(i128 %v1, i128 %v2)			%t = call {i128, i1} @llvm.sadd.with.overflow.i128(i128 %v1, i128 %v2)
	%val = extractvalue {i128, i1} %t, 0			%val = extractvalue {i128, i1} %t, 0
	%obit = extractvalue {i128, i1} %t, 1			%obit = extractvalue {i128, i1} %t, 1
	store i128 %val, ptr %res			store i128 %val, ptr %res
	Show All 17 Lines
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx
	; X86-NEXT: addl {{[0-9]+}}(%esp), %edi			; X86-NEXT: addl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %ebx			; X86-NEXT: adcl {{[0-9]+}}(%esp), %ebx
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: adcl {{[0-9]+}}(%esp), %edx			; X86-NEXT: adcl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: adcl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: setb %al			; X86-NEXT: setb %al
	; X86-NEXT: movl %edi, (%ecx)			; X86-NEXT: movl %edi, (%ecx)
	; X86-NEXT: movl %ebx, 4(%ecx)			; X86-NEXT: movl %ebx, 4(%ecx)
	; X86-NEXT: movl %esi, 8(%ecx)			; X86-NEXT: movl %edx, 8(%ecx)
	; X86-NEXT: movl %edx, 12(%ecx)			; X86-NEXT: movl %esi, 12(%ecx)
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
	; X86-NEXT: popl %ebx			; X86-NEXT: popl %ebx
	; X86-NEXT: retl			; X86-NEXT: retl
	%t = call {i128, i1} @llvm.uadd.with.overflow.i128(i128 %v1, i128 %v2)			%t = call {i128, i1} @llvm.uadd.with.overflow.i128(i128 %v1, i128 %v2)
	%val = extractvalue {i128, i1} %t, 0			%val = extractvalue {i128, i1} %t, 0
	%obit = extractvalue {i128, i1} %t, 1			%obit = extractvalue {i128, i1} %t, 1
	store i128 %val, ptr %res			store i128 %val, ptr %res
	Show All 18 Lines
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx
	; X86-NEXT: subl {{[0-9]+}}(%esp), %edi			; X86-NEXT: subl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: sbbl {{[0-9]+}}(%esp), %ebx			; X86-NEXT: sbbl {{[0-9]+}}(%esp), %ebx
	; X86-NEXT: sbbl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: sbbl {{[0-9]+}}(%esp), %edx			; X86-NEXT: sbbl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: sbbl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: seto %al			; X86-NEXT: seto %al
	; X86-NEXT: movl %edi, (%ecx)			; X86-NEXT: movl %edi, (%ecx)
	; X86-NEXT: movl %ebx, 4(%ecx)			; X86-NEXT: movl %ebx, 4(%ecx)
	; X86-NEXT: movl %esi, 8(%ecx)			; X86-NEXT: movl %edx, 8(%ecx)
	; X86-NEXT: movl %edx, 12(%ecx)			; X86-NEXT: movl %esi, 12(%ecx)
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
	; X86-NEXT: popl %ebx			; X86-NEXT: popl %ebx
	; X86-NEXT: retl			; X86-NEXT: retl
	%t = call {i128, i1} @llvm.ssub.with.overflow.i128(i128 %v1, i128 %v2)			%t = call {i128, i1} @llvm.ssub.with.overflow.i128(i128 %v1, i128 %v2)
	%val = extractvalue {i128, i1} %t, 0			%val = extractvalue {i128, i1} %t, 0
	%obit = extractvalue {i128, i1} %t, 1			%obit = extractvalue {i128, i1} %t, 1
	store i128 %val, ptr %res			store i128 %val, ptr %res
	Show All 17 Lines
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-NEXT: movl {{[0-9]+}}(%esp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: movl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx			; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx
	; X86-NEXT: subl {{[0-9]+}}(%esp), %edi			; X86-NEXT: subl {{[0-9]+}}(%esp), %edi
	; X86-NEXT: sbbl {{[0-9]+}}(%esp), %ebx			; X86-NEXT: sbbl {{[0-9]+}}(%esp), %ebx
	; X86-NEXT: sbbl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: sbbl {{[0-9]+}}(%esp), %edx			; X86-NEXT: sbbl {{[0-9]+}}(%esp), %edx
				; X86-NEXT: sbbl {{[0-9]+}}(%esp), %esi
	; X86-NEXT: setb %al			; X86-NEXT: setb %al
	; X86-NEXT: movl %edi, (%ecx)			; X86-NEXT: movl %edi, (%ecx)
	; X86-NEXT: movl %ebx, 4(%ecx)			; X86-NEXT: movl %ebx, 4(%ecx)
	; X86-NEXT: movl %esi, 8(%ecx)			; X86-NEXT: movl %edx, 8(%ecx)
	; X86-NEXT: movl %edx, 12(%ecx)			; X86-NEXT: movl %esi, 12(%ecx)
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
	; X86-NEXT: popl %ebx			; X86-NEXT: popl %ebx
	; X86-NEXT: retl			; X86-NEXT: retl
	%t = call {i128, i1} @llvm.usub.with.overflow.i128(i128 %v1, i128 %v2)			%t = call {i128, i1} @llvm.usub.with.overflow.i128(i128 %v1, i128 %v2)
	%val = extractvalue {i128, i1} %t, 0			%val = extractvalue {i128, i1} %t, 0
	%obit = extractvalue {i128, i1} %t, 1			%obit = extractvalue {i128, i1} %t, 1
	store i128 %val, ptr %res			store i128 %val, ptr %res
	ret i1 %obit			ret i1 %obit
	}			}

	declare {i128, i1} @llvm.sadd.with.overflow.i128(i128, i128) nounwind readnone			declare {i128, i1} @llvm.sadd.with.overflow.i128(i128, i128) nounwind readnone
	declare {i128, i1} @llvm.uadd.with.overflow.i128(i128, i128) nounwind readnone			declare {i128, i1} @llvm.uadd.with.overflow.i128(i128, i128) nounwind readnone
	declare {i128, i1} @llvm.ssub.with.overflow.i128(i128, i128) nounwind readnone			declare {i128, i1} @llvm.ssub.with.overflow.i128(i128, i128) nounwind readnone
	declare {i128, i1} @llvm.usub.with.overflow.i128(i128, i128) nounwind readnone			declare {i128, i1} @llvm.usub.with.overflow.i128(i128, i128) nounwind readnone

This is an archive of the discontinued LLVM Phabricator instance.

[X86] `X86TargetLowering`: override `allowsMemoryAccess()`ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 491098

llvm/lib/Target/X86/X86ISelLowering.h

llvm/lib/Target/X86/X86ISelLowering.cpp

llvm/test/CodeGen/X86/add-sub-bool.ll

llvm/test/CodeGen/X86/bswap-wide-int.ll

llvm/test/CodeGen/X86/fshl.ll

llvm/test/CodeGen/X86/fshr.ll

llvm/test/CodeGen/X86/i128-add.ll

llvm/test/CodeGen/X86/icmp-shift-opt.ll

llvm/test/CodeGen/X86/legalize-shl-vec.ll

llvm/test/CodeGen/X86/merge-consecutive-stores-nt.ll

llvm/test/CodeGen/X86/setcc-wide-types.ll

llvm/test/CodeGen/X86/smin.ll

llvm/test/CodeGen/X86/smul-with-overflow.ll

llvm/test/CodeGen/X86/smulo-128-legalisation-lowering.ll

llvm/test/CodeGen/X86/umin.ll

llvm/test/CodeGen/X86/umul-with-overflow.ll

llvm/test/CodeGen/X86/wide-integer-cmp.ll

llvm/test/CodeGen/X86/xaluo128.ll

[X86] `X86TargetLowering`: override `allowsMemoryAccess()`
ClosedPublic