This is an archive of the discontinued LLVM Phabricator instance.

[X86] Transform setcc + movzbl into xorl + setcc
ClosedPublic

Authored by mkuper on Jun 27 2016, 4:50 PM.

Download Raw Diff

Details

Reviewers

qcolombet
RKSimon
delena
davidxl
hansw
sanjoy
DavidKreitzer

Commits

rG1ef6c59b1d2f: [X86] Transform setcc + movzbl into xorl + setcc
rL274692: [X86] Transform setcc + movzbl into xorl + setcc

Summary

xorl + setcc is generally the preferred sequence due to the partial register stall setcc + movzbl suffers from. It also encodes one byte smaller. llvm.org/PR28146 and the other associated PRs have more details on this.

Unfortunately, this can not be handled in DAG ISel, because of how X86 SETCC is modeled. And changing the way SETCC is modeled does not seem too attractive. So, we need to clean this up post-ISel. Dave Kreitzer suggested using pseudos to represent a zexted setcc, which is cleaner in some sense, but dirtier in others, and on the balance, I think I prefer this patch. Dave, if you (or anyone else) feel strongly that we should be using pseudos - or some other solution - instead, we can continue the discussion here.

Note that this is not a win in 100% of the cases - for example, some pcmpstri test cases below get pessimized. This happens because the register allocator is over-constrained due to the extended value being forced into %eax. So we should not see this sort of thing inside hot loops. Suggestions on how to resolve this are welcome, although I don't believe that should block this patch.

Diff Detail

Repository: rL LLVM

Event Timeline

mkuper updated this revision to Diff 62039.Jun 27 2016, 4:50 PM

mkuper retitled this revision from to [X86] Transform setcc + movzbl into xorl + setcc.

mkuper updated this object.

mkuper added reviewers: DavidKreitzer, delena, hansw, qcolombet, RKSimon, sanjoy, davidxl.

mkuper added a subscriber: llvm-commits.

delena added inline comments.Jun 27 2016, 11:02 PM

lib/Target/X86/X86FixupSetCC.cpp
119 ↗	(On Diff #62039)	When you go backward, you should ensure that MI.getOperand(0).getReg() is not used between setcc and FlagsDefMI, including the second one. for example: test %eax, %eax setcc %al you can kill %eax before the "test".

Have you considered doing this in DAGToDAG? I'm not sure you'll be able to model the imp-use of EFLAGS (IIRC that's also a longstanding deficiency of pattern matching: we can't turn explicit into implicit), but seems worth a try?

lib/Target/X86/CMakeLists.txt
20 ↗	(On Diff #62039)	Go ahead and reorder separately?

Thanks for working on this!

lib/Target/X86/X86FixupSetCC.cpp
84 ↗	(On Diff #62039)	nit: I'd use braces here to help the reader
115 ↗	(On Diff #62039)	First I thought this was a "break" from the loop, but it's for the switch of course. Maybe this would be easier to read if the switch part was broken out to an "isSetCC" helper function, then the loop could be for (auto &MI : MBB) { if (!isSetCC(MI)) continue; // do stuff, with no confusion about 'break'

Thanks Elena, Ahmed!

In D21774#468880, @ab wrote:

Have you considered doing this in DAGToDAG? I'm not sure you'll be able to model the imp-use of EFLAGS (IIRC that's also a longstanding deficiency of pattern matching: we can't turn explicit into implicit), but seems worth a try?

Yes, that's what I started out with. That really would have been simpler, but I couldn't find a way to make it work.

lib/Target/X86/CMakeLists.txt
20 ↗	(On Diff #62039)	Will do.
lib/Target/X86/X86FixupSetCC.cpp
119 ↗	(On Diff #62039)	I'm not sure I understood the scenario. This is pre-RA, the instruction I'm adding writes into a new vreg, so it can't clobber any other register. But it made me notice a different edge case. I'm not sure this happens in practice, but in theory, FlagsDefMI may also imp-use eflags, in which case the transformation is invalid. I'll update the patch.

Thanks, Hans.

lib/Target/X86/X86FixupSetCC.cpp
84 ↗	(On Diff #62039)	Ack.
115 ↗	(On Diff #62039)	You're right, it'll look nicer.

mkuper updated this revision to Diff 62107.Jun 28 2016, 10:21 AM

RKSimon added inline comments.Jun 28 2016, 10:36 AM

test/CodeGen/X86/sse42-intrinsics-x86.ll
1 ↗	(On Diff #62107)	Regenerate this first so that we can see the diffs?

mkuper added inline comments.Jun 28 2016, 10:43 AM

test/CodeGen/X86/sse42-intrinsics-x86.ll
1 ↗	(On Diff #62107)	Will do. But it's basically the same as in avx-intrinsics-x86.ll - some of the pcmpestr tests get pessimized, because we can't setcc directly into %al (since we can't xor it). So we use an additional register, and get the additional mov and pop (since it's the register's only use).

Updated SSE4.2 test.

I managed to confuse myself with the order of iteration, erasing the zext immediately actually shouldn't be safe - even though I don't have a case where it fails.

Hi Michael,

Thanks for working on this! Ultimately I am okay with this approach, but I have a few high level comments.

(1) The advantage of the pseudo-SETcc instruction approach over this approach is that the dependence of the SETcc on the XOR is explicit. In your transformed code, I think there is nothing to prevent this:

v1 = MOV32r0
<CC setter>
v2 = SETcc
v3 = INSERT_SUBREG(v1, v2)

from being transformed into this

<CC setter>
v2 = SETcc
v1 = MOV32r0
v3 = INSERT_SUBREG(v1, v2)

I'd expect this to be a reasonably common occurrence, since the RA might choose to regenerate the MOV32r0 at its use to save on register pressure. (Regenerating the MOV32r0 eliminates the register conflicts with the operands of the CC setter.) And this would, ironically, have the effect of using an extra register & an extra movb instruction since v1 and v2 could no longer be assigned the same register.

(2) Do you have any performance data on the patch?

(3) There is some advantage to generating the XOR/SETcc idiom even for SETcc instructions that don't get zero extended just for the purpose of eliminating the false dependence. I know that carries a code size cost and has some performance risk, but it is at least worth experimenting with. That's something for later, though. We should go after the "easy" cases first like you've done in this patch. We might also choose to handle these harder cases as part of the ExecutionDepsFix pass and not in this new pass.

Thanks,
Dave

lib/Target/X86/X86FixupSetCC.cpp
1 ↗	(On Diff #62124)	Please fix the cut & paste error here.
test/CodeGen/X86/fp128-compare.ll
11 ↗	(On Diff #62124)	This looks like a problem ...

Thanks for the feedback, Dave!

In D21774#469301, @DavidKreitzer wrote:

Hi Michael,

Thanks for working on this! Ultimately I am okay with this approach, but I have a few high level comments.

(1) The advantage of the pseudo-SETcc instruction approach over this approach is that the dependence of the SETcc on the XOR is explicit.

Right, I agree it would be better to have an explicit dependence. It's just that I'm not a fan of either of the two ways we currently have to achieve this (modifying SETCC or introducing a pseudo).

In your transformed code, I think there is nothing to prevent this:
v1 = MOV32r0
<CC setter>
v2 = SETcc
v3 = INSERT_SUBREG(v1, v2)
from being transformed into this
<CC setter>
v2 = SETcc
v1 = MOV32r0
v3 = INSERT_SUBREG(v1, v2)
I'd expect this to be a reasonably common occurrence, since the RA might choose to regenerate the MOV32r0 at its use to save on register pressure. (Regenerating the MOV32r0 eliminates the register conflicts with the operands of the CC setter.) And this would, ironically, have the effect of using an extra register & an extra movb instruction since v1 and v2 could no longer be assigned the same register.

I actually don't think this happens often, but you're right, conceptually it's not something we should be relying on, and it may be fatal.

(2) Do you have any performance data on the patch?

No, I've verified it fixes the simple cases, but didn't really run comprehensive performance checks, as I assumed generating these xors is pure goodness. :-)
Will do, either for this patch or for a different one, depending on whether we go with this or with pseudos.

lib/Target/X86/X86FixupSetCC.cpp
1 ↗	(On Diff #62124)	Right, thanks.
test/CodeGen/X86/fp128-compare.ll
11 ↗	(On Diff #62124)	It's the same issue as pcmpestr - we are constrained because both the input of the instruction defining eflags, and the eventual output of the setcc must be eax. The full code is: # BB#0: # %entry pushq %rax .Ltmp1: .cfi_def_cfa_offset 16 callq __getf2 xorl %ecx, %ecx testl %eax, %eax setns %cl movl %ecx, %eax popq %rcx retq I should regenerate the test with the update script.

ab mentioned this in D21822: [X86] Transform setcc + movzbl into xorl + setcc; ISel tentative.Jun 28 2016, 4:18 PM

+1 to David's comments about the weirdness around the dependencies. I'm not sure how much of a problem that is in practice though, since that sounds like a pessimization that goes against register coalescing regardless of the specifics of this patch.

Speaking of which; I looked a little closer at the ISel approach, and, with Matthias's and Quentin's help, I have a patch that looks like it works? Can you give it a try: D21822

In D21774#469389, @ab wrote:

+1 to David's comments about the weirdness around the dependencies. I'm not sure how much of a problem that is in practice though, since that sounds like a pessimization that goes against register coalescing regardless of the specifics of this patch.
Speaking of which; I looked a little closer at the ISel approach, and, with Matthias's and Quentin's help, I have a patch that looks like it works? Can you give it a try: D21822

Ooh, this looks nice!
I actually tried to do it as a target-specific DAGCombine for zext and not in DAGToDAG, with something like:

if (N0.getOpcode() == X86ISD::SETCC) {
  return DAG.getTargetInsertSubreg(
      X86::sub_8bit, dl, VT,
      SDValue(DAG.getMachineNode(X86::MOV32r0, dl, VT), 0), N0);
}

But got stuck with the same kind of sub-optimal code, and couldn't see a way out.

I'll take a closer look at D21822, thanks!

D21822 looks really nice. It's missing some small things (anyext and zext to i64 need to be handled explicitly - on the MI level, they have already been lowered to a zext to i32), and the call to Select worries me (is it strictly necessary? We're supposed to always select bottom-up, right?), but other than that, I believe it works.

It does however expose some of the weirdness Dave was talking about, simply by virtue of happening early and thus having less predictable code at RA time.
For CodeGen/X86/legalize-shift-64.ll we will get:

...
	movb	$32, %cl
	testb	%cl, %cl
	sete	%bl
	movl	$0, %ecx
	movb	%bl, %cl
...

The block the sete lives in happens to have two different MOV32r0 instructions, one for the sete, and another for an unrelated reason. One of them gets MachineCSE'd, and then regalloc has to remat a 0, and happens to remat it like this.

I played around with it a bit more, and it seems like doing this in DAGToDAG is too early.
This just doesn't play well with higher-level optimizations. In addition to the legalize-shift-64.ll case above, the DAGToDAG patch doesn't even fix the original case PR28146 was reduced from. There, the MOV32r0 gets hoisted up by MachineLICM, with the result being a similar setcc + mov 0 + movb pattern.

Ahmed, are you OK with keeping this a late (after SSA optimizations) pre-RA pass? Or do you prefer to try again with pseudos, like Dave proposed? Any other suggestions?

Dave, regarding performance: this patch (the machine IR pass) looks performance-neutral on SPEC2006.
And, of course, it improves the performance of the workload PR28146 was reduced from by the expected amount. :-)

igorb added a subscriber: igorb.Jun 30 2016, 6:08 AM

Eh, if Dave has no objections, I'm also fine with the pass; I was hoping we'd find a better solution, sorry for the noise!

The pseudo does seem like the least brittle approach; but the pre-RA pass seems good enough?
So... have you considered doing it post-RA? Seems like that lets you avoid the PCMPESTR problem, and I can't think of obvious drawbacks. It's a tad trickier but should be very close to this patch.

In D21774#469513, @mkuper wrote:

the call to Select worries me (is it strictly necessary? We're supposed to always select bottom-up, right?)

You're right, I don't think it's necessary.

It does however expose some of the weirdness Dave was talking about, simply by virtue of happening early and thus having less predictable code at RA time.
For CodeGen/X86/legalize-shift-64.ll we will get:
...
	movb	$32, %cl
	testb	%cl, %cl
	sete	%bl
	movl	$0, %ecx
	movb	%bl, %cl
...
The block the sete lives in happens to have two different MOV32r0 instructions, one for the sete, and another for an unrelated reason. One of them gets MachineCSE'd, and then regalloc has to remat a 0, and happens to remat it like this.

For the curious: I investigated this some more. We do have the ability to remat using MOV32r0, but we don't because EFLAGS is live (I swear that liveness up-and-down loop is duplicated, like, a dozen times). Immediately after that code, there's a jne, so EFLAGS really is live across the rematerialization point, which doesn't sound common.

But that's a separate issue from the INSERT_SUBREG becoming a real copy. We looked into it with Matthias: on 32-bit, only the ABCD registers are available as sub_8bit, so the INSERT_SUBREG ends up being constrained:

  MOV32mi <fi#0>, 1, %noreg, 0, %noreg, 1; mem:ST4[%x]
  MOV32mi <fi#1>, 1, %noreg, 4, %noreg, 0; mem:ST4[%t+4]
  MOV32mi <fi#1>, 1, %noreg, 0, %noreg, 1; mem:ST4[%t](align=8)
  %vreg0<def> = MOV32ri 1; GR32:%vreg0
  %vreg1<def> = MOV32r0 %EFLAGS<imp-def,dead>; GR32:%vreg1
  %vreg2<def,tied1> = SHLD32rri8 %vreg1<tied0>, %vreg0, 32, %EFLAGS<imp-def,dead>; GR32:%vreg2,%vreg1,%vreg0
  %vreg3<def> = MOV32r0 %EFLAGS<imp-def,dead>; GR32:%vreg3
  %vreg4<def> = MOV8ri 32; GR8:%vreg4
  TEST8rr %vreg4, %vreg4, %EFLAGS<imp-def>; GR8:%vreg4
  %vreg5<def> = SETEr %EFLAGS<imp-use>; GR8:%vreg5
## GR32_ABCD:
  %vreg6<def,tied1> = INSERT_SUBREG %vreg3<tied0>, %vreg5<kill>, sub_8bit; GR32_ABCD:%vreg6 GR32:%vreg3 GR8:%vreg5
  %vreg7<def> = CMOV_GR32 %vreg2<kill>, %vreg0, 9, %EFLAGS<imp-use>; GR32:%vreg7,%vreg2,%vreg0
  %vreg8<def,tied1> = XOR32ri8 %vreg7<tied0>, 1, %EFLAGS<imp-def,dead>; GR32:%vreg8,%vreg7
  %vreg9<def,tied1> = OR32rr %vreg6<tied0>, %vreg8<kill>, %EFLAGS<imp-def>; GR32:%vreg9,%vreg8 GR32_ABCD:%vreg6
  JE_1 <BB#2>, %EFLAGS<imp-use>
  JMP_1 <BB#1>

Once the MOV32r0 is CSE'd, we end up not reusing the (unconstrained GR32) vreg, because of that constraint; we end up with a copy after 2-addr:

  MOV32mi <fi#0>, 1, %noreg, 0, %noreg, 1; mem:ST4[%x]
  MOV32mi <fi#1>, 1, %noreg, 4, %noreg, 0; mem:ST4[%t+4]
  MOV32mi <fi#1>, 1, %noreg, 0, %noreg, 1; mem:ST4[%t](align=8)
  %vreg0<def> = MOV32ri 1; GR32:%vreg0
  %vreg1<def> = MOV32r0 %EFLAGS<imp-def,dead>; GR32:%vreg1
  %vreg4<def> = MOV8ri 32; GR8:%vreg4
  TEST8rr %vreg4<kill>, %vreg4, %EFLAGS<imp-def>; GR8:%vreg4
  %vreg5<def> = SETEr %EFLAGS<imp-use>; GR8:%vreg5
## COPY to GR32_ABCD:
  %vreg6<def> = COPY %vreg1; GR32_ABCD:%vreg6 GR32:%vreg1
  %vreg6:sub_8bit<def> = COPY %vreg5<kill>; GR32_ABCD:%vreg6 GR8:%vreg5
  JE_1 <BB#1>, %EFLAGS<imp-use,kill>

And that's how we get the worst-case sequence.
So, we thought it was fine if it was more likely only on 32-bit, but now we have one 64-bit example in PR28146. I'd be interested to investigate that too, but that's only for curiosity; let's forget about the ISel approach.

spatel added a subscriber: spatel.Jun 30 2016, 8:56 AM

spatel added inline comments.

test/CodeGen/X86/avx-intrinsics-x86.ll
1946–1953 ↗	(On Diff #62124)	Use the 'nounwind' attribute on this and other tests to eliminate some of the diff noise?

In D21774#471231, @ab wrote:

Eh, if Dave has no objections, I'm also fine with the pass; I was hoping we'd find a better solution, sorry for the noise!

It's not noise at all, I was - and still am - hoping we'd find a better solution as well.

The pseudo does seem like the least brittle approach; but the pre-RA pass seems good enough?

I have two main concerns about pseudos:

Opaqueness: the pseudos are at least somewhat opaque to machine IR passes. I'm not sure how much of a concern this really is in practice, but it bothers me.
Cleanliness: this will mean introducing ~15 new pseudos (or a pseudo with a parametric CC code that only gets resolved in expand-time, which I think is even worse, since it'll then look very different from the normal x86 setcc.)

But if "public opinion" is strongly towards pseudos, I can go with that. Dave has a prototype patch, I'll see if I run into any gotchas with it.

So... have you considered doing it post-RA? Seems like that lets you avoid the PCMPESTR problem, and I can't think of obvious drawbacks. It's a tad trickier but should be very close to this patch.

What bothers me about post-RA is cases when the flags-setting instruction and the setcc use the same register, e.g.

testb %al, %al
seta %al
movzbl %al, %eax

It may be the case that this never has a read stall on the seta, because if there were, then we'll always stall on the the instruction that defs the flag (or on some instruction between the test and the setcc, in more complicated cases). I haven't been able to convince myself of this, though.

And that's how we get the worst-case sequence.
So, we thought it was fine if it was more likely only on 32-bit, but now we have one 64-bit example in PR28146. I'd be interested to investigate that too, but that's only for curiosity; let's forget about the ISel approach.

Yes, I'm curious about it too. I'll try to take a look at it separately and see if I can make sense of it. But, I'm, by a long shot, not as familiar with regalloc as I'd like...

test/CodeGen/X86/avx-intrinsics-x86.ll
1946–1953 ↗	(On Diff #62124)	Sure, will do.

I have no objection to this solution. I think it is robust in a functional sense. And we can always change it later if we discover that the worst case MOV32r0 sinking scenario is more common than we think. It would give me the warm fuzzies if we had some experimental evidence to confirm the suspicion that this is a rare case. Do you already have that? If not, maybe it would be a good idea to write a late pass that looks for this kind of pattern and count the number of occurrences on, say, cpu2006?

SETcc r8
xor r32, r32 (or mov $0, r32)
movb r32b, r8

And thanks for the performance data! Once this gets committed, I'll have someone run testing on a broader set of workloads.

test/CodeGen/X86/fp128-compare.ll
11 ↗	(On Diff #62124)	Ah yes, of course. You can ignore my comment.

After looking at the pseudos a bit more, I really prefer to commit this.

Of course, if we have empirical evidence that this misses cases, I'm open to moving to anything that works better - be it a post-RA pass, or Pseudos.
Dave, feel free to provide such empirical evidence in the form of PRs assigned to me. :-)

Closed by commit rL274692: [X86] Transform setcc + movzbl into xorl + setcc (authored by mkuper). · Explain WhyJul 6 2016, 3:03 PM

This revision was automatically updated to reflect the committed changes.

mkuper mentioned this in D22229: [X86] Only apply setcc fixup if GR32_ABCDs are free..Jul 11 2016, 1:52 PM

bryant mentioned this in D23253: [X86] Generalized transformation of `definstr gr8; movzx gr32, gr8` to `xor gr32, gr32; definstr gr8`.Aug 7 2016, 9:02 PM

n.bozhenov added a subscriber: n.bozhenov.Aug 17 2016, 5:26 AM

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

X86/

1 line

3 lines

183 lines

3 lines

test/

CodeGen/

X86/

2008-08-17-UComiCodeGenBug.ll

2 lines

2008-09-11-CoalescerBug2.ll

2 lines

avx-intrinsics-fast-isel.ll

40 lines

avx-intrinsics-x86.ll

192 lines

4 lines

2 lines

2 lines

4 lines

4 lines

4 lines

6 lines

4 lines

12 lines

5 lines

31 lines

15 lines

10 lines

4 lines

2 lines

sse-intrinsics-fast-isel.ll

32 lines

sse-intrinsics-x86.ll

32 lines

sse2-intrinsics-fast-isel.ll

32 lines

sse2-intrinsics-x86.ll

32 lines

sse41-intrinsics-fast-isel.ll

16 lines

sse41-intrinsics-x86.ll

8 lines

sse41.ll

8 lines

sse42-intrinsics-fast-isel.ll

72 lines

sse42-intrinsics-x86.ll

44 lines

Diff 62983

llvm/trunk/lib/Target/X86/CMakeLists.txt

	Show All 13 Lines

	set(sources			set(sources
	X86AsmPrinter.cpp			X86AsmPrinter.cpp
	X86CallFrameOptimization.cpp			X86CallFrameOptimization.cpp
	X86ExpandPseudo.cpp			X86ExpandPseudo.cpp
	X86FastISel.cpp			X86FastISel.cpp
	X86FixupBWInsts.cpp			X86FixupBWInsts.cpp
	X86FixupLEAs.cpp			X86FixupLEAs.cpp
				X86FixupSetCC.cpp
	X86FloatingPoint.cpp			X86FloatingPoint.cpp
	X86FrameLowering.cpp			X86FrameLowering.cpp
	X86ISelDAGToDAG.cpp			X86ISelDAGToDAG.cpp
	X86ISelLowering.cpp			X86ISelLowering.cpp
	X86InstrInfo.cpp			X86InstrInfo.cpp
	X86MCInstLower.cpp			X86MCInstLower.cpp
	X86MachineFunctionInfo.cpp			X86MachineFunctionInfo.cpp
	X86OptimizeLEAs.cpp			X86OptimizeLEAs.cpp
	Show All 21 Lines

llvm/trunk/lib/Target/X86/X86.h

	Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
	/// sub, inc, dec, some shifts, and some multiplies) by equivalent LEA			/// sub, inc, dec, some shifts, and some multiplies) by equivalent LEA
	/// instructions, in order to eliminate execution delays in some processors.			/// instructions, in order to eliminate execution delays in some processors.
	FunctionPass *createX86FixupLEAs();			FunctionPass *createX86FixupLEAs();

	/// Return a pass that removes redundant LEA instructions and redundant address			/// Return a pass that removes redundant LEA instructions and redundant address
	/// recalculations.			/// recalculations.
	FunctionPass *createX86OptimizeLEAs();			FunctionPass *createX86OptimizeLEAs();

				/// Return a pass that transforms setcc + movzx pairs into xor + setcc.
				FunctionPass *createX86FixupSetCC();

	/// Return a pass that expands WinAlloca pseudo-instructions.			/// Return a pass that expands WinAlloca pseudo-instructions.
	FunctionPass *createX86WinAllocaExpander();			FunctionPass *createX86WinAllocaExpander();

	/// Return a pass that optimizes the code-size of x86 call sequences. This is			/// Return a pass that optimizes the code-size of x86 call sequences. This is
	/// done by replacing esp-relative movs with pushes.			/// done by replacing esp-relative movs with pushes.
	FunctionPass *createX86CallFrameOptimization();			FunctionPass *createX86CallFrameOptimization();

	/// Return an IR pass that inserts EH registration stack objects and explicit			/// Return an IR pass that inserts EH registration stack objects and explicit
	Show All 20 Lines

llvm/trunk/lib/Target/X86/X86FixupSetCC.cpp

				//===---- X86FixupSetCC.cpp - optimize usage of LEA instructions ----------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines a pass that fixes zero-extension of setcc patterns.
				// X86 setcc instructions are modeled to have no input arguments, and a single
				// GR8 output argument. This is consistent with other similar instructions
				// (e.g. movb), but means it is impossible to directly generate a setcc into
				// the lower GR8 of a specified GR32.
				// This means that ISel must select (zext (setcc)) into something like
				// seta %al; movzbl %al, %eax.
				// Unfortunately, this can cause a stall due to the partial register write
				// performed by the setcc. Instead, we can use:
				// xor %eax, %eax; seta %al
				// This both avoids the stall, and encodes shorter.
				//===----------------------------------------------------------------------===//

				#include "X86.h"
				#include "X86InstrInfo.h"
				#include "X86Subtarget.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/CodeGen/MachineFunctionPass.h"
				#include "llvm/CodeGen/MachineInstrBuilder.h"
				#include "llvm/CodeGen/MachineRegisterInfo.h"

				using namespace llvm;

				#define DEBUG_TYPE "x86-fixup-setcc"

				STATISTIC(NumSubstZexts, "Number of setcc + zext pairs substituted");

				namespace {
				class X86FixupSetCCPass : public MachineFunctionPass {
				public:
				X86FixupSetCCPass() : MachineFunctionPass(ID) {}

				const char *getPassName() const override { return "X86 Fixup SetCC"; }

				bool runOnMachineFunction(MachineFunction &MF) override;

				private:
				// Find the preceding instruction that imp-defs eflags.
				MachineInstr findFlagsImpDef(MachineBasicBlock MBB,
				MachineBasicBlock::reverse_iterator MI);

				// Return true if MI imp-uses eflags.
				bool impUsesFlags(MachineInstr *MI);

				// Return true if this is the opcode of a SetCC instruction with a register
				// output.
				bool isSetCCr(unsigned Opode);

				MachineRegisterInfo *MRI;
				const X86InstrInfo *TII;

				enum { SearchBound = 16 };

				static char ID;
				};

				char X86FixupSetCCPass::ID = 0;
				}

				FunctionPass *llvm::createX86FixupSetCC() { return new X86FixupSetCCPass(); }

				bool X86FixupSetCCPass::isSetCCr(unsigned Opcode) {
				switch (Opcode) {
				default:
				return false;
				case X86::SETOr:
				case X86::SETNOr:
				case X86::SETBr:
				case X86::SETAEr:
				case X86::SETEr:
				case X86::SETNEr:
				case X86::SETBEr:
				case X86::SETAr:
				case X86::SETSr:
				case X86::SETNSr:
				case X86::SETPr:
				case X86::SETNPr:
				case X86::SETLr:
				case X86::SETGEr:
				case X86::SETLEr:
				case X86::SETGr:
				return true;
				}
				}

				// We expect the instruction immediately before the setcc to imp-def
				// EFLAGS (because of scheduling glue). To make this less brittle w.r.t
				// scheduling, look backwards until we hit the beginning of the
				// basic-block, or a small bound (to avoid quadratic behavior).
				MachineInstr *
				X86FixupSetCCPass::findFlagsImpDef(MachineBasicBlock *MBB,
				MachineBasicBlock::reverse_iterator MI) {
				auto MBBStart = MBB->instr_rend();
				for (int i = 0; (i < SearchBound) && (MI != MBBStart); ++i, ++MI)
				for (auto &Op : MI->implicit_operands())
				if ((Op.getReg() == X86::EFLAGS) && (Op.isDef()))
				return &*MI;

				return nullptr;
				}

				bool X86FixupSetCCPass::impUsesFlags(MachineInstr *MI) {
				for (auto &Op : MI->implicit_operands())
				if ((Op.getReg() == X86::EFLAGS) && (Op.isUse()))
				return true;

				return false;
				}

				bool X86FixupSetCCPass::runOnMachineFunction(MachineFunction &MF) {
				bool Changed = false;
				MRI = &MF.getRegInfo();
				TII = MF.getSubtarget<X86Subtarget>().getInstrInfo();

				SmallVector<MachineInstr*, 4> ToErase;

				for (auto &MBB : MF) {
				for (auto &MI : MBB) {
				// Find a setcc that is used by a zext.
				// This doesn't have to be the only use, the transformation is safe
				// regardless.
				if (!isSetCCr(MI.getOpcode()))
				continue;

				MachineInstr *ZExt = nullptr;
				for (auto &Use : MRI->use_instructions(MI.getOperand(0).getReg()))
				if (Use.getOpcode() == X86::MOVZX32rr8)
				ZExt = &Use;

				if (!ZExt)
				continue;

				// Find the preceding instruction that imp-defs eflags.
				MachineInstr *FlagsDefMI = findFlagsImpDef(
				MI.getParent(), MachineBasicBlock::reverse_iterator(&MI));
				if (!FlagsDefMI)
				continue;

				// We'd like to put something that clobbers eflags directly before
				// FlagsDefMI. This can't hurt anything after FlagsDefMI, because
				// it, itself, by definition, clobbers eflags. But it may happen that
				// FlagsDefMI also uses eflags, in which case the transformation is
				// invalid.
				if (impUsesFlags(FlagsDefMI))
				continue;

				++NumSubstZexts;
				Changed = true;

				auto *RC = MRI->getRegClass(ZExt->getOperand(0).getReg());
				unsigned ZeroReg = MRI->createVirtualRegister(RC);
				unsigned InsertReg = MRI->createVirtualRegister(RC);

				// Initialize a register with 0. This must go before the eflags def
				BuildMI(MBB, FlagsDefMI, MI.getDebugLoc(), TII->get(X86::MOV32r0),
				ZeroReg);

				// X86 setcc only takes an output GR8, so fake a GR32 input by inserting
				// the setcc result into the low byte of the zeroed register.
				BuildMI(*ZExt->getParent(), ZExt, ZExt->getDebugLoc(),
				TII->get(X86::INSERT_SUBREG), InsertReg)
				.addReg(ZeroReg)
				.addReg(MI.getOperand(0).getReg())
				.addImm(X86::sub_8bit);
				MRI->replaceRegWith(ZExt->getOperand(0).getReg(), InsertReg);
				ToErase.push_back(ZExt);
				}
				}

				for (auto &I : ToErase)
				I->eraseFromParent();

				return Changed;
				}

llvm/trunk/lib/Target/X86/X86TargetMachine.cpp

Show First 20 Lines • Show All 279 Lines • ▼ Show 20 Lines	bool X86PassConfig::addInstSelector() {
addPass(createX86ISelDag(getX86TargetMachine(), getOptLevel()));		addPass(createX86ISelDag(getX86TargetMachine(), getOptLevel()));

// For ELF, cleanup any local-dynamic TLS accesses.		// For ELF, cleanup any local-dynamic TLS accesses.
if (TM->getTargetTriple().isOSBinFormatELF() &&		if (TM->getTargetTriple().isOSBinFormatELF() &&
getOptLevel() != CodeGenOpt::None)		getOptLevel() != CodeGenOpt::None)
addPass(createCleanupLocalDynamicTLSPass());		addPass(createCleanupLocalDynamicTLSPass());

addPass(createX86GlobalBaseRegPass());		addPass(createX86GlobalBaseRegPass());

return false;		return false;
}		}

bool X86PassConfig::addILPOpts() {		bool X86PassConfig::addILPOpts() {
addPass(&EarlyIfConverterID);		addPass(&EarlyIfConverterID);
if (EnableMachineCombinerPass)		if (EnableMachineCombinerPass)
addPass(&MachineCombinerID);		addPass(&MachineCombinerID);
return true;		return true;
}		}

bool X86PassConfig::addPreISel() {		bool X86PassConfig::addPreISel() {
// Only add this pass for 32-bit x86 Windows.		// Only add this pass for 32-bit x86 Windows.
const Triple &TT = TM->getTargetTriple();		const Triple &TT = TM->getTargetTriple();
if (TT.isOSWindows() && TT.getArch() == Triple::x86)		if (TT.isOSWindows() && TT.getArch() == Triple::x86)
addPass(createX86WinEHStatePass());		addPass(createX86WinEHStatePass());
return true;		return true;
}		}

void X86PassConfig::addPreRegAlloc() {		void X86PassConfig::addPreRegAlloc() {
		addPass(createX86FixupSetCC());

if (getOptLevel() != CodeGenOpt::None)		if (getOptLevel() != CodeGenOpt::None)
addPass(createX86OptimizeLEAs());		addPass(createX86OptimizeLEAs());

addPass(createX86CallFrameOptimization());		addPass(createX86CallFrameOptimization());
addPass(createX86WinAllocaExpander());		addPass(createX86WinAllocaExpander());
}		}

void X86PassConfig::addPostRegAlloc() {		void X86PassConfig::addPostRegAlloc() {
Show All 18 Lines

llvm/trunk/test/CodeGen/X86/2008-08-17-UComiCodeGenBug.ll

	; RUN: llc < %s -mtriple=x86_64-apple-darwin \| grep movzbl			; RUN: llc < %s -mtriple=x86_64-apple-darwin \| grep xorl

	define i32 @foo(<4 x float> %a, <4 x float> %b) nounwind {			define i32 @foo(<4 x float> %a, <4 x float> %b) nounwind {
	entry:			entry:
	tail call i32 @llvm.x86.sse.ucomige.ss( <4 x float> %a, <4 x float> %b ) nounwind readnone			tail call i32 @llvm.x86.sse.ucomige.ss( <4 x float> %a, <4 x float> %b ) nounwind readnone
	ret i32 %0			ret i32 %0
	}			}

	declare i32 @llvm.x86.sse.ucomige.ss(<4 x float>, <4 x float>) nounwind readnone			declare i32 @llvm.x86.sse.ucomige.ss(<4 x float>, <4 x float>) nounwind readnone

llvm/trunk/test/CodeGen/X86/2008-09-11-CoalescerBug2.ll

	; RUN: llc < %s -march=x86			; RUN: llc < %s -march=x86
	; RUN: llc -pre-RA-sched=source < %s -mtriple=i686-unknown-linux -mcpu=corei7 \| FileCheck %s --check-prefix=SOURCE-SCHED			; RUN: llc -pre-RA-sched=source < %s -mtriple=i686-unknown-linux -mcpu=corei7 \| FileCheck %s --check-prefix=SOURCE-SCHED
	; PR2748			; PR2748

	@g_73 = external global i32 ; <i32*> [#uses=1]			@g_73 = external global i32 ; <i32*> [#uses=1]
	@g_5 = external global i32 ; <i32*> [#uses=1]			@g_5 = external global i32 ; <i32*> [#uses=1]

	define i32 @func_44(i16 signext %p_46) nounwind {			define i32 @func_44(i16 signext %p_46) nounwind {
	entry:			entry:
	; SOURCE-SCHED: subl			; SOURCE-SCHED: subl
	; SOURCE-SCHED: movl			; SOURCE-SCHED: movl
	; SOURCE-SCHED: sarl			; SOURCE-SCHED: sarl
				; SOURCE-SCHED: xorl
	; SOURCE-SCHED: cmpl			; SOURCE-SCHED: cmpl
	; SOURCE-SCHED: setg			; SOURCE-SCHED: setg
	; SOURCE-SCHED: movzbl
	; SOURCE-SCHED: movb			; SOURCE-SCHED: movb
	; SOURCE-SCHED: xorl			; SOURCE-SCHED: xorl
	; SOURCE-SCHED: subl			; SOURCE-SCHED: subl
	; SOURCE-SCHED: testb			; SOURCE-SCHED: testb
	; SOURCE-SCHED: jne			; SOURCE-SCHED: jne
	%0 = load i32, i32* @g_5, align 4 ; <i32> [#uses=1]			%0 = load i32, i32* @g_5, align 4 ; <i32> [#uses=1]
	%1 = ashr i32 %0, 1 ; <i32> [#uses=1]			%1 = ashr i32 %0, 1 ; <i32> [#uses=1]
	%2 = icmp sgt i32 %1, 1 ; <i1> [#uses=1]			%2 = icmp sgt i32 %1, 1 ; <i1> [#uses=1]
	Show All 22 Lines

llvm/trunk/test/CodeGen/X86/avx-intrinsics-fast-isel.ll

Show First 20 Lines • Show All 3,378 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%res = call i32 @llvm.x86.avx.ptestc.256(<4 x i64> %a0, <4 x i64> %a1)		%res = call i32 @llvm.x86.avx.ptestc.256(<4 x i64> %a0, <4 x i64> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.ptestc.256(<4 x i64>, <4 x i64>) nounwind readnone		declare i32 @llvm.x86.avx.ptestc.256(<4 x i64>, <4 x i64>) nounwind readnone

define i32 @test_mm_testnzc_pd(<2 x double> %a0, <2 x double> %a1) nounwind {		define i32 @test_mm_testnzc_pd(<2 x double> %a0, <2 x double> %a1) nounwind {
; X32-LABEL: test_mm_testnzc_pd:		; X32-LABEL: test_mm_testnzc_pd:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: vtestpd %xmm1, %xmm0		; X32-NEXT: vtestpd %xmm1, %xmm0
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_testnzc_pd:		; X64-LABEL: test_mm_testnzc_pd:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: vtestpd %xmm1, %xmm0		; X64-NEXT: vtestpd %xmm1, %xmm0
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.avx.vtestnzc.pd(<2 x double> %a0, <2 x double> %a1)		%res = call i32 @llvm.x86.avx.vtestnzc.pd(<2 x double> %a0, <2 x double> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestnzc.pd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.avx.vtestnzc.pd(<2 x double>, <2 x double>) nounwind readnone

define i32 @test_mm256_testnzc_pd(<4 x double> %a0, <4 x double> %a1) nounwind {		define i32 @test_mm256_testnzc_pd(<4 x double> %a0, <4 x double> %a1) nounwind {
; X32-LABEL: test_mm256_testnzc_pd:		; X32-LABEL: test_mm256_testnzc_pd:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: vtestpd %ymm1, %ymm0		; X32-NEXT: vtestpd %ymm1, %ymm0
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: vzeroupper		; X32-NEXT: vzeroupper
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm256_testnzc_pd:		; X64-LABEL: test_mm256_testnzc_pd:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: vtestpd %ymm1, %ymm0		; X64-NEXT: vtestpd %ymm1, %ymm0
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: vzeroupper		; X64-NEXT: vzeroupper
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.avx.vtestnzc.pd.256(<4 x double> %a0, <4 x double> %a1)		%res = call i32 @llvm.x86.avx.vtestnzc.pd.256(<4 x double> %a0, <4 x double> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestnzc.pd.256(<4 x double>, <4 x double>) nounwind readnone		declare i32 @llvm.x86.avx.vtestnzc.pd.256(<4 x double>, <4 x double>) nounwind readnone

define i32 @test_mm_testnzc_ps(<4 x float> %a0, <4 x float> %a1) nounwind {		define i32 @test_mm_testnzc_ps(<4 x float> %a0, <4 x float> %a1) nounwind {
; X32-LABEL: test_mm_testnzc_ps:		; X32-LABEL: test_mm_testnzc_ps:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: vtestps %xmm1, %xmm0		; X32-NEXT: vtestps %xmm1, %xmm0
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_testnzc_ps:		; X64-LABEL: test_mm_testnzc_ps:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: vtestps %xmm1, %xmm0		; X64-NEXT: vtestps %xmm1, %xmm0
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.avx.vtestnzc.ps(<4 x float> %a0, <4 x float> %a1)		%res = call i32 @llvm.x86.avx.vtestnzc.ps(<4 x float> %a0, <4 x float> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestnzc.ps(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.avx.vtestnzc.ps(<4 x float>, <4 x float>) nounwind readnone

define i32 @test_mm256_testnzc_ps(<8 x float> %a0, <8 x float> %a1) nounwind {		define i32 @test_mm256_testnzc_ps(<8 x float> %a0, <8 x float> %a1) nounwind {
; X32-LABEL: test_mm256_testnzc_ps:		; X32-LABEL: test_mm256_testnzc_ps:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: vtestps %ymm1, %ymm0		; X32-NEXT: vtestps %ymm1, %ymm0
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: vzeroupper		; X32-NEXT: vzeroupper
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm256_testnzc_ps:		; X64-LABEL: test_mm256_testnzc_ps:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: vtestps %ymm1, %ymm0		; X64-NEXT: vtestps %ymm1, %ymm0
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: vzeroupper		; X64-NEXT: vzeroupper
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.avx.vtestnzc.ps.256(<8 x float> %a0, <8 x float> %a1)		%res = call i32 @llvm.x86.avx.vtestnzc.ps.256(<8 x float> %a0, <8 x float> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestnzc.ps.256(<8 x float>, <8 x float>) nounwind readnone		declare i32 @llvm.x86.avx.vtestnzc.ps.256(<8 x float>, <8 x float>) nounwind readnone

define i32 @test_mm256_testnzc_si256(<4 x i64> %a0, <4 x i64> %a1) nounwind {		define i32 @test_mm256_testnzc_si256(<4 x i64> %a0, <4 x i64> %a1) nounwind {
; X32-LABEL: test_mm256_testnzc_si256:		; X32-LABEL: test_mm256_testnzc_si256:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: vptest %ymm1, %ymm0		; X32-NEXT: vptest %ymm1, %ymm0
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: vzeroupper		; X32-NEXT: vzeroupper
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm256_testnzc_si256:		; X64-LABEL: test_mm256_testnzc_si256:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: vptest %ymm1, %ymm0		; X64-NEXT: vptest %ymm1, %ymm0
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: vzeroupper		; X64-NEXT: vzeroupper
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.avx.ptestnzc.256(<4 x i64> %a0, <4 x i64> %a1)		%res = call i32 @llvm.x86.avx.ptestnzc.256(<4 x i64> %a0, <4 x i64> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.ptestnzc.256(<4 x i64>, <4 x i64>) nounwind readnone		declare i32 @llvm.x86.avx.ptestnzc.256(<4 x i64>, <4 x i64>) nounwind readnone

define i32 @test_mm_testz_pd(<2 x double> %a0, <2 x double> %a1) nounwind {		define i32 @test_mm_testz_pd(<2 x double> %a0, <2 x double> %a1) nounwind {
; X32-LABEL: test_mm_testz_pd:		; X32-LABEL: test_mm_testz_pd:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: vtestpd %xmm1, %xmm0		; X32-NEXT: vtestpd %xmm1, %xmm0
; X32-NEXT: sete %al		; X32-NEXT: sete %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_testz_pd:		; X64-LABEL: test_mm_testz_pd:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: vtestpd %xmm1, %xmm0		; X64-NEXT: vtestpd %xmm1, %xmm0
; X64-NEXT: sete %al		; X64-NEXT: sete %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.avx.vtestz.pd(<2 x double> %a0, <2 x double> %a1)		%res = call i32 @llvm.x86.avx.vtestz.pd(<2 x double> %a0, <2 x double> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestz.pd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.avx.vtestz.pd(<2 x double>, <2 x double>) nounwind readnone

define i32 @test_mm256_testz_pd(<4 x double> %a0, <4 x double> %a1) nounwind {		define i32 @test_mm256_testz_pd(<4 x double> %a0, <4 x double> %a1) nounwind {
; X32-LABEL: test_mm256_testz_pd:		; X32-LABEL: test_mm256_testz_pd:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: vtestpd %ymm1, %ymm0		; X32-NEXT: vtestpd %ymm1, %ymm0
; X32-NEXT: sete %al		; X32-NEXT: sete %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: vzeroupper		; X32-NEXT: vzeroupper
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm256_testz_pd:		; X64-LABEL: test_mm256_testz_pd:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: vtestpd %ymm1, %ymm0		; X64-NEXT: vtestpd %ymm1, %ymm0
; X64-NEXT: sete %al		; X64-NEXT: sete %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: vzeroupper		; X64-NEXT: vzeroupper
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.avx.vtestz.pd.256(<4 x double> %a0, <4 x double> %a1)		%res = call i32 @llvm.x86.avx.vtestz.pd.256(<4 x double> %a0, <4 x double> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestz.pd.256(<4 x double>, <4 x double>) nounwind readnone		declare i32 @llvm.x86.avx.vtestz.pd.256(<4 x double>, <4 x double>) nounwind readnone

define i32 @test_mm_testz_ps(<4 x float> %a0, <4 x float> %a1) nounwind {		define i32 @test_mm_testz_ps(<4 x float> %a0, <4 x float> %a1) nounwind {
; X32-LABEL: test_mm_testz_ps:		; X32-LABEL: test_mm_testz_ps:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: vtestps %xmm1, %xmm0		; X32-NEXT: vtestps %xmm1, %xmm0
; X32-NEXT: sete %al		; X32-NEXT: sete %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_testz_ps:		; X64-LABEL: test_mm_testz_ps:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: vtestps %xmm1, %xmm0		; X64-NEXT: vtestps %xmm1, %xmm0
; X64-NEXT: sete %al		; X64-NEXT: sete %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.avx.vtestz.ps(<4 x float> %a0, <4 x float> %a1)		%res = call i32 @llvm.x86.avx.vtestz.ps(<4 x float> %a0, <4 x float> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestz.ps(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.avx.vtestz.ps(<4 x float>, <4 x float>) nounwind readnone

define i32 @test_mm256_testz_ps(<8 x float> %a0, <8 x float> %a1) nounwind {		define i32 @test_mm256_testz_ps(<8 x float> %a0, <8 x float> %a1) nounwind {
; X32-LABEL: test_mm256_testz_ps:		; X32-LABEL: test_mm256_testz_ps:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: vtestps %ymm1, %ymm0		; X32-NEXT: vtestps %ymm1, %ymm0
; X32-NEXT: sete %al		; X32-NEXT: sete %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: vzeroupper		; X32-NEXT: vzeroupper
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm256_testz_ps:		; X64-LABEL: test_mm256_testz_ps:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: vtestps %ymm1, %ymm0		; X64-NEXT: vtestps %ymm1, %ymm0
; X64-NEXT: sete %al		; X64-NEXT: sete %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: vzeroupper		; X64-NEXT: vzeroupper
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.avx.vtestz.ps.256(<8 x float> %a0, <8 x float> %a1)		%res = call i32 @llvm.x86.avx.vtestz.ps.256(<8 x float> %a0, <8 x float> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestz.ps.256(<8 x float>, <8 x float>) nounwind readnone		declare i32 @llvm.x86.avx.vtestz.ps.256(<8 x float>, <8 x float>) nounwind readnone

define i32 @test_mm256_testz_si256(<4 x i64> %a0, <4 x i64> %a1) nounwind {		define i32 @test_mm256_testz_si256(<4 x i64> %a0, <4 x i64> %a1) nounwind {
; X32-LABEL: test_mm256_testz_si256:		; X32-LABEL: test_mm256_testz_si256:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: vptest %ymm1, %ymm0		; X32-NEXT: vptest %ymm1, %ymm0
; X32-NEXT: sete %al		; X32-NEXT: sete %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: vzeroupper		; X32-NEXT: vzeroupper
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm256_testz_si256:		; X64-LABEL: test_mm256_testz_si256:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: vptest %ymm1, %ymm0		; X64-NEXT: vptest %ymm1, %ymm0
; X64-NEXT: sete %al		; X64-NEXT: sete %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: vzeroupper		; X64-NEXT: vzeroupper
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.avx.ptestz.256(<4 x i64> %a0, <4 x i64> %a1)		%res = call i32 @llvm.x86.avx.ptestz.256(<4 x i64> %a0, <4 x i64> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.ptestz.256(<4 x i64>, <4 x i64>) nounwind readnone		declare i32 @llvm.x86.avx.ptestz.256(<4 x i64>, <4 x i64>) nounwind readnone

define <2 x double> @test_mm_undefined_pd() nounwind {		define <2 x double> @test_mm_undefined_pd() nounwind {
▲ Show 20 Lines • Show All 164 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll

Show First 20 Lines • Show All 167 Lines • ▼ Show 20 Lines	; AVX512VL-NEXT: retl
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comieq.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comieq.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_comige_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_comige_sd(<2 x double> %a0, <2 x double> %a1) {
; AVX-LABEL: test_x86_sse2_comige_sd:		; AVX-LABEL: test_x86_sse2_comige_sd:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vcomisd %xmm1, %xmm0		; AVX-NEXT: vcomisd %xmm1, %xmm0
; AVX-NEXT: setae %al		; AVX-NEXT: setae %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse2_comige_sd:		; AVX512VL-LABEL: test_x86_sse2_comige_sd:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vcomisd %xmm1, %xmm0		; AVX512VL-NEXT: vcomisd %xmm1, %xmm0
; AVX512VL-NEXT: setae %al		; AVX512VL-NEXT: setae %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse2.comige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.comige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comige.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comige.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_comigt_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_comigt_sd(<2 x double> %a0, <2 x double> %a1) {
; AVX-LABEL: test_x86_sse2_comigt_sd:		; AVX-LABEL: test_x86_sse2_comigt_sd:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vcomisd %xmm1, %xmm0		; AVX-NEXT: vcomisd %xmm1, %xmm0
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse2_comigt_sd:		; AVX512VL-LABEL: test_x86_sse2_comigt_sd:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vcomisd %xmm1, %xmm0		; AVX512VL-NEXT: vcomisd %xmm1, %xmm0
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse2.comigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.comigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comigt.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comigt.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_comile_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_comile_sd(<2 x double> %a0, <2 x double> %a1) {
; AVX-LABEL: test_x86_sse2_comile_sd:		; AVX-LABEL: test_x86_sse2_comile_sd:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vcomisd %xmm0, %xmm1		; AVX-NEXT: vcomisd %xmm0, %xmm1
; AVX-NEXT: setae %al		; AVX-NEXT: setae %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse2_comile_sd:		; AVX512VL-LABEL: test_x86_sse2_comile_sd:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vcomisd %xmm0, %xmm1		; AVX512VL-NEXT: vcomisd %xmm0, %xmm1
; AVX512VL-NEXT: setae %al		; AVX512VL-NEXT: setae %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse2.comile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.comile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comile.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comile.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_comilt_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_comilt_sd(<2 x double> %a0, <2 x double> %a1) {
; AVX-LABEL: test_x86_sse2_comilt_sd:		; AVX-LABEL: test_x86_sse2_comilt_sd:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vcomisd %xmm0, %xmm1		; AVX-NEXT: vcomisd %xmm0, %xmm1
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse2_comilt_sd:		; AVX512VL-LABEL: test_x86_sse2_comilt_sd:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vcomisd %xmm0, %xmm1		; AVX512VL-NEXT: vcomisd %xmm0, %xmm1
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse2.comilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.comilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comilt.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comilt.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_comineq_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_comineq_sd(<2 x double> %a0, <2 x double> %a1) {
▲ Show 20 Lines • Show All 1,005 Lines • ▼ Show 20 Lines	; AVX512VL-NEXT: retl
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomieq.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomieq.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_ucomige_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_ucomige_sd(<2 x double> %a0, <2 x double> %a1) {
; AVX-LABEL: test_x86_sse2_ucomige_sd:		; AVX-LABEL: test_x86_sse2_ucomige_sd:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vucomisd %xmm1, %xmm0		; AVX-NEXT: vucomisd %xmm1, %xmm0
; AVX-NEXT: setae %al		; AVX-NEXT: setae %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse2_ucomige_sd:		; AVX512VL-LABEL: test_x86_sse2_ucomige_sd:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vucomisd %xmm1, %xmm0		; AVX512VL-NEXT: vucomisd %xmm1, %xmm0
; AVX512VL-NEXT: setae %al		; AVX512VL-NEXT: setae %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse2.ucomige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.ucomige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomige.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomige.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_ucomigt_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_ucomigt_sd(<2 x double> %a0, <2 x double> %a1) {
; AVX-LABEL: test_x86_sse2_ucomigt_sd:		; AVX-LABEL: test_x86_sse2_ucomigt_sd:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vucomisd %xmm1, %xmm0		; AVX-NEXT: vucomisd %xmm1, %xmm0
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse2_ucomigt_sd:		; AVX512VL-LABEL: test_x86_sse2_ucomigt_sd:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vucomisd %xmm1, %xmm0		; AVX512VL-NEXT: vucomisd %xmm1, %xmm0
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse2.ucomigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.ucomigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomigt.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomigt.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_ucomile_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_ucomile_sd(<2 x double> %a0, <2 x double> %a1) {
; AVX-LABEL: test_x86_sse2_ucomile_sd:		; AVX-LABEL: test_x86_sse2_ucomile_sd:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vucomisd %xmm0, %xmm1		; AVX-NEXT: vucomisd %xmm0, %xmm1
; AVX-NEXT: setae %al		; AVX-NEXT: setae %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse2_ucomile_sd:		; AVX512VL-LABEL: test_x86_sse2_ucomile_sd:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vucomisd %xmm0, %xmm1		; AVX512VL-NEXT: vucomisd %xmm0, %xmm1
; AVX512VL-NEXT: setae %al		; AVX512VL-NEXT: setae %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse2.ucomile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.ucomile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomile.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomile.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_ucomilt_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_ucomilt_sd(<2 x double> %a0, <2 x double> %a1) {
; AVX-LABEL: test_x86_sse2_ucomilt_sd:		; AVX-LABEL: test_x86_sse2_ucomilt_sd:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vucomisd %xmm0, %xmm1		; AVX-NEXT: vucomisd %xmm0, %xmm1
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse2_ucomilt_sd:		; AVX512VL-LABEL: test_x86_sse2_ucomilt_sd:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vucomisd %xmm0, %xmm1		; AVX512VL-NEXT: vucomisd %xmm0, %xmm1
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse2.ucomilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.ucomilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomilt.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomilt.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_ucomineq_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_ucomineq_sd(<2 x double> %a0, <2 x double> %a1) {
▲ Show 20 Lines • Show All 441 Lines • ▼ Show 20 Lines	; AVX512VL-NEXT: retl
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse41.ptestc(<2 x i64>, <2 x i64>) nounwind readnone		declare i32 @llvm.x86.sse41.ptestc(<2 x i64>, <2 x i64>) nounwind readnone


define i32 @test_x86_sse41_ptestnzc(<2 x i64> %a0, <2 x i64> %a1) {		define i32 @test_x86_sse41_ptestnzc(<2 x i64> %a0, <2 x i64> %a1) {
; AVX-LABEL: test_x86_sse41_ptestnzc:		; AVX-LABEL: test_x86_sse41_ptestnzc:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vptest %xmm1, %xmm0		; AVX-NEXT: vptest %xmm1, %xmm0
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse41_ptestnzc:		; AVX512VL-LABEL: test_x86_sse41_ptestnzc:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vptest %xmm1, %xmm0		; AVX512VL-NEXT: vptest %xmm1, %xmm0
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse41.ptestnzc(<2 x i64> %a0, <2 x i64> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse41.ptestnzc(<2 x i64> %a0, <2 x i64> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse41.ptestnzc(<2 x i64>, <2 x i64>) nounwind readnone		declare i32 @llvm.x86.sse41.ptestnzc(<2 x i64>, <2 x i64>) nounwind readnone


define i32 @test_x86_sse41_ptestz(<2 x i64> %a0, <2 x i64> %a1) {		define i32 @test_x86_sse41_ptestz(<2 x i64> %a0, <2 x i64> %a1) {
; AVX-LABEL: test_x86_sse41_ptestz:		; AVX-LABEL: test_x86_sse41_ptestz:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vptest %xmm1, %xmm0		; AVX-NEXT: vptest %xmm1, %xmm0
; AVX-NEXT: sete %al		; AVX-NEXT: sete %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse41_ptestz:		; AVX512VL-LABEL: test_x86_sse41_ptestz:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vptest %xmm1, %xmm0		; AVX512VL-NEXT: vptest %xmm1, %xmm0
; AVX512VL-NEXT: sete %al		; AVX512VL-NEXT: sete %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse41.ptestz(<2 x i64> %a0, <2 x i64> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse41.ptestz(<2 x i64> %a0, <2 x i64> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse41.ptestz(<2 x i64>, <2 x i64>) nounwind readnone		declare i32 @llvm.x86.sse41.ptestz(<2 x i64>, <2 x i64>) nounwind readnone


define <2 x double> @test_x86_sse41_round_pd(<2 x double> %a0) {		define <2 x double> @test_x86_sse41_round_pd(<2 x double> %a0) {
▲ Show 20 Lines • Show All 106 Lines • ▼ Show 20 Lines
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%1 = load <16 x i8>, <16 x i8>* %a0		%1 = load <16 x i8>, <16 x i8>* %a0
%2 = load <16 x i8>, <16 x i8>* %a2		%2 = load <16 x i8>, <16 x i8>* %a2
%res = call i32 @llvm.x86.sse42.pcmpestri128(<16 x i8> %1, i32 7, <16 x i8> %2, i32 7, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpestri128(<16 x i8> %1, i32 7, <16 x i8> %2, i32 7, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}


define i32 @test_x86_sse42_pcmpestria128(<16 x i8> %a0, <16 x i8> %a2) {		define i32 @test_x86_sse42_pcmpestria128(<16 x i8> %a0, <16 x i8> %a2) nounwind {
; AVX-LABEL: test_x86_sse42_pcmpestria128:		; AVX-LABEL: test_x86_sse42_pcmpestria128:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: pushl %ebx
; AVX-NEXT: movl $7, %eax		; AVX-NEXT: movl $7, %eax
; AVX-NEXT: movl $7, %edx		; AVX-NEXT: movl $7, %edx
		; AVX-NEXT: xorl %ebx, %ebx
; AVX-NEXT: vpcmpestri $7, %xmm1, %xmm0		; AVX-NEXT: vpcmpestri $7, %xmm1, %xmm0
; AVX-NEXT: seta %al		; AVX-NEXT: seta %bl
; AVX-NEXT: movzbl %al, %eax		; AVX-NEXT: movl %ebx, %eax
		; AVX-NEXT: popl %ebx
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse42_pcmpestria128:		; AVX512VL-LABEL: test_x86_sse42_pcmpestria128:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: pushl %ebx
; AVX512VL-NEXT: movl $7, %eax		; AVX512VL-NEXT: movl $7, %eax
; AVX512VL-NEXT: movl $7, %edx		; AVX512VL-NEXT: movl $7, %edx
		; AVX512VL-NEXT: xorl %ebx, %ebx
; AVX512VL-NEXT: vpcmpestri $7, %xmm1, %xmm0		; AVX512VL-NEXT: vpcmpestri $7, %xmm1, %xmm0
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %bl
; AVX512VL-NEXT: movzbl %al, %eax		; AVX512VL-NEXT: movl %ebx, %eax
		; AVX512VL-NEXT: popl %ebx
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpestria128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpestria128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpestria128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpestria128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpestric128(<16 x i8> %a0, <16 x i8> %a2) {		define i32 @test_x86_sse42_pcmpestric128(<16 x i8> %a0, <16 x i8> %a2) {
Show All 15 Lines
; AVX512VL-NEXT: andl $1, %eax		; AVX512VL-NEXT: andl $1, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpestric128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpestric128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpestric128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpestric128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpestrio128(<16 x i8> %a0, <16 x i8> %a2) {		define i32 @test_x86_sse42_pcmpestrio128(<16 x i8> %a0, <16 x i8> %a2) nounwind {
; AVX-LABEL: test_x86_sse42_pcmpestrio128:		; AVX-LABEL: test_x86_sse42_pcmpestrio128:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: pushl %ebx
; AVX-NEXT: movl $7, %eax		; AVX-NEXT: movl $7, %eax
; AVX-NEXT: movl $7, %edx		; AVX-NEXT: movl $7, %edx
		; AVX-NEXT: xorl %ebx, %ebx
; AVX-NEXT: vpcmpestri $7, %xmm1, %xmm0		; AVX-NEXT: vpcmpestri $7, %xmm1, %xmm0
; AVX-NEXT: seto %al		; AVX-NEXT: seto %bl
; AVX-NEXT: movzbl %al, %eax		; AVX-NEXT: movl %ebx, %eax
		; AVX-NEXT: popl %ebx
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse42_pcmpestrio128:		; AVX512VL-LABEL: test_x86_sse42_pcmpestrio128:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: pushl %ebx
; AVX512VL-NEXT: movl $7, %eax		; AVX512VL-NEXT: movl $7, %eax
; AVX512VL-NEXT: movl $7, %edx		; AVX512VL-NEXT: movl $7, %edx
		; AVX512VL-NEXT: xorl %ebx, %ebx
; AVX512VL-NEXT: vpcmpestri $7, %xmm1, %xmm0		; AVX512VL-NEXT: vpcmpestri $7, %xmm1, %xmm0
; AVX512VL-NEXT: seto %al		; AVX512VL-NEXT: seto %bl
; AVX512VL-NEXT: movzbl %al, %eax		; AVX512VL-NEXT: movl %ebx, %eax
		; AVX512VL-NEXT: popl %ebx
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpestrio128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpestrio128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpestrio128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpestrio128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpestris128(<16 x i8> %a0, <16 x i8> %a2) {		define i32 @test_x86_sse42_pcmpestris128(<16 x i8> %a0, <16 x i8> %a2) nounwind {
; AVX-LABEL: test_x86_sse42_pcmpestris128:		; AVX-LABEL: test_x86_sse42_pcmpestris128:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: pushl %ebx
; AVX-NEXT: movl $7, %eax		; AVX-NEXT: movl $7, %eax
; AVX-NEXT: movl $7, %edx		; AVX-NEXT: movl $7, %edx
		; AVX-NEXT: xorl %ebx, %ebx
; AVX-NEXT: vpcmpestri $7, %xmm1, %xmm0		; AVX-NEXT: vpcmpestri $7, %xmm1, %xmm0
; AVX-NEXT: sets %al		; AVX-NEXT: sets %bl
; AVX-NEXT: movzbl %al, %eax		; AVX-NEXT: movl %ebx, %eax
		; AVX-NEXT: popl %ebx
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse42_pcmpestris128:		; AVX512VL-LABEL: test_x86_sse42_pcmpestris128:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: pushl %ebx
; AVX512VL-NEXT: movl $7, %eax		; AVX512VL-NEXT: movl $7, %eax
; AVX512VL-NEXT: movl $7, %edx		; AVX512VL-NEXT: movl $7, %edx
		; AVX512VL-NEXT: xorl %ebx, %ebx
; AVX512VL-NEXT: vpcmpestri $7, %xmm1, %xmm0		; AVX512VL-NEXT: vpcmpestri $7, %xmm1, %xmm0
; AVX512VL-NEXT: sets %al		; AVX512VL-NEXT: sets %bl
; AVX512VL-NEXT: movzbl %al, %eax		; AVX512VL-NEXT: movl %ebx, %eax
		; AVX512VL-NEXT: popl %ebx
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpestris128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpestris128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpestris128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpestris128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpestriz128(<16 x i8> %a0, <16 x i8> %a2) {		define i32 @test_x86_sse42_pcmpestriz128(<16 x i8> %a0, <16 x i8> %a2) nounwind {
; AVX-LABEL: test_x86_sse42_pcmpestriz128:		; AVX-LABEL: test_x86_sse42_pcmpestriz128:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: pushl %ebx
; AVX-NEXT: movl $7, %eax		; AVX-NEXT: movl $7, %eax
; AVX-NEXT: movl $7, %edx		; AVX-NEXT: movl $7, %edx
		; AVX-NEXT: xorl %ebx, %ebx
; AVX-NEXT: vpcmpestri $7, %xmm1, %xmm0		; AVX-NEXT: vpcmpestri $7, %xmm1, %xmm0
; AVX-NEXT: sete %al		; AVX-NEXT: sete %bl
; AVX-NEXT: movzbl %al, %eax		; AVX-NEXT: movl %ebx, %eax
		; AVX-NEXT: popl %ebx
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse42_pcmpestriz128:		; AVX512VL-LABEL: test_x86_sse42_pcmpestriz128:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: pushl %ebx
; AVX512VL-NEXT: movl $7, %eax		; AVX512VL-NEXT: movl $7, %eax
; AVX512VL-NEXT: movl $7, %edx		; AVX512VL-NEXT: movl $7, %edx
		; AVX512VL-NEXT: xorl %ebx, %ebx
; AVX512VL-NEXT: vpcmpestri $7, %xmm1, %xmm0		; AVX512VL-NEXT: vpcmpestri $7, %xmm1, %xmm0
; AVX512VL-NEXT: sete %al		; AVX512VL-NEXT: sete %bl
; AVX512VL-NEXT: movzbl %al, %eax		; AVX512VL-NEXT: movl %ebx, %eax
		; AVX512VL-NEXT: popl %ebx
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpestriz128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpestriz128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpestriz128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpestriz128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone


define <16 x i8> @test_x86_sse42_pcmpestrm128(<16 x i8> %a0, <16 x i8> %a2) {		define <16 x i8> @test_x86_sse42_pcmpestrm128(<16 x i8> %a0, <16 x i8> %a2) {
▲ Show 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpistri128(<16 x i8> %1, <16 x i8> %2, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpistri128(<16 x i8> %1, <16 x i8> %2, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}


define i32 @test_x86_sse42_pcmpistria128(<16 x i8> %a0, <16 x i8> %a1) {		define i32 @test_x86_sse42_pcmpistria128(<16 x i8> %a0, <16 x i8> %a1) {
; AVX-LABEL: test_x86_sse42_pcmpistria128:		; AVX-LABEL: test_x86_sse42_pcmpistria128:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vpcmpistri $7, %xmm1, %xmm0		; AVX-NEXT: vpcmpistri $7, %xmm1, %xmm0
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse42_pcmpistria128:		; AVX512VL-LABEL: test_x86_sse42_pcmpistria128:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vpcmpistri $7, %xmm1, %xmm0		; AVX512VL-NEXT: vpcmpistri $7, %xmm1, %xmm0
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpistria128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpistria128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpistria128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpistria128(<16 x i8>, <16 x i8>, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpistric128(<16 x i8> %a0, <16 x i8> %a1) {		define i32 @test_x86_sse42_pcmpistric128(<16 x i8> %a0, <16 x i8> %a1) {
Show All 14 Lines	; AVX512VL-NEXT: retl
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpistric128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpistric128(<16 x i8>, <16 x i8>, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpistrio128(<16 x i8> %a0, <16 x i8> %a1) {		define i32 @test_x86_sse42_pcmpistrio128(<16 x i8> %a0, <16 x i8> %a1) {
; AVX-LABEL: test_x86_sse42_pcmpistrio128:		; AVX-LABEL: test_x86_sse42_pcmpistrio128:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vpcmpistri $7, %xmm1, %xmm0		; AVX-NEXT: vpcmpistri $7, %xmm1, %xmm0
; AVX-NEXT: seto %al		; AVX-NEXT: seto %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse42_pcmpistrio128:		; AVX512VL-LABEL: test_x86_sse42_pcmpistrio128:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vpcmpistri $7, %xmm1, %xmm0		; AVX512VL-NEXT: vpcmpistri $7, %xmm1, %xmm0
; AVX512VL-NEXT: seto %al		; AVX512VL-NEXT: seto %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpistrio128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpistrio128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpistrio128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpistrio128(<16 x i8>, <16 x i8>, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpistris128(<16 x i8> %a0, <16 x i8> %a1) {		define i32 @test_x86_sse42_pcmpistris128(<16 x i8> %a0, <16 x i8> %a1) {
; AVX-LABEL: test_x86_sse42_pcmpistris128:		; AVX-LABEL: test_x86_sse42_pcmpistris128:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vpcmpistri $7, %xmm1, %xmm0		; AVX-NEXT: vpcmpistri $7, %xmm1, %xmm0
; AVX-NEXT: sets %al		; AVX-NEXT: sets %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse42_pcmpistris128:		; AVX512VL-LABEL: test_x86_sse42_pcmpistris128:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vpcmpistri $7, %xmm1, %xmm0		; AVX512VL-NEXT: vpcmpistri $7, %xmm1, %xmm0
; AVX512VL-NEXT: sets %al		; AVX512VL-NEXT: sets %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpistris128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpistris128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpistris128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpistris128(<16 x i8>, <16 x i8>, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpistriz128(<16 x i8> %a0, <16 x i8> %a1) {		define i32 @test_x86_sse42_pcmpistriz128(<16 x i8> %a0, <16 x i8> %a1) {
; AVX-LABEL: test_x86_sse42_pcmpistriz128:		; AVX-LABEL: test_x86_sse42_pcmpistriz128:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vpcmpistri $7, %xmm1, %xmm0		; AVX-NEXT: vpcmpistri $7, %xmm1, %xmm0
; AVX-NEXT: sete %al		; AVX-NEXT: sete %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse42_pcmpistriz128:		; AVX512VL-LABEL: test_x86_sse42_pcmpistriz128:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vpcmpistri $7, %xmm1, %xmm0		; AVX512VL-NEXT: vpcmpistri $7, %xmm1, %xmm0
; AVX512VL-NEXT: sete %al		; AVX512VL-NEXT: sete %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpistriz128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpistriz128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpistriz128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpistriz128(<16 x i8>, <16 x i8>, i8) nounwind readnone


define <16 x i8> @test_x86_sse42_pcmpistrm128(<16 x i8> %a0, <16 x i8> %a1) {		define <16 x i8> @test_x86_sse42_pcmpistrm128(<16 x i8> %a0, <16 x i8> %a1) {
▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	; AVX512VL-NEXT: retl
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comieq.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comieq.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_comige_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_comige_ss(<4 x float> %a0, <4 x float> %a1) {
; AVX-LABEL: test_x86_sse_comige_ss:		; AVX-LABEL: test_x86_sse_comige_ss:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vcomiss %xmm1, %xmm0		; AVX-NEXT: vcomiss %xmm1, %xmm0
; AVX-NEXT: setae %al		; AVX-NEXT: setae %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse_comige_ss:		; AVX512VL-LABEL: test_x86_sse_comige_ss:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vcomiss %xmm1, %xmm0		; AVX512VL-NEXT: vcomiss %xmm1, %xmm0
; AVX512VL-NEXT: setae %al		; AVX512VL-NEXT: setae %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse.comige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.comige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comige.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comige.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_comigt_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_comigt_ss(<4 x float> %a0, <4 x float> %a1) {
; AVX-LABEL: test_x86_sse_comigt_ss:		; AVX-LABEL: test_x86_sse_comigt_ss:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vcomiss %xmm1, %xmm0		; AVX-NEXT: vcomiss %xmm1, %xmm0
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse_comigt_ss:		; AVX512VL-LABEL: test_x86_sse_comigt_ss:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vcomiss %xmm1, %xmm0		; AVX512VL-NEXT: vcomiss %xmm1, %xmm0
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse.comigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.comigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comigt.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comigt.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_comile_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_comile_ss(<4 x float> %a0, <4 x float> %a1) {
; AVX-LABEL: test_x86_sse_comile_ss:		; AVX-LABEL: test_x86_sse_comile_ss:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vcomiss %xmm0, %xmm1		; AVX-NEXT: vcomiss %xmm0, %xmm1
; AVX-NEXT: setae %al		; AVX-NEXT: setae %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse_comile_ss:		; AVX512VL-LABEL: test_x86_sse_comile_ss:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vcomiss %xmm0, %xmm1		; AVX512VL-NEXT: vcomiss %xmm0, %xmm1
; AVX512VL-NEXT: setae %al		; AVX512VL-NEXT: setae %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse.comile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.comile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comile.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comile.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_comilt_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_comilt_ss(<4 x float> %a0, <4 x float> %a1) {
; AVX-LABEL: test_x86_sse_comilt_ss:		; AVX-LABEL: test_x86_sse_comilt_ss:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vcomiss %xmm0, %xmm1		; AVX-NEXT: vcomiss %xmm0, %xmm1
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse_comilt_ss:		; AVX512VL-LABEL: test_x86_sse_comilt_ss:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vcomiss %xmm0, %xmm1		; AVX512VL-NEXT: vcomiss %xmm0, %xmm1
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse.comilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.comilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comilt.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comilt.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_comineq_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_comineq_ss(<4 x float> %a0, <4 x float> %a1) {
▲ Show 20 Lines • Show All 354 Lines • ▼ Show 20 Lines	; AVX512VL-NEXT: retl
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomieq.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomieq.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_ucomige_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_ucomige_ss(<4 x float> %a0, <4 x float> %a1) {
; AVX-LABEL: test_x86_sse_ucomige_ss:		; AVX-LABEL: test_x86_sse_ucomige_ss:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vucomiss %xmm1, %xmm0		; AVX-NEXT: vucomiss %xmm1, %xmm0
; AVX-NEXT: setae %al		; AVX-NEXT: setae %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse_ucomige_ss:		; AVX512VL-LABEL: test_x86_sse_ucomige_ss:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vucomiss %xmm1, %xmm0		; AVX512VL-NEXT: vucomiss %xmm1, %xmm0
; AVX512VL-NEXT: setae %al		; AVX512VL-NEXT: setae %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse.ucomige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.ucomige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomige.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomige.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_ucomigt_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_ucomigt_ss(<4 x float> %a0, <4 x float> %a1) {
; AVX-LABEL: test_x86_sse_ucomigt_ss:		; AVX-LABEL: test_x86_sse_ucomigt_ss:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vucomiss %xmm1, %xmm0		; AVX-NEXT: vucomiss %xmm1, %xmm0
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse_ucomigt_ss:		; AVX512VL-LABEL: test_x86_sse_ucomigt_ss:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vucomiss %xmm1, %xmm0		; AVX512VL-NEXT: vucomiss %xmm1, %xmm0
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse.ucomigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.ucomigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomigt.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomigt.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_ucomile_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_ucomile_ss(<4 x float> %a0, <4 x float> %a1) {
; AVX-LABEL: test_x86_sse_ucomile_ss:		; AVX-LABEL: test_x86_sse_ucomile_ss:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vucomiss %xmm0, %xmm1		; AVX-NEXT: vucomiss %xmm0, %xmm1
; AVX-NEXT: setae %al		; AVX-NEXT: setae %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse_ucomile_ss:		; AVX512VL-LABEL: test_x86_sse_ucomile_ss:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vucomiss %xmm0, %xmm1		; AVX512VL-NEXT: vucomiss %xmm0, %xmm1
; AVX512VL-NEXT: setae %al		; AVX512VL-NEXT: setae %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse.ucomile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.ucomile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomile.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomile.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_ucomilt_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_ucomilt_ss(<4 x float> %a0, <4 x float> %a1) {
; AVX-LABEL: test_x86_sse_ucomilt_ss:		; AVX-LABEL: test_x86_sse_ucomilt_ss:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vucomiss %xmm0, %xmm1		; AVX-NEXT: vucomiss %xmm0, %xmm1
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_sse_ucomilt_ss:		; AVX512VL-LABEL: test_x86_sse_ucomilt_ss:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vucomiss %xmm0, %xmm1		; AVX512VL-NEXT: vucomiss %xmm0, %xmm1
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.sse.ucomilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.ucomilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomilt.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomilt.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_ucomineq_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_ucomineq_ss(<4 x float> %a0, <4 x float> %a1) {
▲ Show 20 Lines • Show All 895 Lines • ▼ Show 20 Lines	; AVX512VL-NEXT: retl
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.ptestc.256(<4 x i64>, <4 x i64>) nounwind readnone		declare i32 @llvm.x86.avx.ptestc.256(<4 x i64>, <4 x i64>) nounwind readnone


define i32 @test_x86_avx_ptestnzc_256(<4 x i64> %a0, <4 x i64> %a1) {		define i32 @test_x86_avx_ptestnzc_256(<4 x i64> %a0, <4 x i64> %a1) {
; AVX-LABEL: test_x86_avx_ptestnzc_256:		; AVX-LABEL: test_x86_avx_ptestnzc_256:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vptest %ymm1, %ymm0		; AVX-NEXT: vptest %ymm1, %ymm0
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: vzeroupper		; AVX-NEXT: vzeroupper
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_avx_ptestnzc_256:		; AVX512VL-LABEL: test_x86_avx_ptestnzc_256:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vptest %ymm1, %ymm0		; AVX512VL-NEXT: vptest %ymm1, %ymm0
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.avx.ptestnzc.256(<4 x i64> %a0, <4 x i64> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.avx.ptestnzc.256(<4 x i64> %a0, <4 x i64> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.ptestnzc.256(<4 x i64>, <4 x i64>) nounwind readnone		declare i32 @llvm.x86.avx.ptestnzc.256(<4 x i64>, <4 x i64>) nounwind readnone


define i32 @test_x86_avx_ptestz_256(<4 x i64> %a0, <4 x i64> %a1) {		define i32 @test_x86_avx_ptestz_256(<4 x i64> %a0, <4 x i64> %a1) {
; AVX-LABEL: test_x86_avx_ptestz_256:		; AVX-LABEL: test_x86_avx_ptestz_256:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vptest %ymm1, %ymm0		; AVX-NEXT: vptest %ymm1, %ymm0
; AVX-NEXT: sete %al		; AVX-NEXT: sete %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: vzeroupper		; AVX-NEXT: vzeroupper
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_avx_ptestz_256:		; AVX512VL-LABEL: test_x86_avx_ptestz_256:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vptest %ymm1, %ymm0		; AVX512VL-NEXT: vptest %ymm1, %ymm0
; AVX512VL-NEXT: sete %al		; AVX512VL-NEXT: sete %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.avx.ptestz.256(<4 x i64> %a0, <4 x i64> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.avx.ptestz.256(<4 x i64> %a0, <4 x i64> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.ptestz.256(<4 x i64>, <4 x i64>) nounwind readnone		declare i32 @llvm.x86.avx.ptestz.256(<4 x i64>, <4 x i64>) nounwind readnone


define <8 x float> @test_x86_avx_rcp_ps_256(<8 x float> %a0) {		define <8 x float> @test_x86_avx_rcp_ps_256(<8 x float> %a0) {
▲ Show 20 Lines • Show All 349 Lines • ▼ Show 20 Lines	; AVX512VL-NEXT: retl
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestc.ps.256(<8 x float>, <8 x float>) nounwind readnone		declare i32 @llvm.x86.avx.vtestc.ps.256(<8 x float>, <8 x float>) nounwind readnone


define i32 @test_x86_avx_vtestnzc_pd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_avx_vtestnzc_pd(<2 x double> %a0, <2 x double> %a1) {
; AVX-LABEL: test_x86_avx_vtestnzc_pd:		; AVX-LABEL: test_x86_avx_vtestnzc_pd:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vtestpd %xmm1, %xmm0		; AVX-NEXT: vtestpd %xmm1, %xmm0
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_avx_vtestnzc_pd:		; AVX512VL-LABEL: test_x86_avx_vtestnzc_pd:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vtestpd %xmm1, %xmm0		; AVX512VL-NEXT: vtestpd %xmm1, %xmm0
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.avx.vtestnzc.pd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.avx.vtestnzc.pd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestnzc.pd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.avx.vtestnzc.pd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_avx_vtestnzc_pd_256(<4 x double> %a0, <4 x double> %a1) {		define i32 @test_x86_avx_vtestnzc_pd_256(<4 x double> %a0, <4 x double> %a1) {
; AVX-LABEL: test_x86_avx_vtestnzc_pd_256:		; AVX-LABEL: test_x86_avx_vtestnzc_pd_256:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vtestpd %ymm1, %ymm0		; AVX-NEXT: vtestpd %ymm1, %ymm0
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: vzeroupper		; AVX-NEXT: vzeroupper
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_avx_vtestnzc_pd_256:		; AVX512VL-LABEL: test_x86_avx_vtestnzc_pd_256:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vtestpd %ymm1, %ymm0		; AVX512VL-NEXT: vtestpd %ymm1, %ymm0
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.avx.vtestnzc.pd.256(<4 x double> %a0, <4 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.avx.vtestnzc.pd.256(<4 x double> %a0, <4 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestnzc.pd.256(<4 x double>, <4 x double>) nounwind readnone		declare i32 @llvm.x86.avx.vtestnzc.pd.256(<4 x double>, <4 x double>) nounwind readnone


define i32 @test_x86_avx_vtestnzc_ps(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_avx_vtestnzc_ps(<4 x float> %a0, <4 x float> %a1) {
; AVX-LABEL: test_x86_avx_vtestnzc_ps:		; AVX-LABEL: test_x86_avx_vtestnzc_ps:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vtestps %xmm1, %xmm0		; AVX-NEXT: vtestps %xmm1, %xmm0
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_avx_vtestnzc_ps:		; AVX512VL-LABEL: test_x86_avx_vtestnzc_ps:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vtestps %xmm1, %xmm0		; AVX512VL-NEXT: vtestps %xmm1, %xmm0
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.avx.vtestnzc.ps(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.avx.vtestnzc.ps(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestnzc.ps(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.avx.vtestnzc.ps(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_avx_vtestnzc_ps_256(<8 x float> %a0, <8 x float> %a1) {		define i32 @test_x86_avx_vtestnzc_ps_256(<8 x float> %a0, <8 x float> %a1) {
; AVX-LABEL: test_x86_avx_vtestnzc_ps_256:		; AVX-LABEL: test_x86_avx_vtestnzc_ps_256:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vtestps %ymm1, %ymm0		; AVX-NEXT: vtestps %ymm1, %ymm0
; AVX-NEXT: seta %al		; AVX-NEXT: seta %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: vzeroupper		; AVX-NEXT: vzeroupper
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_avx_vtestnzc_ps_256:		; AVX512VL-LABEL: test_x86_avx_vtestnzc_ps_256:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vtestps %ymm1, %ymm0		; AVX512VL-NEXT: vtestps %ymm1, %ymm0
; AVX512VL-NEXT: seta %al		; AVX512VL-NEXT: seta %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.avx.vtestnzc.ps.256(<8 x float> %a0, <8 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.avx.vtestnzc.ps.256(<8 x float> %a0, <8 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestnzc.ps.256(<8 x float>, <8 x float>) nounwind readnone		declare i32 @llvm.x86.avx.vtestnzc.ps.256(<8 x float>, <8 x float>) nounwind readnone


define i32 @test_x86_avx_vtestz_pd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_avx_vtestz_pd(<2 x double> %a0, <2 x double> %a1) {
; AVX-LABEL: test_x86_avx_vtestz_pd:		; AVX-LABEL: test_x86_avx_vtestz_pd:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vtestpd %xmm1, %xmm0		; AVX-NEXT: vtestpd %xmm1, %xmm0
; AVX-NEXT: sete %al		; AVX-NEXT: sete %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_avx_vtestz_pd:		; AVX512VL-LABEL: test_x86_avx_vtestz_pd:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vtestpd %xmm1, %xmm0		; AVX512VL-NEXT: vtestpd %xmm1, %xmm0
; AVX512VL-NEXT: sete %al		; AVX512VL-NEXT: sete %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.avx.vtestz.pd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.avx.vtestz.pd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestz.pd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.avx.vtestz.pd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_avx_vtestz_pd_256(<4 x double> %a0, <4 x double> %a1) {		define i32 @test_x86_avx_vtestz_pd_256(<4 x double> %a0, <4 x double> %a1) {
; AVX-LABEL: test_x86_avx_vtestz_pd_256:		; AVX-LABEL: test_x86_avx_vtestz_pd_256:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vtestpd %ymm1, %ymm0		; AVX-NEXT: vtestpd %ymm1, %ymm0
; AVX-NEXT: sete %al		; AVX-NEXT: sete %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: vzeroupper		; AVX-NEXT: vzeroupper
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_avx_vtestz_pd_256:		; AVX512VL-LABEL: test_x86_avx_vtestz_pd_256:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vtestpd %ymm1, %ymm0		; AVX512VL-NEXT: vtestpd %ymm1, %ymm0
; AVX512VL-NEXT: sete %al		; AVX512VL-NEXT: sete %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.avx.vtestz.pd.256(<4 x double> %a0, <4 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.avx.vtestz.pd.256(<4 x double> %a0, <4 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestz.pd.256(<4 x double>, <4 x double>) nounwind readnone		declare i32 @llvm.x86.avx.vtestz.pd.256(<4 x double>, <4 x double>) nounwind readnone


define i32 @test_x86_avx_vtestz_ps(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_avx_vtestz_ps(<4 x float> %a0, <4 x float> %a1) {
; AVX-LABEL: test_x86_avx_vtestz_ps:		; AVX-LABEL: test_x86_avx_vtestz_ps:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vtestps %xmm1, %xmm0		; AVX-NEXT: vtestps %xmm1, %xmm0
; AVX-NEXT: sete %al		; AVX-NEXT: sete %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_avx_vtestz_ps:		; AVX512VL-LABEL: test_x86_avx_vtestz_ps:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vtestps %xmm1, %xmm0		; AVX512VL-NEXT: vtestps %xmm1, %xmm0
; AVX512VL-NEXT: sete %al		; AVX512VL-NEXT: sete %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.avx.vtestz.ps(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.avx.vtestz.ps(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestz.ps(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.avx.vtestz.ps(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_avx_vtestz_ps_256(<8 x float> %a0, <8 x float> %a1) {		define i32 @test_x86_avx_vtestz_ps_256(<8 x float> %a0, <8 x float> %a1) {
; AVX-LABEL: test_x86_avx_vtestz_ps_256:		; AVX-LABEL: test_x86_avx_vtestz_ps_256:
; AVX: ## BB#0:		; AVX: ## BB#0:
		; AVX-NEXT: xorl %eax, %eax
; AVX-NEXT: vtestps %ymm1, %ymm0		; AVX-NEXT: vtestps %ymm1, %ymm0
; AVX-NEXT: sete %al		; AVX-NEXT: sete %al
; AVX-NEXT: movzbl %al, %eax
; AVX-NEXT: vzeroupper		; AVX-NEXT: vzeroupper
; AVX-NEXT: retl		; AVX-NEXT: retl
;		;
; AVX512VL-LABEL: test_x86_avx_vtestz_ps_256:		; AVX512VL-LABEL: test_x86_avx_vtestz_ps_256:
; AVX512VL: ## BB#0:		; AVX512VL: ## BB#0:
		; AVX512VL-NEXT: xorl %eax, %eax
; AVX512VL-NEXT: vtestps %ymm1, %ymm0		; AVX512VL-NEXT: vtestps %ymm1, %ymm0
; AVX512VL-NEXT: sete %al		; AVX512VL-NEXT: sete %al
; AVX512VL-NEXT: movzbl %al, %eax
; AVX512VL-NEXT: retl		; AVX512VL-NEXT: retl
%res = call i32 @llvm.x86.avx.vtestz.ps.256(<8 x float> %a0, <8 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.avx.vtestz.ps.256(<8 x float> %a0, <8 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.avx.vtestz.ps.256(<8 x float>, <8 x float>) nounwind readnone		declare i32 @llvm.x86.avx.vtestz.ps.256(<8 x float>, <8 x float>) nounwind readnone


define void @test_x86_avx_vzeroall() {		define void @test_x86_avx_vzeroall() {
▲ Show 20 Lines • Show All 267 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/avx512-cmp.ll

	Show First 20 Lines • Show All 88 Lines • ▼ Show 20 Lines
	return: ; preds = %if.end, %entry			return: ; preds = %if.end, %entry
	%retval.0 = phi float [ %cond, %if.end ], [ %p, %entry ]			%retval.0 = phi float [ %cond, %if.end ], [ %p, %entry ]
	ret float %retval.0			ret float %retval.0
	}			}

	define i32 @test6(i32 %a, i32 %b) {			define i32 @test6(i32 %a, i32 %b) {
	; ALL-LABEL: test6:			; ALL-LABEL: test6:
	; ALL: ## BB#0:			; ALL: ## BB#0:
				; ALL-NEXT: xorl %eax, %eax
	; ALL-NEXT: cmpl %esi, %edi			; ALL-NEXT: cmpl %esi, %edi
	; ALL-NEXT: sete %al			; ALL-NEXT: sete %al
	; ALL-NEXT: movzbl %al, %eax
	; ALL-NEXT: retq			; ALL-NEXT: retq
	%cmp = icmp eq i32 %a, %b			%cmp = icmp eq i32 %a, %b
	%res = zext i1 %cmp to i32			%res = zext i1 %cmp to i32
	ret i32 %res			ret i32 %res
	}			}

	define i32 @test7(double %x, double %y) #2 {			define i32 @test7(double %x, double %y) #2 {
	; ALL-LABEL: test7:			; ALL-LABEL: test7:
	; ALL: ## BB#0: ## %entry			; ALL: ## BB#0: ## %entry
				; ALL-NEXT: xorl %eax, %eax
	; ALL-NEXT: vucomisd %xmm1, %xmm0			; ALL-NEXT: vucomisd %xmm1, %xmm0
	; ALL-NEXT: setne %al			; ALL-NEXT: setne %al
	; ALL-NEXT: movzbl %al, %eax
	; ALL-NEXT: retq			; ALL-NEXT: retq
	entry:			entry:
	%0 = fcmp one double %x, %y			%0 = fcmp one double %x, %y
	%or = zext i1 %0 to i32			%or = zext i1 %0 to i32
	ret i32 %or			ret i32 %or
	}			}

	define i32 @test8(i32 %a1, i32 %a2, i32 %a3) {			define i32 @test8(i32 %a1, i32 %a2, i32 %a3) {
	▲ Show 20 Lines • Show All 70 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/avx512-intrinsics.ll

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl \| FileCheck %s			; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=knl \| FileCheck %s

	declare i32 @llvm.x86.avx512.kortestz.w(i16, i16) nounwind readnone			declare i32 @llvm.x86.avx512.kortestz.w(i16, i16) nounwind readnone
	define i32 @test_kortestz(i16 %a0, i16 %a1) {			define i32 @test_kortestz(i16 %a0, i16 %a1) {
	; CHECK-LABEL: test_kortestz:			; CHECK-LABEL: test_kortestz:
	; CHECK: ## BB#0:			; CHECK: ## BB#0:
	; CHECK-NEXT: kmovw %esi, %k0			; CHECK-NEXT: kmovw %esi, %k0
	; CHECK-NEXT: kmovw %edi, %k1			; CHECK-NEXT: kmovw %edi, %k1
				; CHECK-NEXT: xorl %eax, %eax
	; CHECK-NEXT: kortestw %k0, %k1			; CHECK-NEXT: kortestw %k0, %k1
	; CHECK-NEXT: sete %al			; CHECK-NEXT: sete %al
	; CHECK-NEXT: movzbl %al, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%res = call i32 @llvm.x86.avx512.kortestz.w(i16 %a0, i16 %a1)			%res = call i32 @llvm.x86.avx512.kortestz.w(i16 %a0, i16 %a1)
	ret i32 %res			ret i32 %res
	}			}

	declare i32 @llvm.x86.avx512.kortestc.w(i16, i16) nounwind readnone			declare i32 @llvm.x86.avx512.kortestc.w(i16, i16) nounwind readnone
	define i32 @test_kortestc(i16 %a0, i16 %a1) {			define i32 @test_kortestc(i16 %a0, i16 %a1) {
	; CHECK-LABEL: test_kortestc:			; CHECK-LABEL: test_kortestc:
	▲ Show 20 Lines • Show All 6,763 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll

	Show First 20 Lines • Show All 565 Lines • ▼ Show 20 Lines
	; KNL-NEXT: movl {{.*}}(%rip), %eax			; KNL-NEXT: movl {{.*}}(%rip), %eax
	; KNL-NEXT: kmovw (%rsp), %k1			; KNL-NEXT: kmovw (%rsp), %k1
	; KNL-NEXT: vpbroadcastd %eax, %zmm0 {%k1} {z}			; KNL-NEXT: vpbroadcastd %eax, %zmm0 {%k1} {z}
	; KNL-NEXT: vpmovdb %zmm0, %xmm0			; KNL-NEXT: vpmovdb %zmm0, %xmm0
	; KNL-NEXT: kmovw {{[0-9]+}}(%rsp), %k1			; KNL-NEXT: kmovw {{[0-9]+}}(%rsp), %k1
	; KNL-NEXT: vpbroadcastd %eax, %zmm1 {%k1} {z}			; KNL-NEXT: vpbroadcastd %eax, %zmm1 {%k1} {z}
	; KNL-NEXT: vpmovdb %zmm1, %xmm1			; KNL-NEXT: vpmovdb %zmm1, %xmm1
	; KNL-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm1			; KNL-NEXT: vinserti128 $1, %xmm1, %ymm0, %ymm1
				; KNL-NEXT: xorl %ecx, %ecx
	; KNL-NEXT: cmpl %edx, %esi			; KNL-NEXT: cmpl %edx, %esi
	; KNL-NEXT: setg %cl			; KNL-NEXT: setg %cl
	; KNL-NEXT: movzbl %cl, %ecx
	; KNL-NEXT: vpinsrb $5, %ecx, %xmm0, %xmm0			; KNL-NEXT: vpinsrb $5, %ecx, %xmm0, %xmm0
	; KNL-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]			; KNL-NEXT: vpblendd {{.*#+}} ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]
	; KNL-NEXT: vpsllw $7, %ymm0, %ymm0			; KNL-NEXT: vpsllw $7, %ymm0, %ymm0
	; KNL-NEXT: vpand {{.*}}(%rip), %ymm0, %ymm0			; KNL-NEXT: vpand {{.*}}(%rip), %ymm0, %ymm0
	; KNL-NEXT: vpxor %ymm1, %ymm1, %ymm1			; KNL-NEXT: vpxor %ymm1, %ymm1, %ymm1
	; KNL-NEXT: vpcmpgtb %ymm0, %ymm1, %ymm0			; KNL-NEXT: vpcmpgtb %ymm0, %ymm1, %ymm0
	; KNL-NEXT: kmovw {{[0-9]+}}(%rsp), %k1			; KNL-NEXT: kmovw {{[0-9]+}}(%rsp), %k1
	; KNL-NEXT: vpbroadcastd %eax, %zmm1 {%k1} {z}			; KNL-NEXT: vpbroadcastd %eax, %zmm1 {%k1} {z}
	▲ Show 20 Lines • Show All 1,344 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/bmi.ll

	Show First 20 Lines • Show All 195 Lines • ▼ Show 20 Lines

	; But don't use 'andn' if the mask is a power-of-two.			; But don't use 'andn' if the mask is a power-of-two.
	define i1 @and_cmp_const_power_of_two(i32 %x, i32 %y) {			define i1 @and_cmp_const_power_of_two(i32 %x, i32 %y) {
	; CHECK-LABEL: and_cmp_const_power_of_two:			; CHECK-LABEL: and_cmp_const_power_of_two:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: btl %esi, %edi			; CHECK-NEXT: btl %esi, %edi
	; CHECK-NEXT: setae %al			; CHECK-NEXT: setae %al
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	;
	%shl = shl i32 1, %y			%shl = shl i32 1, %y
	%and = and i32 %x, %shl			%and = and i32 %x, %shl
	%cmp = icmp ne i32 %and, %shl			%cmp = icmp ne i32 %and, %shl
	ret i1 %cmp			ret i1 %cmp
	}			}

	; Don't transform to 'andn' if there's another use of the 'and'.			; Don't transform to 'andn' if there's another use of the 'and'.
	define i32 @and_cmp_not_one_use(i32 %x) {			define i32 @and_cmp_not_one_use(i32 %x) {
	; CHECK-LABEL: and_cmp_not_one_use:			; CHECK-LABEL: and_cmp_not_one_use:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: andl $37, %edi			; CHECK-NEXT: andl $37, %edi
				; CHECK-NEXT: xorl %eax, %eax
	; CHECK-NEXT: cmpl $37, %edi			; CHECK-NEXT: cmpl $37, %edi
	; CHECK-NEXT: sete %al			; CHECK-NEXT: sete %al
	; CHECK-NEXT: movzbl %al, %eax
	; CHECK-NEXT: addl %edi, %eax			; CHECK-NEXT: addl %edi, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	;
	%and = and i32 %x, 37			%and = and i32 %x, 37
	%cmp = icmp eq i32 %and, 37			%cmp = icmp eq i32 %and, 37
	%ext = zext i1 %cmp to i32			%ext = zext i1 %cmp to i32
	%add = add i32 %and, %ext			%add = add i32 %and, %ext
	ret i32 %add			ret i32 %add
	}			}

	; Verify that we're not transforming invalid comparison predicates.			; Verify that we're not transforming invalid comparison predicates.
	▲ Show 20 Lines • Show All 419 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/cmov.ll

	Show First 20 Lines • Show All 114 Lines • ▼ Show 20 Lines
	declare i32 @printf(i8* nocapture, ...) nounwind			declare i32 @printf(i8* nocapture, ...) nounwind


	; Should compile to setcc \| -2.			; Should compile to setcc \| -2.
	; rdar://6668608			; rdar://6668608
	define i32 @test5(i32* nocapture %P) nounwind readonly {			define i32 @test5(i32* nocapture %P) nounwind readonly {
	entry:			entry:
	; CHECK-LABEL: test5:			; CHECK-LABEL: test5:
				; CHECK: xorl %eax, %eax
	; CHECK: setg %al			; CHECK: setg %al
	; CHECK: movzbl %al, %eax
	; CHECK: orl $-2, %eax			; CHECK: orl $-2, %eax
	; CHECK: ret			; CHECK: ret

	%0 = load i32, i32* %P, align 4 ; <i32> [#uses=1]			%0 = load i32, i32* %P, align 4 ; <i32> [#uses=1]
	%1 = icmp sgt i32 %0, 41 ; <i1> [#uses=1]			%1 = icmp sgt i32 %0, 41 ; <i1> [#uses=1]
	%iftmp.0.0 = select i1 %1, i32 -1, i32 -2 ; <i32> [#uses=1]			%iftmp.0.0 = select i1 %1, i32 -1, i32 -2 ; <i32> [#uses=1]
	ret i32 %iftmp.0.0			ret i32 %iftmp.0.0
	}			}

	define i32 @test6(i32* nocapture %P) nounwind readonly {			define i32 @test6(i32* nocapture %P) nounwind readonly {
	entry:			entry:
	; CHECK-LABEL: test6:			; CHECK-LABEL: test6:
				; CHECK: xorl %eax, %eax
	; CHECK: setl %al			; CHECK: setl %al
	; CHECK: movzbl %al, %eax
	; CHECK: leal 4(%rax,%rax,8), %eax			; CHECK: leal 4(%rax,%rax,8), %eax
	; CHECK: ret			; CHECK: ret
	%0 = load i32, i32* %P, align 4 ; <i32> [#uses=1]			%0 = load i32, i32* %P, align 4 ; <i32> [#uses=1]
	%1 = icmp sgt i32 %0, 41 ; <i1> [#uses=1]			%1 = icmp sgt i32 %0, 41 ; <i1> [#uses=1]
	%iftmp.0.0 = select i1 %1, i32 4, i32 13 ; <i32> [#uses=1]			%iftmp.0.0 = select i1 %1, i32 4, i32 13 ; <i32> [#uses=1]
	ret i32 %iftmp.0.0			ret i32 %iftmp.0.0
	}			}

	Show All 11 Lines

llvm/trunk/test/CodeGen/X86/cmp.ll

	Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
	; CHECK: testb $31, (%rsi)			; CHECK: testb $31, (%rsi)
	}			}

	define i64 @test3(i64 %x) nounwind {			define i64 @test3(i64 %x) nounwind {
	%t = icmp eq i64 %x, 0			%t = icmp eq i64 %x, 0
	%r = zext i1 %t to i64			%r = zext i1 %t to i64
	ret i64 %r			ret i64 %r
	; CHECK-LABEL: test3:			; CHECK-LABEL: test3:
				; CHECK: xorl %eax, %eax
	; CHECK: testq %rdi, %rdi			; CHECK: testq %rdi, %rdi
	; CHECK: sete %al			; CHECK: sete %al
	; CHECK: movzbl %al, %eax
	; CHECK: ret			; CHECK: ret
	}			}

	define i64 @test4(i64 %x) nounwind {			define i64 @test4(i64 %x) nounwind {
	%t = icmp slt i64 %x, 1			%t = icmp slt i64 %x, 1
	%r = zext i1 %t to i64			%r = zext i1 %t to i64
	ret i64 %r			ret i64 %r
	; CHECK-LABEL: test4:			; CHECK-LABEL: test4:
				; CHECK: xorl %eax, %eax
	; CHECK: testq %rdi, %rdi			; CHECK: testq %rdi, %rdi
	; CHECK: setle %al			; CHECK: setle %al
	; CHECK: movzbl %al, %eax
	; CHECK: ret			; CHECK: ret
	}			}


	define i32 @test5(double %A) nounwind {			define i32 @test5(double %A) nounwind {
	entry:			entry:
	%tmp2 = fcmp ogt double %A, 1.500000e+02; <i1> [#uses=1]			%tmp2 = fcmp ogt double %A, 1.500000e+02; <i1> [#uses=1]
	%tmp5 = fcmp ult double %A, 7.500000e+01; <i1> [#uses=1]			%tmp5 = fcmp ult double %A, 7.500000e+01; <i1> [#uses=1]
	▲ Show 20 Lines • Show All 203 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: test20			; CHECK-LABEL: test20
	; CHECK: andl			; CHECK: andl
	; CHECK: setne			; CHECK: setne
	; CHECK: addl			; CHECK: addl
	; CHECK: setne			; CHECK: setne
	; CHECK: testl			; CHECK: testl
	; CHECK: setne			; CHECK: setne
	}			}
	No newline at end of file			No newline at end of file

llvm/trunk/test/CodeGen/X86/cmpxchg-i1.ll

Show All 28 Lines	false:
call void @bar()		call void @bar()
ret void		ret void
}		}

define i64 @cmpxchg_sext(i32* %addr, i32 %desired, i32 %new) {		define i64 @cmpxchg_sext(i32* %addr, i32 %desired, i32 %new) {
; CHECK-LABEL: cmpxchg_sext:		; CHECK-LABEL: cmpxchg_sext:
; CHECK-DAG: cmpxchgl		; CHECK-DAG: cmpxchgl
; CHECK-NOT: cmpl		; CHECK-NOT: cmpl
; CHECK: sete %al		; CHECK: sete %cl
; CHECK: retq		; CHECK: retq
%pair = cmpxchg i32* %addr, i32 %desired, i32 %new seq_cst seq_cst		%pair = cmpxchg i32* %addr, i32 %desired, i32 %new seq_cst seq_cst
%success = extractvalue { i32, i1 } %pair, 1		%success = extractvalue { i32, i1 } %pair, 1
%mask = sext i1 %success to i64		%mask = sext i1 %success to i64
ret i64 %mask		ret i64 %mask
}		}

define i32 @cmpxchg_zext(i32* %addr, i32 %desired, i32 %new) {		define i32 @cmpxchg_zext(i32* %addr, i32 %desired, i32 %new) {
; CHECK-LABEL: cmpxchg_zext:		; CHECK-LABEL: cmpxchg_zext:
		; CHECK: xorl %e[[R:[a-z]]]x
; CHECK: cmpxchgl		; CHECK: cmpxchgl
; CHECK-NOT: cmp		; CHECK-NOT: cmp
; CHECK: sete [[BYTE:%[a-z0-9]+]]		; CHECK: sete %[[R]]l
; CHECK: movzbl [[BYTE]], %eax
%pair = cmpxchg i32* %addr, i32 %desired, i32 %new seq_cst seq_cst		%pair = cmpxchg i32* %addr, i32 %desired, i32 %new seq_cst seq_cst
%success = extractvalue { i32, i1 } %pair, 1		%success = extractvalue { i32, i1 } %pair, 1
%mask = zext i1 %success to i32		%mask = zext i1 %success to i32
ret i32 %mask		ret i32 %mask
}		}


define i32 @cmpxchg_use_eflags_and_val(i32* %addr, i32 %offset) {		define i32 @cmpxchg_use_eflags_and_val(i32* %addr, i32 %offset) {
Show All 29 Lines

llvm/trunk/test/CodeGen/X86/cmpxchg-i128-i1.ll

Show All 38 Lines	; CHECK: retq
%pair = cmpxchg i128* %addr, i128 %desired, i128 %new seq_cst seq_cst		%pair = cmpxchg i128* %addr, i128 %desired, i128 %new seq_cst seq_cst
%oldval = extractvalue { i128, i1 } %pair, 0		%oldval = extractvalue { i128, i1 } %pair, 0
%success = icmp sge i128 %oldval, %desired		%success = icmp sge i128 %oldval, %desired
ret i1 %success		ret i1 %success
}		}

define i128 @cmpxchg_zext(i128* %addr, i128 %desired, i128 %new) {		define i128 @cmpxchg_zext(i128* %addr, i128 %desired, i128 %new) {
; CHECK-LABEL: cmpxchg_zext:		; CHECK-LABEL: cmpxchg_zext:
		; CHECK: xorl
; CHECK: cmpxchg16b		; CHECK: cmpxchg16b
; CHECK-NOT: cmpq		; CHECK-NOT: cmpq
; CHECK: sete [[BYTE:%[a-z0-9]+]]		; CHECK: sete
; CHECK: movzbl [[BYTE]], %eax
%pair = cmpxchg i128* %addr, i128 %desired, i128 %new seq_cst seq_cst		%pair = cmpxchg i128* %addr, i128 %desired, i128 %new seq_cst seq_cst
%success = extractvalue { i128, i1 } %pair, 1		%success = extractvalue { i128, i1 } %pair, 1
%mask = zext i1 %success to i128		%mask = zext i1 %success to i128
ret i128 %mask		ret i128 %mask
}		}


define i128 @cmpxchg_use_eflags_and_val(i128* %addr, i128 %offset) {		define i128 @cmpxchg_use_eflags_and_val(i128* %addr, i128 %offset) {
Show All 25 Lines

llvm/trunk/test/CodeGen/X86/ctpop-combine.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=x86_64-unknown -mcpu=corei7 \| FileCheck %s			; RUN: llc < %s -mtriple=x86_64-unknown -mcpu=corei7 \| FileCheck %s

	declare i64 @llvm.ctpop.i64(i64) nounwind readnone			declare i64 @llvm.ctpop.i64(i64) nounwind readnone

	define i32 @test1(i64 %x) nounwind readnone {			define i32 @test1(i64 %x) nounwind readnone {
	; CHECK-LABEL: test1:			; CHECK-LABEL: test1:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: leaq -1(%rdi), %rax			; CHECK-NEXT: leaq -1(%rdi), %rcx
	; CHECK-NEXT: testq %rax, %rdi			; CHECK-NEXT: xorl %eax, %eax
				; CHECK-NEXT: testq %rcx, %rdi
	; CHECK-NEXT: setne %al			; CHECK-NEXT: setne %al
	; CHECK-NEXT: movzbl %al, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%count = tail call i64 @llvm.ctpop.i64(i64 %x)			%count = tail call i64 @llvm.ctpop.i64(i64 %x)
	%cast = trunc i64 %count to i32			%cast = trunc i64 %count to i32
	%cmp = icmp ugt i32 %cast, 1			%cmp = icmp ugt i32 %cast, 1
	%conv = zext i1 %cmp to i32			%conv = zext i1 %cmp to i32
	ret i32 %conv			ret i32 %conv
	}			}


	define i32 @test2(i64 %x) nounwind readnone {			define i32 @test2(i64 %x) nounwind readnone {
	; CHECK-LABEL: test2:			; CHECK-LABEL: test2:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: leaq -1(%rdi), %rax			; CHECK-NEXT: leaq -1(%rdi), %rcx
	; CHECK-NEXT: testq %rax, %rdi			; CHECK-NEXT: xorl %eax, %eax
				; CHECK-NEXT: testq %rcx, %rdi
	; CHECK-NEXT: sete %al			; CHECK-NEXT: sete %al
	; CHECK-NEXT: movzbl %al, %eax
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%count = tail call i64 @llvm.ctpop.i64(i64 %x)			%count = tail call i64 @llvm.ctpop.i64(i64 %x)
	%cmp = icmp ult i64 %count, 2			%cmp = icmp ult i64 %count, 2
	%conv = zext i1 %cmp to i32			%conv = zext i1 %cmp to i32
	ret i32 %conv			ret i32 %conv
	}			}

	define i32 @test3(i64 %x) nounwind readnone {			define i32 @test3(i64 %x) nounwind readnone {
	Show All 14 Lines

llvm/trunk/test/CodeGen/X86/fp128-cast.ll

Show First 20 Lines • Show All 232 Lines • ▼ Show 20 Lines	entry:
ret i32 %conv		ret i32 %conv
; X32-LABEL: TestConst128:		; X32-LABEL: TestConst128:
; X32: calll __gttf2		; X32: calll __gttf2
; X32: retl		; X32: retl
;		;
; X64-LABEL: TestConst128:		; X64-LABEL: TestConst128:
; X64: movaps {{.*}}, %xmm1		; X64: movaps {{.*}}, %xmm1
; X64-NEXT: callq __gttf2		; X64-NEXT: callq __gttf2
		; X64-NEXT: xorl
; X64-NEXT: test		; X64-NEXT: test
; X64: retq		; X64: retq
}		}

; C code:		; C code:
; struct TestBits_ieee_ext {		; struct TestBits_ieee_ext {
; unsigned v1;		; unsigned v1;
; unsigned v2;		; unsigned v2;
Show All 23 Lines
;		;
; X64-LABEL: TestBits128:		; X64-LABEL: TestBits128:
; X64: movaps %xmm0, %xmm1		; X64: movaps %xmm0, %xmm1
; X64-NEXT: callq __multf3		; X64-NEXT: callq __multf3
; X64-NEXT: movaps %xmm0, (%rsp)		; X64-NEXT: movaps %xmm0, (%rsp)
; X64-NEXT: movq (%rsp),		; X64-NEXT: movq (%rsp),
; X64-NEXT: movq %		; X64-NEXT: movq %
; X64-NEXT: shrq $32,		; X64-NEXT: shrq $32,
; X64: orl		; X64: xorl %eax, %eax
		; X64-NEXT: orl
; X64-NEXT: sete %al		; X64-NEXT: sete %al
; X64-NEXT: movzbl %al, %eax
; X64: retq		; X64: retq
;		;
; If TestBits128 fails due to any llvm or clang change,		; If TestBits128 fails due to any llvm or clang change,
; please make sure the original simplified C code will		; please make sure the original simplified C code will
; be compiled into correct IL and assembly code, not		; be compiled into correct IL and assembly code, not
; just this TestBits128 test case. Better yet, try to		; just this TestBits128 test case. Better yet, try to
; test the whole libm and its test cases.		; test the whole libm and its test cases.
}		}
▲ Show 20 Lines • Show All 71 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/fp128-compare.ll

	; RUN: llc < %s -O2 -mtriple=x86_64-linux-android -mattr=+mmx \| FileCheck %s			; RUN: llc < %s -O2 -mtriple=x86_64-linux-android -mattr=+mmx \| FileCheck %s
	; RUN: llc < %s -O2 -mtriple=x86_64-linux-gnu -mattr=+mmx \| FileCheck %s			; RUN: llc < %s -O2 -mtriple=x86_64-linux-gnu -mattr=+mmx \| FileCheck %s

	define i32 @TestComp128GT(fp128 %d1, fp128 %d2) {			define i32 @TestComp128GT(fp128 %d1, fp128 %d2) {
	entry:			entry:
	%cmp = fcmp ogt fp128 %d1, %d2			%cmp = fcmp ogt fp128 %d1, %d2
	%conv = zext i1 %cmp to i32			%conv = zext i1 %cmp to i32
	ret i32 %conv			ret i32 %conv
	; CHECK-LABEL: TestComp128GT:			; CHECK-LABEL: TestComp128GT:
	; CHECK: callq __gttf2			; CHECK: callq __gttf2
	; CHECK: setg %al			; CHECK: xorl %ecx, %ecx
	; CHECK: movzbl %al, %eax			; CHECK: setg %cl
				; CHECK: movl %ecx, %eax
	; CHECK: retq			; CHECK: retq
	}			}

	define i32 @TestComp128GE(fp128 %d1, fp128 %d2) {			define i32 @TestComp128GE(fp128 %d1, fp128 %d2) {
	entry:			entry:
	%cmp = fcmp oge fp128 %d1, %d2			%cmp = fcmp oge fp128 %d1, %d2
	%conv = zext i1 %cmp to i32			%conv = zext i1 %cmp to i32
	ret i32 %conv			ret i32 %conv
	; CHECK-LABEL: TestComp128GE:			; CHECK-LABEL: TestComp128GE:
	; CHECK: callq __getf2			; CHECK: callq __getf2
				; CHECK: xorl %ecx, %ecx
	; CHECK: testl %eax, %eax			; CHECK: testl %eax, %eax
	; CHECK: setns %al			; CHECK: setns %cl
	; CHECK: movzbl %al, %eax			; CHECK: movl %ecx, %eax
	; CHECK: retq			; CHECK: retq
	}			}

	define i32 @TestComp128LT(fp128 %d1, fp128 %d2) {			define i32 @TestComp128LT(fp128 %d1, fp128 %d2) {
	entry:			entry:
	%cmp = fcmp olt fp128 %d1, %d2			%cmp = fcmp olt fp128 %d1, %d2
	%conv = zext i1 %cmp to i32			%conv = zext i1 %cmp to i32
	ret i32 %conv			ret i32 %conv
	Show All 9 Lines

	define i32 @TestComp128LE(fp128 %d1, fp128 %d2) {			define i32 @TestComp128LE(fp128 %d1, fp128 %d2) {
	entry:			entry:
	%cmp = fcmp ole fp128 %d1, %d2			%cmp = fcmp ole fp128 %d1, %d2
	%conv = zext i1 %cmp to i32			%conv = zext i1 %cmp to i32
	ret i32 %conv			ret i32 %conv
	; CHECK-LABEL: TestComp128LE:			; CHECK-LABEL: TestComp128LE:
	; CHECK: callq __letf2			; CHECK: callq __letf2
	; CHECK-NEXT: testl %eax, %eax			; CHECK: xorl %ecx, %ecx
	; CHECK: setle %al			; CHECK: testl %eax, %eax
	; CHECK: movzbl %al, %eax			; CHECK: setle %cl
				; CHECK: movl %ecx, %eax
	; CHECK: retq			; CHECK: retq
	}			}

	define i32 @TestComp128EQ(fp128 %d1, fp128 %d2) {			define i32 @TestComp128EQ(fp128 %d1, fp128 %d2) {
	entry:			entry:
	%cmp = fcmp oeq fp128 %d1, %d2			%cmp = fcmp oeq fp128 %d1, %d2
	%conv = zext i1 %cmp to i32			%conv = zext i1 %cmp to i32
	ret i32 %conv			ret i32 %conv
	; CHECK-LABEL: TestComp128EQ:			; CHECK-LABEL: TestComp128EQ:
	; CHECK: callq __eqtf2			; CHECK: callq __eqtf2
	; CHECK-NEXT: testl %eax, %eax			; CHECK: xorl %ecx, %ecx
	; CHECK: sete %al			; CHECK: testl %eax, %eax
	; CHECK: movzbl %al, %eax			; CHECK: sete %cl
				; CHECK: movl %ecx, %eax
	; CHECK: retq			; CHECK: retq
	}			}

	define i32 @TestComp128NE(fp128 %d1, fp128 %d2) {			define i32 @TestComp128NE(fp128 %d1, fp128 %d2) {
	entry:			entry:
	%cmp = fcmp une fp128 %d1, %d2			%cmp = fcmp une fp128 %d1, %d2
	%conv = zext i1 %cmp to i32			%conv = zext i1 %cmp to i32
	ret i32 %conv			ret i32 %conv
	; CHECK-LABEL: TestComp128NE:			; CHECK-LABEL: TestComp128NE:
	; CHECK: callq __netf2			; CHECK: callq __netf2
	; CHECK-NEXT: testl %eax, %eax			; CHECK: xorl %ecx, %ecx
	; CHECK: setne %al			; CHECK: testl %eax, %eax
	; CHECK: movzbl %al, %eax			; CHECK: setne %cl
				; CHECK: movl %ecx, %eax
	; CHECK: retq			; CHECK: retq
	}			}

	define fp128 @TestMax(fp128 %x, fp128 %y) {			define fp128 @TestMax(fp128 %x, fp128 %y) {
	entry:			entry:
	%cmp = fcmp ogt fp128 %x, %y			%cmp = fcmp ogt fp128 %x, %y
	%cond = select i1 %cmp, fp128 %x, fp128 %y			%cond = select i1 %cmp, fp128 %x, fp128 %y
	ret fp128 %cond			ret fp128 %cond
	Show All 9 Lines

llvm/trunk/test/CodeGen/X86/mcinst-lowering.ll

	; RUN: llc --show-mc-encoding < %s \| FileCheck %s			; RUN: llc --show-mc-encoding < %s \| FileCheck %s

	target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"			target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
	target triple = "x86_64-apple-darwin10.0.0"			target triple = "x86_64-apple-darwin10.0.0"

				declare i32 @foo();

	define i32 @f0(i32* nocapture %x) nounwind readonly ssp {			define i32 @f0(i32* nocapture %x) nounwind readonly ssp {
	entry:			entry:
	%tmp1 = load i32, i32* %x ; <i32> [#uses=2]			%tmp1 = call i32 @foo()
	%tobool = icmp eq i32 %tmp1, 0 ; <i1> [#uses=1]
	br i1 %tobool, label %if.end, label %return

	if.end: ; preds = %entry

	; Check that we lower to the short form of cmpl, which has a fixed %eax
	; register.
	;
	; CHECK: cmpl $16777216, %eax			; CHECK: cmpl $16777216, %eax
	; CHECK: # encoding: [0x3d,0x00,0x00,0x00,0x01]			; CHECK: # encoding: [0x3d,0x00,0x00,0x00,0x01]
	%cmp = icmp eq i32 %tmp1, 16777216 ; <i1> [#uses=1]			%cmp = icmp eq i32 %tmp1, 16777216 ; <i1> [#uses=1]

	%conv = zext i1 %cmp to i32 ; <i32> [#uses=1]			%conv = zext i1 %cmp to i32 ; <i32> [#uses=1]
	ret i32 %conv			ret i32 %conv

	return: ; preds = %entry
	ret i32 0
	}			}

	define i32 @f1() nounwind {			define i32 @f1() nounwind {
	%ax = tail call i16 asm sideeffect "", "={ax},~{dirflag},~{fpsr},~{flags}"()			%ax = tail call i16 asm sideeffect "", "={ax},~{dirflag},~{fpsr},~{flags}"()
	%conv = sext i16 %ax to i32			%conv = sext i16 %ax to i32
	ret i32 %conv			ret i32 %conv

	; CHECK-LABEL: f1:			; CHECK-LABEL: f1:
	Show All 11 Lines

llvm/trunk/test/CodeGen/X86/return-ext.ll

	Show All 36 Lines
	; Unsigned i8 return values are not extended.			; Unsigned i8 return values are not extended.
	; CHECK-LABEL: unsigned_i8:			; CHECK-LABEL: unsigned_i8:
	; CHECK: cmp			; CHECK: cmp
	; CHECK-NEXT: sete			; CHECK-NEXT: sete
	; CHECK-NEXT: ret			; CHECK-NEXT: ret

	; Except on Darwin, for legacy reasons.			; Except on Darwin, for legacy reasons.
	; DARWIN-LABEL: unsigned_i8:			; DARWIN-LABEL: unsigned_i8:
	; DARWIN: cmp			; DARWIN: xorl
				; DARWIN-NEXT: cmp
	; DARWIN-NEXT: sete			; DARWIN-NEXT: sete
	; DARWIN-NEXT: movzbl
	; DARWIN-NEXT: ret			; DARWIN-NEXT: ret
	}			}

	define signext i8 @signed_i8() {			define signext i8 @signed_i8() {
	entry:			entry:
	%0 = load i32, i32* @x			%0 = load i32, i32* @x
	%cmp = icmp eq i32 %0, 42			%cmp = icmp eq i32 %0, 42
	%retval = zext i1 %cmp to i8			%retval = zext i1 %cmp to i8
	ret i8 %retval			ret i8 %retval

	; Signed i8 return values are not extended.			; Signed i8 return values are not extended.
	; CHECK-LABEL: signed_i8:			; CHECK-LABEL: signed_i8:
	; CHECK: cmp			; CHECK: cmp
	; CHECK-NEXT: sete			; CHECK-NEXT: sete
	; CHECK-NEXT: ret			; CHECK-NEXT: ret

	; Except on Darwin, for legacy reasons.			; Except on Darwin, for legacy reasons.
	; DARWIN-LABEL: signed_i8:			; DARWIN-LABEL: signed_i8:
	; DARWIN: cmp			; DARWIN: xorl
				; DARWIN-NEXT: cmp
	; DARWIN-NEXT: sete			; DARWIN-NEXT: sete
	; DARWIN-NEXT: movzbl
	; DARWIN-NEXT: ret			; DARWIN-NEXT: ret
	}			}

	@a = common global i16 0			@a = common global i16 0
	@b = common global i16 0			@b = common global i16 0
	define zeroext i16 @unsigned_i16() {			define zeroext i16 @unsigned_i16() {
	entry:			entry:
	%0 = load i16, i16* @a			%0 = load i16, i16* @a
	%1 = load i16, i16* @b			%1 = load i16, i16* @b
	%add = add i16 %1, %0			%add = add i16 %1, %0
	ret i16 %add			ret i16 %add

	; i16 return values are not extended.			; i16 return values are not extended.
	; CHECK-LABEL: unsigned_i16:			; CHECK-LABEL: unsigned_i16:
	; BWOFF: movw			; BWOFF: movw
	; BWON: movzwl			; BWON: movzwl
	; CHECK-NEXT: addw			; CHECK-NEXT: addw
	; CHECK-NEXT: ret			; CHECK-NEXT: ret

	; Except on Darwin, for legay reasons.			; Except on Darwin, for legacy reasons.
	; DARWIN-LABEL: unsigned_i16:			; DARWIN-LABEL: unsigned_i16:
	; DARWIN-BWOFF: movw			; DARWIN-BWOFF: movw
	; DARWIN-BWON: movzwl			; DARWIN-BWON: movzwl
	; DARWIN-NEXT: addw			; DARWIN-NEXT: addw
	; DARWIN-NEXT: movzwl			; DARWIN-NEXT: movzwl
	; DARWIN-NEXT: ret			; DARWIN-NEXT: ret
	}			}

	▲ Show 20 Lines • Show All 42 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/setcc-narrowing.ll

	; RUN: llc < %s -mtriple=i686-apple-darwin \| FileCheck %s			; RUN: llc < %s -mtriple=i686-apple-darwin \| FileCheck %s
	; PR17338			; PR17338

	@t1.global = internal global i64 -1, align 8			@t1.global = internal global i64 -1, align 8

	define i32 @t1() nounwind ssp {			define i32 @t1() nounwind ssp {
	entry:			entry:
	; CHECK-LABEL: t1:			; CHECK-LABEL: t1:
	; CHECK: cmpl $0, _t1.global			; CHECK: xorl %eax, %eax
				; CHECK-NEXT: cmpl $0, _t1.global
	; CHECK-NEXT: setne %al			; CHECK-NEXT: setne %al
	; CHECK-NEXT: movzbl %al, %eax
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%0 = load i64, i64* @t1.global, align 8			%0 = load i64, i64* @t1.global, align 8
	%and = and i64 4294967295, %0			%and = and i64 4294967295, %0
	%cmp = icmp sgt i64 %and, 0			%cmp = icmp sgt i64 %and, 0
	%conv = zext i1 %cmp to i32			%conv = zext i1 %cmp to i32
	ret i32 %conv			ret i32 %conv
	}			}

llvm/trunk/test/CodeGen/X86/setcc.ll

	; RUN: llc < %s -mtriple=x86_64-apple-darwin \| FileCheck %s			; RUN: llc < %s -mtriple=x86_64-apple-darwin \| FileCheck %s
	; rdar://7329206			; rdar://7329206

	; Use sbb x, x to materialize carry bit in a GPR. The value is either			; Use sbb x, x to materialize carry bit in a GPR. The value is either
	; all 1's or all 0's.			; all 1's or all 0's.

	define zeroext i16 @t1(i16 zeroext %x) nounwind readnone ssp {			define zeroext i16 @t1(i16 zeroext %x) nounwind readnone ssp {
	entry:			entry:
	; CHECK-LABEL: t1:			; CHECK-LABEL: t1:
				; CHECK: xorl %eax, %eax
	; CHECK: seta %al			; CHECK: seta %al
	; CHECK: movzbl %al, %eax
	; CHECK: shll $5, %eax			; CHECK: shll $5, %eax
	%0 = icmp ugt i16 %x, 26 ; <i1> [#uses=1]			%0 = icmp ugt i16 %x, 26 ; <i1> [#uses=1]
	%iftmp.1.0 = select i1 %0, i16 32, i16 0 ; <i16> [#uses=1]			%iftmp.1.0 = select i1 %0, i16 32, i16 0 ; <i16> [#uses=1]
	ret i16 %iftmp.1.0			ret i16 %iftmp.1.0
	}			}

	define zeroext i16 @t2(i16 zeroext %x) nounwind readnone ssp {			define zeroext i16 @t2(i16 zeroext %x) nounwind readnone ssp {
	entry:			entry:
	▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/sse-intrinsics-fast-isel.ll

Show First 20 Lines • Show All 587 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%res = call i32 @llvm.x86.sse.comieq.ss(<4 x float> %a0, <4 x float> %a1)		%res = call i32 @llvm.x86.sse.comieq.ss(<4 x float> %a0, <4 x float> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comieq.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comieq.ss(<4 x float>, <4 x float>) nounwind readnone

define i32 @test_mm_comige_ss(<4 x float> %a0, <4 x float> %a1) nounwind {		define i32 @test_mm_comige_ss(<4 x float> %a0, <4 x float> %a1) nounwind {
; X32-LABEL: test_mm_comige_ss:		; X32-LABEL: test_mm_comige_ss:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: comiss %xmm1, %xmm0		; X32-NEXT: comiss %xmm1, %xmm0
; X32-NEXT: setae %al		; X32-NEXT: setae %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_comige_ss:		; X64-LABEL: test_mm_comige_ss:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: comiss %xmm1, %xmm0		; X64-NEXT: comiss %xmm1, %xmm0
; X64-NEXT: setae %al		; X64-NEXT: setae %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse.comige.ss(<4 x float> %a0, <4 x float> %a1)		%res = call i32 @llvm.x86.sse.comige.ss(<4 x float> %a0, <4 x float> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comige.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comige.ss(<4 x float>, <4 x float>) nounwind readnone

define i32 @test_mm_comigt_ss(<4 x float> %a0, <4 x float> %a1) nounwind {		define i32 @test_mm_comigt_ss(<4 x float> %a0, <4 x float> %a1) nounwind {
; X32-LABEL: test_mm_comigt_ss:		; X32-LABEL: test_mm_comigt_ss:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: comiss %xmm1, %xmm0		; X32-NEXT: comiss %xmm1, %xmm0
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_comigt_ss:		; X64-LABEL: test_mm_comigt_ss:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: comiss %xmm1, %xmm0		; X64-NEXT: comiss %xmm1, %xmm0
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse.comigt.ss(<4 x float> %a0, <4 x float> %a1)		%res = call i32 @llvm.x86.sse.comigt.ss(<4 x float> %a0, <4 x float> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comigt.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comigt.ss(<4 x float>, <4 x float>) nounwind readnone

define i32 @test_mm_comile_ss(<4 x float> %a0, <4 x float> %a1) nounwind {		define i32 @test_mm_comile_ss(<4 x float> %a0, <4 x float> %a1) nounwind {
; X32-LABEL: test_mm_comile_ss:		; X32-LABEL: test_mm_comile_ss:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: comiss %xmm0, %xmm1		; X32-NEXT: comiss %xmm0, %xmm1
; X32-NEXT: setae %al		; X32-NEXT: setae %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_comile_ss:		; X64-LABEL: test_mm_comile_ss:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: comiss %xmm0, %xmm1		; X64-NEXT: comiss %xmm0, %xmm1
; X64-NEXT: setae %al		; X64-NEXT: setae %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse.comile.ss(<4 x float> %a0, <4 x float> %a1)		%res = call i32 @llvm.x86.sse.comile.ss(<4 x float> %a0, <4 x float> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comile.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comile.ss(<4 x float>, <4 x float>) nounwind readnone

define i32 @test_mm_comilt_ss(<4 x float> %a0, <4 x float> %a1) nounwind {		define i32 @test_mm_comilt_ss(<4 x float> %a0, <4 x float> %a1) nounwind {
; X32-LABEL: test_mm_comilt_ss:		; X32-LABEL: test_mm_comilt_ss:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: comiss %xmm0, %xmm1		; X32-NEXT: comiss %xmm0, %xmm1
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_comilt_ss:		; X64-LABEL: test_mm_comilt_ss:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: comiss %xmm0, %xmm1		; X64-NEXT: comiss %xmm0, %xmm1
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse.comilt.ss(<4 x float> %a0, <4 x float> %a1)		%res = call i32 @llvm.x86.sse.comilt.ss(<4 x float> %a0, <4 x float> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comilt.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comilt.ss(<4 x float>, <4 x float>) nounwind readnone

define i32 @test_mm_comineq_ss(<4 x float> %a0, <4 x float> %a1) nounwind {		define i32 @test_mm_comineq_ss(<4 x float> %a0, <4 x float> %a1) nounwind {
; X32-LABEL: test_mm_comineq_ss:		; X32-LABEL: test_mm_comineq_ss:
▲ Show 20 Lines • Show All 1,418 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%res = call i32 @llvm.x86.sse.ucomieq.ss(<4 x float> %a0, <4 x float> %a1)		%res = call i32 @llvm.x86.sse.ucomieq.ss(<4 x float> %a0, <4 x float> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomieq.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomieq.ss(<4 x float>, <4 x float>) nounwind readnone

define i32 @test_mm_ucomige_ss(<4 x float> %a0, <4 x float> %a1) nounwind {		define i32 @test_mm_ucomige_ss(<4 x float> %a0, <4 x float> %a1) nounwind {
; X32-LABEL: test_mm_ucomige_ss:		; X32-LABEL: test_mm_ucomige_ss:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: ucomiss %xmm1, %xmm0		; X32-NEXT: ucomiss %xmm1, %xmm0
; X32-NEXT: setae %al		; X32-NEXT: setae %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_ucomige_ss:		; X64-LABEL: test_mm_ucomige_ss:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: ucomiss %xmm1, %xmm0		; X64-NEXT: ucomiss %xmm1, %xmm0
; X64-NEXT: setae %al		; X64-NEXT: setae %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse.ucomige.ss(<4 x float> %a0, <4 x float> %a1)		%res = call i32 @llvm.x86.sse.ucomige.ss(<4 x float> %a0, <4 x float> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomige.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomige.ss(<4 x float>, <4 x float>) nounwind readnone

define i32 @test_mm_ucomigt_ss(<4 x float> %a0, <4 x float> %a1) nounwind {		define i32 @test_mm_ucomigt_ss(<4 x float> %a0, <4 x float> %a1) nounwind {
; X32-LABEL: test_mm_ucomigt_ss:		; X32-LABEL: test_mm_ucomigt_ss:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: ucomiss %xmm1, %xmm0		; X32-NEXT: ucomiss %xmm1, %xmm0
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_ucomigt_ss:		; X64-LABEL: test_mm_ucomigt_ss:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: ucomiss %xmm1, %xmm0		; X64-NEXT: ucomiss %xmm1, %xmm0
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse.ucomigt.ss(<4 x float> %a0, <4 x float> %a1)		%res = call i32 @llvm.x86.sse.ucomigt.ss(<4 x float> %a0, <4 x float> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomigt.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomigt.ss(<4 x float>, <4 x float>) nounwind readnone

define i32 @test_mm_ucomile_ss(<4 x float> %a0, <4 x float> %a1) nounwind {		define i32 @test_mm_ucomile_ss(<4 x float> %a0, <4 x float> %a1) nounwind {
; X32-LABEL: test_mm_ucomile_ss:		; X32-LABEL: test_mm_ucomile_ss:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: ucomiss %xmm0, %xmm1		; X32-NEXT: ucomiss %xmm0, %xmm1
; X32-NEXT: setae %al		; X32-NEXT: setae %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_ucomile_ss:		; X64-LABEL: test_mm_ucomile_ss:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: ucomiss %xmm0, %xmm1		; X64-NEXT: ucomiss %xmm0, %xmm1
; X64-NEXT: setae %al		; X64-NEXT: setae %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse.ucomile.ss(<4 x float> %a0, <4 x float> %a1)		%res = call i32 @llvm.x86.sse.ucomile.ss(<4 x float> %a0, <4 x float> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomile.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomile.ss(<4 x float>, <4 x float>) nounwind readnone

define i32 @test_mm_ucomilt_ss(<4 x float> %a0, <4 x float> %a1) nounwind {		define i32 @test_mm_ucomilt_ss(<4 x float> %a0, <4 x float> %a1) nounwind {
; X32-LABEL: test_mm_ucomilt_ss:		; X32-LABEL: test_mm_ucomilt_ss:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: ucomiss %xmm0, %xmm1		; X32-NEXT: ucomiss %xmm0, %xmm1
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_ucomilt_ss:		; X64-LABEL: test_mm_ucomilt_ss:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: ucomiss %xmm0, %xmm1		; X64-NEXT: ucomiss %xmm0, %xmm1
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse.ucomilt.ss(<4 x float> %a0, <4 x float> %a1)		%res = call i32 @llvm.x86.sse.ucomilt.ss(<4 x float> %a0, <4 x float> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomilt.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomilt.ss(<4 x float>, <4 x float>) nounwind readnone

define i32 @test_mm_ucomineq_ss(<4 x float> %a0, <4 x float> %a1) nounwind {		define i32 @test_mm_ucomineq_ss(<4 x float> %a0, <4 x float> %a1) nounwind {
; X32-LABEL: test_mm_ucomineq_ss:		; X32-LABEL: test_mm_ucomineq_ss:
▲ Show 20 Lines • Show All 132 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/sse-intrinsics-x86.ll

Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	; KNL-NEXT: retl
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comieq.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comieq.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_comige_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_comige_ss(<4 x float> %a0, <4 x float> %a1) {
; SSE-LABEL: test_x86_sse_comige_ss:		; SSE-LABEL: test_x86_sse_comige_ss:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: comiss %xmm1, %xmm0		; SSE-NEXT: comiss %xmm1, %xmm0
; SSE-NEXT: setae %al		; SSE-NEXT: setae %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse_comige_ss:		; KNL-LABEL: test_x86_sse_comige_ss:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vcomiss %xmm1, %xmm0		; KNL-NEXT: vcomiss %xmm1, %xmm0
; KNL-NEXT: setae %al		; KNL-NEXT: setae %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse.comige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.comige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comige.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comige.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_comigt_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_comigt_ss(<4 x float> %a0, <4 x float> %a1) {
; SSE-LABEL: test_x86_sse_comigt_ss:		; SSE-LABEL: test_x86_sse_comigt_ss:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: comiss %xmm1, %xmm0		; SSE-NEXT: comiss %xmm1, %xmm0
; SSE-NEXT: seta %al		; SSE-NEXT: seta %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse_comigt_ss:		; KNL-LABEL: test_x86_sse_comigt_ss:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vcomiss %xmm1, %xmm0		; KNL-NEXT: vcomiss %xmm1, %xmm0
; KNL-NEXT: seta %al		; KNL-NEXT: seta %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse.comigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.comigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comigt.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comigt.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_comile_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_comile_ss(<4 x float> %a0, <4 x float> %a1) {
; SSE-LABEL: test_x86_sse_comile_ss:		; SSE-LABEL: test_x86_sse_comile_ss:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: comiss %xmm0, %xmm1		; SSE-NEXT: comiss %xmm0, %xmm1
; SSE-NEXT: setae %al		; SSE-NEXT: setae %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse_comile_ss:		; KNL-LABEL: test_x86_sse_comile_ss:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vcomiss %xmm0, %xmm1		; KNL-NEXT: vcomiss %xmm0, %xmm1
; KNL-NEXT: setae %al		; KNL-NEXT: setae %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse.comile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.comile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comile.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comile.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_comilt_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_comilt_ss(<4 x float> %a0, <4 x float> %a1) {
; SSE-LABEL: test_x86_sse_comilt_ss:		; SSE-LABEL: test_x86_sse_comilt_ss:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: comiss %xmm0, %xmm1		; SSE-NEXT: comiss %xmm0, %xmm1
; SSE-NEXT: seta %al		; SSE-NEXT: seta %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse_comilt_ss:		; KNL-LABEL: test_x86_sse_comilt_ss:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vcomiss %xmm0, %xmm1		; KNL-NEXT: vcomiss %xmm0, %xmm1
; KNL-NEXT: seta %al		; KNL-NEXT: seta %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse.comilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.comilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.comilt.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.comilt.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_comineq_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_comineq_ss(<4 x float> %a0, <4 x float> %a1) {
▲ Show 20 Lines • Show All 354 Lines • ▼ Show 20 Lines	; KNL-NEXT: retl
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomieq.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomieq.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_ucomige_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_ucomige_ss(<4 x float> %a0, <4 x float> %a1) {
; SSE-LABEL: test_x86_sse_ucomige_ss:		; SSE-LABEL: test_x86_sse_ucomige_ss:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: ucomiss %xmm1, %xmm0		; SSE-NEXT: ucomiss %xmm1, %xmm0
; SSE-NEXT: setae %al		; SSE-NEXT: setae %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse_ucomige_ss:		; KNL-LABEL: test_x86_sse_ucomige_ss:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vucomiss %xmm1, %xmm0		; KNL-NEXT: vucomiss %xmm1, %xmm0
; KNL-NEXT: setae %al		; KNL-NEXT: setae %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse.ucomige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.ucomige.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomige.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomige.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_ucomigt_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_ucomigt_ss(<4 x float> %a0, <4 x float> %a1) {
; SSE-LABEL: test_x86_sse_ucomigt_ss:		; SSE-LABEL: test_x86_sse_ucomigt_ss:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: ucomiss %xmm1, %xmm0		; SSE-NEXT: ucomiss %xmm1, %xmm0
; SSE-NEXT: seta %al		; SSE-NEXT: seta %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse_ucomigt_ss:		; KNL-LABEL: test_x86_sse_ucomigt_ss:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vucomiss %xmm1, %xmm0		; KNL-NEXT: vucomiss %xmm1, %xmm0
; KNL-NEXT: seta %al		; KNL-NEXT: seta %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse.ucomigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.ucomigt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomigt.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomigt.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_ucomile_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_ucomile_ss(<4 x float> %a0, <4 x float> %a1) {
; SSE-LABEL: test_x86_sse_ucomile_ss:		; SSE-LABEL: test_x86_sse_ucomile_ss:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: ucomiss %xmm0, %xmm1		; SSE-NEXT: ucomiss %xmm0, %xmm1
; SSE-NEXT: setae %al		; SSE-NEXT: setae %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse_ucomile_ss:		; KNL-LABEL: test_x86_sse_ucomile_ss:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vucomiss %xmm0, %xmm1		; KNL-NEXT: vucomiss %xmm0, %xmm1
; KNL-NEXT: setae %al		; KNL-NEXT: setae %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse.ucomile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.ucomile.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomile.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomile.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_ucomilt_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_ucomilt_ss(<4 x float> %a0, <4 x float> %a1) {
; SSE-LABEL: test_x86_sse_ucomilt_ss:		; SSE-LABEL: test_x86_sse_ucomilt_ss:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: ucomiss %xmm0, %xmm1		; SSE-NEXT: ucomiss %xmm0, %xmm1
; SSE-NEXT: seta %al		; SSE-NEXT: seta %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse_ucomilt_ss:		; KNL-LABEL: test_x86_sse_ucomilt_ss:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vucomiss %xmm0, %xmm1		; KNL-NEXT: vucomiss %xmm0, %xmm1
; KNL-NEXT: seta %al		; KNL-NEXT: seta %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse.ucomilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse.ucomilt.ss(<4 x float> %a0, <4 x float> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse.ucomilt.ss(<4 x float>, <4 x float>) nounwind readnone		declare i32 @llvm.x86.sse.ucomilt.ss(<4 x float>, <4 x float>) nounwind readnone


define i32 @test_x86_sse_ucomineq_ss(<4 x float> %a0, <4 x float> %a1) {		define i32 @test_x86_sse_ucomineq_ss(<4 x float> %a0, <4 x float> %a1) {
Show All 21 Lines

llvm/trunk/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll

Show First 20 Lines • Show All 977 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%res = call i32 @llvm.x86.sse2.comieq.sd(<2 x double> %a0, <2 x double> %a1)		%res = call i32 @llvm.x86.sse2.comieq.sd(<2 x double> %a0, <2 x double> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comieq.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comieq.sd(<2 x double>, <2 x double>) nounwind readnone

define i32 @test_mm_comige_sd(<2 x double> %a0, <2 x double> %a1) nounwind {		define i32 @test_mm_comige_sd(<2 x double> %a0, <2 x double> %a1) nounwind {
; X32-LABEL: test_mm_comige_sd:		; X32-LABEL: test_mm_comige_sd:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: comisd %xmm1, %xmm0		; X32-NEXT: comisd %xmm1, %xmm0
; X32-NEXT: setae %al		; X32-NEXT: setae %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_comige_sd:		; X64-LABEL: test_mm_comige_sd:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: comisd %xmm1, %xmm0		; X64-NEXT: comisd %xmm1, %xmm0
; X64-NEXT: setae %al		; X64-NEXT: setae %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse2.comige.sd(<2 x double> %a0, <2 x double> %a1)		%res = call i32 @llvm.x86.sse2.comige.sd(<2 x double> %a0, <2 x double> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comige.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comige.sd(<2 x double>, <2 x double>) nounwind readnone

define i32 @test_mm_comigt_sd(<2 x double> %a0, <2 x double> %a1) nounwind {		define i32 @test_mm_comigt_sd(<2 x double> %a0, <2 x double> %a1) nounwind {
; X32-LABEL: test_mm_comigt_sd:		; X32-LABEL: test_mm_comigt_sd:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: comisd %xmm1, %xmm0		; X32-NEXT: comisd %xmm1, %xmm0
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_comigt_sd:		; X64-LABEL: test_mm_comigt_sd:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: comisd %xmm1, %xmm0		; X64-NEXT: comisd %xmm1, %xmm0
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse2.comigt.sd(<2 x double> %a0, <2 x double> %a1)		%res = call i32 @llvm.x86.sse2.comigt.sd(<2 x double> %a0, <2 x double> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comigt.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comigt.sd(<2 x double>, <2 x double>) nounwind readnone

define i32 @test_mm_comile_sd(<2 x double> %a0, <2 x double> %a1) nounwind {		define i32 @test_mm_comile_sd(<2 x double> %a0, <2 x double> %a1) nounwind {
; X32-LABEL: test_mm_comile_sd:		; X32-LABEL: test_mm_comile_sd:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: comisd %xmm0, %xmm1		; X32-NEXT: comisd %xmm0, %xmm1
; X32-NEXT: setae %al		; X32-NEXT: setae %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_comile_sd:		; X64-LABEL: test_mm_comile_sd:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: comisd %xmm0, %xmm1		; X64-NEXT: comisd %xmm0, %xmm1
; X64-NEXT: setae %al		; X64-NEXT: setae %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse2.comile.sd(<2 x double> %a0, <2 x double> %a1)		%res = call i32 @llvm.x86.sse2.comile.sd(<2 x double> %a0, <2 x double> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comile.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comile.sd(<2 x double>, <2 x double>) nounwind readnone

define i32 @test_mm_comilt_sd(<2 x double> %a0, <2 x double> %a1) nounwind {		define i32 @test_mm_comilt_sd(<2 x double> %a0, <2 x double> %a1) nounwind {
; X32-LABEL: test_mm_comilt_sd:		; X32-LABEL: test_mm_comilt_sd:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: comisd %xmm0, %xmm1		; X32-NEXT: comisd %xmm0, %xmm1
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_comilt_sd:		; X64-LABEL: test_mm_comilt_sd:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: comisd %xmm0, %xmm1		; X64-NEXT: comisd %xmm0, %xmm1
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse2.comilt.sd(<2 x double> %a0, <2 x double> %a1)		%res = call i32 @llvm.x86.sse2.comilt.sd(<2 x double> %a0, <2 x double> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comilt.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comilt.sd(<2 x double>, <2 x double>) nounwind readnone

define i32 @test_mm_comineq_sd(<2 x double> %a0, <2 x double> %a1) nounwind {		define i32 @test_mm_comineq_sd(<2 x double> %a0, <2 x double> %a1) nounwind {
; X32-LABEL: test_mm_comineq_sd:		; X32-LABEL: test_mm_comineq_sd:
▲ Show 20 Lines • Show All 2,472 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%res = call i32 @llvm.x86.sse2.ucomieq.sd(<2 x double> %a0, <2 x double> %a1)		%res = call i32 @llvm.x86.sse2.ucomieq.sd(<2 x double> %a0, <2 x double> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomieq.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomieq.sd(<2 x double>, <2 x double>) nounwind readnone

define i32 @test_mm_ucomige_sd(<2 x double> %a0, <2 x double> %a1) nounwind {		define i32 @test_mm_ucomige_sd(<2 x double> %a0, <2 x double> %a1) nounwind {
; X32-LABEL: test_mm_ucomige_sd:		; X32-LABEL: test_mm_ucomige_sd:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: ucomisd %xmm1, %xmm0		; X32-NEXT: ucomisd %xmm1, %xmm0
; X32-NEXT: setae %al		; X32-NEXT: setae %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_ucomige_sd:		; X64-LABEL: test_mm_ucomige_sd:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: ucomisd %xmm1, %xmm0		; X64-NEXT: ucomisd %xmm1, %xmm0
; X64-NEXT: setae %al		; X64-NEXT: setae %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse2.ucomige.sd(<2 x double> %a0, <2 x double> %a1)		%res = call i32 @llvm.x86.sse2.ucomige.sd(<2 x double> %a0, <2 x double> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomige.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomige.sd(<2 x double>, <2 x double>) nounwind readnone

define i32 @test_mm_ucomigt_sd(<2 x double> %a0, <2 x double> %a1) nounwind {		define i32 @test_mm_ucomigt_sd(<2 x double> %a0, <2 x double> %a1) nounwind {
; X32-LABEL: test_mm_ucomigt_sd:		; X32-LABEL: test_mm_ucomigt_sd:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: ucomisd %xmm1, %xmm0		; X32-NEXT: ucomisd %xmm1, %xmm0
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_ucomigt_sd:		; X64-LABEL: test_mm_ucomigt_sd:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: ucomisd %xmm1, %xmm0		; X64-NEXT: ucomisd %xmm1, %xmm0
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse2.ucomigt.sd(<2 x double> %a0, <2 x double> %a1)		%res = call i32 @llvm.x86.sse2.ucomigt.sd(<2 x double> %a0, <2 x double> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomigt.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomigt.sd(<2 x double>, <2 x double>) nounwind readnone

define i32 @test_mm_ucomile_sd(<2 x double> %a0, <2 x double> %a1) nounwind {		define i32 @test_mm_ucomile_sd(<2 x double> %a0, <2 x double> %a1) nounwind {
; X32-LABEL: test_mm_ucomile_sd:		; X32-LABEL: test_mm_ucomile_sd:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: ucomisd %xmm0, %xmm1		; X32-NEXT: ucomisd %xmm0, %xmm1
; X32-NEXT: setae %al		; X32-NEXT: setae %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_ucomile_sd:		; X64-LABEL: test_mm_ucomile_sd:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: ucomisd %xmm0, %xmm1		; X64-NEXT: ucomisd %xmm0, %xmm1
; X64-NEXT: setae %al		; X64-NEXT: setae %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse2.ucomile.sd(<2 x double> %a0, <2 x double> %a1)		%res = call i32 @llvm.x86.sse2.ucomile.sd(<2 x double> %a0, <2 x double> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomile.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomile.sd(<2 x double>, <2 x double>) nounwind readnone

define i32 @test_mm_ucomilt_sd(<2 x double> %a0, <2 x double> %a1) nounwind {		define i32 @test_mm_ucomilt_sd(<2 x double> %a0, <2 x double> %a1) nounwind {
; X32-LABEL: test_mm_ucomilt_sd:		; X32-LABEL: test_mm_ucomilt_sd:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: ucomisd %xmm0, %xmm1		; X32-NEXT: ucomisd %xmm0, %xmm1
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_ucomilt_sd:		; X64-LABEL: test_mm_ucomilt_sd:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: ucomisd %xmm0, %xmm1		; X64-NEXT: ucomisd %xmm0, %xmm1
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse2.ucomilt.sd(<2 x double> %a0, <2 x double> %a1)		%res = call i32 @llvm.x86.sse2.ucomilt.sd(<2 x double> %a0, <2 x double> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomilt.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomilt.sd(<2 x double>, <2 x double>) nounwind readnone

define i32 @test_mm_ucomineq_sd(<2 x double> %a0, <2 x double> %a1) nounwind {		define i32 @test_mm_ucomineq_sd(<2 x double> %a0, <2 x double> %a1) nounwind {
; X32-LABEL: test_mm_ucomineq_sd:		; X32-LABEL: test_mm_ucomineq_sd:
▲ Show 20 Lines • Show All 234 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86.ll

Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	; KNL-NEXT: retl
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comieq.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comieq.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_comige_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_comige_sd(<2 x double> %a0, <2 x double> %a1) {
; SSE-LABEL: test_x86_sse2_comige_sd:		; SSE-LABEL: test_x86_sse2_comige_sd:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: comisd %xmm1, %xmm0		; SSE-NEXT: comisd %xmm1, %xmm0
; SSE-NEXT: setae %al		; SSE-NEXT: setae %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse2_comige_sd:		; KNL-LABEL: test_x86_sse2_comige_sd:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vcomisd %xmm1, %xmm0		; KNL-NEXT: vcomisd %xmm1, %xmm0
; KNL-NEXT: setae %al		; KNL-NEXT: setae %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse2.comige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.comige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comige.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comige.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_comigt_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_comigt_sd(<2 x double> %a0, <2 x double> %a1) {
; SSE-LABEL: test_x86_sse2_comigt_sd:		; SSE-LABEL: test_x86_sse2_comigt_sd:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: comisd %xmm1, %xmm0		; SSE-NEXT: comisd %xmm1, %xmm0
; SSE-NEXT: seta %al		; SSE-NEXT: seta %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse2_comigt_sd:		; KNL-LABEL: test_x86_sse2_comigt_sd:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vcomisd %xmm1, %xmm0		; KNL-NEXT: vcomisd %xmm1, %xmm0
; KNL-NEXT: seta %al		; KNL-NEXT: seta %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse2.comigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.comigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comigt.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comigt.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_comile_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_comile_sd(<2 x double> %a0, <2 x double> %a1) {
; SSE-LABEL: test_x86_sse2_comile_sd:		; SSE-LABEL: test_x86_sse2_comile_sd:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: comisd %xmm0, %xmm1		; SSE-NEXT: comisd %xmm0, %xmm1
; SSE-NEXT: setae %al		; SSE-NEXT: setae %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse2_comile_sd:		; KNL-LABEL: test_x86_sse2_comile_sd:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vcomisd %xmm0, %xmm1		; KNL-NEXT: vcomisd %xmm0, %xmm1
; KNL-NEXT: setae %al		; KNL-NEXT: setae %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse2.comile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.comile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comile.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comile.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_comilt_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_comilt_sd(<2 x double> %a0, <2 x double> %a1) {
; SSE-LABEL: test_x86_sse2_comilt_sd:		; SSE-LABEL: test_x86_sse2_comilt_sd:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: comisd %xmm0, %xmm1		; SSE-NEXT: comisd %xmm0, %xmm1
; SSE-NEXT: seta %al		; SSE-NEXT: seta %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse2_comilt_sd:		; KNL-LABEL: test_x86_sse2_comilt_sd:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vcomisd %xmm0, %xmm1		; KNL-NEXT: vcomisd %xmm0, %xmm1
; KNL-NEXT: seta %al		; KNL-NEXT: seta %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse2.comilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.comilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.comilt.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.comilt.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_comineq_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_comineq_sd(<2 x double> %a0, <2 x double> %a1) {
▲ Show 20 Lines • Show All 989 Lines • ▼ Show 20 Lines	; KNL-NEXT: retl
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomieq.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomieq.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_ucomige_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_ucomige_sd(<2 x double> %a0, <2 x double> %a1) {
; SSE-LABEL: test_x86_sse2_ucomige_sd:		; SSE-LABEL: test_x86_sse2_ucomige_sd:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: ucomisd %xmm1, %xmm0		; SSE-NEXT: ucomisd %xmm1, %xmm0
; SSE-NEXT: setae %al		; SSE-NEXT: setae %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse2_ucomige_sd:		; KNL-LABEL: test_x86_sse2_ucomige_sd:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vucomisd %xmm1, %xmm0		; KNL-NEXT: vucomisd %xmm1, %xmm0
; KNL-NEXT: setae %al		; KNL-NEXT: setae %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse2.ucomige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.ucomige.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomige.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomige.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_ucomigt_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_ucomigt_sd(<2 x double> %a0, <2 x double> %a1) {
; SSE-LABEL: test_x86_sse2_ucomigt_sd:		; SSE-LABEL: test_x86_sse2_ucomigt_sd:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: ucomisd %xmm1, %xmm0		; SSE-NEXT: ucomisd %xmm1, %xmm0
; SSE-NEXT: seta %al		; SSE-NEXT: seta %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse2_ucomigt_sd:		; KNL-LABEL: test_x86_sse2_ucomigt_sd:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vucomisd %xmm1, %xmm0		; KNL-NEXT: vucomisd %xmm1, %xmm0
; KNL-NEXT: seta %al		; KNL-NEXT: seta %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse2.ucomigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.ucomigt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomigt.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomigt.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_ucomile_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_ucomile_sd(<2 x double> %a0, <2 x double> %a1) {
; SSE-LABEL: test_x86_sse2_ucomile_sd:		; SSE-LABEL: test_x86_sse2_ucomile_sd:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: ucomisd %xmm0, %xmm1		; SSE-NEXT: ucomisd %xmm0, %xmm1
; SSE-NEXT: setae %al		; SSE-NEXT: setae %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse2_ucomile_sd:		; KNL-LABEL: test_x86_sse2_ucomile_sd:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vucomisd %xmm0, %xmm1		; KNL-NEXT: vucomisd %xmm0, %xmm1
; KNL-NEXT: setae %al		; KNL-NEXT: setae %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse2.ucomile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.ucomile.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomile.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomile.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_ucomilt_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_ucomilt_sd(<2 x double> %a0, <2 x double> %a1) {
; SSE-LABEL: test_x86_sse2_ucomilt_sd:		; SSE-LABEL: test_x86_sse2_ucomilt_sd:
; SSE: ## BB#0:		; SSE: ## BB#0:
		; SSE-NEXT: xorl %eax, %eax
; SSE-NEXT: ucomisd %xmm0, %xmm1		; SSE-NEXT: ucomisd %xmm0, %xmm1
; SSE-NEXT: seta %al		; SSE-NEXT: seta %al
; SSE-NEXT: movzbl %al, %eax
; SSE-NEXT: retl		; SSE-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse2_ucomilt_sd:		; KNL-LABEL: test_x86_sse2_ucomilt_sd:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vucomisd %xmm0, %xmm1		; KNL-NEXT: vucomisd %xmm0, %xmm1
; KNL-NEXT: seta %al		; KNL-NEXT: seta %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse2.ucomilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse2.ucomilt.sd(<2 x double> %a0, <2 x double> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse2.ucomilt.sd(<2 x double>, <2 x double>) nounwind readnone		declare i32 @llvm.x86.sse2.ucomilt.sd(<2 x double>, <2 x double>) nounwind readnone


define i32 @test_x86_sse2_ucomineq_sd(<2 x double> %a0, <2 x double> %a1) {		define i32 @test_x86_sse2_ucomineq_sd(<2 x double> %a0, <2 x double> %a1) {
Show All 36 Lines

llvm/trunk/test/CodeGen/X86/sse41-intrinsics-fast-isel.ll

Show First 20 Lines • Show All 912 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%res = call i32 @llvm.x86.sse41.ptestc(<2 x i64> %a0, <2 x i64> <i64 -1, i64 -1>)		%res = call i32 @llvm.x86.sse41.ptestc(<2 x i64> %a0, <2 x i64> <i64 -1, i64 -1>)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse41.ptestc(<2 x i64>, <2 x i64>) nounwind readnone		declare i32 @llvm.x86.sse41.ptestc(<2 x i64>, <2 x i64>) nounwind readnone

define i32 @test_mm_test_all_zeros(<2 x i64> %a0, <2 x i64> %a1) {		define i32 @test_mm_test_all_zeros(<2 x i64> %a0, <2 x i64> %a1) {
; X32-LABEL: test_mm_test_all_zeros:		; X32-LABEL: test_mm_test_all_zeros:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: ptest %xmm1, %xmm0		; X32-NEXT: ptest %xmm1, %xmm0
; X32-NEXT: sete %al		; X32-NEXT: sete %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_test_all_zeros:		; X64-LABEL: test_mm_test_all_zeros:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: ptest %xmm1, %xmm0		; X64-NEXT: ptest %xmm1, %xmm0
; X64-NEXT: sete %al		; X64-NEXT: sete %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse41.ptestz(<2 x i64> %a0, <2 x i64> %a1)		%res = call i32 @llvm.x86.sse41.ptestz(<2 x i64> %a0, <2 x i64> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse41.ptestz(<2 x i64>, <2 x i64>) nounwind readnone		declare i32 @llvm.x86.sse41.ptestz(<2 x i64>, <2 x i64>) nounwind readnone

define i32 @test_mm_test_mix_ones_zeros(<2 x i64> %a0, <2 x i64> %a1) {		define i32 @test_mm_test_mix_ones_zeros(<2 x i64> %a0, <2 x i64> %a1) {
; X32-LABEL: test_mm_test_mix_ones_zeros:		; X32-LABEL: test_mm_test_mix_ones_zeros:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: ptest %xmm1, %xmm0		; X32-NEXT: ptest %xmm1, %xmm0
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_test_mix_ones_zeros:		; X64-LABEL: test_mm_test_mix_ones_zeros:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: ptest %xmm1, %xmm0		; X64-NEXT: ptest %xmm1, %xmm0
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse41.ptestnzc(<2 x i64> %a0, <2 x i64> %a1)		%res = call i32 @llvm.x86.sse41.ptestnzc(<2 x i64> %a0, <2 x i64> %a1)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse41.ptestnzc(<2 x i64>, <2 x i64>) nounwind readnone		declare i32 @llvm.x86.sse41.ptestnzc(<2 x i64>, <2 x i64>) nounwind readnone

define i32 @test_mm_testc_si128(<2 x i64> %a0, <2 x i64> %a1) {		define i32 @test_mm_testc_si128(<2 x i64> %a0, <2 x i64> %a1) {
; X32-LABEL: test_mm_testc_si128:		; X32-LABEL: test_mm_testc_si128:
Show All 11 Lines
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse41.ptestc(<2 x i64> %a0, <2 x i64> %a1)		%res = call i32 @llvm.x86.sse41.ptestc(<2 x i64> %a0, <2 x i64> %a1)
ret i32 %res		ret i32 %res
}		}

define i32 @test_mm_testnzc_si128(<2 x i64> %a0, <2 x i64> %a1) {		define i32 @test_mm_testnzc_si128(<2 x i64> %a0, <2 x i64> %a1) {
; X32-LABEL: test_mm_testnzc_si128:		; X32-LABEL: test_mm_testnzc_si128:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: ptest %xmm1, %xmm0		; X32-NEXT: ptest %xmm1, %xmm0
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_testnzc_si128:		; X64-LABEL: test_mm_testnzc_si128:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: ptest %xmm1, %xmm0		; X64-NEXT: ptest %xmm1, %xmm0
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse41.ptestnzc(<2 x i64> %a0, <2 x i64> %a1)		%res = call i32 @llvm.x86.sse41.ptestnzc(<2 x i64> %a0, <2 x i64> %a1)
ret i32 %res		ret i32 %res
}		}

define i32 @test_mm_testz_si128(<2 x i64> %a0, <2 x i64> %a1) {		define i32 @test_mm_testz_si128(<2 x i64> %a0, <2 x i64> %a1) {
; X32-LABEL: test_mm_testz_si128:		; X32-LABEL: test_mm_testz_si128:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: ptest %xmm1, %xmm0		; X32-NEXT: ptest %xmm1, %xmm0
; X32-NEXT: sete %al		; X32-NEXT: sete %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_testz_si128:		; X64-LABEL: test_mm_testz_si128:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: ptest %xmm1, %xmm0		; X64-NEXT: ptest %xmm1, %xmm0
; X64-NEXT: sete %al		; X64-NEXT: sete %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%res = call i32 @llvm.x86.sse41.ptestz(<2 x i64> %a0, <2 x i64> %a1)		%res = call i32 @llvm.x86.sse41.ptestz(<2 x i64> %a0, <2 x i64> %a1)
ret i32 %res		ret i32 %res
}		}

llvm/trunk/test/CodeGen/X86/sse41-intrinsics-x86.ll

Show First 20 Lines • Show All 317 Lines • ▼ Show 20 Lines	; KNL-NEXT: retl
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse41.ptestc(<2 x i64>, <2 x i64>) nounwind readnone		declare i32 @llvm.x86.sse41.ptestc(<2 x i64>, <2 x i64>) nounwind readnone


define i32 @test_x86_sse41_ptestnzc(<2 x i64> %a0, <2 x i64> %a1) {		define i32 @test_x86_sse41_ptestnzc(<2 x i64> %a0, <2 x i64> %a1) {
; SSE41-LABEL: test_x86_sse41_ptestnzc:		; SSE41-LABEL: test_x86_sse41_ptestnzc:
; SSE41: ## BB#0:		; SSE41: ## BB#0:
		; SSE41-NEXT: xorl %eax, %eax
; SSE41-NEXT: ptest %xmm1, %xmm0		; SSE41-NEXT: ptest %xmm1, %xmm0
; SSE41-NEXT: seta %al		; SSE41-NEXT: seta %al
; SSE41-NEXT: movzbl %al, %eax
; SSE41-NEXT: retl		; SSE41-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse41_ptestnzc:		; KNL-LABEL: test_x86_sse41_ptestnzc:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vptest %xmm1, %xmm0		; KNL-NEXT: vptest %xmm1, %xmm0
; KNL-NEXT: seta %al		; KNL-NEXT: seta %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse41.ptestnzc(<2 x i64> %a0, <2 x i64> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse41.ptestnzc(<2 x i64> %a0, <2 x i64> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse41.ptestnzc(<2 x i64>, <2 x i64>) nounwind readnone		declare i32 @llvm.x86.sse41.ptestnzc(<2 x i64>, <2 x i64>) nounwind readnone


define i32 @test_x86_sse41_ptestz(<2 x i64> %a0, <2 x i64> %a1) {		define i32 @test_x86_sse41_ptestz(<2 x i64> %a0, <2 x i64> %a1) {
; SSE41-LABEL: test_x86_sse41_ptestz:		; SSE41-LABEL: test_x86_sse41_ptestz:
; SSE41: ## BB#0:		; SSE41: ## BB#0:
		; SSE41-NEXT: xorl %eax, %eax
; SSE41-NEXT: ptest %xmm1, %xmm0		; SSE41-NEXT: ptest %xmm1, %xmm0
; SSE41-NEXT: sete %al		; SSE41-NEXT: sete %al
; SSE41-NEXT: movzbl %al, %eax
; SSE41-NEXT: retl		; SSE41-NEXT: retl
;		;
; KNL-LABEL: test_x86_sse41_ptestz:		; KNL-LABEL: test_x86_sse41_ptestz:
; KNL: ## BB#0:		; KNL: ## BB#0:
		; KNL-NEXT: xorl %eax, %eax
; KNL-NEXT: vptest %xmm1, %xmm0		; KNL-NEXT: vptest %xmm1, %xmm0
; KNL-NEXT: sete %al		; KNL-NEXT: sete %al
; KNL-NEXT: movzbl %al, %eax
; KNL-NEXT: retl		; KNL-NEXT: retl
%res = call i32 @llvm.x86.sse41.ptestz(<2 x i64> %a0, <2 x i64> %a1) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse41.ptestz(<2 x i64> %a0, <2 x i64> %a1) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse41.ptestz(<2 x i64>, <2 x i64>) nounwind readnone		declare i32 @llvm.x86.sse41.ptestz(<2 x i64>, <2 x i64>) nounwind readnone


define <2 x double> @test_x86_sse41_round_pd(<2 x double> %a0) {		define <2 x double> @test_x86_sse41_round_pd(<2 x double> %a0) {
▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/sse41.ll

Show First 20 Lines • Show All 202 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%tmp2 = extractelement <4 x float> %t2, i32 0		%tmp2 = extractelement <4 x float> %t2, i32 0
%tmp1 = insertelement <4 x float> %t1, float %tmp2, i32 0		%tmp1 = insertelement <4 x float> %t1, float %tmp2, i32 0
ret <4 x float> %tmp1		ret <4 x float> %tmp1
}		}

define i32 @ptestz_1(<2 x i64> %t1, <2 x i64> %t2) nounwind {		define i32 @ptestz_1(<2 x i64> %t1, <2 x i64> %t2) nounwind {
; X32-LABEL: ptestz_1:		; X32-LABEL: ptestz_1:
; X32: ## BB#0:		; X32: ## BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: ptest %xmm1, %xmm0		; X32-NEXT: ptest %xmm1, %xmm0
; X32-NEXT: sete %al		; X32-NEXT: sete %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: ptestz_1:		; X64-LABEL: ptestz_1:
; X64: ## BB#0:		; X64: ## BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: ptest %xmm1, %xmm0		; X64-NEXT: ptest %xmm1, %xmm0
; X64-NEXT: sete %al		; X64-NEXT: sete %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%tmp1 = call i32 @llvm.x86.sse41.ptestz(<2 x i64> %t1, <2 x i64> %t2) nounwind readnone		%tmp1 = call i32 @llvm.x86.sse41.ptestz(<2 x i64> %t1, <2 x i64> %t2) nounwind readnone
ret i32 %tmp1		ret i32 %tmp1
}		}

define i32 @ptestz_2(<2 x i64> %t1, <2 x i64> %t2) nounwind {		define i32 @ptestz_2(<2 x i64> %t1, <2 x i64> %t2) nounwind {
; X32-LABEL: ptestz_2:		; X32-LABEL: ptestz_2:
; X32: ## BB#0:		; X32: ## BB#0:
Show All 10 Lines
; X64-NEXT: retq		; X64-NEXT: retq
%tmp1 = call i32 @llvm.x86.sse41.ptestc(<2 x i64> %t1, <2 x i64> %t2) nounwind readnone		%tmp1 = call i32 @llvm.x86.sse41.ptestc(<2 x i64> %t1, <2 x i64> %t2) nounwind readnone
ret i32 %tmp1		ret i32 %tmp1
}		}

define i32 @ptestz_3(<2 x i64> %t1, <2 x i64> %t2) nounwind {		define i32 @ptestz_3(<2 x i64> %t1, <2 x i64> %t2) nounwind {
; X32-LABEL: ptestz_3:		; X32-LABEL: ptestz_3:
; X32: ## BB#0:		; X32: ## BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: ptest %xmm1, %xmm0		; X32-NEXT: ptest %xmm1, %xmm0
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: ptestz_3:		; X64-LABEL: ptestz_3:
; X64: ## BB#0:		; X64: ## BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: ptest %xmm1, %xmm0		; X64-NEXT: ptest %xmm1, %xmm0
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%tmp1 = call i32 @llvm.x86.sse41.ptestnzc(<2 x i64> %t1, <2 x i64> %t2) nounwind readnone		%tmp1 = call i32 @llvm.x86.sse41.ptestnzc(<2 x i64> %t1, <2 x i64> %t2) nounwind readnone
ret i32 %tmp1		ret i32 %tmp1
}		}


declare i32 @llvm.x86.sse41.ptestz(<2 x i64>, <2 x i64>) nounwind readnone		declare i32 @llvm.x86.sse41.ptestz(<2 x i64>, <2 x i64>) nounwind readnone
declare i32 @llvm.x86.sse41.ptestc(<2 x i64>, <2 x i64>) nounwind readnone		declare i32 @llvm.x86.sse41.ptestc(<2 x i64>, <2 x i64>) nounwind readnone
▲ Show 20 Lines • Show All 900 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/sse42-intrinsics-fast-isel.ll

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -fast-isel -mtriple=i386-unknown-unknown -mattr=+sse4.2 \| FileCheck %s --check-prefix=ALL --check-prefix=X32		; RUN: llc < %s -fast-isel -mtriple=i386-unknown-unknown -mattr=+sse4.2 \| FileCheck %s --check-prefix=ALL --check-prefix=X32
; RUN: llc < %s -fast-isel -mtriple=x86_64-unknown-unknown -mattr=+sse4.2 \| FileCheck %s --check-prefix=ALL --check-prefix=X64		; RUN: llc < %s -fast-isel -mtriple=x86_64-unknown-unknown -mattr=+sse4.2 \| FileCheck %s --check-prefix=ALL --check-prefix=X64

; NOTE: This should use IR equivalent to what is generated by clang/test/CodeGen/sse42-builtins.c		; NOTE: This should use IR equivalent to what is generated by clang/test/CodeGen/sse42-builtins.c

define i32 @test_mm_cmpestra(<2 x i64> %a0, i32 %a1, <2 x i64> %a2, i32 %a3) {		define i32 @test_mm_cmpestra(<2 x i64> %a0, i32 %a1, <2 x i64> %a2, i32 %a3) nounwind {
; X32-LABEL: test_mm_cmpestra:		; X32-LABEL: test_mm_cmpestra:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: pushl %ebx
; X32-NEXT: movl {{[0-9]+}}(%esp), %edx		; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
		; X32-NEXT: xorl %ebx, %ebx
; X32-NEXT: pcmpestri $7, %xmm1, %xmm0		; X32-NEXT: pcmpestri $7, %xmm1, %xmm0
; X32-NEXT: seta %al		; X32-NEXT: seta %bl
; X32-NEXT: movzbl %al, %eax		; X32-NEXT: movl %ebx, %eax
		; X32-NEXT: popl %ebx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_cmpestra:		; X64-LABEL: test_mm_cmpestra:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %r8d, %r8d
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: movl %esi, %edx		; X64-NEXT: movl %esi, %edx
; X64-NEXT: pcmpestri $7, %xmm1, %xmm0		; X64-NEXT: pcmpestri $7, %xmm1, %xmm0
; X64-NEXT: seta %al		; X64-NEXT: seta %r8b
; X64-NEXT: movzbl %al, %eax		; X64-NEXT: movl %r8d, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%arg0 = bitcast <2 x i64> %a0 to <16 x i8>		%arg0 = bitcast <2 x i64> %a0 to <16 x i8>
%arg2 = bitcast <2 x i64> %a2 to <16 x i8>		%arg2 = bitcast <2 x i64> %a2 to <16 x i8>
%res = call i32 @llvm.x86.sse42.pcmpestria128(<16 x i8> %arg0, i32 %a1, <16 x i8> %arg2, i32 %a3, i8 7)		%res = call i32 @llvm.x86.sse42.pcmpestria128(<16 x i8> %arg0, i32 %a1, <16 x i8> %arg2, i32 %a3, i8 7)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpestria128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpestria128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone

▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%arg0 = bitcast <2 x i64> %a0 to <16 x i8>		%arg0 = bitcast <2 x i64> %a0 to <16 x i8>
%arg2 = bitcast <2 x i64> %a2 to <16 x i8>		%arg2 = bitcast <2 x i64> %a2 to <16 x i8>
%res = call <16 x i8> @llvm.x86.sse42.pcmpestrm128(<16 x i8> %arg0, i32 %a1, <16 x i8> %arg2, i32 %a3, i8 7)		%res = call <16 x i8> @llvm.x86.sse42.pcmpestrm128(<16 x i8> %arg0, i32 %a1, <16 x i8> %arg2, i32 %a3, i8 7)
%bc = bitcast <16 x i8> %res to <2 x i64>		%bc = bitcast <16 x i8> %res to <2 x i64>
ret <2 x i64> %bc		ret <2 x i64> %bc
}		}
declare <16 x i8> @llvm.x86.sse42.pcmpestrm128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare <16 x i8> @llvm.x86.sse42.pcmpestrm128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone

define i32 @test_mm_cmpestro(<2 x i64> %a0, i32 %a1, <2 x i64> %a2, i32 %a3) {		define i32 @test_mm_cmpestro(<2 x i64> %a0, i32 %a1, <2 x i64> %a2, i32 %a3) nounwind {
; X32-LABEL: test_mm_cmpestro:		; X32-LABEL: test_mm_cmpestro:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: pushl %ebx
; X32-NEXT: movl {{[0-9]+}}(%esp), %edx		; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
		; X32-NEXT: xorl %ebx, %ebx
; X32-NEXT: pcmpestri $7, %xmm1, %xmm0		; X32-NEXT: pcmpestri $7, %xmm1, %xmm0
; X32-NEXT: seto %al		; X32-NEXT: seto %bl
; X32-NEXT: movzbl %al, %eax		; X32-NEXT: movl %ebx, %eax
		; X32-NEXT: popl %ebx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_cmpestro:		; X64-LABEL: test_mm_cmpestro:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %r8d, %r8d
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: movl %esi, %edx		; X64-NEXT: movl %esi, %edx
; X64-NEXT: pcmpestri $7, %xmm1, %xmm0		; X64-NEXT: pcmpestri $7, %xmm1, %xmm0
; X64-NEXT: seto %al		; X64-NEXT: seto %r8b
; X64-NEXT: movzbl %al, %eax		; X64-NEXT: movl %r8d, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%arg0 = bitcast <2 x i64> %a0 to <16 x i8>		%arg0 = bitcast <2 x i64> %a0 to <16 x i8>
%arg2 = bitcast <2 x i64> %a2 to <16 x i8>		%arg2 = bitcast <2 x i64> %a2 to <16 x i8>
%res = call i32 @llvm.x86.sse42.pcmpestrio128(<16 x i8> %arg0, i32 %a1, <16 x i8> %arg2, i32 %a3, i8 7)		%res = call i32 @llvm.x86.sse42.pcmpestrio128(<16 x i8> %arg0, i32 %a1, <16 x i8> %arg2, i32 %a3, i8 7)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpestrio128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpestrio128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone

define i32 @test_mm_cmpestrs(<2 x i64> %a0, i32 %a1, <2 x i64> %a2, i32 %a3) {		define i32 @test_mm_cmpestrs(<2 x i64> %a0, i32 %a1, <2 x i64> %a2, i32 %a3) nounwind {
; X32-LABEL: test_mm_cmpestrs:		; X32-LABEL: test_mm_cmpestrs:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: pushl %ebx
; X32-NEXT: movl {{[0-9]+}}(%esp), %edx		; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
		; X32-NEXT: xorl %ebx, %ebx
; X32-NEXT: pcmpestri $7, %xmm1, %xmm0		; X32-NEXT: pcmpestri $7, %xmm1, %xmm0
; X32-NEXT: sets %al		; X32-NEXT: sets %bl
; X32-NEXT: movzbl %al, %eax		; X32-NEXT: movl %ebx, %eax
		; X32-NEXT: popl %ebx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_cmpestrs:		; X64-LABEL: test_mm_cmpestrs:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %r8d, %r8d
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: movl %esi, %edx		; X64-NEXT: movl %esi, %edx
; X64-NEXT: pcmpestri $7, %xmm1, %xmm0		; X64-NEXT: pcmpestri $7, %xmm1, %xmm0
; X64-NEXT: sets %al		; X64-NEXT: sets %r8b
; X64-NEXT: movzbl %al, %eax		; X64-NEXT: movl %r8d, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%arg0 = bitcast <2 x i64> %a0 to <16 x i8>		%arg0 = bitcast <2 x i64> %a0 to <16 x i8>
%arg2 = bitcast <2 x i64> %a2 to <16 x i8>		%arg2 = bitcast <2 x i64> %a2 to <16 x i8>
%res = call i32 @llvm.x86.sse42.pcmpestris128(<16 x i8> %arg0, i32 %a1, <16 x i8> %arg2, i32 %a3, i8 7)		%res = call i32 @llvm.x86.sse42.pcmpestris128(<16 x i8> %arg0, i32 %a1, <16 x i8> %arg2, i32 %a3, i8 7)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpestris128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpestris128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone

define i32 @test_mm_cmpestrz(<2 x i64> %a0, i32 %a1, <2 x i64> %a2, i32 %a3) {		define i32 @test_mm_cmpestrz(<2 x i64> %a0, i32 %a1, <2 x i64> %a2, i32 %a3) nounwind {
; X32-LABEL: test_mm_cmpestrz:		; X32-LABEL: test_mm_cmpestrz:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: pushl %ebx
; X32-NEXT: movl {{[0-9]+}}(%esp), %edx		; X32-NEXT: movl {{[0-9]+}}(%esp), %edx
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: movl {{[0-9]+}}(%esp), %eax
		; X32-NEXT: xorl %ebx, %ebx
; X32-NEXT: pcmpestri $7, %xmm1, %xmm0		; X32-NEXT: pcmpestri $7, %xmm1, %xmm0
; X32-NEXT: sete %al		; X32-NEXT: sete %bl
; X32-NEXT: movzbl %al, %eax		; X32-NEXT: movl %ebx, %eax
		; X32-NEXT: popl %ebx
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_cmpestrz:		; X64-LABEL: test_mm_cmpestrz:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %r8d, %r8d
; X64-NEXT: movl %edi, %eax		; X64-NEXT: movl %edi, %eax
; X64-NEXT: movl %esi, %edx		; X64-NEXT: movl %esi, %edx
; X64-NEXT: pcmpestri $7, %xmm1, %xmm0		; X64-NEXT: pcmpestri $7, %xmm1, %xmm0
; X64-NEXT: sete %al		; X64-NEXT: sete %r8b
; X64-NEXT: movzbl %al, %eax		; X64-NEXT: movl %r8d, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%arg0 = bitcast <2 x i64> %a0 to <16 x i8>		%arg0 = bitcast <2 x i64> %a0 to <16 x i8>
%arg2 = bitcast <2 x i64> %a2 to <16 x i8>		%arg2 = bitcast <2 x i64> %a2 to <16 x i8>
%res = call i32 @llvm.x86.sse42.pcmpestriz128(<16 x i8> %arg0, i32 %a1, <16 x i8> %arg2, i32 %a3, i8 7)		%res = call i32 @llvm.x86.sse42.pcmpestriz128(<16 x i8> %arg0, i32 %a1, <16 x i8> %arg2, i32 %a3, i8 7)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpestriz128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpestriz128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone

Show All 10 Lines	; X64-NEXT: retq
%cmp = icmp sgt <2 x i64> %a0, %a1		%cmp = icmp sgt <2 x i64> %a0, %a1
%res = sext <2 x i1> %cmp to <2 x i64>		%res = sext <2 x i1> %cmp to <2 x i64>
ret <2 x i64> %res		ret <2 x i64> %res
}		}

define i32 @test_mm_cmpistra(<2 x i64> %a0, <2 x i64> %a1) {		define i32 @test_mm_cmpistra(<2 x i64> %a0, <2 x i64> %a1) {
; X32-LABEL: test_mm_cmpistra:		; X32-LABEL: test_mm_cmpistra:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: pcmpistri $7, %xmm1, %xmm0		; X32-NEXT: pcmpistri $7, %xmm1, %xmm0
; X32-NEXT: seta %al		; X32-NEXT: seta %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_cmpistra:		; X64-LABEL: test_mm_cmpistra:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: pcmpistri $7, %xmm1, %xmm0		; X64-NEXT: pcmpistri $7, %xmm1, %xmm0
; X64-NEXT: seta %al		; X64-NEXT: seta %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%arg0 = bitcast <2 x i64> %a0 to <16 x i8>		%arg0 = bitcast <2 x i64> %a0 to <16 x i8>
%arg1 = bitcast <2 x i64> %a1 to <16 x i8>		%arg1 = bitcast <2 x i64> %a1 to <16 x i8>
%res = call i32 @llvm.x86.sse42.pcmpistria128(<16 x i8> %arg0, <16 x i8> %arg1, i8 7)		%res = call i32 @llvm.x86.sse42.pcmpistria128(<16 x i8> %arg0, <16 x i8> %arg1, i8 7)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpistria128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpistria128(<16 x i8>, <16 x i8>, i8) nounwind readnone

▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%bc = bitcast <16 x i8> %res to <2 x i64>		%bc = bitcast <16 x i8> %res to <2 x i64>
ret <2 x i64> %bc		ret <2 x i64> %bc
}		}
declare <16 x i8> @llvm.x86.sse42.pcmpistrm128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare <16 x i8> @llvm.x86.sse42.pcmpistrm128(<16 x i8>, <16 x i8>, i8) nounwind readnone

define i32 @test_mm_cmpistro(<2 x i64> %a0, <2 x i64> %a1) {		define i32 @test_mm_cmpistro(<2 x i64> %a0, <2 x i64> %a1) {
; X32-LABEL: test_mm_cmpistro:		; X32-LABEL: test_mm_cmpistro:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: pcmpistri $7, %xmm1, %xmm0		; X32-NEXT: pcmpistri $7, %xmm1, %xmm0
; X32-NEXT: seto %al		; X32-NEXT: seto %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_cmpistro:		; X64-LABEL: test_mm_cmpistro:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: pcmpistri $7, %xmm1, %xmm0		; X64-NEXT: pcmpistri $7, %xmm1, %xmm0
; X64-NEXT: seto %al		; X64-NEXT: seto %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%arg0 = bitcast <2 x i64> %a0 to <16 x i8>		%arg0 = bitcast <2 x i64> %a0 to <16 x i8>
%arg1 = bitcast <2 x i64> %a1 to <16 x i8>		%arg1 = bitcast <2 x i64> %a1 to <16 x i8>
%res = call i32 @llvm.x86.sse42.pcmpistrio128(<16 x i8> %arg0, <16 x i8> %arg1, i8 7)		%res = call i32 @llvm.x86.sse42.pcmpistrio128(<16 x i8> %arg0, <16 x i8> %arg1, i8 7)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpistrio128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpistrio128(<16 x i8>, <16 x i8>, i8) nounwind readnone

define i32 @test_mm_cmpistrs(<2 x i64> %a0, <2 x i64> %a1) {		define i32 @test_mm_cmpistrs(<2 x i64> %a0, <2 x i64> %a1) {
; X32-LABEL: test_mm_cmpistrs:		; X32-LABEL: test_mm_cmpistrs:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: pcmpistri $7, %xmm1, %xmm0		; X32-NEXT: pcmpistri $7, %xmm1, %xmm0
; X32-NEXT: sets %al		; X32-NEXT: sets %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_cmpistrs:		; X64-LABEL: test_mm_cmpistrs:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: pcmpistri $7, %xmm1, %xmm0		; X64-NEXT: pcmpistri $7, %xmm1, %xmm0
; X64-NEXT: sets %al		; X64-NEXT: sets %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%arg0 = bitcast <2 x i64> %a0 to <16 x i8>		%arg0 = bitcast <2 x i64> %a0 to <16 x i8>
%arg1 = bitcast <2 x i64> %a1 to <16 x i8>		%arg1 = bitcast <2 x i64> %a1 to <16 x i8>
%res = call i32 @llvm.x86.sse42.pcmpistris128(<16 x i8> %arg0, <16 x i8> %arg1, i8 7)		%res = call i32 @llvm.x86.sse42.pcmpistris128(<16 x i8> %arg0, <16 x i8> %arg1, i8 7)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpistris128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpistris128(<16 x i8>, <16 x i8>, i8) nounwind readnone

define i32 @test_mm_cmpistrz(<2 x i64> %a0, <2 x i64> %a1) {		define i32 @test_mm_cmpistrz(<2 x i64> %a0, <2 x i64> %a1) {
; X32-LABEL: test_mm_cmpistrz:		; X32-LABEL: test_mm_cmpistrz:
; X32: # BB#0:		; X32: # BB#0:
		; X32-NEXT: xorl %eax, %eax
; X32-NEXT: pcmpistri $7, %xmm1, %xmm0		; X32-NEXT: pcmpistri $7, %xmm1, %xmm0
; X32-NEXT: sete %al		; X32-NEXT: sete %al
; X32-NEXT: movzbl %al, %eax
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: test_mm_cmpistrz:		; X64-LABEL: test_mm_cmpistrz:
; X64: # BB#0:		; X64: # BB#0:
		; X64-NEXT: xorl %eax, %eax
; X64-NEXT: pcmpistri $7, %xmm1, %xmm0		; X64-NEXT: pcmpistri $7, %xmm1, %xmm0
; X64-NEXT: sete %al		; X64-NEXT: sete %al
; X64-NEXT: movzbl %al, %eax
; X64-NEXT: retq		; X64-NEXT: retq
%arg0 = bitcast <2 x i64> %a0 to <16 x i8>		%arg0 = bitcast <2 x i64> %a0 to <16 x i8>
%arg1 = bitcast <2 x i64> %a1 to <16 x i8>		%arg1 = bitcast <2 x i64> %a1 to <16 x i8>
%res = call i32 @llvm.x86.sse42.pcmpistriz128(<16 x i8> %arg0, <16 x i8> %arg1, i8 7)		%res = call i32 @llvm.x86.sse42.pcmpistriz128(<16 x i8> %arg0, <16 x i8> %arg1, i8 7)
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpistriz128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpistriz128(<16 x i8>, <16 x i8>, i8) nounwind readnone

▲ Show 20 Lines • Show All 52 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/sse42-intrinsics-x86.ll

Show All 27 Lines
; CHECK-NEXT: retl		; CHECK-NEXT: retl
%1 = load <16 x i8>, <16 x i8>* %a0		%1 = load <16 x i8>, <16 x i8>* %a0
%2 = load <16 x i8>, <16 x i8>* %a2		%2 = load <16 x i8>, <16 x i8>* %a2
%res = call i32 @llvm.x86.sse42.pcmpestri128(<16 x i8> %1, i32 7, <16 x i8> %2, i32 7, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpestri128(<16 x i8> %1, i32 7, <16 x i8> %2, i32 7, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}


define i32 @test_x86_sse42_pcmpestria128(<16 x i8> %a0, <16 x i8> %a2) {		define i32 @test_x86_sse42_pcmpestria128(<16 x i8> %a0, <16 x i8> %a2) nounwind {
; CHECK-LABEL: test_x86_sse42_pcmpestria128:		; CHECK-LABEL: test_x86_sse42_pcmpestria128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
		; CHECK-NEXT: pushl %ebx
; CHECK-NEXT: movl $7, %eax		; CHECK-NEXT: movl $7, %eax
; CHECK-NEXT: movl $7, %edx		; CHECK-NEXT: movl $7, %edx
		; CHECK-NEXT: xorl %ebx, %ebx
; CHECK-NEXT: pcmpestri $7, %xmm1, %xmm0		; CHECK-NEXT: pcmpestri $7, %xmm1, %xmm0
; CHECK-NEXT: seta %al		; CHECK-NEXT: seta %bl
; CHECK-NEXT: movzbl %al, %eax		; CHECK-NEXT: movl %ebx, %eax
		; CHECK-NEXT: popl %ebx
; CHECK-NEXT: retl		; CHECK-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpestria128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpestria128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpestria128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpestria128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpestric128(<16 x i8> %a0, <16 x i8> %a2) {		define i32 @test_x86_sse42_pcmpestric128(<16 x i8> %a0, <16 x i8> %a2) {
; CHECK-LABEL: test_x86_sse42_pcmpestric128:		; CHECK-LABEL: test_x86_sse42_pcmpestric128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: movl $7, %eax		; CHECK-NEXT: movl $7, %eax
; CHECK-NEXT: movl $7, %edx		; CHECK-NEXT: movl $7, %edx
; CHECK-NEXT: pcmpestri $7, %xmm1, %xmm0		; CHECK-NEXT: pcmpestri $7, %xmm1, %xmm0
; CHECK-NEXT: sbbl %eax, %eax		; CHECK-NEXT: sbbl %eax, %eax
; CHECK-NEXT: andl $1, %eax		; CHECK-NEXT: andl $1, %eax
; CHECK-NEXT: retl		; CHECK-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpestric128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpestric128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpestric128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpestric128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpestrio128(<16 x i8> %a0, <16 x i8> %a2) {		define i32 @test_x86_sse42_pcmpestrio128(<16 x i8> %a0, <16 x i8> %a2) nounwind {
; CHECK-LABEL: test_x86_sse42_pcmpestrio128:		; CHECK-LABEL: test_x86_sse42_pcmpestrio128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
		; CHECK-NEXT: pushl %ebx
; CHECK-NEXT: movl $7, %eax		; CHECK-NEXT: movl $7, %eax
; CHECK-NEXT: movl $7, %edx		; CHECK-NEXT: movl $7, %edx
		; CHECK-NEXT: xorl %ebx, %ebx
; CHECK-NEXT: pcmpestri $7, %xmm1, %xmm0		; CHECK-NEXT: pcmpestri $7, %xmm1, %xmm0
; CHECK-NEXT: seto %al		; CHECK-NEXT: seto %bl
; CHECK-NEXT: movzbl %al, %eax		; CHECK-NEXT: movl %ebx, %eax
		; CHECK-NEXT: popl %ebx
; CHECK-NEXT: retl		; CHECK-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpestrio128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpestrio128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpestrio128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpestrio128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpestris128(<16 x i8> %a0, <16 x i8> %a2) {		define i32 @test_x86_sse42_pcmpestris128(<16 x i8> %a0, <16 x i8> %a2) nounwind {
; CHECK-LABEL: test_x86_sse42_pcmpestris128:		; CHECK-LABEL: test_x86_sse42_pcmpestris128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
		; CHECK-NEXT: pushl %ebx
; CHECK-NEXT: movl $7, %eax		; CHECK-NEXT: movl $7, %eax
; CHECK-NEXT: movl $7, %edx		; CHECK-NEXT: movl $7, %edx
		; CHECK-NEXT: xorl %ebx, %ebx
; CHECK-NEXT: pcmpestri $7, %xmm1, %xmm0		; CHECK-NEXT: pcmpestri $7, %xmm1, %xmm0
; CHECK-NEXT: sets %al		; CHECK-NEXT: sets %bl
; CHECK-NEXT: movzbl %al, %eax		; CHECK-NEXT: movl %ebx, %eax
		; CHECK-NEXT: popl %ebx
; CHECK-NEXT: retl		; CHECK-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpestris128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpestris128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpestris128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpestris128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpestriz128(<16 x i8> %a0, <16 x i8> %a2) {		define i32 @test_x86_sse42_pcmpestriz128(<16 x i8> %a0, <16 x i8> %a2) nounwind {
; CHECK-LABEL: test_x86_sse42_pcmpestriz128:		; CHECK-LABEL: test_x86_sse42_pcmpestriz128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
		; CHECK-NEXT: pushl %ebx
; CHECK-NEXT: movl $7, %eax		; CHECK-NEXT: movl $7, %eax
; CHECK-NEXT: movl $7, %edx		; CHECK-NEXT: movl $7, %edx
		; CHECK-NEXT: xorl %ebx, %ebx
; CHECK-NEXT: pcmpestri $7, %xmm1, %xmm0		; CHECK-NEXT: pcmpestri $7, %xmm1, %xmm0
; CHECK-NEXT: sete %al		; CHECK-NEXT: sete %bl
; CHECK-NEXT: movzbl %al, %eax		; CHECK-NEXT: movl %ebx, %eax
		; CHECK-NEXT: popl %ebx
; CHECK-NEXT: retl		; CHECK-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpestriz128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpestriz128(<16 x i8> %a0, i32 7, <16 x i8> %a2, i32 7, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpestriz128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpestriz128(<16 x i8>, i32, <16 x i8>, i32, i8) nounwind readnone


define <16 x i8> @test_x86_sse42_pcmpestrm128(<16 x i8> %a0, <16 x i8> %a2) {		define <16 x i8> @test_x86_sse42_pcmpestrm128(<16 x i8> %a0, <16 x i8> %a2) {
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	; CHECK-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpistri128(<16 x i8> %1, <16 x i8> %2, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpistri128(<16 x i8> %1, <16 x i8> %2, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}


define i32 @test_x86_sse42_pcmpistria128(<16 x i8> %a0, <16 x i8> %a1) {		define i32 @test_x86_sse42_pcmpistria128(<16 x i8> %a0, <16 x i8> %a1) {
; CHECK-LABEL: test_x86_sse42_pcmpistria128:		; CHECK-LABEL: test_x86_sse42_pcmpistria128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
		; CHECK-NEXT: xorl %eax, %eax
; CHECK-NEXT: pcmpistri $7, %xmm1, %xmm0		; CHECK-NEXT: pcmpistri $7, %xmm1, %xmm0
; CHECK-NEXT: seta %al		; CHECK-NEXT: seta %al
; CHECK-NEXT: movzbl %al, %eax
; CHECK-NEXT: retl		; CHECK-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpistria128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpistria128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpistria128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpistria128(<16 x i8>, <16 x i8>, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpistric128(<16 x i8> %a0, <16 x i8> %a1) {		define i32 @test_x86_sse42_pcmpistric128(<16 x i8> %a0, <16 x i8> %a1) {
; CHECK-LABEL: test_x86_sse42_pcmpistric128:		; CHECK-LABEL: test_x86_sse42_pcmpistric128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
; CHECK-NEXT: pcmpistri $7, %xmm1, %xmm0		; CHECK-NEXT: pcmpistri $7, %xmm1, %xmm0
; CHECK-NEXT: sbbl %eax, %eax		; CHECK-NEXT: sbbl %eax, %eax
; CHECK-NEXT: andl $1, %eax		; CHECK-NEXT: andl $1, %eax
; CHECK-NEXT: retl		; CHECK-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpistric128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpistric128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpistric128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpistric128(<16 x i8>, <16 x i8>, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpistrio128(<16 x i8> %a0, <16 x i8> %a1) {		define i32 @test_x86_sse42_pcmpistrio128(<16 x i8> %a0, <16 x i8> %a1) {
; CHECK-LABEL: test_x86_sse42_pcmpistrio128:		; CHECK-LABEL: test_x86_sse42_pcmpistrio128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
		; CHECK-NEXT: xorl %eax, %eax
; CHECK-NEXT: pcmpistri $7, %xmm1, %xmm0		; CHECK-NEXT: pcmpistri $7, %xmm1, %xmm0
; CHECK-NEXT: seto %al		; CHECK-NEXT: seto %al
; CHECK-NEXT: movzbl %al, %eax
; CHECK-NEXT: retl		; CHECK-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpistrio128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpistrio128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpistrio128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpistrio128(<16 x i8>, <16 x i8>, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpistris128(<16 x i8> %a0, <16 x i8> %a1) {		define i32 @test_x86_sse42_pcmpistris128(<16 x i8> %a0, <16 x i8> %a1) {
; CHECK-LABEL: test_x86_sse42_pcmpistris128:		; CHECK-LABEL: test_x86_sse42_pcmpistris128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
		; CHECK-NEXT: xorl %eax, %eax
; CHECK-NEXT: pcmpistri $7, %xmm1, %xmm0		; CHECK-NEXT: pcmpistri $7, %xmm1, %xmm0
; CHECK-NEXT: sets %al		; CHECK-NEXT: sets %al
; CHECK-NEXT: movzbl %al, %eax
; CHECK-NEXT: retl		; CHECK-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpistris128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpistris128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpistris128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpistris128(<16 x i8>, <16 x i8>, i8) nounwind readnone


define i32 @test_x86_sse42_pcmpistriz128(<16 x i8> %a0, <16 x i8> %a1) {		define i32 @test_x86_sse42_pcmpistriz128(<16 x i8> %a0, <16 x i8> %a1) {
; CHECK-LABEL: test_x86_sse42_pcmpistriz128:		; CHECK-LABEL: test_x86_sse42_pcmpistriz128:
; CHECK: ## BB#0:		; CHECK: ## BB#0:
		; CHECK-NEXT: xorl %eax, %eax
; CHECK-NEXT: pcmpistri $7, %xmm1, %xmm0		; CHECK-NEXT: pcmpistri $7, %xmm1, %xmm0
; CHECK-NEXT: sete %al		; CHECK-NEXT: sete %al
; CHECK-NEXT: movzbl %al, %eax
; CHECK-NEXT: retl		; CHECK-NEXT: retl
%res = call i32 @llvm.x86.sse42.pcmpistriz128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]		%res = call i32 @llvm.x86.sse42.pcmpistriz128(<16 x i8> %a0, <16 x i8> %a1, i8 7) ; <i32> [#uses=1]
ret i32 %res		ret i32 %res
}		}
declare i32 @llvm.x86.sse42.pcmpistriz128(<16 x i8>, <16 x i8>, i8) nounwind readnone		declare i32 @llvm.x86.sse42.pcmpistriz128(<16 x i8>, <16 x i8>, i8) nounwind readnone


define <16 x i8> @test_x86_sse42_pcmpistrm128(<16 x i8> %a0, <16 x i8> %a1) {		define <16 x i8> @test_x86_sse42_pcmpistrm128(<16 x i8> %a0, <16 x i8> %a1) {
Show All 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Transform setcc + movzbl into xorl + setccClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 62983

llvm/trunk/lib/Target/X86/CMakeLists.txt

llvm/trunk/lib/Target/X86/X86.h

llvm/trunk/lib/Target/X86/X86FixupSetCC.cpp

llvm/trunk/lib/Target/X86/X86TargetMachine.cpp

llvm/trunk/test/CodeGen/X86/2008-08-17-UComiCodeGenBug.ll

llvm/trunk/test/CodeGen/X86/2008-09-11-CoalescerBug2.ll

llvm/trunk/test/CodeGen/X86/avx-intrinsics-fast-isel.ll

llvm/trunk/test/CodeGen/X86/avx-intrinsics-x86.ll

llvm/trunk/test/CodeGen/X86/avx512-cmp.ll

llvm/trunk/test/CodeGen/X86/avx512-intrinsics.ll

llvm/trunk/test/CodeGen/X86/avx512-mask-op.ll

llvm/trunk/test/CodeGen/X86/bmi.ll

llvm/trunk/test/CodeGen/X86/cmov.ll

llvm/trunk/test/CodeGen/X86/cmp.ll

llvm/trunk/test/CodeGen/X86/cmpxchg-i1.ll

llvm/trunk/test/CodeGen/X86/cmpxchg-i128-i1.ll

llvm/trunk/test/CodeGen/X86/ctpop-combine.ll

llvm/trunk/test/CodeGen/X86/fp128-cast.ll

llvm/trunk/test/CodeGen/X86/fp128-compare.ll

llvm/trunk/test/CodeGen/X86/mcinst-lowering.ll

llvm/trunk/test/CodeGen/X86/return-ext.ll

llvm/trunk/test/CodeGen/X86/setcc-narrowing.ll

llvm/trunk/test/CodeGen/X86/setcc.ll

llvm/trunk/test/CodeGen/X86/sse-intrinsics-fast-isel.ll

llvm/trunk/test/CodeGen/X86/sse-intrinsics-x86.ll

llvm/trunk/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll

llvm/trunk/test/CodeGen/X86/sse2-intrinsics-x86.ll

llvm/trunk/test/CodeGen/X86/sse41-intrinsics-fast-isel.ll

llvm/trunk/test/CodeGen/X86/sse41-intrinsics-x86.ll

llvm/trunk/test/CodeGen/X86/sse41.ll

llvm/trunk/test/CodeGen/X86/sse42-intrinsics-fast-isel.ll

llvm/trunk/test/CodeGen/X86/sse42-intrinsics-x86.ll

[X86] Transform setcc + movzbl into xorl + setcc
ClosedPublic