This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
1/1
CGExprScalar.cpp
-
test/CodeGen/
-
CodeGen/
-
aarch64-neon-2velem.c
-
aarch64-neon-fma.c
-
aarch64-neon-intrinsics.c
-
aarch64-neon-misc.c
-
aarch64-neon-scalar-x-indexed-elem.c
-
aarch64-v8.2a-fp16-intrinsics.c
-
aarch64-v8.2a-neon-intrinsics.c
-
arm-v8.2a-neon-intrinsics.c
-
arm_neon_intrinsics.c
-
avx512f-builtins.c
-
avx512vl-builtins.c
-
builtins-ppc-vsx.c
-
complex-math.c
-
exprs.c
-
fma-builtins.c
-
fma4-builtins.c
-
fp16-ops.c
-
zvector.c
-
zvector2.c
-
llvm/
-
include/llvm/IR/
-
llvm/
-
IR/
-
IRBuilder.h
-
test/
-
CodeGen/AMDGPU/
-
AMDGPU/
-
amdgpu-codegenprepare-idiv.ll
-
divrem24-assume.ll
-
Transforms/InstCombine/
-
InstCombine/
-
cos-1.ll
-
fast-math.ll
-
fmul.ll
-
select-crash.ll
-
unittests/IR/
-
IR/
-
InstructionsTest.cpp

Differential D61675

[WIP] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator
ClosedPublic

Authored by cameron.mcinally on May 8 2019, 5:46 AM.

Download Raw Diff

Details

Reviewers

spatel
kpn
arsenm
craig.topper
andrew.w.kaylor

Commits

rG20b8ed2c2b10: [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator
rL374782: [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator
rC374782: [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator
rG47363a148f1d: [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator
rL374240: [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator
rC374240: [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator

Summary

This will have to sit unmerged until we're sure that the unary FNEG has not regressed from the binary psuedo-FNEG. But it's ready to go when we're satisfied. There were no surprises in this patch.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

cameron.mcinally created this revision.May 8 2019, 5:46 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 8 2019, 5:46 AM

Herald added subscribers: llvm-commits, javed.absar, nhaehnle and 2 others. · View Herald Transcript

cameron.mcinally retitled this revision from Update IRBuilder::CreateFNeg(...) to return a UnaryOperator to [WIP] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator.Jun 3 2019, 8:31 AM

Ok, I think we're pretty close to having unary FNeg optimizations in line with binary FNeg optimizations. Is anyone aware of any obvious passes that I've missed?

Also I'm looking for some test-suite guidance. I've benchmarked the change in this patch and the results appear to be in the noise range (I think?):

<scrubbed> llvm-project/test-suite-build> ../test-suite/utils/compare.py --filter-short fneg1.json fneg2.json fneg3.json vs stock1.json stock2.json stock3.json
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Tests: 546
Short Running: 187 (filtered out)
Remaining: 359
Metric: exec_time

Program                                        lhs    rhs    diff 
 test-suite...marks/CoyoteBench/lpbench.test     1.19   1.11 -6.5%
 test-suite...marks/Misc/matmul_f64_4x4.test     0.68   0.64 -5.8%
 test-suite...e/Benchmarks/McGill/chomp.test     0.67   0.63 -5.7%
 test-suite...BENCHMARK_ORDERED_DITHER/128/2    60.65  63.84  5.3%
 test-suite...HMARK_BICUBIC_INTERPOLATION/16    48.53  50.78  4.6%
 test-suite...mbolics-flt/Symbolics-flt.test     0.76   0.73 -4.0%
 test-suite...ks/Shootout/Shootout-hash.test     4.60   4.79  4.0%
 test-suite...ce/Benchmarks/Olden/bh/bh.test     0.97   0.93 -3.5%
 test-suite...oxyApps-C/miniGMG/miniGMG.test     0.65   0.63 -3.3%
 test-suite...-flt/LinearDependence-flt.test     1.35   1.40  3.3%
 test-suite...rolangs-C++/primes/primes.test     0.75   0.78  3.3%
 test-suite...ncils/fdtd-apml/fdtd-apml.test     0.86   0.83 -3.2%
 test-suite...s/Fhourstones/fhourstones.test     0.68   0.70  3.2%
 test-suite.../Benchmarks/nbench/nbench.test     1.09   1.12  3.1%
 test-suite...s/Misc/richards_benchmark.test     1.11   1.08 -3.0%
 Geomean difference                                           nan%
                 lhs            rhs        diff
count  356.000000     359.000000     356.000000
mean   2207.107388    2187.731742   -0.000983  
std    17335.321906   17252.648405   0.011480  
min    0.608100       0.602200      -0.065347  
25%    1.948518       1.886300      -0.002897  
50%    6.446969       5.985600      -0.000070  
75%    111.044515     106.027453     0.001283  
max    214183.043000  214138.878333  0.052571

I've compared assembly for the notable differences, and unless I've made a horrible mistake, there are no generated asm differences. I suspect that the >5% swings are from I/O noise on my shared machine. Any thoughts/insights on using test-suite?

Another thought...

This patch is a change to the main FNeg IRBuilder entry point only! There may be value in a patch which switches *all* the FNeg entry points at once. However, this single patch alone is quite large and I'm not sure if more diff noise is a good thing.

Clang does not use the IRBuilder fneg entry point. When do we plan to switch clang over?

In D61675#1601893, @craig.topper wrote:

Clang does not use the IRBuilder fneg entry point. When do we plan to switch clang over?

Sorry for the delay, Craig. I missed this comment...

So Clang does use the IRBuilder fneg entry point (at least in some places). That's why all these Clang tests have differences:

M			clang/test/CodeGen/aarch64-v8.2a-fp16-intrinsics.c (2 lines)	
M			clang/test/CodeGen/avx512f-builtins.c (56 lines)	
M			clang/test/CodeGen/avx512vl-builtins.c (24 lines)	
M			clang/test/CodeGen/complex-math.c (6 lines)	
M			clang/test/CodeGen/fma-builtins.c (8 lines)	
M			clang/test/CodeGen/fma4-builtins.c (10 lines)

There may be more places that need to be updated though, I'm not sure.

In D61675#1608548, @cameron.mcinally wrote:
In D61675#1601893, @craig.topper wrote:

Clang does not use the IRBuilder fneg entry point. When do we plan to switch clang over?

Sorry for the delay, Craig. I missed this comment...

So Clang does use the IRBuilder fneg entry point (at least in some places). That's why all these Clang tests have differences:
M			clang/test/CodeGen/aarch64-v8.2a-fp16-intrinsics.c (2 lines)	
M			clang/test/CodeGen/avx512f-builtins.c (56 lines)	
M			clang/test/CodeGen/avx512vl-builtins.c (24 lines)	
M			clang/test/CodeGen/complex-math.c (6 lines)	
M			clang/test/CodeGen/fma-builtins.c (8 lines)	
M			clang/test/CodeGen/fma4-builtins.c (10 lines)
There may be more places that need to be updated though, I'm not sure.

Sorry I wasn't very clear there. There are certainly places that use it, but the code for unary minus operator does not use CreateFNEG. it asks for the constant value to use for negation and the makes an fsub or sub depending on fp or int.

In D61675#1608593, @craig.topper wrote:

Sorry I wasn't very clear there. There are certainly places that use it, but the code for unary minus operator does not use CreateFNEG. it asks for the constant value to use for negation and the makes an fsub or sub depending on fp or int.

Ah, ok. I probably should've dug deeper through the Clang code. It makes sense that this change didn't trigger a lot of differences in test-suite now. And also why only a select few Clang tests showed IR differences.

I'll check it out...

Thanks for pointing this out, Craig. I do see VisitUnaryMinus(...) now. [I'm glad that Clang distinguished between the unary and binary fneg -- that could've been trouble. ;)]

Kind of a larger question: Do we want to include the VisitUnaryMinus change into one mega-Diff? Or should we have several separate Diffs for each of the individual CreateFNeg(...) changes?

One mega-diff would probably be better for shaking out any performance problems. But it will also make the changeset much larger with test differences.

In D61675#1609078, @cameron.mcinally wrote:

Kind of a larger question: Do we want to include the VisitUnaryMinus change into one mega-Diff? Or should we have several separate Diffs for each of the individual CreateFNeg(...) changes?

Never mind, we don't have a choice. Calling CreateFNeg(...) as-is would just create a binary FNeg.

cameron.mcinally planned changes to this revision.Aug 6 2019, 11:04 AM

Updated patch to generate unary FNeg in Clang.

New perf numbers:

llvm-project/test-suite-build> ../test-suite/utils/compare.py --filter-short stock1.json stock2.json stock3.json stock4.json stock5.json vs fneg1.json fneg2.json fneg3.json fneg4.json fneg5.json 
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/FDRMode/fdrmode-bench.test' has No metrics!
Warning: 'test-suite :: MicroBenchmarks/XRay/ReturnReference/retref-bench.test' has No metrics!
Tests: 1151
Short Running: 799 (filtered out)
Remaining: 352
Metric: exec_time

Program                                        lhs    rhs    diff 
 test-suite...ProxyApps-C++/CLAMR/CLAMR.test     1.01   0.97 -4.3%
 test-suite...late.test:BENCHMARK_DILATE/128   104.11 107.70  3.4%
 test-suite.../Benchmarks/nbench/nbench.test     1.04   1.01 -3.2%
 test-suite...s/MallocBench/cfrac/cfrac.test     0.63   0.61 -3.1%
 test-suite.../Trimaran/enc-rc4/enc-rc4.test     0.65   0.66  3.0%
 test-suite...e/Benchmarks/Misc/salsa20.test     3.95   3.85 -2.7%
 test-suite...ut-C++/Shootout-C++-sieve.test     1.23   1.26  2.5%
 test-suite...hootout/Shootout-heapsort.test     1.50   1.46 -2.4%
 test-suite.../Applications/spiff/spiff.test     1.01   1.03  2.3%
 test-suite...:BM_FIND_FIRST_MIN_LAMBDA/5001     2.88   2.82 -2.2%
 test-suite...lications/sqlite3/sqlite3.test     1.50   1.54  2.1%
 test-suite...-flt/LinearDependence-flt.test     1.38   1.35 -2.1%
 test-suite...s/gramschmidt/gramschmidt.test     1.73   1.76  2.0%
 test-suite...test:BM_PRESSURE_CALC_RAW/5001     9.40   9.59  2.0%
 test-suite...ications/JM/lencod/lencod.test     2.89   2.94  1.8%
 Geomean difference                                           nan%
                 lhs            rhs        diff
count  350.000000     351.000000     349.000000
mean   1998.973248    1997.490274   -0.000602  
std    16776.472077   16804.318859   0.007581  
min    0.604400       0.602700      -0.043298  
25%    1.761900       1.753700      -0.001977  
50%    6.234300       5.989400      -0.000066  
75%    104.102761     104.041111     0.000735  
max    215062.448333  215082.720667  0.034431

I believe these results are in the noise range for my machine. I examined the assembly for the listed tests, and some others, but no differences were found***.

There were some IR differences from Value name uniquing, but I believe that's to be expected. I.e. there were less sub's and new fneg's. The missing sub's appear to affect alloca names and such. E.g.:

5239c5239
<   %i780 = alloca i32, align 4
---
>   %i779 = alloca i32, align 4

Herald added subscribers: arphaman, kbarton, nemanjai. · View Herald TranscriptAug 14 2019, 3:06 PM

Not sure if clang / C source has any impact on their targets, but subscribing @mcberg2017 @escha in case this could make a difference for out-of-tree (GPU) hardware.

Herald added a subscriber: • wuzish. · View Herald TranscriptAug 15 2019, 7:04 AM

Just putting a place holder for NegateValue in Reassociate.cpp to be updated for both binary and unary context when examining the uses of the Value V. In fact the full file should be looked over for other cases as well. I tried out the current patch and we hit the first case pretty quickly internally.

@mcberg2017 Do you have a test you can share?

I'm aware of at least one place in Reassociation that *may* be making an invalid transform (correctness of sign bit on a NaN). But I assumed that Reassociation would only take place if the reassoc FMF was seen. So I put it on the back-burner for now.

Cameron here is an example with the issue:

// bin/opt -reassociate repro.ll -S
define float @test1(float %arg, float %arg1, float %arg2, float %arg3) {

%tmp1 = fsub fast float %arg3, %arg2
%tmp2 = fadd fast float %arg, -5.000000e+02
%tmp3 = fsub fast float %tmp1, %tmp2
%tmp4 = fmul fast float %tmp3, %tmp3
%tmp5 = fneg fast float %arg
%tmp6 = fdiv fast float %tmp5, %arg1
%tmp7 = fadd fast float %tmp6, %tmp4
ret float %tmp7

}

mcberg2017 mentioned this in D66612: [Reassoc] Small fix to support unary FNeg in NegateValue(...).Aug 22 2019, 12:42 PM

Given that we may still want the binary fneg generated in Reassociate.cpp, this may be enough then, plus what was done in D66612.

In D61675#1642206, @mcberg2017 wrote:

Given that we may still want the binary fneg generated in Reassociate.cpp, this may be enough then, plus what was done in D66612.

Very difficult to tell. Reassociate is tricky (for me at least). :/

I'm throwing some obvious unary FNeg tests at it and they are working fine, compared against the equivalent binary FNeg tests. So I'm feeling fairly confident in general.

I could check in the tests if desired, but they're more like fuzz tests than targeting specific features of Reassoc. It would be hard to prove these new tests are not doubling coverage of some existing test.

Ping.

I've been digging through this pass and it seems to be ok AFAICT. OptimizeInst(...) canonicalizes both unary and binary FNegs to -1.0*X, if they are fast and part of a special multiply tree. Other FNegs end up as leaf nodes, so no problem there.

Anyone aware of other situations I should look at?

In D61675#1663625, @cameron.mcinally wrote:

Ping.

I've been digging through this pass and it seems to be ok AFAICT. OptimizeInst(...) canonicalizes both unary and binary FNegs to -1.0*X, if they are fast and part of a special multiply tree. Other FNegs end up as leaf nodes, so no problem there.

Anyone aware of other situations I should look at?

Do we need to enhance EarlyCSE to see this equivalence:

define float @cse_fneg(float %x, i1 %cond) {
  %fneg_unary = fneg float %x
  %fneg_binary = fsub float -0.0, %x
  %r = select i1 %cond, float %fneg_unary, float %fneg_binary
  ret float %r
}

The binary fneg has the looser requirement for NaN propagation (IEEE-754 6.3: "this standard does not specify the sign bit of a NaN result [for math ops]"), but that's less important for optimization than knowing the 2 values are otherwise equivalent.

In D61675#1670602, @spatel wrote:
In D61675#1663625, @cameron.mcinally wrote:

Ping.

I've been digging through this pass and it seems to be ok AFAICT. OptimizeInst(...) canonicalizes both unary and binary FNegs to -1.0*X, if they are fast and part of a special multiply tree. Other FNegs end up as leaf nodes, so no problem there.

Anyone aware of other situations I should look at?

Do we need to enhance EarlyCSE to see this equivalence:
define float @cse_fneg(float %x, i1 %cond) {
  %fneg_unary = fneg float %x
  %fneg_binary = fsub float -0.0, %x
  %r = select i1 %cond, float %fneg_unary, float %fneg_binary
  ret float %r
}
The binary fneg has the looser requirement for NaN propagation (IEEE-754 6.3: "this standard does not specify the sign bit of a NaN result [for math ops]"), but that's less important for optimization than knowing the 2 values are otherwise equivalent.

On 2nd thought, that doesn't really make sense for CSE. The real question is whether we should canonicalize the binary fneg to unary fneg in instcombine. And should that happen before/after/concurrent with this change to clang?

In D61675#1670603, @spatel wrote:

On 2nd thought, that doesn't really make sense for CSE. The real question is whether we should canonicalize the binary fneg to unary fneg in instcombine. And should that happen before/after/concurrent with this change to clang?

Tough question...

The biggest problem I see with canonicalization of binary FNeg to unary FNeg is if DAZ/FTZ are set. With those set, a binary FNeg could be used to zero insignificant data. Otherwise, with a unary FNeg, the sign bit would just be flipped, regardless if it's a denormal or not. We have a weather code that uses an explicit binary FNeg to sanitize their noisy input, so any canonicalization there would disturb the results.

Currently, if I'm not mistaken, Clang only enables DAZ/FTZ with -Ofast and -ffast-math, so no problem there. But, there's nothing stopping a user from compiling with -O0 and flipping the DAZ/FTZ bits themselves.

Conversely, if would be easy to argue that setting DAZ/FTZ breaks IEEE-754 compliance, so it doesn't matter what we do from that point on. Although, this seems like a weird grey-area to me.

Another concern is if LLVM intends to respect the hardware-specified sign of a NaN or not. That is, IEEE-754 does not specify the sign on a NaN result, but the hardware may. E.g. x86 always returns the a -NaN, called QNaN floating-point indefinite.

It would be easy to argue that IEEE-754 doesn't specify the sign, so the compiler can do whatever it wants. IMO, I think we can do better and should allow the hardware to choose the NaN it wants.

@arsenm also had a concern about source modifiers on AMD GPUs. I don't know much about those, so will leave that discussion to him...

In D61675#1671277, @cameron.mcinally wrote:

In D61675#1670603, @spatel wrote:

On 2nd thought, that doesn't really make sense for CSE. The real question is whether we should canonicalize the binary fneg to unary fneg in instcombine. And should that happen before/after/concurrent with this change to clang?

Tough question...

The biggest problem I see with canonicalization of binary FNeg to unary FNeg is if DAZ/FTZ are set. With those set, a binary FNeg could be used to zero insignificant data. Otherwise, with a unary FNeg, the sign bit would just be flipped, regardless if it's a denormal or not. We have a weather code that uses an explicit binary FNeg to sanitize their noisy input, so any canonicalization there would disturb the results.

Currently, if I'm not mistaken, Clang only enables DAZ/FTZ with -Ofast and -ffast-math, so no problem there. But, there's nothing stopping a user from compiling with -O0 and flipping the DAZ/FTZ bits themselves.

Conversely, if would be easy to argue that setting DAZ/FTZ breaks IEEE-754 compliance, so it doesn't matter what we do from that point on. Although, this seems like a weird grey-area to me.

Yeah, that sounds like we may get pushback from users depending on hardware target.

Another concern is if LLVM intends to respect the hardware-specified sign of a NaN or not. That is, IEEE-754 does not specify the sign on a NaN result, but the hardware may. E.g. x86 always returns the a -NaN, called QNaN floating-point indefinite.

It would be easy to argue that IEEE-754 doesn't specify the sign, so the compiler can do whatever it wants. IMO, I think we can do better and should allow the hardware to choose the NaN it wants.

I think we would sacrifice target-specific / implementation-defined behavior if it meant code could be optimized better, but we'd want some evidence of that improvement before making a change.

@arsenm also had a concern about source modifiers on AMD GPUs. I don't know much about those, so will leave that discussion to him...

Ok. I have no objections/comments to this patch then. As long as clang is consistent in creating the unary fneg, the CSE concern is probably moot. LGTM, but give other reviewers another chance (1-2 days?) to comment before committing.

This revision is now accepted and ready to land.Sep 16 2019, 9:09 AM

What targets does clang enable FTZ/DAZ on? I don't think it does on X86.

In D61675#1671479, @craig.topper wrote:

What targets does clang enable FTZ/DAZ on? I don't think it does on X86.

IIUC, the problem is for LLVM targets that are FTZ/DAZ all the time by hardware design, and they may not be in trunk. I don't know if it's explicitly stated anywhere, but we try to support those targets even though they are not IEEE-754 compliant.

For x86 - clang has this -ffast-math hack for Linux:
rL165240

With fast-math, anything goes, so we don't have to worry about that scenario.
And I think the case where a user changes MXCSR bits is UB for C/C++ ( https://bugs.llvm.org/show_bug.cgi?id=8100#c15 ). So x86 never has a problem in theory, but it might in practice because users believe that twiddling MXCSR bits is allowed?

lenary added a subscriber: lenary.Sep 16 2019, 12:11 PM

lenary added inline comments.

clang/lib/CodeGen/CGExprScalar.cpp
2587–2588	This `if` will always evaluate the false branch now, right? So you should be able to get rid of the if statement totally.

In D61675#1671602, @spatel wrote:

And I think the case where a user changes MXCSR bits is UB for C/C++ ( https://bugs.llvm.org/show_bug.cgi?id=8100#c15 ). So x86 never has a problem in theory, but it might in practice because users believe that twiddling MXCSR bits is allowed?

We will eventually have to support #pragma STDC FENV_ACCESS, so Richard's reasoning won't hold forever.

Someone could argue, though, to just use a constrained strict FSub if you care about DAZ/FTZ. That seems like a valid solution. But, that is really treating DAZ/FTZ like a rounding-mode, not underflow. It would be a heavy hammer to enforce all the side-effect concerns when the user really only cares about DAZ/FTZ. In other words, we'll be losing significant performance in an attempt to gain significant performance.

In D61675#1671688, @cameron.mcinally wrote:

Someone could argue, though, to just use a constrained strict FSub if you care about DAZ/FTZ. That seems like a valid solution. But, that is really treating DAZ/FTZ like a rounding-mode, not underflow. It would be a heavy hammer to enforce all the side-effect concerns when the user really only cares about DAZ/FTZ. In other words, we'll be losing significant performance in an attempt to gain significant performance.

Now that I've said that, I suppose someone could argue that DAZ/FTZ *is* a side-effect. :D

Address review by @lenary.

cameron.mcinally marked an inline comment as done.Sep 16 2019, 2:11 PM

Some data points...

Intel is correct at: '-O3 -fp-model fast=2'
MSVC is correct at: '-O3'
GCC is NOT correct at : '-O3'
GCC is correct at: '-O3 -ftrapping-math -fsignaling-nans'

https://godbolt.org/z/p0DpN_

For posterity's sake, @andrew.w.kaylor just suggested adding a nftz fast math flag:

http://lists.llvm.org/pipermail/llvm-dev/2019-September/135183.html.

That would be great, since it would make it clear when the binary->unary FNeg transform is safe to do.

Sorry for the slow response. I've been busy with other projects.

Are there any reservations about merging the Clang unary FNeg change?

Still LGTM.

Closed by commit rG47363a148f1d: [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator (authored by cameron.mcinally). · Explain WhyOct 9 2019, 2:55 PM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptOct 9 2019, 2:55 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

Sorry, but this commit broke OCaml tests: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19014

I reverted it in r374354. Please test before re-landing.

I reverted it in r374354. Please test before re-landing.

Hmm, how do I run those tests? I did not see that failure with check-all.

That's a pretty straightforward failure. It's just a one-line IR change:

fsub float {{.*}}0{{.*}}, %F1 -> fneg float %F1

In D61675#1703749, @cameron.mcinally wrote:

I reverted it in r374354. Please test before re-landing.

Hmm, how do I run those tests? I did not see that failure with check-all.

That's a pretty straightforward failure. It's just a one-line IR change:

fsub float {{.*}}0{{.*}}, %F1 -> fneg float %F1

@gribozavr in case not subscribed...

@gribozavr I see that you also reverted @RKSimon's commit for the OCaml/core.ml failure:

Author: gribozavr
Date: Thu Oct 10 07:16:58 2019
New Revision: 374357

URL: http://llvm.org/viewvc/llvm-project?rev=374357&view=rev
Log:
Revert "Fix OCaml/core.ml fneg check"

This reverts commit r374346. It attempted to fix OCaml tests, but is
does not actually fix them.

Modified:
    llvm/trunk/test/Bindings/OCaml/core.ml

That appears to be the proper fix. Do you see something wrong with it that I'm missing?

Recommitted this patch with Ocaml test fix. I was not able to find quick-start documentation for the Ocaml bindings, so the Ocaml failure has not been tested. That said, the failure mode seems extremely low risk. Will monitor the buildbots for problems...

jdoerfert mentioned this in D29011: [IR] Add Freeze instruction.Oct 14 2019, 8:53 AM

Hi, it looks like this patch has caused a test failure under asan:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/35922/steps/check-llvm%20asan/logs/stdio
Can you please take a look?

Apologies, @pcc. Looking now. Will revert if it's not obvious...

Fix incoming. Sorry for not running the ASan tests...

Both the ASan build:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/35926

and the OCaml build:

http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19255

have completed successfully with the latest patches. Will continue to monitor the bots for further problems.

I apologize again for the noise...

@pcc @gribozavr

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGExprScalar.cpp

14 lines

test/

CodeGen/

aarch64-neon-2velem.c

34 lines

aarch64-neon-fma.c

2 lines

aarch64-neon-intrinsics.c

10 lines

aarch64-neon-misc.c

6 lines

aarch64-neon-scalar-x-indexed-elem.c

6 lines

aarch64-v8.2a-fp16-intrinsics.c

2 lines

aarch64-v8.2a-neon-intrinsics.c

24 lines

arm-v8.2a-neon-intrinsics.c

8 lines

arm_neon_intrinsics.c

8 lines

288 lines

144 lines

12 lines

6 lines

2 lines

64 lines

64 lines

4 lines

2 lines

2 lines

llvm/

include/

llvm/

IR/

IRBuilder.h

6 lines

test/

CodeGen/

AMDGPU/

amdgpu-codegenprepare-idiv.ll

104 lines

divrem24-assume.ll

2 lines

Transforms/

InstCombine/

20 lines

2 lines

8 lines

4 lines

unittests/

IR/

InstructionsTest.cpp

15 lines

Diff 224181

clang/lib/CodeGen/CGExprScalar.cpp

Show First 20 Lines • Show All 2,571 Lines • ▼ Show 20 Lines	ScalarExprEmitter::EmitScalarPrePostIncDec(const UnaryOperator *E, LValue LV,
// updated value.		// updated value.
return isPre ? value : input;		return isPre ? value : input;
}		}



Value ScalarExprEmitter::VisitUnaryMinus(const UnaryOperator E) {		Value ScalarExprEmitter::VisitUnaryMinus(const UnaryOperator E) {
TestAndClearIgnoreResultAssign();		TestAndClearIgnoreResultAssign();
		Value *Op = Visit(E->getSubExpr());

		// Generate a unary FNeg for FP ops.
		if (Op->getType()->isFPOrFPVectorTy())
		return Builder.CreateFNeg(Op, "fneg");

// Emit unary minus with EmitSub so we handle overflow cases etc.		// Emit unary minus with EmitSub so we handle overflow cases etc.
BinOpInfo BinOp;		BinOpInfo BinOp;
BinOp.RHS = Visit(E->getSubExpr());		BinOp.RHS = Op;
		lenaryUnsubmitted Done Reply Inline Actions This `if` will always evaluate the false branch now, right? So you should be able to get rid of the if statement totally. lenary: This `if` will always evaluate the false branch now, right? So you should be able to get rid of…

if (BinOp.RHS->getType()->isFPOrFPVectorTy())
BinOp.LHS = llvm::ConstantFP::getZeroValueForNegation(BinOp.RHS->getType());
else
BinOp.LHS = llvm::Constant::getNullValue(BinOp.RHS->getType());		BinOp.LHS = llvm::Constant::getNullValue(BinOp.RHS->getType());
BinOp.Ty = E->getType();		BinOp.Ty = E->getType();
BinOp.Opcode = BO_Sub;		BinOp.Opcode = BO_Sub;
// FIXME: once UnaryOperator carries FPFeatures, copy it here.		// FIXME: once UnaryOperator carries FPFeatures, copy it here.
BinOp.E = E;		BinOp.E = E;
return EmitSub(BinOp);		return EmitSub(BinOp);
}		}

Value ScalarExprEmitter::VisitUnaryNot(const UnaryOperator E) {		Value ScalarExprEmitter::VisitUnaryNot(const UnaryOperator E) {
▲ Show 20 Lines • Show All 2,113 Lines • Show Last 20 Lines

clang/test/CodeGen/aarch64-neon-2velem.c

	Show First 20 Lines • Show All 327 Lines • ▼ Show 20 Lines
	// CHECK: [[LANE:%.*]] = shufflevector <4 x float> [[TMP5]], <4 x float> [[TMP5]], <4 x i32> <i32 3, i32 3, i32 3, i32 3>			// CHECK: [[LANE:%.*]] = shufflevector <4 x float> [[TMP5]], <4 x float> [[TMP5]], <4 x i32> <i32 3, i32 3, i32 3, i32 3>
	// CHECK: [[TMP6:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[LANE]], <4 x float> [[TMP4]], <4 x float> [[TMP3]])			// CHECK: [[TMP6:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[LANE]], <4 x float> [[TMP4]], <4 x float> [[TMP3]])
	// CHECK: ret <4 x float> [[TMP6]]			// CHECK: ret <4 x float> [[TMP6]]
	float32x4_t test_vfmaq_laneq_f32(float32x4_t a, float32x4_t b, float32x4_t v) {			float32x4_t test_vfmaq_laneq_f32(float32x4_t a, float32x4_t b, float32x4_t v) {
	return vfmaq_laneq_f32(a, b, v, 3);			return vfmaq_laneq_f32(a, b, v, 3);
	}			}

	// CHECK-LABEL: @test_vfms_lane_f32(			// CHECK-LABEL: @test_vfms_lane_f32(
	// CHECK: [[SUB:%.*]] = fsub <2 x float> <float -0.000000e+00, float -0.000000e+00>, %b			// CHECK: [[SUB:%.*]] = fneg <2 x float> %b
	// CHECK: [[TMP0:%.*]] = bitcast <2 x float> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <2 x float> %a to <8 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <2 x float> [[SUB]] to <8 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <2 x float> [[SUB]] to <8 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <2 x float> %v to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <2 x float> %v to <8 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <2 x float>			// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <2 x float>
	// CHECK: [[LANE:%.*]] = shufflevector <2 x float> [[TMP3]], <2 x float> [[TMP3]], <2 x i32> <i32 1, i32 1>			// CHECK: [[LANE:%.*]] = shufflevector <2 x float> [[TMP3]], <2 x float> [[TMP3]], <2 x i32> <i32 1, i32 1>
	// CHECK: [[FMLA:%.*]] = bitcast <8 x i8> [[TMP1]] to <2 x float>			// CHECK: [[FMLA:%.*]] = bitcast <8 x i8> [[TMP1]] to <2 x float>
	// CHECK: [[FMLA1:%.*]] = bitcast <8 x i8> [[TMP0]] to <2 x float>			// CHECK: [[FMLA1:%.*]] = bitcast <8 x i8> [[TMP0]] to <2 x float>
	// CHECK: [[FMLA2:%.*]] = call <2 x float> @llvm.fma.v2f32(<2 x float> [[FMLA]], <2 x float> [[LANE]], <2 x float> [[FMLA1]])			// CHECK: [[FMLA2:%.*]] = call <2 x float> @llvm.fma.v2f32(<2 x float> [[FMLA]], <2 x float> [[LANE]], <2 x float> [[FMLA1]])
	// CHECK: ret <2 x float> [[FMLA2]]			// CHECK: ret <2 x float> [[FMLA2]]
	float32x2_t test_vfms_lane_f32(float32x2_t a, float32x2_t b, float32x2_t v) {			float32x2_t test_vfms_lane_f32(float32x2_t a, float32x2_t b, float32x2_t v) {
	return vfms_lane_f32(a, b, v, 1);			return vfms_lane_f32(a, b, v, 1);
	}			}

	// CHECK-LABEL: @test_vfmsq_lane_f32(			// CHECK-LABEL: @test_vfmsq_lane_f32(
	// CHECK: [[SUB:%.*]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %b			// CHECK: [[SUB:%.*]] = fneg <4 x float> %b
	// CHECK: [[TMP0:%.*]] = bitcast <4 x float> %a to <16 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <4 x float> %a to <16 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <4 x float> [[SUB]] to <16 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <4 x float> [[SUB]] to <16 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <2 x float> %v to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <2 x float> %v to <8 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <2 x float>			// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <2 x float>
	// CHECK: [[LANE:%.*]] = shufflevector <2 x float> [[TMP3]], <2 x float> [[TMP3]], <4 x i32> <i32 1, i32 1, i32 1, i32 1>			// CHECK: [[LANE:%.*]] = shufflevector <2 x float> [[TMP3]], <2 x float> [[TMP3]], <4 x i32> <i32 1, i32 1, i32 1, i32 1>
	// CHECK: [[FMLA:%.*]] = bitcast <16 x i8> [[TMP1]] to <4 x float>			// CHECK: [[FMLA:%.*]] = bitcast <16 x i8> [[TMP1]] to <4 x float>
	// CHECK: [[FMLA1:%.*]] = bitcast <16 x i8> [[TMP0]] to <4 x float>			// CHECK: [[FMLA1:%.*]] = bitcast <16 x i8> [[TMP0]] to <4 x float>
	// CHECK: [[FMLA2:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[FMLA]], <4 x float> [[LANE]], <4 x float> [[FMLA1]])			// CHECK: [[FMLA2:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[FMLA]], <4 x float> [[LANE]], <4 x float> [[FMLA1]])
	// CHECK: ret <4 x float> [[FMLA2]]			// CHECK: ret <4 x float> [[FMLA2]]
	float32x4_t test_vfmsq_lane_f32(float32x4_t a, float32x4_t b, float32x2_t v) {			float32x4_t test_vfmsq_lane_f32(float32x4_t a, float32x4_t b, float32x2_t v) {
	return vfmsq_lane_f32(a, b, v, 1);			return vfmsq_lane_f32(a, b, v, 1);
	}			}

	// CHECK-LABEL: @test_vfms_laneq_f32(			// CHECK-LABEL: @test_vfms_laneq_f32(
	// CHECK: [[SUB:%.*]] = fsub <2 x float> <float -0.000000e+00, float -0.000000e+00>, %b			// CHECK: [[SUB:%.*]] = fneg <2 x float> %b
	// CHECK: [[TMP0:%.*]] = bitcast <2 x float> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <2 x float> %a to <8 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <2 x float> [[SUB]] to <8 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <2 x float> [[SUB]] to <8 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <4 x float> %v to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <4 x float> %v to <16 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP0]] to <2 x float>			// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP0]] to <2 x float>
	// CHECK: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <2 x float>			// CHECK: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <2 x float>
	// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <4 x float>			// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <4 x float>
	// CHECK: [[LANE:%.*]] = shufflevector <4 x float> [[TMP5]], <4 x float> [[TMP5]], <2 x i32> <i32 3, i32 3>			// CHECK: [[LANE:%.*]] = shufflevector <4 x float> [[TMP5]], <4 x float> [[TMP5]], <2 x i32> <i32 3, i32 3>
	// CHECK: [[TMP6:%.*]] = call <2 x float> @llvm.fma.v2f32(<2 x float> [[LANE]], <2 x float> [[TMP4]], <2 x float> [[TMP3]])			// CHECK: [[TMP6:%.*]] = call <2 x float> @llvm.fma.v2f32(<2 x float> [[LANE]], <2 x float> [[TMP4]], <2 x float> [[TMP3]])
	// CHECK: ret <2 x float> [[TMP6]]			// CHECK: ret <2 x float> [[TMP6]]
	float32x2_t test_vfms_laneq_f32(float32x2_t a, float32x2_t b, float32x4_t v) {			float32x2_t test_vfms_laneq_f32(float32x2_t a, float32x2_t b, float32x4_t v) {
	return vfms_laneq_f32(a, b, v, 3);			return vfms_laneq_f32(a, b, v, 3);
	}			}

	// CHECK-LABEL: @test_vfmsq_laneq_f32(			// CHECK-LABEL: @test_vfmsq_laneq_f32(
	// CHECK: [[SUB:%.*]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %b			// CHECK: [[SUB:%.*]] = fneg <4 x float> %b
	// CHECK: [[TMP0:%.*]] = bitcast <4 x float> %a to <16 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <4 x float> %a to <16 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <4 x float> [[SUB]] to <16 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <4 x float> [[SUB]] to <16 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <4 x float> %v to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <4 x float> %v to <16 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <16 x i8> [[TMP0]] to <4 x float>			// CHECK: [[TMP3:%.*]] = bitcast <16 x i8> [[TMP0]] to <4 x float>
	// CHECK: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <4 x float>			// CHECK: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <4 x float>
	// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <4 x float>			// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <4 x float>
	// CHECK: [[LANE:%.*]] = shufflevector <4 x float> [[TMP5]], <4 x float> [[TMP5]], <4 x i32> <i32 3, i32 3, i32 3, i32 3>			// CHECK: [[LANE:%.*]] = shufflevector <4 x float> [[TMP5]], <4 x float> [[TMP5]], <4 x i32> <i32 3, i32 3, i32 3, i32 3>
	// CHECK: [[TMP6:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[LANE]], <4 x float> [[TMP4]], <4 x float> [[TMP3]])			// CHECK: [[TMP6:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[LANE]], <4 x float> [[TMP4]], <4 x float> [[TMP3]])
	Show All 26 Lines
	// CHECK: [[LANE:%.*]] = shufflevector <2 x double> [[TMP5]], <2 x double> [[TMP5]], <2 x i32> <i32 1, i32 1>			// CHECK: [[LANE:%.*]] = shufflevector <2 x double> [[TMP5]], <2 x double> [[TMP5]], <2 x i32> <i32 1, i32 1>
	// CHECK: [[TMP6:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> [[LANE]], <2 x double> [[TMP4]], <2 x double> [[TMP3]])			// CHECK: [[TMP6:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> [[LANE]], <2 x double> [[TMP4]], <2 x double> [[TMP3]])
	// CHECK: ret <2 x double> [[TMP6]]			// CHECK: ret <2 x double> [[TMP6]]
	float64x2_t test_vfmaq_laneq_f64(float64x2_t a, float64x2_t b, float64x2_t v) {			float64x2_t test_vfmaq_laneq_f64(float64x2_t a, float64x2_t b, float64x2_t v) {
	return vfmaq_laneq_f64(a, b, v, 1);			return vfmaq_laneq_f64(a, b, v, 1);
	}			}

	// CHECK-LABEL: @test_vfmsq_lane_f64(			// CHECK-LABEL: @test_vfmsq_lane_f64(
	// CHECK: [[SUB:%.*]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %b			// CHECK: [[SUB:%.*]] = fneg <2 x double> %b
	// CHECK: [[TMP0:%.*]] = bitcast <2 x double> %a to <16 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <2 x double> %a to <16 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <2 x double> [[SUB]] to <16 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <2 x double> [[SUB]] to <16 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <1 x double> %v to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <1 x double> %v to <8 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <1 x double>			// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <1 x double>
	// CHECK: [[LANE:%.*]] = shufflevector <1 x double> [[TMP3]], <1 x double> [[TMP3]], <2 x i32> zeroinitializer			// CHECK: [[LANE:%.*]] = shufflevector <1 x double> [[TMP3]], <1 x double> [[TMP3]], <2 x i32> zeroinitializer
	// CHECK: [[FMLA:%.*]] = bitcast <16 x i8> [[TMP1]] to <2 x double>			// CHECK: [[FMLA:%.*]] = bitcast <16 x i8> [[TMP1]] to <2 x double>
	// CHECK: [[FMLA1:%.*]] = bitcast <16 x i8> [[TMP0]] to <2 x double>			// CHECK: [[FMLA1:%.*]] = bitcast <16 x i8> [[TMP0]] to <2 x double>
	// CHECK: [[FMLA2:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> [[FMLA]], <2 x double> [[LANE]], <2 x double> [[FMLA1]])			// CHECK: [[FMLA2:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> [[FMLA]], <2 x double> [[LANE]], <2 x double> [[FMLA1]])
	// CHECK: ret <2 x double> [[FMLA2]]			// CHECK: ret <2 x double> [[FMLA2]]
	float64x2_t test_vfmsq_lane_f64(float64x2_t a, float64x2_t b, float64x1_t v) {			float64x2_t test_vfmsq_lane_f64(float64x2_t a, float64x2_t b, float64x1_t v) {
	return vfmsq_lane_f64(a, b, v, 0);			return vfmsq_lane_f64(a, b, v, 0);
	}			}

	// CHECK-LABEL: @test_vfmsq_laneq_f64(			// CHECK-LABEL: @test_vfmsq_laneq_f64(
	// CHECK: [[SUB:%.*]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %b			// CHECK: [[SUB:%.*]] = fneg <2 x double> %b
	// CHECK: [[TMP0:%.*]] = bitcast <2 x double> %a to <16 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <2 x double> %a to <16 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <2 x double> [[SUB]] to <16 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <2 x double> [[SUB]] to <16 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <2 x double> %v to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <2 x double> %v to <16 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <16 x i8> [[TMP0]] to <2 x double>			// CHECK: [[TMP3:%.*]] = bitcast <16 x i8> [[TMP0]] to <2 x double>
	// CHECK: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <2 x double>			// CHECK: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <2 x double>
	// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <2 x double>			// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <2 x double>
	// CHECK: [[LANE:%.*]] = shufflevector <2 x double> [[TMP5]], <2 x double> [[TMP5]], <2 x i32> <i32 1, i32 1>			// CHECK: [[LANE:%.*]] = shufflevector <2 x double> [[TMP5]], <2 x double> [[TMP5]], <2 x i32> <i32 1, i32 1>
	// CHECK: [[TMP6:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> [[LANE]], <2 x double> [[TMP4]], <2 x double> [[TMP3]])			// CHECK: [[TMP6:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> [[LANE]], <2 x double> [[TMP4]], <2 x double> [[TMP3]])
	// CHECK: ret <2 x double> [[TMP6]]			// CHECK: ret <2 x double> [[TMP6]]
	float64x2_t test_vfmsq_laneq_f64(float64x2_t a, float64x2_t b, float64x2_t v) {			float64x2_t test_vfmsq_laneq_f64(float64x2_t a, float64x2_t b, float64x2_t v) {
	return vfmsq_laneq_f64(a, b, v, 1);			return vfmsq_laneq_f64(a, b, v, 1);
	}			}

	// CHECK-LABEL: @test_vfmas_laneq_f32(			// CHECK-LABEL: @test_vfmas_laneq_f32(
	// CHECK: [[EXTRACT:%.*]] = extractelement <4 x float> %v, i32 3			// CHECK: [[EXTRACT:%.*]] = extractelement <4 x float> %v, i32 3
	// CHECK: [[TMP2:%.*]] = call float @llvm.fma.f32(float %b, float [[EXTRACT]], float %a)			// CHECK: [[TMP2:%.*]] = call float @llvm.fma.f32(float %b, float [[EXTRACT]], float %a)
	// CHECK: ret float [[TMP2]]			// CHECK: ret float [[TMP2]]
	float32_t test_vfmas_laneq_f32(float32_t a, float32_t b, float32x4_t v) {			float32_t test_vfmas_laneq_f32(float32_t a, float32_t b, float32x4_t v) {
	return vfmas_laneq_f32(a, b, v, 3);			return vfmas_laneq_f32(a, b, v, 3);
	}			}

	// CHECK-LABEL: @test_vfmsd_lane_f64(			// CHECK-LABEL: @test_vfmsd_lane_f64(
	// CHECK: [[SUB:%.*]] = fsub double -0.000000e+00, %b			// CHECK: [[SUB:%.*]] = fneg double %b
	// CHECK: [[EXTRACT:%.*]] = extractelement <1 x double> %v, i32 0			// CHECK: [[EXTRACT:%.*]] = extractelement <1 x double> %v, i32 0
	// CHECK: [[TMP2:%.*]] = call double @llvm.fma.f64(double [[SUB]], double [[EXTRACT]], double %a)			// CHECK: [[TMP2:%.*]] = call double @llvm.fma.f64(double [[SUB]], double [[EXTRACT]], double %a)
	// CHECK: ret double [[TMP2]]			// CHECK: ret double [[TMP2]]
	float64_t test_vfmsd_lane_f64(float64_t a, float64_t b, float64x1_t v) {			float64_t test_vfmsd_lane_f64(float64_t a, float64_t b, float64x1_t v) {
	return vfmsd_lane_f64(a, b, v, 0);			return vfmsd_lane_f64(a, b, v, 0);
	}			}

	// CHECK-LABEL: @test_vfmss_laneq_f32(			// CHECK-LABEL: @test_vfmss_laneq_f32(
	// CHECK: [[SUB:%.*]] = fsub float -0.000000e+00, %b			// CHECK: [[SUB:%.*]] = fneg float %b
	// CHECK: [[EXTRACT:%.*]] = extractelement <4 x float> %v, i32 3			// CHECK: [[EXTRACT:%.*]] = extractelement <4 x float> %v, i32 3
	// CHECK: [[TMP2:%.*]] = call float @llvm.fma.f32(float [[SUB]], float [[EXTRACT]], float %a)			// CHECK: [[TMP2:%.*]] = call float @llvm.fma.f32(float [[SUB]], float [[EXTRACT]], float %a)
	// CHECK: ret float [[TMP2]]			// CHECK: ret float [[TMP2]]
	float32_t test_vfmss_laneq_f32(float32_t a, float32_t b, float32x4_t v) {			float32_t test_vfmss_laneq_f32(float32_t a, float32_t b, float32x4_t v) {
	return vfmss_laneq_f32(a, b, v, 3);			return vfmss_laneq_f32(a, b, v, 3);
	}			}

	// CHECK-LABEL: @test_vfmsd_laneq_f64(			// CHECK-LABEL: @test_vfmsd_laneq_f64(
	// CHECK: [[SUB:%.*]] = fsub double -0.000000e+00, %b			// CHECK: [[SUB:%.*]] = fneg double %b
	// CHECK: [[EXTRACT:%.*]] = extractelement <2 x double> %v, i32 1			// CHECK: [[EXTRACT:%.*]] = extractelement <2 x double> %v, i32 1
	// CHECK: [[TMP2:%.*]] = call double @llvm.fma.f64(double [[SUB]], double [[EXTRACT]], double %a)			// CHECK: [[TMP2:%.*]] = call double @llvm.fma.f64(double [[SUB]], double [[EXTRACT]], double %a)
	// CHECK: ret double [[TMP2]]			// CHECK: ret double [[TMP2]]
	float64_t test_vfmsd_laneq_f64(float64_t a, float64_t b, float64x2_t v) {			float64_t test_vfmsd_laneq_f64(float64_t a, float64_t b, float64x2_t v) {
	return vfmsd_laneq_f64(a, b, v, 1);			return vfmsd_laneq_f64(a, b, v, 1);
	}			}

	// CHECK-LABEL: @test_vmlal_lane_s16(			// CHECK-LABEL: @test_vmlal_lane_s16(
	▲ Show 20 Lines • Show All 1,273 Lines • ▼ Show 20 Lines
	// CHECK: [[LANE:%.*]] = shufflevector <4 x float> [[TMP5]], <4 x float> [[TMP5]], <4 x i32> zeroinitializer			// CHECK: [[LANE:%.*]] = shufflevector <4 x float> [[TMP5]], <4 x float> [[TMP5]], <4 x i32> zeroinitializer
	// CHECK: [[TMP6:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[LANE]], <4 x float> [[TMP4]], <4 x float> [[TMP3]])			// CHECK: [[TMP6:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[LANE]], <4 x float> [[TMP4]], <4 x float> [[TMP3]])
	// CHECK: ret <4 x float> [[TMP6]]			// CHECK: ret <4 x float> [[TMP6]]
	float32x4_t test_vfmaq_laneq_f32_0(float32x4_t a, float32x4_t b, float32x4_t v) {			float32x4_t test_vfmaq_laneq_f32_0(float32x4_t a, float32x4_t b, float32x4_t v) {
	return vfmaq_laneq_f32(a, b, v, 0);			return vfmaq_laneq_f32(a, b, v, 0);
	}			}

	// CHECK-LABEL: @test_vfms_lane_f32_0(			// CHECK-LABEL: @test_vfms_lane_f32_0(
	// CHECK: [[SUB:%.*]] = fsub <2 x float> <float -0.000000e+00, float -0.000000e+00>, %b			// CHECK: [[SUB:%.*]] = fneg <2 x float> %b
	// CHECK: [[TMP0:%.*]] = bitcast <2 x float> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <2 x float> %a to <8 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <2 x float> [[SUB]] to <8 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <2 x float> [[SUB]] to <8 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <2 x float> %v to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <2 x float> %v to <8 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <2 x float>			// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <2 x float>
	// CHECK: [[LANE:%.*]] = shufflevector <2 x float> [[TMP3]], <2 x float> [[TMP3]], <2 x i32> zeroinitializer			// CHECK: [[LANE:%.*]] = shufflevector <2 x float> [[TMP3]], <2 x float> [[TMP3]], <2 x i32> zeroinitializer
	// CHECK: [[FMLA:%.*]] = bitcast <8 x i8> [[TMP1]] to <2 x float>			// CHECK: [[FMLA:%.*]] = bitcast <8 x i8> [[TMP1]] to <2 x float>
	// CHECK: [[FMLA1:%.*]] = bitcast <8 x i8> [[TMP0]] to <2 x float>			// CHECK: [[FMLA1:%.*]] = bitcast <8 x i8> [[TMP0]] to <2 x float>
	// CHECK: [[FMLA2:%.*]] = call <2 x float> @llvm.fma.v2f32(<2 x float> [[FMLA]], <2 x float> [[LANE]], <2 x float> [[FMLA1]])			// CHECK: [[FMLA2:%.*]] = call <2 x float> @llvm.fma.v2f32(<2 x float> [[FMLA]], <2 x float> [[LANE]], <2 x float> [[FMLA1]])
	// CHECK: ret <2 x float> [[FMLA2]]			// CHECK: ret <2 x float> [[FMLA2]]
	float32x2_t test_vfms_lane_f32_0(float32x2_t a, float32x2_t b, float32x2_t v) {			float32x2_t test_vfms_lane_f32_0(float32x2_t a, float32x2_t b, float32x2_t v) {
	return vfms_lane_f32(a, b, v, 0);			return vfms_lane_f32(a, b, v, 0);
	}			}

	// CHECK-LABEL: @test_vfmsq_lane_f32_0(			// CHECK-LABEL: @test_vfmsq_lane_f32_0(
	// CHECK: [[SUB:%.*]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %b			// CHECK: [[SUB:%.*]] = fneg <4 x float> %b
	// CHECK: [[TMP0:%.*]] = bitcast <4 x float> %a to <16 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <4 x float> %a to <16 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <4 x float> [[SUB]] to <16 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <4 x float> [[SUB]] to <16 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <2 x float> %v to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <2 x float> %v to <8 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <2 x float>			// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <2 x float>
	// CHECK: [[LANE:%.*]] = shufflevector <2 x float> [[TMP3]], <2 x float> [[TMP3]], <4 x i32> zeroinitializer			// CHECK: [[LANE:%.*]] = shufflevector <2 x float> [[TMP3]], <2 x float> [[TMP3]], <4 x i32> zeroinitializer
	// CHECK: [[FMLA:%.*]] = bitcast <16 x i8> [[TMP1]] to <4 x float>			// CHECK: [[FMLA:%.*]] = bitcast <16 x i8> [[TMP1]] to <4 x float>
	// CHECK: [[FMLA1:%.*]] = bitcast <16 x i8> [[TMP0]] to <4 x float>			// CHECK: [[FMLA1:%.*]] = bitcast <16 x i8> [[TMP0]] to <4 x float>
	// CHECK: [[FMLA2:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[FMLA]], <4 x float> [[LANE]], <4 x float> [[FMLA1]])			// CHECK: [[FMLA2:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[FMLA]], <4 x float> [[LANE]], <4 x float> [[FMLA1]])
	// CHECK: ret <4 x float> [[FMLA2]]			// CHECK: ret <4 x float> [[FMLA2]]
	float32x4_t test_vfmsq_lane_f32_0(float32x4_t a, float32x4_t b, float32x2_t v) {			float32x4_t test_vfmsq_lane_f32_0(float32x4_t a, float32x4_t b, float32x2_t v) {
	return vfmsq_lane_f32(a, b, v, 0);			return vfmsq_lane_f32(a, b, v, 0);
	}			}

	// CHECK-LABEL: @test_vfms_laneq_f32_0(			// CHECK-LABEL: @test_vfms_laneq_f32_0(
	// CHECK: [[SUB:%.*]] = fsub <2 x float> <float -0.000000e+00, float -0.000000e+00>, %b			// CHECK: [[SUB:%.*]] = fneg <2 x float> %b
	// CHECK: [[TMP0:%.*]] = bitcast <2 x float> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <2 x float> %a to <8 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <2 x float> [[SUB]] to <8 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <2 x float> [[SUB]] to <8 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <4 x float> %v to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <4 x float> %v to <16 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP0]] to <2 x float>			// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP0]] to <2 x float>
	// CHECK: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <2 x float>			// CHECK: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <2 x float>
	// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <4 x float>			// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <4 x float>
	// CHECK: [[LANE:%.*]] = shufflevector <4 x float> [[TMP5]], <4 x float> [[TMP5]], <2 x i32> zeroinitializer			// CHECK: [[LANE:%.*]] = shufflevector <4 x float> [[TMP5]], <4 x float> [[TMP5]], <2 x i32> zeroinitializer
	// CHECK: [[TMP6:%.*]] = call <2 x float> @llvm.fma.v2f32(<2 x float> [[LANE]], <2 x float> [[TMP4]], <2 x float> [[TMP3]])			// CHECK: [[TMP6:%.*]] = call <2 x float> @llvm.fma.v2f32(<2 x float> [[LANE]], <2 x float> [[TMP4]], <2 x float> [[TMP3]])
	// CHECK: ret <2 x float> [[TMP6]]			// CHECK: ret <2 x float> [[TMP6]]
	float32x2_t test_vfms_laneq_f32_0(float32x2_t a, float32x2_t b, float32x4_t v) {			float32x2_t test_vfms_laneq_f32_0(float32x2_t a, float32x2_t b, float32x4_t v) {
	return vfms_laneq_f32(a, b, v, 0);			return vfms_laneq_f32(a, b, v, 0);
	}			}

	// CHECK-LABEL: @test_vfmsq_laneq_f32_0(			// CHECK-LABEL: @test_vfmsq_laneq_f32_0(
	// CHECK: [[SUB:%.*]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %b			// CHECK: [[SUB:%.*]] = fneg <4 x float> %b
	// CHECK: [[TMP0:%.*]] = bitcast <4 x float> %a to <16 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <4 x float> %a to <16 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <4 x float> [[SUB]] to <16 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <4 x float> [[SUB]] to <16 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <4 x float> %v to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <4 x float> %v to <16 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <16 x i8> [[TMP0]] to <4 x float>			// CHECK: [[TMP3:%.*]] = bitcast <16 x i8> [[TMP0]] to <4 x float>
	// CHECK: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <4 x float>			// CHECK: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <4 x float>
	// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <4 x float>			// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <4 x float>
	// CHECK: [[LANE:%.*]] = shufflevector <4 x float> [[TMP5]], <4 x float> [[TMP5]], <4 x i32> zeroinitializer			// CHECK: [[LANE:%.*]] = shufflevector <4 x float> [[TMP5]], <4 x float> [[TMP5]], <4 x i32> zeroinitializer
	// CHECK: [[TMP6:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[LANE]], <4 x float> [[TMP4]], <4 x float> [[TMP3]])			// CHECK: [[TMP6:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[LANE]], <4 x float> [[TMP4]], <4 x float> [[TMP3]])
	Show All 12 Lines
	// CHECK: [[LANE:%.*]] = shufflevector <2 x double> [[TMP5]], <2 x double> [[TMP5]], <2 x i32> zeroinitializer			// CHECK: [[LANE:%.*]] = shufflevector <2 x double> [[TMP5]], <2 x double> [[TMP5]], <2 x i32> zeroinitializer
	// CHECK: [[TMP6:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> [[LANE]], <2 x double> [[TMP4]], <2 x double> [[TMP3]])			// CHECK: [[TMP6:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> [[LANE]], <2 x double> [[TMP4]], <2 x double> [[TMP3]])
	// CHECK: ret <2 x double> [[TMP6]]			// CHECK: ret <2 x double> [[TMP6]]
	float64x2_t test_vfmaq_laneq_f64_0(float64x2_t a, float64x2_t b, float64x2_t v) {			float64x2_t test_vfmaq_laneq_f64_0(float64x2_t a, float64x2_t b, float64x2_t v) {
	return vfmaq_laneq_f64(a, b, v, 0);			return vfmaq_laneq_f64(a, b, v, 0);
	}			}

	// CHECK-LABEL: @test_vfmsq_laneq_f64_0(			// CHECK-LABEL: @test_vfmsq_laneq_f64_0(
	// CHECK: [[SUB:%.*]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %b			// CHECK: [[SUB:%.*]] = fneg <2 x double> %b
	// CHECK: [[TMP0:%.*]] = bitcast <2 x double> %a to <16 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <2 x double> %a to <16 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <2 x double> [[SUB]] to <16 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <2 x double> [[SUB]] to <16 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <2 x double> %v to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <2 x double> %v to <16 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <16 x i8> [[TMP0]] to <2 x double>			// CHECK: [[TMP3:%.*]] = bitcast <16 x i8> [[TMP0]] to <2 x double>
	// CHECK: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <2 x double>			// CHECK: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <2 x double>
	// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <2 x double>			// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <2 x double>
	// CHECK: [[LANE:%.*]] = shufflevector <2 x double> [[TMP5]], <2 x double> [[TMP5]], <2 x i32> zeroinitializer			// CHECK: [[LANE:%.*]] = shufflevector <2 x double> [[TMP5]], <2 x double> [[TMP5]], <2 x i32> zeroinitializer
	// CHECK: [[TMP6:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> [[LANE]], <2 x double> [[TMP4]], <2 x double> [[TMP3]])			// CHECK: [[TMP6:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> [[LANE]], <2 x double> [[TMP4]], <2 x double> [[TMP3]])
	▲ Show 20 Lines • Show All 1,243 Lines • ▼ Show 20 Lines
	// CHECK: [[TMP2:%.*]] = bitcast <4 x float> [[VECINIT3_I]] to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <4 x float> [[VECINIT3_I]] to <16 x i8>
	// CHECK: [[TMP3:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %b, <4 x float> [[VECINIT3_I]], <4 x float> %a)			// CHECK: [[TMP3:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %b, <4 x float> [[VECINIT3_I]], <4 x float> %a)
	// CHECK: ret <4 x float> [[TMP3]]			// CHECK: ret <4 x float> [[TMP3]]
	float32x4_t test_vfmaq_n_f32(float32x4_t a, float32x4_t b, float32_t n) {			float32x4_t test_vfmaq_n_f32(float32x4_t a, float32x4_t b, float32_t n) {
	return vfmaq_n_f32(a, b, n);			return vfmaq_n_f32(a, b, n);
	}			}

	// CHECK-LABEL: @test_vfms_n_f32(			// CHECK-LABEL: @test_vfms_n_f32(
	// CHECK: [[SUB_I:%.*]] = fsub <2 x float> <float -0.000000e+00, float -0.000000e+00>, %b			// CHECK: [[SUB_I:%.*]] = fneg <2 x float> %b
	// CHECK: [[VECINIT_I:%.*]] = insertelement <2 x float> undef, float %n, i32 0			// CHECK: [[VECINIT_I:%.*]] = insertelement <2 x float> undef, float %n, i32 0
	// CHECK: [[VECINIT1_I:%.*]] = insertelement <2 x float> [[VECINIT_I]], float %n, i32 1			// CHECK: [[VECINIT1_I:%.*]] = insertelement <2 x float> [[VECINIT_I]], float %n, i32 1
	// CHECK: [[TMP0:%.*]] = bitcast <2 x float> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <2 x float> %a to <8 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <2 x float> [[SUB_I]] to <8 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <2 x float> [[SUB_I]] to <8 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <2 x float> [[VECINIT1_I]] to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <2 x float> [[VECINIT1_I]] to <8 x i8>
	// CHECK: [[TMP3:%.*]] = call <2 x float> @llvm.fma.v2f32(<2 x float> [[SUB_I]], <2 x float> [[VECINIT1_I]], <2 x float> %a)			// CHECK: [[TMP3:%.*]] = call <2 x float> @llvm.fma.v2f32(<2 x float> [[SUB_I]], <2 x float> [[VECINIT1_I]], <2 x float> %a)
	// CHECK: ret <2 x float> [[TMP3]]			// CHECK: ret <2 x float> [[TMP3]]
	float32x2_t test_vfms_n_f32(float32x2_t a, float32x2_t b, float32_t n) {			float32x2_t test_vfms_n_f32(float32x2_t a, float32x2_t b, float32_t n) {
	return vfms_n_f32(a, b, n);			return vfms_n_f32(a, b, n);
	}			}

	// CHECK-LABEL: @test_vfms_n_f64(			// CHECK-LABEL: @test_vfms_n_f64(
	// CHECK: [[SUB_I:%.*]] = fsub <1 x double> <double -0.000000e+00>, %b			// CHECK: [[SUB_I:%.*]] = fneg <1 x double> %b
	// CHECK: [[VECINIT_I:%.*]] = insertelement <1 x double> undef, double %n, i32 0			// CHECK: [[VECINIT_I:%.*]] = insertelement <1 x double> undef, double %n, i32 0
	// CHECK: [[TMP0:%.*]] = bitcast <1 x double> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <1 x double> %a to <8 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <1 x double> [[SUB_I]] to <8 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <1 x double> [[SUB_I]] to <8 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <1 x double> [[VECINIT_I]] to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <1 x double> [[VECINIT_I]] to <8 x i8>
	// CHECK: [[TMP3:%.*]] = call <1 x double> @llvm.fma.v1f64(<1 x double> [[SUB_I]], <1 x double> [[VECINIT_I]], <1 x double> %a)			// CHECK: [[TMP3:%.*]] = call <1 x double> @llvm.fma.v1f64(<1 x double> [[SUB_I]], <1 x double> [[VECINIT_I]], <1 x double> %a)
	// CHECK: ret <1 x double> [[TMP3]]			// CHECK: ret <1 x double> [[TMP3]]
	float64x1_t test_vfms_n_f64(float64x1_t a, float64x1_t b, float64_t n) {			float64x1_t test_vfms_n_f64(float64x1_t a, float64x1_t b, float64_t n) {
	return vfms_n_f64(a, b, n);			return vfms_n_f64(a, b, n);
	}			}

	// CHECK-LABEL: @test_vfmsq_n_f32(			// CHECK-LABEL: @test_vfmsq_n_f32(
	// CHECK: [[SUB_I:%.*]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %b			// CHECK: [[SUB_I:%.*]] = fneg <4 x float> %b
	// CHECK: [[VECINIT_I:%.*]] = insertelement <4 x float> undef, float %n, i32 0			// CHECK: [[VECINIT_I:%.*]] = insertelement <4 x float> undef, float %n, i32 0
	// CHECK: [[VECINIT1_I:%.*]] = insertelement <4 x float> [[VECINIT_I]], float %n, i32 1			// CHECK: [[VECINIT1_I:%.*]] = insertelement <4 x float> [[VECINIT_I]], float %n, i32 1
	// CHECK: [[VECINIT2_I:%.*]] = insertelement <4 x float> [[VECINIT1_I]], float %n, i32 2			// CHECK: [[VECINIT2_I:%.*]] = insertelement <4 x float> [[VECINIT1_I]], float %n, i32 2
	// CHECK: [[VECINIT3_I:%.*]] = insertelement <4 x float> [[VECINIT2_I]], float %n, i32 3			// CHECK: [[VECINIT3_I:%.*]] = insertelement <4 x float> [[VECINIT2_I]], float %n, i32 3
	// CHECK: [[TMP0:%.*]] = bitcast <4 x float> %a to <16 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <4 x float> %a to <16 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <4 x float> [[SUB_I]] to <16 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <4 x float> [[SUB_I]] to <16 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <4 x float> [[VECINIT3_I]] to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <4 x float> [[VECINIT3_I]] to <16 x i8>
	// CHECK: [[TMP3:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[SUB_I]], <4 x float> [[VECINIT3_I]], <4 x float> %a)			// CHECK: [[TMP3:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[SUB_I]], <4 x float> [[VECINIT3_I]], <4 x float> %a)
	▲ Show 20 Lines • Show All 1,310 Lines • Show Last 20 Lines

clang/test/CodeGen/aarch64-neon-fma.c

	Show First 20 Lines • Show All 215 Lines • ▼ Show 20 Lines
	// CHECK: [[VECINIT1_I:%.*]] = insertelement <2 x double> [[VECINIT_I]], double %c, i32 1			// CHECK: [[VECINIT1_I:%.*]] = insertelement <2 x double> [[VECINIT_I]], double %c, i32 1
	// CHECK: [[TMP6:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %b, <2 x double> [[VECINIT1_I]], <2 x double> %a)			// CHECK: [[TMP6:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %b, <2 x double> [[VECINIT1_I]], <2 x double> %a)
	// CHECK: ret <2 x double> [[TMP6]]			// CHECK: ret <2 x double> [[TMP6]]
	float64x2_t test_vfmaq_n_f64(float64x2_t a, float64x2_t b, float64_t c) {			float64x2_t test_vfmaq_n_f64(float64x2_t a, float64x2_t b, float64_t c) {
	return vfmaq_n_f64(a, b, c);			return vfmaq_n_f64(a, b, c);
	}			}

	// CHECK-LABEL: define <2 x double> @test_vfmsq_n_f64(<2 x double> %a, <2 x double> %b, double %c) #1 {			// CHECK-LABEL: define <2 x double> @test_vfmsq_n_f64(<2 x double> %a, <2 x double> %b, double %c) #1 {
	// CHECK: [[SUB_I:%.*]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %b			// CHECK: [[SUB_I:%.*]] = fneg <2 x double> %b
	// CHECK: [[VECINIT_I:%.*]] = insertelement <2 x double> undef, double %c, i32 0			// CHECK: [[VECINIT_I:%.*]] = insertelement <2 x double> undef, double %c, i32 0
	// CHECK: [[VECINIT1_I:%.*]] = insertelement <2 x double> [[VECINIT_I]], double %c, i32 1			// CHECK: [[VECINIT1_I:%.*]] = insertelement <2 x double> [[VECINIT_I]], double %c, i32 1
	// CHECK: [[TMP6:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> [[SUB_I]], <2 x double> [[VECINIT1_I]], <2 x double> %a) #3			// CHECK: [[TMP6:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> [[SUB_I]], <2 x double> [[VECINIT1_I]], <2 x double> %a) #3
	// CHECK: ret <2 x double> [[TMP6]]			// CHECK: ret <2 x double> [[TMP6]]
	float64x2_t test_vfmsq_n_f64(float64x2_t a, float64x2_t b, float64_t c) {			float64x2_t test_vfmsq_n_f64(float64x2_t a, float64x2_t b, float64_t c) {
	return vfmsq_n_f64(a, b, c);			return vfmsq_n_f64(a, b, c);
	}			}

	// CHECK: attributes #0 ={{.*}}"min-legal-vector-width"="64"			// CHECK: attributes #0 ={{.*}}"min-legal-vector-width"="64"
	// CHECK: attributes #1 ={{.*}}"min-legal-vector-width"="128"			// CHECK: attributes #1 ={{.*}}"min-legal-vector-width"="128"

clang/test/CodeGen/aarch64-neon-intrinsics.c

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 659 Lines • ▼ Show 20 Lines
	// CHECK: [[TMP2:%.*]] = bitcast <2 x double> %v3 to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <2 x double> %v3 to <16 x i8>
	// CHECK: [[TMP3:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %v2, <2 x double> %v3, <2 x double> %v1)			// CHECK: [[TMP3:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %v2, <2 x double> %v3, <2 x double> %v1)
	// CHECK: ret <2 x double> [[TMP3]]			// CHECK: ret <2 x double> [[TMP3]]
	float64x2_t test_vfmaq_f64(float64x2_t v1, float64x2_t v2, float64x2_t v3) {			float64x2_t test_vfmaq_f64(float64x2_t v1, float64x2_t v2, float64x2_t v3) {
	return vfmaq_f64(v1, v2, v3);			return vfmaq_f64(v1, v2, v3);
	}			}

	// CHECK-LABEL: @test_vfms_f32(			// CHECK-LABEL: @test_vfms_f32(
	// CHECK: [[SUB_I:%.*]] = fsub <2 x float> <float -0.000000e+00, float -0.000000e+00>, %v2			// CHECK: [[SUB_I:%.*]] = fneg <2 x float> %v2
	// CHECK: [[TMP0:%.*]] = bitcast <2 x float> %v1 to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <2 x float> %v1 to <8 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <2 x float> [[SUB_I]] to <8 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <2 x float> [[SUB_I]] to <8 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <2 x float> %v3 to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <2 x float> %v3 to <8 x i8>
	// CHECK: [[TMP3:%.*]] = call <2 x float> @llvm.fma.v2f32(<2 x float> [[SUB_I]], <2 x float> %v3, <2 x float> %v1)			// CHECK: [[TMP3:%.*]] = call <2 x float> @llvm.fma.v2f32(<2 x float> [[SUB_I]], <2 x float> %v3, <2 x float> %v1)
	// CHECK: ret <2 x float> [[TMP3]]			// CHECK: ret <2 x float> [[TMP3]]
	float32x2_t test_vfms_f32(float32x2_t v1, float32x2_t v2, float32x2_t v3) {			float32x2_t test_vfms_f32(float32x2_t v1, float32x2_t v2, float32x2_t v3) {
	return vfms_f32(v1, v2, v3);			return vfms_f32(v1, v2, v3);
	}			}

	// CHECK-LABEL: @test_vfmsq_f32(			// CHECK-LABEL: @test_vfmsq_f32(
	// CHECK: [[SUB_I:%.*]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %v2			// CHECK: [[SUB_I:%.*]] = fneg <4 x float> %v2
	// CHECK: [[TMP0:%.*]] = bitcast <4 x float> %v1 to <16 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <4 x float> %v1 to <16 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <4 x float> [[SUB_I]] to <16 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <4 x float> [[SUB_I]] to <16 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <4 x float> %v3 to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <4 x float> %v3 to <16 x i8>
	// CHECK: [[TMP3:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[SUB_I]], <4 x float> %v3, <4 x float> %v1)			// CHECK: [[TMP3:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[SUB_I]], <4 x float> %v3, <4 x float> %v1)
	// CHECK: ret <4 x float> [[TMP3]]			// CHECK: ret <4 x float> [[TMP3]]
	float32x4_t test_vfmsq_f32(float32x4_t v1, float32x4_t v2, float32x4_t v3) {			float32x4_t test_vfmsq_f32(float32x4_t v1, float32x4_t v2, float32x4_t v3) {
	return vfmsq_f32(v1, v2, v3);			return vfmsq_f32(v1, v2, v3);
	}			}

	// CHECK-LABEL: @test_vfmsq_f64(			// CHECK-LABEL: @test_vfmsq_f64(
	// CHECK: [[SUB_I:%.*]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %v2			// CHECK: [[SUB_I:%.*]] = fneg <2 x double> %v2
	// CHECK: [[TMP0:%.*]] = bitcast <2 x double> %v1 to <16 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <2 x double> %v1 to <16 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <2 x double> [[SUB_I]] to <16 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <2 x double> [[SUB_I]] to <16 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <2 x double> %v3 to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <2 x double> %v3 to <16 x i8>
	// CHECK: [[TMP3:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> [[SUB_I]], <2 x double> %v3, <2 x double> %v1)			// CHECK: [[TMP3:%.*]] = call <2 x double> @llvm.fma.v2f64(<2 x double> [[SUB_I]], <2 x double> %v3, <2 x double> %v1)
	// CHECK: ret <2 x double> [[TMP3]]			// CHECK: ret <2 x double> [[TMP3]]
	float64x2_t test_vfmsq_f64(float64x2_t v1, float64x2_t v2, float64x2_t v3) {			float64x2_t test_vfmsq_f64(float64x2_t v1, float64x2_t v2, float64x2_t v3) {
	return vfmsq_f64(v1, v2, v3);			return vfmsq_f64(v1, v2, v3);
	}			}
	▲ Show 20 Lines • Show All 17,140 Lines • ▼ Show 20 Lines
	// CHECK: [[TMP2:%.*]] = bitcast <1 x double> %c to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <1 x double> %c to <8 x i8>
	// CHECK: [[TMP3:%.*]] = call <1 x double> @llvm.fma.v1f64(<1 x double> %b, <1 x double> %c, <1 x double> %a)			// CHECK: [[TMP3:%.*]] = call <1 x double> @llvm.fma.v1f64(<1 x double> %b, <1 x double> %c, <1 x double> %a)
	// CHECK: ret <1 x double> [[TMP3]]			// CHECK: ret <1 x double> [[TMP3]]
	float64x1_t test_vfma_f64(float64x1_t a, float64x1_t b, float64x1_t c) {			float64x1_t test_vfma_f64(float64x1_t a, float64x1_t b, float64x1_t c) {
	return vfma_f64(a, b, c);			return vfma_f64(a, b, c);
	}			}

	// CHECK-LABEL: @test_vfms_f64(			// CHECK-LABEL: @test_vfms_f64(
	// CHECK: [[SUB_I:%.*]] = fsub <1 x double> <double -0.000000e+00>, %b			// CHECK: [[SUB_I:%.*]] = fneg <1 x double> %b
	// CHECK: [[TMP0:%.*]] = bitcast <1 x double> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <1 x double> %a to <8 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <1 x double> [[SUB_I]] to <8 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <1 x double> [[SUB_I]] to <8 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <1 x double> %c to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <1 x double> %c to <8 x i8>
	// CHECK: [[TMP3:%.*]] = call <1 x double> @llvm.fma.v1f64(<1 x double> [[SUB_I]], <1 x double> %c, <1 x double> %a)			// CHECK: [[TMP3:%.*]] = call <1 x double> @llvm.fma.v1f64(<1 x double> [[SUB_I]], <1 x double> %c, <1 x double> %a)
	// CHECK: ret <1 x double> [[TMP3]]			// CHECK: ret <1 x double> [[TMP3]]
	float64x1_t test_vfms_f64(float64x1_t a, float64x1_t b, float64x1_t c) {			float64x1_t test_vfms_f64(float64x1_t a, float64x1_t b, float64x1_t c) {
	return vfms_f64(a, b, c);			return vfms_f64(a, b, c);
	}			}
	▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines
	// CHECK: [[TMP0:%.*]] = bitcast <1 x double> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <1 x double> %a to <8 x i8>
	// CHECK: [[VABS1_I:%.*]] = call <1 x double> @llvm.fabs.v1f64(<1 x double> %a)			// CHECK: [[VABS1_I:%.*]] = call <1 x double> @llvm.fabs.v1f64(<1 x double> %a)
	// CHECK: ret <1 x double> [[VABS1_I]]			// CHECK: ret <1 x double> [[VABS1_I]]
	float64x1_t test_vabs_f64(float64x1_t a) {			float64x1_t test_vabs_f64(float64x1_t a) {
	return vabs_f64(a);			return vabs_f64(a);
	}			}

	// CHECK-LABEL: @test_vneg_f64(			// CHECK-LABEL: @test_vneg_f64(
	// CHECK: [[SUB_I:%.*]] = fsub <1 x double> <double -0.000000e+00>, %a			// CHECK: [[SUB_I:%.*]] = fneg <1 x double> %a
	// CHECK: ret <1 x double> [[SUB_I]]			// CHECK: ret <1 x double> [[SUB_I]]
	float64x1_t test_vneg_f64(float64x1_t a) {			float64x1_t test_vneg_f64(float64x1_t a) {
	return vneg_f64(a);			return vneg_f64(a);
	}			}

	// CHECK-LABEL: @test_vcvt_s64_f64(			// CHECK-LABEL: @test_vcvt_s64_f64(
	// CHECK: [[TMP0:%.*]] = bitcast <1 x double> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <1 x double> %a to <8 x i8>
	// CHECK: [[TMP1:%.*]] = fptosi <1 x double> %a to <1 x i64>			// CHECK: [[TMP1:%.*]] = fptosi <1 x double> %a to <1 x i64>
	▲ Show 20 Lines • Show All 283 Lines • Show Last 20 Lines

clang/test/CodeGen/aarch64-neon-misc.c

	Show First 20 Lines • Show All 1,280 Lines • ▼ Show 20 Lines
	// CHECK-LABEL: @test_vnegq_s64(			// CHECK-LABEL: @test_vnegq_s64(
	// CHECK: [[SUB_I:%.*]] = sub <2 x i64> zeroinitializer, %a			// CHECK: [[SUB_I:%.*]] = sub <2 x i64> zeroinitializer, %a
	// CHECK: ret <2 x i64> [[SUB_I]]			// CHECK: ret <2 x i64> [[SUB_I]]
	int64x2_t test_vnegq_s64(int64x2_t a) {			int64x2_t test_vnegq_s64(int64x2_t a) {
	return vnegq_s64(a);			return vnegq_s64(a);
	}			}

	// CHECK-LABEL: @test_vneg_f32(			// CHECK-LABEL: @test_vneg_f32(
	// CHECK: [[SUB_I:%.*]] = fsub <2 x float> <float -0.000000e+00, float -0.000000e+00>, %a			// CHECK: [[SUB_I:%.*]] = fneg <2 x float> %a
	// CHECK: ret <2 x float> [[SUB_I]]			// CHECK: ret <2 x float> [[SUB_I]]
	float32x2_t test_vneg_f32(float32x2_t a) {			float32x2_t test_vneg_f32(float32x2_t a) {
	return vneg_f32(a);			return vneg_f32(a);
	}			}

	// CHECK-LABEL: @test_vnegq_f32(			// CHECK-LABEL: @test_vnegq_f32(
	// CHECK: [[SUB_I:%.*]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %a			// CHECK: [[SUB_I:%.*]] = fneg <4 x float> %a
	// CHECK: ret <4 x float> [[SUB_I]]			// CHECK: ret <4 x float> [[SUB_I]]
	float32x4_t test_vnegq_f32(float32x4_t a) {			float32x4_t test_vnegq_f32(float32x4_t a) {
	return vnegq_f32(a);			return vnegq_f32(a);
	}			}

	// CHECK-LABEL: @test_vnegq_f64(			// CHECK-LABEL: @test_vnegq_f64(
	// CHECK: [[SUB_I:%.*]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %a			// CHECK: [[SUB_I:%.*]] = fneg <2 x double> %a
	// CHECK: ret <2 x double> [[SUB_I]]			// CHECK: ret <2 x double> [[SUB_I]]
	float64x2_t test_vnegq_f64(float64x2_t a) {			float64x2_t test_vnegq_f64(float64x2_t a) {
	return vnegq_f64(a);			return vnegq_f64(a);
	}			}

	// CHECK-LABEL: @test_vabs_s8(			// CHECK-LABEL: @test_vabs_s8(
	// CHECK: [[VABS_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.abs.v8i8(<8 x i8> %a)			// CHECK: [[VABS_I:%.*]] = call <8 x i8> @llvm.aarch64.neon.abs.v8i8(<8 x i8> %a)
	// CHECK: ret <8 x i8> [[VABS_I]]			// CHECK: ret <8 x i8> [[VABS_I]]
	▲ Show 20 Lines • Show All 1,375 Lines • Show Last 20 Lines

clang/test/CodeGen/aarch64-neon-scalar-x-indexed-elem.c

	Show First 20 Lines • Show All 130 Lines • ▼ Show 20 Lines
	// CHECK: [[EXTRACT:%.*]] = extractelement <2 x double> %c, i32 1			// CHECK: [[EXTRACT:%.*]] = extractelement <2 x double> %c, i32 1
	// CHECK: [[TMP2:%.*]] = call double @llvm.fma.f64(double %b, double [[EXTRACT]], double %a)			// CHECK: [[TMP2:%.*]] = call double @llvm.fma.f64(double %b, double [[EXTRACT]], double %a)
	// CHECK: ret double [[TMP2]]			// CHECK: ret double [[TMP2]]
	float64_t test_vfmad_laneq_f64(float64_t a, float64_t b, float64x2_t c) {			float64_t test_vfmad_laneq_f64(float64_t a, float64_t b, float64x2_t c) {
	return vfmad_laneq_f64(a, b, c, 1);			return vfmad_laneq_f64(a, b, c, 1);
	}			}

	// CHECK-LABEL: define float @test_vfmss_lane_f32(float %a, float %b, <2 x float> %c) #0 {			// CHECK-LABEL: define float @test_vfmss_lane_f32(float %a, float %b, <2 x float> %c) #0 {
	// CHECK: [[SUB:%.*]] = fsub float -0.000000e+00, %b			// CHECK: [[SUB:%.*]] = fneg float %b
	// CHECK: [[EXTRACT:%.*]] = extractelement <2 x float> %c, i32 1			// CHECK: [[EXTRACT:%.*]] = extractelement <2 x float> %c, i32 1
	// CHECK: [[TMP2:%.*]] = call float @llvm.fma.f32(float [[SUB]], float [[EXTRACT]], float %a)			// CHECK: [[TMP2:%.*]] = call float @llvm.fma.f32(float [[SUB]], float [[EXTRACT]], float %a)
	// CHECK: ret float [[TMP2]]			// CHECK: ret float [[TMP2]]
	float32_t test_vfmss_lane_f32(float32_t a, float32_t b, float32x2_t c) {			float32_t test_vfmss_lane_f32(float32_t a, float32_t b, float32x2_t c) {
	return vfmss_lane_f32(a, b, c, 1);			return vfmss_lane_f32(a, b, c, 1);
	}			}

	// CHECK-LABEL: define <1 x double> @test_vfma_lane_f64(<1 x double> %a, <1 x double> %b, <1 x double> %v) #0 {			// CHECK-LABEL: define <1 x double> @test_vfma_lane_f64(<1 x double> %a, <1 x double> %b, <1 x double> %v) #0 {
	// CHECK: [[TMP0:%.*]] = bitcast <1 x double> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <1 x double> %a to <8 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <1 x double> %b to <8 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <1 x double> %b to <8 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <1 x double> %v to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <1 x double> %v to <8 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <1 x double>			// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <1 x double>
	// CHECK: [[LANE:%.*]] = shufflevector <1 x double> [[TMP3]], <1 x double> [[TMP3]], <1 x i32> zeroinitializer			// CHECK: [[LANE:%.*]] = shufflevector <1 x double> [[TMP3]], <1 x double> [[TMP3]], <1 x i32> zeroinitializer
	// CHECK: [[FMLA:%.*]] = bitcast <8 x i8> [[TMP1]] to <1 x double>			// CHECK: [[FMLA:%.*]] = bitcast <8 x i8> [[TMP1]] to <1 x double>
	// CHECK: [[FMLA1:%.*]] = bitcast <8 x i8> [[TMP0]] to <1 x double>			// CHECK: [[FMLA1:%.*]] = bitcast <8 x i8> [[TMP0]] to <1 x double>
	// CHECK: [[FMLA2:%.*]] = call <1 x double> @llvm.fma.v1f64(<1 x double> [[FMLA]], <1 x double> [[LANE]], <1 x double> [[FMLA1]])			// CHECK: [[FMLA2:%.*]] = call <1 x double> @llvm.fma.v1f64(<1 x double> [[FMLA]], <1 x double> [[LANE]], <1 x double> [[FMLA1]])
	// CHECK: ret <1 x double> [[FMLA2]]			// CHECK: ret <1 x double> [[FMLA2]]
	float64x1_t test_vfma_lane_f64(float64x1_t a, float64x1_t b, float64x1_t v) {			float64x1_t test_vfma_lane_f64(float64x1_t a, float64x1_t b, float64x1_t v) {
	return vfma_lane_f64(a, b, v, 0);			return vfma_lane_f64(a, b, v, 0);
	}			}

	// CHECK-LABEL: define <1 x double> @test_vfms_lane_f64(<1 x double> %a, <1 x double> %b, <1 x double> %v) #0 {			// CHECK-LABEL: define <1 x double> @test_vfms_lane_f64(<1 x double> %a, <1 x double> %b, <1 x double> %v) #0 {
	// CHECK: [[SUB:%.*]] = fsub <1 x double> <double -0.000000e+00>, %b			// CHECK: [[SUB:%.*]] = fneg <1 x double> %b
	// CHECK: [[TMP0:%.*]] = bitcast <1 x double> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <1 x double> %a to <8 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <1 x double> [[SUB]] to <8 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <1 x double> [[SUB]] to <8 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <1 x double> %v to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <1 x double> %v to <8 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <1 x double>			// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <1 x double>
	// CHECK: [[LANE:%.*]] = shufflevector <1 x double> [[TMP3]], <1 x double> [[TMP3]], <1 x i32> zeroinitializer			// CHECK: [[LANE:%.*]] = shufflevector <1 x double> [[TMP3]], <1 x double> [[TMP3]], <1 x i32> zeroinitializer
	// CHECK: [[FMLA:%.*]] = bitcast <8 x i8> [[TMP1]] to <1 x double>			// CHECK: [[FMLA:%.*]] = bitcast <8 x i8> [[TMP1]] to <1 x double>
	// CHECK: [[FMLA1:%.*]] = bitcast <8 x i8> [[TMP0]] to <1 x double>			// CHECK: [[FMLA1:%.*]] = bitcast <8 x i8> [[TMP0]] to <1 x double>
	// CHECK: [[FMLA2:%.*]] = call <1 x double> @llvm.fma.v1f64(<1 x double> [[FMLA]], <1 x double> [[LANE]], <1 x double> [[FMLA1]])			// CHECK: [[FMLA2:%.*]] = call <1 x double> @llvm.fma.v1f64(<1 x double> [[FMLA]], <1 x double> [[LANE]], <1 x double> [[FMLA1]])
	Show All 13 Lines
	// CHECK: [[TMP6:%.*]] = call double @llvm.fma.f64(double [[TMP4]], double [[EXTRACT]], double [[TMP3]])			// CHECK: [[TMP6:%.*]] = call double @llvm.fma.f64(double [[TMP4]], double [[EXTRACT]], double [[TMP3]])
	// CHECK: [[TMP7:%.*]] = bitcast double [[TMP6]] to <1 x double>			// CHECK: [[TMP7:%.*]] = bitcast double [[TMP6]] to <1 x double>
	// CHECK: ret <1 x double> [[TMP7]]			// CHECK: ret <1 x double> [[TMP7]]
	float64x1_t test_vfma_laneq_f64(float64x1_t a, float64x1_t b, float64x2_t v) {			float64x1_t test_vfma_laneq_f64(float64x1_t a, float64x1_t b, float64x2_t v) {
	return vfma_laneq_f64(a, b, v, 0);			return vfma_laneq_f64(a, b, v, 0);
	}			}

	// CHECK-LABEL: define <1 x double> @test_vfms_laneq_f64(<1 x double> %a, <1 x double> %b, <2 x double> %v) #1 {			// CHECK-LABEL: define <1 x double> @test_vfms_laneq_f64(<1 x double> %a, <1 x double> %b, <2 x double> %v) #1 {
	// CHECK: [[SUB:%.*]] = fsub <1 x double> <double -0.000000e+00>, %b			// CHECK: [[SUB:%.*]] = fneg <1 x double> %b
	// CHECK: [[TMP0:%.*]] = bitcast <1 x double> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <1 x double> %a to <8 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <1 x double> [[SUB]] to <8 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <1 x double> [[SUB]] to <8 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <2 x double> %v to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <2 x double> %v to <16 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP0]] to double			// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP0]] to double
	// CHECK: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to double			// CHECK: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to double
	// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <2 x double>			// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <2 x double>
	// CHECK: [[EXTRACT:%.*]] = extractelement <2 x double> [[TMP5]], i32 0			// CHECK: [[EXTRACT:%.*]] = extractelement <2 x double> [[TMP5]], i32 0
	// CHECK: [[TMP6:%.*]] = call double @llvm.fma.f64(double [[TMP4]], double [[EXTRACT]], double [[TMP3]])			// CHECK: [[TMP6:%.*]] = call double @llvm.fma.f64(double [[TMP4]], double [[EXTRACT]], double [[TMP3]])
	▲ Show 20 Lines • Show All 251 Lines • Show Last 20 Lines

clang/test/CodeGen/aarch64-v8.2a-fp16-intrinsics.c

	Show First 20 Lines • Show All 309 Lines • ▼ Show 20 Lines
	// CHECK-LABEL: test_vcvtph_u64_f16			// CHECK-LABEL: test_vcvtph_u64_f16
	// CHECK: [[VCVT:%.*]] = call i64 @llvm.aarch64.neon.fcvtpu.i64.f16(half %a)			// CHECK: [[VCVT:%.*]] = call i64 @llvm.aarch64.neon.fcvtpu.i64.f16(half %a)
	// CHECK: ret i64 [[VCVT]]			// CHECK: ret i64 [[VCVT]]
	uint64_t test_vcvtph_u64_f16 (float16_t a) {			uint64_t test_vcvtph_u64_f16 (float16_t a) {
	return vcvtph_u64_f16(a);			return vcvtph_u64_f16(a);
	}			}

	// CHECK-LABEL: test_vnegh_f16			// CHECK-LABEL: test_vnegh_f16
	// CHECK: [[NEG:%.*]] = fsub half 0xH8000, %a			// CHECK: [[NEG:%.*]] = fneg half %a
	// CHECK: ret half [[NEG]]			// CHECK: ret half [[NEG]]
	float16_t test_vnegh_f16(float16_t a) {			float16_t test_vnegh_f16(float16_t a) {
	return vnegh_f16(a);			return vnegh_f16(a);
	}			}

	// CHECK-LABEL: test_vrecpeh_f16			// CHECK-LABEL: test_vrecpeh_f16
	// CHECK: [[VREC:%.*]] = call half @llvm.aarch64.neon.frecpe.f16(half %a)			// CHECK: [[VREC:%.*]] = call half @llvm.aarch64.neon.frecpe.f16(half %a)
	// CHECK: ret half [[VREC]]			// CHECK: ret half [[VREC]]
	▲ Show 20 Lines • Show All 333 Lines • Show Last 20 Lines

clang/test/CodeGen/aarch64-v8.2a-neon-intrinsics.c

	Show First 20 Lines • Show All 258 Lines • ▼ Show 20 Lines
	// CHECK: [[VCVT:%.*]] = call <8 x i16> @llvm.aarch64.neon.fcvtpu.v8i16.v8f16(<8 x half> %a)			// CHECK: [[VCVT:%.*]] = call <8 x i16> @llvm.aarch64.neon.fcvtpu.v8i16.v8f16(<8 x half> %a)
	// CHECK: ret <8 x i16> [[VCVT]]			// CHECK: ret <8 x i16> [[VCVT]]
	uint16x8_t test_vcvtpq_u16_f16 (float16x8_t a) {			uint16x8_t test_vcvtpq_u16_f16 (float16x8_t a) {
	return vcvtpq_u16_f16(a);			return vcvtpq_u16_f16(a);
	}			}

	// FIXME: Fix the zero constant when fp16 non-storage-only type becomes available.			// FIXME: Fix the zero constant when fp16 non-storage-only type becomes available.
	// CHECK-LABEL: test_vneg_f16			// CHECK-LABEL: test_vneg_f16
	// CHECK: [[NEG:%.*]] = fsub <4 x half> <half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000>, %a			// CHECK: [[NEG:%.*]] = fneg <4 x half> %a
	// CHECK: ret <4 x half> [[NEG]]			// CHECK: ret <4 x half> [[NEG]]
	float16x4_t test_vneg_f16(float16x4_t a) {			float16x4_t test_vneg_f16(float16x4_t a) {
	return vneg_f16(a);			return vneg_f16(a);
	}			}

	// CHECK-LABEL: test_vnegq_f16			// CHECK-LABEL: test_vnegq_f16
	// CHECK: [[NEG:%.*]] = fsub <8 x half> <half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000>, %a			// CHECK: [[NEG:%.*]] = fneg <8 x half> %a
	// CHECK: ret <8 x half> [[NEG]]			// CHECK: ret <8 x half> [[NEG]]
	float16x8_t test_vnegq_f16(float16x8_t a) {			float16x8_t test_vnegq_f16(float16x8_t a) {
	return vnegq_f16(a);			return vnegq_f16(a);
	}			}

	// CHECK-LABEL: test_vrecpe_f16			// CHECK-LABEL: test_vrecpe_f16
	// CHECK: [[RCP:%.*]] = call <4 x half> @llvm.aarch64.neon.frecpe.v4f16(<4 x half> %a)			// CHECK: [[RCP:%.*]] = call <4 x half> @llvm.aarch64.neon.frecpe.v4f16(<4 x half> %a)
	// CHECK: ret <4 x half> [[RCP]]			// CHECK: ret <4 x half> [[RCP]]
	▲ Show 20 Lines • Show All 574 Lines • ▼ Show 20 Lines
	// CHECK-LABEL: test_vfmaq_f16			// CHECK-LABEL: test_vfmaq_f16
	// CHECK: [[ADD:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> %b, <8 x half> %c, <8 x half> %a)			// CHECK: [[ADD:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> %b, <8 x half> %c, <8 x half> %a)
	// CHECK: ret <8 x half> [[ADD]]			// CHECK: ret <8 x half> [[ADD]]
	float16x8_t test_vfmaq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {			float16x8_t test_vfmaq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {
	return vfmaq_f16(a, b, c);			return vfmaq_f16(a, b, c);
	}			}

	// CHECK-LABEL: test_vfms_f16			// CHECK-LABEL: test_vfms_f16
	// CHECK: [[SUB:%.*]] = fsub <4 x half> <half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000>, %b			// CHECK: [[SUB:%.*]] = fneg <4 x half> %b
	// CHECK: [[ADD:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[SUB]], <4 x half> %c, <4 x half> %a)			// CHECK: [[ADD:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[SUB]], <4 x half> %c, <4 x half> %a)
	// CHECK: ret <4 x half> [[ADD]]			// CHECK: ret <4 x half> [[ADD]]
	float16x4_t test_vfms_f16(float16x4_t a, float16x4_t b, float16x4_t c) {			float16x4_t test_vfms_f16(float16x4_t a, float16x4_t b, float16x4_t c) {
	return vfms_f16(a, b, c);			return vfms_f16(a, b, c);
	}			}

	// CHECK-LABEL: test_vfmsq_f16			// CHECK-LABEL: test_vfmsq_f16
	// CHECK: [[SUB:%.*]] = fsub <8 x half> <half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000>, %b			// CHECK: [[SUB:%.*]] = fneg <8 x half> %b
	// CHECK: [[ADD:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[SUB]], <8 x half> %c, <8 x half> %a)			// CHECK: [[ADD:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[SUB]], <8 x half> %c, <8 x half> %a)
	// CHECK: ret <8 x half> [[ADD]]			// CHECK: ret <8 x half> [[ADD]]
	float16x8_t test_vfmsq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {			float16x8_t test_vfmsq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {
	return vfmsq_f16(a, b, c);			return vfmsq_f16(a, b, c);
	}			}

	// CHECK-LABEL: test_vfma_lane_f16			// CHECK-LABEL: test_vfma_lane_f16
	// CHECK: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8>
	▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines
	// CHECK: [[EXTR:%.*]] = extractelement <8 x half> %c, i32 7			// CHECK: [[EXTR:%.*]] = extractelement <8 x half> %c, i32 7
	// CHECK: [[FMA:%.*]] = call half @llvm.fma.f16(half %b, half [[EXTR]], half %a)			// CHECK: [[FMA:%.*]] = call half @llvm.fma.f16(half %b, half [[EXTR]], half %a)
	// CHECK: ret half [[FMA]]			// CHECK: ret half [[FMA]]
	float16_t test_vfmah_laneq_f16(float16_t a, float16_t b, float16x8_t c) {			float16_t test_vfmah_laneq_f16(float16_t a, float16_t b, float16x8_t c) {
	return vfmah_laneq_f16(a, b, c, 7);			return vfmah_laneq_f16(a, b, c, 7);
	}			}

	// CHECK-LABEL: test_vfms_lane_f16			// CHECK-LABEL: test_vfms_lane_f16
	// CHECK: [[SUB:%.*]] = fsub <4 x half> <half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000>, %b			// CHECK: [[SUB:%.*]] = fneg <4 x half> %b
	// CHECK: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <4 x half> [[SUB]] to <8 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <4 x half> [[SUB]] to <8 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <4 x half> %c to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <4 x half> %c to <8 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <4 x half>			// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <4 x half>
	// CHECK: [[LANE:%.*]] = shufflevector <4 x half> [[TMP3]], <4 x half> [[TMP3]], <4 x i32> <i32 3, i32 3, i32 3, i32 3>			// CHECK: [[LANE:%.*]] = shufflevector <4 x half> [[TMP3]], <4 x half> [[TMP3]], <4 x i32> <i32 3, i32 3, i32 3, i32 3>
	// CHECK: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <4 x half>			// CHECK: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <4 x half>
	// CHECK: [[TMP5:%.*]] = bitcast <8 x i8> [[TMP0]] to <4 x half>			// CHECK: [[TMP5:%.*]] = bitcast <8 x i8> [[TMP0]] to <4 x half>
	// CHECK: [[FMA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[TMP4]], <4 x half> [[LANE]], <4 x half> [[TMP5]])			// CHECK: [[FMA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[TMP4]], <4 x half> [[LANE]], <4 x half> [[TMP5]])
	// CHECK: ret <4 x half> [[FMA]]			// CHECK: ret <4 x half> [[FMA]]
	float16x4_t test_vfms_lane_f16(float16x4_t a, float16x4_t b, float16x4_t c) {			float16x4_t test_vfms_lane_f16(float16x4_t a, float16x4_t b, float16x4_t c) {
	return vfms_lane_f16(a, b, c, 3);			return vfms_lane_f16(a, b, c, 3);
	}			}

	// CHECK-LABEL: test_vfmsq_lane_f16			// CHECK-LABEL: test_vfmsq_lane_f16
	// CHECK: [[SUB:%.*]] = fsub <8 x half> <half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000>, %b			// CHECK: [[SUB:%.*]] = fneg <8 x half> %b
	// CHECK: [[TMP0:%.*]] = bitcast <8 x half> %a to <16 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <8 x half> %a to <16 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <8 x half> [[SUB]] to <16 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <8 x half> [[SUB]] to <16 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <4 x half> %c to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <4 x half> %c to <8 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <4 x half>			// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP2]] to <4 x half>
	// CHECK: [[LANE:%.*]] = shufflevector <4 x half> [[TMP3]], <4 x half> [[TMP3]], <8 x i32> <i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3>			// CHECK: [[LANE:%.*]] = shufflevector <4 x half> [[TMP3]], <4 x half> [[TMP3]], <8 x i32> <i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3>
	// CHECK: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <8 x half>			// CHECK: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <8 x half>
	// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP0]] to <8 x half>			// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP0]] to <8 x half>
	// CHECK: [[FMLA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[TMP4]], <8 x half> [[LANE]], <8 x half> [[TMP5]])			// CHECK: [[FMLA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[TMP4]], <8 x half> [[LANE]], <8 x half> [[TMP5]])
	// CHECK: ret <8 x half> [[FMLA]]			// CHECK: ret <8 x half> [[FMLA]]
	float16x8_t test_vfmsq_lane_f16(float16x8_t a, float16x8_t b, float16x4_t c) {			float16x8_t test_vfmsq_lane_f16(float16x8_t a, float16x8_t b, float16x4_t c) {
	return vfmsq_lane_f16(a, b, c, 3);			return vfmsq_lane_f16(a, b, c, 3);
	}			}

	// CHECK-LABEL: test_vfms_laneq_f16			// CHECK-LABEL: test_vfms_laneq_f16
	// CHECK: [[SUB:%.*]] = fsub <4 x half> <half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000>, %b			// CHECK: [[SUB:%.*]] = fneg <4 x half> %b
	// CHECK: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <4 x half> %a to <8 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <4 x half> [[SUB]] to <8 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <4 x half> [[SUB]] to <8 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <8 x half> %c to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <8 x half> %c to <16 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP0]] to <4 x half>			// CHECK: [[TMP3:%.*]] = bitcast <8 x i8> [[TMP0]] to <4 x half>
	// CHECK: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <4 x half>			// CHECK: [[TMP4:%.*]] = bitcast <8 x i8> [[TMP1]] to <4 x half>
	// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <8 x half>			// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <8 x half>
	// CHECK: [[LANE:%.*]] = shufflevector <8 x half> [[TMP5]], <8 x half> [[TMP5]], <4 x i32> <i32 7, i32 7, i32 7, i32 7>			// CHECK: [[LANE:%.*]] = shufflevector <8 x half> [[TMP5]], <8 x half> [[TMP5]], <4 x i32> <i32 7, i32 7, i32 7, i32 7>
	// CHECK: [[FMLA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[LANE]], <4 x half> [[TMP4]], <4 x half> [[TMP3]])			// CHECK: [[FMLA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[LANE]], <4 x half> [[TMP4]], <4 x half> [[TMP3]])
	// CHECK: ret <4 x half> [[FMLA]]			// CHECK: ret <4 x half> [[FMLA]]
	float16x4_t test_vfms_laneq_f16(float16x4_t a, float16x4_t b, float16x8_t c) {			float16x4_t test_vfms_laneq_f16(float16x4_t a, float16x4_t b, float16x8_t c) {
	return vfms_laneq_f16(a, b, c, 7);			return vfms_laneq_f16(a, b, c, 7);
	}			}

	// CHECK-LABEL: test_vfmsq_laneq_f16			// CHECK-LABEL: test_vfmsq_laneq_f16
	// CHECK: [[SUB:%.*]] = fsub <8 x half> <half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000>, %b			// CHECK: [[SUB:%.*]] = fneg <8 x half> %b
	// CHECK: [[TMP0:%.*]] = bitcast <8 x half> %a to <16 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <8 x half> %a to <16 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <8 x half> [[SUB]] to <16 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <8 x half> [[SUB]] to <16 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <8 x half> %c to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <8 x half> %c to <16 x i8>
	// CHECK: [[TMP3:%.*]] = bitcast <16 x i8> [[TMP0]] to <8 x half>			// CHECK: [[TMP3:%.*]] = bitcast <16 x i8> [[TMP0]] to <8 x half>
	// CHECK: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <8 x half>			// CHECK: [[TMP4:%.*]] = bitcast <16 x i8> [[TMP1]] to <8 x half>
	// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <8 x half>			// CHECK: [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to <8 x half>
	// CHECK: [[LANE:%.*]] = shufflevector <8 x half> [[TMP5]], <8 x half> [[TMP5]], <8 x i32> <i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7>			// CHECK: [[LANE:%.*]] = shufflevector <8 x half> [[TMP5]], <8 x half> [[TMP5]], <8 x i32> <i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7>
	// CHECK: [[FMLA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[LANE]], <8 x half> [[TMP4]], <8 x half> [[TMP3]])			// CHECK: [[FMLA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[LANE]], <8 x half> [[TMP4]], <8 x half> [[TMP3]])
	// CHECK: ret <8 x half> [[FMLA]]			// CHECK: ret <8 x half> [[FMLA]]
	float16x8_t test_vfmsq_laneq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {			float16x8_t test_vfmsq_laneq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {
	return vfmsq_laneq_f16(a, b, c, 7);			return vfmsq_laneq_f16(a, b, c, 7);
	}			}

	// CHECK-LABEL: test_vfms_n_f16			// CHECK-LABEL: test_vfms_n_f16
	// CHECK: [[SUB:%.*]] = fsub <4 x half> <half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000>, %b			// CHECK: [[SUB:%.*]] = fneg <4 x half> %b
	// CHECK: [[TMP0:%.*]] = insertelement <4 x half> undef, half %c, i32 0			// CHECK: [[TMP0:%.*]] = insertelement <4 x half> undef, half %c, i32 0
	// CHECK: [[TMP1:%.*]] = insertelement <4 x half> [[TMP0]], half %c, i32 1			// CHECK: [[TMP1:%.*]] = insertelement <4 x half> [[TMP0]], half %c, i32 1
	// CHECK: [[TMP2:%.*]] = insertelement <4 x half> [[TMP1]], half %c, i32 2			// CHECK: [[TMP2:%.*]] = insertelement <4 x half> [[TMP1]], half %c, i32 2
	// CHECK: [[TMP3:%.*]] = insertelement <4 x half> [[TMP2]], half %c, i32 3			// CHECK: [[TMP3:%.*]] = insertelement <4 x half> [[TMP2]], half %c, i32 3
	// CHECK: [[FMA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[SUB]], <4 x half> [[TMP3]], <4 x half> %a)			// CHECK: [[FMA:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[SUB]], <4 x half> [[TMP3]], <4 x half> %a)
	// CHECK: ret <4 x half> [[FMA]]			// CHECK: ret <4 x half> [[FMA]]
	float16x4_t test_vfms_n_f16(float16x4_t a, float16x4_t b, float16_t c) {			float16x4_t test_vfms_n_f16(float16x4_t a, float16x4_t b, float16_t c) {
	return vfms_n_f16(a, b, c);			return vfms_n_f16(a, b, c);
	}			}

	// CHECK-LABEL: test_vfmsq_n_f16			// CHECK-LABEL: test_vfmsq_n_f16
	// CHECK: [[SUB:%.*]] = fsub <8 x half> <half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000>, %b			// CHECK: [[SUB:%.*]] = fneg <8 x half> %b
	// CHECK: [[TMP0:%.*]] = insertelement <8 x half> undef, half %c, i32 0			// CHECK: [[TMP0:%.*]] = insertelement <8 x half> undef, half %c, i32 0
	// CHECK: [[TMP1:%.*]] = insertelement <8 x half> [[TMP0]], half %c, i32 1			// CHECK: [[TMP1:%.*]] = insertelement <8 x half> [[TMP0]], half %c, i32 1
	// CHECK: [[TMP2:%.*]] = insertelement <8 x half> [[TMP1]], half %c, i32 2			// CHECK: [[TMP2:%.*]] = insertelement <8 x half> [[TMP1]], half %c, i32 2
	// CHECK: [[TMP3:%.*]] = insertelement <8 x half> [[TMP2]], half %c, i32 3			// CHECK: [[TMP3:%.*]] = insertelement <8 x half> [[TMP2]], half %c, i32 3
	// CHECK: [[TMP4:%.*]] = insertelement <8 x half> [[TMP3]], half %c, i32 4			// CHECK: [[TMP4:%.*]] = insertelement <8 x half> [[TMP3]], half %c, i32 4
	// CHECK: [[TMP5:%.*]] = insertelement <8 x half> [[TMP4]], half %c, i32 5			// CHECK: [[TMP5:%.*]] = insertelement <8 x half> [[TMP4]], half %c, i32 5
	// CHECK: [[TMP6:%.*]] = insertelement <8 x half> [[TMP5]], half %c, i32 6			// CHECK: [[TMP6:%.*]] = insertelement <8 x half> [[TMP5]], half %c, i32 6
	// CHECK: [[TMP7:%.*]] = insertelement <8 x half> [[TMP6]], half %c, i32 7			// CHECK: [[TMP7:%.*]] = insertelement <8 x half> [[TMP6]], half %c, i32 7
	// CHECK: [[FMA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[SUB]], <8 x half> [[TMP7]], <8 x half> %a)			// CHECK: [[FMA:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[SUB]], <8 x half> [[TMP7]], <8 x half> %a)
	// CHECK: ret <8 x half> [[FMA]]			// CHECK: ret <8 x half> [[FMA]]
	float16x8_t test_vfmsq_n_f16(float16x8_t a, float16x8_t b, float16_t c) {			float16x8_t test_vfmsq_n_f16(float16x8_t a, float16x8_t b, float16_t c) {
	return vfmsq_n_f16(a, b, c);			return vfmsq_n_f16(a, b, c);
	}			}

	// CHECK-LABEL: test_vfmsh_lane_f16			// CHECK-LABEL: test_vfmsh_lane_f16
	// CHECK: [[TMP0:%.*]] = fpext half %b to float			// CHECK: [[TMP0:%.*]] = fpext half %b to float
	// CHECK: [[TMP1:%.*]] = fsub float -0.000000e+00, [[TMP0]]			// CHECK: [[TMP1:%.*]] = fneg float [[TMP0]]
	// CHECK: [[SUB:%.*]] = fptrunc float [[TMP1]] to half			// CHECK: [[SUB:%.*]] = fptrunc float [[TMP1]] to half
	// CHECK: [[EXTR:%.*]] = extractelement <4 x half> %c, i32 3			// CHECK: [[EXTR:%.*]] = extractelement <4 x half> %c, i32 3
	// CHECK: [[FMA:%.*]] = call half @llvm.fma.f16(half [[SUB]], half [[EXTR]], half %a)			// CHECK: [[FMA:%.*]] = call half @llvm.fma.f16(half [[SUB]], half [[EXTR]], half %a)
	// CHECK: ret half [[FMA]]			// CHECK: ret half [[FMA]]
	float16_t test_vfmsh_lane_f16(float16_t a, float16_t b, float16x4_t c) {			float16_t test_vfmsh_lane_f16(float16_t a, float16_t b, float16x4_t c) {
	return vfmsh_lane_f16(a, b, c, 3);			return vfmsh_lane_f16(a, b, c, 3);
	}			}

	// CHECK-LABEL: test_vfmsh_laneq_f16			// CHECK-LABEL: test_vfmsh_laneq_f16
	// CHECK: [[TMP0:%.*]] = fpext half %b to float			// CHECK: [[TMP0:%.*]] = fpext half %b to float
	// CHECK: [[TMP1:%.*]] = fsub float -0.000000e+00, [[TMP0]]			// CHECK: [[TMP1:%.*]] = fneg float [[TMP0]]
	// CHECK: [[SUB:%.*]] = fptrunc float [[TMP1]] to half			// CHECK: [[SUB:%.*]] = fptrunc float [[TMP1]] to half
	// CHECK: [[EXTR:%.*]] = extractelement <8 x half> %c, i32 7			// CHECK: [[EXTR:%.*]] = extractelement <8 x half> %c, i32 7
	// CHECK: [[FMA:%.*]] = call half @llvm.fma.f16(half [[SUB]], half [[EXTR]], half %a)			// CHECK: [[FMA:%.*]] = call half @llvm.fma.f16(half [[SUB]], half [[EXTR]], half %a)
	// CHECK: ret half [[FMA]]			// CHECK: ret half [[FMA]]
	float16_t test_vfmsh_laneq_f16(float16_t a, float16_t b, float16x8_t c) {			float16_t test_vfmsh_laneq_f16(float16_t a, float16_t b, float16x8_t c) {
	return vfmsh_laneq_f16(a, b, c, 7);			return vfmsh_laneq_f16(a, b, c, 7);
	}			}

	▲ Show 20 Lines • Show All 534 Lines • Show Last 20 Lines

clang/test/CodeGen/arm-v8.2a-neon-intrinsics.c

	Show First 20 Lines • Show All 258 Lines • ▼ Show 20 Lines
	// CHECK: [[VCVT:%.*]] = call <8 x i16> @llvm.arm.neon.vcvtpu.v8i16.v8f16(<8 x half> %a)			// CHECK: [[VCVT:%.*]] = call <8 x i16> @llvm.arm.neon.vcvtpu.v8i16.v8f16(<8 x half> %a)
	// CHECK: ret <8 x i16> [[VCVT]]			// CHECK: ret <8 x i16> [[VCVT]]
	uint16x8_t test_vcvtpq_u16_f16 (float16x8_t a) {			uint16x8_t test_vcvtpq_u16_f16 (float16x8_t a) {
	return vcvtpq_u16_f16(a);			return vcvtpq_u16_f16(a);
	}			}

	// FIXME: Fix the zero constant when fp16 non-storage-only type becomes available.			// FIXME: Fix the zero constant when fp16 non-storage-only type becomes available.
	// CHECK-LABEL: test_vneg_f16			// CHECK-LABEL: test_vneg_f16
	// CHECK: [[NEG:%.*]] = fsub <4 x half> <half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000>, %a			// CHECK: [[NEG:%.*]] = fneg <4 x half> %a
	// CHECK: ret <4 x half> [[NEG]]			// CHECK: ret <4 x half> [[NEG]]
	float16x4_t test_vneg_f16(float16x4_t a) {			float16x4_t test_vneg_f16(float16x4_t a) {
	return vneg_f16(a);			return vneg_f16(a);
	}			}

	// CHECK-LABEL: test_vnegq_f16			// CHECK-LABEL: test_vnegq_f16
	// CHECK: [[NEG:%.*]] = fsub <8 x half> <half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000>, %a			// CHECK: [[NEG:%.*]] = fneg <8 x half> %a
	// CHECK: ret <8 x half> [[NEG]]			// CHECK: ret <8 x half> [[NEG]]
	float16x8_t test_vnegq_f16(float16x8_t a) {			float16x8_t test_vnegq_f16(float16x8_t a) {
	return vnegq_f16(a);			return vnegq_f16(a);
	}			}

	// CHECK-LABEL: test_vrecpe_f16			// CHECK-LABEL: test_vrecpe_f16
	// CHECK: [[RCP:%.*]] = call <4 x half> @llvm.arm.neon.vrecpe.v4f16(<4 x half> %a)			// CHECK: [[RCP:%.*]] = call <4 x half> @llvm.arm.neon.vrecpe.v4f16(<4 x half> %a)
	// CHECK: ret <4 x half> [[RCP]]			// CHECK: ret <4 x half> [[RCP]]
	▲ Show 20 Lines • Show All 469 Lines • ▼ Show 20 Lines
	// CHECK-LABEL: test_vfmaq_f16			// CHECK-LABEL: test_vfmaq_f16
	// CHECK: [[ADD:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> %b, <8 x half> %c, <8 x half> %a)			// CHECK: [[ADD:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> %b, <8 x half> %c, <8 x half> %a)
	// CHECK: ret <8 x half> [[ADD]]			// CHECK: ret <8 x half> [[ADD]]
	float16x8_t test_vfmaq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {			float16x8_t test_vfmaq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {
	return vfmaq_f16(a, b, c);			return vfmaq_f16(a, b, c);
	}			}

	// CHECK-LABEL: test_vfms_f16			// CHECK-LABEL: test_vfms_f16
	// CHECK: [[SUB:%.*]] = fsub <4 x half> <half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000>, %b			// CHECK: [[SUB:%.*]] = fneg <4 x half> %b
	// CHECK: [[ADD:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[SUB]], <4 x half> %c, <4 x half> %a)			// CHECK: [[ADD:%.*]] = call <4 x half> @llvm.fma.v4f16(<4 x half> [[SUB]], <4 x half> %c, <4 x half> %a)
	// CHECK: ret <4 x half> [[ADD]]			// CHECK: ret <4 x half> [[ADD]]
	float16x4_t test_vfms_f16(float16x4_t a, float16x4_t b, float16x4_t c) {			float16x4_t test_vfms_f16(float16x4_t a, float16x4_t b, float16x4_t c) {
	return vfms_f16(a, b, c);			return vfms_f16(a, b, c);
	}			}

	// CHECK-LABEL: test_vfmsq_f16			// CHECK-LABEL: test_vfmsq_f16
	// CHECK: [[SUB:%.*]] = fsub <8 x half> <half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000, half 0xH8000>, %b			// CHECK: [[SUB:%.*]] = fneg <8 x half> %b
	// CHECK: [[ADD:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[SUB]], <8 x half> %c, <8 x half> %a)			// CHECK: [[ADD:%.*]] = call <8 x half> @llvm.fma.v8f16(<8 x half> [[SUB]], <8 x half> %c, <8 x half> %a)
	// CHECK: ret <8 x half> [[ADD]]			// CHECK: ret <8 x half> [[ADD]]
	float16x8_t test_vfmsq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {			float16x8_t test_vfmsq_f16(float16x8_t a, float16x8_t b, float16x8_t c) {
	return vfmsq_f16(a, b, c);			return vfmsq_f16(a, b, c);
	}			}

	// CHECK-LABEL: test_vmul_lane_f16			// CHECK-LABEL: test_vmul_lane_f16
	// CHECK: [[TMP0:%.*]] = shufflevector <4 x half> %b, <4 x half> %b, <4 x i32> <i32 3, i32 3, i32 3, i32 3>			// CHECK: [[TMP0:%.*]] = shufflevector <4 x half> %b, <4 x half> %b, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
	▲ Show 20 Lines • Show All 213 Lines • Show Last 20 Lines

clang/test/CodeGen/arm_neon_intrinsics.c

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 3,200 Lines • ▼ Show 20 Lines
	// CHECK: [[TMP2:%.*]] = bitcast <4 x float> %c to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <4 x float> %c to <16 x i8>
	// CHECK: [[TMP3:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %b, <4 x float> %c, <4 x float> %a)			// CHECK: [[TMP3:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %b, <4 x float> %c, <4 x float> %a)
	// CHECK: ret <4 x float> [[TMP3]]			// CHECK: ret <4 x float> [[TMP3]]
	float32x4_t test_vfmaq_f32(float32x4_t a, float32x4_t b, float32x4_t c) {			float32x4_t test_vfmaq_f32(float32x4_t a, float32x4_t b, float32x4_t c) {
	return vfmaq_f32(a, b, c);			return vfmaq_f32(a, b, c);
	}			}

	// CHECK-LABEL: @test_vfms_f32(			// CHECK-LABEL: @test_vfms_f32(
	// CHECK: [[SUB_I:%.*]] = fsub <2 x float> <float -0.000000e+00, float -0.000000e+00>, %b			// CHECK: [[SUB_I:%.*]] = fneg <2 x float> %b
	// CHECK: [[TMP0:%.*]] = bitcast <2 x float> %a to <8 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <2 x float> %a to <8 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <2 x float> [[SUB_I]] to <8 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <2 x float> [[SUB_I]] to <8 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <2 x float> %c to <8 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <2 x float> %c to <8 x i8>
	// CHECK: [[TMP3:%.*]] = call <2 x float> @llvm.fma.v2f32(<2 x float> [[SUB_I]], <2 x float> %c, <2 x float> %a)			// CHECK: [[TMP3:%.*]] = call <2 x float> @llvm.fma.v2f32(<2 x float> [[SUB_I]], <2 x float> %c, <2 x float> %a)
	// CHECK: ret <2 x float> [[TMP3]]			// CHECK: ret <2 x float> [[TMP3]]
	float32x2_t test_vfms_f32(float32x2_t a, float32x2_t b, float32x2_t c) {			float32x2_t test_vfms_f32(float32x2_t a, float32x2_t b, float32x2_t c) {
	return vfms_f32(a, b, c);			return vfms_f32(a, b, c);
	}			}

	// CHECK-LABEL: @test_vfmsq_f32(			// CHECK-LABEL: @test_vfmsq_f32(
	// CHECK: [[SUB_I:%.*]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %b			// CHECK: [[SUB_I:%.*]] = fneg <4 x float> %b
	// CHECK: [[TMP0:%.*]] = bitcast <4 x float> %a to <16 x i8>			// CHECK: [[TMP0:%.*]] = bitcast <4 x float> %a to <16 x i8>
	// CHECK: [[TMP1:%.*]] = bitcast <4 x float> [[SUB_I]] to <16 x i8>			// CHECK: [[TMP1:%.*]] = bitcast <4 x float> [[SUB_I]] to <16 x i8>
	// CHECK: [[TMP2:%.*]] = bitcast <4 x float> %c to <16 x i8>			// CHECK: [[TMP2:%.*]] = bitcast <4 x float> %c to <16 x i8>
	// CHECK: [[TMP3:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[SUB_I]], <4 x float> %c, <4 x float> %a)			// CHECK: [[TMP3:%.*]] = call <4 x float> @llvm.fma.v4f32(<4 x float> [[SUB_I]], <4 x float> %c, <4 x float> %a)
	// CHECK: ret <4 x float> [[TMP3]]			// CHECK: ret <4 x float> [[TMP3]]
	float32x4_t test_vfmsq_f32(float32x4_t a, float32x4_t b, float32x4_t c) {			float32x4_t test_vfmsq_f32(float32x4_t a, float32x4_t b, float32x4_t c) {
	return vfmsq_f32(a, b, c);			return vfmsq_f32(a, b, c);
	}			}
	▲ Show 20 Lines • Show All 5,575 Lines • ▼ Show 20 Lines
	// CHECK-LABEL: @test_vneg_s32(			// CHECK-LABEL: @test_vneg_s32(
	// CHECK: [[SUB_I:%.*]] = sub <2 x i32> zeroinitializer, %a			// CHECK: [[SUB_I:%.*]] = sub <2 x i32> zeroinitializer, %a
	// CHECK: ret <2 x i32> [[SUB_I]]			// CHECK: ret <2 x i32> [[SUB_I]]
	int32x2_t test_vneg_s32(int32x2_t a) {			int32x2_t test_vneg_s32(int32x2_t a) {
	return vneg_s32(a);			return vneg_s32(a);
	}			}

	// CHECK-LABEL: @test_vneg_f32(			// CHECK-LABEL: @test_vneg_f32(
	// CHECK: [[SUB_I:%.*]] = fsub <2 x float> <float -0.000000e+00, float -0.000000e+00>, %a			// CHECK: [[SUB_I:%.*]] = fneg <2 x float> %a
	// CHECK: ret <2 x float> [[SUB_I]]			// CHECK: ret <2 x float> [[SUB_I]]
	float32x2_t test_vneg_f32(float32x2_t a) {			float32x2_t test_vneg_f32(float32x2_t a) {
	return vneg_f32(a);			return vneg_f32(a);
	}			}

	// CHECK-LABEL: @test_vnegq_s8(			// CHECK-LABEL: @test_vnegq_s8(
	// CHECK: [[SUB_I:%.*]] = sub <16 x i8> zeroinitializer, %a			// CHECK: [[SUB_I:%.*]] = sub <16 x i8> zeroinitializer, %a
	// CHECK: ret <16 x i8> [[SUB_I]]			// CHECK: ret <16 x i8> [[SUB_I]]
	Show All 11 Lines
	// CHECK-LABEL: @test_vnegq_s32(			// CHECK-LABEL: @test_vnegq_s32(
	// CHECK: [[SUB_I:%.*]] = sub <4 x i32> zeroinitializer, %a			// CHECK: [[SUB_I:%.*]] = sub <4 x i32> zeroinitializer, %a
	// CHECK: ret <4 x i32> [[SUB_I]]			// CHECK: ret <4 x i32> [[SUB_I]]
	int32x4_t test_vnegq_s32(int32x4_t a) {			int32x4_t test_vnegq_s32(int32x4_t a) {
	return vnegq_s32(a);			return vnegq_s32(a);
	}			}

	// CHECK-LABEL: @test_vnegq_f32(			// CHECK-LABEL: @test_vnegq_f32(
	// CHECK: [[SUB_I:%.*]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %a			// CHECK: [[SUB_I:%.*]] = fneg <4 x float> %a
	// CHECK: ret <4 x float> [[SUB_I]]			// CHECK: ret <4 x float> [[SUB_I]]
	float32x4_t test_vnegq_f32(float32x4_t a) {			float32x4_t test_vnegq_f32(float32x4_t a) {
	return vnegq_f32(a);			return vnegq_f32(a);
	}			}

	// CHECK-LABEL: @test_vorn_s8(			// CHECK-LABEL: @test_vorn_s8(
	// CHECK: [[NEG_I:%.*]] = xor <8 x i8> %b, <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>			// CHECK: [[NEG_I:%.*]] = xor <8 x i8> %b, <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>
	// CHECK: [[OR_I:%.*]] = or <8 x i8> %a, [[NEG_I]]			// CHECK: [[OR_I:%.*]] = or <8 x i8> %a, [[NEG_I]]
	▲ Show 20 Lines • Show All 12,162 Lines • Show Last 20 Lines

clang/test/CodeGen/avx512f-builtins.c

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 517 Lines • ▼ Show 20 Lines	__m512d test_mm512_maskz_fmadd_round_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_maskz_fmadd_round_pd		// CHECK-LABEL: @test_mm512_maskz_fmadd_round_pd
// CHECK: @llvm.x86.avx512.vfmadd.pd.512		// CHECK: @llvm.x86.avx512.vfmadd.pd.512
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer
return _mm512_maskz_fmadd_round_pd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_maskz_fmadd_round_pd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_fmsub_round_pd(__m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_fmsub_round_pd(__m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_fmsub_round_pd		// CHECK-LABEL: @test_mm512_fmsub_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: @llvm.x86.avx512.vfmadd.pd.512		// CHECK: @llvm.x86.avx512.vfmadd.pd.512
return _mm512_fmsub_round_pd(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_fmsub_round_pd(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_mask_fmsub_round_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {		__m512d test_mm512_mask_fmsub_round_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_mask_fmsub_round_pd		// CHECK-LABEL: @test_mm512_mask_fmsub_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: @llvm.x86.avx512.vfmadd.pd.512		// CHECK: @llvm.x86.avx512.vfmadd.pd.512
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask_fmsub_round_pd(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask_fmsub_round_pd(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_maskz_fmsub_round_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_maskz_fmsub_round_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_maskz_fmsub_round_pd		// CHECK-LABEL: @test_mm512_maskz_fmsub_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: @llvm.x86.avx512.vfmadd.pd.512		// CHECK: @llvm.x86.avx512.vfmadd.pd.512
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer
return _mm512_maskz_fmsub_round_pd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_maskz_fmsub_round_pd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_fnmadd_round_pd(__m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_fnmadd_round_pd(__m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_fnmadd_round_pd		// CHECK-LABEL: @test_mm512_fnmadd_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: @llvm.x86.avx512.vfmadd.pd.512		// CHECK: @llvm.x86.avx512.vfmadd.pd.512
return _mm512_fnmadd_round_pd(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_fnmadd_round_pd(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_mask3_fnmadd_round_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {		__m512d test_mm512_mask3_fnmadd_round_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm512_mask3_fnmadd_round_pd		// CHECK-LABEL: @test_mm512_mask3_fnmadd_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: @llvm.x86.avx512.vfmadd.pd.512		// CHECK: @llvm.x86.avx512.vfmadd.pd.512
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask3_fnmadd_round_pd(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask3_fnmadd_round_pd(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_maskz_fnmadd_round_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_maskz_fnmadd_round_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_maskz_fnmadd_round_pd		// CHECK-LABEL: @test_mm512_maskz_fnmadd_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: @llvm.x86.avx512.vfmadd.pd.512		// CHECK: @llvm.x86.avx512.vfmadd.pd.512
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer
return _mm512_maskz_fnmadd_round_pd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_maskz_fnmadd_round_pd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_fnmsub_round_pd(__m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_fnmsub_round_pd(__m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_fnmsub_round_pd		// CHECK-LABEL: @test_mm512_fnmsub_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: @llvm.x86.avx512.vfmadd.pd.512		// CHECK: @llvm.x86.avx512.vfmadd.pd.512
return _mm512_fnmsub_round_pd(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_fnmsub_round_pd(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_maskz_fnmsub_round_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_maskz_fnmsub_round_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_maskz_fnmsub_round_pd		// CHECK-LABEL: @test_mm512_maskz_fnmsub_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: @llvm.x86.avx512.vfmadd.pd.512		// CHECK: @llvm.x86.avx512.vfmadd.pd.512
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer
return _mm512_maskz_fnmsub_round_pd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_maskz_fnmsub_round_pd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_fmadd_pd(__m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_fmadd_pd(__m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_fmadd_pd		// CHECK-LABEL: @test_mm512_fmadd_pd
// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
Show All 17 Lines	__m512d test_mm512_maskz_fmadd_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_maskz_fmadd_pd		// CHECK-LABEL: @test_mm512_maskz_fmadd_pd
// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer
return _mm512_maskz_fmadd_pd(__U, __A, __B, __C);		return _mm512_maskz_fmadd_pd(__U, __A, __B, __C);
}		}
__m512d test_mm512_fmsub_pd(__m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_fmsub_pd(__m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_fmsub_pd		// CHECK-LABEL: @test_mm512_fmsub_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
return _mm512_fmsub_pd(__A, __B, __C);		return _mm512_fmsub_pd(__A, __B, __C);
}		}
__m512d test_mm512_mask_fmsub_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {		__m512d test_mm512_mask_fmsub_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_mask_fmsub_pd		// CHECK-LABEL: @test_mm512_mask_fmsub_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask_fmsub_pd(__A, __U, __B, __C);		return _mm512_mask_fmsub_pd(__A, __U, __B, __C);
}		}
__m512d test_mm512_maskz_fmsub_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_maskz_fmsub_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_maskz_fmsub_pd		// CHECK-LABEL: @test_mm512_maskz_fmsub_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer
return _mm512_maskz_fmsub_pd(__U, __A, __B, __C);		return _mm512_maskz_fmsub_pd(__U, __A, __B, __C);
}		}
__m512d test_mm512_fnmadd_pd(__m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_fnmadd_pd(__m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_fnmadd_pd		// CHECK-LABEL: @test_mm512_fnmadd_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
return _mm512_fnmadd_pd(__A, __B, __C);		return _mm512_fnmadd_pd(__A, __B, __C);
}		}
__m512d test_mm512_mask3_fnmadd_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {		__m512d test_mm512_mask3_fnmadd_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm512_mask3_fnmadd_pd		// CHECK-LABEL: @test_mm512_mask3_fnmadd_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask3_fnmadd_pd(__A, __B, __C, __U);		return _mm512_mask3_fnmadd_pd(__A, __B, __C, __U);
}		}
__m512d test_mm512_maskz_fnmadd_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_maskz_fnmadd_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_maskz_fnmadd_pd		// CHECK-LABEL: @test_mm512_maskz_fnmadd_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer
return _mm512_maskz_fnmadd_pd(__U, __A, __B, __C);		return _mm512_maskz_fnmadd_pd(__U, __A, __B, __C);
}		}
__m512d test_mm512_fnmsub_pd(__m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_fnmsub_pd(__m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_fnmsub_pd		// CHECK-LABEL: @test_mm512_fnmsub_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
return _mm512_fnmsub_pd(__A, __B, __C);		return _mm512_fnmsub_pd(__A, __B, __C);
}		}
__m512d test_mm512_maskz_fnmsub_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_maskz_fnmsub_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_maskz_fnmsub_pd		// CHECK-LABEL: @test_mm512_maskz_fnmsub_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer
return _mm512_maskz_fnmsub_pd(__U, __A, __B, __C);		return _mm512_maskz_fnmsub_pd(__U, __A, __B, __C);
}		}
__m512 test_mm512_fmadd_round_ps(__m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_fmadd_round_ps(__m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_fmadd_round_ps		// CHECK-LABEL: @test_mm512_fmadd_round_ps
// CHECK: @llvm.x86.avx512.vfmadd.ps.512		// CHECK: @llvm.x86.avx512.vfmadd.ps.512
Show All 17 Lines	__m512 test_mm512_maskz_fmadd_round_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_maskz_fmadd_round_ps		// CHECK-LABEL: @test_mm512_maskz_fmadd_round_ps
// CHECK: @llvm.x86.avx512.vfmadd.ps.512		// CHECK: @llvm.x86.avx512.vfmadd.ps.512
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer
return _mm512_maskz_fmadd_round_ps(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_maskz_fmadd_round_ps(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_fmsub_round_ps(__m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_fmsub_round_ps(__m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_fmsub_round_ps		// CHECK-LABEL: @test_mm512_fmsub_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmadd.ps.512		// CHECK: @llvm.x86.avx512.vfmadd.ps.512
return _mm512_fmsub_round_ps(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_fmsub_round_ps(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_mask_fmsub_round_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {		__m512 test_mm512_mask_fmsub_round_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_mask_fmsub_round_ps		// CHECK-LABEL: @test_mm512_mask_fmsub_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmadd.ps.512		// CHECK: @llvm.x86.avx512.vfmadd.ps.512
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask_fmsub_round_ps(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask_fmsub_round_ps(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_maskz_fmsub_round_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_maskz_fmsub_round_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_maskz_fmsub_round_ps		// CHECK-LABEL: @test_mm512_maskz_fmsub_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmadd.ps.512		// CHECK: @llvm.x86.avx512.vfmadd.ps.512
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer
return _mm512_maskz_fmsub_round_ps(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_maskz_fmsub_round_ps(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_fnmadd_round_ps(__m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_fnmadd_round_ps(__m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_fnmadd_round_ps		// CHECK-LABEL: @test_mm512_fnmadd_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmadd.ps.512		// CHECK: @llvm.x86.avx512.vfmadd.ps.512
return _mm512_fnmadd_round_ps(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_fnmadd_round_ps(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_mask3_fnmadd_round_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {		__m512 test_mm512_mask3_fnmadd_round_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {
// CHECK-LABEL: @test_mm512_mask3_fnmadd_round_ps		// CHECK-LABEL: @test_mm512_mask3_fnmadd_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmadd.ps.512		// CHECK: @llvm.x86.avx512.vfmadd.ps.512
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask3_fnmadd_round_ps(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask3_fnmadd_round_ps(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_maskz_fnmadd_round_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_maskz_fnmadd_round_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_maskz_fnmadd_round_ps		// CHECK-LABEL: @test_mm512_maskz_fnmadd_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmadd.ps.512		// CHECK: @llvm.x86.avx512.vfmadd.ps.512
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer
return _mm512_maskz_fnmadd_round_ps(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_maskz_fnmadd_round_ps(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_fnmsub_round_ps(__m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_fnmsub_round_ps(__m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_fnmsub_round_ps		// CHECK-LABEL: @test_mm512_fnmsub_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmadd.ps.512		// CHECK: @llvm.x86.avx512.vfmadd.ps.512
return _mm512_fnmsub_round_ps(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_fnmsub_round_ps(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_maskz_fnmsub_round_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_maskz_fnmsub_round_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_maskz_fnmsub_round_ps		// CHECK-LABEL: @test_mm512_maskz_fnmsub_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmadd.ps.512		// CHECK: @llvm.x86.avx512.vfmadd.ps.512
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer
return _mm512_maskz_fnmsub_round_ps(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_maskz_fnmsub_round_ps(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_fmadd_ps(__m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_fmadd_ps(__m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_fmadd_ps		// CHECK-LABEL: @test_mm512_fmadd_ps
// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
Show All 15 Lines	__m512 test_mm512_maskz_fmadd_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_maskz_fmadd_ps		// CHECK-LABEL: @test_mm512_maskz_fmadd_ps
// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer
return _mm512_maskz_fmadd_ps(__U, __A, __B, __C);		return _mm512_maskz_fmadd_ps(__U, __A, __B, __C);
}		}
__m512 test_mm512_fmsub_ps(__m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_fmsub_ps(__m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_fmsub_ps		// CHECK-LABEL: @test_mm512_fmsub_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
return _mm512_fmsub_ps(__A, __B, __C);		return _mm512_fmsub_ps(__A, __B, __C);
}		}
__m512 test_mm512_mask_fmsub_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {		__m512 test_mm512_mask_fmsub_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_mask_fmsub_ps		// CHECK-LABEL: @test_mm512_mask_fmsub_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask_fmsub_ps(__A, __U, __B, __C);		return _mm512_mask_fmsub_ps(__A, __U, __B, __C);
}		}
__m512 test_mm512_maskz_fmsub_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_maskz_fmsub_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_maskz_fmsub_ps		// CHECK-LABEL: @test_mm512_maskz_fmsub_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer
return _mm512_maskz_fmsub_ps(__U, __A, __B, __C);		return _mm512_maskz_fmsub_ps(__U, __A, __B, __C);
}		}
__m512 test_mm512_fnmadd_ps(__m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_fnmadd_ps(__m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_fnmadd_ps		// CHECK-LABEL: @test_mm512_fnmadd_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
return _mm512_fnmadd_ps(__A, __B, __C);		return _mm512_fnmadd_ps(__A, __B, __C);
}		}
__m512 test_mm512_mask3_fnmadd_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {		__m512 test_mm512_mask3_fnmadd_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {
// CHECK-LABEL: @test_mm512_mask3_fnmadd_ps		// CHECK-LABEL: @test_mm512_mask3_fnmadd_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask3_fnmadd_ps(__A, __B, __C, __U);		return _mm512_mask3_fnmadd_ps(__A, __B, __C, __U);
}		}
__m512 test_mm512_maskz_fnmadd_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_maskz_fnmadd_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_maskz_fnmadd_ps		// CHECK-LABEL: @test_mm512_maskz_fnmadd_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer
return _mm512_maskz_fnmadd_ps(__U, __A, __B, __C);		return _mm512_maskz_fnmadd_ps(__U, __A, __B, __C);
}		}
__m512 test_mm512_fnmsub_ps(__m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_fnmsub_ps(__m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_fnmsub_ps		// CHECK-LABEL: @test_mm512_fnmsub_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
return _mm512_fnmsub_ps(__A, __B, __C);		return _mm512_fnmsub_ps(__A, __B, __C);
}		}
__m512 test_mm512_maskz_fnmsub_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_maskz_fnmsub_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_maskz_fnmsub_ps		// CHECK-LABEL: @test_mm512_maskz_fnmsub_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer
return _mm512_maskz_fnmsub_ps(__U, __A, __B, __C);		return _mm512_maskz_fnmsub_ps(__U, __A, __B, __C);
}		}
__m512d test_mm512_fmaddsub_round_pd(__m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_fmaddsub_round_pd(__m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_fmaddsub_round_pd		// CHECK-LABEL: @test_mm512_fmaddsub_round_pd
// CHECK: @llvm.x86.avx512.vfmaddsub.pd.512		// CHECK: @llvm.x86.avx512.vfmaddsub.pd.512
Show All 17 Lines	__m512d test_mm512_maskz_fmaddsub_round_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_maskz_fmaddsub_round_pd		// CHECK-LABEL: @test_mm512_maskz_fmaddsub_round_pd
// CHECK: @llvm.x86.avx512.vfmaddsub.pd.512		// CHECK: @llvm.x86.avx512.vfmaddsub.pd.512
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer
return _mm512_maskz_fmaddsub_round_pd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_maskz_fmaddsub_round_pd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_fmsubadd_round_pd(__m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_fmsubadd_round_pd(__m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_fmsubadd_round_pd		// CHECK-LABEL: @test_mm512_fmsubadd_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmaddsub.pd.512		// CHECK: @llvm.x86.avx512.vfmaddsub.pd.512
return _mm512_fmsubadd_round_pd(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_fmsubadd_round_pd(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_mask_fmsubadd_round_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {		__m512d test_mm512_mask_fmsubadd_round_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_mask_fmsubadd_round_pd		// CHECK-LABEL: @test_mm512_mask_fmsubadd_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmaddsub.pd.512		// CHECK: @llvm.x86.avx512.vfmaddsub.pd.512
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask_fmsubadd_round_pd(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask_fmsubadd_round_pd(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_maskz_fmsubadd_round_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_maskz_fmsubadd_round_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_maskz_fmsubadd_round_pd		// CHECK-LABEL: @test_mm512_maskz_fmsubadd_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmaddsub.pd.512		// CHECK: @llvm.x86.avx512.vfmaddsub.pd.512
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer
return _mm512_maskz_fmsubadd_round_pd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_maskz_fmsubadd_round_pd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_fmaddsub_pd(__m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_fmaddsub_pd(__m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_fmaddsub_pd		// CHECK-LABEL: @test_mm512_fmaddsub_pd
// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x double> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]
// CHECK: shufflevector <8 x double> [[SUB]], <8 x double> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x double> [[SUB]], <8 x double> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
return _mm512_fmaddsub_pd(__A, __B, __C);		return _mm512_fmaddsub_pd(__A, __B, __C);
}		}
__m512d test_mm512_mask_fmaddsub_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {		__m512d test_mm512_mask_fmaddsub_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_mask_fmaddsub_pd		// CHECK-LABEL: @test_mm512_mask_fmaddsub_pd
// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x double> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]
// CHECK: shufflevector <8 x double> [[SUB]], <8 x double> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x double> [[SUB]], <8 x double> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask_fmaddsub_pd(__A, __U, __B, __C);		return _mm512_mask_fmaddsub_pd(__A, __U, __B, __C);
}		}
__m512d test_mm512_mask3_fmaddsub_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {		__m512d test_mm512_mask3_fmaddsub_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm512_mask3_fmaddsub_pd		// CHECK-LABEL: @test_mm512_mask3_fmaddsub_pd
// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x double> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]
// CHECK: shufflevector <8 x double> [[SUB]], <8 x double> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x double> [[SUB]], <8 x double> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask3_fmaddsub_pd(__A, __B, __C, __U);		return _mm512_mask3_fmaddsub_pd(__A, __B, __C, __U);
}		}
__m512d test_mm512_maskz_fmaddsub_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_maskz_fmaddsub_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_maskz_fmaddsub_pd		// CHECK-LABEL: @test_mm512_maskz_fmaddsub_pd
// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x double> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]
// CHECK: shufflevector <8 x double> [[SUB]], <8 x double> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x double> [[SUB]], <8 x double> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer
return _mm512_maskz_fmaddsub_pd(__U, __A, __B, __C);		return _mm512_maskz_fmaddsub_pd(__U, __A, __B, __C);
}		}
__m512d test_mm512_fmsubadd_pd(__m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_fmsubadd_pd(__m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_fmsubadd_pd		// CHECK-LABEL: @test_mm512_fmsubadd_pd
// CHECK: [[NEG:%.+]] = fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x double> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]
// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: shufflevector <8 x double> [[ADD]], <8 x double> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x double> [[ADD]], <8 x double> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
return _mm512_fmsubadd_pd(__A, __B, __C);		return _mm512_fmsubadd_pd(__A, __B, __C);
}		}
__m512d test_mm512_mask_fmsubadd_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {		__m512d test_mm512_mask_fmsubadd_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_mask_fmsubadd_pd		// CHECK-LABEL: @test_mm512_mask_fmsubadd_pd
// CHECK: [[NEG:%.+]] = fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x double> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]
// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: shufflevector <8 x double> [[ADD]], <8 x double> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x double> [[ADD]], <8 x double> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask_fmsubadd_pd(__A, __U, __B, __C);		return _mm512_mask_fmsubadd_pd(__A, __U, __B, __C);
}		}
__m512d test_mm512_maskz_fmsubadd_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {		__m512d test_mm512_maskz_fmsubadd_pd(__mmask8 __U, __m512d __A, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_maskz_fmsubadd_pd		// CHECK-LABEL: @test_mm512_maskz_fmsubadd_pd
// CHECK: [[NEG:%.+]] = fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x double> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]
// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: shufflevector <8 x double> [[ADD]], <8 x double> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x double> [[ADD]], <8 x double> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> zeroinitializer
return _mm512_maskz_fmsubadd_pd(__U, __A, __B, __C);		return _mm512_maskz_fmsubadd_pd(__U, __A, __B, __C);
}		}
__m512 test_mm512_fmaddsub_round_ps(__m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_fmaddsub_round_ps(__m512 __A, __m512 __B, __m512 __C) {
Show All 19 Lines	__m512 test_mm512_maskz_fmaddsub_round_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_maskz_fmaddsub_round_ps		// CHECK-LABEL: @test_mm512_maskz_fmaddsub_round_ps
// CHECK: @llvm.x86.avx512.vfmaddsub.ps.512		// CHECK: @llvm.x86.avx512.vfmaddsub.ps.512
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer
return _mm512_maskz_fmaddsub_round_ps(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_maskz_fmaddsub_round_ps(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_fmsubadd_round_ps(__m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_fmsubadd_round_ps(__m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_fmsubadd_round_ps		// CHECK-LABEL: @test_mm512_fmsubadd_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmaddsub.ps.512		// CHECK: @llvm.x86.avx512.vfmaddsub.ps.512
return _mm512_fmsubadd_round_ps(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_fmsubadd_round_ps(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_mask_fmsubadd_round_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {		__m512 test_mm512_mask_fmsubadd_round_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_mask_fmsubadd_round_ps		// CHECK-LABEL: @test_mm512_mask_fmsubadd_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmaddsub.ps.512		// CHECK: @llvm.x86.avx512.vfmaddsub.ps.512
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask_fmsubadd_round_ps(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask_fmsubadd_round_ps(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_maskz_fmsubadd_round_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_maskz_fmsubadd_round_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_maskz_fmsubadd_round_ps		// CHECK-LABEL: @test_mm512_maskz_fmsubadd_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmaddsub.ps.512		// CHECK: @llvm.x86.avx512.vfmaddsub.ps.512
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer
return _mm512_maskz_fmsubadd_round_ps(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_maskz_fmsubadd_round_ps(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_fmaddsub_ps(__m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_fmaddsub_ps(__m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_fmaddsub_ps		// CHECK-LABEL: @test_mm512_fmaddsub_ps
// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <16 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]
// CHECK: shufflevector <16 x float> [[SUB]], <16 x float> [[ADD]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>		// CHECK: shufflevector <16 x float> [[SUB]], <16 x float> [[ADD]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>
return _mm512_fmaddsub_ps(__A, __B, __C);		return _mm512_fmaddsub_ps(__A, __B, __C);
}		}
__m512 test_mm512_mask_fmaddsub_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {		__m512 test_mm512_mask_fmaddsub_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_mask_fmaddsub_ps		// CHECK-LABEL: @test_mm512_mask_fmaddsub_ps
// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <16 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]
// CHECK: shufflevector <16 x float> [[SUB]], <16 x float> [[ADD]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>		// CHECK: shufflevector <16 x float> [[SUB]], <16 x float> [[ADD]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask_fmaddsub_ps(__A, __U, __B, __C);		return _mm512_mask_fmaddsub_ps(__A, __U, __B, __C);
}		}
__m512 test_mm512_mask3_fmaddsub_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {		__m512 test_mm512_mask3_fmaddsub_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {
// CHECK-LABEL: @test_mm512_mask3_fmaddsub_ps		// CHECK-LABEL: @test_mm512_mask3_fmaddsub_ps
// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <16 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]
// CHECK: shufflevector <16 x float> [[SUB]], <16 x float> [[ADD]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>		// CHECK: shufflevector <16 x float> [[SUB]], <16 x float> [[ADD]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask3_fmaddsub_ps(__A, __B, __C, __U);		return _mm512_mask3_fmaddsub_ps(__A, __B, __C, __U);
}		}
__m512 test_mm512_maskz_fmaddsub_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_maskz_fmaddsub_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_maskz_fmaddsub_ps		// CHECK-LABEL: @test_mm512_maskz_fmaddsub_ps
// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <16 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]
// CHECK: shufflevector <16 x float> [[SUB]], <16 x float> [[ADD]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>		// CHECK: shufflevector <16 x float> [[SUB]], <16 x float> [[ADD]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer
return _mm512_maskz_fmaddsub_ps(__U, __A, __B, __C);		return _mm512_maskz_fmaddsub_ps(__U, __A, __B, __C);
}		}
__m512 test_mm512_fmsubadd_ps(__m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_fmsubadd_ps(__m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_fmsubadd_ps		// CHECK-LABEL: @test_mm512_fmsubadd_ps
// CHECK: [[NEG:%.+]] = fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <16 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]
// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: shufflevector <16 x float> [[ADD]], <16 x float> [[SUB]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>		// CHECK: shufflevector <16 x float> [[ADD]], <16 x float> [[SUB]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>
return _mm512_fmsubadd_ps(__A, __B, __C);		return _mm512_fmsubadd_ps(__A, __B, __C);
}		}
__m512 test_mm512_mask_fmsubadd_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {		__m512 test_mm512_mask_fmsubadd_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_mask_fmsubadd_ps		// CHECK-LABEL: @test_mm512_mask_fmsubadd_ps
// CHECK: [[NEG:%.+]] = fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <16 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]
// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: shufflevector <16 x float> [[ADD]], <16 x float> [[SUB]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>		// CHECK: shufflevector <16 x float> [[ADD]], <16 x float> [[SUB]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask_fmsubadd_ps(__A, __U, __B, __C);		return _mm512_mask_fmsubadd_ps(__A, __U, __B, __C);
}		}
__m512 test_mm512_maskz_fmsubadd_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {		__m512 test_mm512_maskz_fmsubadd_ps(__mmask16 __U, __m512 __A, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_maskz_fmsubadd_ps		// CHECK-LABEL: @test_mm512_maskz_fmsubadd_ps
// CHECK: [[NEG:%.+]] = fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <16 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]
// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: shufflevector <16 x float> [[ADD]], <16 x float> [[SUB]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>		// CHECK: shufflevector <16 x float> [[ADD]], <16 x float> [[SUB]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> zeroinitializer
return _mm512_maskz_fmsubadd_ps(__U, __A, __B, __C);		return _mm512_maskz_fmsubadd_ps(__U, __A, __B, __C);
}		}
__m512d test_mm512_mask3_fmsub_round_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {		__m512d test_mm512_mask3_fmsub_round_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm512_mask3_fmsub_round_pd		// CHECK-LABEL: @test_mm512_mask3_fmsub_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmadd.pd.512		// CHECK: @llvm.x86.avx512.vfmadd.pd.512
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask3_fmsub_round_pd(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask3_fmsub_round_pd(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_mask3_fmsub_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {		__m512d test_mm512_mask3_fmsub_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm512_mask3_fmsub_pd		// CHECK-LABEL: @test_mm512_mask3_fmsub_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask3_fmsub_pd(__A, __B, __C, __U);		return _mm512_mask3_fmsub_pd(__A, __B, __C, __U);
}		}
__m512 test_mm512_mask3_fmsub_round_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {		__m512 test_mm512_mask3_fmsub_round_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {
// CHECK-LABEL: @test_mm512_mask3_fmsub_round_ps		// CHECK-LABEL: @test_mm512_mask3_fmsub_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmadd.ps.512		// CHECK: @llvm.x86.avx512.vfmadd.ps.512
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask3_fmsub_round_ps(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask3_fmsub_round_ps(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_mask3_fmsub_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {		__m512 test_mm512_mask3_fmsub_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {
// CHECK-LABEL: @test_mm512_mask3_fmsub_ps		// CHECK-LABEL: @test_mm512_mask3_fmsub_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask3_fmsub_ps(__A, __B, __C, __U);		return _mm512_mask3_fmsub_ps(__A, __B, __C, __U);
}		}
__m512d test_mm512_mask3_fmsubadd_round_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {		__m512d test_mm512_mask3_fmsubadd_round_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm512_mask3_fmsubadd_round_pd		// CHECK-LABEL: @test_mm512_mask3_fmsubadd_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmaddsub.pd.512		// CHECK: @llvm.x86.avx512.vfmaddsub.pd.512
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask3_fmsubadd_round_pd(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask3_fmsubadd_round_pd(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_mask3_fmsubadd_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {		__m512d test_mm512_mask3_fmsubadd_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm512_mask3_fmsubadd_pd		// CHECK-LABEL: @test_mm512_mask3_fmsubadd_pd
// CHECK: [[NEG:%.+]] = fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x double> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> [[NEG]]
// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: shufflevector <8 x double> [[ADD]], <8 x double> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x double> [[ADD]], <8 x double> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask3_fmsubadd_pd(__A, __B, __C, __U);		return _mm512_mask3_fmsubadd_pd(__A, __B, __C, __U);
}		}
__m512 test_mm512_mask3_fmsubadd_round_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {		__m512 test_mm512_mask3_fmsubadd_round_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {
// CHECK-LABEL: @test_mm512_mask3_fmsubadd_round_ps		// CHECK-LABEL: @test_mm512_mask3_fmsubadd_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmaddsub.ps.512		// CHECK: @llvm.x86.avx512.vfmaddsub.ps.512
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask3_fmsubadd_round_ps(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask3_fmsubadd_round_ps(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_mask3_fmsubadd_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {		__m512 test_mm512_mask3_fmsubadd_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {
// CHECK-LABEL: @test_mm512_mask3_fmsubadd_ps		// CHECK-LABEL: @test_mm512_mask3_fmsubadd_ps
// CHECK: [[NEG:%.+]] = fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <16 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> [[NEG]]
// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: shufflevector <16 x float> [[ADD]], <16 x float> [[SUB]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>		// CHECK: shufflevector <16 x float> [[ADD]], <16 x float> [[SUB]], <16 x i32> <i32 0, i32 17, i32 2, i32 19, i32 4, i32 21, i32 6, i32 23, i32 8, i32 25, i32 10, i32 27, i32 12, i32 29, i32 14, i32 31>
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask3_fmsubadd_ps(__A, __B, __C, __U);		return _mm512_mask3_fmsubadd_ps(__A, __B, __C, __U);
}		}
__m512d test_mm512_mask_fnmadd_round_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {		__m512d test_mm512_mask_fnmadd_round_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_mask_fnmadd_round_pd		// CHECK-LABEL: @test_mm512_mask_fnmadd_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: @llvm.x86.avx512.vfmadd.pd.512		// CHECK: @llvm.x86.avx512.vfmadd.pd.512
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask_fnmadd_round_pd(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask_fnmadd_round_pd(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_mask_fnmadd_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {		__m512d test_mm512_mask_fnmadd_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_mask_fnmadd_pd		// CHECK-LABEL: @test_mm512_mask_fnmadd_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask_fnmadd_pd(__A, __U, __B, __C);		return _mm512_mask_fnmadd_pd(__A, __U, __B, __C);
}		}
__m512 test_mm512_mask_fnmadd_round_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {		__m512 test_mm512_mask_fnmadd_round_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_mask_fnmadd_round_ps		// CHECK-LABEL: @test_mm512_mask_fnmadd_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmadd.ps.512		// CHECK: @llvm.x86.avx512.vfmadd.ps.512
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask_fnmadd_round_ps(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask_fnmadd_round_ps(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_mask_fnmadd_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {		__m512 test_mm512_mask_fnmadd_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_mask_fnmadd_ps		// CHECK-LABEL: @test_mm512_mask_fnmadd_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask_fnmadd_ps(__A, __U, __B, __C);		return _mm512_mask_fnmadd_ps(__A, __U, __B, __C);
}		}
__m512d test_mm512_mask_fnmsub_round_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {		__m512d test_mm512_mask_fnmsub_round_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_mask_fnmsub_round_pd		// CHECK-LABEL: @test_mm512_mask_fnmsub_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: @llvm.x86.avx512.vfmadd.pd.512		// CHECK: @llvm.x86.avx512.vfmadd.pd.512
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask_fnmsub_round_pd(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask_fnmsub_round_pd(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_mask3_fnmsub_round_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {		__m512d test_mm512_mask3_fnmsub_round_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm512_mask3_fnmsub_round_pd		// CHECK-LABEL: @test_mm512_mask3_fnmsub_round_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>		// CHECK: fneg <8 x double>
// CHECK: @llvm.x86.avx512.vfmadd.pd.512		// CHECK: @llvm.x86.avx512.vfmadd.pd.512
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask3_fnmsub_round_pd(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask3_fnmsub_round_pd(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512d test_mm512_mask_fnmsub_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {		__m512d test_mm512_mask_fnmsub_pd(__m512d __A, __mmask8 __U, __m512d __B, __m512d __C) {
// CHECK-LABEL: @test_mm512_mask_fnmsub_pd		// CHECK-LABEL: @test_mm512_mask_fnmsub_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask_fnmsub_pd(__A, __U, __B, __C);		return _mm512_mask_fnmsub_pd(__A, __U, __B, __C);
}		}
__m512d test_mm512_mask3_fnmsub_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {		__m512d test_mm512_mask3_fnmsub_pd(__m512d __A, __m512d __B, __m512d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm512_mask3_fnmsub_pd		// CHECK-LABEL: @test_mm512_mask3_fnmsub_pd
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: fsub <8 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x double> %{{.*}}
// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})		// CHECK: call <8 x double> @llvm.fma.v8f64(<8 x double> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x double> %{{.}}, <8 x double> %{{.*}}
return _mm512_mask3_fnmsub_pd(__A, __B, __C, __U);		return _mm512_mask3_fnmsub_pd(__A, __B, __C, __U);
}		}
__m512 test_mm512_mask_fnmsub_round_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {		__m512 test_mm512_mask_fnmsub_round_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_mask_fnmsub_round_ps		// CHECK-LABEL: @test_mm512_mask_fnmsub_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmadd.ps.512		// CHECK: @llvm.x86.avx512.vfmadd.ps.512
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask_fnmsub_round_ps(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask_fnmsub_round_ps(__A, __U, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_mask3_fnmsub_round_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {		__m512 test_mm512_mask3_fnmsub_round_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {
// CHECK-LABEL: @test_mm512_mask3_fnmsub_round_ps		// CHECK-LABEL: @test_mm512_mask3_fnmsub_round_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: @llvm.x86.avx512.vfmadd.ps.512		// CHECK: @llvm.x86.avx512.vfmadd.ps.512
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask3_fnmsub_round_ps(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm512_mask3_fnmsub_round_ps(__A, __B, __C, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}
__m512 test_mm512_mask_fnmsub_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {		__m512 test_mm512_mask_fnmsub_ps(__m512 __A, __mmask16 __U, __m512 __B, __m512 __C) {
// CHECK-LABEL: @test_mm512_mask_fnmsub_ps		// CHECK-LABEL: @test_mm512_mask_fnmsub_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask_fnmsub_ps(__A, __U, __B, __C);		return _mm512_mask_fnmsub_ps(__A, __U, __B, __C);
}		}
__m512 test_mm512_mask3_fnmsub_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {		__m512 test_mm512_mask3_fnmsub_ps(__m512 __A, __m512 __B, __m512 __C, __mmask16 __U) {
// CHECK-LABEL: @test_mm512_mask3_fnmsub_ps		// CHECK-LABEL: @test_mm512_mask3_fnmsub_ps
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: fsub <16 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <16 x float> %{{.*}}
// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})		// CHECK: call <16 x float> @llvm.fma.v16f32(<16 x float> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}})
// CHECK: bitcast i16 %{{.*}} to <16 x i1>		// CHECK: bitcast i16 %{{.*}} to <16 x i1>
// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}		// CHECK: select <16 x i1> %{{.}}, <16 x float> %{{.}}, <16 x float> %{{.*}}
return _mm512_mask3_fnmsub_ps(__A, __B, __C, __U);		return _mm512_mask3_fnmsub_ps(__A, __B, __C, __U);
}		}

__mmask16 test_mm512_cmpeq_epi32_mask(__m512i __a, __m512i __b) {		__mmask16 test_mm512_cmpeq_epi32_mask(__m512i __a, __m512i __b) {
// CHECK-LABEL: @test_mm512_cmpeq_epi32_mask		// CHECK-LABEL: @test_mm512_cmpeq_epi32_mask
▲ Show 20 Lines • Show All 6,273 Lines • ▼ Show 20 Lines	__m128 test_mm_mask3_fmadd_round_ss(__m128 __W, __m128 __X, __m128 __Y, __mmask8 __U){
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[C]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[C]]
// CHECK-NEXT: insertelement <4 x float> [[ORIGC]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGC]], float [[SEL]], i64 0
return _mm_mask3_fmadd_round_ss(__W, __X, __Y, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_mask3_fmadd_round_ss(__W, __X, __Y, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128 test_mm_mask_fmsub_ss(__m128 __W, __mmask8 __U, __m128 __A, __m128 __B){		__m128 test_mm_mask_fmsub_ss(__m128 __W, __mmask8 __U, __m128 __A, __m128 __B){
// CHECK-LABEL: @test_mm_mask_fmsub_ss		// CHECK-LABEL: @test_mm_mask_fmsub_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[A]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[A]]
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0
return _mm_mask_fmsub_ss(__W, __U, __A, __B);		return _mm_mask_fmsub_ss(__W, __U, __A, __B);
}		}

__m128 test_mm_fmsub_round_ss(__m128 __A, __m128 __B, __m128 __C){		__m128 test_mm_fmsub_round_ss(__m128 __A, __m128 __B, __m128 __C){
// CHECK-LABEL: @test_mm_fmsub_round_ss		// CHECK-LABEL: @test_mm_fmsub_round_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[FMA]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[FMA]], i64 0
return _mm_fmsub_round_ss(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_fmsub_round_ss(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128 test_mm_mask_fmsub_round_ss(__m128 __W, __mmask8 __U, __m128 __A, __m128 __B){		__m128 test_mm_mask_fmsub_round_ss(__m128 __W, __mmask8 __U, __m128 __A, __m128 __B){
// CHECK-LABEL: @test_mm_mask_fmsub_round_ss		// CHECK-LABEL: @test_mm_mask_fmsub_round_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[A]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[A]]
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0
return _mm_mask_fmsub_round_ss(__W, __U, __A, __B, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_mask_fmsub_round_ss(__W, __U, __A, __B, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128 test_mm_maskz_fmsub_ss(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C){		__m128 test_mm_maskz_fmsub_ss(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C){
// CHECK-LABEL: @test_mm_maskz_fmsub_ss		// CHECK-LABEL: @test_mm_maskz_fmsub_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float 0.000000e+00		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float 0.000000e+00
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0
return _mm_maskz_fmsub_ss(__U, __A, __B, __C);		return _mm_maskz_fmsub_ss(__U, __A, __B, __C);
}		}

__m128 test_mm_maskz_fmsub_round_ss(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C){		__m128 test_mm_maskz_fmsub_round_ss(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C){
// CHECK-LABEL: @test_mm_maskz_fmsub_round_ss		// CHECK-LABEL: @test_mm_maskz_fmsub_round_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float 0.000000e+00		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float 0.000000e+00
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0
return _mm_maskz_fmsub_round_ss(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_maskz_fmsub_round_ss(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128 test_mm_mask3_fmsub_ss(__m128 __W, __m128 __X, __m128 __Y, __mmask8 __U){		__m128 test_mm_mask3_fmsub_ss(__m128 __W, __m128 __X, __m128 __Y, __mmask8 __U){
// CHECK-LABEL: @test_mm_mask3_fmsub_ss		// CHECK-LABEL: @test_mm_mask3_fmsub_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, [[ORIGC:%.+]]		// CHECK: [[NEG:%.+]] = fneg <4 x float> [[ORIGC:%.+]]
// CHECK: [[A:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])
// CHECK-NEXT: [[C2:%.+]] = extractelement <4 x float> [[ORIGC]], i64 0		// CHECK-NEXT: [[C2:%.+]] = extractelement <4 x float> [[ORIGC]], i64 0
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[C2]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[C2]]
// CHECK-NEXT: insertelement <4 x float> [[ORIGC]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGC]], float [[SEL]], i64 0
return _mm_mask3_fmsub_ss(__W, __X, __Y, __U);		return _mm_mask3_fmsub_ss(__W, __X, __Y, __U);
}		}

__m128 test_mm_mask3_fmsub_round_ss(__m128 __W, __m128 __X, __m128 __Y, __mmask8 __U){		__m128 test_mm_mask3_fmsub_round_ss(__m128 __W, __m128 __X, __m128 __Y, __mmask8 __U){
// CHECK-LABEL: @test_mm_mask3_fmsub_round_ss		// CHECK-LABEL: @test_mm_mask3_fmsub_round_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, [[ORIGC:%.+]]		// CHECK: [[NEG:%.+]] = fneg <4 x float> [[ORIGC:%.+]]
// CHECK: [[A:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)
// CHECK-NEXT: [[C2:%.+]] = extractelement <4 x float> [[ORIGC]], i64 0		// CHECK-NEXT: [[C2:%.+]] = extractelement <4 x float> [[ORIGC]], i64 0
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[C2]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[C2]]
// CHECK-NEXT: insertelement <4 x float> [[ORIGC]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGC]], float [[SEL]], i64 0
return _mm_mask3_fmsub_round_ss(__W, __X, __Y, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_mask3_fmsub_round_ss(__W, __X, __Y, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128 test_mm_mask_fnmadd_ss(__m128 __W, __mmask8 __U, __m128 __A, __m128 __B){		__m128 test_mm_mask_fnmadd_ss(__m128 __W, __mmask8 __U, __m128 __A, __m128 __B){
// CHECK-LABEL: @test_mm_mask_fnmadd_ss		// CHECK-LABEL: @test_mm_mask_fnmadd_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[A]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[A]]
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0
return _mm_mask_fnmadd_ss(__W, __U, __A, __B);		return _mm_mask_fnmadd_ss(__W, __U, __A, __B);
}		}

__m128 test_mm_fnmadd_round_ss(__m128 __A, __m128 __B, __m128 __C){		__m128 test_mm_fnmadd_round_ss(__m128 __A, __m128 __B, __m128 __C){
// CHECK-LABEL: @test_mm_fnmadd_round_ss		// CHECK-LABEL: @test_mm_fnmadd_round_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[FMA]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[FMA]], i64 0
return _mm_fnmadd_round_ss(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_fnmadd_round_ss(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128 test_mm_mask_fnmadd_round_ss(__m128 __W, __mmask8 __U, __m128 __A, __m128 __B){		__m128 test_mm_mask_fnmadd_round_ss(__m128 __W, __mmask8 __U, __m128 __A, __m128 __B){
// CHECK-LABEL: @test_mm_mask_fnmadd_round_ss		// CHECK-LABEL: @test_mm_mask_fnmadd_round_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[A]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[A]]
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0
return _mm_mask_fnmadd_round_ss(__W, __U, __A, __B, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_mask_fnmadd_round_ss(__W, __U, __A, __B, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128 test_mm_maskz_fnmadd_ss(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C){		__m128 test_mm_maskz_fnmadd_ss(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C){
// CHECK-LABEL: @test_mm_maskz_fnmadd_ss		// CHECK-LABEL: @test_mm_maskz_fnmadd_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float 0.000000e+00		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float 0.000000e+00
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0
return _mm_maskz_fnmadd_ss(__U, __A, __B, __C);		return _mm_maskz_fnmadd_ss(__U, __A, __B, __C);
}		}

__m128 test_mm_maskz_fnmadd_round_ss(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C){		__m128 test_mm_maskz_fnmadd_round_ss(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C){
// CHECK-LABEL: @test_mm_maskz_fnmadd_round_ss		// CHECK-LABEL: @test_mm_maskz_fnmadd_round_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float 0.000000e+00		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float 0.000000e+00
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0
return _mm_maskz_fnmadd_round_ss(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_maskz_fnmadd_round_ss(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128 test_mm_mask3_fnmadd_ss(__m128 __W, __m128 __X, __m128 __Y, __mmask8 __U){		__m128 test_mm_mask3_fnmadd_ss(__m128 __W, __m128 __X, __m128 __Y, __mmask8 __U){
// CHECK-LABEL: @test_mm_mask3_fnmadd_ss		// CHECK-LABEL: @test_mm_mask3_fnmadd_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[ORIGC:%.+]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[ORIGC:%.+]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[C]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[C]]
// CHECK-NEXT: insertelement <4 x float> [[ORIGC]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGC]], float [[SEL]], i64 0
return _mm_mask3_fnmadd_ss(__W, __X, __Y, __U);		return _mm_mask3_fnmadd_ss(__W, __X, __Y, __U);
}		}

__m128 test_mm_mask3_fnmadd_round_ss(__m128 __W, __m128 __X, __m128 __Y, __mmask8 __U){		__m128 test_mm_mask3_fnmadd_round_ss(__m128 __W, __m128 __X, __m128 __Y, __mmask8 __U){
// CHECK-LABEL: @test_mm_mask3_fnmadd_round_ss		// CHECK-LABEL: @test_mm_mask3_fnmadd_round_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[ORIGC:%.+]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[ORIGC:%.+]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[C]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[C]]
// CHECK-NEXT: insertelement <4 x float> [[ORIGC]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGC]], float [[SEL]], i64 0
return _mm_mask3_fnmadd_round_ss(__W, __X, __Y, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_mask3_fnmadd_round_ss(__W, __X, __Y, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128 test_mm_mask_fnmsub_ss(__m128 __W, __mmask8 __U, __m128 __A, __m128 __B){		__m128 test_mm_mask_fnmsub_ss(__m128 __W, __mmask8 __U, __m128 __A, __m128 __B){
// CHECK-LABEL: @test_mm_mask_fnmsub_ss		// CHECK-LABEL: @test_mm_mask_fnmsub_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG2:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[A]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[A]]
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0
return _mm_mask_fnmsub_ss(__W, __U, __A, __B);		return _mm_mask_fnmsub_ss(__W, __U, __A, __B);
}		}

__m128 test_mm_fnmsub_round_ss(__m128 __A, __m128 __B, __m128 __C){		__m128 test_mm_fnmsub_round_ss(__m128 __A, __m128 __B, __m128 __C){
// CHECK-LABEL: @test_mm_fnmsub_round_ss		// CHECK-LABEL: @test_mm_fnmsub_round_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG2:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[FMA]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[FMA]], i64 0
return _mm_fnmsub_round_ss(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_fnmsub_round_ss(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128 test_mm_mask_fnmsub_round_ss(__m128 __W, __mmask8 __U, __m128 __A, __m128 __B){		__m128 test_mm_mask_fnmsub_round_ss(__m128 __W, __mmask8 __U, __m128 __A, __m128 __B){
// CHECK-LABEL: @test_mm_mask_fnmsub_round_ss		// CHECK-LABEL: @test_mm_mask_fnmsub_round_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG2:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[A]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[A]]
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0
return _mm_mask_fnmsub_round_ss(__W, __U, __A, __B, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_mask_fnmsub_round_ss(__W, __U, __A, __B, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128 test_mm_maskz_fnmsub_ss(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C){		__m128 test_mm_maskz_fnmsub_ss(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C){
// CHECK-LABEL: @test_mm_maskz_fnmsub_ss		// CHECK-LABEL: @test_mm_maskz_fnmsub_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG2:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float 0.000000e+00		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float 0.000000e+00
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0
return _mm_maskz_fnmsub_ss(__U, __A, __B, __C);		return _mm_maskz_fnmsub_ss(__U, __A, __B, __C);
}		}

__m128 test_mm_maskz_fnmsub_round_ss(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C){		__m128 test_mm_maskz_fnmsub_round_ss(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C){
// CHECK-LABEL: @test_mm_maskz_fnmsub_round_ss		// CHECK-LABEL: @test_mm_maskz_fnmsub_round_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG2:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float 0.000000e+00		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float 0.000000e+00
// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGA]], float [[SEL]], i64 0
return _mm_maskz_fnmsub_round_ss(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_maskz_fnmsub_round_ss(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128 test_mm_mask3_fnmsub_ss(__m128 __W, __m128 __X, __m128 __Y, __mmask8 __U){		__m128 test_mm_mask3_fnmsub_ss(__m128 __W, __m128 __X, __m128 __Y, __mmask8 __U){
// CHECK-LABEL: @test_mm_mask3_fnmsub_ss		// CHECK-LABEL: @test_mm_mask3_fnmsub_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, [[ORIGC:%.+]]		// CHECK: [[NEG2:%.+]] = fneg <4 x float> [[ORIGC:%.+]]
// CHECK: [[A:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.fma.f32(float [[A]], float [[B]], float [[C]])
// CHECK-NEXT: [[C2:%.+]] = extractelement <4 x float> [[ORIGC]], i64 0		// CHECK-NEXT: [[C2:%.+]] = extractelement <4 x float> [[ORIGC]], i64 0
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[C2]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[C2]]
// CHECK-NEXT: insertelement <4 x float> [[ORIGC]], float [[SEL]], i64 0		// CHECK-NEXT: insertelement <4 x float> [[ORIGC]], float [[SEL]], i64 0
return _mm_mask3_fnmsub_ss(__W, __X, __Y, __U);		return _mm_mask3_fnmsub_ss(__W, __X, __Y, __U);
}		}

__m128 test_mm_mask3_fnmsub_round_ss(__m128 __W, __m128 __X, __m128 __Y, __mmask8 __U){		__m128 test_mm_mask3_fnmsub_round_ss(__m128 __W, __m128 __X, __m128 __Y, __mmask8 __U){
// CHECK-LABEL: @test_mm_mask3_fnmsub_round_ss		// CHECK-LABEL: @test_mm_mask3_fnmsub_round_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, [[ORIGC:%.+]]		// CHECK: [[NEG2:%.+]] = fneg <4 x float> [[ORIGC:%.+]]
// CHECK: [[A:%.+]] = extractelement <4 x float> %{{.*}}, i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> %{{.*}}, i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call float @llvm.x86.avx512.vfmadd.f32(float [[A]], float [[B]], float [[C]], i32 11)
// CHECK-NEXT: [[C2:%.+]] = extractelement <4 x float> [[ORIGC]], i64 0		// CHECK-NEXT: [[C2:%.+]] = extractelement <4 x float> [[ORIGC]], i64 0
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[C2]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, float [[FMA]], float [[C2]]
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	__m128d test_mm_mask3_fmadd_round_sd(__m128d __W, __m128d __X, __m128d __Y, __mmask8 __U){
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[C]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[C]]
// CHECK-NEXT: insertelement <2 x double> [[ORIGC]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGC]], double [[SEL]], i64 0
return _mm_mask3_fmadd_round_sd(__W, __X, __Y, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_mask3_fmadd_round_sd(__W, __X, __Y, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128d test_mm_mask_fmsub_sd(__m128d __W, __mmask8 __U, __m128d __A, __m128d __B){		__m128d test_mm_mask_fmsub_sd(__m128d __W, __mmask8 __U, __m128d __A, __m128d __B){
// CHECK-LABEL: @test_mm_mask_fmsub_sd		// CHECK-LABEL: @test_mm_mask_fmsub_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[A]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[A]]
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0
return _mm_mask_fmsub_sd(__W, __U, __A, __B);		return _mm_mask_fmsub_sd(__W, __U, __A, __B);
}		}

__m128d test_mm_fmsub_round_sd(__m128d __A, __m128d __B, __m128d __C){		__m128d test_mm_fmsub_round_sd(__m128d __A, __m128d __B, __m128d __C){
// CHECK-LABEL: @test_mm_fmsub_round_sd		// CHECK-LABEL: @test_mm_fmsub_round_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[FMA]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[FMA]], i64 0
return _mm_fmsub_round_sd(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_fmsub_round_sd(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128d test_mm_mask_fmsub_round_sd(__m128d __W, __mmask8 __U, __m128d __A, __m128d __B){		__m128d test_mm_mask_fmsub_round_sd(__m128d __W, __mmask8 __U, __m128d __A, __m128d __B){
// CHECK-LABEL: @test_mm_mask_fmsub_round_sd		// CHECK-LABEL: @test_mm_mask_fmsub_round_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[A]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[A]]
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0
return _mm_mask_fmsub_round_sd(__W, __U, __A, __B, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_mask_fmsub_round_sd(__W, __U, __A, __B, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128d test_mm_maskz_fmsub_sd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C){		__m128d test_mm_maskz_fmsub_sd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C){
// CHECK-LABEL: @test_mm_maskz_fmsub_sd		// CHECK-LABEL: @test_mm_maskz_fmsub_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double 0.000000e+00		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double 0.000000e+00
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0
return _mm_maskz_fmsub_sd(__U, __A, __B, __C);		return _mm_maskz_fmsub_sd(__U, __A, __B, __C);
}		}

__m128d test_mm_maskz_fmsub_round_sd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C){		__m128d test_mm_maskz_fmsub_round_sd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C){
// CHECK-LABEL: @test_mm_maskz_fmsub_round_sd		// CHECK-LABEL: @test_mm_maskz_fmsub_round_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double 0.000000e+00		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double 0.000000e+00
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0
return _mm_maskz_fmsub_round_sd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_maskz_fmsub_round_sd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128d test_mm_mask3_fmsub_sd(__m128d __W, __m128d __X, __m128d __Y, __mmask8 __U){		__m128d test_mm_mask3_fmsub_sd(__m128d __W, __m128d __X, __m128d __Y, __mmask8 __U){
// CHECK-LABEL: @test_mm_mask3_fmsub_sd		// CHECK-LABEL: @test_mm_mask3_fmsub_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, [[ORIGC:%.+]]		// CHECK: [[NEG:%.+]] = fneg <2 x double> [[ORIGC:%.+]]
// CHECK: [[A:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])
// CHECK-NEXT: [[C2:%.+]] = extractelement <2 x double> [[ORIGC]], i64 0		// CHECK-NEXT: [[C2:%.+]] = extractelement <2 x double> [[ORIGC]], i64 0
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[C2]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[C2]]
// CHECK-NEXT: insertelement <2 x double> [[ORIGC]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGC]], double [[SEL]], i64 0
return _mm_mask3_fmsub_sd(__W, __X, __Y, __U);		return _mm_mask3_fmsub_sd(__W, __X, __Y, __U);
}		}

__m128d test_mm_mask3_fmsub_round_sd(__m128d __W, __m128d __X, __m128d __Y, __mmask8 __U){		__m128d test_mm_mask3_fmsub_round_sd(__m128d __W, __m128d __X, __m128d __Y, __mmask8 __U){
// CHECK-LABEL: @test_mm_mask3_fmsub_round_sd		// CHECK-LABEL: @test_mm_mask3_fmsub_round_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, [[ORIGC:%.+]]		// CHECK: [[NEG:%.+]] = fneg <2 x double> [[ORIGC:%.+]]
// CHECK: [[A:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)
// CHECK-NEXT: [[C2:%.+]] = extractelement <2 x double> [[ORIGC]], i64 0		// CHECK-NEXT: [[C2:%.+]] = extractelement <2 x double> [[ORIGC]], i64 0
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[C2]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[C2]]
// CHECK-NEXT: insertelement <2 x double> [[ORIGC]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGC]], double [[SEL]], i64 0
return _mm_mask3_fmsub_round_sd(__W, __X, __Y, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_mask3_fmsub_round_sd(__W, __X, __Y, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128d test_mm_mask_fnmadd_sd(__m128d __W, __mmask8 __U, __m128d __A, __m128d __B){		__m128d test_mm_mask_fnmadd_sd(__m128d __W, __mmask8 __U, __m128d __A, __m128d __B){
// CHECK-LABEL: @test_mm_mask_fnmadd_sd		// CHECK-LABEL: @test_mm_mask_fnmadd_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[A]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[A]]
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0
return _mm_mask_fnmadd_sd(__W, __U, __A, __B);		return _mm_mask_fnmadd_sd(__W, __U, __A, __B);
}		}

__m128d test_mm_fnmadd_round_sd(__m128d __A, __m128d __B, __m128d __C){		__m128d test_mm_fnmadd_round_sd(__m128d __A, __m128d __B, __m128d __C){
// CHECK-LABEL: @test_mm_fnmadd_round_sd		// CHECK-LABEL: @test_mm_fnmadd_round_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[FMA]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[FMA]], i64 0
return _mm_fnmadd_round_sd(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_fnmadd_round_sd(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128d test_mm_mask_fnmadd_round_sd(__m128d __W, __mmask8 __U, __m128d __A, __m128d __B){		__m128d test_mm_mask_fnmadd_round_sd(__m128d __W, __mmask8 __U, __m128d __A, __m128d __B){
// CHECK-LABEL: @test_mm_mask_fnmadd_round_sd		// CHECK-LABEL: @test_mm_mask_fnmadd_round_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[A]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[A]]
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0
return _mm_mask_fnmadd_round_sd(__W, __U, __A, __B, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_mask_fnmadd_round_sd(__W, __U, __A, __B, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128d test_mm_maskz_fnmadd_sd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C){		__m128d test_mm_maskz_fnmadd_sd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C){
// CHECK-LABEL: @test_mm_maskz_fnmadd_sd		// CHECK-LABEL: @test_mm_maskz_fnmadd_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double 0.000000e+00		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double 0.000000e+00
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0
return _mm_maskz_fnmadd_sd(__U, __A, __B, __C);		return _mm_maskz_fnmadd_sd(__U, __A, __B, __C);
}		}

__m128d test_mm_maskz_fnmadd_round_sd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C){		__m128d test_mm_maskz_fnmadd_round_sd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C){
// CHECK-LABEL: @test_mm_maskz_fnmadd_round_sd		// CHECK-LABEL: @test_mm_maskz_fnmadd_round_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.+]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double 0.000000e+00		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double 0.000000e+00
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0
return _mm_maskz_fnmadd_round_sd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_maskz_fnmadd_round_sd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128d test_mm_mask3_fnmadd_sd(__m128d __W, __m128d __X, __m128d __Y, __mmask8 __U){		__m128d test_mm_mask3_fnmadd_sd(__m128d __W, __m128d __X, __m128d __Y, __mmask8 __U){
// CHECK-LABEL: @test_mm_mask3_fnmadd_sd		// CHECK-LABEL: @test_mm_mask3_fnmadd_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[ORIGC:%.+]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[ORIGC:%.+]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[C]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[C]]
// CHECK-NEXT: insertelement <2 x double> [[ORIGC]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGC]], double [[SEL]], i64 0
return _mm_mask3_fnmadd_sd(__W, __X, __Y, __U);		return _mm_mask3_fnmadd_sd(__W, __X, __Y, __U);
}		}

__m128d test_mm_mask3_fnmadd_round_sd(__m128d __W, __m128d __X, __m128d __Y, __mmask8 __U){		__m128d test_mm_mask3_fnmadd_round_sd(__m128d __W, __m128d __X, __m128d __Y, __mmask8 __U){
// CHECK-LABEL: @test_mm_mask3_fnmadd_round_sd		// CHECK-LABEL: @test_mm_mask3_fnmadd_round_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[ORIGC:%.+]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[ORIGC:%.+]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[C]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[C]]
// CHECK-NEXT: insertelement <2 x double> [[ORIGC]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGC]], double [[SEL]], i64 0
return _mm_mask3_fnmadd_round_sd(__W, __X, __Y, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_mask3_fnmadd_round_sd(__W, __X, __Y, __U, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128d test_mm_mask_fnmsub_sd(__m128d __W, __mmask8 __U, __m128d __A, __m128d __B){		__m128d test_mm_mask_fnmsub_sd(__m128d __W, __mmask8 __U, __m128d __A, __m128d __B){
// CHECK-LABEL: @test_mm_mask_fnmsub_sd		// CHECK-LABEL: @test_mm_mask_fnmsub_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG2:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[A]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[A]]
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0
return _mm_mask_fnmsub_sd(__W, __U, __A, __B);		return _mm_mask_fnmsub_sd(__W, __U, __A, __B);
}		}

__m128d test_mm_fnmsub_round_sd(__m128d __A, __m128d __B, __m128d __C){		__m128d test_mm_fnmsub_round_sd(__m128d __A, __m128d __B, __m128d __C){
// CHECK-LABEL: @test_mm_fnmsub_round_sd		// CHECK-LABEL: @test_mm_fnmsub_round_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG2:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[FMA]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[FMA]], i64 0
return _mm_fnmsub_round_sd(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_fnmsub_round_sd(__A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128d test_mm_mask_fnmsub_round_sd(__m128d __W, __mmask8 __U, __m128d __A, __m128d __B){		__m128d test_mm_mask_fnmsub_round_sd(__m128d __W, __mmask8 __U, __m128d __A, __m128d __B){
// CHECK-LABEL: @test_mm_mask_fnmsub_round_sd		// CHECK-LABEL: @test_mm_mask_fnmsub_round_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG2:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[A]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[A]]
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0
return _mm_mask_fnmsub_round_sd(__W, __U, __A, __B, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_mask_fnmsub_round_sd(__W, __U, __A, __B, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128d test_mm_maskz_fnmsub_sd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C){		__m128d test_mm_maskz_fnmsub_sd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C){
// CHECK-LABEL: @test_mm_maskz_fnmsub_sd		// CHECK-LABEL: @test_mm_maskz_fnmsub_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG2:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double 0.000000e+00		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double 0.000000e+00
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0
return _mm_maskz_fnmsub_sd(__U, __A, __B, __C);		return _mm_maskz_fnmsub_sd(__U, __A, __B, __C);
}		}

__m128d test_mm_maskz_fnmsub_round_sd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C){		__m128d test_mm_maskz_fnmsub_round_sd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C){
// CHECK-LABEL: @test_mm_maskz_fnmsub_round_sd		// CHECK-LABEL: @test_mm_maskz_fnmsub_round_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG2:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[ORIGA:%.]], i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double 0.000000e+00		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double 0.000000e+00
// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGA]], double [[SEL]], i64 0
return _mm_maskz_fnmsub_round_sd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);		return _mm_maskz_fnmsub_round_sd(__U, __A, __B, __C, _MM_FROUND_TO_ZERO \| _MM_FROUND_NO_EXC);
}		}

__m128d test_mm_mask3_fnmsub_sd(__m128d __W, __m128d __X, __m128d __Y, __mmask8 __U){		__m128d test_mm_mask3_fnmsub_sd(__m128d __W, __m128d __X, __m128d __Y, __mmask8 __U){
// CHECK-LABEL: @test_mm_mask3_fnmsub_sd		// CHECK-LABEL: @test_mm_mask3_fnmsub_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, [[ORIGC:%.+]]		// CHECK: [[NEG2:%.+]] = fneg <2 x double> [[ORIGC:%.+]]
// CHECK: [[A:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.fma.f64(double [[A]], double [[B]], double [[C]])
// CHECK-NEXT: [[C2:%.+]] = extractelement <2 x double> [[ORIGC]], i64 0		// CHECK-NEXT: [[C2:%.+]] = extractelement <2 x double> [[ORIGC]], i64 0
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[C2]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[C2]]
// CHECK-NEXT: insertelement <2 x double> [[ORIGC]], double [[SEL]], i64 0		// CHECK-NEXT: insertelement <2 x double> [[ORIGC]], double [[SEL]], i64 0
return _mm_mask3_fnmsub_sd(__W, __X, __Y, __U);		return _mm_mask3_fnmsub_sd(__W, __X, __Y, __U);
}		}

__m128d test_mm_mask3_fnmsub_round_sd(__m128d __W, __m128d __X, __m128d __Y, __mmask8 __U){		__m128d test_mm_mask3_fnmsub_round_sd(__m128d __W, __m128d __X, __m128d __Y, __mmask8 __U){
// CHECK-LABEL: @test_mm_mask3_fnmsub_round_sd		// CHECK-LABEL: @test_mm_mask3_fnmsub_round_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, [[ORIGC:%.+]]		// CHECK: [[NEG2:%.+]] = fneg <2 x double> [[ORIGC:%.+]]
// CHECK: [[A:%.+]] = extractelement <2 x double> %{{.*}}, i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> %{{.*}}, i64 0
// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK-NEXT: [[B:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0		// CHECK-NEXT: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0
// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)		// CHECK-NEXT: [[FMA:%.+]] = call double @llvm.x86.avx512.vfmadd.f64(double [[A]], double [[B]], double [[C]], i32 11)
// CHECK-NEXT: [[C2:%.+]] = extractelement <2 x double> [[ORIGC]], i64 0		// CHECK-NEXT: [[C2:%.+]] = extractelement <2 x double> [[ORIGC]], i64 0
// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>		// CHECK-NEXT: bitcast i8 %{{.*}} to <8 x i1>
// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0		// CHECK-NEXT: extractelement <8 x i1> %{{.*}}, i64 0
// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[C2]]		// CHECK-NEXT: [[SEL:%.+]] = select i1 %{{.*}}, double [[FMA]], double [[C2]]
▲ Show 20 Lines • Show All 2,673 Lines • Show Last 20 Lines

clang/test/CodeGen/avx512vl-builtins.c

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,839 Lines • ▼ Show 20 Lines	__m128d test_mm_mask_fmadd_pd(__m128d __A, __mmask8 __U, __m128d __B, __m128d __C) {
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_mask_fmadd_pd(__A, __U, __B, __C);		return _mm_mask_fmadd_pd(__A, __U, __B, __C);
}		}

__m128d test_mm_mask_fmsub_pd(__m128d __A, __mmask8 __U, __m128d __B, __m128d __C) {		__m128d test_mm_mask_fmsub_pd(__m128d __A, __mmask8 __U, __m128d __B, __m128d __C) {
// CHECK-LABEL: @test_mm_mask_fmsub_pd		// CHECK-LABEL: @test_mm_mask_fmsub_pd
// CHECK: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <2 x double> %{{.*}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_mask_fmsub_pd(__A, __U, __B, __C);		return _mm_mask_fmsub_pd(__A, __U, __B, __C);
}		}

__m128d test_mm_mask3_fmadd_pd(__m128d __A, __m128d __B, __m128d __C, __mmask8 __U) {		__m128d test_mm_mask3_fmadd_pd(__m128d __A, __m128d __B, __m128d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm_mask3_fmadd_pd		// CHECK-LABEL: @test_mm_mask3_fmadd_pd
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_mask3_fmadd_pd(__A, __B, __C, __U);		return _mm_mask3_fmadd_pd(__A, __B, __C, __U);
}		}

__m128d test_mm_mask3_fnmadd_pd(__m128d __A, __m128d __B, __m128d __C, __mmask8 __U) {		__m128d test_mm_mask3_fnmadd_pd(__m128d __A, __m128d __B, __m128d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm_mask3_fnmadd_pd		// CHECK-LABEL: @test_mm_mask3_fnmadd_pd
// CHECK: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <2 x double> %{{.*}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_mask3_fnmadd_pd(__A, __B, __C, __U);		return _mm_mask3_fnmadd_pd(__A, __B, __C, __U);
}		}

__m128d test_mm_maskz_fmadd_pd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C) {		__m128d test_mm_maskz_fmadd_pd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C) {
// CHECK-LABEL: @test_mm_maskz_fmadd_pd		// CHECK-LABEL: @test_mm_maskz_fmadd_pd
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_maskz_fmadd_pd(__U, __A, __B, __C);		return _mm_maskz_fmadd_pd(__U, __A, __B, __C);
}		}

__m128d test_mm_maskz_fmsub_pd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C) {		__m128d test_mm_maskz_fmsub_pd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C) {
// CHECK-LABEL: @test_mm_maskz_fmsub_pd		// CHECK-LABEL: @test_mm_maskz_fmsub_pd
// CHECK: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <2 x double> %{{.*}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_maskz_fmsub_pd(__U, __A, __B, __C);		return _mm_maskz_fmsub_pd(__U, __A, __B, __C);
}		}

__m128d test_mm_maskz_fnmadd_pd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C) {		__m128d test_mm_maskz_fnmadd_pd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C) {
// CHECK-LABEL: @test_mm_maskz_fnmadd_pd		// CHECK-LABEL: @test_mm_maskz_fnmadd_pd
// CHECK: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <2 x double> %{{.*}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_maskz_fnmadd_pd(__U, __A, __B, __C);		return _mm_maskz_fnmadd_pd(__U, __A, __B, __C);
}		}

__m128d test_mm_maskz_fnmsub_pd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C) {		__m128d test_mm_maskz_fnmsub_pd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C) {
// CHECK-LABEL: @test_mm_maskz_fnmsub_pd		// CHECK-LABEL: @test_mm_maskz_fnmsub_pd
// CHECK: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <2 x double> %{{.*}}
// CHECK: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <2 x double> %{{.*}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_maskz_fnmsub_pd(__U, __A, __B, __C);		return _mm_maskz_fnmsub_pd(__U, __A, __B, __C);
}		}

__m256d test_mm256_mask_fmadd_pd(__m256d __A, __mmask8 __U, __m256d __B, __m256d __C) {		__m256d test_mm256_mask_fmadd_pd(__m256d __A, __mmask8 __U, __m256d __B, __m256d __C) {
// CHECK-LABEL: @test_mm256_mask_fmadd_pd		// CHECK-LABEL: @test_mm256_mask_fmadd_pd
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_mask_fmadd_pd(__A, __U, __B, __C);		return _mm256_mask_fmadd_pd(__A, __U, __B, __C);
}		}

__m256d test_mm256_mask_fmsub_pd(__m256d __A, __mmask8 __U, __m256d __B, __m256d __C) {		__m256d test_mm256_mask_fmsub_pd(__m256d __A, __mmask8 __U, __m256d __B, __m256d __C) {
// CHECK-LABEL: @test_mm256_mask_fmsub_pd		// CHECK-LABEL: @test_mm256_mask_fmsub_pd
// CHECK: fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x double> %{{.*}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_mask_fmsub_pd(__A, __U, __B, __C);		return _mm256_mask_fmsub_pd(__A, __U, __B, __C);
}		}

__m256d test_mm256_mask3_fmadd_pd(__m256d __A, __m256d __B, __m256d __C, __mmask8 __U) {		__m256d test_mm256_mask3_fmadd_pd(__m256d __A, __m256d __B, __m256d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm256_mask3_fmadd_pd		// CHECK-LABEL: @test_mm256_mask3_fmadd_pd
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_mask3_fmadd_pd(__A, __B, __C, __U);		return _mm256_mask3_fmadd_pd(__A, __B, __C, __U);
}		}

__m256d test_mm256_mask3_fnmadd_pd(__m256d __A, __m256d __B, __m256d __C, __mmask8 __U) {		__m256d test_mm256_mask3_fnmadd_pd(__m256d __A, __m256d __B, __m256d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm256_mask3_fnmadd_pd		// CHECK-LABEL: @test_mm256_mask3_fnmadd_pd
// CHECK: fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x double> %{{.*}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_mask3_fnmadd_pd(__A, __B, __C, __U);		return _mm256_mask3_fnmadd_pd(__A, __B, __C, __U);
}		}

__m256d test_mm256_maskz_fmadd_pd(__mmask8 __U, __m256d __A, __m256d __B, __m256d __C) {		__m256d test_mm256_maskz_fmadd_pd(__mmask8 __U, __m256d __A, __m256d __B, __m256d __C) {
// CHECK-LABEL: @test_mm256_maskz_fmadd_pd		// CHECK-LABEL: @test_mm256_maskz_fmadd_pd
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_maskz_fmadd_pd(__U, __A, __B, __C);		return _mm256_maskz_fmadd_pd(__U, __A, __B, __C);
}		}

__m256d test_mm256_maskz_fmsub_pd(__mmask8 __U, __m256d __A, __m256d __B, __m256d __C) {		__m256d test_mm256_maskz_fmsub_pd(__mmask8 __U, __m256d __A, __m256d __B, __m256d __C) {
// CHECK-LABEL: @test_mm256_maskz_fmsub_pd		// CHECK-LABEL: @test_mm256_maskz_fmsub_pd
// CHECK: fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x double> %{{.*}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_maskz_fmsub_pd(__U, __A, __B, __C);		return _mm256_maskz_fmsub_pd(__U, __A, __B, __C);
}		}

__m256d test_mm256_maskz_fnmadd_pd(__mmask8 __U, __m256d __A, __m256d __B, __m256d __C) {		__m256d test_mm256_maskz_fnmadd_pd(__mmask8 __U, __m256d __A, __m256d __B, __m256d __C) {
// CHECK-LABEL: @test_mm256_maskz_fnmadd_pd		// CHECK-LABEL: @test_mm256_maskz_fnmadd_pd
// CHECK: fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x double> %{{.*}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_maskz_fnmadd_pd(__U, __A, __B, __C);		return _mm256_maskz_fnmadd_pd(__U, __A, __B, __C);
}		}

__m256d test_mm256_maskz_fnmsub_pd(__mmask8 __U, __m256d __A, __m256d __B, __m256d __C) {		__m256d test_mm256_maskz_fnmsub_pd(__mmask8 __U, __m256d __A, __m256d __B, __m256d __C) {
// CHECK-LABEL: @test_mm256_maskz_fnmsub_pd		// CHECK-LABEL: @test_mm256_maskz_fnmsub_pd
// CHECK: fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x double> %{{.*}}
// CHECK: fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x double> %{{.*}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_maskz_fnmsub_pd(__U, __A, __B, __C);		return _mm256_maskz_fnmsub_pd(__U, __A, __B, __C);
}		}

__m128 test_mm_mask_fmadd_ps(__m128 __A, __mmask8 __U, __m128 __B, __m128 __C) {		__m128 test_mm_mask_fmadd_ps(__m128 __A, __mmask8 __U, __m128 __B, __m128 __C) {
// CHECK-LABEL: @test_mm_mask_fmadd_ps		// CHECK-LABEL: @test_mm_mask_fmadd_ps
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_mask_fmadd_ps(__A, __U, __B, __C);		return _mm_mask_fmadd_ps(__A, __U, __B, __C);
}		}

__m128 test_mm_mask_fmsub_ps(__m128 __A, __mmask8 __U, __m128 __B, __m128 __C) {		__m128 test_mm_mask_fmsub_ps(__m128 __A, __mmask8 __U, __m128 __B, __m128 __C) {
// CHECK-LABEL: @test_mm_mask_fmsub_ps		// CHECK-LABEL: @test_mm_mask_fmsub_ps
// CHECK: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x float> %{{.*}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_mask_fmsub_ps(__A, __U, __B, __C);		return _mm_mask_fmsub_ps(__A, __U, __B, __C);
}		}

__m128 test_mm_mask3_fmadd_ps(__m128 __A, __m128 __B, __m128 __C, __mmask8 __U) {		__m128 test_mm_mask3_fmadd_ps(__m128 __A, __m128 __B, __m128 __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm_mask3_fmadd_ps		// CHECK-LABEL: @test_mm_mask3_fmadd_ps
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_mask3_fmadd_ps(__A, __B, __C, __U);		return _mm_mask3_fmadd_ps(__A, __B, __C, __U);
}		}

__m128 test_mm_mask3_fnmadd_ps(__m128 __A, __m128 __B, __m128 __C, __mmask8 __U) {		__m128 test_mm_mask3_fnmadd_ps(__m128 __A, __m128 __B, __m128 __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm_mask3_fnmadd_ps		// CHECK-LABEL: @test_mm_mask3_fnmadd_ps
// CHECK: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x float> %{{.*}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_mask3_fnmadd_ps(__A, __B, __C, __U);		return _mm_mask3_fnmadd_ps(__A, __B, __C, __U);
}		}

__m128 test_mm_maskz_fmadd_ps(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C) {		__m128 test_mm_maskz_fmadd_ps(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C) {
// CHECK-LABEL: @test_mm_maskz_fmadd_ps		// CHECK-LABEL: @test_mm_maskz_fmadd_ps
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_maskz_fmadd_ps(__U, __A, __B, __C);		return _mm_maskz_fmadd_ps(__U, __A, __B, __C);
}		}

__m128 test_mm_maskz_fmsub_ps(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C) {		__m128 test_mm_maskz_fmsub_ps(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C) {
// CHECK-LABEL: @test_mm_maskz_fmsub_ps		// CHECK-LABEL: @test_mm_maskz_fmsub_ps
// CHECK: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x float> %{{.*}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_maskz_fmsub_ps(__U, __A, __B, __C);		return _mm_maskz_fmsub_ps(__U, __A, __B, __C);
}		}

__m128 test_mm_maskz_fnmadd_ps(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C) {		__m128 test_mm_maskz_fnmadd_ps(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C) {
// CHECK-LABEL: @test_mm_maskz_fnmadd_ps		// CHECK-LABEL: @test_mm_maskz_fnmadd_ps
// CHECK: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x float> %{{.*}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_maskz_fnmadd_ps(__U, __A, __B, __C);		return _mm_maskz_fnmadd_ps(__U, __A, __B, __C);
}		}

__m128 test_mm_maskz_fnmsub_ps(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C) {		__m128 test_mm_maskz_fnmsub_ps(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C) {
// CHECK-LABEL: @test_mm_maskz_fnmsub_ps		// CHECK-LABEL: @test_mm_maskz_fnmsub_ps
// CHECK: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x float> %{{.*}}
// CHECK: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x float> %{{.*}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_maskz_fnmsub_ps(__U, __A, __B, __C);		return _mm_maskz_fnmsub_ps(__U, __A, __B, __C);
}		}

__m256 test_mm256_mask_fmadd_ps(__m256 __A, __mmask8 __U, __m256 __B, __m256 __C) {		__m256 test_mm256_mask_fmadd_ps(__m256 __A, __mmask8 __U, __m256 __B, __m256 __C) {
// CHECK-LABEL: @test_mm256_mask_fmadd_ps		// CHECK-LABEL: @test_mm256_mask_fmadd_ps
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_mask_fmadd_ps(__A, __U, __B, __C);		return _mm256_mask_fmadd_ps(__A, __U, __B, __C);
}		}

__m256 test_mm256_mask_fmsub_ps(__m256 __A, __mmask8 __U, __m256 __B, __m256 __C) {		__m256 test_mm256_mask_fmsub_ps(__m256 __A, __mmask8 __U, __m256 __B, __m256 __C) {
// CHECK-LABEL: @test_mm256_mask_fmsub_ps		// CHECK-LABEL: @test_mm256_mask_fmsub_ps
// CHECK: fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_mask_fmsub_ps(__A, __U, __B, __C);		return _mm256_mask_fmsub_ps(__A, __U, __B, __C);
}		}

__m256 test_mm256_mask3_fmadd_ps(__m256 __A, __m256 __B, __m256 __C, __mmask8 __U) {		__m256 test_mm256_mask3_fmadd_ps(__m256 __A, __m256 __B, __m256 __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm256_mask3_fmadd_ps		// CHECK-LABEL: @test_mm256_mask3_fmadd_ps
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_mask3_fmadd_ps(__A, __B, __C, __U);		return _mm256_mask3_fmadd_ps(__A, __B, __C, __U);
}		}

__m256 test_mm256_mask3_fnmadd_ps(__m256 __A, __m256 __B, __m256 __C, __mmask8 __U) {		__m256 test_mm256_mask3_fnmadd_ps(__m256 __A, __m256 __B, __m256 __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm256_mask3_fnmadd_ps		// CHECK-LABEL: @test_mm256_mask3_fnmadd_ps
// CHECK: fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_mask3_fnmadd_ps(__A, __B, __C, __U);		return _mm256_mask3_fnmadd_ps(__A, __B, __C, __U);
}		}

__m256 test_mm256_maskz_fmadd_ps(__mmask8 __U, __m256 __A, __m256 __B, __m256 __C) {		__m256 test_mm256_maskz_fmadd_ps(__mmask8 __U, __m256 __A, __m256 __B, __m256 __C) {
// CHECK-LABEL: @test_mm256_maskz_fmadd_ps		// CHECK-LABEL: @test_mm256_maskz_fmadd_ps
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_maskz_fmadd_ps(__U, __A, __B, __C);		return _mm256_maskz_fmadd_ps(__U, __A, __B, __C);
}		}

__m256 test_mm256_maskz_fmsub_ps(__mmask8 __U, __m256 __A, __m256 __B, __m256 __C) {		__m256 test_mm256_maskz_fmsub_ps(__mmask8 __U, __m256 __A, __m256 __B, __m256 __C) {
// CHECK-LABEL: @test_mm256_maskz_fmsub_ps		// CHECK-LABEL: @test_mm256_maskz_fmsub_ps
// CHECK: fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_maskz_fmsub_ps(__U, __A, __B, __C);		return _mm256_maskz_fmsub_ps(__U, __A, __B, __C);
}		}

__m256 test_mm256_maskz_fnmadd_ps(__mmask8 __U, __m256 __A, __m256 __B, __m256 __C) {		__m256 test_mm256_maskz_fnmadd_ps(__mmask8 __U, __m256 __A, __m256 __B, __m256 __C) {
// CHECK-LABEL: @test_mm256_maskz_fnmadd_ps		// CHECK-LABEL: @test_mm256_maskz_fnmadd_ps
// CHECK: fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_maskz_fnmadd_ps(__U, __A, __B, __C);		return _mm256_maskz_fnmadd_ps(__U, __A, __B, __C);
}		}

__m256 test_mm256_maskz_fnmsub_ps(__mmask8 __U, __m256 __A, __m256 __B, __m256 __C) {		__m256 test_mm256_maskz_fnmsub_ps(__mmask8 __U, __m256 __A, __m256 __B, __m256 __C) {
// CHECK-LABEL: @test_mm256_maskz_fnmsub_ps		// CHECK-LABEL: @test_mm256_maskz_fnmsub_ps
// CHECK: fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x float> %{{.*}}
// CHECK: fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_maskz_fnmsub_ps(__U, __A, __B, __C);		return _mm256_maskz_fnmsub_ps(__U, __A, __B, __C);
}		}

__m128d test_mm_mask_fmaddsub_pd(__m128d __A, __mmask8 __U, __m128d __B, __m128d __C) {		__m128d test_mm_mask_fmaddsub_pd(__m128d __A, __mmask8 __U, __m128d __B, __m128d __C) {
// CHECK-LABEL: @test_mm_mask_fmaddsub_pd		// CHECK-LABEL: @test_mm_mask_fmaddsub_pd
// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]
// CHECK: shufflevector <2 x double> [[SUB]], <2 x double> [[ADD]], <2 x i32> <i32 0, i32 3>		// CHECK: shufflevector <2 x double> [[SUB]], <2 x double> [[ADD]], <2 x i32> <i32 0, i32 3>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_mask_fmaddsub_pd(__A, __U, __B, __C);		return _mm_mask_fmaddsub_pd(__A, __U, __B, __C);
}		}

__m128d test_mm_mask_fmsubadd_pd(__m128d __A, __mmask8 __U, __m128d __B, __m128d __C) {		__m128d test_mm_mask_fmsubadd_pd(__m128d __A, __mmask8 __U, __m128d __B, __m128d __C) {
// CHECK-LABEL: @test_mm_mask_fmsubadd_pd		// CHECK-LABEL: @test_mm_mask_fmsubadd_pd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]
// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: shufflevector <2 x double> [[ADD]], <2 x double> [[SUB]], <2 x i32> <i32 0, i32 3>		// CHECK: shufflevector <2 x double> [[ADD]], <2 x double> [[SUB]], <2 x i32> <i32 0, i32 3>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_mask_fmsubadd_pd(__A, __U, __B, __C);		return _mm_mask_fmsubadd_pd(__A, __U, __B, __C);
}		}

__m128d test_mm_mask3_fmaddsub_pd(__m128d __A, __m128d __B, __m128d __C, __mmask8 __U) {		__m128d test_mm_mask3_fmaddsub_pd(__m128d __A, __m128d __B, __m128d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm_mask3_fmaddsub_pd		// CHECK-LABEL: @test_mm_mask3_fmaddsub_pd
// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]
// CHECK: shufflevector <2 x double> [[SUB]], <2 x double> [[ADD]], <2 x i32> <i32 0, i32 3>		// CHECK: shufflevector <2 x double> [[SUB]], <2 x double> [[ADD]], <2 x i32> <i32 0, i32 3>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_mask3_fmaddsub_pd(__A, __B, __C, __U);		return _mm_mask3_fmaddsub_pd(__A, __B, __C, __U);
}		}

__m128d test_mm_maskz_fmaddsub_pd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C) {		__m128d test_mm_maskz_fmaddsub_pd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C) {
// CHECK-LABEL: @test_mm_maskz_fmaddsub_pd		// CHECK-LABEL: @test_mm_maskz_fmaddsub_pd
// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]
// CHECK: shufflevector <2 x double> [[SUB]], <2 x double> [[ADD]], <2 x i32> <i32 0, i32 3>		// CHECK: shufflevector <2 x double> [[SUB]], <2 x double> [[ADD]], <2 x i32> <i32 0, i32 3>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_maskz_fmaddsub_pd(__U, __A, __B, __C);		return _mm_maskz_fmaddsub_pd(__U, __A, __B, __C);
}		}

__m128d test_mm_maskz_fmsubadd_pd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C) {		__m128d test_mm_maskz_fmsubadd_pd(__mmask8 __U, __m128d __A, __m128d __B, __m128d __C) {
// CHECK-LABEL: @test_mm_maskz_fmsubadd_pd		// CHECK-LABEL: @test_mm_maskz_fmsubadd_pd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]
// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: shufflevector <2 x double> [[ADD]], <2 x double> [[SUB]], <2 x i32> <i32 0, i32 3>		// CHECK: shufflevector <2 x double> [[ADD]], <2 x double> [[SUB]], <2 x i32> <i32 0, i32 3>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_maskz_fmsubadd_pd(__U, __A, __B, __C);		return _mm_maskz_fmsubadd_pd(__U, __A, __B, __C);
}		}

__m256d test_mm256_mask_fmaddsub_pd(__m256d __A, __mmask8 __U, __m256d __B, __m256d __C) {		__m256d test_mm256_mask_fmaddsub_pd(__m256d __A, __mmask8 __U, __m256d __B, __m256d __C) {
// CHECK-LABEL: @test_mm256_mask_fmaddsub_pd		// CHECK-LABEL: @test_mm256_mask_fmaddsub_pd
// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: shufflevector <4 x double> [[SUB]], <4 x double> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x double> [[SUB]], <4 x double> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_mask_fmaddsub_pd(__A, __U, __B, __C);		return _mm256_mask_fmaddsub_pd(__A, __U, __B, __C);
}		}

__m256d test_mm256_mask_fmsubadd_pd(__m256d __A, __mmask8 __U, __m256d __B, __m256d __C) {		__m256d test_mm256_mask_fmsubadd_pd(__m256d __A, __mmask8 __U, __m256d __B, __m256d __C) {
// CHECK-LABEL: @test_mm256_mask_fmsubadd_pd		// CHECK-LABEL: @test_mm256_mask_fmsubadd_pd
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> [[NEG]]
// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: shufflevector <4 x double> [[ADD]], <4 x double> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x double> [[ADD]], <4 x double> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_mask_fmsubadd_pd(__A, __U, __B, __C);		return _mm256_mask_fmsubadd_pd(__A, __U, __B, __C);
}		}

__m256d test_mm256_mask3_fmaddsub_pd(__m256d __A, __m256d __B, __m256d __C, __mmask8 __U) {		__m256d test_mm256_mask3_fmaddsub_pd(__m256d __A, __m256d __B, __m256d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm256_mask3_fmaddsub_pd		// CHECK-LABEL: @test_mm256_mask3_fmaddsub_pd
// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: shufflevector <4 x double> [[SUB]], <4 x double> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x double> [[SUB]], <4 x double> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_mask3_fmaddsub_pd(__A, __B, __C, __U);		return _mm256_mask3_fmaddsub_pd(__A, __B, __C, __U);
}		}

__m256d test_mm256_maskz_fmaddsub_pd(__mmask8 __U, __m256d __A, __m256d __B, __m256d __C) {		__m256d test_mm256_maskz_fmaddsub_pd(__mmask8 __U, __m256d __A, __m256d __B, __m256d __C) {
// CHECK-LABEL: @test_mm256_maskz_fmaddsub_pd		// CHECK-LABEL: @test_mm256_maskz_fmaddsub_pd
// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: shufflevector <4 x double> [[SUB]], <4 x double> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x double> [[SUB]], <4 x double> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_maskz_fmaddsub_pd(__U, __A, __B, __C);		return _mm256_maskz_fmaddsub_pd(__U, __A, __B, __C);
}		}

__m256d test_mm256_maskz_fmsubadd_pd(__mmask8 __U, __m256d __A, __m256d __B, __m256d __C) {		__m256d test_mm256_maskz_fmsubadd_pd(__mmask8 __U, __m256d __A, __m256d __B, __m256d __C) {
// CHECK-LABEL: @test_mm256_maskz_fmsubadd_pd		// CHECK-LABEL: @test_mm256_maskz_fmsubadd_pd
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> [[NEG]]
// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: shufflevector <4 x double> [[ADD]], <4 x double> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x double> [[ADD]], <4 x double> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_maskz_fmsubadd_pd(__U, __A, __B, __C);		return _mm256_maskz_fmsubadd_pd(__U, __A, __B, __C);
}		}

__m128 test_mm_mask_fmaddsub_ps(__m128 __A, __mmask8 __U, __m128 __B, __m128 __C) {		__m128 test_mm_mask_fmaddsub_ps(__m128 __A, __mmask8 __U, __m128 __B, __m128 __C) {
// CHECK-LABEL: @test_mm_mask_fmaddsub_ps		// CHECK-LABEL: @test_mm_mask_fmaddsub_ps
// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]
// CHECK: shufflevector <4 x float> [[SUB]], <4 x float> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x float> [[SUB]], <4 x float> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_mask_fmaddsub_ps(__A, __U, __B, __C);		return _mm_mask_fmaddsub_ps(__A, __U, __B, __C);
}		}

__m128 test_mm_mask_fmsubadd_ps(__m128 __A, __mmask8 __U, __m128 __B, __m128 __C) {		__m128 test_mm_mask_fmsubadd_ps(__m128 __A, __mmask8 __U, __m128 __B, __m128 __C) {
// CHECK-LABEL: @test_mm_mask_fmsubadd_ps		// CHECK-LABEL: @test_mm_mask_fmsubadd_ps
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]
// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: shufflevector <4 x float> [[ADD]], <4 x float> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x float> [[ADD]], <4 x float> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_mask_fmsubadd_ps(__A, __U, __B, __C);		return _mm_mask_fmsubadd_ps(__A, __U, __B, __C);
}		}

__m128 test_mm_mask3_fmaddsub_ps(__m128 __A, __m128 __B, __m128 __C, __mmask8 __U) {		__m128 test_mm_mask3_fmaddsub_ps(__m128 __A, __m128 __B, __m128 __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm_mask3_fmaddsub_ps		// CHECK-LABEL: @test_mm_mask3_fmaddsub_ps
// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]
// CHECK: shufflevector <4 x float> [[SUB]], <4 x float> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x float> [[SUB]], <4 x float> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_mask3_fmaddsub_ps(__A, __B, __C, __U);		return _mm_mask3_fmaddsub_ps(__A, __B, __C, __U);
}		}

__m128 test_mm_maskz_fmaddsub_ps(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C) {		__m128 test_mm_maskz_fmaddsub_ps(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C) {
// CHECK-LABEL: @test_mm_maskz_fmaddsub_ps		// CHECK-LABEL: @test_mm_maskz_fmaddsub_ps
// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]
// CHECK: shufflevector <4 x float> [[SUB]], <4 x float> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x float> [[SUB]], <4 x float> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_maskz_fmaddsub_ps(__U, __A, __B, __C);		return _mm_maskz_fmaddsub_ps(__U, __A, __B, __C);
}		}

__m128 test_mm_maskz_fmsubadd_ps(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C) {		__m128 test_mm_maskz_fmsubadd_ps(__mmask8 __U, __m128 __A, __m128 __B, __m128 __C) {
// CHECK-LABEL: @test_mm_maskz_fmsubadd_ps		// CHECK-LABEL: @test_mm_maskz_fmsubadd_ps
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]
// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: shufflevector <4 x float> [[ADD]], <4 x float> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x float> [[ADD]], <4 x float> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_maskz_fmsubadd_ps(__U, __A, __B, __C);		return _mm_maskz_fmsubadd_ps(__U, __A, __B, __C);
}		}

__m256 test_mm256_mask_fmaddsub_ps(__m256 __A, __mmask8 __U, __m256 __B, __m256 __C) {		__m256 test_mm256_mask_fmaddsub_ps(__m256 __A, __mmask8 __U, __m256 __B, __m256 __C) {
// CHECK-LABEL: @test_mm256_mask_fmaddsub_ps		// CHECK-LABEL: @test_mm256_mask_fmaddsub_ps
// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]
// CHECK: shufflevector <8 x float> [[SUB]], <8 x float> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x float> [[SUB]], <8 x float> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_mask_fmaddsub_ps(__A, __U, __B, __C);		return _mm256_mask_fmaddsub_ps(__A, __U, __B, __C);
}		}

__m256 test_mm256_mask_fmsubadd_ps(__m256 __A, __mmask8 __U, __m256 __B, __m256 __C) {		__m256 test_mm256_mask_fmsubadd_ps(__m256 __A, __mmask8 __U, __m256 __B, __m256 __C) {
// CHECK-LABEL: @test_mm256_mask_fmsubadd_ps		// CHECK-LABEL: @test_mm256_mask_fmsubadd_ps
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]
// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: shufflevector <8 x float> [[ADD]], <8 x float> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x float> [[ADD]], <8 x float> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_mask_fmsubadd_ps(__A, __U, __B, __C);		return _mm256_mask_fmsubadd_ps(__A, __U, __B, __C);
}		}

__m256 test_mm256_mask3_fmaddsub_ps(__m256 __A, __m256 __B, __m256 __C, __mmask8 __U) {		__m256 test_mm256_mask3_fmaddsub_ps(__m256 __A, __m256 __B, __m256 __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm256_mask3_fmaddsub_ps		// CHECK-LABEL: @test_mm256_mask3_fmaddsub_ps
// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]
// CHECK: shufflevector <8 x float> [[SUB]], <8 x float> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x float> [[SUB]], <8 x float> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_mask3_fmaddsub_ps(__A, __B, __C, __U);		return _mm256_mask3_fmaddsub_ps(__A, __B, __C, __U);
}		}

__m256 test_mm256_maskz_fmaddsub_ps(__mmask8 __U, __m256 __A, __m256 __B, __m256 __C) {		__m256 test_mm256_maskz_fmaddsub_ps(__mmask8 __U, __m256 __A, __m256 __B, __m256 __C) {
// CHECK-LABEL: @test_mm256_maskz_fmaddsub_ps		// CHECK-LABEL: @test_mm256_maskz_fmaddsub_ps
// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]
// CHECK: shufflevector <8 x float> [[SUB]], <8 x float> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x float> [[SUB]], <8 x float> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_maskz_fmaddsub_ps(__U, __A, __B, __C);		return _mm256_maskz_fmaddsub_ps(__U, __A, __B, __C);
}		}

__m256 test_mm256_maskz_fmsubadd_ps(__mmask8 __U, __m256 __A, __m256 __B, __m256 __C) {		__m256 test_mm256_maskz_fmsubadd_ps(__mmask8 __U, __m256 __A, __m256 __B, __m256 __C) {
// CHECK-LABEL: @test_mm256_maskz_fmsubadd_ps		// CHECK-LABEL: @test_mm256_maskz_fmsubadd_ps
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]
// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: shufflevector <8 x float> [[ADD]], <8 x float> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x float> [[ADD]], <8 x float> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_maskz_fmsubadd_ps(__U, __A, __B, __C);		return _mm256_maskz_fmsubadd_ps(__U, __A, __B, __C);
}		}

__m128d test_mm_mask3_fmsub_pd(__m128d __A, __m128d __B, __m128d __C, __mmask8 __U) {		__m128d test_mm_mask3_fmsub_pd(__m128d __A, __m128d __B, __m128d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm_mask3_fmsub_pd		// CHECK-LABEL: @test_mm_mask3_fmsub_pd
// CHECK: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <2 x double> %{{.*}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_mask3_fmsub_pd(__A, __B, __C, __U);		return _mm_mask3_fmsub_pd(__A, __B, __C, __U);
}		}

__m256d test_mm256_mask3_fmsub_pd(__m256d __A, __m256d __B, __m256d __C, __mmask8 __U) {		__m256d test_mm256_mask3_fmsub_pd(__m256d __A, __m256d __B, __m256d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm256_mask3_fmsub_pd		// CHECK-LABEL: @test_mm256_mask3_fmsub_pd
// CHECK: fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x double> %{{.*}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_mask3_fmsub_pd(__A, __B, __C, __U);		return _mm256_mask3_fmsub_pd(__A, __B, __C, __U);
}		}

__m128 test_mm_mask3_fmsub_ps(__m128 __A, __m128 __B, __m128 __C, __mmask8 __U) {		__m128 test_mm_mask3_fmsub_ps(__m128 __A, __m128 __B, __m128 __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm_mask3_fmsub_ps		// CHECK-LABEL: @test_mm_mask3_fmsub_ps
// CHECK: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x float> %{{.*}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_mask3_fmsub_ps(__A, __B, __C, __U);		return _mm_mask3_fmsub_ps(__A, __B, __C, __U);
}		}

__m256 test_mm256_mask3_fmsub_ps(__m256 __A, __m256 __B, __m256 __C, __mmask8 __U) {		__m256 test_mm256_mask3_fmsub_ps(__m256 __A, __m256 __B, __m256 __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm256_mask3_fmsub_ps		// CHECK-LABEL: @test_mm256_mask3_fmsub_ps
// CHECK: fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_mask3_fmsub_ps(__A, __B, __C, __U);		return _mm256_mask3_fmsub_ps(__A, __B, __C, __U);
}		}

__m128d test_mm_mask3_fmsubadd_pd(__m128d __A, __m128d __B, __m128d __C, __mmask8 __U) {		__m128d test_mm_mask3_fmsubadd_pd(__m128d __A, __m128d __B, __m128d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm_mask3_fmsubadd_pd		// CHECK-LABEL: @test_mm_mask3_fmsubadd_pd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]
// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: shufflevector <2 x double> [[ADD]], <2 x double> [[SUB]], <2 x i32> <i32 0, i32 3>		// CHECK: shufflevector <2 x double> [[ADD]], <2 x double> [[SUB]], <2 x i32> <i32 0, i32 3>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_mask3_fmsubadd_pd(__A, __B, __C, __U);		return _mm_mask3_fmsubadd_pd(__A, __B, __C, __U);
}		}

__m256d test_mm256_mask3_fmsubadd_pd(__m256d __A, __m256d __B, __m256d __C, __mmask8 __U) {		__m256d test_mm256_mask3_fmsubadd_pd(__m256d __A, __m256d __B, __m256d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm256_mask3_fmsubadd_pd		// CHECK-LABEL: @test_mm256_mask3_fmsubadd_pd
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> [[NEG]]
// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: shufflevector <4 x double> [[ADD]], <4 x double> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x double> [[ADD]], <4 x double> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_mask3_fmsubadd_pd(__A, __B, __C, __U);		return _mm256_mask3_fmsubadd_pd(__A, __B, __C, __U);
}		}

__m128 test_mm_mask3_fmsubadd_ps(__m128 __A, __m128 __B, __m128 __C, __mmask8 __U) {		__m128 test_mm_mask3_fmsubadd_ps(__m128 __A, __m128 __B, __m128 __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm_mask3_fmsubadd_ps		// CHECK-LABEL: @test_mm_mask3_fmsubadd_ps
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]
// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: shufflevector <4 x float> [[ADD]], <4 x float> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x float> [[ADD]], <4 x float> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_mask3_fmsubadd_ps(__A, __B, __C, __U);		return _mm_mask3_fmsubadd_ps(__A, __B, __C, __U);
}		}

__m256 test_mm256_mask3_fmsubadd_ps(__m256 __A, __m256 __B, __m256 __C, __mmask8 __U) {		__m256 test_mm256_mask3_fmsubadd_ps(__m256 __A, __m256 __B, __m256 __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm256_mask3_fmsubadd_ps		// CHECK-LABEL: @test_mm256_mask3_fmsubadd_ps
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]
// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: shufflevector <8 x float> [[ADD]], <8 x float> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x float> [[ADD]], <8 x float> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_mask3_fmsubadd_ps(__A, __B, __C, __U);		return _mm256_mask3_fmsubadd_ps(__A, __B, __C, __U);
}		}

__m128d test_mm_mask_fnmadd_pd(__m128d __A, __mmask8 __U, __m128d __B, __m128d __C) {		__m128d test_mm_mask_fnmadd_pd(__m128d __A, __mmask8 __U, __m128d __B, __m128d __C) {
// CHECK-LABEL: @test_mm_mask_fnmadd_pd		// CHECK-LABEL: @test_mm_mask_fnmadd_pd
// CHECK: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <2 x double> %{{.*}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_mask_fnmadd_pd(__A, __U, __B, __C);		return _mm_mask_fnmadd_pd(__A, __U, __B, __C);
}		}

__m256d test_mm256_mask_fnmadd_pd(__m256d __A, __mmask8 __U, __m256d __B, __m256d __C) {		__m256d test_mm256_mask_fnmadd_pd(__m256d __A, __mmask8 __U, __m256d __B, __m256d __C) {
// CHECK-LABEL: @test_mm256_mask_fnmadd_pd		// CHECK-LABEL: @test_mm256_mask_fnmadd_pd
// CHECK: fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x double> %{{.*}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_mask_fnmadd_pd(__A, __U, __B, __C);		return _mm256_mask_fnmadd_pd(__A, __U, __B, __C);
}		}

__m128 test_mm_mask_fnmadd_ps(__m128 __A, __mmask8 __U, __m128 __B, __m128 __C) {		__m128 test_mm_mask_fnmadd_ps(__m128 __A, __mmask8 __U, __m128 __B, __m128 __C) {
// CHECK-LABEL: @test_mm_mask_fnmadd_ps		// CHECK-LABEL: @test_mm_mask_fnmadd_ps
// CHECK: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x float> %{{.*}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_mask_fnmadd_ps(__A, __U, __B, __C);		return _mm_mask_fnmadd_ps(__A, __U, __B, __C);
}		}

__m256 test_mm256_mask_fnmadd_ps(__m256 __A, __mmask8 __U, __m256 __B, __m256 __C) {		__m256 test_mm256_mask_fnmadd_ps(__m256 __A, __mmask8 __U, __m256 __B, __m256 __C) {
// CHECK-LABEL: @test_mm256_mask_fnmadd_ps		// CHECK-LABEL: @test_mm256_mask_fnmadd_ps
// CHECK: fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_mask_fnmadd_ps(__A, __U, __B, __C);		return _mm256_mask_fnmadd_ps(__A, __U, __B, __C);
}		}

__m128d test_mm_mask_fnmsub_pd(__m128d __A, __mmask8 __U, __m128d __B, __m128d __C) {		__m128d test_mm_mask_fnmsub_pd(__m128d __A, __mmask8 __U, __m128d __B, __m128d __C) {
// CHECK-LABEL: @test_mm_mask_fnmsub_pd		// CHECK-LABEL: @test_mm_mask_fnmsub_pd
// CHECK: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <2 x double> %{{.*}}
// CHECK: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <2 x double> %{{.*}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_mask_fnmsub_pd(__A, __U, __B, __C);		return _mm_mask_fnmsub_pd(__A, __U, __B, __C);
}		}

__m128d test_mm_mask3_fnmsub_pd(__m128d __A, __m128d __B, __m128d __C, __mmask8 __U) {		__m128d test_mm_mask3_fnmsub_pd(__m128d __A, __m128d __B, __m128d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm_mask3_fnmsub_pd		// CHECK-LABEL: @test_mm_mask3_fnmsub_pd
// CHECK: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <2 x double> %{{.*}}
// CHECK: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <2 x double> %{{.*}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <2 x i32> <i32 0, i32 1>
// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}		// CHECK: select <2 x i1> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}}
return _mm_mask3_fnmsub_pd(__A, __B, __C, __U);		return _mm_mask3_fnmsub_pd(__A, __B, __C, __U);
}		}

__m256d test_mm256_mask_fnmsub_pd(__m256d __A, __mmask8 __U, __m256d __B, __m256d __C) {		__m256d test_mm256_mask_fnmsub_pd(__m256d __A, __mmask8 __U, __m256d __B, __m256d __C) {
// CHECK-LABEL: @test_mm256_mask_fnmsub_pd		// CHECK-LABEL: @test_mm256_mask_fnmsub_pd
// CHECK: fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x double> %{{.*}}
// CHECK: fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x double> %{{.*}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_mask_fnmsub_pd(__A, __U, __B, __C);		return _mm256_mask_fnmsub_pd(__A, __U, __B, __C);
}		}

__m256d test_mm256_mask3_fnmsub_pd(__m256d __A, __m256d __B, __m256d __C, __mmask8 __U) {		__m256d test_mm256_mask3_fnmsub_pd(__m256d __A, __m256d __B, __m256d __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm256_mask3_fnmsub_pd		// CHECK-LABEL: @test_mm256_mask3_fnmsub_pd
// CHECK: fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x double> %{{.*}}
// CHECK: fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x double> %{{.*}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}}
return _mm256_mask3_fnmsub_pd(__A, __B, __C, __U);		return _mm256_mask3_fnmsub_pd(__A, __B, __C, __U);
}		}

__m128 test_mm_mask_fnmsub_ps(__m128 __A, __mmask8 __U, __m128 __B, __m128 __C) {		__m128 test_mm_mask_fnmsub_ps(__m128 __A, __mmask8 __U, __m128 __B, __m128 __C) {
// CHECK-LABEL: @test_mm_mask_fnmsub_ps		// CHECK-LABEL: @test_mm_mask_fnmsub_ps
// CHECK: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x float> %{{.*}}
// CHECK: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x float> %{{.*}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_mask_fnmsub_ps(__A, __U, __B, __C);		return _mm_mask_fnmsub_ps(__A, __U, __B, __C);
}		}

__m128 test_mm_mask3_fnmsub_ps(__m128 __A, __m128 __B, __m128 __C, __mmask8 __U) {		__m128 test_mm_mask3_fnmsub_ps(__m128 __A, __m128 __B, __m128 __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm_mask3_fnmsub_ps		// CHECK-LABEL: @test_mm_mask3_fnmsub_ps
// CHECK: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x float> %{{.*}}
// CHECK: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <4 x float> %{{.*}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>		// CHECK: shufflevector <8 x i1> %{{.}}, <8 x i1> %{{.}}, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}		// CHECK: select <4 x i1> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}}
return _mm_mask3_fnmsub_ps(__A, __B, __C, __U);		return _mm_mask3_fnmsub_ps(__A, __B, __C, __U);
}		}

__m256 test_mm256_mask_fnmsub_ps(__m256 __A, __mmask8 __U, __m256 __B, __m256 __C) {		__m256 test_mm256_mask_fnmsub_ps(__m256 __A, __mmask8 __U, __m256 __B, __m256 __C) {
// CHECK-LABEL: @test_mm256_mask_fnmsub_ps		// CHECK-LABEL: @test_mm256_mask_fnmsub_ps
// CHECK: fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x float> %{{.*}}
// CHECK: fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_mask_fnmsub_ps(__A, __U, __B, __C);		return _mm256_mask_fnmsub_ps(__A, __U, __B, __C);
}		}

__m256 test_mm256_mask3_fnmsub_ps(__m256 __A, __m256 __B, __m256 __C, __mmask8 __U) {		__m256 test_mm256_mask3_fnmsub_ps(__m256 __A, __m256 __B, __m256 __C, __mmask8 __U) {
// CHECK-LABEL: @test_mm256_mask3_fnmsub_ps		// CHECK-LABEL: @test_mm256_mask3_fnmsub_ps
// CHECK: fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x float> %{{.*}}
// CHECK: fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: bitcast i8 %{{.*}} to <8 x i1>		// CHECK: bitcast i8 %{{.*}} to <8 x i1>
// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}		// CHECK: select <8 x i1> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}}
return _mm256_mask3_fnmsub_ps(__A, __B, __C, __U);		return _mm256_mask3_fnmsub_ps(__A, __B, __C, __U);
}		}

__m128d test_mm_mask_add_pd(__m128d __W, __mmask8 __U, __m128d __A, __m128d __B) {		__m128d test_mm_mask_add_pd(__m128d __W, __mmask8 __U, __m128d __A, __m128d __B) {
// CHECK-LABEL: @test_mm_mask_add_pd		// CHECK-LABEL: @test_mm_mask_add_pd
▲ Show 20 Lines • Show All 6,327 Lines • Show Last 20 Lines

clang/test/CodeGen/builtins-ppc-vsx.c

	Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines
	// CHECK-LE: call <4 x float> @llvm.fabs.v4f32(<4 x float> %{{[0-9]*}})			// CHECK-LE: call <4 x float> @llvm.fabs.v4f32(<4 x float> %{{[0-9]*}})

	res_vd = vec_abs(vd);			res_vd = vec_abs(vd);
	// CHECK: call <2 x double> @llvm.fabs.v2f64(<2 x double> %{{[0-9]*}})			// CHECK: call <2 x double> @llvm.fabs.v2f64(<2 x double> %{{[0-9]*}})
	// CHECK-LE: call <2 x double> @llvm.fabs.v2f64(<2 x double> %{{[0-9]*}})			// CHECK-LE: call <2 x double> @llvm.fabs.v2f64(<2 x double> %{{[0-9]*}})

	res_vf = vec_nabs(vf);			res_vf = vec_nabs(vf);
	// CHECK: [[VEC:%[0-9]+]] = call <4 x float> @llvm.fabs.v4f32(<4 x float> %{{[0-9]*}})			// CHECK: [[VEC:%[0-9]+]] = call <4 x float> @llvm.fabs.v4f32(<4 x float> %{{[0-9]*}})
	// CHECK-NEXT: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, [[VEC]]			// CHECK-NEXT: fneg <4 x float> [[VEC]]

	res_vd = vec_nabs(vd);			res_vd = vec_nabs(vd);
	// CHECK: [[VECD:%[0-9]+]] = call <2 x double> @llvm.fabs.v2f64(<2 x double> %{{[0-9]*}})			// CHECK: [[VECD:%[0-9]+]] = call <2 x double> @llvm.fabs.v2f64(<2 x double> %{{[0-9]*}})
	// CHECK: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, [[VECD]]			// CHECK: fneg <2 x double> [[VECD]]

	dummy();			dummy();
	// CHECK: call void @dummy()			// CHECK: call void @dummy()
	// CHECK-LE: call void @dummy()			// CHECK-LE: call void @dummy()

	res_vd = vec_add(vd, vd);			res_vd = vec_add(vd, vd);
	// CHECK: fadd <2 x double>			// CHECK: fadd <2 x double>
	// CHECK-LE: fadd <2 x double>			// CHECK-LE: fadd <2 x double>
	▲ Show 20 Lines • Show All 1,588 Lines • ▼ Show 20 Lines
	// CHECK: store <2 x i64> %{{[0-9]+}}, <2 x i64>* %{{[0-9]+}}, align 1			// CHECK: store <2 x i64> %{{[0-9]+}}, <2 x i64>* %{{[0-9]+}}, align 1
	// CHECK-LE: call void @llvm.ppc.vsx.stxvd2x.be(<2 x double> %{{[0-9]+}}, i8* %{{[0-9]+}})			// CHECK-LE: call void @llvm.ppc.vsx.stxvd2x.be(<2 x double> %{{[0-9]+}}, i8* %{{[0-9]+}})

	vec_xst_be(vd, sll, ad);			vec_xst_be(vd, sll, ad);
	// CHECK: store <2 x double> %{{[0-9]+}}, <2 x double>* %{{[0-9]+}}, align 1			// CHECK: store <2 x double> %{{[0-9]+}}, <2 x double>* %{{[0-9]+}}, align 1
	// CHECK-LE: call void @llvm.ppc.vsx.stxvd2x.be(<2 x double> %{{[0-9]+}}, i8* %{{[0-9]+}})			// CHECK-LE: call void @llvm.ppc.vsx.stxvd2x.be(<2 x double> %{{[0-9]+}}, i8* %{{[0-9]+}})

	res_vf = vec_neg(vf);			res_vf = vec_neg(vf);
	// CHECK: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, {{%[0-9]+}}			// CHECK: fneg <4 x float> {{%[0-9]+}}
	// CHECK-LE: fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, {{%[0-9]+}}			// CHECK-LE: fneg <4 x float> {{%[0-9]+}}

	res_vd = vec_neg(vd);			res_vd = vec_neg(vd);
	// CHECK: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, {{%[0-9]+}}			// CHECK: fneg <2 x double> {{%[0-9]+}}
	// CHECK-LE: fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, {{%[0-9]+}}			// CHECK-LE: fneg <2 x double> {{%[0-9]+}}

	res_vd = vec_xxpermdi(vd, vd, 0);			res_vd = vec_xxpermdi(vd, vd, 0);
	// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> <i32 0, i32 2>			// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> <i32 0, i32 2>
	// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> <i32 0, i32 2>			// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> <i32 0, i32 2>

	res_vf = vec_xxpermdi(vf, vf, 1);			res_vf = vec_xxpermdi(vf, vf, 1);
	// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> <i32 0, i32 3>			// CHECK: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> <i32 0, i32 3>
	// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> <i32 0, i32 3>			// CHECK-LE: shufflevector <2 x i64> %{{[0-9]+}}, <2 x i64> %{{[0-9]+}}, <2 x i32> <i32 0, i32 3>
	▲ Show 20 Lines • Show All 132 Lines • Show Last 20 Lines

clang/test/CodeGen/complex-math.c

Show First 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	float _Complex sub_float_cr(float _Complex a, float b) {
// X86: fsub		// X86: fsub
// X86-NOT: fsub		// X86-NOT: fsub
// X86: ret		// X86: ret
return a - b;		return a - b;
}		}
float _Complex sub_float_rc(float a, float _Complex b) {		float _Complex sub_float_rc(float a, float _Complex b) {
// X86-LABEL: @sub_float_rc(		// X86-LABEL: @sub_float_rc(
// X86: fsub		// X86: fsub
// X86: fsub float -0.{{0+}}e+00,		// X86: fneg
// X86-NOT: fsub		// X86-NOT: fsub
// X86: ret		// X86: ret
return a - b;		return a - b;
}		}
float _Complex sub_float_cc(float _Complex a, float _Complex b) {		float _Complex sub_float_cc(float _Complex a, float _Complex b) {
// X86-LABEL: @sub_float_cc(		// X86-LABEL: @sub_float_cc(
// X86: fsub		// X86: fsub
// X86: fsub		// X86: fsub
▲ Show 20 Lines • Show All 161 Lines • ▼ Show 20 Lines	double _Complex sub_double_cr(double _Complex a, double b) {
// X86: fsub		// X86: fsub
// X86-NOT: fsub		// X86-NOT: fsub
// X86: ret		// X86: ret
return a - b;		return a - b;
}		}
double _Complex sub_double_rc(double a, double _Complex b) {		double _Complex sub_double_rc(double a, double _Complex b) {
// X86-LABEL: @sub_double_rc(		// X86-LABEL: @sub_double_rc(
// X86: fsub		// X86: fsub
// X86: fsub double -0.{{0+}}e+00,		// X86: fneg
// X86-NOT: fsub		// X86-NOT: fsub
// X86: ret		// X86: ret
return a - b;		return a - b;
}		}
double _Complex sub_double_cc(double _Complex a, double _Complex b) {		double _Complex sub_double_cc(double _Complex a, double _Complex b) {
// X86-LABEL: @sub_double_cc(		// X86-LABEL: @sub_double_cc(
// X86: fsub		// X86: fsub
// X86: fsub		// X86: fsub
▲ Show 20 Lines • Show All 161 Lines • ▼ Show 20 Lines	long double _Complex sub_long_double_cr(long double _Complex a, long double b) {
// X86: fsub		// X86: fsub
// X86-NOT: fsub		// X86-NOT: fsub
// X86: ret		// X86: ret
return a - b;		return a - b;
}		}
long double _Complex sub_long_double_rc(long double a, long double _Complex b) {		long double _Complex sub_long_double_rc(long double a, long double _Complex b) {
// X86-LABEL: @sub_long_double_rc(		// X86-LABEL: @sub_long_double_rc(
// X86: fsub		// X86: fsub
// X86: fsub x86_fp80 0xK8{{0+}},		// X86: fneg
// X86-NOT: fsub		// X86-NOT: fsub
// X86: ret		// X86: ret
return a - b;		return a - b;
}		}
long double _Complex sub_long_double_cc(long double _Complex a, long double _Complex b) {		long double _Complex sub_long_double_cc(long double _Complex a, long double _Complex b) {
// X86-LABEL: @sub_long_double_cc(		// X86-LABEL: @sub_long_double_cc(
// X86: fsub		// X86: fsub
// X86: fsub		// X86: fsub
▲ Show 20 Lines • Show All 211 Lines • Show Last 20 Lines

clang/test/CodeGen/exprs.c

Show First 20 Lines • Show All 137 Lines • ▼ Show 20 Lines	int f12() {
// CHECK-LABEL: define i32 @f12		// CHECK-LABEL: define i32 @f12
// CHECK: ret i32 1		// CHECK: ret i32 1
return 1\|\|1;		return 1\|\|1;
}		}

// Make sure negate of fp uses -0.0 for proper -0 handling.		// Make sure negate of fp uses -0.0 for proper -0 handling.
double f13(double X) {		double f13(double X) {
// CHECK-LABEL: define double @f13		// CHECK-LABEL: define double @f13
// CHECK: fsub double -0.0		// CHECK: fneg double
return -X;		return -X;
}		}

// Check operations on incomplete types.		// Check operations on incomplete types.
void f14(struct s14 *a) {		void f14(struct s14 *a) {
(void) &*a;		(void) &*a;
}		}

▲ Show 20 Lines • Show All 54 Lines • Show Last 20 Lines

clang/test/CodeGen/fma-builtins.c

Show All 31 Lines	__m128d test_mm_fmadd_sd(__m128d a, __m128d b, __m128d c) {
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: call double @llvm.fma.f64(double %{{.}}, double %{{.}}, double %{{.*}})		// CHECK: call double @llvm.fma.f64(double %{{.}}, double %{{.}}, double %{{.*}})
// CHECK: insertelement <2 x double> %{{.}}, double %{{.}}, i64 0		// CHECK: insertelement <2 x double> %{{.}}, double %{{.}}, i64 0
return _mm_fmadd_sd(a, b, c);		return _mm_fmadd_sd(a, b, c);
}		}

__m128 test_mm_fmsub_ps(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_fmsub_ps(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_fmsub_ps		// CHECK-LABEL: test_mm_fmsub_ps
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
return _mm_fmsub_ps(a, b, c);		return _mm_fmsub_ps(a, b, c);
}		}

__m128d test_mm_fmsub_pd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_fmsub_pd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_fmsub_pd		// CHECK-LABEL: test_mm_fmsub_pd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
return _mm_fmsub_pd(a, b, c);		return _mm_fmsub_pd(a, b, c);
}		}

__m128 test_mm_fmsub_ss(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_fmsub_ss(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_fmsub_ss		// CHECK-LABEL: test_mm_fmsub_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: extractelement <4 x float> %{{.*}}, i64 0		// CHECK: extractelement <4 x float> %{{.*}}, i64 0
// CHECK: extractelement <4 x float> %{{.*}}, i64 0		// CHECK: extractelement <4 x float> %{{.*}}, i64 0
// CHECK: extractelement <4 x float> %{{.*}}, i64 0		// CHECK: extractelement <4 x float> %{{.*}}, i64 0
// CHECK: call float @llvm.fma.f32(float %{{.}}, float %{{.}}, float %{{.*}})		// CHECK: call float @llvm.fma.f32(float %{{.}}, float %{{.}}, float %{{.*}})
// CHECK: insertelement <4 x float> %{{.}}, float %{{.}}, i64 0		// CHECK: insertelement <4 x float> %{{.}}, float %{{.}}, i64 0
return _mm_fmsub_ss(a, b, c);		return _mm_fmsub_ss(a, b, c);
}		}

__m128d test_mm_fmsub_sd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_fmsub_sd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_fmsub_sd		// CHECK-LABEL: test_mm_fmsub_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: call double @llvm.fma.f64(double %{{.}}, double %{{.}}, double %{{.*}})		// CHECK: call double @llvm.fma.f64(double %{{.}}, double %{{.}}, double %{{.*}})
// CHECK: insertelement <2 x double> %{{.}}, double %{{.}}, i64 0		// CHECK: insertelement <2 x double> %{{.}}, double %{{.}}, i64 0
return _mm_fmsub_sd(a, b, c);		return _mm_fmsub_sd(a, b, c);
}		}

__m128 test_mm_fnmadd_ps(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_fnmadd_ps(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_fnmadd_ps		// CHECK-LABEL: test_mm_fnmadd_ps
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
return _mm_fnmadd_ps(a, b, c);		return _mm_fnmadd_ps(a, b, c);
}		}

__m128d test_mm_fnmadd_pd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_fnmadd_pd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_fnmadd_pd		// CHECK-LABEL: test_mm_fnmadd_pd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
return _mm_fnmadd_pd(a, b, c);		return _mm_fnmadd_pd(a, b, c);
}		}

__m128 test_mm_fnmadd_ss(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_fnmadd_ss(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_fnmadd_ss		// CHECK-LABEL: test_mm_fnmadd_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: extractelement <4 x float> %{{.*}}, i64 0		// CHECK: extractelement <4 x float> %{{.*}}, i64 0
// CHECK: extractelement <4 x float> %{{.*}}, i64 0		// CHECK: extractelement <4 x float> %{{.*}}, i64 0
// CHECK: extractelement <4 x float> %{{.*}}, i64 0		// CHECK: extractelement <4 x float> %{{.*}}, i64 0
// CHECK: call float @llvm.fma.f32(float %{{.}}, float %{{.}}, float %{{.*}})		// CHECK: call float @llvm.fma.f32(float %{{.}}, float %{{.}}, float %{{.*}})
// CHECK: insertelement <4 x float> %{{.}}, float %{{.}}, i64 0		// CHECK: insertelement <4 x float> %{{.}}, float %{{.}}, i64 0
return _mm_fnmadd_ss(a, b, c);		return _mm_fnmadd_ss(a, b, c);
}		}

__m128d test_mm_fnmadd_sd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_fnmadd_sd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_fnmadd_sd		// CHECK-LABEL: test_mm_fnmadd_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: call double @llvm.fma.f64(double %{{.}}, double %{{.}}, double %{{.*}})		// CHECK: call double @llvm.fma.f64(double %{{.}}, double %{{.}}, double %{{.*}})
// CHECK: insertelement <2 x double> %{{.}}, double %{{.}}, i64 0		// CHECK: insertelement <2 x double> %{{.}}, double %{{.}}, i64 0
return _mm_fnmadd_sd(a, b, c);		return _mm_fnmadd_sd(a, b, c);
}		}

__m128 test_mm_fnmsub_ps(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_fnmsub_ps(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_fnmsub_ps		// CHECK-LABEL: test_mm_fnmsub_ps
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[NEG2:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG2:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
return _mm_fnmsub_ps(a, b, c);		return _mm_fnmsub_ps(a, b, c);
}		}

__m128d test_mm_fnmsub_pd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_fnmsub_pd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_fnmsub_pd		// CHECK-LABEL: test_mm_fnmsub_pd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[NEG2:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG2:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
return _mm_fnmsub_pd(a, b, c);		return _mm_fnmsub_pd(a, b, c);
}		}

__m128 test_mm_fnmsub_ss(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_fnmsub_ss(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_fnmsub_ss		// CHECK-LABEL: test_mm_fnmsub_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[NEG2:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG2:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: extractelement <4 x float> %{{.*}}, i64 0		// CHECK: extractelement <4 x float> %{{.*}}, i64 0
// CHECK: extractelement <4 x float> %{{.*}}, i64 0		// CHECK: extractelement <4 x float> %{{.*}}, i64 0
// CHECK: extractelement <4 x float> %{{.*}}, i64 0		// CHECK: extractelement <4 x float> %{{.*}}, i64 0
// CHECK: call float @llvm.fma.f32(float %{{.}}, float %{{.}}, float %{{.*}})		// CHECK: call float @llvm.fma.f32(float %{{.}}, float %{{.}}, float %{{.*}})
// CHECK: insertelement <4 x float> %{{.}}, float %{{.}}, i64 0		// CHECK: insertelement <4 x float> %{{.}}, float %{{.}}, i64 0
return _mm_fnmsub_ss(a, b, c);		return _mm_fnmsub_ss(a, b, c);
}		}

__m128d test_mm_fnmsub_sd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_fnmsub_sd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_fnmsub_sd		// CHECK-LABEL: test_mm_fnmsub_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[NEG2:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG2:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: call double @llvm.fma.f64(double %{{.}}, double %{{.}}, double %{{.*}})		// CHECK: call double @llvm.fma.f64(double %{{.}}, double %{{.}}, double %{{.*}})
// CHECK: insertelement <2 x double> %{{.}}, double %{{.}}, i64 0		// CHECK: insertelement <2 x double> %{{.}}, double %{{.}}, i64 0
return _mm_fnmsub_sd(a, b, c);		return _mm_fnmsub_sd(a, b, c);
}		}

__m128 test_mm_fmaddsub_ps(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_fmaddsub_ps(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_fmaddsub_ps		// CHECK-LABEL: test_mm_fmaddsub_ps
// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]
// CHECK: shufflevector <4 x float> [[SUB]], <4 x float> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x float> [[SUB]], <4 x float> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
return _mm_fmaddsub_ps(a, b, c);		return _mm_fmaddsub_ps(a, b, c);
}		}

__m128d test_mm_fmaddsub_pd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_fmaddsub_pd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_fmaddsub_pd		// CHECK-LABEL: test_mm_fmaddsub_pd
// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]
// CHECK: shufflevector <2 x double> [[SUB]], <2 x double> [[ADD]], <2 x i32> <i32 0, i32 3>		// CHECK: shufflevector <2 x double> [[SUB]], <2 x double> [[ADD]], <2 x i32> <i32 0, i32 3>
return _mm_fmaddsub_pd(a, b, c);		return _mm_fmaddsub_pd(a, b, c);
}		}

__m128 test_mm_fmsubadd_ps(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_fmsubadd_ps(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_fmsubadd_ps		// CHECK-LABEL: test_mm_fmsubadd_ps
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]
// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: shufflevector <4 x float> [[ADD]], <4 x float> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x float> [[ADD]], <4 x float> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
return _mm_fmsubadd_ps(a, b, c);		return _mm_fmsubadd_ps(a, b, c);
}		}

__m128d test_mm_fmsubadd_pd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_fmsubadd_pd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_fmsubadd_pd		// CHECK-LABEL: test_mm_fmsubadd_pd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]
// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: shufflevector <2 x double> [[ADD]], <2 x double> [[SUB]], <2 x i32> <i32 0, i32 3>		// CHECK: shufflevector <2 x double> [[ADD]], <2 x double> [[SUB]], <2 x i32> <i32 0, i32 3>
return _mm_fmsubadd_pd(a, b, c);		return _mm_fmsubadd_pd(a, b, c);
}		}

__m256 test_mm256_fmadd_ps(__m256 a, __m256 b, __m256 c) {		__m256 test_mm256_fmadd_ps(__m256 a, __m256 b, __m256 c) {
// CHECK-LABEL: test_mm256_fmadd_ps		// CHECK-LABEL: test_mm256_fmadd_ps
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
return _mm256_fmadd_ps(a, b, c);		return _mm256_fmadd_ps(a, b, c);
}		}

__m256d test_mm256_fmadd_pd(__m256d a, __m256d b, __m256d c) {		__m256d test_mm256_fmadd_pd(__m256d a, __m256d b, __m256d c) {
// CHECK-LABEL: test_mm256_fmadd_pd		// CHECK-LABEL: test_mm256_fmadd_pd
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
return _mm256_fmadd_pd(a, b, c);		return _mm256_fmadd_pd(a, b, c);
}		}

__m256 test_mm256_fmsub_ps(__m256 a, __m256 b, __m256 c) {		__m256 test_mm256_fmsub_ps(__m256 a, __m256 b, __m256 c) {
// CHECK-LABEL: test_mm256_fmsub_ps		// CHECK-LABEL: test_mm256_fmsub_ps
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
return _mm256_fmsub_ps(a, b, c);		return _mm256_fmsub_ps(a, b, c);
}		}

__m256d test_mm256_fmsub_pd(__m256d a, __m256d b, __m256d c) {		__m256d test_mm256_fmsub_pd(__m256d a, __m256d b, __m256d c) {
// CHECK-LABEL: test_mm256_fmsub_pd		// CHECK-LABEL: test_mm256_fmsub_pd
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
return _mm256_fmsub_pd(a, b, c);		return _mm256_fmsub_pd(a, b, c);
}		}

__m256 test_mm256_fnmadd_ps(__m256 a, __m256 b, __m256 c) {		__m256 test_mm256_fnmadd_ps(__m256 a, __m256 b, __m256 c) {
// CHECK-LABEL: test_mm256_fnmadd_ps		// CHECK-LABEL: test_mm256_fnmadd_ps
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
return _mm256_fnmadd_ps(a, b, c);		return _mm256_fnmadd_ps(a, b, c);
}		}

__m256d test_mm256_fnmadd_pd(__m256d a, __m256d b, __m256d c) {		__m256d test_mm256_fnmadd_pd(__m256d a, __m256d b, __m256d c) {
// CHECK-LABEL: test_mm256_fnmadd_pd		// CHECK-LABEL: test_mm256_fnmadd_pd
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
return _mm256_fnmadd_pd(a, b, c);		return _mm256_fnmadd_pd(a, b, c);
}		}

__m256 test_mm256_fnmsub_ps(__m256 a, __m256 b, __m256 c) {		__m256 test_mm256_fnmsub_ps(__m256 a, __m256 b, __m256 c) {
// CHECK-LABEL: test_mm256_fnmsub_ps		// CHECK-LABEL: test_mm256_fnmsub_ps
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG2:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
return _mm256_fnmsub_ps(a, b, c);		return _mm256_fnmsub_ps(a, b, c);
}		}

__m256d test_mm256_fnmsub_pd(__m256d a, __m256d b, __m256d c) {		__m256d test_mm256_fnmsub_pd(__m256d a, __m256d b, __m256d c) {
// CHECK-LABEL: test_mm256_fnmsub_pd		// CHECK-LABEL: test_mm256_fnmsub_pd
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: [[NEG2:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG2:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
return _mm256_fnmsub_pd(a, b, c);		return _mm256_fnmsub_pd(a, b, c);
}		}

__m256 test_mm256_fmaddsub_ps(__m256 a, __m256 b, __m256 c) {		__m256 test_mm256_fmaddsub_ps(__m256 a, __m256 b, __m256 c) {
// CHECK-LABEL: test_mm256_fmaddsub_ps		// CHECK-LABEL: test_mm256_fmaddsub_ps
// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]
// CHECK: shufflevector <8 x float> [[SUB]], <8 x float> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x float> [[SUB]], <8 x float> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
return _mm256_fmaddsub_ps(a, b, c);		return _mm256_fmaddsub_ps(a, b, c);
}		}

__m256d test_mm256_fmaddsub_pd(__m256d a, __m256d b, __m256d c) {		__m256d test_mm256_fmaddsub_pd(__m256d a, __m256d b, __m256d c) {
// CHECK-LABEL: test_mm256_fmaddsub_pd		// CHECK-LABEL: test_mm256_fmaddsub_pd
// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: shufflevector <4 x double> [[SUB]], <4 x double> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x double> [[SUB]], <4 x double> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
return _mm256_fmaddsub_pd(a, b, c);		return _mm256_fmaddsub_pd(a, b, c);
}		}

__m256 test_mm256_fmsubadd_ps(__m256 a, __m256 b, __m256 c) {		__m256 test_mm256_fmsubadd_ps(__m256 a, __m256 b, __m256 c) {
// CHECK-LABEL: test_mm256_fmsubadd_ps		// CHECK-LABEL: test_mm256_fmsubadd_ps
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]
// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: shufflevector <8 x float> [[ADD]], <8 x float> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x float> [[ADD]], <8 x float> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
return _mm256_fmsubadd_ps(a, b, c);		return _mm256_fmsubadd_ps(a, b, c);
}		}

__m256d test_mm256_fmsubadd_pd(__m256d a, __m256d b, __m256d c) {		__m256d test_mm256_fmsubadd_pd(__m256d a, __m256d b, __m256d c) {
// CHECK-LABEL: test_mm256_fmsubadd_pd		// CHECK-LABEL: test_mm256_fmsubadd_pd
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> [[NEG]]
// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: shufflevector <4 x double> [[ADD]], <4 x double> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x double> [[ADD]], <4 x double> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
return _mm256_fmsubadd_pd(a, b, c);		return _mm256_fmsubadd_pd(a, b, c);
}		}

clang/test/CodeGen/fma4-builtins.c

Show All 31 Lines	__m128d test_mm_macc_sd(__m128d a, __m128d b, __m128d c) {
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: call double @llvm.fma.f64(double %{{.}}, double %{{.}}, double %{{.*}})		// CHECK: call double @llvm.fma.f64(double %{{.}}, double %{{.}}, double %{{.*}})
// CHECK: insertelement <2 x double> zeroinitializer, double %{{.*}}, i64 0		// CHECK: insertelement <2 x double> zeroinitializer, double %{{.*}}, i64 0
return _mm_macc_sd(a, b, c);		return _mm_macc_sd(a, b, c);
}		}

__m128 test_mm_msub_ps(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_msub_ps(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_msub_ps		// CHECK-LABEL: test_mm_msub_ps
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
return _mm_msub_ps(a, b, c);		return _mm_msub_ps(a, b, c);
}		}

__m128d test_mm_msub_pd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_msub_pd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_msub_pd		// CHECK-LABEL: test_mm_msub_pd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
return _mm_msub_pd(a, b, c);		return _mm_msub_pd(a, b, c);
}		}

__m128 test_mm_msub_ss(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_msub_ss(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_msub_ss		// CHECK-LABEL: test_mm_msub_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: extractelement <4 x float> %{{.*}}, i64 0		// CHECK: extractelement <4 x float> %{{.*}}, i64 0
// CHECK: extractelement <4 x float> %{{.*}}, i64 0		// CHECK: extractelement <4 x float> %{{.*}}, i64 0
// CHECK: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK: [[C:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK: call float @llvm.fma.f32(float %{{.}}, float %{{.}}, float [[C]])		// CHECK: call float @llvm.fma.f32(float %{{.}}, float %{{.}}, float [[C]])
// CHECK: insertelement <4 x float> zeroinitializer, float %{{.*}}, i64 0		// CHECK: insertelement <4 x float> zeroinitializer, float %{{.*}}, i64 0
return _mm_msub_ss(a, b, c);		return _mm_msub_ss(a, b, c);
}		}

__m128d test_mm_msub_sd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_msub_sd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_msub_sd		// CHECK-LABEL: test_mm_msub_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK: [[C:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK: call double @llvm.fma.f64(double %{{.}}, double %{{.}}, double [[C]])		// CHECK: call double @llvm.fma.f64(double %{{.}}, double %{{.}}, double [[C]])
// CHECK: insertelement <2 x double> zeroinitializer, double %{{.*}}, i64 0		// CHECK: insertelement <2 x double> zeroinitializer, double %{{.*}}, i64 0
return _mm_msub_sd(a, b, c);		return _mm_msub_sd(a, b, c);
}		}

__m128 test_mm_nmacc_ps(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_nmacc_ps(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_nmacc_ps		// CHECK-LABEL: test_mm_nmacc_ps
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
return _mm_nmacc_ps(a, b, c);		return _mm_nmacc_ps(a, b, c);
}		}

__m128d test_mm_nmacc_pd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_nmacc_pd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_nmacc_pd		// CHECK-LABEL: test_mm_nmacc_pd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
return _mm_nmacc_pd(a, b, c);		return _mm_nmacc_pd(a, b, c);
}		}

__m128 test_mm_nmacc_ss(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_nmacc_ss(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_nmacc_ss		// CHECK-LABEL: test_mm_nmacc_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK: extractelement <4 x float> %{{.*}}, i64 0		// CHECK: extractelement <4 x float> %{{.*}}, i64 0
// CHECK: extractelement <4 x float> %{{.*}}, i64 0		// CHECK: extractelement <4 x float> %{{.*}}, i64 0
// CHECK: call float @llvm.fma.f32(float [[A]], float %{{.}}, float %{{.}})		// CHECK: call float @llvm.fma.f32(float [[A]], float %{{.}}, float %{{.}})
// CHECK: insertelement <4 x float> zeroinitializer, float %{{.*}}, i64 0		// CHECK: insertelement <4 x float> zeroinitializer, float %{{.*}}, i64 0
return _mm_nmacc_ss(a, b, c);		return _mm_nmacc_ss(a, b, c);
}		}

__m128d test_mm_nmacc_sd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_nmacc_sd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_nmacc_sd		// CHECK-LABEL: test_mm_nmacc_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: call double @llvm.fma.f64(double [[A]], double %{{.}}, double %{{.}})		// CHECK: call double @llvm.fma.f64(double [[A]], double %{{.}}, double %{{.}})
// CHECK: insertelement <2 x double> zeroinitializer, double %{{.*}}, i64 0		// CHECK: insertelement <2 x double> zeroinitializer, double %{{.*}}, i64 0
return _mm_nmacc_sd(a, b, c);		return _mm_nmacc_sd(a, b, c);
}		}

__m128 test_mm_nmsub_ps(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_nmsub_ps(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_nmsub_ps		// CHECK-LABEL: test_mm_nmsub_ps
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[NEG2:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG2:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
return _mm_nmsub_ps(a, b, c);		return _mm_nmsub_ps(a, b, c);
}		}

__m128d test_mm_nmsub_pd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_nmsub_pd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_nmsub_pd		// CHECK-LABEL: test_mm_nmsub_pd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[NEG2:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG2:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
return _mm_nmsub_pd(a, b, c);		return _mm_nmsub_pd(a, b, c);
}		}

__m128 test_mm_nmsub_ss(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_nmsub_ss(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_nmsub_ss		// CHECK-LABEL: test_mm_nmsub_ss
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[NEG2:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG2:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[A:%.+]] = extractelement <4 x float> [[NEG]], i64 0		// CHECK: [[A:%.+]] = extractelement <4 x float> [[NEG]], i64 0
// CHECK: extractelement <4 x float> %{{.*}}, i64 0		// CHECK: extractelement <4 x float> %{{.*}}, i64 0
// CHECK: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0		// CHECK: [[C:%.+]] = extractelement <4 x float> [[NEG2]], i64 0
// CHECK: call float @llvm.fma.f32(float [[A]], float %{{.*}}, float [[C]])		// CHECK: call float @llvm.fma.f32(float [[A]], float %{{.*}}, float [[C]])
// CHECK: insertelement <4 x float> zeroinitializer, float %{{.*}}, i64 0		// CHECK: insertelement <4 x float> zeroinitializer, float %{{.*}}, i64 0
return _mm_nmsub_ss(a, b, c);		return _mm_nmsub_ss(a, b, c);
}		}

__m128d test_mm_nmsub_sd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_nmsub_sd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_nmsub_sd		// CHECK-LABEL: test_mm_nmsub_sd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[NEG2:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG2:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[A:%.+]] = extractelement <2 x double> [[NEG]], i64 0		// CHECK: [[A:%.+]] = extractelement <2 x double> [[NEG]], i64 0
// CHECK: extractelement <2 x double> %{{.*}}, i64 0		// CHECK: extractelement <2 x double> %{{.*}}, i64 0
// CHECK: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0		// CHECK: [[C:%.+]] = extractelement <2 x double> [[NEG2]], i64 0
// CHECK: call double @llvm.fma.f64(double [[A]], double %{{.*}}, double [[C]])		// CHECK: call double @llvm.fma.f64(double [[A]], double %{{.*}}, double [[C]])
// CHECK: insertelement <2 x double> zeroinitializer, double %{{.*}}, i64 0		// CHECK: insertelement <2 x double> zeroinitializer, double %{{.*}}, i64 0
return _mm_nmsub_sd(a, b, c);		return _mm_nmsub_sd(a, b, c);
}		}

__m128 test_mm_maddsub_ps(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_maddsub_ps(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_maddsub_ps		// CHECK-LABEL: test_mm_maddsub_ps
// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]
// CHECK: shufflevector <4 x float> [[SUB]], <4 x float> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x float> [[SUB]], <4 x float> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
return _mm_maddsub_ps(a, b, c);		return _mm_maddsub_ps(a, b, c);
}		}

__m128d test_mm_maddsub_pd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_maddsub_pd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_maddsub_pd		// CHECK-LABEL: test_mm_maddsub_pd
// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]
// CHECK: shufflevector <2 x double> [[SUB]], <2 x double> [[ADD]], <2 x i32> <i32 0, i32 3>		// CHECK: shufflevector <2 x double> [[SUB]], <2 x double> [[ADD]], <2 x i32> <i32 0, i32 3>
return _mm_maddsub_pd(a, b, c);		return _mm_maddsub_pd(a, b, c);
}		}

__m128 test_mm_msubadd_ps(__m128 a, __m128 b, __m128 c) {		__m128 test_mm_msubadd_ps(__m128 a, __m128 b, __m128 c) {
// CHECK-LABEL: test_mm_msubadd_ps		// CHECK-LABEL: test_mm_msubadd_ps
// CHECK: [[NEG:%.+]] = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x float> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> [[NEG]]
// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x float> @llvm.fma.v4f32(<4 x float> %{{.}}, <4 x float> %{{.}}, <4 x float> %{{.*}})
// CHECK: shufflevector <4 x float> [[ADD]], <4 x float> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x float> [[ADD]], <4 x float> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
return _mm_msubadd_ps(a, b, c);		return _mm_msubadd_ps(a, b, c);
}		}

__m128d test_mm_msubadd_pd(__m128d a, __m128d b, __m128d c) {		__m128d test_mm_msubadd_pd(__m128d a, __m128d b, __m128d c) {
// CHECK-LABEL: test_mm_msubadd_pd		// CHECK-LABEL: test_mm_msubadd_pd
// CHECK: [[NEG:%.+]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <2 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> [[NEG]]
// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <2 x double> @llvm.fma.v2f64(<2 x double> %{{.}}, <2 x double> %{{.}}, <2 x double> %{{.*}})
// CHECK: shufflevector <2 x double> [[ADD]], <2 x double> [[SUB]], <2 x i32> <i32 0, i32 3>		// CHECK: shufflevector <2 x double> [[ADD]], <2 x double> [[SUB]], <2 x i32> <i32 0, i32 3>
return _mm_msubadd_pd(a, b, c);		return _mm_msubadd_pd(a, b, c);
}		}

__m256 test_mm256_macc_ps(__m256 a, __m256 b, __m256 c) {		__m256 test_mm256_macc_ps(__m256 a, __m256 b, __m256 c) {
// CHECK-LABEL: test_mm256_macc_ps		// CHECK-LABEL: test_mm256_macc_ps
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
return _mm256_macc_ps(a, b, c);		return _mm256_macc_ps(a, b, c);
}		}

__m256d test_mm256_macc_pd(__m256d a, __m256d b, __m256d c) {		__m256d test_mm256_macc_pd(__m256d a, __m256d b, __m256d c) {
// CHECK-LABEL: test_mm256_macc_pd		// CHECK-LABEL: test_mm256_macc_pd
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
return _mm256_macc_pd(a, b, c);		return _mm256_macc_pd(a, b, c);
}		}

__m256 test_mm256_msub_ps(__m256 a, __m256 b, __m256 c) {		__m256 test_mm256_msub_ps(__m256 a, __m256 b, __m256 c) {
// CHECK-LABEL: test_mm256_msub_ps		// CHECK-LABEL: test_mm256_msub_ps
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
return _mm256_msub_ps(a, b, c);		return _mm256_msub_ps(a, b, c);
}		}

__m256d test_mm256_msub_pd(__m256d a, __m256d b, __m256d c) {		__m256d test_mm256_msub_pd(__m256d a, __m256d b, __m256d c) {
// CHECK-LABEL: test_mm256_msub_pd		// CHECK-LABEL: test_mm256_msub_pd
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
return _mm256_msub_pd(a, b, c);		return _mm256_msub_pd(a, b, c);
}		}

__m256 test_mm256_nmacc_ps(__m256 a, __m256 b, __m256 c) {		__m256 test_mm256_nmacc_ps(__m256 a, __m256 b, __m256 c) {
// CHECK-LABEL: test_mm256_nmacc_ps		// CHECK-LABEL: test_mm256_nmacc_ps
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
return _mm256_nmacc_ps(a, b, c);		return _mm256_nmacc_ps(a, b, c);
}		}

__m256d test_mm256_nmacc_pd(__m256d a, __m256d b, __m256d c) {		__m256d test_mm256_nmacc_pd(__m256d a, __m256d b, __m256d c) {
// CHECK-LABEL: test_mm256_nmacc_pd		// CHECK-LABEL: test_mm256_nmacc_pd
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
return _mm256_nmacc_pd(a, b, c);		return _mm256_nmacc_pd(a, b, c);
}		}

__m256 test_mm256_nmsub_ps(__m256 a, __m256 b, __m256 c) {		__m256 test_mm256_nmsub_ps(__m256 a, __m256 b, __m256 c) {
// CHECK-LABEL: test_mm256_nmsub_ps		// CHECK-LABEL: test_mm256_nmsub_ps
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: [[NEG2:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG2:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
return _mm256_nmsub_ps(a, b, c);		return _mm256_nmsub_ps(a, b, c);
}		}

__m256d test_mm256_nmsub_pd(__m256d a, __m256d b, __m256d c) {		__m256d test_mm256_nmsub_pd(__m256d a, __m256d b, __m256d c) {
// CHECK-LABEL: test_mm256_nmsub_pd		// CHECK-LABEL: test_mm256_nmsub_pd
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: [[NEG2:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG2:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
return _mm256_nmsub_pd(a, b, c);		return _mm256_nmsub_pd(a, b, c);
}		}

__m256 test_mm256_maddsub_ps(__m256 a, __m256 b, __m256 c) {		__m256 test_mm256_maddsub_ps(__m256 a, __m256 b, __m256 c) {
// CHECK-LABEL: test_mm256_maddsub_ps		// CHECK-LABEL: test_mm256_maddsub_ps
// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]
// CHECK: shufflevector <8 x float> [[SUB]], <8 x float> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x float> [[SUB]], <8 x float> [[ADD]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
return _mm256_maddsub_ps(a, b, c);		return _mm256_maddsub_ps(a, b, c);
}		}

__m256d test_mm256_maddsub_pd(__m256d a, __m256d b, __m256d c) {		__m256d test_mm256_maddsub_pd(__m256d a, __m256d b, __m256d c) {
// CHECK-LABEL: test_mm256_maddsub_pd		// CHECK-LABEL: test_mm256_maddsub_pd
// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> %{{.+}}
// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: shufflevector <4 x double> [[SUB]], <4 x double> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x double> [[SUB]], <4 x double> [[ADD]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
return _mm256_maddsub_pd(a, b, c);		return _mm256_maddsub_pd(a, b, c);
}		}

__m256 test_mm256_msubadd_ps(__m256 a, __m256 b, __m256 c) {		__m256 test_mm256_msubadd_ps(__m256 a, __m256 b, __m256 c) {
// CHECK-LABEL: test_mm256_msubadd_ps		// CHECK-LABEL: test_mm256_msubadd_ps
// CHECK: [[NEG:%.+]] = fsub <8 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, %{{.*}}		// CHECK: [[NEG:%.+]] = fneg <8 x float> %{{.*}}
// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]		// CHECK: [[SUB:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> [[NEG]]
// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})		// CHECK: [[ADD:%.+]] = call <8 x float> @llvm.fma.v8f32(<8 x float> %{{.}}, <8 x float> %{{.}}, <8 x float> %{{.*}})
// CHECK: shufflevector <8 x float> [[ADD]], <8 x float> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>		// CHECK: shufflevector <8 x float> [[ADD]], <8 x float> [[SUB]], <8 x i32> <i32 0, i32 9, i32 2, i32 11, i32 4, i32 13, i32 6, i32 15>
return _mm256_msubadd_ps(a, b, c);		return _mm256_msubadd_ps(a, b, c);
}		}

__m256d test_mm256_msubadd_pd(__m256d a, __m256d b, __m256d c) {		__m256d test_mm256_msubadd_pd(__m256d a, __m256d b, __m256d c) {
// CHECK-LABEL: test_mm256_msubadd_pd		// CHECK-LABEL: test_mm256_msubadd_pd
// CHECK: [[NEG:%.+]] = fsub <4 x double> <double -0.000000e+00, double -0.000000e+00, double -0.000000e+00, double -0.000000e+00>, %{{.+}}		// CHECK: [[NEG:%.+]] = fneg <4 x double> {{.+}}
// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> [[NEG]]		// CHECK: [[SUB:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> [[NEG]]
// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})		// CHECK: [[ADD:%.+]] = call <4 x double> @llvm.fma.v4f64(<4 x double> %{{.}}, <4 x double> %{{.}}, <4 x double> %{{.*}})
// CHECK: shufflevector <4 x double> [[ADD]], <4 x double> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>		// CHECK: shufflevector <4 x double> [[ADD]], <4 x double> [[SUB]], <4 x i32> <i32 0, i32 5, i32 2, i32 7>
return _mm256_msubadd_pd(a, b, c);		return _mm256_msubadd_pd(a, b, c);
}		}

clang/test/CodeGen/fp16-ops.c

Show All 31 Lines	void foo(void) {
// NOTNATIVE: [[F32TOF16:fptrunc float]]		// NOTNATIVE: [[F32TOF16:fptrunc float]]
// NATIVE-HALF: uitofp i32 {{.*}} to half		// NATIVE-HALF: uitofp i32 {{.*}} to half
h0 = (test);		h0 = (test);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fcmp une float		// CHECK: fcmp une float
// NATIVE-HALF: fcmp une half		// NATIVE-HALF: fcmp une half
test = (!h1);		test = (!h1);
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fsub float		// CHECK: fneg float
// NOTNATIVE: [[F32TOF16]]		// NOTNATIVE: [[F32TOF16]]
// NATIVE-HALF: fsub half		// NATIVE-HALF: fneg half
h1 = -h1;		h1 = -h1;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: [[F32TOF16]]		// CHECK: [[F32TOF16]]
// NATIVE-HALF: load volatile half		// NATIVE-HALF: load volatile half
// NATIVE-HALF-NEXT: store volatile half		// NATIVE-HALF-NEXT: store volatile half
h1 = +h1;		h1 = +h1;
// CHECK: [[F16TOF32]]		// CHECK: [[F16TOF32]]
// CHECK: fadd float		// CHECK: fadd float
▲ Show 20 Lines • Show All 493 Lines • Show Last 20 Lines

clang/test/CodeGen/zvector.c

	Show First 20 Lines • Show All 102 Lines • ▼ Show 20 Lines
	// CHECK: store volatile <8 x i16> [[SUB1]], <8 x i16>* @ss, align 8			// CHECK: store volatile <8 x i16> [[SUB1]], <8 x i16>* @ss, align 8
	// CHECK: [[TMP2:%.]] = load volatile <4 x i32>, <4 x i32> @si2, align 8			// CHECK: [[TMP2:%.]] = load volatile <4 x i32>, <4 x i32> @si2, align 8
	// CHECK: [[SUB2:%.*]] = sub <4 x i32> zeroinitializer, [[TMP2]]			// CHECK: [[SUB2:%.*]] = sub <4 x i32> zeroinitializer, [[TMP2]]
	// CHECK: store volatile <4 x i32> [[SUB2]], <4 x i32>* @si, align 8			// CHECK: store volatile <4 x i32> [[SUB2]], <4 x i32>* @si, align 8
	// CHECK: [[TMP3:%.]] = load volatile <2 x i64>, <2 x i64> @sl2, align 8			// CHECK: [[TMP3:%.]] = load volatile <2 x i64>, <2 x i64> @sl2, align 8
	// CHECK: [[SUB3:%.*]] = sub <2 x i64> zeroinitializer, [[TMP3]]			// CHECK: [[SUB3:%.*]] = sub <2 x i64> zeroinitializer, [[TMP3]]
	// CHECK: store volatile <2 x i64> [[SUB3]], <2 x i64>* @sl, align 8			// CHECK: store volatile <2 x i64> [[SUB3]], <2 x i64>* @sl, align 8
	// CHECK: [[TMP4:%.]] = load volatile <2 x double>, <2 x double> @fd2, align 8			// CHECK: [[TMP4:%.]] = load volatile <2 x double>, <2 x double> @fd2, align 8
	// CHECK: [[SUB4:%.*]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, [[TMP4]]			// CHECK: [[SUB4:%.*]] = fneg <2 x double> [[TMP4]]
	// CHECK: store volatile <2 x double> [[SUB4]], <2 x double>* @fd, align 8			// CHECK: store volatile <2 x double> [[SUB4]], <2 x double>* @fd, align 8
	// CHECK: ret void			// CHECK: ret void
	void test_neg(void) {			void test_neg(void) {

	sc = -sc2;			sc = -sc2;
	ss = -ss2;			ss = -ss2;
	si = -si2;			si = -si2;
	sl = -sl2;			sl = -sl2;
	▲ Show 20 Lines • Show All 3,249 Lines • Show Last 20 Lines

clang/test/CodeGen/zvector2.c

	Show All 18 Lines
	// CHECK: store volatile <4 x float> [[VAL]], <4 x float>* @ff			// CHECK: store volatile <4 x float> [[VAL]], <4 x float>* @ff
	ff = +ff2;			ff = +ff2;
	}			}

	void test_neg (void)			void test_neg (void)
	{			{
	// CHECK-LABEL: test_neg			// CHECK-LABEL: test_neg
	// CHECK: [[VAL:%[^ ]+]] = load volatile <4 x float>, <4 x float>* @ff2			// CHECK: [[VAL:%[^ ]+]] = load volatile <4 x float>, <4 x float>* @ff2
	// CHECK: %{{.*}} = fsub <4 x float> <float -0.000000e+00, float -0.000000e+00, float -0.000000e+00, float -0.000000e+00>, [[VAL]]			// CHECK: %{{.*}} = fneg <4 x float> [[VAL]]
	ff = -ff2;			ff = -ff2;
	}			}

	void test_preinc (void)			void test_preinc (void)
	{			{
	// CHECK-LABEL: test_preinc			// CHECK-LABEL: test_preinc
	// CHECK: [[VAL:%[^ ]+]] = load volatile <4 x float>, <4 x float>* @ff2			// CHECK: [[VAL:%[^ ]+]] = load volatile <4 x float>, <4 x float>* @ff2
	// CHECK: %{{.*}} = fadd <4 x float> [[VAL]], <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>			// CHECK: %{{.*}} = fadd <4 x float> [[VAL]], <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>
	▲ Show 20 Lines • Show All 159 Lines • Show Last 20 Lines

llvm/include/llvm/IR/IRBuilder.h

Show First 20 Lines • Show All 1,498 Lines • ▼ Show 20 Lines	public:
Value CreateNUWNeg(Value V, const Twine &Name = "") {		Value CreateNUWNeg(Value V, const Twine &Name = "") {
return CreateNeg(V, Name, true, false);		return CreateNeg(V, Name, true, false);
}		}

Value CreateFNeg(Value V, const Twine &Name = "",		Value CreateFNeg(Value V, const Twine &Name = "",
MDNode *FPMathTag = nullptr) {		MDNode *FPMathTag = nullptr) {
if (auto *VC = dyn_cast<Constant>(V))		if (auto *VC = dyn_cast<Constant>(V))
return Insert(Folder.CreateFNeg(VC), Name);		return Insert(Folder.CreateFNeg(VC), Name);
return Insert(setFPAttrs(BinaryOperator::CreateFNeg(V), FPMathTag, FMF),		return Insert(setFPAttrs(UnaryOperator::CreateFNeg(V), FPMathTag, FMF),
Name);		Name);
}		}

/// Copy fast-math-flags from an instruction rather than using the builder's		/// Copy fast-math-flags from an instruction rather than using the builder's
/// default FMF.		/// default FMF.
Value CreateFNegFMF(Value V, Instruction *FMFSource,		Value CreateFNegFMF(Value V, Instruction *FMFSource,
const Twine &Name = "") {		const Twine &Name = "") {
if (auto *VC = dyn_cast<Constant>(V))		if (auto *VC = dyn_cast<Constant>(V))
return Insert(Folder.CreateFNeg(VC), Name);		return Insert(Folder.CreateFNeg(VC), Name);
// TODO: This should return UnaryOperator::CreateFNeg(...) once we are		return Insert(setFPAttrs(UnaryOperator::CreateFNeg(V), nullptr,
// confident that they are optimized sufficiently.
return Insert(setFPAttrs(BinaryOperator::CreateFNeg(V), nullptr,
FMFSource->getFastMathFlags()),		FMFSource->getFastMathFlags()),
Name);		Name);
}		}

Value CreateNot(Value V, const Twine &Name = "") {		Value CreateNot(Value V, const Twine &Name = "") {
if (auto *VC = dyn_cast<Constant>(V))		if (auto *VC = dyn_cast<Constant>(V))
return Insert(Folder.CreateNot(VC), Name);		return Insert(Folder.CreateNot(VC), Name);
return Insert(BinaryOperator::CreateNot(V), Name);		return Insert(BinaryOperator::CreateNot(V), Name);
▲ Show 20 Lines • Show All 1,143 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll

	Show First 20 Lines • Show All 221 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: @udiv_i16(			; CHECK-LABEL: @udiv_i16(
	; CHECK-NEXT: [[TMP1:%.]] = zext i16 [[X:%.]] to i32			; CHECK-NEXT: [[TMP1:%.]] = zext i16 [[X:%.]] to i32
	; CHECK-NEXT: [[TMP2:%.]] = zext i16 [[Y:%.]] to i32			; CHECK-NEXT: [[TMP2:%.]] = zext i16 [[Y:%.]] to i32
	; CHECK-NEXT: [[TMP3:%.*]] = uitofp i32 [[TMP1]] to float			; CHECK-NEXT: [[TMP3:%.*]] = uitofp i32 [[TMP1]] to float
	; CHECK-NEXT: [[TMP4:%.*]] = uitofp i32 [[TMP2]] to float			; CHECK-NEXT: [[TMP4:%.*]] = uitofp i32 [[TMP2]] to float
	; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]]			; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]]
	; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]]			; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]]
	; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]])			; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]])
	; CHECK-NEXT: [[TMP8:%.*]] = fsub fast float -0.000000e+00, [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fneg fast float [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]])			; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]])
	; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32			; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32
	; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.fabs.f32(float [[TMP4]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.fabs.f32(float [[TMP4]])
	; CHECK-NEXT: [[TMP13:%.*]] = fcmp fast oge float [[TMP11]], [[TMP12]]			; CHECK-NEXT: [[TMP13:%.*]] = fcmp fast oge float [[TMP11]], [[TMP12]]
	; CHECK-NEXT: [[TMP14:%.*]] = select i1 [[TMP13]], i32 1, i32 0			; CHECK-NEXT: [[TMP14:%.*]] = select i1 [[TMP13]], i32 1, i32 0
	; CHECK-NEXT: [[TMP15:%.*]] = add i32 [[TMP10]], [[TMP14]]			; CHECK-NEXT: [[TMP15:%.*]] = add i32 [[TMP10]], [[TMP14]]
	; CHECK-NEXT: [[TMP16:%.*]] = and i32 [[TMP15]], 65535			; CHECK-NEXT: [[TMP16:%.*]] = and i32 [[TMP15]], 65535
	Show All 10 Lines
	; CHECK-LABEL: @urem_i16(			; CHECK-LABEL: @urem_i16(
	; CHECK-NEXT: [[TMP1:%.]] = zext i16 [[X:%.]] to i32			; CHECK-NEXT: [[TMP1:%.]] = zext i16 [[X:%.]] to i32
	; CHECK-NEXT: [[TMP2:%.]] = zext i16 [[Y:%.]] to i32			; CHECK-NEXT: [[TMP2:%.]] = zext i16 [[Y:%.]] to i32
	; CHECK-NEXT: [[TMP3:%.*]] = uitofp i32 [[TMP1]] to float			; CHECK-NEXT: [[TMP3:%.*]] = uitofp i32 [[TMP1]] to float
	; CHECK-NEXT: [[TMP4:%.*]] = uitofp i32 [[TMP2]] to float			; CHECK-NEXT: [[TMP4:%.*]] = uitofp i32 [[TMP2]] to float
	; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]]			; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]]
	; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]]			; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]]
	; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]])			; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]])
	; CHECK-NEXT: [[TMP8:%.*]] = fsub fast float -0.000000e+00, [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fneg fast float [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]])			; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]])
	; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32			; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32
	; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.fabs.f32(float [[TMP4]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.fabs.f32(float [[TMP4]])
	; CHECK-NEXT: [[TMP13:%.*]] = fcmp fast oge float [[TMP11]], [[TMP12]]			; CHECK-NEXT: [[TMP13:%.*]] = fcmp fast oge float [[TMP11]], [[TMP12]]
	; CHECK-NEXT: [[TMP14:%.*]] = select i1 [[TMP13]], i32 1, i32 0			; CHECK-NEXT: [[TMP14:%.*]] = select i1 [[TMP13]], i32 1, i32 0
	; CHECK-NEXT: [[TMP15:%.*]] = add i32 [[TMP10]], [[TMP14]]			; CHECK-NEXT: [[TMP15:%.*]] = add i32 [[TMP10]], [[TMP14]]
	; CHECK-NEXT: [[TMP16:%.*]] = mul i32 [[TMP15]], [[TMP2]]			; CHECK-NEXT: [[TMP16:%.*]] = mul i32 [[TMP15]], [[TMP2]]
	Show All 15 Lines
	; CHECK-NEXT: [[TMP3:%.*]] = xor i32 [[TMP1]], [[TMP2]]			; CHECK-NEXT: [[TMP3:%.*]] = xor i32 [[TMP1]], [[TMP2]]
	; CHECK-NEXT: [[TMP4:%.*]] = ashr i32 [[TMP3]], 30			; CHECK-NEXT: [[TMP4:%.*]] = ashr i32 [[TMP3]], 30
	; CHECK-NEXT: [[TMP5:%.*]] = or i32 [[TMP4]], 1			; CHECK-NEXT: [[TMP5:%.*]] = or i32 [[TMP4]], 1
	; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP1]] to float			; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP1]] to float
	; CHECK-NEXT: [[TMP7:%.*]] = sitofp i32 [[TMP2]] to float			; CHECK-NEXT: [[TMP7:%.*]] = sitofp i32 [[TMP2]] to float
	; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]]			; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]]
	; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP11:%.*]] = fsub fast float -0.000000e+00, [[TMP10]]			; CHECK-NEXT: [[TMP11:%.*]] = fneg fast float [[TMP10]]
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]])
	; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32			; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]])
	; CHECK-NEXT: [[TMP15:%.*]] = call fast float @llvm.fabs.f32(float [[TMP7]])			; CHECK-NEXT: [[TMP15:%.*]] = call fast float @llvm.fabs.f32(float [[TMP7]])
	; CHECK-NEXT: [[TMP16:%.*]] = fcmp fast oge float [[TMP14]], [[TMP15]]			; CHECK-NEXT: [[TMP16:%.*]] = fcmp fast oge float [[TMP14]], [[TMP15]]
	; CHECK-NEXT: [[TMP17:%.*]] = select i1 [[TMP16]], i32 [[TMP5]], i32 0			; CHECK-NEXT: [[TMP17:%.*]] = select i1 [[TMP16]], i32 [[TMP5]], i32 0
	; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[TMP13]], [[TMP17]]			; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[TMP13]], [[TMP17]]
	; CHECK-NEXT: [[TMP19:%.*]] = trunc i32 [[TMP18]] to i16			; CHECK-NEXT: [[TMP19:%.*]] = trunc i32 [[TMP18]] to i16
	Show All 14 Lines
	; CHECK-NEXT: [[TMP3:%.*]] = xor i32 [[TMP1]], [[TMP2]]			; CHECK-NEXT: [[TMP3:%.*]] = xor i32 [[TMP1]], [[TMP2]]
	; CHECK-NEXT: [[TMP4:%.*]] = ashr i32 [[TMP3]], 30			; CHECK-NEXT: [[TMP4:%.*]] = ashr i32 [[TMP3]], 30
	; CHECK-NEXT: [[TMP5:%.*]] = or i32 [[TMP4]], 1			; CHECK-NEXT: [[TMP5:%.*]] = or i32 [[TMP4]], 1
	; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP1]] to float			; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP1]] to float
	; CHECK-NEXT: [[TMP7:%.*]] = sitofp i32 [[TMP2]] to float			; CHECK-NEXT: [[TMP7:%.*]] = sitofp i32 [[TMP2]] to float
	; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]]			; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]]
	; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP11:%.*]] = fsub fast float -0.000000e+00, [[TMP10]]			; CHECK-NEXT: [[TMP11:%.*]] = fneg fast float [[TMP10]]
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]])
	; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32			; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]])
	; CHECK-NEXT: [[TMP15:%.*]] = call fast float @llvm.fabs.f32(float [[TMP7]])			; CHECK-NEXT: [[TMP15:%.*]] = call fast float @llvm.fabs.f32(float [[TMP7]])
	; CHECK-NEXT: [[TMP16:%.*]] = fcmp fast oge float [[TMP14]], [[TMP15]]			; CHECK-NEXT: [[TMP16:%.*]] = fcmp fast oge float [[TMP14]], [[TMP15]]
	; CHECK-NEXT: [[TMP17:%.*]] = select i1 [[TMP16]], i32 [[TMP5]], i32 0			; CHECK-NEXT: [[TMP17:%.*]] = select i1 [[TMP16]], i32 [[TMP5]], i32 0
	; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[TMP13]], [[TMP17]]			; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[TMP13]], [[TMP17]]
	; CHECK-NEXT: [[TMP19:%.*]] = mul i32 [[TMP18]], [[TMP2]]			; CHECK-NEXT: [[TMP19:%.*]] = mul i32 [[TMP18]], [[TMP2]]
	Show All 13 Lines
	; CHECK-LABEL: @udiv_i8(			; CHECK-LABEL: @udiv_i8(
	; CHECK-NEXT: [[TMP1:%.]] = zext i8 [[X:%.]] to i32			; CHECK-NEXT: [[TMP1:%.]] = zext i8 [[X:%.]] to i32
	; CHECK-NEXT: [[TMP2:%.]] = zext i8 [[Y:%.]] to i32			; CHECK-NEXT: [[TMP2:%.]] = zext i8 [[Y:%.]] to i32
	; CHECK-NEXT: [[TMP3:%.*]] = uitofp i32 [[TMP1]] to float			; CHECK-NEXT: [[TMP3:%.*]] = uitofp i32 [[TMP1]] to float
	; CHECK-NEXT: [[TMP4:%.*]] = uitofp i32 [[TMP2]] to float			; CHECK-NEXT: [[TMP4:%.*]] = uitofp i32 [[TMP2]] to float
	; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]]			; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]]
	; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]]			; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]]
	; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]])			; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]])
	; CHECK-NEXT: [[TMP8:%.*]] = fsub fast float -0.000000e+00, [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fneg fast float [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]])			; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]])
	; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32			; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32
	; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.fabs.f32(float [[TMP4]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.fabs.f32(float [[TMP4]])
	; CHECK-NEXT: [[TMP13:%.*]] = fcmp fast oge float [[TMP11]], [[TMP12]]			; CHECK-NEXT: [[TMP13:%.*]] = fcmp fast oge float [[TMP11]], [[TMP12]]
	; CHECK-NEXT: [[TMP14:%.*]] = select i1 [[TMP13]], i32 1, i32 0			; CHECK-NEXT: [[TMP14:%.*]] = select i1 [[TMP13]], i32 1, i32 0
	; CHECK-NEXT: [[TMP15:%.*]] = add i32 [[TMP10]], [[TMP14]]			; CHECK-NEXT: [[TMP15:%.*]] = add i32 [[TMP10]], [[TMP14]]
	; CHECK-NEXT: [[TMP16:%.*]] = and i32 [[TMP15]], 255			; CHECK-NEXT: [[TMP16:%.*]] = and i32 [[TMP15]], 255
	Show All 10 Lines
	; CHECK-LABEL: @urem_i8(			; CHECK-LABEL: @urem_i8(
	; CHECK-NEXT: [[TMP1:%.]] = zext i8 [[X:%.]] to i32			; CHECK-NEXT: [[TMP1:%.]] = zext i8 [[X:%.]] to i32
	; CHECK-NEXT: [[TMP2:%.]] = zext i8 [[Y:%.]] to i32			; CHECK-NEXT: [[TMP2:%.]] = zext i8 [[Y:%.]] to i32
	; CHECK-NEXT: [[TMP3:%.*]] = uitofp i32 [[TMP1]] to float			; CHECK-NEXT: [[TMP3:%.*]] = uitofp i32 [[TMP1]] to float
	; CHECK-NEXT: [[TMP4:%.*]] = uitofp i32 [[TMP2]] to float			; CHECK-NEXT: [[TMP4:%.*]] = uitofp i32 [[TMP2]] to float
	; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]]			; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]]
	; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]]			; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]]
	; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]])			; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]])
	; CHECK-NEXT: [[TMP8:%.*]] = fsub fast float -0.000000e+00, [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fneg fast float [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]])			; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]])
	; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32			; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32
	; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.fabs.f32(float [[TMP4]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.fabs.f32(float [[TMP4]])
	; CHECK-NEXT: [[TMP13:%.*]] = fcmp fast oge float [[TMP11]], [[TMP12]]			; CHECK-NEXT: [[TMP13:%.*]] = fcmp fast oge float [[TMP11]], [[TMP12]]
	; CHECK-NEXT: [[TMP14:%.*]] = select i1 [[TMP13]], i32 1, i32 0			; CHECK-NEXT: [[TMP14:%.*]] = select i1 [[TMP13]], i32 1, i32 0
	; CHECK-NEXT: [[TMP15:%.*]] = add i32 [[TMP10]], [[TMP14]]			; CHECK-NEXT: [[TMP15:%.*]] = add i32 [[TMP10]], [[TMP14]]
	; CHECK-NEXT: [[TMP16:%.*]] = mul i32 [[TMP15]], [[TMP2]]			; CHECK-NEXT: [[TMP16:%.*]] = mul i32 [[TMP15]], [[TMP2]]
	Show All 15 Lines
	; CHECK-NEXT: [[TMP3:%.*]] = xor i32 [[TMP1]], [[TMP2]]			; CHECK-NEXT: [[TMP3:%.*]] = xor i32 [[TMP1]], [[TMP2]]
	; CHECK-NEXT: [[TMP4:%.*]] = ashr i32 [[TMP3]], 30			; CHECK-NEXT: [[TMP4:%.*]] = ashr i32 [[TMP3]], 30
	; CHECK-NEXT: [[TMP5:%.*]] = or i32 [[TMP4]], 1			; CHECK-NEXT: [[TMP5:%.*]] = or i32 [[TMP4]], 1
	; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP1]] to float			; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP1]] to float
	; CHECK-NEXT: [[TMP7:%.*]] = sitofp i32 [[TMP2]] to float			; CHECK-NEXT: [[TMP7:%.*]] = sitofp i32 [[TMP2]] to float
	; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]]			; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]]
	; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP11:%.*]] = fsub fast float -0.000000e+00, [[TMP10]]			; CHECK-NEXT: [[TMP11:%.*]] = fneg fast float [[TMP10]]
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]])
	; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32			; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]])
	; CHECK-NEXT: [[TMP15:%.*]] = call fast float @llvm.fabs.f32(float [[TMP7]])			; CHECK-NEXT: [[TMP15:%.*]] = call fast float @llvm.fabs.f32(float [[TMP7]])
	; CHECK-NEXT: [[TMP16:%.*]] = fcmp fast oge float [[TMP14]], [[TMP15]]			; CHECK-NEXT: [[TMP16:%.*]] = fcmp fast oge float [[TMP14]], [[TMP15]]
	; CHECK-NEXT: [[TMP17:%.*]] = select i1 [[TMP16]], i32 [[TMP5]], i32 0			; CHECK-NEXT: [[TMP17:%.*]] = select i1 [[TMP16]], i32 [[TMP5]], i32 0
	; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[TMP13]], [[TMP17]]			; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[TMP13]], [[TMP17]]
	; CHECK-NEXT: [[TMP19:%.*]] = trunc i32 [[TMP18]] to i8			; CHECK-NEXT: [[TMP19:%.*]] = trunc i32 [[TMP18]] to i8
	Show All 14 Lines
	; CHECK-NEXT: [[TMP3:%.*]] = xor i32 [[TMP1]], [[TMP2]]			; CHECK-NEXT: [[TMP3:%.*]] = xor i32 [[TMP1]], [[TMP2]]
	; CHECK-NEXT: [[TMP4:%.*]] = ashr i32 [[TMP3]], 30			; CHECK-NEXT: [[TMP4:%.*]] = ashr i32 [[TMP3]], 30
	; CHECK-NEXT: [[TMP5:%.*]] = or i32 [[TMP4]], 1			; CHECK-NEXT: [[TMP5:%.*]] = or i32 [[TMP4]], 1
	; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP1]] to float			; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP1]] to float
	; CHECK-NEXT: [[TMP7:%.*]] = sitofp i32 [[TMP2]] to float			; CHECK-NEXT: [[TMP7:%.*]] = sitofp i32 [[TMP2]] to float
	; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]]			; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]]
	; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP11:%.*]] = fsub fast float -0.000000e+00, [[TMP10]]			; CHECK-NEXT: [[TMP11:%.*]] = fneg fast float [[TMP10]]
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]])
	; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32			; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]])
	; CHECK-NEXT: [[TMP15:%.*]] = call fast float @llvm.fabs.f32(float [[TMP7]])			; CHECK-NEXT: [[TMP15:%.*]] = call fast float @llvm.fabs.f32(float [[TMP7]])
	; CHECK-NEXT: [[TMP16:%.*]] = fcmp fast oge float [[TMP14]], [[TMP15]]			; CHECK-NEXT: [[TMP16:%.*]] = fcmp fast oge float [[TMP14]], [[TMP15]]
	; CHECK-NEXT: [[TMP17:%.*]] = select i1 [[TMP16]], i32 [[TMP5]], i32 0			; CHECK-NEXT: [[TMP17:%.*]] = select i1 [[TMP16]], i32 [[TMP5]], i32 0
	; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[TMP13]], [[TMP17]]			; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[TMP13]], [[TMP17]]
	; CHECK-NEXT: [[TMP19:%.*]] = mul i32 [[TMP18]], [[TMP2]]			; CHECK-NEXT: [[TMP19:%.*]] = mul i32 [[TMP18]], [[TMP2]]
	▲ Show 20 Lines • Show All 811 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP2:%.]] = extractelement <4 x i16> [[Y:%.]], i64 0			; CHECK-NEXT: [[TMP2:%.]] = extractelement <4 x i16> [[Y:%.]], i64 0
	; CHECK-NEXT: [[TMP3:%.*]] = zext i16 [[TMP1]] to i32			; CHECK-NEXT: [[TMP3:%.*]] = zext i16 [[TMP1]] to i32
	; CHECK-NEXT: [[TMP4:%.*]] = zext i16 [[TMP2]] to i32			; CHECK-NEXT: [[TMP4:%.*]] = zext i16 [[TMP2]] to i32
	; CHECK-NEXT: [[TMP5:%.*]] = uitofp i32 [[TMP3]] to float			; CHECK-NEXT: [[TMP5:%.*]] = uitofp i32 [[TMP3]] to float
	; CHECK-NEXT: [[TMP6:%.*]] = uitofp i32 [[TMP4]] to float			; CHECK-NEXT: [[TMP6:%.*]] = uitofp i32 [[TMP4]] to float
	; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]]
	; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]])			; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]])
	; CHECK-NEXT: [[TMP10:%.*]] = fsub fast float -0.000000e+00, [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = fneg fast float [[TMP9]]
	; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]])			; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]])
	; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32			; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32
	; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]])			; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]])
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]])
	; CHECK-NEXT: [[TMP15:%.*]] = fcmp fast oge float [[TMP13]], [[TMP14]]			; CHECK-NEXT: [[TMP15:%.*]] = fcmp fast oge float [[TMP13]], [[TMP14]]
	; CHECK-NEXT: [[TMP16:%.*]] = select i1 [[TMP15]], i32 1, i32 0			; CHECK-NEXT: [[TMP16:%.*]] = select i1 [[TMP15]], i32 1, i32 0
	; CHECK-NEXT: [[TMP17:%.*]] = add i32 [[TMP12]], [[TMP16]]			; CHECK-NEXT: [[TMP17:%.*]] = add i32 [[TMP12]], [[TMP16]]
	; CHECK-NEXT: [[TMP18:%.*]] = and i32 [[TMP17]], 65535			; CHECK-NEXT: [[TMP18:%.*]] = and i32 [[TMP17]], 65535
	; CHECK-NEXT: [[TMP19:%.*]] = trunc i32 [[TMP18]] to i16			; CHECK-NEXT: [[TMP19:%.*]] = trunc i32 [[TMP18]] to i16
	; CHECK-NEXT: [[TMP20:%.*]] = insertelement <4 x i16> undef, i16 [[TMP19]], i64 0			; CHECK-NEXT: [[TMP20:%.*]] = insertelement <4 x i16> undef, i16 [[TMP19]], i64 0
	; CHECK-NEXT: [[TMP21:%.*]] = extractelement <4 x i16> [[X]], i64 1			; CHECK-NEXT: [[TMP21:%.*]] = extractelement <4 x i16> [[X]], i64 1
	; CHECK-NEXT: [[TMP22:%.*]] = extractelement <4 x i16> [[Y]], i64 1			; CHECK-NEXT: [[TMP22:%.*]] = extractelement <4 x i16> [[Y]], i64 1
	; CHECK-NEXT: [[TMP23:%.*]] = zext i16 [[TMP21]] to i32			; CHECK-NEXT: [[TMP23:%.*]] = zext i16 [[TMP21]] to i32
	; CHECK-NEXT: [[TMP24:%.*]] = zext i16 [[TMP22]] to i32			; CHECK-NEXT: [[TMP24:%.*]] = zext i16 [[TMP22]] to i32
	; CHECK-NEXT: [[TMP25:%.*]] = uitofp i32 [[TMP23]] to float			; CHECK-NEXT: [[TMP25:%.*]] = uitofp i32 [[TMP23]] to float
	; CHECK-NEXT: [[TMP26:%.*]] = uitofp i32 [[TMP24]] to float			; CHECK-NEXT: [[TMP26:%.*]] = uitofp i32 [[TMP24]] to float
	; CHECK-NEXT: [[TMP27:%.*]] = fdiv fast float 1.000000e+00, [[TMP26]]			; CHECK-NEXT: [[TMP27:%.*]] = fdiv fast float 1.000000e+00, [[TMP26]]
	; CHECK-NEXT: [[TMP28:%.*]] = fmul fast float [[TMP25]], [[TMP27]]			; CHECK-NEXT: [[TMP28:%.*]] = fmul fast float [[TMP25]], [[TMP27]]
	; CHECK-NEXT: [[TMP29:%.*]] = call fast float @llvm.trunc.f32(float [[TMP28]])			; CHECK-NEXT: [[TMP29:%.*]] = call fast float @llvm.trunc.f32(float [[TMP28]])
	; CHECK-NEXT: [[TMP30:%.*]] = fsub fast float -0.000000e+00, [[TMP29]]			; CHECK-NEXT: [[TMP30:%.*]] = fneg fast float [[TMP29]]
	; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP30]], float [[TMP26]], float [[TMP25]])			; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP30]], float [[TMP26]], float [[TMP25]])
	; CHECK-NEXT: [[TMP32:%.*]] = fptoui float [[TMP29]] to i32			; CHECK-NEXT: [[TMP32:%.*]] = fptoui float [[TMP29]] to i32
	; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.fabs.f32(float [[TMP31]])			; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.fabs.f32(float [[TMP31]])
	; CHECK-NEXT: [[TMP34:%.*]] = call fast float @llvm.fabs.f32(float [[TMP26]])			; CHECK-NEXT: [[TMP34:%.*]] = call fast float @llvm.fabs.f32(float [[TMP26]])
	; CHECK-NEXT: [[TMP35:%.*]] = fcmp fast oge float [[TMP33]], [[TMP34]]			; CHECK-NEXT: [[TMP35:%.*]] = fcmp fast oge float [[TMP33]], [[TMP34]]
	; CHECK-NEXT: [[TMP36:%.*]] = select i1 [[TMP35]], i32 1, i32 0			; CHECK-NEXT: [[TMP36:%.*]] = select i1 [[TMP35]], i32 1, i32 0
	; CHECK-NEXT: [[TMP37:%.*]] = add i32 [[TMP32]], [[TMP36]]			; CHECK-NEXT: [[TMP37:%.*]] = add i32 [[TMP32]], [[TMP36]]
	; CHECK-NEXT: [[TMP38:%.*]] = and i32 [[TMP37]], 65535			; CHECK-NEXT: [[TMP38:%.*]] = and i32 [[TMP37]], 65535
	; CHECK-NEXT: [[TMP39:%.*]] = trunc i32 [[TMP38]] to i16			; CHECK-NEXT: [[TMP39:%.*]] = trunc i32 [[TMP38]] to i16
	; CHECK-NEXT: [[TMP40:%.*]] = insertelement <4 x i16> [[TMP20]], i16 [[TMP39]], i64 1			; CHECK-NEXT: [[TMP40:%.*]] = insertelement <4 x i16> [[TMP20]], i16 [[TMP39]], i64 1
	; CHECK-NEXT: [[TMP41:%.*]] = extractelement <4 x i16> [[X]], i64 2			; CHECK-NEXT: [[TMP41:%.*]] = extractelement <4 x i16> [[X]], i64 2
	; CHECK-NEXT: [[TMP42:%.*]] = extractelement <4 x i16> [[Y]], i64 2			; CHECK-NEXT: [[TMP42:%.*]] = extractelement <4 x i16> [[Y]], i64 2
	; CHECK-NEXT: [[TMP43:%.*]] = zext i16 [[TMP41]] to i32			; CHECK-NEXT: [[TMP43:%.*]] = zext i16 [[TMP41]] to i32
	; CHECK-NEXT: [[TMP44:%.*]] = zext i16 [[TMP42]] to i32			; CHECK-NEXT: [[TMP44:%.*]] = zext i16 [[TMP42]] to i32
	; CHECK-NEXT: [[TMP45:%.*]] = uitofp i32 [[TMP43]] to float			; CHECK-NEXT: [[TMP45:%.*]] = uitofp i32 [[TMP43]] to float
	; CHECK-NEXT: [[TMP46:%.*]] = uitofp i32 [[TMP44]] to float			; CHECK-NEXT: [[TMP46:%.*]] = uitofp i32 [[TMP44]] to float
	; CHECK-NEXT: [[TMP47:%.*]] = fdiv fast float 1.000000e+00, [[TMP46]]			; CHECK-NEXT: [[TMP47:%.*]] = fdiv fast float 1.000000e+00, [[TMP46]]
	; CHECK-NEXT: [[TMP48:%.*]] = fmul fast float [[TMP45]], [[TMP47]]			; CHECK-NEXT: [[TMP48:%.*]] = fmul fast float [[TMP45]], [[TMP47]]
	; CHECK-NEXT: [[TMP49:%.*]] = call fast float @llvm.trunc.f32(float [[TMP48]])			; CHECK-NEXT: [[TMP49:%.*]] = call fast float @llvm.trunc.f32(float [[TMP48]])
	; CHECK-NEXT: [[TMP50:%.*]] = fsub fast float -0.000000e+00, [[TMP49]]			; CHECK-NEXT: [[TMP50:%.*]] = fneg fast float [[TMP49]]
	; CHECK-NEXT: [[TMP51:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP50]], float [[TMP46]], float [[TMP45]])			; CHECK-NEXT: [[TMP51:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP50]], float [[TMP46]], float [[TMP45]])
	; CHECK-NEXT: [[TMP52:%.*]] = fptoui float [[TMP49]] to i32			; CHECK-NEXT: [[TMP52:%.*]] = fptoui float [[TMP49]] to i32
	; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.fabs.f32(float [[TMP51]])			; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.fabs.f32(float [[TMP51]])
	; CHECK-NEXT: [[TMP54:%.*]] = call fast float @llvm.fabs.f32(float [[TMP46]])			; CHECK-NEXT: [[TMP54:%.*]] = call fast float @llvm.fabs.f32(float [[TMP46]])
	; CHECK-NEXT: [[TMP55:%.*]] = fcmp fast oge float [[TMP53]], [[TMP54]]			; CHECK-NEXT: [[TMP55:%.*]] = fcmp fast oge float [[TMP53]], [[TMP54]]
	; CHECK-NEXT: [[TMP56:%.*]] = select i1 [[TMP55]], i32 1, i32 0			; CHECK-NEXT: [[TMP56:%.*]] = select i1 [[TMP55]], i32 1, i32 0
	; CHECK-NEXT: [[TMP57:%.*]] = add i32 [[TMP52]], [[TMP56]]			; CHECK-NEXT: [[TMP57:%.*]] = add i32 [[TMP52]], [[TMP56]]
	; CHECK-NEXT: [[TMP58:%.*]] = and i32 [[TMP57]], 65535			; CHECK-NEXT: [[TMP58:%.*]] = and i32 [[TMP57]], 65535
	; CHECK-NEXT: [[TMP59:%.*]] = trunc i32 [[TMP58]] to i16			; CHECK-NEXT: [[TMP59:%.*]] = trunc i32 [[TMP58]] to i16
	; CHECK-NEXT: [[TMP60:%.*]] = insertelement <4 x i16> [[TMP40]], i16 [[TMP59]], i64 2			; CHECK-NEXT: [[TMP60:%.*]] = insertelement <4 x i16> [[TMP40]], i16 [[TMP59]], i64 2
	; CHECK-NEXT: [[TMP61:%.*]] = extractelement <4 x i16> [[X]], i64 3			; CHECK-NEXT: [[TMP61:%.*]] = extractelement <4 x i16> [[X]], i64 3
	; CHECK-NEXT: [[TMP62:%.*]] = extractelement <4 x i16> [[Y]], i64 3			; CHECK-NEXT: [[TMP62:%.*]] = extractelement <4 x i16> [[Y]], i64 3
	; CHECK-NEXT: [[TMP63:%.*]] = zext i16 [[TMP61]] to i32			; CHECK-NEXT: [[TMP63:%.*]] = zext i16 [[TMP61]] to i32
	; CHECK-NEXT: [[TMP64:%.*]] = zext i16 [[TMP62]] to i32			; CHECK-NEXT: [[TMP64:%.*]] = zext i16 [[TMP62]] to i32
	; CHECK-NEXT: [[TMP65:%.*]] = uitofp i32 [[TMP63]] to float			; CHECK-NEXT: [[TMP65:%.*]] = uitofp i32 [[TMP63]] to float
	; CHECK-NEXT: [[TMP66:%.*]] = uitofp i32 [[TMP64]] to float			; CHECK-NEXT: [[TMP66:%.*]] = uitofp i32 [[TMP64]] to float
	; CHECK-NEXT: [[TMP67:%.*]] = fdiv fast float 1.000000e+00, [[TMP66]]			; CHECK-NEXT: [[TMP67:%.*]] = fdiv fast float 1.000000e+00, [[TMP66]]
	; CHECK-NEXT: [[TMP68:%.*]] = fmul fast float [[TMP65]], [[TMP67]]			; CHECK-NEXT: [[TMP68:%.*]] = fmul fast float [[TMP65]], [[TMP67]]
	; CHECK-NEXT: [[TMP69:%.*]] = call fast float @llvm.trunc.f32(float [[TMP68]])			; CHECK-NEXT: [[TMP69:%.*]] = call fast float @llvm.trunc.f32(float [[TMP68]])
	; CHECK-NEXT: [[TMP70:%.*]] = fsub fast float -0.000000e+00, [[TMP69]]			; CHECK-NEXT: [[TMP70:%.*]] = fneg fast float [[TMP69]]
	; CHECK-NEXT: [[TMP71:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP70]], float [[TMP66]], float [[TMP65]])			; CHECK-NEXT: [[TMP71:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP70]], float [[TMP66]], float [[TMP65]])
	; CHECK-NEXT: [[TMP72:%.*]] = fptoui float [[TMP69]] to i32			; CHECK-NEXT: [[TMP72:%.*]] = fptoui float [[TMP69]] to i32
	; CHECK-NEXT: [[TMP73:%.*]] = call fast float @llvm.fabs.f32(float [[TMP71]])			; CHECK-NEXT: [[TMP73:%.*]] = call fast float @llvm.fabs.f32(float [[TMP71]])
	; CHECK-NEXT: [[TMP74:%.*]] = call fast float @llvm.fabs.f32(float [[TMP66]])			; CHECK-NEXT: [[TMP74:%.*]] = call fast float @llvm.fabs.f32(float [[TMP66]])
	; CHECK-NEXT: [[TMP75:%.*]] = fcmp fast oge float [[TMP73]], [[TMP74]]			; CHECK-NEXT: [[TMP75:%.*]] = fcmp fast oge float [[TMP73]], [[TMP74]]
	; CHECK-NEXT: [[TMP76:%.*]] = select i1 [[TMP75]], i32 1, i32 0			; CHECK-NEXT: [[TMP76:%.*]] = select i1 [[TMP75]], i32 1, i32 0
	; CHECK-NEXT: [[TMP77:%.*]] = add i32 [[TMP72]], [[TMP76]]			; CHECK-NEXT: [[TMP77:%.*]] = add i32 [[TMP72]], [[TMP76]]
	; CHECK-NEXT: [[TMP78:%.*]] = and i32 [[TMP77]], 65535			; CHECK-NEXT: [[TMP78:%.*]] = and i32 [[TMP77]], 65535
	Show All 13 Lines
	; CHECK-NEXT: [[TMP2:%.]] = extractelement <4 x i16> [[Y:%.]], i64 0			; CHECK-NEXT: [[TMP2:%.]] = extractelement <4 x i16> [[Y:%.]], i64 0
	; CHECK-NEXT: [[TMP3:%.*]] = zext i16 [[TMP1]] to i32			; CHECK-NEXT: [[TMP3:%.*]] = zext i16 [[TMP1]] to i32
	; CHECK-NEXT: [[TMP4:%.*]] = zext i16 [[TMP2]] to i32			; CHECK-NEXT: [[TMP4:%.*]] = zext i16 [[TMP2]] to i32
	; CHECK-NEXT: [[TMP5:%.*]] = uitofp i32 [[TMP3]] to float			; CHECK-NEXT: [[TMP5:%.*]] = uitofp i32 [[TMP3]] to float
	; CHECK-NEXT: [[TMP6:%.*]] = uitofp i32 [[TMP4]] to float			; CHECK-NEXT: [[TMP6:%.*]] = uitofp i32 [[TMP4]] to float
	; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]]
	; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]])			; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]])
	; CHECK-NEXT: [[TMP10:%.*]] = fsub fast float -0.000000e+00, [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = fneg fast float [[TMP9]]
	; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]])			; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]])
	; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32			; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32
	; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]])			; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]])
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]])
	; CHECK-NEXT: [[TMP15:%.*]] = fcmp fast oge float [[TMP13]], [[TMP14]]			; CHECK-NEXT: [[TMP15:%.*]] = fcmp fast oge float [[TMP13]], [[TMP14]]
	; CHECK-NEXT: [[TMP16:%.*]] = select i1 [[TMP15]], i32 1, i32 0			; CHECK-NEXT: [[TMP16:%.*]] = select i1 [[TMP15]], i32 1, i32 0
	; CHECK-NEXT: [[TMP17:%.*]] = add i32 [[TMP12]], [[TMP16]]			; CHECK-NEXT: [[TMP17:%.*]] = add i32 [[TMP12]], [[TMP16]]
	; CHECK-NEXT: [[TMP18:%.*]] = mul i32 [[TMP17]], [[TMP4]]			; CHECK-NEXT: [[TMP18:%.*]] = mul i32 [[TMP17]], [[TMP4]]
	; CHECK-NEXT: [[TMP19:%.*]] = sub i32 [[TMP3]], [[TMP18]]			; CHECK-NEXT: [[TMP19:%.*]] = sub i32 [[TMP3]], [[TMP18]]
	; CHECK-NEXT: [[TMP20:%.*]] = and i32 [[TMP19]], 65535			; CHECK-NEXT: [[TMP20:%.*]] = and i32 [[TMP19]], 65535
	; CHECK-NEXT: [[TMP21:%.*]] = trunc i32 [[TMP20]] to i16			; CHECK-NEXT: [[TMP21:%.*]] = trunc i32 [[TMP20]] to i16
	; CHECK-NEXT: [[TMP22:%.*]] = insertelement <4 x i16> undef, i16 [[TMP21]], i64 0			; CHECK-NEXT: [[TMP22:%.*]] = insertelement <4 x i16> undef, i16 [[TMP21]], i64 0
	; CHECK-NEXT: [[TMP23:%.*]] = extractelement <4 x i16> [[X]], i64 1			; CHECK-NEXT: [[TMP23:%.*]] = extractelement <4 x i16> [[X]], i64 1
	; CHECK-NEXT: [[TMP24:%.*]] = extractelement <4 x i16> [[Y]], i64 1			; CHECK-NEXT: [[TMP24:%.*]] = extractelement <4 x i16> [[Y]], i64 1
	; CHECK-NEXT: [[TMP25:%.*]] = zext i16 [[TMP23]] to i32			; CHECK-NEXT: [[TMP25:%.*]] = zext i16 [[TMP23]] to i32
	; CHECK-NEXT: [[TMP26:%.*]] = zext i16 [[TMP24]] to i32			; CHECK-NEXT: [[TMP26:%.*]] = zext i16 [[TMP24]] to i32
	; CHECK-NEXT: [[TMP27:%.*]] = uitofp i32 [[TMP25]] to float			; CHECK-NEXT: [[TMP27:%.*]] = uitofp i32 [[TMP25]] to float
	; CHECK-NEXT: [[TMP28:%.*]] = uitofp i32 [[TMP26]] to float			; CHECK-NEXT: [[TMP28:%.*]] = uitofp i32 [[TMP26]] to float
	; CHECK-NEXT: [[TMP29:%.*]] = fdiv fast float 1.000000e+00, [[TMP28]]			; CHECK-NEXT: [[TMP29:%.*]] = fdiv fast float 1.000000e+00, [[TMP28]]
	; CHECK-NEXT: [[TMP30:%.*]] = fmul fast float [[TMP27]], [[TMP29]]			; CHECK-NEXT: [[TMP30:%.*]] = fmul fast float [[TMP27]], [[TMP29]]
	; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.trunc.f32(float [[TMP30]])			; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.trunc.f32(float [[TMP30]])
	; CHECK-NEXT: [[TMP32:%.*]] = fsub fast float -0.000000e+00, [[TMP31]]			; CHECK-NEXT: [[TMP32:%.*]] = fneg fast float [[TMP31]]
	; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP32]], float [[TMP28]], float [[TMP27]])			; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP32]], float [[TMP28]], float [[TMP27]])
	; CHECK-NEXT: [[TMP34:%.*]] = fptoui float [[TMP31]] to i32			; CHECK-NEXT: [[TMP34:%.*]] = fptoui float [[TMP31]] to i32
	; CHECK-NEXT: [[TMP35:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]])			; CHECK-NEXT: [[TMP35:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]])
	; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.fabs.f32(float [[TMP28]])			; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.fabs.f32(float [[TMP28]])
	; CHECK-NEXT: [[TMP37:%.*]] = fcmp fast oge float [[TMP35]], [[TMP36]]			; CHECK-NEXT: [[TMP37:%.*]] = fcmp fast oge float [[TMP35]], [[TMP36]]
	; CHECK-NEXT: [[TMP38:%.*]] = select i1 [[TMP37]], i32 1, i32 0			; CHECK-NEXT: [[TMP38:%.*]] = select i1 [[TMP37]], i32 1, i32 0
	; CHECK-NEXT: [[TMP39:%.*]] = add i32 [[TMP34]], [[TMP38]]			; CHECK-NEXT: [[TMP39:%.*]] = add i32 [[TMP34]], [[TMP38]]
	; CHECK-NEXT: [[TMP40:%.*]] = mul i32 [[TMP39]], [[TMP26]]			; CHECK-NEXT: [[TMP40:%.*]] = mul i32 [[TMP39]], [[TMP26]]
	; CHECK-NEXT: [[TMP41:%.*]] = sub i32 [[TMP25]], [[TMP40]]			; CHECK-NEXT: [[TMP41:%.*]] = sub i32 [[TMP25]], [[TMP40]]
	; CHECK-NEXT: [[TMP42:%.*]] = and i32 [[TMP41]], 65535			; CHECK-NEXT: [[TMP42:%.*]] = and i32 [[TMP41]], 65535
	; CHECK-NEXT: [[TMP43:%.*]] = trunc i32 [[TMP42]] to i16			; CHECK-NEXT: [[TMP43:%.*]] = trunc i32 [[TMP42]] to i16
	; CHECK-NEXT: [[TMP44:%.*]] = insertelement <4 x i16> [[TMP22]], i16 [[TMP43]], i64 1			; CHECK-NEXT: [[TMP44:%.*]] = insertelement <4 x i16> [[TMP22]], i16 [[TMP43]], i64 1
	; CHECK-NEXT: [[TMP45:%.*]] = extractelement <4 x i16> [[X]], i64 2			; CHECK-NEXT: [[TMP45:%.*]] = extractelement <4 x i16> [[X]], i64 2
	; CHECK-NEXT: [[TMP46:%.*]] = extractelement <4 x i16> [[Y]], i64 2			; CHECK-NEXT: [[TMP46:%.*]] = extractelement <4 x i16> [[Y]], i64 2
	; CHECK-NEXT: [[TMP47:%.*]] = zext i16 [[TMP45]] to i32			; CHECK-NEXT: [[TMP47:%.*]] = zext i16 [[TMP45]] to i32
	; CHECK-NEXT: [[TMP48:%.*]] = zext i16 [[TMP46]] to i32			; CHECK-NEXT: [[TMP48:%.*]] = zext i16 [[TMP46]] to i32
	; CHECK-NEXT: [[TMP49:%.*]] = uitofp i32 [[TMP47]] to float			; CHECK-NEXT: [[TMP49:%.*]] = uitofp i32 [[TMP47]] to float
	; CHECK-NEXT: [[TMP50:%.*]] = uitofp i32 [[TMP48]] to float			; CHECK-NEXT: [[TMP50:%.*]] = uitofp i32 [[TMP48]] to float
	; CHECK-NEXT: [[TMP51:%.*]] = fdiv fast float 1.000000e+00, [[TMP50]]			; CHECK-NEXT: [[TMP51:%.*]] = fdiv fast float 1.000000e+00, [[TMP50]]
	; CHECK-NEXT: [[TMP52:%.*]] = fmul fast float [[TMP49]], [[TMP51]]			; CHECK-NEXT: [[TMP52:%.*]] = fmul fast float [[TMP49]], [[TMP51]]
	; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.trunc.f32(float [[TMP52]])			; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.trunc.f32(float [[TMP52]])
	; CHECK-NEXT: [[TMP54:%.*]] = fsub fast float -0.000000e+00, [[TMP53]]			; CHECK-NEXT: [[TMP54:%.*]] = fneg fast float [[TMP53]]
	; CHECK-NEXT: [[TMP55:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP54]], float [[TMP50]], float [[TMP49]])			; CHECK-NEXT: [[TMP55:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP54]], float [[TMP50]], float [[TMP49]])
	; CHECK-NEXT: [[TMP56:%.*]] = fptoui float [[TMP53]] to i32			; CHECK-NEXT: [[TMP56:%.*]] = fptoui float [[TMP53]] to i32
	; CHECK-NEXT: [[TMP57:%.*]] = call fast float @llvm.fabs.f32(float [[TMP55]])			; CHECK-NEXT: [[TMP57:%.*]] = call fast float @llvm.fabs.f32(float [[TMP55]])
	; CHECK-NEXT: [[TMP58:%.*]] = call fast float @llvm.fabs.f32(float [[TMP50]])			; CHECK-NEXT: [[TMP58:%.*]] = call fast float @llvm.fabs.f32(float [[TMP50]])
	; CHECK-NEXT: [[TMP59:%.*]] = fcmp fast oge float [[TMP57]], [[TMP58]]			; CHECK-NEXT: [[TMP59:%.*]] = fcmp fast oge float [[TMP57]], [[TMP58]]
	; CHECK-NEXT: [[TMP60:%.*]] = select i1 [[TMP59]], i32 1, i32 0			; CHECK-NEXT: [[TMP60:%.*]] = select i1 [[TMP59]], i32 1, i32 0
	; CHECK-NEXT: [[TMP61:%.*]] = add i32 [[TMP56]], [[TMP60]]			; CHECK-NEXT: [[TMP61:%.*]] = add i32 [[TMP56]], [[TMP60]]
	; CHECK-NEXT: [[TMP62:%.*]] = mul i32 [[TMP61]], [[TMP48]]			; CHECK-NEXT: [[TMP62:%.*]] = mul i32 [[TMP61]], [[TMP48]]
	; CHECK-NEXT: [[TMP63:%.*]] = sub i32 [[TMP47]], [[TMP62]]			; CHECK-NEXT: [[TMP63:%.*]] = sub i32 [[TMP47]], [[TMP62]]
	; CHECK-NEXT: [[TMP64:%.*]] = and i32 [[TMP63]], 65535			; CHECK-NEXT: [[TMP64:%.*]] = and i32 [[TMP63]], 65535
	; CHECK-NEXT: [[TMP65:%.*]] = trunc i32 [[TMP64]] to i16			; CHECK-NEXT: [[TMP65:%.*]] = trunc i32 [[TMP64]] to i16
	; CHECK-NEXT: [[TMP66:%.*]] = insertelement <4 x i16> [[TMP44]], i16 [[TMP65]], i64 2			; CHECK-NEXT: [[TMP66:%.*]] = insertelement <4 x i16> [[TMP44]], i16 [[TMP65]], i64 2
	; CHECK-NEXT: [[TMP67:%.*]] = extractelement <4 x i16> [[X]], i64 3			; CHECK-NEXT: [[TMP67:%.*]] = extractelement <4 x i16> [[X]], i64 3
	; CHECK-NEXT: [[TMP68:%.*]] = extractelement <4 x i16> [[Y]], i64 3			; CHECK-NEXT: [[TMP68:%.*]] = extractelement <4 x i16> [[Y]], i64 3
	; CHECK-NEXT: [[TMP69:%.*]] = zext i16 [[TMP67]] to i32			; CHECK-NEXT: [[TMP69:%.*]] = zext i16 [[TMP67]] to i32
	; CHECK-NEXT: [[TMP70:%.*]] = zext i16 [[TMP68]] to i32			; CHECK-NEXT: [[TMP70:%.*]] = zext i16 [[TMP68]] to i32
	; CHECK-NEXT: [[TMP71:%.*]] = uitofp i32 [[TMP69]] to float			; CHECK-NEXT: [[TMP71:%.*]] = uitofp i32 [[TMP69]] to float
	; CHECK-NEXT: [[TMP72:%.*]] = uitofp i32 [[TMP70]] to float			; CHECK-NEXT: [[TMP72:%.*]] = uitofp i32 [[TMP70]] to float
	; CHECK-NEXT: [[TMP73:%.*]] = fdiv fast float 1.000000e+00, [[TMP72]]			; CHECK-NEXT: [[TMP73:%.*]] = fdiv fast float 1.000000e+00, [[TMP72]]
	; CHECK-NEXT: [[TMP74:%.*]] = fmul fast float [[TMP71]], [[TMP73]]			; CHECK-NEXT: [[TMP74:%.*]] = fmul fast float [[TMP71]], [[TMP73]]
	; CHECK-NEXT: [[TMP75:%.*]] = call fast float @llvm.trunc.f32(float [[TMP74]])			; CHECK-NEXT: [[TMP75:%.*]] = call fast float @llvm.trunc.f32(float [[TMP74]])
	; CHECK-NEXT: [[TMP76:%.*]] = fsub fast float -0.000000e+00, [[TMP75]]			; CHECK-NEXT: [[TMP76:%.*]] = fneg fast float [[TMP75]]
	; CHECK-NEXT: [[TMP77:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP76]], float [[TMP72]], float [[TMP71]])			; CHECK-NEXT: [[TMP77:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP76]], float [[TMP72]], float [[TMP71]])
	; CHECK-NEXT: [[TMP78:%.*]] = fptoui float [[TMP75]] to i32			; CHECK-NEXT: [[TMP78:%.*]] = fptoui float [[TMP75]] to i32
	; CHECK-NEXT: [[TMP79:%.*]] = call fast float @llvm.fabs.f32(float [[TMP77]])			; CHECK-NEXT: [[TMP79:%.*]] = call fast float @llvm.fabs.f32(float [[TMP77]])
	; CHECK-NEXT: [[TMP80:%.*]] = call fast float @llvm.fabs.f32(float [[TMP72]])			; CHECK-NEXT: [[TMP80:%.*]] = call fast float @llvm.fabs.f32(float [[TMP72]])
	; CHECK-NEXT: [[TMP81:%.*]] = fcmp fast oge float [[TMP79]], [[TMP80]]			; CHECK-NEXT: [[TMP81:%.*]] = fcmp fast oge float [[TMP79]], [[TMP80]]
	; CHECK-NEXT: [[TMP82:%.*]] = select i1 [[TMP81]], i32 1, i32 0			; CHECK-NEXT: [[TMP82:%.*]] = select i1 [[TMP81]], i32 1, i32 0
	; CHECK-NEXT: [[TMP83:%.*]] = add i32 [[TMP78]], [[TMP82]]			; CHECK-NEXT: [[TMP83:%.*]] = add i32 [[TMP78]], [[TMP82]]
	; CHECK-NEXT: [[TMP84:%.*]] = mul i32 [[TMP83]], [[TMP70]]			; CHECK-NEXT: [[TMP84:%.*]] = mul i32 [[TMP83]], [[TMP70]]
	Show All 18 Lines
	; CHECK-NEXT: [[TMP5:%.*]] = xor i32 [[TMP3]], [[TMP4]]			; CHECK-NEXT: [[TMP5:%.*]] = xor i32 [[TMP3]], [[TMP4]]
	; CHECK-NEXT: [[TMP6:%.*]] = ashr i32 [[TMP5]], 30			; CHECK-NEXT: [[TMP6:%.*]] = ashr i32 [[TMP5]], 30
	; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], 1			; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], 1
	; CHECK-NEXT: [[TMP8:%.*]] = sitofp i32 [[TMP3]] to float			; CHECK-NEXT: [[TMP8:%.*]] = sitofp i32 [[TMP3]] to float
	; CHECK-NEXT: [[TMP9:%.*]] = sitofp i32 [[TMP4]] to float			; CHECK-NEXT: [[TMP9:%.*]] = sitofp i32 [[TMP4]] to float
	; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]]
	; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]]			; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]]
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]])
	; CHECK-NEXT: [[TMP13:%.*]] = fsub fast float -0.000000e+00, [[TMP12]]			; CHECK-NEXT: [[TMP13:%.*]] = fneg fast float [[TMP12]]
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]])
	; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32			; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32
	; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]])			; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]])
	; CHECK-NEXT: [[TMP17:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP17:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP18:%.*]] = fcmp fast oge float [[TMP16]], [[TMP17]]			; CHECK-NEXT: [[TMP18:%.*]] = fcmp fast oge float [[TMP16]], [[TMP17]]
	; CHECK-NEXT: [[TMP19:%.*]] = select i1 [[TMP18]], i32 [[TMP7]], i32 0			; CHECK-NEXT: [[TMP19:%.*]] = select i1 [[TMP18]], i32 [[TMP7]], i32 0
	; CHECK-NEXT: [[TMP20:%.*]] = add i32 [[TMP15]], [[TMP19]]			; CHECK-NEXT: [[TMP20:%.*]] = add i32 [[TMP15]], [[TMP19]]
	; CHECK-NEXT: [[TMP21:%.*]] = trunc i32 [[TMP20]] to i16			; CHECK-NEXT: [[TMP21:%.*]] = trunc i32 [[TMP20]] to i16
	; CHECK-NEXT: [[TMP22:%.*]] = sext i16 [[TMP21]] to i32			; CHECK-NEXT: [[TMP22:%.*]] = sext i16 [[TMP21]] to i32
	; CHECK-NEXT: [[TMP23:%.*]] = trunc i32 [[TMP22]] to i16			; CHECK-NEXT: [[TMP23:%.*]] = trunc i32 [[TMP22]] to i16
	; CHECK-NEXT: [[TMP24:%.*]] = insertelement <4 x i16> undef, i16 [[TMP23]], i64 0			; CHECK-NEXT: [[TMP24:%.*]] = insertelement <4 x i16> undef, i16 [[TMP23]], i64 0
	; CHECK-NEXT: [[TMP25:%.*]] = extractelement <4 x i16> [[X]], i64 1			; CHECK-NEXT: [[TMP25:%.*]] = extractelement <4 x i16> [[X]], i64 1
	; CHECK-NEXT: [[TMP26:%.*]] = extractelement <4 x i16> [[Y]], i64 1			; CHECK-NEXT: [[TMP26:%.*]] = extractelement <4 x i16> [[Y]], i64 1
	; CHECK-NEXT: [[TMP27:%.*]] = sext i16 [[TMP25]] to i32			; CHECK-NEXT: [[TMP27:%.*]] = sext i16 [[TMP25]] to i32
	; CHECK-NEXT: [[TMP28:%.*]] = sext i16 [[TMP26]] to i32			; CHECK-NEXT: [[TMP28:%.*]] = sext i16 [[TMP26]] to i32
	; CHECK-NEXT: [[TMP29:%.*]] = xor i32 [[TMP27]], [[TMP28]]			; CHECK-NEXT: [[TMP29:%.*]] = xor i32 [[TMP27]], [[TMP28]]
	; CHECK-NEXT: [[TMP30:%.*]] = ashr i32 [[TMP29]], 30			; CHECK-NEXT: [[TMP30:%.*]] = ashr i32 [[TMP29]], 30
	; CHECK-NEXT: [[TMP31:%.*]] = or i32 [[TMP30]], 1			; CHECK-NEXT: [[TMP31:%.*]] = or i32 [[TMP30]], 1
	; CHECK-NEXT: [[TMP32:%.*]] = sitofp i32 [[TMP27]] to float			; CHECK-NEXT: [[TMP32:%.*]] = sitofp i32 [[TMP27]] to float
	; CHECK-NEXT: [[TMP33:%.*]] = sitofp i32 [[TMP28]] to float			; CHECK-NEXT: [[TMP33:%.*]] = sitofp i32 [[TMP28]] to float
	; CHECK-NEXT: [[TMP34:%.*]] = fdiv fast float 1.000000e+00, [[TMP33]]			; CHECK-NEXT: [[TMP34:%.*]] = fdiv fast float 1.000000e+00, [[TMP33]]
	; CHECK-NEXT: [[TMP35:%.*]] = fmul fast float [[TMP32]], [[TMP34]]			; CHECK-NEXT: [[TMP35:%.*]] = fmul fast float [[TMP32]], [[TMP34]]
	; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.trunc.f32(float [[TMP35]])			; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.trunc.f32(float [[TMP35]])
	; CHECK-NEXT: [[TMP37:%.*]] = fsub fast float -0.000000e+00, [[TMP36]]			; CHECK-NEXT: [[TMP37:%.*]] = fneg fast float [[TMP36]]
	; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP37]], float [[TMP33]], float [[TMP32]])			; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP37]], float [[TMP33]], float [[TMP32]])
	; CHECK-NEXT: [[TMP39:%.*]] = fptosi float [[TMP36]] to i32			; CHECK-NEXT: [[TMP39:%.*]] = fptosi float [[TMP36]] to i32
	; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.fabs.f32(float [[TMP38]])			; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.fabs.f32(float [[TMP38]])
	; CHECK-NEXT: [[TMP41:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]])			; CHECK-NEXT: [[TMP41:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]])
	; CHECK-NEXT: [[TMP42:%.*]] = fcmp fast oge float [[TMP40]], [[TMP41]]			; CHECK-NEXT: [[TMP42:%.*]] = fcmp fast oge float [[TMP40]], [[TMP41]]
	; CHECK-NEXT: [[TMP43:%.*]] = select i1 [[TMP42]], i32 [[TMP31]], i32 0			; CHECK-NEXT: [[TMP43:%.*]] = select i1 [[TMP42]], i32 [[TMP31]], i32 0
	; CHECK-NEXT: [[TMP44:%.*]] = add i32 [[TMP39]], [[TMP43]]			; CHECK-NEXT: [[TMP44:%.*]] = add i32 [[TMP39]], [[TMP43]]
	; CHECK-NEXT: [[TMP45:%.*]] = trunc i32 [[TMP44]] to i16			; CHECK-NEXT: [[TMP45:%.*]] = trunc i32 [[TMP44]] to i16
	; CHECK-NEXT: [[TMP46:%.*]] = sext i16 [[TMP45]] to i32			; CHECK-NEXT: [[TMP46:%.*]] = sext i16 [[TMP45]] to i32
	; CHECK-NEXT: [[TMP47:%.*]] = trunc i32 [[TMP46]] to i16			; CHECK-NEXT: [[TMP47:%.*]] = trunc i32 [[TMP46]] to i16
	; CHECK-NEXT: [[TMP48:%.*]] = insertelement <4 x i16> [[TMP24]], i16 [[TMP47]], i64 1			; CHECK-NEXT: [[TMP48:%.*]] = insertelement <4 x i16> [[TMP24]], i16 [[TMP47]], i64 1
	; CHECK-NEXT: [[TMP49:%.*]] = extractelement <4 x i16> [[X]], i64 2			; CHECK-NEXT: [[TMP49:%.*]] = extractelement <4 x i16> [[X]], i64 2
	; CHECK-NEXT: [[TMP50:%.*]] = extractelement <4 x i16> [[Y]], i64 2			; CHECK-NEXT: [[TMP50:%.*]] = extractelement <4 x i16> [[Y]], i64 2
	; CHECK-NEXT: [[TMP51:%.*]] = sext i16 [[TMP49]] to i32			; CHECK-NEXT: [[TMP51:%.*]] = sext i16 [[TMP49]] to i32
	; CHECK-NEXT: [[TMP52:%.*]] = sext i16 [[TMP50]] to i32			; CHECK-NEXT: [[TMP52:%.*]] = sext i16 [[TMP50]] to i32
	; CHECK-NEXT: [[TMP53:%.*]] = xor i32 [[TMP51]], [[TMP52]]			; CHECK-NEXT: [[TMP53:%.*]] = xor i32 [[TMP51]], [[TMP52]]
	; CHECK-NEXT: [[TMP54:%.*]] = ashr i32 [[TMP53]], 30			; CHECK-NEXT: [[TMP54:%.*]] = ashr i32 [[TMP53]], 30
	; CHECK-NEXT: [[TMP55:%.*]] = or i32 [[TMP54]], 1			; CHECK-NEXT: [[TMP55:%.*]] = or i32 [[TMP54]], 1
	; CHECK-NEXT: [[TMP56:%.*]] = sitofp i32 [[TMP51]] to float			; CHECK-NEXT: [[TMP56:%.*]] = sitofp i32 [[TMP51]] to float
	; CHECK-NEXT: [[TMP57:%.*]] = sitofp i32 [[TMP52]] to float			; CHECK-NEXT: [[TMP57:%.*]] = sitofp i32 [[TMP52]] to float
	; CHECK-NEXT: [[TMP58:%.*]] = fdiv fast float 1.000000e+00, [[TMP57]]			; CHECK-NEXT: [[TMP58:%.*]] = fdiv fast float 1.000000e+00, [[TMP57]]
	; CHECK-NEXT: [[TMP59:%.*]] = fmul fast float [[TMP56]], [[TMP58]]			; CHECK-NEXT: [[TMP59:%.*]] = fmul fast float [[TMP56]], [[TMP58]]
	; CHECK-NEXT: [[TMP60:%.*]] = call fast float @llvm.trunc.f32(float [[TMP59]])			; CHECK-NEXT: [[TMP60:%.*]] = call fast float @llvm.trunc.f32(float [[TMP59]])
	; CHECK-NEXT: [[TMP61:%.*]] = fsub fast float -0.000000e+00, [[TMP60]]			; CHECK-NEXT: [[TMP61:%.*]] = fneg fast float [[TMP60]]
	; CHECK-NEXT: [[TMP62:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP61]], float [[TMP57]], float [[TMP56]])			; CHECK-NEXT: [[TMP62:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP61]], float [[TMP57]], float [[TMP56]])
	; CHECK-NEXT: [[TMP63:%.*]] = fptosi float [[TMP60]] to i32			; CHECK-NEXT: [[TMP63:%.*]] = fptosi float [[TMP60]] to i32
	; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.fabs.f32(float [[TMP62]])			; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.fabs.f32(float [[TMP62]])
	; CHECK-NEXT: [[TMP65:%.*]] = call fast float @llvm.fabs.f32(float [[TMP57]])			; CHECK-NEXT: [[TMP65:%.*]] = call fast float @llvm.fabs.f32(float [[TMP57]])
	; CHECK-NEXT: [[TMP66:%.*]] = fcmp fast oge float [[TMP64]], [[TMP65]]			; CHECK-NEXT: [[TMP66:%.*]] = fcmp fast oge float [[TMP64]], [[TMP65]]
	; CHECK-NEXT: [[TMP67:%.*]] = select i1 [[TMP66]], i32 [[TMP55]], i32 0			; CHECK-NEXT: [[TMP67:%.*]] = select i1 [[TMP66]], i32 [[TMP55]], i32 0
	; CHECK-NEXT: [[TMP68:%.*]] = add i32 [[TMP63]], [[TMP67]]			; CHECK-NEXT: [[TMP68:%.*]] = add i32 [[TMP63]], [[TMP67]]
	; CHECK-NEXT: [[TMP69:%.*]] = trunc i32 [[TMP68]] to i16			; CHECK-NEXT: [[TMP69:%.*]] = trunc i32 [[TMP68]] to i16
	; CHECK-NEXT: [[TMP70:%.*]] = sext i16 [[TMP69]] to i32			; CHECK-NEXT: [[TMP70:%.*]] = sext i16 [[TMP69]] to i32
	; CHECK-NEXT: [[TMP71:%.*]] = trunc i32 [[TMP70]] to i16			; CHECK-NEXT: [[TMP71:%.*]] = trunc i32 [[TMP70]] to i16
	; CHECK-NEXT: [[TMP72:%.*]] = insertelement <4 x i16> [[TMP48]], i16 [[TMP71]], i64 2			; CHECK-NEXT: [[TMP72:%.*]] = insertelement <4 x i16> [[TMP48]], i16 [[TMP71]], i64 2
	; CHECK-NEXT: [[TMP73:%.*]] = extractelement <4 x i16> [[X]], i64 3			; CHECK-NEXT: [[TMP73:%.*]] = extractelement <4 x i16> [[X]], i64 3
	; CHECK-NEXT: [[TMP74:%.*]] = extractelement <4 x i16> [[Y]], i64 3			; CHECK-NEXT: [[TMP74:%.*]] = extractelement <4 x i16> [[Y]], i64 3
	; CHECK-NEXT: [[TMP75:%.*]] = sext i16 [[TMP73]] to i32			; CHECK-NEXT: [[TMP75:%.*]] = sext i16 [[TMP73]] to i32
	; CHECK-NEXT: [[TMP76:%.*]] = sext i16 [[TMP74]] to i32			; CHECK-NEXT: [[TMP76:%.*]] = sext i16 [[TMP74]] to i32
	; CHECK-NEXT: [[TMP77:%.*]] = xor i32 [[TMP75]], [[TMP76]]			; CHECK-NEXT: [[TMP77:%.*]] = xor i32 [[TMP75]], [[TMP76]]
	; CHECK-NEXT: [[TMP78:%.*]] = ashr i32 [[TMP77]], 30			; CHECK-NEXT: [[TMP78:%.*]] = ashr i32 [[TMP77]], 30
	; CHECK-NEXT: [[TMP79:%.*]] = or i32 [[TMP78]], 1			; CHECK-NEXT: [[TMP79:%.*]] = or i32 [[TMP78]], 1
	; CHECK-NEXT: [[TMP80:%.*]] = sitofp i32 [[TMP75]] to float			; CHECK-NEXT: [[TMP80:%.*]] = sitofp i32 [[TMP75]] to float
	; CHECK-NEXT: [[TMP81:%.*]] = sitofp i32 [[TMP76]] to float			; CHECK-NEXT: [[TMP81:%.*]] = sitofp i32 [[TMP76]] to float
	; CHECK-NEXT: [[TMP82:%.*]] = fdiv fast float 1.000000e+00, [[TMP81]]			; CHECK-NEXT: [[TMP82:%.*]] = fdiv fast float 1.000000e+00, [[TMP81]]
	; CHECK-NEXT: [[TMP83:%.*]] = fmul fast float [[TMP80]], [[TMP82]]			; CHECK-NEXT: [[TMP83:%.*]] = fmul fast float [[TMP80]], [[TMP82]]
	; CHECK-NEXT: [[TMP84:%.*]] = call fast float @llvm.trunc.f32(float [[TMP83]])			; CHECK-NEXT: [[TMP84:%.*]] = call fast float @llvm.trunc.f32(float [[TMP83]])
	; CHECK-NEXT: [[TMP85:%.*]] = fsub fast float -0.000000e+00, [[TMP84]]			; CHECK-NEXT: [[TMP85:%.*]] = fneg fast float [[TMP84]]
	; CHECK-NEXT: [[TMP86:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP85]], float [[TMP81]], float [[TMP80]])			; CHECK-NEXT: [[TMP86:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP85]], float [[TMP81]], float [[TMP80]])
	; CHECK-NEXT: [[TMP87:%.*]] = fptosi float [[TMP84]] to i32			; CHECK-NEXT: [[TMP87:%.*]] = fptosi float [[TMP84]] to i32
	; CHECK-NEXT: [[TMP88:%.*]] = call fast float @llvm.fabs.f32(float [[TMP86]])			; CHECK-NEXT: [[TMP88:%.*]] = call fast float @llvm.fabs.f32(float [[TMP86]])
	; CHECK-NEXT: [[TMP89:%.*]] = call fast float @llvm.fabs.f32(float [[TMP81]])			; CHECK-NEXT: [[TMP89:%.*]] = call fast float @llvm.fabs.f32(float [[TMP81]])
	; CHECK-NEXT: [[TMP90:%.*]] = fcmp fast oge float [[TMP88]], [[TMP89]]			; CHECK-NEXT: [[TMP90:%.*]] = fcmp fast oge float [[TMP88]], [[TMP89]]
	; CHECK-NEXT: [[TMP91:%.*]] = select i1 [[TMP90]], i32 [[TMP79]], i32 0			; CHECK-NEXT: [[TMP91:%.*]] = select i1 [[TMP90]], i32 [[TMP79]], i32 0
	; CHECK-NEXT: [[TMP92:%.*]] = add i32 [[TMP87]], [[TMP91]]			; CHECK-NEXT: [[TMP92:%.*]] = add i32 [[TMP87]], [[TMP91]]
	; CHECK-NEXT: [[TMP93:%.*]] = trunc i32 [[TMP92]] to i16			; CHECK-NEXT: [[TMP93:%.*]] = trunc i32 [[TMP92]] to i16
	Show All 17 Lines
	; CHECK-NEXT: [[TMP5:%.*]] = xor i32 [[TMP3]], [[TMP4]]			; CHECK-NEXT: [[TMP5:%.*]] = xor i32 [[TMP3]], [[TMP4]]
	; CHECK-NEXT: [[TMP6:%.*]] = ashr i32 [[TMP5]], 30			; CHECK-NEXT: [[TMP6:%.*]] = ashr i32 [[TMP5]], 30
	; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], 1			; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], 1
	; CHECK-NEXT: [[TMP8:%.*]] = sitofp i32 [[TMP3]] to float			; CHECK-NEXT: [[TMP8:%.*]] = sitofp i32 [[TMP3]] to float
	; CHECK-NEXT: [[TMP9:%.*]] = sitofp i32 [[TMP4]] to float			; CHECK-NEXT: [[TMP9:%.*]] = sitofp i32 [[TMP4]] to float
	; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]]
	; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]]			; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]]
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]])
	; CHECK-NEXT: [[TMP13:%.*]] = fsub fast float -0.000000e+00, [[TMP12]]			; CHECK-NEXT: [[TMP13:%.*]] = fneg fast float [[TMP12]]
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]])
	; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32			; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32
	; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]])			; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]])
	; CHECK-NEXT: [[TMP17:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP17:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP18:%.*]] = fcmp fast oge float [[TMP16]], [[TMP17]]			; CHECK-NEXT: [[TMP18:%.*]] = fcmp fast oge float [[TMP16]], [[TMP17]]
	; CHECK-NEXT: [[TMP19:%.*]] = select i1 [[TMP18]], i32 [[TMP7]], i32 0			; CHECK-NEXT: [[TMP19:%.*]] = select i1 [[TMP18]], i32 [[TMP7]], i32 0
	; CHECK-NEXT: [[TMP20:%.*]] = add i32 [[TMP15]], [[TMP19]]			; CHECK-NEXT: [[TMP20:%.*]] = add i32 [[TMP15]], [[TMP19]]
	; CHECK-NEXT: [[TMP21:%.*]] = mul i32 [[TMP20]], [[TMP4]]			; CHECK-NEXT: [[TMP21:%.*]] = mul i32 [[TMP20]], [[TMP4]]
	Show All 9 Lines
	; CHECK-NEXT: [[TMP31:%.*]] = xor i32 [[TMP29]], [[TMP30]]			; CHECK-NEXT: [[TMP31:%.*]] = xor i32 [[TMP29]], [[TMP30]]
	; CHECK-NEXT: [[TMP32:%.*]] = ashr i32 [[TMP31]], 30			; CHECK-NEXT: [[TMP32:%.*]] = ashr i32 [[TMP31]], 30
	; CHECK-NEXT: [[TMP33:%.*]] = or i32 [[TMP32]], 1			; CHECK-NEXT: [[TMP33:%.*]] = or i32 [[TMP32]], 1
	; CHECK-NEXT: [[TMP34:%.*]] = sitofp i32 [[TMP29]] to float			; CHECK-NEXT: [[TMP34:%.*]] = sitofp i32 [[TMP29]] to float
	; CHECK-NEXT: [[TMP35:%.*]] = sitofp i32 [[TMP30]] to float			; CHECK-NEXT: [[TMP35:%.*]] = sitofp i32 [[TMP30]] to float
	; CHECK-NEXT: [[TMP36:%.*]] = fdiv fast float 1.000000e+00, [[TMP35]]			; CHECK-NEXT: [[TMP36:%.*]] = fdiv fast float 1.000000e+00, [[TMP35]]
	; CHECK-NEXT: [[TMP37:%.*]] = fmul fast float [[TMP34]], [[TMP36]]			; CHECK-NEXT: [[TMP37:%.*]] = fmul fast float [[TMP34]], [[TMP36]]
	; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.trunc.f32(float [[TMP37]])			; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.trunc.f32(float [[TMP37]])
	; CHECK-NEXT: [[TMP39:%.*]] = fsub fast float -0.000000e+00, [[TMP38]]			; CHECK-NEXT: [[TMP39:%.*]] = fneg fast float [[TMP38]]
	; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP39]], float [[TMP35]], float [[TMP34]])			; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP39]], float [[TMP35]], float [[TMP34]])
	; CHECK-NEXT: [[TMP41:%.*]] = fptosi float [[TMP38]] to i32			; CHECK-NEXT: [[TMP41:%.*]] = fptosi float [[TMP38]] to i32
	; CHECK-NEXT: [[TMP42:%.*]] = call fast float @llvm.fabs.f32(float [[TMP40]])			; CHECK-NEXT: [[TMP42:%.*]] = call fast float @llvm.fabs.f32(float [[TMP40]])
	; CHECK-NEXT: [[TMP43:%.*]] = call fast float @llvm.fabs.f32(float [[TMP35]])			; CHECK-NEXT: [[TMP43:%.*]] = call fast float @llvm.fabs.f32(float [[TMP35]])
	; CHECK-NEXT: [[TMP44:%.*]] = fcmp fast oge float [[TMP42]], [[TMP43]]			; CHECK-NEXT: [[TMP44:%.*]] = fcmp fast oge float [[TMP42]], [[TMP43]]
	; CHECK-NEXT: [[TMP45:%.*]] = select i1 [[TMP44]], i32 [[TMP33]], i32 0			; CHECK-NEXT: [[TMP45:%.*]] = select i1 [[TMP44]], i32 [[TMP33]], i32 0
	; CHECK-NEXT: [[TMP46:%.*]] = add i32 [[TMP41]], [[TMP45]]			; CHECK-NEXT: [[TMP46:%.*]] = add i32 [[TMP41]], [[TMP45]]
	; CHECK-NEXT: [[TMP47:%.*]] = mul i32 [[TMP46]], [[TMP30]]			; CHECK-NEXT: [[TMP47:%.*]] = mul i32 [[TMP46]], [[TMP30]]
	Show All 9 Lines
	; CHECK-NEXT: [[TMP57:%.*]] = xor i32 [[TMP55]], [[TMP56]]			; CHECK-NEXT: [[TMP57:%.*]] = xor i32 [[TMP55]], [[TMP56]]
	; CHECK-NEXT: [[TMP58:%.*]] = ashr i32 [[TMP57]], 30			; CHECK-NEXT: [[TMP58:%.*]] = ashr i32 [[TMP57]], 30
	; CHECK-NEXT: [[TMP59:%.*]] = or i32 [[TMP58]], 1			; CHECK-NEXT: [[TMP59:%.*]] = or i32 [[TMP58]], 1
	; CHECK-NEXT: [[TMP60:%.*]] = sitofp i32 [[TMP55]] to float			; CHECK-NEXT: [[TMP60:%.*]] = sitofp i32 [[TMP55]] to float
	; CHECK-NEXT: [[TMP61:%.*]] = sitofp i32 [[TMP56]] to float			; CHECK-NEXT: [[TMP61:%.*]] = sitofp i32 [[TMP56]] to float
	; CHECK-NEXT: [[TMP62:%.*]] = fdiv fast float 1.000000e+00, [[TMP61]]			; CHECK-NEXT: [[TMP62:%.*]] = fdiv fast float 1.000000e+00, [[TMP61]]
	; CHECK-NEXT: [[TMP63:%.*]] = fmul fast float [[TMP60]], [[TMP62]]			; CHECK-NEXT: [[TMP63:%.*]] = fmul fast float [[TMP60]], [[TMP62]]
	; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.trunc.f32(float [[TMP63]])			; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.trunc.f32(float [[TMP63]])
	; CHECK-NEXT: [[TMP65:%.*]] = fsub fast float -0.000000e+00, [[TMP64]]			; CHECK-NEXT: [[TMP65:%.*]] = fneg fast float [[TMP64]]
	; CHECK-NEXT: [[TMP66:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP65]], float [[TMP61]], float [[TMP60]])			; CHECK-NEXT: [[TMP66:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP65]], float [[TMP61]], float [[TMP60]])
	; CHECK-NEXT: [[TMP67:%.*]] = fptosi float [[TMP64]] to i32			; CHECK-NEXT: [[TMP67:%.*]] = fptosi float [[TMP64]] to i32
	; CHECK-NEXT: [[TMP68:%.*]] = call fast float @llvm.fabs.f32(float [[TMP66]])			; CHECK-NEXT: [[TMP68:%.*]] = call fast float @llvm.fabs.f32(float [[TMP66]])
	; CHECK-NEXT: [[TMP69:%.*]] = call fast float @llvm.fabs.f32(float [[TMP61]])			; CHECK-NEXT: [[TMP69:%.*]] = call fast float @llvm.fabs.f32(float [[TMP61]])
	; CHECK-NEXT: [[TMP70:%.*]] = fcmp fast oge float [[TMP68]], [[TMP69]]			; CHECK-NEXT: [[TMP70:%.*]] = fcmp fast oge float [[TMP68]], [[TMP69]]
	; CHECK-NEXT: [[TMP71:%.*]] = select i1 [[TMP70]], i32 [[TMP59]], i32 0			; CHECK-NEXT: [[TMP71:%.*]] = select i1 [[TMP70]], i32 [[TMP59]], i32 0
	; CHECK-NEXT: [[TMP72:%.*]] = add i32 [[TMP67]], [[TMP71]]			; CHECK-NEXT: [[TMP72:%.*]] = add i32 [[TMP67]], [[TMP71]]
	; CHECK-NEXT: [[TMP73:%.*]] = mul i32 [[TMP72]], [[TMP56]]			; CHECK-NEXT: [[TMP73:%.*]] = mul i32 [[TMP72]], [[TMP56]]
	Show All 9 Lines
	; CHECK-NEXT: [[TMP83:%.*]] = xor i32 [[TMP81]], [[TMP82]]			; CHECK-NEXT: [[TMP83:%.*]] = xor i32 [[TMP81]], [[TMP82]]
	; CHECK-NEXT: [[TMP84:%.*]] = ashr i32 [[TMP83]], 30			; CHECK-NEXT: [[TMP84:%.*]] = ashr i32 [[TMP83]], 30
	; CHECK-NEXT: [[TMP85:%.*]] = or i32 [[TMP84]], 1			; CHECK-NEXT: [[TMP85:%.*]] = or i32 [[TMP84]], 1
	; CHECK-NEXT: [[TMP86:%.*]] = sitofp i32 [[TMP81]] to float			; CHECK-NEXT: [[TMP86:%.*]] = sitofp i32 [[TMP81]] to float
	; CHECK-NEXT: [[TMP87:%.*]] = sitofp i32 [[TMP82]] to float			; CHECK-NEXT: [[TMP87:%.*]] = sitofp i32 [[TMP82]] to float
	; CHECK-NEXT: [[TMP88:%.*]] = fdiv fast float 1.000000e+00, [[TMP87]]			; CHECK-NEXT: [[TMP88:%.*]] = fdiv fast float 1.000000e+00, [[TMP87]]
	; CHECK-NEXT: [[TMP89:%.*]] = fmul fast float [[TMP86]], [[TMP88]]			; CHECK-NEXT: [[TMP89:%.*]] = fmul fast float [[TMP86]], [[TMP88]]
	; CHECK-NEXT: [[TMP90:%.*]] = call fast float @llvm.trunc.f32(float [[TMP89]])			; CHECK-NEXT: [[TMP90:%.*]] = call fast float @llvm.trunc.f32(float [[TMP89]])
	; CHECK-NEXT: [[TMP91:%.*]] = fsub fast float -0.000000e+00, [[TMP90]]			; CHECK-NEXT: [[TMP91:%.*]] = fneg fast float [[TMP90]]
	; CHECK-NEXT: [[TMP92:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP91]], float [[TMP87]], float [[TMP86]])			; CHECK-NEXT: [[TMP92:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP91]], float [[TMP87]], float [[TMP86]])
	; CHECK-NEXT: [[TMP93:%.*]] = fptosi float [[TMP90]] to i32			; CHECK-NEXT: [[TMP93:%.*]] = fptosi float [[TMP90]] to i32
	; CHECK-NEXT: [[TMP94:%.*]] = call fast float @llvm.fabs.f32(float [[TMP92]])			; CHECK-NEXT: [[TMP94:%.*]] = call fast float @llvm.fabs.f32(float [[TMP92]])
	; CHECK-NEXT: [[TMP95:%.*]] = call fast float @llvm.fabs.f32(float [[TMP87]])			; CHECK-NEXT: [[TMP95:%.*]] = call fast float @llvm.fabs.f32(float [[TMP87]])
	; CHECK-NEXT: [[TMP96:%.*]] = fcmp fast oge float [[TMP94]], [[TMP95]]			; CHECK-NEXT: [[TMP96:%.*]] = fcmp fast oge float [[TMP94]], [[TMP95]]
	; CHECK-NEXT: [[TMP97:%.*]] = select i1 [[TMP96]], i32 [[TMP85]], i32 0			; CHECK-NEXT: [[TMP97:%.*]] = select i1 [[TMP96]], i32 [[TMP85]], i32 0
	; CHECK-NEXT: [[TMP98:%.*]] = add i32 [[TMP93]], [[TMP97]]			; CHECK-NEXT: [[TMP98:%.*]] = add i32 [[TMP93]], [[TMP97]]
	; CHECK-NEXT: [[TMP99:%.*]] = mul i32 [[TMP98]], [[TMP82]]			; CHECK-NEXT: [[TMP99:%.*]] = mul i32 [[TMP98]], [[TMP82]]
	Show All 14 Lines
	; CHECK-LABEL: @udiv_i3(			; CHECK-LABEL: @udiv_i3(
	; CHECK-NEXT: [[TMP1:%.]] = zext i3 [[X:%.]] to i32			; CHECK-NEXT: [[TMP1:%.]] = zext i3 [[X:%.]] to i32
	; CHECK-NEXT: [[TMP2:%.]] = zext i3 [[Y:%.]] to i32			; CHECK-NEXT: [[TMP2:%.]] = zext i3 [[Y:%.]] to i32
	; CHECK-NEXT: [[TMP3:%.*]] = uitofp i32 [[TMP1]] to float			; CHECK-NEXT: [[TMP3:%.*]] = uitofp i32 [[TMP1]] to float
	; CHECK-NEXT: [[TMP4:%.*]] = uitofp i32 [[TMP2]] to float			; CHECK-NEXT: [[TMP4:%.*]] = uitofp i32 [[TMP2]] to float
	; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]]			; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]]
	; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]]			; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]]
	; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]])			; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]])
	; CHECK-NEXT: [[TMP8:%.*]] = fsub fast float -0.000000e+00, [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fneg fast float [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]])			; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]])
	; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32			; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32
	; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.fabs.f32(float [[TMP4]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.fabs.f32(float [[TMP4]])
	; CHECK-NEXT: [[TMP13:%.*]] = fcmp fast oge float [[TMP11]], [[TMP12]]			; CHECK-NEXT: [[TMP13:%.*]] = fcmp fast oge float [[TMP11]], [[TMP12]]
	; CHECK-NEXT: [[TMP14:%.*]] = select i1 [[TMP13]], i32 1, i32 0			; CHECK-NEXT: [[TMP14:%.*]] = select i1 [[TMP13]], i32 1, i32 0
	; CHECK-NEXT: [[TMP15:%.*]] = add i32 [[TMP10]], [[TMP14]]			; CHECK-NEXT: [[TMP15:%.*]] = add i32 [[TMP10]], [[TMP14]]
	; CHECK-NEXT: [[TMP16:%.*]] = and i32 [[TMP15]], 7			; CHECK-NEXT: [[TMP16:%.*]] = and i32 [[TMP15]], 7
	Show All 10 Lines
	; CHECK-LABEL: @urem_i3(			; CHECK-LABEL: @urem_i3(
	; CHECK-NEXT: [[TMP1:%.]] = zext i3 [[X:%.]] to i32			; CHECK-NEXT: [[TMP1:%.]] = zext i3 [[X:%.]] to i32
	; CHECK-NEXT: [[TMP2:%.]] = zext i3 [[Y:%.]] to i32			; CHECK-NEXT: [[TMP2:%.]] = zext i3 [[Y:%.]] to i32
	; CHECK-NEXT: [[TMP3:%.*]] = uitofp i32 [[TMP1]] to float			; CHECK-NEXT: [[TMP3:%.*]] = uitofp i32 [[TMP1]] to float
	; CHECK-NEXT: [[TMP4:%.*]] = uitofp i32 [[TMP2]] to float			; CHECK-NEXT: [[TMP4:%.*]] = uitofp i32 [[TMP2]] to float
	; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]]			; CHECK-NEXT: [[TMP5:%.*]] = fdiv fast float 1.000000e+00, [[TMP4]]
	; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]]			; CHECK-NEXT: [[TMP6:%.*]] = fmul fast float [[TMP3]], [[TMP5]]
	; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]])			; CHECK-NEXT: [[TMP7:%.*]] = call fast float @llvm.trunc.f32(float [[TMP6]])
	; CHECK-NEXT: [[TMP8:%.*]] = fsub fast float -0.000000e+00, [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fneg fast float [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]])			; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP8]], float [[TMP4]], float [[TMP3]])
	; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32			; CHECK-NEXT: [[TMP10:%.*]] = fptoui float [[TMP7]] to i32
	; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.fabs.f32(float [[TMP4]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.fabs.f32(float [[TMP4]])
	; CHECK-NEXT: [[TMP13:%.*]] = fcmp fast oge float [[TMP11]], [[TMP12]]			; CHECK-NEXT: [[TMP13:%.*]] = fcmp fast oge float [[TMP11]], [[TMP12]]
	; CHECK-NEXT: [[TMP14:%.*]] = select i1 [[TMP13]], i32 1, i32 0			; CHECK-NEXT: [[TMP14:%.*]] = select i1 [[TMP13]], i32 1, i32 0
	; CHECK-NEXT: [[TMP15:%.*]] = add i32 [[TMP10]], [[TMP14]]			; CHECK-NEXT: [[TMP15:%.*]] = add i32 [[TMP10]], [[TMP14]]
	; CHECK-NEXT: [[TMP16:%.*]] = mul i32 [[TMP15]], [[TMP2]]			; CHECK-NEXT: [[TMP16:%.*]] = mul i32 [[TMP15]], [[TMP2]]
	Show All 15 Lines
	; CHECK-NEXT: [[TMP3:%.*]] = xor i32 [[TMP1]], [[TMP2]]			; CHECK-NEXT: [[TMP3:%.*]] = xor i32 [[TMP1]], [[TMP2]]
	; CHECK-NEXT: [[TMP4:%.*]] = ashr i32 [[TMP3]], 30			; CHECK-NEXT: [[TMP4:%.*]] = ashr i32 [[TMP3]], 30
	; CHECK-NEXT: [[TMP5:%.*]] = or i32 [[TMP4]], 1			; CHECK-NEXT: [[TMP5:%.*]] = or i32 [[TMP4]], 1
	; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP1]] to float			; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP1]] to float
	; CHECK-NEXT: [[TMP7:%.*]] = sitofp i32 [[TMP2]] to float			; CHECK-NEXT: [[TMP7:%.*]] = sitofp i32 [[TMP2]] to float
	; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]]			; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]]
	; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP11:%.*]] = fsub fast float -0.000000e+00, [[TMP10]]			; CHECK-NEXT: [[TMP11:%.*]] = fneg fast float [[TMP10]]
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]])
	; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32			; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]])
	; CHECK-NEXT: [[TMP15:%.*]] = call fast float @llvm.fabs.f32(float [[TMP7]])			; CHECK-NEXT: [[TMP15:%.*]] = call fast float @llvm.fabs.f32(float [[TMP7]])
	; CHECK-NEXT: [[TMP16:%.*]] = fcmp fast oge float [[TMP14]], [[TMP15]]			; CHECK-NEXT: [[TMP16:%.*]] = fcmp fast oge float [[TMP14]], [[TMP15]]
	; CHECK-NEXT: [[TMP17:%.*]] = select i1 [[TMP16]], i32 [[TMP5]], i32 0			; CHECK-NEXT: [[TMP17:%.*]] = select i1 [[TMP16]], i32 [[TMP5]], i32 0
	; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[TMP13]], [[TMP17]]			; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[TMP13]], [[TMP17]]
	; CHECK-NEXT: [[TMP19:%.*]] = trunc i32 [[TMP18]] to i3			; CHECK-NEXT: [[TMP19:%.*]] = trunc i32 [[TMP18]] to i3
	Show All 14 Lines
	; CHECK-NEXT: [[TMP3:%.*]] = xor i32 [[TMP1]], [[TMP2]]			; CHECK-NEXT: [[TMP3:%.*]] = xor i32 [[TMP1]], [[TMP2]]
	; CHECK-NEXT: [[TMP4:%.*]] = ashr i32 [[TMP3]], 30			; CHECK-NEXT: [[TMP4:%.*]] = ashr i32 [[TMP3]], 30
	; CHECK-NEXT: [[TMP5:%.*]] = or i32 [[TMP4]], 1			; CHECK-NEXT: [[TMP5:%.*]] = or i32 [[TMP4]], 1
	; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP1]] to float			; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP1]] to float
	; CHECK-NEXT: [[TMP7:%.*]] = sitofp i32 [[TMP2]] to float			; CHECK-NEXT: [[TMP7:%.*]] = sitofp i32 [[TMP2]] to float
	; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fdiv fast float 1.000000e+00, [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]]			; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], [[TMP8]]
	; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP10:%.*]] = call fast float @llvm.trunc.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP11:%.*]] = fsub fast float -0.000000e+00, [[TMP10]]			; CHECK-NEXT: [[TMP11:%.*]] = fneg fast float [[TMP10]]
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP11]], float [[TMP7]], float [[TMP6]])
	; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32			; CHECK-NEXT: [[TMP13:%.*]] = fptosi float [[TMP10]] to i32
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP12]])
	; CHECK-NEXT: [[TMP15:%.*]] = call fast float @llvm.fabs.f32(float [[TMP7]])			; CHECK-NEXT: [[TMP15:%.*]] = call fast float @llvm.fabs.f32(float [[TMP7]])
	; CHECK-NEXT: [[TMP16:%.*]] = fcmp fast oge float [[TMP14]], [[TMP15]]			; CHECK-NEXT: [[TMP16:%.*]] = fcmp fast oge float [[TMP14]], [[TMP15]]
	; CHECK-NEXT: [[TMP17:%.*]] = select i1 [[TMP16]], i32 [[TMP5]], i32 0			; CHECK-NEXT: [[TMP17:%.*]] = select i1 [[TMP16]], i32 [[TMP5]], i32 0
	; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[TMP13]], [[TMP17]]			; CHECK-NEXT: [[TMP18:%.*]] = add i32 [[TMP13]], [[TMP17]]
	; CHECK-NEXT: [[TMP19:%.*]] = mul i32 [[TMP18]], [[TMP2]]			; CHECK-NEXT: [[TMP19:%.*]] = mul i32 [[TMP18]], [[TMP2]]
	Show All 15 Lines
	; CHECK-NEXT: [[TMP2:%.]] = extractelement <3 x i16> [[Y:%.]], i64 0			; CHECK-NEXT: [[TMP2:%.]] = extractelement <3 x i16> [[Y:%.]], i64 0
	; CHECK-NEXT: [[TMP3:%.*]] = zext i16 [[TMP1]] to i32			; CHECK-NEXT: [[TMP3:%.*]] = zext i16 [[TMP1]] to i32
	; CHECK-NEXT: [[TMP4:%.*]] = zext i16 [[TMP2]] to i32			; CHECK-NEXT: [[TMP4:%.*]] = zext i16 [[TMP2]] to i32
	; CHECK-NEXT: [[TMP5:%.*]] = uitofp i32 [[TMP3]] to float			; CHECK-NEXT: [[TMP5:%.*]] = uitofp i32 [[TMP3]] to float
	; CHECK-NEXT: [[TMP6:%.*]] = uitofp i32 [[TMP4]] to float			; CHECK-NEXT: [[TMP6:%.*]] = uitofp i32 [[TMP4]] to float
	; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]]
	; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]])			; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]])
	; CHECK-NEXT: [[TMP10:%.*]] = fsub fast float -0.000000e+00, [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = fneg fast float [[TMP9]]
	; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]])			; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]])
	; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32			; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32
	; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]])			; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]])
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]])
	; CHECK-NEXT: [[TMP15:%.*]] = fcmp fast oge float [[TMP13]], [[TMP14]]			; CHECK-NEXT: [[TMP15:%.*]] = fcmp fast oge float [[TMP13]], [[TMP14]]
	; CHECK-NEXT: [[TMP16:%.*]] = select i1 [[TMP15]], i32 1, i32 0			; CHECK-NEXT: [[TMP16:%.*]] = select i1 [[TMP15]], i32 1, i32 0
	; CHECK-NEXT: [[TMP17:%.*]] = add i32 [[TMP12]], [[TMP16]]			; CHECK-NEXT: [[TMP17:%.*]] = add i32 [[TMP12]], [[TMP16]]
	; CHECK-NEXT: [[TMP18:%.*]] = and i32 [[TMP17]], 65535			; CHECK-NEXT: [[TMP18:%.*]] = and i32 [[TMP17]], 65535
	; CHECK-NEXT: [[TMP19:%.*]] = trunc i32 [[TMP18]] to i16			; CHECK-NEXT: [[TMP19:%.*]] = trunc i32 [[TMP18]] to i16
	; CHECK-NEXT: [[TMP20:%.*]] = insertelement <3 x i16> undef, i16 [[TMP19]], i64 0			; CHECK-NEXT: [[TMP20:%.*]] = insertelement <3 x i16> undef, i16 [[TMP19]], i64 0
	; CHECK-NEXT: [[TMP21:%.*]] = extractelement <3 x i16> [[X]], i64 1			; CHECK-NEXT: [[TMP21:%.*]] = extractelement <3 x i16> [[X]], i64 1
	; CHECK-NEXT: [[TMP22:%.*]] = extractelement <3 x i16> [[Y]], i64 1			; CHECK-NEXT: [[TMP22:%.*]] = extractelement <3 x i16> [[Y]], i64 1
	; CHECK-NEXT: [[TMP23:%.*]] = zext i16 [[TMP21]] to i32			; CHECK-NEXT: [[TMP23:%.*]] = zext i16 [[TMP21]] to i32
	; CHECK-NEXT: [[TMP24:%.*]] = zext i16 [[TMP22]] to i32			; CHECK-NEXT: [[TMP24:%.*]] = zext i16 [[TMP22]] to i32
	; CHECK-NEXT: [[TMP25:%.*]] = uitofp i32 [[TMP23]] to float			; CHECK-NEXT: [[TMP25:%.*]] = uitofp i32 [[TMP23]] to float
	; CHECK-NEXT: [[TMP26:%.*]] = uitofp i32 [[TMP24]] to float			; CHECK-NEXT: [[TMP26:%.*]] = uitofp i32 [[TMP24]] to float
	; CHECK-NEXT: [[TMP27:%.*]] = fdiv fast float 1.000000e+00, [[TMP26]]			; CHECK-NEXT: [[TMP27:%.*]] = fdiv fast float 1.000000e+00, [[TMP26]]
	; CHECK-NEXT: [[TMP28:%.*]] = fmul fast float [[TMP25]], [[TMP27]]			; CHECK-NEXT: [[TMP28:%.*]] = fmul fast float [[TMP25]], [[TMP27]]
	; CHECK-NEXT: [[TMP29:%.*]] = call fast float @llvm.trunc.f32(float [[TMP28]])			; CHECK-NEXT: [[TMP29:%.*]] = call fast float @llvm.trunc.f32(float [[TMP28]])
	; CHECK-NEXT: [[TMP30:%.*]] = fsub fast float -0.000000e+00, [[TMP29]]			; CHECK-NEXT: [[TMP30:%.*]] = fneg fast float [[TMP29]]
	; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP30]], float [[TMP26]], float [[TMP25]])			; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP30]], float [[TMP26]], float [[TMP25]])
	; CHECK-NEXT: [[TMP32:%.*]] = fptoui float [[TMP29]] to i32			; CHECK-NEXT: [[TMP32:%.*]] = fptoui float [[TMP29]] to i32
	; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.fabs.f32(float [[TMP31]])			; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.fabs.f32(float [[TMP31]])
	; CHECK-NEXT: [[TMP34:%.*]] = call fast float @llvm.fabs.f32(float [[TMP26]])			; CHECK-NEXT: [[TMP34:%.*]] = call fast float @llvm.fabs.f32(float [[TMP26]])
	; CHECK-NEXT: [[TMP35:%.*]] = fcmp fast oge float [[TMP33]], [[TMP34]]			; CHECK-NEXT: [[TMP35:%.*]] = fcmp fast oge float [[TMP33]], [[TMP34]]
	; CHECK-NEXT: [[TMP36:%.*]] = select i1 [[TMP35]], i32 1, i32 0			; CHECK-NEXT: [[TMP36:%.*]] = select i1 [[TMP35]], i32 1, i32 0
	; CHECK-NEXT: [[TMP37:%.*]] = add i32 [[TMP32]], [[TMP36]]			; CHECK-NEXT: [[TMP37:%.*]] = add i32 [[TMP32]], [[TMP36]]
	; CHECK-NEXT: [[TMP38:%.*]] = and i32 [[TMP37]], 65535			; CHECK-NEXT: [[TMP38:%.*]] = and i32 [[TMP37]], 65535
	; CHECK-NEXT: [[TMP39:%.*]] = trunc i32 [[TMP38]] to i16			; CHECK-NEXT: [[TMP39:%.*]] = trunc i32 [[TMP38]] to i16
	; CHECK-NEXT: [[TMP40:%.*]] = insertelement <3 x i16> [[TMP20]], i16 [[TMP39]], i64 1			; CHECK-NEXT: [[TMP40:%.*]] = insertelement <3 x i16> [[TMP20]], i16 [[TMP39]], i64 1
	; CHECK-NEXT: [[TMP41:%.*]] = extractelement <3 x i16> [[X]], i64 2			; CHECK-NEXT: [[TMP41:%.*]] = extractelement <3 x i16> [[X]], i64 2
	; CHECK-NEXT: [[TMP42:%.*]] = extractelement <3 x i16> [[Y]], i64 2			; CHECK-NEXT: [[TMP42:%.*]] = extractelement <3 x i16> [[Y]], i64 2
	; CHECK-NEXT: [[TMP43:%.*]] = zext i16 [[TMP41]] to i32			; CHECK-NEXT: [[TMP43:%.*]] = zext i16 [[TMP41]] to i32
	; CHECK-NEXT: [[TMP44:%.*]] = zext i16 [[TMP42]] to i32			; CHECK-NEXT: [[TMP44:%.*]] = zext i16 [[TMP42]] to i32
	; CHECK-NEXT: [[TMP45:%.*]] = uitofp i32 [[TMP43]] to float			; CHECK-NEXT: [[TMP45:%.*]] = uitofp i32 [[TMP43]] to float
	; CHECK-NEXT: [[TMP46:%.*]] = uitofp i32 [[TMP44]] to float			; CHECK-NEXT: [[TMP46:%.*]] = uitofp i32 [[TMP44]] to float
	; CHECK-NEXT: [[TMP47:%.*]] = fdiv fast float 1.000000e+00, [[TMP46]]			; CHECK-NEXT: [[TMP47:%.*]] = fdiv fast float 1.000000e+00, [[TMP46]]
	; CHECK-NEXT: [[TMP48:%.*]] = fmul fast float [[TMP45]], [[TMP47]]			; CHECK-NEXT: [[TMP48:%.*]] = fmul fast float [[TMP45]], [[TMP47]]
	; CHECK-NEXT: [[TMP49:%.*]] = call fast float @llvm.trunc.f32(float [[TMP48]])			; CHECK-NEXT: [[TMP49:%.*]] = call fast float @llvm.trunc.f32(float [[TMP48]])
	; CHECK-NEXT: [[TMP50:%.*]] = fsub fast float -0.000000e+00, [[TMP49]]			; CHECK-NEXT: [[TMP50:%.*]] = fneg fast float [[TMP49]]
	; CHECK-NEXT: [[TMP51:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP50]], float [[TMP46]], float [[TMP45]])			; CHECK-NEXT: [[TMP51:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP50]], float [[TMP46]], float [[TMP45]])
	; CHECK-NEXT: [[TMP52:%.*]] = fptoui float [[TMP49]] to i32			; CHECK-NEXT: [[TMP52:%.*]] = fptoui float [[TMP49]] to i32
	; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.fabs.f32(float [[TMP51]])			; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.fabs.f32(float [[TMP51]])
	; CHECK-NEXT: [[TMP54:%.*]] = call fast float @llvm.fabs.f32(float [[TMP46]])			; CHECK-NEXT: [[TMP54:%.*]] = call fast float @llvm.fabs.f32(float [[TMP46]])
	; CHECK-NEXT: [[TMP55:%.*]] = fcmp fast oge float [[TMP53]], [[TMP54]]			; CHECK-NEXT: [[TMP55:%.*]] = fcmp fast oge float [[TMP53]], [[TMP54]]
	; CHECK-NEXT: [[TMP56:%.*]] = select i1 [[TMP55]], i32 1, i32 0			; CHECK-NEXT: [[TMP56:%.*]] = select i1 [[TMP55]], i32 1, i32 0
	; CHECK-NEXT: [[TMP57:%.*]] = add i32 [[TMP52]], [[TMP56]]			; CHECK-NEXT: [[TMP57:%.*]] = add i32 [[TMP52]], [[TMP56]]
	; CHECK-NEXT: [[TMP58:%.*]] = and i32 [[TMP57]], 65535			; CHECK-NEXT: [[TMP58:%.*]] = and i32 [[TMP57]], 65535
	Show All 13 Lines
	; CHECK-NEXT: [[TMP2:%.]] = extractelement <3 x i16> [[Y:%.]], i64 0			; CHECK-NEXT: [[TMP2:%.]] = extractelement <3 x i16> [[Y:%.]], i64 0
	; CHECK-NEXT: [[TMP3:%.*]] = zext i16 [[TMP1]] to i32			; CHECK-NEXT: [[TMP3:%.*]] = zext i16 [[TMP1]] to i32
	; CHECK-NEXT: [[TMP4:%.*]] = zext i16 [[TMP2]] to i32			; CHECK-NEXT: [[TMP4:%.*]] = zext i16 [[TMP2]] to i32
	; CHECK-NEXT: [[TMP5:%.*]] = uitofp i32 [[TMP3]] to float			; CHECK-NEXT: [[TMP5:%.*]] = uitofp i32 [[TMP3]] to float
	; CHECK-NEXT: [[TMP6:%.*]] = uitofp i32 [[TMP4]] to float			; CHECK-NEXT: [[TMP6:%.*]] = uitofp i32 [[TMP4]] to float
	; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]]
	; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]])			; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]])
	; CHECK-NEXT: [[TMP10:%.*]] = fsub fast float -0.000000e+00, [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = fneg fast float [[TMP9]]
	; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]])			; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]])
	; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32			; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32
	; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]])			; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]])
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]])
	; CHECK-NEXT: [[TMP15:%.*]] = fcmp fast oge float [[TMP13]], [[TMP14]]			; CHECK-NEXT: [[TMP15:%.*]] = fcmp fast oge float [[TMP13]], [[TMP14]]
	; CHECK-NEXT: [[TMP16:%.*]] = select i1 [[TMP15]], i32 1, i32 0			; CHECK-NEXT: [[TMP16:%.*]] = select i1 [[TMP15]], i32 1, i32 0
	; CHECK-NEXT: [[TMP17:%.*]] = add i32 [[TMP12]], [[TMP16]]			; CHECK-NEXT: [[TMP17:%.*]] = add i32 [[TMP12]], [[TMP16]]
	; CHECK-NEXT: [[TMP18:%.*]] = mul i32 [[TMP17]], [[TMP4]]			; CHECK-NEXT: [[TMP18:%.*]] = mul i32 [[TMP17]], [[TMP4]]
	; CHECK-NEXT: [[TMP19:%.*]] = sub i32 [[TMP3]], [[TMP18]]			; CHECK-NEXT: [[TMP19:%.*]] = sub i32 [[TMP3]], [[TMP18]]
	; CHECK-NEXT: [[TMP20:%.*]] = and i32 [[TMP19]], 65535			; CHECK-NEXT: [[TMP20:%.*]] = and i32 [[TMP19]], 65535
	; CHECK-NEXT: [[TMP21:%.*]] = trunc i32 [[TMP20]] to i16			; CHECK-NEXT: [[TMP21:%.*]] = trunc i32 [[TMP20]] to i16
	; CHECK-NEXT: [[TMP22:%.*]] = insertelement <3 x i16> undef, i16 [[TMP21]], i64 0			; CHECK-NEXT: [[TMP22:%.*]] = insertelement <3 x i16> undef, i16 [[TMP21]], i64 0
	; CHECK-NEXT: [[TMP23:%.*]] = extractelement <3 x i16> [[X]], i64 1			; CHECK-NEXT: [[TMP23:%.*]] = extractelement <3 x i16> [[X]], i64 1
	; CHECK-NEXT: [[TMP24:%.*]] = extractelement <3 x i16> [[Y]], i64 1			; CHECK-NEXT: [[TMP24:%.*]] = extractelement <3 x i16> [[Y]], i64 1
	; CHECK-NEXT: [[TMP25:%.*]] = zext i16 [[TMP23]] to i32			; CHECK-NEXT: [[TMP25:%.*]] = zext i16 [[TMP23]] to i32
	; CHECK-NEXT: [[TMP26:%.*]] = zext i16 [[TMP24]] to i32			; CHECK-NEXT: [[TMP26:%.*]] = zext i16 [[TMP24]] to i32
	; CHECK-NEXT: [[TMP27:%.*]] = uitofp i32 [[TMP25]] to float			; CHECK-NEXT: [[TMP27:%.*]] = uitofp i32 [[TMP25]] to float
	; CHECK-NEXT: [[TMP28:%.*]] = uitofp i32 [[TMP26]] to float			; CHECK-NEXT: [[TMP28:%.*]] = uitofp i32 [[TMP26]] to float
	; CHECK-NEXT: [[TMP29:%.*]] = fdiv fast float 1.000000e+00, [[TMP28]]			; CHECK-NEXT: [[TMP29:%.*]] = fdiv fast float 1.000000e+00, [[TMP28]]
	; CHECK-NEXT: [[TMP30:%.*]] = fmul fast float [[TMP27]], [[TMP29]]			; CHECK-NEXT: [[TMP30:%.*]] = fmul fast float [[TMP27]], [[TMP29]]
	; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.trunc.f32(float [[TMP30]])			; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.trunc.f32(float [[TMP30]])
	; CHECK-NEXT: [[TMP32:%.*]] = fsub fast float -0.000000e+00, [[TMP31]]			; CHECK-NEXT: [[TMP32:%.*]] = fneg fast float [[TMP31]]
	; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP32]], float [[TMP28]], float [[TMP27]])			; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP32]], float [[TMP28]], float [[TMP27]])
	; CHECK-NEXT: [[TMP34:%.*]] = fptoui float [[TMP31]] to i32			; CHECK-NEXT: [[TMP34:%.*]] = fptoui float [[TMP31]] to i32
	; CHECK-NEXT: [[TMP35:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]])			; CHECK-NEXT: [[TMP35:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]])
	; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.fabs.f32(float [[TMP28]])			; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.fabs.f32(float [[TMP28]])
	; CHECK-NEXT: [[TMP37:%.*]] = fcmp fast oge float [[TMP35]], [[TMP36]]			; CHECK-NEXT: [[TMP37:%.*]] = fcmp fast oge float [[TMP35]], [[TMP36]]
	; CHECK-NEXT: [[TMP38:%.*]] = select i1 [[TMP37]], i32 1, i32 0			; CHECK-NEXT: [[TMP38:%.*]] = select i1 [[TMP37]], i32 1, i32 0
	; CHECK-NEXT: [[TMP39:%.*]] = add i32 [[TMP34]], [[TMP38]]			; CHECK-NEXT: [[TMP39:%.*]] = add i32 [[TMP34]], [[TMP38]]
	; CHECK-NEXT: [[TMP40:%.*]] = mul i32 [[TMP39]], [[TMP26]]			; CHECK-NEXT: [[TMP40:%.*]] = mul i32 [[TMP39]], [[TMP26]]
	; CHECK-NEXT: [[TMP41:%.*]] = sub i32 [[TMP25]], [[TMP40]]			; CHECK-NEXT: [[TMP41:%.*]] = sub i32 [[TMP25]], [[TMP40]]
	; CHECK-NEXT: [[TMP42:%.*]] = and i32 [[TMP41]], 65535			; CHECK-NEXT: [[TMP42:%.*]] = and i32 [[TMP41]], 65535
	; CHECK-NEXT: [[TMP43:%.*]] = trunc i32 [[TMP42]] to i16			; CHECK-NEXT: [[TMP43:%.*]] = trunc i32 [[TMP42]] to i16
	; CHECK-NEXT: [[TMP44:%.*]] = insertelement <3 x i16> [[TMP22]], i16 [[TMP43]], i64 1			; CHECK-NEXT: [[TMP44:%.*]] = insertelement <3 x i16> [[TMP22]], i16 [[TMP43]], i64 1
	; CHECK-NEXT: [[TMP45:%.*]] = extractelement <3 x i16> [[X]], i64 2			; CHECK-NEXT: [[TMP45:%.*]] = extractelement <3 x i16> [[X]], i64 2
	; CHECK-NEXT: [[TMP46:%.*]] = extractelement <3 x i16> [[Y]], i64 2			; CHECK-NEXT: [[TMP46:%.*]] = extractelement <3 x i16> [[Y]], i64 2
	; CHECK-NEXT: [[TMP47:%.*]] = zext i16 [[TMP45]] to i32			; CHECK-NEXT: [[TMP47:%.*]] = zext i16 [[TMP45]] to i32
	; CHECK-NEXT: [[TMP48:%.*]] = zext i16 [[TMP46]] to i32			; CHECK-NEXT: [[TMP48:%.*]] = zext i16 [[TMP46]] to i32
	; CHECK-NEXT: [[TMP49:%.*]] = uitofp i32 [[TMP47]] to float			; CHECK-NEXT: [[TMP49:%.*]] = uitofp i32 [[TMP47]] to float
	; CHECK-NEXT: [[TMP50:%.*]] = uitofp i32 [[TMP48]] to float			; CHECK-NEXT: [[TMP50:%.*]] = uitofp i32 [[TMP48]] to float
	; CHECK-NEXT: [[TMP51:%.*]] = fdiv fast float 1.000000e+00, [[TMP50]]			; CHECK-NEXT: [[TMP51:%.*]] = fdiv fast float 1.000000e+00, [[TMP50]]
	; CHECK-NEXT: [[TMP52:%.*]] = fmul fast float [[TMP49]], [[TMP51]]			; CHECK-NEXT: [[TMP52:%.*]] = fmul fast float [[TMP49]], [[TMP51]]
	; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.trunc.f32(float [[TMP52]])			; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.trunc.f32(float [[TMP52]])
	; CHECK-NEXT: [[TMP54:%.*]] = fsub fast float -0.000000e+00, [[TMP53]]			; CHECK-NEXT: [[TMP54:%.*]] = fneg fast float [[TMP53]]
	; CHECK-NEXT: [[TMP55:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP54]], float [[TMP50]], float [[TMP49]])			; CHECK-NEXT: [[TMP55:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP54]], float [[TMP50]], float [[TMP49]])
	; CHECK-NEXT: [[TMP56:%.*]] = fptoui float [[TMP53]] to i32			; CHECK-NEXT: [[TMP56:%.*]] = fptoui float [[TMP53]] to i32
	; CHECK-NEXT: [[TMP57:%.*]] = call fast float @llvm.fabs.f32(float [[TMP55]])			; CHECK-NEXT: [[TMP57:%.*]] = call fast float @llvm.fabs.f32(float [[TMP55]])
	; CHECK-NEXT: [[TMP58:%.*]] = call fast float @llvm.fabs.f32(float [[TMP50]])			; CHECK-NEXT: [[TMP58:%.*]] = call fast float @llvm.fabs.f32(float [[TMP50]])
	; CHECK-NEXT: [[TMP59:%.*]] = fcmp fast oge float [[TMP57]], [[TMP58]]			; CHECK-NEXT: [[TMP59:%.*]] = fcmp fast oge float [[TMP57]], [[TMP58]]
	; CHECK-NEXT: [[TMP60:%.*]] = select i1 [[TMP59]], i32 1, i32 0			; CHECK-NEXT: [[TMP60:%.*]] = select i1 [[TMP59]], i32 1, i32 0
	; CHECK-NEXT: [[TMP61:%.*]] = add i32 [[TMP56]], [[TMP60]]			; CHECK-NEXT: [[TMP61:%.*]] = add i32 [[TMP56]], [[TMP60]]
	; CHECK-NEXT: [[TMP62:%.*]] = mul i32 [[TMP61]], [[TMP48]]			; CHECK-NEXT: [[TMP62:%.*]] = mul i32 [[TMP61]], [[TMP48]]
	Show All 18 Lines
	; CHECK-NEXT: [[TMP5:%.*]] = xor i32 [[TMP3]], [[TMP4]]			; CHECK-NEXT: [[TMP5:%.*]] = xor i32 [[TMP3]], [[TMP4]]
	; CHECK-NEXT: [[TMP6:%.*]] = ashr i32 [[TMP5]], 30			; CHECK-NEXT: [[TMP6:%.*]] = ashr i32 [[TMP5]], 30
	; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], 1			; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], 1
	; CHECK-NEXT: [[TMP8:%.*]] = sitofp i32 [[TMP3]] to float			; CHECK-NEXT: [[TMP8:%.*]] = sitofp i32 [[TMP3]] to float
	; CHECK-NEXT: [[TMP9:%.*]] = sitofp i32 [[TMP4]] to float			; CHECK-NEXT: [[TMP9:%.*]] = sitofp i32 [[TMP4]] to float
	; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]]
	; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]]			; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]]
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]])
	; CHECK-NEXT: [[TMP13:%.*]] = fsub fast float -0.000000e+00, [[TMP12]]			; CHECK-NEXT: [[TMP13:%.*]] = fneg fast float [[TMP12]]
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]])
	; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32			; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32
	; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]])			; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]])
	; CHECK-NEXT: [[TMP17:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP17:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP18:%.*]] = fcmp fast oge float [[TMP16]], [[TMP17]]			; CHECK-NEXT: [[TMP18:%.*]] = fcmp fast oge float [[TMP16]], [[TMP17]]
	; CHECK-NEXT: [[TMP19:%.*]] = select i1 [[TMP18]], i32 [[TMP7]], i32 0			; CHECK-NEXT: [[TMP19:%.*]] = select i1 [[TMP18]], i32 [[TMP7]], i32 0
	; CHECK-NEXT: [[TMP20:%.*]] = add i32 [[TMP15]], [[TMP19]]			; CHECK-NEXT: [[TMP20:%.*]] = add i32 [[TMP15]], [[TMP19]]
	; CHECK-NEXT: [[TMP21:%.*]] = trunc i32 [[TMP20]] to i16			; CHECK-NEXT: [[TMP21:%.*]] = trunc i32 [[TMP20]] to i16
	; CHECK-NEXT: [[TMP22:%.*]] = sext i16 [[TMP21]] to i32			; CHECK-NEXT: [[TMP22:%.*]] = sext i16 [[TMP21]] to i32
	; CHECK-NEXT: [[TMP23:%.*]] = trunc i32 [[TMP22]] to i16			; CHECK-NEXT: [[TMP23:%.*]] = trunc i32 [[TMP22]] to i16
	; CHECK-NEXT: [[TMP24:%.*]] = insertelement <3 x i16> undef, i16 [[TMP23]], i64 0			; CHECK-NEXT: [[TMP24:%.*]] = insertelement <3 x i16> undef, i16 [[TMP23]], i64 0
	; CHECK-NEXT: [[TMP25:%.*]] = extractelement <3 x i16> [[X]], i64 1			; CHECK-NEXT: [[TMP25:%.*]] = extractelement <3 x i16> [[X]], i64 1
	; CHECK-NEXT: [[TMP26:%.*]] = extractelement <3 x i16> [[Y]], i64 1			; CHECK-NEXT: [[TMP26:%.*]] = extractelement <3 x i16> [[Y]], i64 1
	; CHECK-NEXT: [[TMP27:%.*]] = sext i16 [[TMP25]] to i32			; CHECK-NEXT: [[TMP27:%.*]] = sext i16 [[TMP25]] to i32
	; CHECK-NEXT: [[TMP28:%.*]] = sext i16 [[TMP26]] to i32			; CHECK-NEXT: [[TMP28:%.*]] = sext i16 [[TMP26]] to i32
	; CHECK-NEXT: [[TMP29:%.*]] = xor i32 [[TMP27]], [[TMP28]]			; CHECK-NEXT: [[TMP29:%.*]] = xor i32 [[TMP27]], [[TMP28]]
	; CHECK-NEXT: [[TMP30:%.*]] = ashr i32 [[TMP29]], 30			; CHECK-NEXT: [[TMP30:%.*]] = ashr i32 [[TMP29]], 30
	; CHECK-NEXT: [[TMP31:%.*]] = or i32 [[TMP30]], 1			; CHECK-NEXT: [[TMP31:%.*]] = or i32 [[TMP30]], 1
	; CHECK-NEXT: [[TMP32:%.*]] = sitofp i32 [[TMP27]] to float			; CHECK-NEXT: [[TMP32:%.*]] = sitofp i32 [[TMP27]] to float
	; CHECK-NEXT: [[TMP33:%.*]] = sitofp i32 [[TMP28]] to float			; CHECK-NEXT: [[TMP33:%.*]] = sitofp i32 [[TMP28]] to float
	; CHECK-NEXT: [[TMP34:%.*]] = fdiv fast float 1.000000e+00, [[TMP33]]			; CHECK-NEXT: [[TMP34:%.*]] = fdiv fast float 1.000000e+00, [[TMP33]]
	; CHECK-NEXT: [[TMP35:%.*]] = fmul fast float [[TMP32]], [[TMP34]]			; CHECK-NEXT: [[TMP35:%.*]] = fmul fast float [[TMP32]], [[TMP34]]
	; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.trunc.f32(float [[TMP35]])			; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.trunc.f32(float [[TMP35]])
	; CHECK-NEXT: [[TMP37:%.*]] = fsub fast float -0.000000e+00, [[TMP36]]			; CHECK-NEXT: [[TMP37:%.*]] = fneg fast float [[TMP36]]
	; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP37]], float [[TMP33]], float [[TMP32]])			; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP37]], float [[TMP33]], float [[TMP32]])
	; CHECK-NEXT: [[TMP39:%.*]] = fptosi float [[TMP36]] to i32			; CHECK-NEXT: [[TMP39:%.*]] = fptosi float [[TMP36]] to i32
	; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.fabs.f32(float [[TMP38]])			; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.fabs.f32(float [[TMP38]])
	; CHECK-NEXT: [[TMP41:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]])			; CHECK-NEXT: [[TMP41:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]])
	; CHECK-NEXT: [[TMP42:%.*]] = fcmp fast oge float [[TMP40]], [[TMP41]]			; CHECK-NEXT: [[TMP42:%.*]] = fcmp fast oge float [[TMP40]], [[TMP41]]
	; CHECK-NEXT: [[TMP43:%.*]] = select i1 [[TMP42]], i32 [[TMP31]], i32 0			; CHECK-NEXT: [[TMP43:%.*]] = select i1 [[TMP42]], i32 [[TMP31]], i32 0
	; CHECK-NEXT: [[TMP44:%.*]] = add i32 [[TMP39]], [[TMP43]]			; CHECK-NEXT: [[TMP44:%.*]] = add i32 [[TMP39]], [[TMP43]]
	; CHECK-NEXT: [[TMP45:%.*]] = trunc i32 [[TMP44]] to i16			; CHECK-NEXT: [[TMP45:%.*]] = trunc i32 [[TMP44]] to i16
	; CHECK-NEXT: [[TMP46:%.*]] = sext i16 [[TMP45]] to i32			; CHECK-NEXT: [[TMP46:%.*]] = sext i16 [[TMP45]] to i32
	; CHECK-NEXT: [[TMP47:%.*]] = trunc i32 [[TMP46]] to i16			; CHECK-NEXT: [[TMP47:%.*]] = trunc i32 [[TMP46]] to i16
	; CHECK-NEXT: [[TMP48:%.*]] = insertelement <3 x i16> [[TMP24]], i16 [[TMP47]], i64 1			; CHECK-NEXT: [[TMP48:%.*]] = insertelement <3 x i16> [[TMP24]], i16 [[TMP47]], i64 1
	; CHECK-NEXT: [[TMP49:%.*]] = extractelement <3 x i16> [[X]], i64 2			; CHECK-NEXT: [[TMP49:%.*]] = extractelement <3 x i16> [[X]], i64 2
	; CHECK-NEXT: [[TMP50:%.*]] = extractelement <3 x i16> [[Y]], i64 2			; CHECK-NEXT: [[TMP50:%.*]] = extractelement <3 x i16> [[Y]], i64 2
	; CHECK-NEXT: [[TMP51:%.*]] = sext i16 [[TMP49]] to i32			; CHECK-NEXT: [[TMP51:%.*]] = sext i16 [[TMP49]] to i32
	; CHECK-NEXT: [[TMP52:%.*]] = sext i16 [[TMP50]] to i32			; CHECK-NEXT: [[TMP52:%.*]] = sext i16 [[TMP50]] to i32
	; CHECK-NEXT: [[TMP53:%.*]] = xor i32 [[TMP51]], [[TMP52]]			; CHECK-NEXT: [[TMP53:%.*]] = xor i32 [[TMP51]], [[TMP52]]
	; CHECK-NEXT: [[TMP54:%.*]] = ashr i32 [[TMP53]], 30			; CHECK-NEXT: [[TMP54:%.*]] = ashr i32 [[TMP53]], 30
	; CHECK-NEXT: [[TMP55:%.*]] = or i32 [[TMP54]], 1			; CHECK-NEXT: [[TMP55:%.*]] = or i32 [[TMP54]], 1
	; CHECK-NEXT: [[TMP56:%.*]] = sitofp i32 [[TMP51]] to float			; CHECK-NEXT: [[TMP56:%.*]] = sitofp i32 [[TMP51]] to float
	; CHECK-NEXT: [[TMP57:%.*]] = sitofp i32 [[TMP52]] to float			; CHECK-NEXT: [[TMP57:%.*]] = sitofp i32 [[TMP52]] to float
	; CHECK-NEXT: [[TMP58:%.*]] = fdiv fast float 1.000000e+00, [[TMP57]]			; CHECK-NEXT: [[TMP58:%.*]] = fdiv fast float 1.000000e+00, [[TMP57]]
	; CHECK-NEXT: [[TMP59:%.*]] = fmul fast float [[TMP56]], [[TMP58]]			; CHECK-NEXT: [[TMP59:%.*]] = fmul fast float [[TMP56]], [[TMP58]]
	; CHECK-NEXT: [[TMP60:%.*]] = call fast float @llvm.trunc.f32(float [[TMP59]])			; CHECK-NEXT: [[TMP60:%.*]] = call fast float @llvm.trunc.f32(float [[TMP59]])
	; CHECK-NEXT: [[TMP61:%.*]] = fsub fast float -0.000000e+00, [[TMP60]]			; CHECK-NEXT: [[TMP61:%.*]] = fneg fast float [[TMP60]]
	; CHECK-NEXT: [[TMP62:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP61]], float [[TMP57]], float [[TMP56]])			; CHECK-NEXT: [[TMP62:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP61]], float [[TMP57]], float [[TMP56]])
	; CHECK-NEXT: [[TMP63:%.*]] = fptosi float [[TMP60]] to i32			; CHECK-NEXT: [[TMP63:%.*]] = fptosi float [[TMP60]] to i32
	; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.fabs.f32(float [[TMP62]])			; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.fabs.f32(float [[TMP62]])
	; CHECK-NEXT: [[TMP65:%.*]] = call fast float @llvm.fabs.f32(float [[TMP57]])			; CHECK-NEXT: [[TMP65:%.*]] = call fast float @llvm.fabs.f32(float [[TMP57]])
	; CHECK-NEXT: [[TMP66:%.*]] = fcmp fast oge float [[TMP64]], [[TMP65]]			; CHECK-NEXT: [[TMP66:%.*]] = fcmp fast oge float [[TMP64]], [[TMP65]]
	; CHECK-NEXT: [[TMP67:%.*]] = select i1 [[TMP66]], i32 [[TMP55]], i32 0			; CHECK-NEXT: [[TMP67:%.*]] = select i1 [[TMP66]], i32 [[TMP55]], i32 0
	; CHECK-NEXT: [[TMP68:%.*]] = add i32 [[TMP63]], [[TMP67]]			; CHECK-NEXT: [[TMP68:%.*]] = add i32 [[TMP63]], [[TMP67]]
	; CHECK-NEXT: [[TMP69:%.*]] = trunc i32 [[TMP68]] to i16			; CHECK-NEXT: [[TMP69:%.*]] = trunc i32 [[TMP68]] to i16
	Show All 17 Lines
	; CHECK-NEXT: [[TMP5:%.*]] = xor i32 [[TMP3]], [[TMP4]]			; CHECK-NEXT: [[TMP5:%.*]] = xor i32 [[TMP3]], [[TMP4]]
	; CHECK-NEXT: [[TMP6:%.*]] = ashr i32 [[TMP5]], 30			; CHECK-NEXT: [[TMP6:%.*]] = ashr i32 [[TMP5]], 30
	; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], 1			; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], 1
	; CHECK-NEXT: [[TMP8:%.*]] = sitofp i32 [[TMP3]] to float			; CHECK-NEXT: [[TMP8:%.*]] = sitofp i32 [[TMP3]] to float
	; CHECK-NEXT: [[TMP9:%.*]] = sitofp i32 [[TMP4]] to float			; CHECK-NEXT: [[TMP9:%.*]] = sitofp i32 [[TMP4]] to float
	; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]]
	; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]]			; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]]
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]])
	; CHECK-NEXT: [[TMP13:%.*]] = fsub fast float -0.000000e+00, [[TMP12]]			; CHECK-NEXT: [[TMP13:%.*]] = fneg fast float [[TMP12]]
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]])
	; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32			; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32
	; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]])			; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]])
	; CHECK-NEXT: [[TMP17:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP17:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP18:%.*]] = fcmp fast oge float [[TMP16]], [[TMP17]]			; CHECK-NEXT: [[TMP18:%.*]] = fcmp fast oge float [[TMP16]], [[TMP17]]
	; CHECK-NEXT: [[TMP19:%.*]] = select i1 [[TMP18]], i32 [[TMP7]], i32 0			; CHECK-NEXT: [[TMP19:%.*]] = select i1 [[TMP18]], i32 [[TMP7]], i32 0
	; CHECK-NEXT: [[TMP20:%.*]] = add i32 [[TMP15]], [[TMP19]]			; CHECK-NEXT: [[TMP20:%.*]] = add i32 [[TMP15]], [[TMP19]]
	; CHECK-NEXT: [[TMP21:%.*]] = mul i32 [[TMP20]], [[TMP4]]			; CHECK-NEXT: [[TMP21:%.*]] = mul i32 [[TMP20]], [[TMP4]]
	Show All 9 Lines
	; CHECK-NEXT: [[TMP31:%.*]] = xor i32 [[TMP29]], [[TMP30]]			; CHECK-NEXT: [[TMP31:%.*]] = xor i32 [[TMP29]], [[TMP30]]
	; CHECK-NEXT: [[TMP32:%.*]] = ashr i32 [[TMP31]], 30			; CHECK-NEXT: [[TMP32:%.*]] = ashr i32 [[TMP31]], 30
	; CHECK-NEXT: [[TMP33:%.*]] = or i32 [[TMP32]], 1			; CHECK-NEXT: [[TMP33:%.*]] = or i32 [[TMP32]], 1
	; CHECK-NEXT: [[TMP34:%.*]] = sitofp i32 [[TMP29]] to float			; CHECK-NEXT: [[TMP34:%.*]] = sitofp i32 [[TMP29]] to float
	; CHECK-NEXT: [[TMP35:%.*]] = sitofp i32 [[TMP30]] to float			; CHECK-NEXT: [[TMP35:%.*]] = sitofp i32 [[TMP30]] to float
	; CHECK-NEXT: [[TMP36:%.*]] = fdiv fast float 1.000000e+00, [[TMP35]]			; CHECK-NEXT: [[TMP36:%.*]] = fdiv fast float 1.000000e+00, [[TMP35]]
	; CHECK-NEXT: [[TMP37:%.*]] = fmul fast float [[TMP34]], [[TMP36]]			; CHECK-NEXT: [[TMP37:%.*]] = fmul fast float [[TMP34]], [[TMP36]]
	; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.trunc.f32(float [[TMP37]])			; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.trunc.f32(float [[TMP37]])
	; CHECK-NEXT: [[TMP39:%.*]] = fsub fast float -0.000000e+00, [[TMP38]]			; CHECK-NEXT: [[TMP39:%.*]] = fneg fast float [[TMP38]]
	; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP39]], float [[TMP35]], float [[TMP34]])			; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP39]], float [[TMP35]], float [[TMP34]])
	; CHECK-NEXT: [[TMP41:%.*]] = fptosi float [[TMP38]] to i32			; CHECK-NEXT: [[TMP41:%.*]] = fptosi float [[TMP38]] to i32
	; CHECK-NEXT: [[TMP42:%.*]] = call fast float @llvm.fabs.f32(float [[TMP40]])			; CHECK-NEXT: [[TMP42:%.*]] = call fast float @llvm.fabs.f32(float [[TMP40]])
	; CHECK-NEXT: [[TMP43:%.*]] = call fast float @llvm.fabs.f32(float [[TMP35]])			; CHECK-NEXT: [[TMP43:%.*]] = call fast float @llvm.fabs.f32(float [[TMP35]])
	; CHECK-NEXT: [[TMP44:%.*]] = fcmp fast oge float [[TMP42]], [[TMP43]]			; CHECK-NEXT: [[TMP44:%.*]] = fcmp fast oge float [[TMP42]], [[TMP43]]
	; CHECK-NEXT: [[TMP45:%.*]] = select i1 [[TMP44]], i32 [[TMP33]], i32 0			; CHECK-NEXT: [[TMP45:%.*]] = select i1 [[TMP44]], i32 [[TMP33]], i32 0
	; CHECK-NEXT: [[TMP46:%.*]] = add i32 [[TMP41]], [[TMP45]]			; CHECK-NEXT: [[TMP46:%.*]] = add i32 [[TMP41]], [[TMP45]]
	; CHECK-NEXT: [[TMP47:%.*]] = mul i32 [[TMP46]], [[TMP30]]			; CHECK-NEXT: [[TMP47:%.*]] = mul i32 [[TMP46]], [[TMP30]]
	Show All 9 Lines
	; CHECK-NEXT: [[TMP57:%.*]] = xor i32 [[TMP55]], [[TMP56]]			; CHECK-NEXT: [[TMP57:%.*]] = xor i32 [[TMP55]], [[TMP56]]
	; CHECK-NEXT: [[TMP58:%.*]] = ashr i32 [[TMP57]], 30			; CHECK-NEXT: [[TMP58:%.*]] = ashr i32 [[TMP57]], 30
	; CHECK-NEXT: [[TMP59:%.*]] = or i32 [[TMP58]], 1			; CHECK-NEXT: [[TMP59:%.*]] = or i32 [[TMP58]], 1
	; CHECK-NEXT: [[TMP60:%.*]] = sitofp i32 [[TMP55]] to float			; CHECK-NEXT: [[TMP60:%.*]] = sitofp i32 [[TMP55]] to float
	; CHECK-NEXT: [[TMP61:%.*]] = sitofp i32 [[TMP56]] to float			; CHECK-NEXT: [[TMP61:%.*]] = sitofp i32 [[TMP56]] to float
	; CHECK-NEXT: [[TMP62:%.*]] = fdiv fast float 1.000000e+00, [[TMP61]]			; CHECK-NEXT: [[TMP62:%.*]] = fdiv fast float 1.000000e+00, [[TMP61]]
	; CHECK-NEXT: [[TMP63:%.*]] = fmul fast float [[TMP60]], [[TMP62]]			; CHECK-NEXT: [[TMP63:%.*]] = fmul fast float [[TMP60]], [[TMP62]]
	; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.trunc.f32(float [[TMP63]])			; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.trunc.f32(float [[TMP63]])
	; CHECK-NEXT: [[TMP65:%.*]] = fsub fast float -0.000000e+00, [[TMP64]]			; CHECK-NEXT: [[TMP65:%.*]] = fneg fast float [[TMP64]]
	; CHECK-NEXT: [[TMP66:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP65]], float [[TMP61]], float [[TMP60]])			; CHECK-NEXT: [[TMP66:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP65]], float [[TMP61]], float [[TMP60]])
	; CHECK-NEXT: [[TMP67:%.*]] = fptosi float [[TMP64]] to i32			; CHECK-NEXT: [[TMP67:%.*]] = fptosi float [[TMP64]] to i32
	; CHECK-NEXT: [[TMP68:%.*]] = call fast float @llvm.fabs.f32(float [[TMP66]])			; CHECK-NEXT: [[TMP68:%.*]] = call fast float @llvm.fabs.f32(float [[TMP66]])
	; CHECK-NEXT: [[TMP69:%.*]] = call fast float @llvm.fabs.f32(float [[TMP61]])			; CHECK-NEXT: [[TMP69:%.*]] = call fast float @llvm.fabs.f32(float [[TMP61]])
	; CHECK-NEXT: [[TMP70:%.*]] = fcmp fast oge float [[TMP68]], [[TMP69]]			; CHECK-NEXT: [[TMP70:%.*]] = fcmp fast oge float [[TMP68]], [[TMP69]]
	; CHECK-NEXT: [[TMP71:%.*]] = select i1 [[TMP70]], i32 [[TMP59]], i32 0			; CHECK-NEXT: [[TMP71:%.*]] = select i1 [[TMP70]], i32 [[TMP59]], i32 0
	; CHECK-NEXT: [[TMP72:%.*]] = add i32 [[TMP67]], [[TMP71]]			; CHECK-NEXT: [[TMP72:%.*]] = add i32 [[TMP67]], [[TMP71]]
	; CHECK-NEXT: [[TMP73:%.*]] = mul i32 [[TMP72]], [[TMP56]]			; CHECK-NEXT: [[TMP73:%.*]] = mul i32 [[TMP72]], [[TMP56]]
	Show All 16 Lines
	; CHECK-NEXT: [[TMP2:%.]] = extractelement <3 x i15> [[Y:%.]], i64 0			; CHECK-NEXT: [[TMP2:%.]] = extractelement <3 x i15> [[Y:%.]], i64 0
	; CHECK-NEXT: [[TMP3:%.*]] = zext i15 [[TMP1]] to i32			; CHECK-NEXT: [[TMP3:%.*]] = zext i15 [[TMP1]] to i32
	; CHECK-NEXT: [[TMP4:%.*]] = zext i15 [[TMP2]] to i32			; CHECK-NEXT: [[TMP4:%.*]] = zext i15 [[TMP2]] to i32
	; CHECK-NEXT: [[TMP5:%.*]] = uitofp i32 [[TMP3]] to float			; CHECK-NEXT: [[TMP5:%.*]] = uitofp i32 [[TMP3]] to float
	; CHECK-NEXT: [[TMP6:%.*]] = uitofp i32 [[TMP4]] to float			; CHECK-NEXT: [[TMP6:%.*]] = uitofp i32 [[TMP4]] to float
	; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]]
	; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]])			; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]])
	; CHECK-NEXT: [[TMP10:%.*]] = fsub fast float -0.000000e+00, [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = fneg fast float [[TMP9]]
	; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]])			; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]])
	; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32			; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32
	; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]])			; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]])
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]])
	; CHECK-NEXT: [[TMP15:%.*]] = fcmp fast oge float [[TMP13]], [[TMP14]]			; CHECK-NEXT: [[TMP15:%.*]] = fcmp fast oge float [[TMP13]], [[TMP14]]
	; CHECK-NEXT: [[TMP16:%.*]] = select i1 [[TMP15]], i32 1, i32 0			; CHECK-NEXT: [[TMP16:%.*]] = select i1 [[TMP15]], i32 1, i32 0
	; CHECK-NEXT: [[TMP17:%.*]] = add i32 [[TMP12]], [[TMP16]]			; CHECK-NEXT: [[TMP17:%.*]] = add i32 [[TMP12]], [[TMP16]]
	; CHECK-NEXT: [[TMP18:%.*]] = and i32 [[TMP17]], 32767			; CHECK-NEXT: [[TMP18:%.*]] = and i32 [[TMP17]], 32767
	; CHECK-NEXT: [[TMP19:%.*]] = trunc i32 [[TMP18]] to i15			; CHECK-NEXT: [[TMP19:%.*]] = trunc i32 [[TMP18]] to i15
	; CHECK-NEXT: [[TMP20:%.*]] = insertelement <3 x i15> undef, i15 [[TMP19]], i64 0			; CHECK-NEXT: [[TMP20:%.*]] = insertelement <3 x i15> undef, i15 [[TMP19]], i64 0
	; CHECK-NEXT: [[TMP21:%.*]] = extractelement <3 x i15> [[X]], i64 1			; CHECK-NEXT: [[TMP21:%.*]] = extractelement <3 x i15> [[X]], i64 1
	; CHECK-NEXT: [[TMP22:%.*]] = extractelement <3 x i15> [[Y]], i64 1			; CHECK-NEXT: [[TMP22:%.*]] = extractelement <3 x i15> [[Y]], i64 1
	; CHECK-NEXT: [[TMP23:%.*]] = zext i15 [[TMP21]] to i32			; CHECK-NEXT: [[TMP23:%.*]] = zext i15 [[TMP21]] to i32
	; CHECK-NEXT: [[TMP24:%.*]] = zext i15 [[TMP22]] to i32			; CHECK-NEXT: [[TMP24:%.*]] = zext i15 [[TMP22]] to i32
	; CHECK-NEXT: [[TMP25:%.*]] = uitofp i32 [[TMP23]] to float			; CHECK-NEXT: [[TMP25:%.*]] = uitofp i32 [[TMP23]] to float
	; CHECK-NEXT: [[TMP26:%.*]] = uitofp i32 [[TMP24]] to float			; CHECK-NEXT: [[TMP26:%.*]] = uitofp i32 [[TMP24]] to float
	; CHECK-NEXT: [[TMP27:%.*]] = fdiv fast float 1.000000e+00, [[TMP26]]			; CHECK-NEXT: [[TMP27:%.*]] = fdiv fast float 1.000000e+00, [[TMP26]]
	; CHECK-NEXT: [[TMP28:%.*]] = fmul fast float [[TMP25]], [[TMP27]]			; CHECK-NEXT: [[TMP28:%.*]] = fmul fast float [[TMP25]], [[TMP27]]
	; CHECK-NEXT: [[TMP29:%.*]] = call fast float @llvm.trunc.f32(float [[TMP28]])			; CHECK-NEXT: [[TMP29:%.*]] = call fast float @llvm.trunc.f32(float [[TMP28]])
	; CHECK-NEXT: [[TMP30:%.*]] = fsub fast float -0.000000e+00, [[TMP29]]			; CHECK-NEXT: [[TMP30:%.*]] = fneg fast float [[TMP29]]
	; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP30]], float [[TMP26]], float [[TMP25]])			; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP30]], float [[TMP26]], float [[TMP25]])
	; CHECK-NEXT: [[TMP32:%.*]] = fptoui float [[TMP29]] to i32			; CHECK-NEXT: [[TMP32:%.*]] = fptoui float [[TMP29]] to i32
	; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.fabs.f32(float [[TMP31]])			; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.fabs.f32(float [[TMP31]])
	; CHECK-NEXT: [[TMP34:%.*]] = call fast float @llvm.fabs.f32(float [[TMP26]])			; CHECK-NEXT: [[TMP34:%.*]] = call fast float @llvm.fabs.f32(float [[TMP26]])
	; CHECK-NEXT: [[TMP35:%.*]] = fcmp fast oge float [[TMP33]], [[TMP34]]			; CHECK-NEXT: [[TMP35:%.*]] = fcmp fast oge float [[TMP33]], [[TMP34]]
	; CHECK-NEXT: [[TMP36:%.*]] = select i1 [[TMP35]], i32 1, i32 0			; CHECK-NEXT: [[TMP36:%.*]] = select i1 [[TMP35]], i32 1, i32 0
	; CHECK-NEXT: [[TMP37:%.*]] = add i32 [[TMP32]], [[TMP36]]			; CHECK-NEXT: [[TMP37:%.*]] = add i32 [[TMP32]], [[TMP36]]
	; CHECK-NEXT: [[TMP38:%.*]] = and i32 [[TMP37]], 32767			; CHECK-NEXT: [[TMP38:%.*]] = and i32 [[TMP37]], 32767
	; CHECK-NEXT: [[TMP39:%.*]] = trunc i32 [[TMP38]] to i15			; CHECK-NEXT: [[TMP39:%.*]] = trunc i32 [[TMP38]] to i15
	; CHECK-NEXT: [[TMP40:%.*]] = insertelement <3 x i15> [[TMP20]], i15 [[TMP39]], i64 1			; CHECK-NEXT: [[TMP40:%.*]] = insertelement <3 x i15> [[TMP20]], i15 [[TMP39]], i64 1
	; CHECK-NEXT: [[TMP41:%.*]] = extractelement <3 x i15> [[X]], i64 2			; CHECK-NEXT: [[TMP41:%.*]] = extractelement <3 x i15> [[X]], i64 2
	; CHECK-NEXT: [[TMP42:%.*]] = extractelement <3 x i15> [[Y]], i64 2			; CHECK-NEXT: [[TMP42:%.*]] = extractelement <3 x i15> [[Y]], i64 2
	; CHECK-NEXT: [[TMP43:%.*]] = zext i15 [[TMP41]] to i32			; CHECK-NEXT: [[TMP43:%.*]] = zext i15 [[TMP41]] to i32
	; CHECK-NEXT: [[TMP44:%.*]] = zext i15 [[TMP42]] to i32			; CHECK-NEXT: [[TMP44:%.*]] = zext i15 [[TMP42]] to i32
	; CHECK-NEXT: [[TMP45:%.*]] = uitofp i32 [[TMP43]] to float			; CHECK-NEXT: [[TMP45:%.*]] = uitofp i32 [[TMP43]] to float
	; CHECK-NEXT: [[TMP46:%.*]] = uitofp i32 [[TMP44]] to float			; CHECK-NEXT: [[TMP46:%.*]] = uitofp i32 [[TMP44]] to float
	; CHECK-NEXT: [[TMP47:%.*]] = fdiv fast float 1.000000e+00, [[TMP46]]			; CHECK-NEXT: [[TMP47:%.*]] = fdiv fast float 1.000000e+00, [[TMP46]]
	; CHECK-NEXT: [[TMP48:%.*]] = fmul fast float [[TMP45]], [[TMP47]]			; CHECK-NEXT: [[TMP48:%.*]] = fmul fast float [[TMP45]], [[TMP47]]
	; CHECK-NEXT: [[TMP49:%.*]] = call fast float @llvm.trunc.f32(float [[TMP48]])			; CHECK-NEXT: [[TMP49:%.*]] = call fast float @llvm.trunc.f32(float [[TMP48]])
	; CHECK-NEXT: [[TMP50:%.*]] = fsub fast float -0.000000e+00, [[TMP49]]			; CHECK-NEXT: [[TMP50:%.*]] = fneg fast float [[TMP49]]
	; CHECK-NEXT: [[TMP51:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP50]], float [[TMP46]], float [[TMP45]])			; CHECK-NEXT: [[TMP51:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP50]], float [[TMP46]], float [[TMP45]])
	; CHECK-NEXT: [[TMP52:%.*]] = fptoui float [[TMP49]] to i32			; CHECK-NEXT: [[TMP52:%.*]] = fptoui float [[TMP49]] to i32
	; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.fabs.f32(float [[TMP51]])			; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.fabs.f32(float [[TMP51]])
	; CHECK-NEXT: [[TMP54:%.*]] = call fast float @llvm.fabs.f32(float [[TMP46]])			; CHECK-NEXT: [[TMP54:%.*]] = call fast float @llvm.fabs.f32(float [[TMP46]])
	; CHECK-NEXT: [[TMP55:%.*]] = fcmp fast oge float [[TMP53]], [[TMP54]]			; CHECK-NEXT: [[TMP55:%.*]] = fcmp fast oge float [[TMP53]], [[TMP54]]
	; CHECK-NEXT: [[TMP56:%.*]] = select i1 [[TMP55]], i32 1, i32 0			; CHECK-NEXT: [[TMP56:%.*]] = select i1 [[TMP55]], i32 1, i32 0
	; CHECK-NEXT: [[TMP57:%.*]] = add i32 [[TMP52]], [[TMP56]]			; CHECK-NEXT: [[TMP57:%.*]] = add i32 [[TMP52]], [[TMP56]]
	; CHECK-NEXT: [[TMP58:%.*]] = and i32 [[TMP57]], 32767			; CHECK-NEXT: [[TMP58:%.*]] = and i32 [[TMP57]], 32767
	Show All 13 Lines
	; CHECK-NEXT: [[TMP2:%.]] = extractelement <3 x i15> [[Y:%.]], i64 0			; CHECK-NEXT: [[TMP2:%.]] = extractelement <3 x i15> [[Y:%.]], i64 0
	; CHECK-NEXT: [[TMP3:%.*]] = zext i15 [[TMP1]] to i32			; CHECK-NEXT: [[TMP3:%.*]] = zext i15 [[TMP1]] to i32
	; CHECK-NEXT: [[TMP4:%.*]] = zext i15 [[TMP2]] to i32			; CHECK-NEXT: [[TMP4:%.*]] = zext i15 [[TMP2]] to i32
	; CHECK-NEXT: [[TMP5:%.*]] = uitofp i32 [[TMP3]] to float			; CHECK-NEXT: [[TMP5:%.*]] = uitofp i32 [[TMP3]] to float
	; CHECK-NEXT: [[TMP6:%.*]] = uitofp i32 [[TMP4]] to float			; CHECK-NEXT: [[TMP6:%.*]] = uitofp i32 [[TMP4]] to float
	; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = fdiv fast float 1.000000e+00, [[TMP6]]
	; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = fmul fast float [[TMP5]], [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]])			; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.trunc.f32(float [[TMP8]])
	; CHECK-NEXT: [[TMP10:%.*]] = fsub fast float -0.000000e+00, [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = fneg fast float [[TMP9]]
	; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]])			; CHECK-NEXT: [[TMP11:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP10]], float [[TMP6]], float [[TMP5]])
	; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32			; CHECK-NEXT: [[TMP12:%.*]] = fptoui float [[TMP9]] to i32
	; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]])			; CHECK-NEXT: [[TMP13:%.*]] = call fast float @llvm.fabs.f32(float [[TMP11]])
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]])
	; CHECK-NEXT: [[TMP15:%.*]] = fcmp fast oge float [[TMP13]], [[TMP14]]			; CHECK-NEXT: [[TMP15:%.*]] = fcmp fast oge float [[TMP13]], [[TMP14]]
	; CHECK-NEXT: [[TMP16:%.*]] = select i1 [[TMP15]], i32 1, i32 0			; CHECK-NEXT: [[TMP16:%.*]] = select i1 [[TMP15]], i32 1, i32 0
	; CHECK-NEXT: [[TMP17:%.*]] = add i32 [[TMP12]], [[TMP16]]			; CHECK-NEXT: [[TMP17:%.*]] = add i32 [[TMP12]], [[TMP16]]
	; CHECK-NEXT: [[TMP18:%.*]] = mul i32 [[TMP17]], [[TMP4]]			; CHECK-NEXT: [[TMP18:%.*]] = mul i32 [[TMP17]], [[TMP4]]
	; CHECK-NEXT: [[TMP19:%.*]] = sub i32 [[TMP3]], [[TMP18]]			; CHECK-NEXT: [[TMP19:%.*]] = sub i32 [[TMP3]], [[TMP18]]
	; CHECK-NEXT: [[TMP20:%.*]] = and i32 [[TMP19]], 32767			; CHECK-NEXT: [[TMP20:%.*]] = and i32 [[TMP19]], 32767
	; CHECK-NEXT: [[TMP21:%.*]] = trunc i32 [[TMP20]] to i15			; CHECK-NEXT: [[TMP21:%.*]] = trunc i32 [[TMP20]] to i15
	; CHECK-NEXT: [[TMP22:%.*]] = insertelement <3 x i15> undef, i15 [[TMP21]], i64 0			; CHECK-NEXT: [[TMP22:%.*]] = insertelement <3 x i15> undef, i15 [[TMP21]], i64 0
	; CHECK-NEXT: [[TMP23:%.*]] = extractelement <3 x i15> [[X]], i64 1			; CHECK-NEXT: [[TMP23:%.*]] = extractelement <3 x i15> [[X]], i64 1
	; CHECK-NEXT: [[TMP24:%.*]] = extractelement <3 x i15> [[Y]], i64 1			; CHECK-NEXT: [[TMP24:%.*]] = extractelement <3 x i15> [[Y]], i64 1
	; CHECK-NEXT: [[TMP25:%.*]] = zext i15 [[TMP23]] to i32			; CHECK-NEXT: [[TMP25:%.*]] = zext i15 [[TMP23]] to i32
	; CHECK-NEXT: [[TMP26:%.*]] = zext i15 [[TMP24]] to i32			; CHECK-NEXT: [[TMP26:%.*]] = zext i15 [[TMP24]] to i32
	; CHECK-NEXT: [[TMP27:%.*]] = uitofp i32 [[TMP25]] to float			; CHECK-NEXT: [[TMP27:%.*]] = uitofp i32 [[TMP25]] to float
	; CHECK-NEXT: [[TMP28:%.*]] = uitofp i32 [[TMP26]] to float			; CHECK-NEXT: [[TMP28:%.*]] = uitofp i32 [[TMP26]] to float
	; CHECK-NEXT: [[TMP29:%.*]] = fdiv fast float 1.000000e+00, [[TMP28]]			; CHECK-NEXT: [[TMP29:%.*]] = fdiv fast float 1.000000e+00, [[TMP28]]
	; CHECK-NEXT: [[TMP30:%.*]] = fmul fast float [[TMP27]], [[TMP29]]			; CHECK-NEXT: [[TMP30:%.*]] = fmul fast float [[TMP27]], [[TMP29]]
	; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.trunc.f32(float [[TMP30]])			; CHECK-NEXT: [[TMP31:%.*]] = call fast float @llvm.trunc.f32(float [[TMP30]])
	; CHECK-NEXT: [[TMP32:%.*]] = fsub fast float -0.000000e+00, [[TMP31]]			; CHECK-NEXT: [[TMP32:%.*]] = fneg fast float [[TMP31]]
	; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP32]], float [[TMP28]], float [[TMP27]])			; CHECK-NEXT: [[TMP33:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP32]], float [[TMP28]], float [[TMP27]])
	; CHECK-NEXT: [[TMP34:%.*]] = fptoui float [[TMP31]] to i32			; CHECK-NEXT: [[TMP34:%.*]] = fptoui float [[TMP31]] to i32
	; CHECK-NEXT: [[TMP35:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]])			; CHECK-NEXT: [[TMP35:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]])
	; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.fabs.f32(float [[TMP28]])			; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.fabs.f32(float [[TMP28]])
	; CHECK-NEXT: [[TMP37:%.*]] = fcmp fast oge float [[TMP35]], [[TMP36]]			; CHECK-NEXT: [[TMP37:%.*]] = fcmp fast oge float [[TMP35]], [[TMP36]]
	; CHECK-NEXT: [[TMP38:%.*]] = select i1 [[TMP37]], i32 1, i32 0			; CHECK-NEXT: [[TMP38:%.*]] = select i1 [[TMP37]], i32 1, i32 0
	; CHECK-NEXT: [[TMP39:%.*]] = add i32 [[TMP34]], [[TMP38]]			; CHECK-NEXT: [[TMP39:%.*]] = add i32 [[TMP34]], [[TMP38]]
	; CHECK-NEXT: [[TMP40:%.*]] = mul i32 [[TMP39]], [[TMP26]]			; CHECK-NEXT: [[TMP40:%.*]] = mul i32 [[TMP39]], [[TMP26]]
	; CHECK-NEXT: [[TMP41:%.*]] = sub i32 [[TMP25]], [[TMP40]]			; CHECK-NEXT: [[TMP41:%.*]] = sub i32 [[TMP25]], [[TMP40]]
	; CHECK-NEXT: [[TMP42:%.*]] = and i32 [[TMP41]], 32767			; CHECK-NEXT: [[TMP42:%.*]] = and i32 [[TMP41]], 32767
	; CHECK-NEXT: [[TMP43:%.*]] = trunc i32 [[TMP42]] to i15			; CHECK-NEXT: [[TMP43:%.*]] = trunc i32 [[TMP42]] to i15
	; CHECK-NEXT: [[TMP44:%.*]] = insertelement <3 x i15> [[TMP22]], i15 [[TMP43]], i64 1			; CHECK-NEXT: [[TMP44:%.*]] = insertelement <3 x i15> [[TMP22]], i15 [[TMP43]], i64 1
	; CHECK-NEXT: [[TMP45:%.*]] = extractelement <3 x i15> [[X]], i64 2			; CHECK-NEXT: [[TMP45:%.*]] = extractelement <3 x i15> [[X]], i64 2
	; CHECK-NEXT: [[TMP46:%.*]] = extractelement <3 x i15> [[Y]], i64 2			; CHECK-NEXT: [[TMP46:%.*]] = extractelement <3 x i15> [[Y]], i64 2
	; CHECK-NEXT: [[TMP47:%.*]] = zext i15 [[TMP45]] to i32			; CHECK-NEXT: [[TMP47:%.*]] = zext i15 [[TMP45]] to i32
	; CHECK-NEXT: [[TMP48:%.*]] = zext i15 [[TMP46]] to i32			; CHECK-NEXT: [[TMP48:%.*]] = zext i15 [[TMP46]] to i32
	; CHECK-NEXT: [[TMP49:%.*]] = uitofp i32 [[TMP47]] to float			; CHECK-NEXT: [[TMP49:%.*]] = uitofp i32 [[TMP47]] to float
	; CHECK-NEXT: [[TMP50:%.*]] = uitofp i32 [[TMP48]] to float			; CHECK-NEXT: [[TMP50:%.*]] = uitofp i32 [[TMP48]] to float
	; CHECK-NEXT: [[TMP51:%.*]] = fdiv fast float 1.000000e+00, [[TMP50]]			; CHECK-NEXT: [[TMP51:%.*]] = fdiv fast float 1.000000e+00, [[TMP50]]
	; CHECK-NEXT: [[TMP52:%.*]] = fmul fast float [[TMP49]], [[TMP51]]			; CHECK-NEXT: [[TMP52:%.*]] = fmul fast float [[TMP49]], [[TMP51]]
	; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.trunc.f32(float [[TMP52]])			; CHECK-NEXT: [[TMP53:%.*]] = call fast float @llvm.trunc.f32(float [[TMP52]])
	; CHECK-NEXT: [[TMP54:%.*]] = fsub fast float -0.000000e+00, [[TMP53]]			; CHECK-NEXT: [[TMP54:%.*]] = fneg fast float [[TMP53]]
	; CHECK-NEXT: [[TMP55:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP54]], float [[TMP50]], float [[TMP49]])			; CHECK-NEXT: [[TMP55:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP54]], float [[TMP50]], float [[TMP49]])
	; CHECK-NEXT: [[TMP56:%.*]] = fptoui float [[TMP53]] to i32			; CHECK-NEXT: [[TMP56:%.*]] = fptoui float [[TMP53]] to i32
	; CHECK-NEXT: [[TMP57:%.*]] = call fast float @llvm.fabs.f32(float [[TMP55]])			; CHECK-NEXT: [[TMP57:%.*]] = call fast float @llvm.fabs.f32(float [[TMP55]])
	; CHECK-NEXT: [[TMP58:%.*]] = call fast float @llvm.fabs.f32(float [[TMP50]])			; CHECK-NEXT: [[TMP58:%.*]] = call fast float @llvm.fabs.f32(float [[TMP50]])
	; CHECK-NEXT: [[TMP59:%.*]] = fcmp fast oge float [[TMP57]], [[TMP58]]			; CHECK-NEXT: [[TMP59:%.*]] = fcmp fast oge float [[TMP57]], [[TMP58]]
	; CHECK-NEXT: [[TMP60:%.*]] = select i1 [[TMP59]], i32 1, i32 0			; CHECK-NEXT: [[TMP60:%.*]] = select i1 [[TMP59]], i32 1, i32 0
	; CHECK-NEXT: [[TMP61:%.*]] = add i32 [[TMP56]], [[TMP60]]			; CHECK-NEXT: [[TMP61:%.*]] = add i32 [[TMP56]], [[TMP60]]
	; CHECK-NEXT: [[TMP62:%.*]] = mul i32 [[TMP61]], [[TMP48]]			; CHECK-NEXT: [[TMP62:%.*]] = mul i32 [[TMP61]], [[TMP48]]
	Show All 18 Lines
	; CHECK-NEXT: [[TMP5:%.*]] = xor i32 [[TMP3]], [[TMP4]]			; CHECK-NEXT: [[TMP5:%.*]] = xor i32 [[TMP3]], [[TMP4]]
	; CHECK-NEXT: [[TMP6:%.*]] = ashr i32 [[TMP5]], 30			; CHECK-NEXT: [[TMP6:%.*]] = ashr i32 [[TMP5]], 30
	; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], 1			; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], 1
	; CHECK-NEXT: [[TMP8:%.*]] = sitofp i32 [[TMP3]] to float			; CHECK-NEXT: [[TMP8:%.*]] = sitofp i32 [[TMP3]] to float
	; CHECK-NEXT: [[TMP9:%.*]] = sitofp i32 [[TMP4]] to float			; CHECK-NEXT: [[TMP9:%.*]] = sitofp i32 [[TMP4]] to float
	; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]]
	; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]]			; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]]
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]])
	; CHECK-NEXT: [[TMP13:%.*]] = fsub fast float -0.000000e+00, [[TMP12]]			; CHECK-NEXT: [[TMP13:%.*]] = fneg fast float [[TMP12]]
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]])
	; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32			; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32
	; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]])			; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]])
	; CHECK-NEXT: [[TMP17:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP17:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP18:%.*]] = fcmp fast oge float [[TMP16]], [[TMP17]]			; CHECK-NEXT: [[TMP18:%.*]] = fcmp fast oge float [[TMP16]], [[TMP17]]
	; CHECK-NEXT: [[TMP19:%.*]] = select i1 [[TMP18]], i32 [[TMP7]], i32 0			; CHECK-NEXT: [[TMP19:%.*]] = select i1 [[TMP18]], i32 [[TMP7]], i32 0
	; CHECK-NEXT: [[TMP20:%.*]] = add i32 [[TMP15]], [[TMP19]]			; CHECK-NEXT: [[TMP20:%.*]] = add i32 [[TMP15]], [[TMP19]]
	; CHECK-NEXT: [[TMP21:%.*]] = trunc i32 [[TMP20]] to i15			; CHECK-NEXT: [[TMP21:%.*]] = trunc i32 [[TMP20]] to i15
	; CHECK-NEXT: [[TMP22:%.*]] = sext i15 [[TMP21]] to i32			; CHECK-NEXT: [[TMP22:%.*]] = sext i15 [[TMP21]] to i32
	; CHECK-NEXT: [[TMP23:%.*]] = trunc i32 [[TMP22]] to i15			; CHECK-NEXT: [[TMP23:%.*]] = trunc i32 [[TMP22]] to i15
	; CHECK-NEXT: [[TMP24:%.*]] = insertelement <3 x i15> undef, i15 [[TMP23]], i64 0			; CHECK-NEXT: [[TMP24:%.*]] = insertelement <3 x i15> undef, i15 [[TMP23]], i64 0
	; CHECK-NEXT: [[TMP25:%.*]] = extractelement <3 x i15> [[X]], i64 1			; CHECK-NEXT: [[TMP25:%.*]] = extractelement <3 x i15> [[X]], i64 1
	; CHECK-NEXT: [[TMP26:%.*]] = extractelement <3 x i15> [[Y]], i64 1			; CHECK-NEXT: [[TMP26:%.*]] = extractelement <3 x i15> [[Y]], i64 1
	; CHECK-NEXT: [[TMP27:%.*]] = sext i15 [[TMP25]] to i32			; CHECK-NEXT: [[TMP27:%.*]] = sext i15 [[TMP25]] to i32
	; CHECK-NEXT: [[TMP28:%.*]] = sext i15 [[TMP26]] to i32			; CHECK-NEXT: [[TMP28:%.*]] = sext i15 [[TMP26]] to i32
	; CHECK-NEXT: [[TMP29:%.*]] = xor i32 [[TMP27]], [[TMP28]]			; CHECK-NEXT: [[TMP29:%.*]] = xor i32 [[TMP27]], [[TMP28]]
	; CHECK-NEXT: [[TMP30:%.*]] = ashr i32 [[TMP29]], 30			; CHECK-NEXT: [[TMP30:%.*]] = ashr i32 [[TMP29]], 30
	; CHECK-NEXT: [[TMP31:%.*]] = or i32 [[TMP30]], 1			; CHECK-NEXT: [[TMP31:%.*]] = or i32 [[TMP30]], 1
	; CHECK-NEXT: [[TMP32:%.*]] = sitofp i32 [[TMP27]] to float			; CHECK-NEXT: [[TMP32:%.*]] = sitofp i32 [[TMP27]] to float
	; CHECK-NEXT: [[TMP33:%.*]] = sitofp i32 [[TMP28]] to float			; CHECK-NEXT: [[TMP33:%.*]] = sitofp i32 [[TMP28]] to float
	; CHECK-NEXT: [[TMP34:%.*]] = fdiv fast float 1.000000e+00, [[TMP33]]			; CHECK-NEXT: [[TMP34:%.*]] = fdiv fast float 1.000000e+00, [[TMP33]]
	; CHECK-NEXT: [[TMP35:%.*]] = fmul fast float [[TMP32]], [[TMP34]]			; CHECK-NEXT: [[TMP35:%.*]] = fmul fast float [[TMP32]], [[TMP34]]
	; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.trunc.f32(float [[TMP35]])			; CHECK-NEXT: [[TMP36:%.*]] = call fast float @llvm.trunc.f32(float [[TMP35]])
	; CHECK-NEXT: [[TMP37:%.*]] = fsub fast float -0.000000e+00, [[TMP36]]			; CHECK-NEXT: [[TMP37:%.*]] = fneg fast float [[TMP36]]
	; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP37]], float [[TMP33]], float [[TMP32]])			; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP37]], float [[TMP33]], float [[TMP32]])
	; CHECK-NEXT: [[TMP39:%.*]] = fptosi float [[TMP36]] to i32			; CHECK-NEXT: [[TMP39:%.*]] = fptosi float [[TMP36]] to i32
	; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.fabs.f32(float [[TMP38]])			; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.fabs.f32(float [[TMP38]])
	; CHECK-NEXT: [[TMP41:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]])			; CHECK-NEXT: [[TMP41:%.*]] = call fast float @llvm.fabs.f32(float [[TMP33]])
	; CHECK-NEXT: [[TMP42:%.*]] = fcmp fast oge float [[TMP40]], [[TMP41]]			; CHECK-NEXT: [[TMP42:%.*]] = fcmp fast oge float [[TMP40]], [[TMP41]]
	; CHECK-NEXT: [[TMP43:%.*]] = select i1 [[TMP42]], i32 [[TMP31]], i32 0			; CHECK-NEXT: [[TMP43:%.*]] = select i1 [[TMP42]], i32 [[TMP31]], i32 0
	; CHECK-NEXT: [[TMP44:%.*]] = add i32 [[TMP39]], [[TMP43]]			; CHECK-NEXT: [[TMP44:%.*]] = add i32 [[TMP39]], [[TMP43]]
	; CHECK-NEXT: [[TMP45:%.*]] = trunc i32 [[TMP44]] to i15			; CHECK-NEXT: [[TMP45:%.*]] = trunc i32 [[TMP44]] to i15
	; CHECK-NEXT: [[TMP46:%.*]] = sext i15 [[TMP45]] to i32			; CHECK-NEXT: [[TMP46:%.*]] = sext i15 [[TMP45]] to i32
	; CHECK-NEXT: [[TMP47:%.*]] = trunc i32 [[TMP46]] to i15			; CHECK-NEXT: [[TMP47:%.*]] = trunc i32 [[TMP46]] to i15
	; CHECK-NEXT: [[TMP48:%.*]] = insertelement <3 x i15> [[TMP24]], i15 [[TMP47]], i64 1			; CHECK-NEXT: [[TMP48:%.*]] = insertelement <3 x i15> [[TMP24]], i15 [[TMP47]], i64 1
	; CHECK-NEXT: [[TMP49:%.*]] = extractelement <3 x i15> [[X]], i64 2			; CHECK-NEXT: [[TMP49:%.*]] = extractelement <3 x i15> [[X]], i64 2
	; CHECK-NEXT: [[TMP50:%.*]] = extractelement <3 x i15> [[Y]], i64 2			; CHECK-NEXT: [[TMP50:%.*]] = extractelement <3 x i15> [[Y]], i64 2
	; CHECK-NEXT: [[TMP51:%.*]] = sext i15 [[TMP49]] to i32			; CHECK-NEXT: [[TMP51:%.*]] = sext i15 [[TMP49]] to i32
	; CHECK-NEXT: [[TMP52:%.*]] = sext i15 [[TMP50]] to i32			; CHECK-NEXT: [[TMP52:%.*]] = sext i15 [[TMP50]] to i32
	; CHECK-NEXT: [[TMP53:%.*]] = xor i32 [[TMP51]], [[TMP52]]			; CHECK-NEXT: [[TMP53:%.*]] = xor i32 [[TMP51]], [[TMP52]]
	; CHECK-NEXT: [[TMP54:%.*]] = ashr i32 [[TMP53]], 30			; CHECK-NEXT: [[TMP54:%.*]] = ashr i32 [[TMP53]], 30
	; CHECK-NEXT: [[TMP55:%.*]] = or i32 [[TMP54]], 1			; CHECK-NEXT: [[TMP55:%.*]] = or i32 [[TMP54]], 1
	; CHECK-NEXT: [[TMP56:%.*]] = sitofp i32 [[TMP51]] to float			; CHECK-NEXT: [[TMP56:%.*]] = sitofp i32 [[TMP51]] to float
	; CHECK-NEXT: [[TMP57:%.*]] = sitofp i32 [[TMP52]] to float			; CHECK-NEXT: [[TMP57:%.*]] = sitofp i32 [[TMP52]] to float
	; CHECK-NEXT: [[TMP58:%.*]] = fdiv fast float 1.000000e+00, [[TMP57]]			; CHECK-NEXT: [[TMP58:%.*]] = fdiv fast float 1.000000e+00, [[TMP57]]
	; CHECK-NEXT: [[TMP59:%.*]] = fmul fast float [[TMP56]], [[TMP58]]			; CHECK-NEXT: [[TMP59:%.*]] = fmul fast float [[TMP56]], [[TMP58]]
	; CHECK-NEXT: [[TMP60:%.*]] = call fast float @llvm.trunc.f32(float [[TMP59]])			; CHECK-NEXT: [[TMP60:%.*]] = call fast float @llvm.trunc.f32(float [[TMP59]])
	; CHECK-NEXT: [[TMP61:%.*]] = fsub fast float -0.000000e+00, [[TMP60]]			; CHECK-NEXT: [[TMP61:%.*]] = fneg fast float [[TMP60]]
	; CHECK-NEXT: [[TMP62:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP61]], float [[TMP57]], float [[TMP56]])			; CHECK-NEXT: [[TMP62:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP61]], float [[TMP57]], float [[TMP56]])
	; CHECK-NEXT: [[TMP63:%.*]] = fptosi float [[TMP60]] to i32			; CHECK-NEXT: [[TMP63:%.*]] = fptosi float [[TMP60]] to i32
	; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.fabs.f32(float [[TMP62]])			; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.fabs.f32(float [[TMP62]])
	; CHECK-NEXT: [[TMP65:%.*]] = call fast float @llvm.fabs.f32(float [[TMP57]])			; CHECK-NEXT: [[TMP65:%.*]] = call fast float @llvm.fabs.f32(float [[TMP57]])
	; CHECK-NEXT: [[TMP66:%.*]] = fcmp fast oge float [[TMP64]], [[TMP65]]			; CHECK-NEXT: [[TMP66:%.*]] = fcmp fast oge float [[TMP64]], [[TMP65]]
	; CHECK-NEXT: [[TMP67:%.*]] = select i1 [[TMP66]], i32 [[TMP55]], i32 0			; CHECK-NEXT: [[TMP67:%.*]] = select i1 [[TMP66]], i32 [[TMP55]], i32 0
	; CHECK-NEXT: [[TMP68:%.*]] = add i32 [[TMP63]], [[TMP67]]			; CHECK-NEXT: [[TMP68:%.*]] = add i32 [[TMP63]], [[TMP67]]
	; CHECK-NEXT: [[TMP69:%.*]] = trunc i32 [[TMP68]] to i15			; CHECK-NEXT: [[TMP69:%.*]] = trunc i32 [[TMP68]] to i15
	Show All 17 Lines
	; CHECK-NEXT: [[TMP5:%.*]] = xor i32 [[TMP3]], [[TMP4]]			; CHECK-NEXT: [[TMP5:%.*]] = xor i32 [[TMP3]], [[TMP4]]
	; CHECK-NEXT: [[TMP6:%.*]] = ashr i32 [[TMP5]], 30			; CHECK-NEXT: [[TMP6:%.*]] = ashr i32 [[TMP5]], 30
	; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], 1			; CHECK-NEXT: [[TMP7:%.*]] = or i32 [[TMP6]], 1
	; CHECK-NEXT: [[TMP8:%.*]] = sitofp i32 [[TMP3]] to float			; CHECK-NEXT: [[TMP8:%.*]] = sitofp i32 [[TMP3]] to float
	; CHECK-NEXT: [[TMP9:%.*]] = sitofp i32 [[TMP4]] to float			; CHECK-NEXT: [[TMP9:%.*]] = sitofp i32 [[TMP4]] to float
	; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = fdiv fast float 1.000000e+00, [[TMP9]]
	; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]]			; CHECK-NEXT: [[TMP11:%.*]] = fmul fast float [[TMP8]], [[TMP10]]
	; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]])			; CHECK-NEXT: [[TMP12:%.*]] = call fast float @llvm.trunc.f32(float [[TMP11]])
	; CHECK-NEXT: [[TMP13:%.*]] = fsub fast float -0.000000e+00, [[TMP12]]			; CHECK-NEXT: [[TMP13:%.*]] = fneg fast float [[TMP12]]
	; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]])			; CHECK-NEXT: [[TMP14:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP13]], float [[TMP9]], float [[TMP8]])
	; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32			; CHECK-NEXT: [[TMP15:%.*]] = fptosi float [[TMP12]] to i32
	; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]])			; CHECK-NEXT: [[TMP16:%.*]] = call fast float @llvm.fabs.f32(float [[TMP14]])
	; CHECK-NEXT: [[TMP17:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])			; CHECK-NEXT: [[TMP17:%.*]] = call fast float @llvm.fabs.f32(float [[TMP9]])
	; CHECK-NEXT: [[TMP18:%.*]] = fcmp fast oge float [[TMP16]], [[TMP17]]			; CHECK-NEXT: [[TMP18:%.*]] = fcmp fast oge float [[TMP16]], [[TMP17]]
	; CHECK-NEXT: [[TMP19:%.*]] = select i1 [[TMP18]], i32 [[TMP7]], i32 0			; CHECK-NEXT: [[TMP19:%.*]] = select i1 [[TMP18]], i32 [[TMP7]], i32 0
	; CHECK-NEXT: [[TMP20:%.*]] = add i32 [[TMP15]], [[TMP19]]			; CHECK-NEXT: [[TMP20:%.*]] = add i32 [[TMP15]], [[TMP19]]
	; CHECK-NEXT: [[TMP21:%.*]] = mul i32 [[TMP20]], [[TMP4]]			; CHECK-NEXT: [[TMP21:%.*]] = mul i32 [[TMP20]], [[TMP4]]
	Show All 9 Lines
	; CHECK-NEXT: [[TMP31:%.*]] = xor i32 [[TMP29]], [[TMP30]]			; CHECK-NEXT: [[TMP31:%.*]] = xor i32 [[TMP29]], [[TMP30]]
	; CHECK-NEXT: [[TMP32:%.*]] = ashr i32 [[TMP31]], 30			; CHECK-NEXT: [[TMP32:%.*]] = ashr i32 [[TMP31]], 30
	; CHECK-NEXT: [[TMP33:%.*]] = or i32 [[TMP32]], 1			; CHECK-NEXT: [[TMP33:%.*]] = or i32 [[TMP32]], 1
	; CHECK-NEXT: [[TMP34:%.*]] = sitofp i32 [[TMP29]] to float			; CHECK-NEXT: [[TMP34:%.*]] = sitofp i32 [[TMP29]] to float
	; CHECK-NEXT: [[TMP35:%.*]] = sitofp i32 [[TMP30]] to float			; CHECK-NEXT: [[TMP35:%.*]] = sitofp i32 [[TMP30]] to float
	; CHECK-NEXT: [[TMP36:%.*]] = fdiv fast float 1.000000e+00, [[TMP35]]			; CHECK-NEXT: [[TMP36:%.*]] = fdiv fast float 1.000000e+00, [[TMP35]]
	; CHECK-NEXT: [[TMP37:%.*]] = fmul fast float [[TMP34]], [[TMP36]]			; CHECK-NEXT: [[TMP37:%.*]] = fmul fast float [[TMP34]], [[TMP36]]
	; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.trunc.f32(float [[TMP37]])			; CHECK-NEXT: [[TMP38:%.*]] = call fast float @llvm.trunc.f32(float [[TMP37]])
	; CHECK-NEXT: [[TMP39:%.*]] = fsub fast float -0.000000e+00, [[TMP38]]			; CHECK-NEXT: [[TMP39:%.*]] = fneg fast float [[TMP38]]
	; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP39]], float [[TMP35]], float [[TMP34]])			; CHECK-NEXT: [[TMP40:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP39]], float [[TMP35]], float [[TMP34]])
	; CHECK-NEXT: [[TMP41:%.*]] = fptosi float [[TMP38]] to i32			; CHECK-NEXT: [[TMP41:%.*]] = fptosi float [[TMP38]] to i32
	; CHECK-NEXT: [[TMP42:%.*]] = call fast float @llvm.fabs.f32(float [[TMP40]])			; CHECK-NEXT: [[TMP42:%.*]] = call fast float @llvm.fabs.f32(float [[TMP40]])
	; CHECK-NEXT: [[TMP43:%.*]] = call fast float @llvm.fabs.f32(float [[TMP35]])			; CHECK-NEXT: [[TMP43:%.*]] = call fast float @llvm.fabs.f32(float [[TMP35]])
	; CHECK-NEXT: [[TMP44:%.*]] = fcmp fast oge float [[TMP42]], [[TMP43]]			; CHECK-NEXT: [[TMP44:%.*]] = fcmp fast oge float [[TMP42]], [[TMP43]]
	; CHECK-NEXT: [[TMP45:%.*]] = select i1 [[TMP44]], i32 [[TMP33]], i32 0			; CHECK-NEXT: [[TMP45:%.*]] = select i1 [[TMP44]], i32 [[TMP33]], i32 0
	; CHECK-NEXT: [[TMP46:%.*]] = add i32 [[TMP41]], [[TMP45]]			; CHECK-NEXT: [[TMP46:%.*]] = add i32 [[TMP41]], [[TMP45]]
	; CHECK-NEXT: [[TMP47:%.*]] = mul i32 [[TMP46]], [[TMP30]]			; CHECK-NEXT: [[TMP47:%.*]] = mul i32 [[TMP46]], [[TMP30]]
	Show All 9 Lines
	; CHECK-NEXT: [[TMP57:%.*]] = xor i32 [[TMP55]], [[TMP56]]			; CHECK-NEXT: [[TMP57:%.*]] = xor i32 [[TMP55]], [[TMP56]]
	; CHECK-NEXT: [[TMP58:%.*]] = ashr i32 [[TMP57]], 30			; CHECK-NEXT: [[TMP58:%.*]] = ashr i32 [[TMP57]], 30
	; CHECK-NEXT: [[TMP59:%.*]] = or i32 [[TMP58]], 1			; CHECK-NEXT: [[TMP59:%.*]] = or i32 [[TMP58]], 1
	; CHECK-NEXT: [[TMP60:%.*]] = sitofp i32 [[TMP55]] to float			; CHECK-NEXT: [[TMP60:%.*]] = sitofp i32 [[TMP55]] to float
	; CHECK-NEXT: [[TMP61:%.*]] = sitofp i32 [[TMP56]] to float			; CHECK-NEXT: [[TMP61:%.*]] = sitofp i32 [[TMP56]] to float
	; CHECK-NEXT: [[TMP62:%.*]] = fdiv fast float 1.000000e+00, [[TMP61]]			; CHECK-NEXT: [[TMP62:%.*]] = fdiv fast float 1.000000e+00, [[TMP61]]
	; CHECK-NEXT: [[TMP63:%.*]] = fmul fast float [[TMP60]], [[TMP62]]			; CHECK-NEXT: [[TMP63:%.*]] = fmul fast float [[TMP60]], [[TMP62]]
	; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.trunc.f32(float [[TMP63]])			; CHECK-NEXT: [[TMP64:%.*]] = call fast float @llvm.trunc.f32(float [[TMP63]])
	; CHECK-NEXT: [[TMP65:%.*]] = fsub fast float -0.000000e+00, [[TMP64]]			; CHECK-NEXT: [[TMP65:%.*]] = fneg fast float [[TMP64]]
	; CHECK-NEXT: [[TMP66:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP65]], float [[TMP61]], float [[TMP60]])			; CHECK-NEXT: [[TMP66:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP65]], float [[TMP61]], float [[TMP60]])
	; CHECK-NEXT: [[TMP67:%.*]] = fptosi float [[TMP64]] to i32			; CHECK-NEXT: [[TMP67:%.*]] = fptosi float [[TMP64]] to i32
	; CHECK-NEXT: [[TMP68:%.*]] = call fast float @llvm.fabs.f32(float [[TMP66]])			; CHECK-NEXT: [[TMP68:%.*]] = call fast float @llvm.fabs.f32(float [[TMP66]])
	; CHECK-NEXT: [[TMP69:%.*]] = call fast float @llvm.fabs.f32(float [[TMP61]])			; CHECK-NEXT: [[TMP69:%.*]] = call fast float @llvm.fabs.f32(float [[TMP61]])
	; CHECK-NEXT: [[TMP70:%.*]] = fcmp fast oge float [[TMP68]], [[TMP69]]			; CHECK-NEXT: [[TMP70:%.*]] = fcmp fast oge float [[TMP68]], [[TMP69]]
	; CHECK-NEXT: [[TMP71:%.*]] = select i1 [[TMP70]], i32 [[TMP59]], i32 0			; CHECK-NEXT: [[TMP71:%.*]] = select i1 [[TMP70]], i32 [[TMP59]], i32 0
	; CHECK-NEXT: [[TMP72:%.*]] = add i32 [[TMP67]], [[TMP71]]			; CHECK-NEXT: [[TMP72:%.*]] = add i32 [[TMP67]], [[TMP71]]
	; CHECK-NEXT: [[TMP73:%.*]] = mul i32 [[TMP72]], [[TMP56]]			; CHECK-NEXT: [[TMP73:%.*]] = mul i32 [[TMP72]], [[TMP56]]
	Show All 12 Lines

llvm/test/CodeGen/AMDGPU/divrem24-assume.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -S -mtriple=amdgcn-- -amdgpu-codegenprepare %s \| FileCheck %s			; RUN: opt -S -mtriple=amdgcn-- -amdgpu-codegenprepare %s \| FileCheck %s

	define amdgpu_kernel void @divrem24_assume(i32 addrspace(1)* %arg, i32 %arg1) {			define amdgpu_kernel void @divrem24_assume(i32 addrspace(1)* %arg, i32 %arg1) {
	; CHECK-LABEL: @divrem24_assume(			; CHECK-LABEL: @divrem24_assume(
	; CHECK-NEXT: bb:			; CHECK-NEXT: bb:
	; CHECK-NEXT: [[TMP:%.*]] = tail call i32 @llvm.amdgcn.workitem.id.x(), !range !0			; CHECK-NEXT: [[TMP:%.*]] = tail call i32 @llvm.amdgcn.workitem.id.x(), !range !0
	; CHECK-NEXT: [[TMP2:%.]] = icmp ult i32 [[ARG1:%.]], 42			; CHECK-NEXT: [[TMP2:%.]] = icmp ult i32 [[ARG1:%.]], 42
	; CHECK-NEXT: tail call void @llvm.assume(i1 [[TMP2]])			; CHECK-NEXT: tail call void @llvm.assume(i1 [[TMP2]])
	; CHECK-NEXT: [[TMP0:%.*]] = uitofp i32 [[TMP]] to float			; CHECK-NEXT: [[TMP0:%.*]] = uitofp i32 [[TMP]] to float
	; CHECK-NEXT: [[TMP1:%.*]] = uitofp i32 [[ARG1]] to float			; CHECK-NEXT: [[TMP1:%.*]] = uitofp i32 [[ARG1]] to float
	; CHECK-NEXT: [[TMP2:%.*]] = fdiv fast float 1.000000e+00, [[TMP1]]			; CHECK-NEXT: [[TMP2:%.*]] = fdiv fast float 1.000000e+00, [[TMP1]]
	; CHECK-NEXT: [[TMP3:%.*]] = fmul fast float [[TMP0]], [[TMP2]]			; CHECK-NEXT: [[TMP3:%.*]] = fmul fast float [[TMP0]], [[TMP2]]
	; CHECK-NEXT: [[TMP4:%.*]] = call fast float @llvm.trunc.f32(float [[TMP3]])			; CHECK-NEXT: [[TMP4:%.*]] = call fast float @llvm.trunc.f32(float [[TMP3]])
	; CHECK-NEXT: [[TMP5:%.*]] = fsub fast float -0.000000e+00, [[TMP4]]			; CHECK-NEXT: [[TMP5:%.*]] = fneg fast float [[TMP4]]
	; CHECK-NEXT: [[TMP6:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP5]], float [[TMP1]], float [[TMP0]])			; CHECK-NEXT: [[TMP6:%.*]] = call fast float @llvm.amdgcn.fmad.ftz.f32(float [[TMP5]], float [[TMP1]], float [[TMP0]])
	; CHECK-NEXT: [[TMP7:%.*]] = fptoui float [[TMP4]] to i32			; CHECK-NEXT: [[TMP7:%.*]] = fptoui float [[TMP4]] to i32
	; CHECK-NEXT: [[TMP8:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]])			; CHECK-NEXT: [[TMP8:%.*]] = call fast float @llvm.fabs.f32(float [[TMP6]])
	; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.fabs.f32(float [[TMP1]])			; CHECK-NEXT: [[TMP9:%.*]] = call fast float @llvm.fabs.f32(float [[TMP1]])
	; CHECK-NEXT: [[TMP10:%.*]] = fcmp fast oge float [[TMP8]], [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = fcmp fast oge float [[TMP8]], [[TMP9]]
	; CHECK-NEXT: [[TMP11:%.*]] = select i1 [[TMP10]], i32 1, i32 0			; CHECK-NEXT: [[TMP11:%.*]] = select i1 [[TMP10]], i32 1, i32 0
	; CHECK-NEXT: [[TMP12:%.*]] = add i32 [[TMP7]], [[TMP11]]			; CHECK-NEXT: [[TMP12:%.*]] = add i32 [[TMP7]], [[TMP11]]
	; CHECK-NEXT: [[TMP13:%.*]] = and i32 [[TMP12]], 1023			; CHECK-NEXT: [[TMP13:%.*]] = and i32 [[TMP12]], 1023
	Show All 20 Lines

llvm/test/Transforms/InstCombine/cos-1.ll

Show First 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	;
ret float %r		ret float %r
}		}

; sin(-x) -> -sin(x);		; sin(-x) -> -sin(x);

define double @sin_negated_arg(double %x) {		define double @sin_negated_arg(double %x) {
; ANY-LABEL: @sin_negated_arg(		; ANY-LABEL: @sin_negated_arg(
; ANY-NEXT: [[TMP1:%.]] = call double @sin(double [[X:%.]])		; ANY-NEXT: [[TMP1:%.]] = call double @sin(double [[X:%.]])
; ANY-NEXT: [[TMP2:%.*]] = fsub double -0.000000e+00, [[TMP1]]		; ANY-NEXT: [[TMP2:%.*]] = fneg double [[TMP1]]
; ANY-NEXT: ret double [[TMP2]]		; ANY-NEXT: ret double [[TMP2]]
;		;
%neg = fsub double -0.0, %x		%neg = fsub double -0.0, %x
%r = call double @sin(double %neg)		%r = call double @sin(double %neg)
ret double %r		ret double %r
}		}

define double @sin_unary_negated_arg(double %x) {		define double @sin_unary_negated_arg(double %x) {
; ANY-LABEL: @sin_unary_negated_arg(		; ANY-LABEL: @sin_unary_negated_arg(
; ANY-NEXT: [[TMP1:%.]] = call double @sin(double [[X:%.]])		; ANY-NEXT: [[TMP1:%.]] = call double @sin(double [[X:%.]])
; ANY-NEXT: [[TMP2:%.*]] = fsub double -0.000000e+00, [[TMP1]]		; ANY-NEXT: [[TMP2:%.*]] = fneg double [[TMP1]]
; ANY-NEXT: ret double [[TMP2]]		; ANY-NEXT: ret double [[TMP2]]
;		;
%neg = fneg double %x		%neg = fneg double %x
%r = call double @sin(double %neg)		%r = call double @sin(double %neg)
ret double %r		ret double %r
}		}

define float @sinf_negated_arg(float %x) {		define float @sinf_negated_arg(float %x) {
; ANY-LABEL: @sinf_negated_arg(		; ANY-LABEL: @sinf_negated_arg(
; ANY-NEXT: [[TMP1:%.]] = call float @sinf(float [[X:%.]])		; ANY-NEXT: [[TMP1:%.]] = call float @sinf(float [[X:%.]])
; ANY-NEXT: [[TMP2:%.*]] = fsub float -0.000000e+00, [[TMP1]]		; ANY-NEXT: [[TMP2:%.*]] = fneg float [[TMP1]]
; ANY-NEXT: ret float [[TMP2]]		; ANY-NEXT: ret float [[TMP2]]
;		;
%neg = fsub float -0.0, %x		%neg = fsub float -0.0, %x
%r = call float @sinf(float %neg)		%r = call float @sinf(float %neg)
ret float %r		ret float %r
}		}

define float @sinf_unary_negated_arg(float %x) {		define float @sinf_unary_negated_arg(float %x) {
; ANY-LABEL: @sinf_unary_negated_arg(		; ANY-LABEL: @sinf_unary_negated_arg(
; ANY-NEXT: [[TMP1:%.]] = call float @sinf(float [[X:%.]])		; ANY-NEXT: [[TMP1:%.]] = call float @sinf(float [[X:%.]])
; ANY-NEXT: [[TMP2:%.*]] = fsub float -0.000000e+00, [[TMP1]]		; ANY-NEXT: [[TMP2:%.*]] = fneg float [[TMP1]]
; ANY-NEXT: ret float [[TMP2]]		; ANY-NEXT: ret float [[TMP2]]
;		;
%neg = fneg float %x		%neg = fneg float %x
%r = call float @sinf(float %neg)		%r = call float @sinf(float %neg)
ret float %r		ret float %r
}		}

define float @sinf_negated_arg_FMF(float %x) {		define float @sinf_negated_arg_FMF(float %x) {
; ANY-LABEL: @sinf_negated_arg_FMF(		; ANY-LABEL: @sinf_negated_arg_FMF(
; ANY-NEXT: [[TMP1:%.]] = call nnan afn float @sinf(float [[X:%.]])		; ANY-NEXT: [[TMP1:%.]] = call nnan afn float @sinf(float [[X:%.]])
; ANY-NEXT: [[TMP2:%.*]] = fsub nnan afn float -0.000000e+00, [[TMP1]]		; ANY-NEXT: [[TMP2:%.*]] = fneg nnan afn float [[TMP1]]
; ANY-NEXT: ret float [[TMP2]]		; ANY-NEXT: ret float [[TMP2]]
;		;
%neg = fsub ninf float -0.0, %x		%neg = fsub ninf float -0.0, %x
%r = call afn nnan float @sinf(float %neg)		%r = call afn nnan float @sinf(float %neg)
ret float %r		ret float %r
}		}

define float @sinf_unary_negated_arg_FMF(float %x) {		define float @sinf_unary_negated_arg_FMF(float %x) {
; ANY-LABEL: @sinf_unary_negated_arg_FMF(		; ANY-LABEL: @sinf_unary_negated_arg_FMF(
; ANY-NEXT: [[TMP1:%.]] = call nnan afn float @sinf(float [[X:%.]])		; ANY-NEXT: [[TMP1:%.]] = call nnan afn float @sinf(float [[X:%.]])
; ANY-NEXT: [[TMP2:%.*]] = fsub nnan afn float -0.000000e+00, [[TMP1]]		; ANY-NEXT: [[TMP2:%.*]] = fneg nnan afn float [[TMP1]]
; ANY-NEXT: ret float [[TMP2]]		; ANY-NEXT: ret float [[TMP2]]
;		;
%neg = fneg ninf float %x		%neg = fneg ninf float %x
%r = call afn nnan float @sinf(float %neg)		%r = call afn nnan float @sinf(float %neg)
ret float %r		ret float %r
}		}

declare void @use(double)		declare void @use(double)
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	;
ret double %rn		ret double %rn
}		}

; tan(-x) -> -tan(x);		; tan(-x) -> -tan(x);

define double @tan_negated_arg(double %x) {		define double @tan_negated_arg(double %x) {
; ANY-LABEL: @tan_negated_arg(		; ANY-LABEL: @tan_negated_arg(
; ANY-NEXT: [[TMP1:%.]] = call double @tan(double [[X:%.]])		; ANY-NEXT: [[TMP1:%.]] = call double @tan(double [[X:%.]])
; ANY-NEXT: [[TMP2:%.*]] = fsub double -0.000000e+00, [[TMP1]]		; ANY-NEXT: [[TMP2:%.*]] = fneg double [[TMP1]]
; ANY-NEXT: ret double [[TMP2]]		; ANY-NEXT: ret double [[TMP2]]
;		;
%neg = fsub double -0.0, %x		%neg = fsub double -0.0, %x
%r = call double @tan(double %neg)		%r = call double @tan(double %neg)
ret double %r		ret double %r
}		}

define double @tan_unary_negated_arg(double %x) {		define double @tan_unary_negated_arg(double %x) {
; ANY-LABEL: @tan_unary_negated_arg(		; ANY-LABEL: @tan_unary_negated_arg(
; ANY-NEXT: [[TMP1:%.]] = call double @tan(double [[X:%.]])		; ANY-NEXT: [[TMP1:%.]] = call double @tan(double [[X:%.]])
; ANY-NEXT: [[TMP2:%.*]] = fsub double -0.000000e+00, [[TMP1]]		; ANY-NEXT: [[TMP2:%.*]] = fneg double [[TMP1]]
; ANY-NEXT: ret double [[TMP2]]		; ANY-NEXT: ret double [[TMP2]]
;		;
%neg = fneg double %x		%neg = fneg double %x
%r = call double @tan(double %neg)		%r = call double @tan(double %neg)
ret double %r		ret double %r
}		}

; tanl(-x) -> -tanl(x);		; tanl(-x) -> -tanl(x);

define fp128 @tanl_negated_arg(fp128 %x) {		define fp128 @tanl_negated_arg(fp128 %x) {
; ANY-LABEL: @tanl_negated_arg(		; ANY-LABEL: @tanl_negated_arg(
; ANY-NEXT: [[TMP1:%.]] = call fp128 @tanl(fp128 [[X:%.]])		; ANY-NEXT: [[TMP1:%.]] = call fp128 @tanl(fp128 [[X:%.]])
; ANY-NEXT: [[TMP2:%.*]] = fsub fp128 0xL00000000000000008000000000000000, [[TMP1]]		; ANY-NEXT: [[TMP2:%.*]] = fneg fp128 [[TMP1]]
; ANY-NEXT: ret fp128 [[TMP2]]		; ANY-NEXT: ret fp128 [[TMP2]]
;		;
%neg = fsub fp128 0xL00000000000000008000000000000000, %x		%neg = fsub fp128 0xL00000000000000008000000000000000, %x
%r = call fp128 @tanl(fp128 %neg)		%r = call fp128 @tanl(fp128 %neg)
ret fp128 %r		ret fp128 %r
}		}

define fp128 @tanl_unary_negated_arg(fp128 %x) {		define fp128 @tanl_unary_negated_arg(fp128 %x) {
; ANY-LABEL: @tanl_unary_negated_arg(		; ANY-LABEL: @tanl_unary_negated_arg(
; ANY-NEXT: [[TMP1:%.]] = call fp128 @tanl(fp128 [[X:%.]])		; ANY-NEXT: [[TMP1:%.]] = call fp128 @tanl(fp128 [[X:%.]])
; ANY-NEXT: [[TMP2:%.*]] = fsub fp128 0xL00000000000000008000000000000000, [[TMP1]]		; ANY-NEXT: [[TMP2:%.*]] = fneg fp128 [[TMP1]]
; ANY-NEXT: ret fp128 [[TMP2]]		; ANY-NEXT: ret fp128 [[TMP2]]
;		;
%neg = fneg fp128 %x		%neg = fneg fp128 %x
%r = call fp128 @tanl(fp128 %neg)		%r = call fp128 @tanl(fp128 %neg)
ret fp128 %r		ret fp128 %r
}		}

define float @negated_and_shrinkable_libcall(float %f) {		define float @negated_and_shrinkable_libcall(float %f) {
▲ Show 20 Lines • Show All 64 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/fast-math.ll

Show First 20 Lines • Show All 498 Lines • ▼ Show 20 Lines	;
ret float %sub		ret float %sub
}		}

; (select X+Y, X-Y) => X + (select Y, -Y)		; (select X+Y, X-Y) => X + (select Y, -Y)
; This is always safe. No FMF required.		; This is always safe. No FMF required.
define float @fold16(float %x, float %y) {		define float @fold16(float %x, float %y) {
; CHECK-LABEL: @fold16(		; CHECK-LABEL: @fold16(
; CHECK-NEXT: [[CMP:%.]] = fcmp ogt float [[X:%.]], [[Y:%.*]]		; CHECK-NEXT: [[CMP:%.]] = fcmp ogt float [[X:%.]], [[Y:%.*]]
; CHECK-NEXT: [[TMP1:%.*]] = fsub float -0.000000e+00, [[Y]]		; CHECK-NEXT: [[TMP1:%.*]] = fneg float [[Y]]
; CHECK-NEXT: [[R_P:%.*]] = select i1 [[CMP]], float [[Y]], float [[TMP1]]		; CHECK-NEXT: [[R_P:%.*]] = select i1 [[CMP]], float [[Y]], float [[TMP1]]
; CHECK-NEXT: [[R:%.*]] = fadd float [[R_P]], [[X]]		; CHECK-NEXT: [[R:%.*]] = fadd float [[R_P]], [[X]]
; CHECK-NEXT: ret float [[R]]		; CHECK-NEXT: ret float [[R]]
;		;
%cmp = fcmp ogt float %x, %y		%cmp = fcmp ogt float %x, %y
%plus = fadd float %x, %y		%plus = fadd float %x, %y
%minus = fsub float %x, %y		%minus = fsub float %x, %y
%r = select i1 %cmp, float %plus, float %minus		%r = select i1 %cmp, float %plus, float %minus
▲ Show 20 Lines • Show All 423 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/fmul.ll

	Show First 20 Lines • Show All 988 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret double [[R]]			; CHECK-NEXT: ret double [[R]]
	;			;
	%r = fmul double %x, fsub (double -0.000000e+00, double bitcast (i64 ptrtoint (i8** getelementptr inbounds ({ [2 x i8] }, { [2 x i8] }* @g, i64 0, inrange i32 0, i64 2) to i64) to double))			%r = fmul double %x, fsub (double -0.000000e+00, double bitcast (i64 ptrtoint (i8** getelementptr inbounds ({ [2 x i8] }, { [2 x i8] }* @g, i64 0, inrange i32 0, i64 2) to i64) to double))
	ret double %r			ret double %r
	}			}

	define float @negate_if_true(float %x, i1 %cond) {			define float @negate_if_true(float %x, i1 %cond) {
	; CHECK-LABEL: @negate_if_true(			; CHECK-LABEL: @negate_if_true(
	; CHECK-NEXT: [[TMP1:%.]] = fsub float -0.000000e+00, [[X:%.]]			; CHECK-NEXT: [[TMP1:%.]] = fneg float [[X:%.]]
	; CHECK-NEXT: [[TMP2:%.]] = select i1 [[COND:%.]], float [[TMP1]], float [[X]]			; CHECK-NEXT: [[TMP2:%.]] = select i1 [[COND:%.]], float [[TMP1]], float [[X]]
	; CHECK-NEXT: ret float [[TMP2]]			; CHECK-NEXT: ret float [[TMP2]]
	;			;
	%sel = select i1 %cond, float -1.0, float 1.0			%sel = select i1 %cond, float -1.0, float 1.0
	%r = fmul float %sel, %x			%r = fmul float %sel, %x
	ret float %r			ret float %r
	}			}

	define float @negate_if_false(float %x, i1 %cond) {			define float @negate_if_false(float %x, i1 %cond) {
	; CHECK-LABEL: @negate_if_false(			; CHECK-LABEL: @negate_if_false(
	; CHECK-NEXT: [[TMP1:%.]] = fsub arcp float -0.000000e+00, [[X:%.]]			; CHECK-NEXT: [[TMP1:%.]] = fneg arcp float [[X:%.]]
	; CHECK-NEXT: [[TMP2:%.]] = select arcp i1 [[COND:%.]], float [[X]], float [[TMP1]]			; CHECK-NEXT: [[TMP2:%.]] = select arcp i1 [[COND:%.]], float [[X]], float [[TMP1]]
	; CHECK-NEXT: ret float [[TMP2]]			; CHECK-NEXT: ret float [[TMP2]]
	;			;
	%sel = select i1 %cond, float 1.0, float -1.0			%sel = select i1 %cond, float 1.0, float -1.0
	%r = fmul arcp float %sel, %x			%r = fmul arcp float %sel, %x
	ret float %r			ret float %r
	}			}

	define <2 x double> @negate_if_true_commute(<2 x double> %px, i1 %cond) {			define <2 x double> @negate_if_true_commute(<2 x double> %px, i1 %cond) {
	; CHECK-LABEL: @negate_if_true_commute(			; CHECK-LABEL: @negate_if_true_commute(
	; CHECK-NEXT: [[X:%.]] = fdiv <2 x double> <double 4.200000e+01, double 4.200000e+01>, [[PX:%.]]			; CHECK-NEXT: [[X:%.]] = fdiv <2 x double> <double 4.200000e+01, double 4.200000e+01>, [[PX:%.]]
	; CHECK-NEXT: [[TMP1:%.*]] = fsub ninf <2 x double> <double -0.000000e+00, double -0.000000e+00>, [[X]]			; CHECK-NEXT: [[TMP1:%.*]] = fneg ninf <2 x double> [[X]]
	; CHECK-NEXT: [[TMP2:%.]] = select ninf i1 [[COND:%.]], <2 x double> [[TMP1]], <2 x double> [[X]]			; CHECK-NEXT: [[TMP2:%.]] = select ninf i1 [[COND:%.]], <2 x double> [[TMP1]], <2 x double> [[X]]
	; CHECK-NEXT: ret <2 x double> [[TMP2]]			; CHECK-NEXT: ret <2 x double> [[TMP2]]
	;			;
	%x = fdiv <2 x double> <double 42.0, double 42.0>, %px ; thwart complexity-based canonicalization			%x = fdiv <2 x double> <double 42.0, double 42.0>, %px ; thwart complexity-based canonicalization
	%sel = select i1 %cond, <2 x double> <double -1.0, double -1.0>, <2 x double> <double 1.0, double 1.0>			%sel = select i1 %cond, <2 x double> <double -1.0, double -1.0>, <2 x double> <double 1.0, double 1.0>
	%r = fmul ninf <2 x double> %x, %sel			%r = fmul ninf <2 x double> %x, %sel
	ret <2 x double> %r			ret <2 x double> %r
	}			}

	define <2 x double> @negate_if_false_commute(<2 x double> %px, <2 x i1> %cond) {			define <2 x double> @negate_if_false_commute(<2 x double> %px, <2 x i1> %cond) {
	; CHECK-LABEL: @negate_if_false_commute(			; CHECK-LABEL: @negate_if_false_commute(
	; CHECK-NEXT: [[X:%.]] = fdiv <2 x double> <double 4.200000e+01, double 5.100000e+00>, [[PX:%.]]			; CHECK-NEXT: [[X:%.]] = fdiv <2 x double> <double 4.200000e+01, double 5.100000e+00>, [[PX:%.]]
	; CHECK-NEXT: [[TMP1:%.*]] = fsub <2 x double> <double -0.000000e+00, double -0.000000e+00>, [[X]]			; CHECK-NEXT: [[TMP1:%.*]] = fneg <2 x double> [[X]]
	; CHECK-NEXT: [[TMP2:%.]] = select <2 x i1> [[COND:%.]], <2 x double> [[X]], <2 x double> [[TMP1]]			; CHECK-NEXT: [[TMP2:%.]] = select <2 x i1> [[COND:%.]], <2 x double> [[X]], <2 x double> [[TMP1]]
	; CHECK-NEXT: ret <2 x double> [[TMP2]]			; CHECK-NEXT: ret <2 x double> [[TMP2]]
	;			;
	%x = fdiv <2 x double> <double 42.0, double 5.1>, %px ; thwart complexity-based canonicalization			%x = fdiv <2 x double> <double 42.0, double 5.1>, %px ; thwart complexity-based canonicalization
	%sel = select <2 x i1> %cond, <2 x double> <double 1.0, double 1.0>, <2 x double> <double -1.0, double -1.0>			%sel = select <2 x i1> %cond, <2 x double> <double 1.0, double 1.0>, <2 x double> <double -1.0, double -1.0>
	%r = fmul <2 x double> %x, %sel			%r = fmul <2 x double> %x, %sel
	ret <2 x double> %r			ret <2 x double> %r
	}			}
	Show All 30 Lines

llvm/test/Transforms/InstCombine/select-crash.ll

	; RUN: opt < %s -instcombine -S \| FileCheck %s			; RUN: opt < %s -instcombine -S \| FileCheck %s
	; Formerly crashed, PR8490.			; Formerly crashed, PR8490.

	define fastcc double @gimp_operation_color_balance_map(float %value, double %highlights) nounwind readnone inlinehint {			define fastcc double @gimp_operation_color_balance_map(float %value, double %highlights) nounwind readnone inlinehint {
	entry:			entry:
	; CHECK: gimp_operation_color_balance_map			; CHECK: gimp_operation_color_balance_map
	; CHECK: fsub double -0.000000			; CHECK: fneg double
	%conv = fpext float %value to double			%conv = fpext float %value to double
	%div = fdiv double %conv, 1.600000e+01			%div = fdiv double %conv, 1.600000e+01
	%add = fadd double %div, 1.000000e+00			%add = fadd double %div, 1.000000e+00
	%div1 = fdiv double 1.000000e+00, %add			%div1 = fdiv double 1.000000e+00, %add
	%sub = fsub double 1.075000e+00, %div1			%sub = fsub double 1.075000e+00, %div1
	%sub24 = fsub double 1.000000e+00, %sub			%sub24 = fsub double 1.000000e+00, %sub
	%add26 = fadd double %sub, 1.000000e+00			%add26 = fadd double %sub, 1.000000e+00
	%cmp86 = fcmp ogt double %highlights, 0.000000e+00			%cmp86 = fcmp ogt double %highlights, 0.000000e+00
	%cond90 = select i1 %cmp86, double %sub24, double %add26			%cond90 = select i1 %cmp86, double %sub24, double %add26
	%mul91 = fmul double %highlights, %cond90			%mul91 = fmul double %highlights, %cond90
	%add94 = fadd double %mul91, %mul91			%add94 = fadd double %mul91, %mul91
	ret double %add94			ret double %add94
	}			}

	; PR10180: same crash, but with vectors			; PR10180: same crash, but with vectors
	define <4 x float> @foo(i1 %b, <4 x float> %x, <4 x float> %y, <4 x float> %z) {			define <4 x float> @foo(i1 %b, <4 x float> %x, <4 x float> %y, <4 x float> %z) {
	; CHECK-LABEL: @foo(			; CHECK-LABEL: @foo(
	; CHECK: fsub <4 x float>			; CHECK: fneg <4 x float>
	; CHECK: select			; CHECK: select
	; CHECK: fadd <4 x float>			; CHECK: fadd <4 x float>
	%a = fadd <4 x float> %x, %y			%a = fadd <4 x float> %x, %y
	%sub = fsub <4 x float> %x, %z			%sub = fsub <4 x float> %x, %z
	%sel = select i1 %b, <4 x float> %a, <4 x float> %sub			%sel = select i1 %b, <4 x float> %a, <4 x float> %sub
	ret <4 x float> %sel			ret <4 x float> %sel
	}			}

llvm/unittests/IR/InstructionsTest.cpp

Show First 20 Lines • Show All 1,109 Lines • ▼ Show 20 Lines	EXPECT_EQ(IndirectBA, ArgBA)
"destination of '"		"destination of '"
<< CBI.getIndirectDest(0)->getName() << "', but a argument of '"		<< CBI.getIndirectDest(0)->getName() << "', but a argument of '"
<< ArgBA->getBasicBlock()->getName() << "'. These should always match:\n"		<< ArgBA->getBasicBlock()->getName() << "'. These should always match:\n"
<< CBI;		<< CBI;
EXPECT_EQ(IndirectBA->getBasicBlock(), &IfThen);		EXPECT_EQ(IndirectBA->getBasicBlock(), &IfThen);
EXPECT_EQ(ArgBA->getBasicBlock(), &IfThen);		EXPECT_EQ(ArgBA->getBasicBlock(), &IfThen);
}		}

		TEST(InstructionsTest, UnaryOperator) {
		LLVMContext Context;
		IRBuilder<> Builder(Context);
		Instruction *I = Builder.CreatePHI(Builder.getDoubleTy(), 0);
		Value *F = Builder.CreateFNeg(I);

		EXPECT_TRUE(isa<Value>(F));
		EXPECT_TRUE(isa<Instruction>(F));
		EXPECT_TRUE(isa<UnaryInstruction>(F));
		EXPECT_TRUE(isa<UnaryOperator>(F));
		EXPECT_FALSE(isa<BinaryOperator>(F));

		F->deleteValue();
		}

} // end anonymous namespace		} // end anonymous namespace
} // end namespace llvm		} // end namespace llvm

This is an archive of the discontinued LLVM Phabricator instance.

[WIP] Update IRBuilder::CreateFNeg(...) to return a UnaryOperatorClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 224181

clang/lib/CodeGen/CGExprScalar.cpp

clang/test/CodeGen/aarch64-neon-2velem.c

clang/test/CodeGen/aarch64-neon-fma.c

clang/test/CodeGen/aarch64-neon-intrinsics.c

clang/test/CodeGen/aarch64-neon-misc.c

clang/test/CodeGen/aarch64-neon-scalar-x-indexed-elem.c

clang/test/CodeGen/aarch64-v8.2a-fp16-intrinsics.c

clang/test/CodeGen/aarch64-v8.2a-neon-intrinsics.c

clang/test/CodeGen/arm-v8.2a-neon-intrinsics.c

clang/test/CodeGen/arm_neon_intrinsics.c

clang/test/CodeGen/avx512f-builtins.c

clang/test/CodeGen/avx512vl-builtins.c

clang/test/CodeGen/builtins-ppc-vsx.c

clang/test/CodeGen/complex-math.c

clang/test/CodeGen/exprs.c

clang/test/CodeGen/fma-builtins.c

clang/test/CodeGen/fma4-builtins.c

clang/test/CodeGen/fp16-ops.c

clang/test/CodeGen/zvector.c

clang/test/CodeGen/zvector2.c

llvm/include/llvm/IR/IRBuilder.h

llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll

llvm/test/CodeGen/AMDGPU/divrem24-assume.ll

llvm/test/Transforms/InstCombine/cos-1.ll

llvm/test/Transforms/InstCombine/fast-math.ll

llvm/test/Transforms/InstCombine/fmul.ll

llvm/test/Transforms/InstCombine/select-crash.ll

llvm/unittests/IR/InstructionsTest.cpp

[WIP] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator
ClosedPublic