This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/CodeGen/SelectionDAG/
-
CodeGen/
-
SelectionDAG/
6/13
DAGCombiner.cpp
-
test/CodeGen/
-
CodeGen/
-
AArch64/
-
fadd-combines.ll
-
AMDGPU/
-
fmuladd.f16.ll
-
fmuladd.f32.ll
-
fmuladd.f64.ll
-
fmuladd.v2f16.ll
-
PowerPC/
-
combine-fneg.ll
-
fdiv.ll
1
fma-aggr-FMF.ll
1/3
fma-assoc.ll
-
fma-combine.ll
-
fma-mutate.ll
-
fma-negate.ll
1
fma-precision.ll
1
fmf-propagation.ll
-
machine-combiner.ll
-
recipest.ll
-
register-pressure-reduction.ll
-
repeated-fp-divisors.ll
-
X86/
-
machine-combiner.ll
-
sqrt-fastmath.ll

Differential D104247

[DAGCombine] reassoc flag shouldn't enable contract
ClosedPublic

Authored by jsji on Jun 14 2021, 11:18 AM.

Download Raw Diff

Details

Reviewers

spatel
mcberg2017
qiucf
dfukalov
arsenm
steven.zhang
dmgreen
RKSimon

Group Reviewers

Restricted Project

Commits

rG3996311ee1b0: [DAGCombine] reassoc flag shouldn't enable contract

Summary

According to IR LangRef, the FMF flag:

contract
Allow floating-point contraction (e.g. fusing a multiply followed by an
addition into a fused multiply-and-add).

reassoc
Allow reassociation transformations for floating-point instructions.
This may dramatically change results in floating-point.

My understanding is that these two flags shouldn't imply each other,
as we might have a SDNode that can be reassociated with others, but
not contractble.

eg: We may want following fmul/fad/fsub to freely reassoc, but don't
want fma being generated here.

%F = fmul reassoc double %A, %B         ; <double> [#uses=1]
%G = fmul reassoc double %C, %D         ; <double> [#uses=1]
%H = fadd reassoc double %F, %G         ; <double> [#uses=1]
%I = fsub reassoc double %H, %E         ; <double> [#uses=1]

Before https://reviews.llvm.org/D45710, reassoc flag actually
did not imply isContratable either.

static bool isContractable(SDNode *N) {

SDNodeFlags F = N->getFlags();
return F.hasAllowContract() || F.hasUnsafeAlgebra();

}

The current implementation also only check the flag in fadd node,
ignoring fmul node, this patch update that as well.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jsji created this revision.Jun 14 2021, 11:18 AM

Herald added subscribers: kerbowa, pengfei, dmgreen and 4 others. · View Herald TranscriptJun 14 2021, 11:18 AM

jsji requested review of this revision.Jun 14 2021, 11:18 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2021, 11:18 AM

Herald added subscribers: llvm-commits, wdng. · View Herald Transcript

jsji added a reviewer: Restricted Project.Jun 14 2021, 11:24 AM

jsji edited the summary of this revision. (Show Details)Jun 14 2021, 11:41 AM

dmgreen added inline comments.Jun 14 2021, 11:43 AM

llvm/test/CodeGen/Thumb2/mve-vldshuffle.ll
3 ↗	(On Diff #351939)	I can fix this.

jsji added inline comments.Jun 14 2021, 11:47 AM

llvm/test/CodeGen/Thumb2/mve-vldshuffle.ll
3 ↗	(On Diff #351939)	Thanks!

jsji edited reviewers, added: dmgreen; removed: greened.Jun 14 2021, 11:49 AM

dmgreen mentioned this in D104255: [InterleaveAccess] Copy fast math flags when adjusting binary operators in interleave access pass.Jun 14 2021, 12:04 PM

Harbormaster completed remote builds in B109148: Diff 351939.Jun 14 2021, 12:21 PM

Changes in AMDGPU tests LGTM, thanks!

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
12874–12875	Nit: it seems this function is too trivial now and we can remove it?
13104–13105	Nit: move `CanFuse` definition two hundreds lines down to its uses?

RKSimon resigned from this revision.Jun 15 2021, 6:43 AM

Address review comments.

Harbormaster completed remote builds in B109290: Diff 352132.Jun 15 2021, 7:54 AM

Thanks for the patch. It looks reasonable since we've split effect of contract from reassoc (like D89527), and this one does the missing reverse.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
13125	`SDValue` overloaded '->' operator.
llvm/test/CodeGen/PowerPC/fma-aggr-FMF.ll
26	extra space

Address review comments.

Harbormaster completed remote builds in B109445: Diff 352329.Jun 16 2021, 3:20 AM

dmgreen mentioned this in rGfda8b4714e05: [InterleaveAccess] Copy fast math flags when adjusting binary operators in….Jun 17 2021, 1:53 AM

Rebase to pick up Dave's D104255.

LGTM if no further comments from others @spatel

This revision is now accepted and ready to land.Jun 17 2021, 7:59 PM

Harbormaster completed remote builds in B109801: Diff 352838.Jun 18 2021, 4:17 AM

In code that I've looked at (mostly C compiled with -ffast-math), we always have contract when we have reassoc, so I don't see much practical difference.
Can you explain more how we could benefit from this change - in the example in the description, we would have 4 instructions rather than 3 if we use FMA - is that better?
We should duplicate (instead of edit) at least a few of the tests, so we can verify that the existing tests do not form FMA now. That could also be used to show an advantage from not forming FMA.

llvm/test/CodeGen/PowerPC/fmf-propagation.ll
129	This comment was not valid even before this patch.

In D104247#2827089, @spatel wrote:

In code that I've looked at (mostly C compiled with -ffast-math), we always have contract when we have reassoc, so I don't see much practical difference.
Can you explain more how we could benefit from this change - in the example in the description, we would have 4 instructions rather than 3 if we use FMA - is that better?

Yes, you are right, the performance of having FMA should be better. However, we have quite some scenarios that users care about precision more than performance, they want to precise control of when FMA can be generated. So The major motivation of this is to ensure that we respect the IR semantics. For users that care about performance, we still can get them through default global option or emitting respect flag in IR.

We should duplicate (instead of edit) at least a few of the tests, so we can verify that the existing tests do not form FMA now. That could also be used to show an advantage from not forming FMA.

Sure, I can do that . Thanks!

In D104247#2827116, @jsji wrote:

In D104247#2827089, @spatel wrote:

In code that I've looked at (mostly C compiled with -ffast-math), we always have contract when we have reassoc, so I don't see much practical difference.
Can you explain more how we could benefit from this change - in the example in the description, we would have 4 instructions rather than 3 if we use FMA - is that better?

Yes, you are right, the performance of having FMA should be better. However, we have quite some scenarios that users care about precision more than performance, they want to precise control of when FMA can be generated. So The major motivation of this is to ensure that we respect the IR semantics. For users that care about performance, we still can get them through default global option or emitting respect flag in IR.

AFAIU, the use of FMA doesn't decline the precision (but may improve a bit, which is also unexpected in some circumstance). But reassoc flag does affect precision much. We had some discussion on D99675, contract is acceptable even we want to disable reassoc for the sake of precision. That said, contract has much less influence on precision than reassoc. So I don't see any necessity to use reassoc without contract.

In D104247#2828581, @pengfei wrote:

In D104247#2827116, @jsji wrote:

In D104247#2827089, @spatel wrote:

In code that I've looked at (mostly C compiled with -ffast-math), we always have contract when we have reassoc, so I don't see much practical difference.
Can you explain more how we could benefit from this change - in the example in the description, we would have 4 instructions rather than 3 if we use FMA - is that better?

Yes, you are right, the performance of having FMA should be better. However, we have quite some scenarios that users care about precision more than performance, they want to precise control of when FMA can be generated. So The major motivation of this is to ensure that we respect the IR semantics. For users that care about performance, we still can get them through default global option or emitting respect flag in IR.

AFAIU, the use of FMA doesn't decline the precision (but may improve a bit, which is also unexpected in some circumstance).

Yes, you are right. FMA on PowerPC improves precision, which is sometime unexpected to math library writers.

We had some discussion on D99675, contract is acceptable even we want to disable reassoc for the sake of precision.
That said, contract has much less influence on precision than reassoc. So I don't see any necessity to use reassoc without contract.

I agree in general.
However, there are some specific cases that IR producer might generate reassoc flag for general reassociation, not intending for contract,
and current implementation is causing confusions and failures.

As reassoc and contract are defined as two independent flags,
I think we should distinguish them in implementation, regardless of whether it is really meaningful to use reassoc without contract?

Or another choice is we can update the definition of these two flags, saying that reassoc always imply contract,
but this doesn't looks a clear definition to me.

Address comments -- updated testcases.

Harbormaster completed remote builds in B110047: Diff 353165.Jun 19 2021, 4:57 PM

spatel added inline comments.Jun 21 2021, 9:04 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
13257	If we still allow the global/deprecated UnsafeFPMath option to enable FMA formation, we are keeping a side-channel way to override the FMF. That is, the global setting is effectively a replacement for "reassoc" in (many?) regression tests. I think this is not changing the logic before this patch, so that's fine, but do you plan to fix that too?
llvm/test/CodeGen/PowerPC/fma-assoc.ll
323	Ideally, we would remove the RUN lines that have that setting and rely on IR-level FMF instead. This may require duplicating some tests and updating the FMF. We are trying to rely on the IR and node-level alone if possible - see D99080 for discussion.
llvm/test/CodeGen/PowerPC/fma-precision.ll
2	Similar to the earlier inline comment (see D99080 for details). Can we avoid using the global flag by updating FMF instead?

Address comments.

jsji added inline comments.Jun 21 2021, 9:26 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
13257	Yes, I planned to fix this -- by allowing it for global flag as well , not only `UnsafeFPMath`.
13257	Oh, just to double confirm, we will still allow global options to override FMF (UnSafeFPMath, -ffast-math, etc), right? FMF is great, but sometimes global options will be more handy for front-end or IR producers. It may take time for IR producers to transit to FMF only?
llvm/test/CodeGen/PowerPC/fma-assoc.ll
323	Can we do this in another follow up patch? As you mentioned in D99080, I believe there are quite some tests needs updates.

LGTM.
Since we are opting for a definition of reassoc that is not a superset of the other flags, it might be worth seeing if transforms related to arcp and afn are similarly affected.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
13257	The goal of IR- and node-level FMF is to obsolete/deprecate the global options. But this has been the case for many years now, so I have no guess for how much longer it will take. :)
llvm/test/CodeGen/PowerPC/fma-assoc.ll
323	Sure - this is going to require manual inspection, adding tests, etc. I just wanted to point out that fixing the global option might not be worth the effort (if the code is behaving as expected using FMF).

Harbormaster completed remote builds in B110218: Diff 353393.Jun 21 2021, 11:54 AM

In D104247#2831159, @spatel wrote:

LGTM.
Since we are opting for a definition of reassoc that is not a superset of the other flags, it might be worth seeing if transforms related to arcp and afn are similarly affected.

Thanks, sure, will check.

Closed by commit rG3996311ee1b0: [DAGCombine] reassoc flag shouldn't enable contract (authored by jsji). · Explain WhyJun 21 2021, 2:16 PM

This revision was automatically updated to reflect the committed changes.

jsji added a commit: rG3996311ee1b0: [DAGCombine] reassoc flag shouldn't enable contract.

mcberg2017 added inline comments.Jun 21 2021, 3:48 PM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
13258	Some of these patterns below have operands/operations which change order, they should be audited and checked reassoc FMF to enable. Perhaps a follow on change?

jsji added inline comments.Jun 21 2021, 4:34 PM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
13258	Good point, sure, will follow up.

spatel added inline comments.Jun 22 2021, 6:43 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

13258

Sorry, I missed this difference. The patch as-is is allowing miscompiles IIUC. For example, this test of the pattern:

; x*y + u*v - z
define float @f(float %x, float %y, float %z, float %u, float %v) {
  %xy = fmul contract float %x, %y
  %uv = fmul contract float %u, %v
  %xyuv = fadd contract float %xy, %uv
  %xyuvz = fsub contract float %xyuv, %z
  ret float %xyuvz
}

We are not allowed to fuse the trailing fsub into fma based on what we decided in D89527, but that's what happens now with:

$ llc -o - fma.ll -mtriple=powerpc64le -ppc-asm-full-reg-names
	xsmsubasp f3, f4, f5 ; u*v - z
	xsmaddasp f3, f1, f2 ; x*y + (u*v - z)
	fmr f1, f3
	blr

jsji added inline comments.Jun 22 2021, 6:47 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
13258	Yes, working on it.

jsji added inline comments.Jun 22 2021, 7:06 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
13258	The specific case `; xy + uv - z` , the behavior is not changed by D104247. It has been like this for a long time. But yeah, I am looking into these patterns now.

jsji added inline comments.Jun 22 2021, 9:56 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
13258	https://reviews.llvm.org/D104723 posted for this.

jsji mentioned this in D104723: [DAGCombine] Check reassoc flags in aggressive fsub fusion.Jun 22 2021, 10:06 AM

jsji mentioned this in rGc125af82a5ff: [DAGCombine] Check reassoc flags in aggressive fsub fusion.Jun 23 2021, 7:00 AM

matejam mentioned this in D93305: [AMDGPU][GlobalISel] Transform (fadd (fmul x, y), z) -> (fma x, y, z).Sep 29 2021, 7:52 AM

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

20 lines

test/

CodeGen/

AArch64/

fadd-combines.ll

6 lines

AMDGPU/

2 lines

2 lines

2 lines

2 lines

PowerPC/

4 lines

4 lines

8 lines

45 lines

16 lines

2 lines

20 lines

68 lines

248 lines

78 lines

44 lines

26 lines

repeated-fp-divisors.ll

4 lines

X86/

machine-combiner.ll

24 lines

sqrt-fastmath.ll

8 lines

Diff 353487

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 12,865 Lines • ▼ Show 20 Lines for (const SDValue &Op : BV->op_values()) {

// For big endian targets, swap the order of the pieces of each element. // For big endian targets, swap the order of the pieces of each element.

if (DAG.getDataLayout().isBigEndian()) if (DAG.getDataLayout().isBigEndian())

std::reverse(Ops.end()-NumOutputsPerInput, Ops.end()); std::reverse(Ops.end()-NumOutputsPerInput, Ops.end());

} }

return DAG.getBuildVector(VT, DL, Ops); return DAG.getBuildVector(VT, DL, Ops);

} }

static bool isContractable(SDNode *N) {

SDNodeFlags F = N->getFlags();

return F.hasAllowContract() || F.hasAllowReassociation();

}

/// Try to perform FMA combining on a given FADD node. /// Try to perform FMA combining on a given FADD node.

dfukalovUnsubmitted

Not Done

Nit: it seems this function is too trivial now and we can remove it?

dfukalov: Nit: it seems this function is too trivial now and we can remove it?

SDValue DAGCombiner::visitFADDForFMACombine(SDNode *N) { SDValue DAGCombiner::visitFADDForFMACombine(SDNode *N) {

SDValue N0 = N->getOperand(0); SDValue N0 = N->getOperand(0);

SDValue N1 = N->getOperand(1); SDValue N1 = N->getOperand(1);

EVT VT = N->getValueType(0); EVT VT = N->getValueType(0);

SDLoc SL(N); SDLoc SL(N);

const TargetOptions &Options = DAG.getTarget().Options; const TargetOptions &Options = DAG.getTarget().Options;

// Floating-point multiply-add with intermediate rounding. // Floating-point multiply-add with intermediate rounding.

bool HasFMAD = (LegalOperations && TLI.isFMADLegal(DAG, N)); bool HasFMAD = (LegalOperations && TLI.isFMADLegal(DAG, N));

// Floating-point multiply-add without intermediate rounding. // Floating-point multiply-add without intermediate rounding.

bool HasFMA = bool HasFMA =

TLI.isFMAFasterThanFMulAndFAdd(DAG.getMachineFunction(), VT) && TLI.isFMAFasterThanFMulAndFAdd(DAG.getMachineFunction(), VT) &&

(!LegalOperations || TLI.isOperationLegalOrCustom(ISD::FMA, VT)); (!LegalOperations || TLI.isOperationLegalOrCustom(ISD::FMA, VT));

// No valid opcode, do not combine. // No valid opcode, do not combine.

if (!HasFMAD && !HasFMA) if (!HasFMAD && !HasFMA)

return SDValue(); return SDValue();

bool CanFuse = Options.UnsafeFPMath || isContractable(N);

bool CanReassociate = bool CanReassociate =

Options.UnsafeFPMath || N->getFlags().hasAllowReassociation(); Options.UnsafeFPMath || N->getFlags().hasAllowReassociation();

bool AllowFusionGlobally = (Options.AllowFPOpFusion == FPOpFusion::Fast || bool AllowFusionGlobally = (Options.AllowFPOpFusion == FPOpFusion::Fast ||

CanFuse || HasFMAD); Options.UnsafeFPMath || HasFMAD);

// If the addition is not contractable, do not combine. // If the addition is not contractable, do not combine.

if (!AllowFusionGlobally && !isContractable(N)) if (!AllowFusionGlobally && !N->getFlags().hasAllowContract())

return SDValue(); return SDValue();

if (TLI.generateFMAsInMachineCombiner(VT, OptLevel)) if (TLI.generateFMAsInMachineCombiner(VT, OptLevel))

return SDValue(); return SDValue();

// Always prefer FMAD to FMA for precision. // Always prefer FMAD to FMA for precision.

unsigned PreferredFusedOpcode = HasFMAD ? ISD::FMAD : ISD::FMA; unsigned PreferredFusedOpcode = HasFMAD ? ISD::FMAD : ISD::FMA;

bool Aggressive = TLI.enableAggressiveFMAFusion(VT); bool Aggressive = TLI.enableAggressiveFMAFusion(VT);

// Is the node an FMUL and contractable either due to global flags or // Is the node an FMUL and contractable either due to global flags or

// SDNodeFlags. // SDNodeFlags.

auto isContractableFMUL = [AllowFusionGlobally](SDValue N) { auto isContractableFMUL = [AllowFusionGlobally](SDValue N) {

if (N.getOpcode() != ISD::FMUL) if (N.getOpcode() != ISD::FMUL)

return false; return false;

return AllowFusionGlobally || isContractable(N.getNode()); return AllowFusionGlobally || N->getFlags().hasAllowContract();

}; };

// If we have two choices trying to fold (fadd (fmul u, v), (fmul x, y)), // If we have two choices trying to fold (fadd (fmul u, v), (fmul x, y)),

// prefer to fold the multiply with fewer uses. // prefer to fold the multiply with fewer uses.

if (Aggressive && isContractableFMUL(N0) && isContractableFMUL(N1)) { if (Aggressive && isContractableFMUL(N0) && isContractableFMUL(N1)) {

if (N0.getNode()->use_size() > N1.getNode()->use_size()) if (N0.getNode()->use_size() > N1.getNode()->use_size())

std::swap(N0, N1); std::swap(N0, N1);

} }

▲ Show 20 Lines • Show All 171 Lines • ▼ Show 20 Lines SDValue DAGCombiner::visitFSUBForFMACombine(SDNode *N) {

bool HasFMA = bool HasFMA =

TLI.isFMAFasterThanFMulAndFAdd(DAG.getMachineFunction(), VT) && TLI.isFMAFasterThanFMulAndFAdd(DAG.getMachineFunction(), VT) &&

(!LegalOperations || TLI.isOperationLegalOrCustom(ISD::FMA, VT)); (!LegalOperations || TLI.isOperationLegalOrCustom(ISD::FMA, VT));

// No valid opcode, do not combine. // No valid opcode, do not combine.

if (!HasFMAD && !HasFMA) if (!HasFMAD && !HasFMA)

return SDValue(); return SDValue();

const SDNodeFlags Flags = N->getFlags(); const SDNodeFlags Flags = N->getFlags();

bool CanFuse = Options.UnsafeFPMath || isContractable(N);

bool AllowFusionGlobally = (Options.AllowFPOpFusion == FPOpFusion::Fast || bool AllowFusionGlobally = (Options.AllowFPOpFusion == FPOpFusion::Fast ||

dfukalovUnsubmitted

Not Done

Nit: move CanFuse definition two hundreds lines down to its uses?

dfukalov: Nit: move `CanFuse` definition two hundreds lines down to its uses?

CanFuse || HasFMAD); Options.UnsafeFPMath || HasFMAD);

// If the subtraction is not contractable, do not combine. // If the subtraction is not contractable, do not combine.

if (!AllowFusionGlobally && !isContractable(N)) if (!AllowFusionGlobally && !N->getFlags().hasAllowContract())

return SDValue(); return SDValue();

if (TLI.generateFMAsInMachineCombiner(VT, OptLevel)) if (TLI.generateFMAsInMachineCombiner(VT, OptLevel))

return SDValue(); return SDValue();

// Always prefer FMAD to FMA for precision. // Always prefer FMAD to FMA for precision.

unsigned PreferredFusedOpcode = HasFMAD ? ISD::FMAD : ISD::FMA; unsigned PreferredFusedOpcode = HasFMAD ? ISD::FMAD : ISD::FMA;

bool Aggressive = TLI.enableAggressiveFMAFusion(VT); bool Aggressive = TLI.enableAggressiveFMAFusion(VT);

bool NoSignedZero = Options.NoSignedZerosFPMath || Flags.hasNoSignedZeros(); bool NoSignedZero = Options.NoSignedZerosFPMath || Flags.hasNoSignedZeros();

// Is the node an FMUL and contractable either due to global flags or // Is the node an FMUL and contractable either due to global flags or

// SDNodeFlags. // SDNodeFlags.

auto isContractableFMUL = [AllowFusionGlobally](SDValue N) { auto isContractableFMUL = [AllowFusionGlobally](SDValue N) {

if (N.getOpcode() != ISD::FMUL) if (N.getOpcode() != ISD::FMUL)

return false; return false;

return AllowFusionGlobally || isContractable(N.getNode()); return AllowFusionGlobally || N->getFlags().hasAllowContract();

qiucfUnsubmitted

Not Done

return false;

- return AllowFusionGlobally || N.getNode()->getFlags().hasAllowContract();

+ return AllowFusionGlobally || N->getFlags().hasAllowContract();

};

// fold (fsub (fmul x, y), z) -> (fma x, y, (fneg z))

SDValue overloaded '->' operator.

qiucf: `SDValue` overloaded [[ https://llvm.org/doxygen/classllvm_1_1SDValue.

}; };

// fold (fsub (fmul x, y), z) -> (fma x, y, (fneg z)) // fold (fsub (fmul x, y), z) -> (fma x, y, (fneg z))

auto tryToFoldXYSubZ = [&](SDValue XY, SDValue Z) { auto tryToFoldXYSubZ = [&](SDValue XY, SDValue Z) {

if (isContractableFMUL(XY) && (Aggressive || XY->hasOneUse())) { if (isContractableFMUL(XY) && (Aggressive || XY->hasOneUse())) {

return DAG.getNode(PreferredFusedOpcode, SL, VT, XY.getOperand(0), return DAG.getNode(PreferredFusedOpcode, SL, VT, XY.getOperand(0),

XY.getOperand(1), DAG.getNode(ISD::FNEG, SL, VT, Z)); XY.getOperand(1), DAG.getNode(ISD::FNEG, SL, VT, Z));

} }

▲ Show 20 Lines • Show All 115 Lines • ▼ Show 20 Lines if (N00.getOpcode() == ISD::FP_EXTEND) {

DAG.getNode(ISD::FP_EXTEND, SL, VT, N000.getOperand(1)), DAG.getNode(ISD::FP_EXTEND, SL, VT, N000.getOperand(1)),

N1)); N1));

} }

// More folding opportunities when target permits. // More folding opportunities when target permits.

if (Aggressive) { if (Aggressive) {

bool CanFuse = Options.UnsafeFPMath || N->getFlags().hasAllowContract();

spatelUnsubmitted

Not Done

If we still allow the global/deprecated UnsafeFPMath option to enable FMA formation, we are keeping a side-channel way to override the FMF. That is, the global setting is effectively a replacement for "reassoc" in (many?) regression tests.

I think this is not changing the logic before this patch, so that's fine, but do you plan to fix that too?

spatel: If we still allow the global/deprecated UnsafeFPMath option to enable FMA formation, we are…

jsjiAuthorUnsubmitted

Done

Yes, I planned to fix this -- by allowing it for global flag as well , not only UnsafeFPMath.

jsji: Yes, I planned to fix this -- by allowing it for global flag as well , not only `UnsafeFPMath`.

jsjiAuthorUnsubmitted

Done

Oh, just to double confirm, we will still allow global options to override FMF (UnSafeFPMath, -ffast-math, etc), right? FMF is great, but sometimes global options will be more handy for front-end or IR producers. It may take time for IR producers to transit to FMF only?

jsji: Oh, just to double confirm, we will still allow global options to override FMF (UnSafeFPMath…

spatelUnsubmitted

Not Done

The goal of IR- and node-level FMF is to obsolete/deprecate the global options. But this has been the case for many years now, so I have no guess for how much longer it will take. :)

spatel: The goal of IR- and node-level FMF is to obsolete/deprecate the global options. But this has…

// fold (fsub (fma x, y, (fmul u, v)), z) // fold (fsub (fma x, y, (fmul u, v)), z)

mcberg2017Unsubmitted

Not Done

Some of these patterns below have operands/operations which change order, they should be audited and checked reassoc FMF to enable. Perhaps a follow on change?

mcberg2017: Some of these patterns below have operands/operations which change order, they should be…

jsjiAuthorUnsubmitted

Done

Good point, sure, will follow up.

jsji: Good point, sure, will follow up.

spatelUnsubmitted

Not Done

Sorry, I missed this difference. The patch as-is is allowing miscompiles IIUC. For example, this test of the pattern:

; x*y + u*v - z
define float @f(float %x, float %y, float %z, float %u, float %v) {
  %xy = fmul contract float %x, %y
  %uv = fmul contract float %u, %v
  %xyuv = fadd contract float %xy, %uv
  %xyuvz = fsub contract float %xyuv, %z
  ret float %xyuvz
}

We are not allowed to fuse the trailing fsub into fma based on what we decided in D89527, but that's what happens now with:

$ llc -o - fma.ll -mtriple=powerpc64le -ppc-asm-full-reg-names
	xsmsubasp f3, f4, f5 ; u*v - z
	xsmaddasp f3, f1, f2 ; x*y + (u*v - z)
	fmr f1, f3
	blr

spatel: Sorry, I missed this difference. The patch as-is is allowing miscompiles IIUC. For example…

jsjiAuthorUnsubmitted

Done

Yes, working on it.

jsji: Yes, working on it.

jsjiAuthorUnsubmitted

Done

The specific case ; x*y + u*v - z , the behavior is not changed by D104247. It has been like this for a long time. But yeah, I am looking into these patterns now.

jsji: The specific case `; x*y + u*v - z` , the behavior is not changed by D104247. It has been like…

jsjiAuthorUnsubmitted

Done

https://reviews.llvm.org/D104723 posted for this.

jsji: https://reviews.llvm.org/D104723 posted for this.

// -> (fma x, y (fma u, v, (fneg z))) // -> (fma x, y (fma u, v, (fneg z)))

if (CanFuse && N0.getOpcode() == PreferredFusedOpcode && if (CanFuse && N0.getOpcode() == PreferredFusedOpcode &&

isContractableFMUL(N0.getOperand(2)) && N0->hasOneUse() && isContractableFMUL(N0.getOperand(2)) && N0->hasOneUse() &&

N0.getOperand(2)->hasOneUse()) { N0.getOperand(2)->hasOneUse()) {

return DAG.getNode(PreferredFusedOpcode, SL, VT, N0.getOperand(0), return DAG.getNode(PreferredFusedOpcode, SL, VT, N0.getOperand(0),

N0.getOperand(1), N0.getOperand(1),

DAG.getNode(PreferredFusedOpcode, SL, VT, DAG.getNode(PreferredFusedOpcode, SL, VT,

N0.getOperand(2).getOperand(0), N0.getOperand(2).getOperand(0),

▲ Show 20 Lines • Show All 9,979 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/fadd-combines.ll

	Show First 20 Lines • Show All 211 Lines • ▼ Show 20 Lines
	; requires reassociation to fuse with c*d.			; requires reassociation to fuse with c*d.

	define float @fadd_fma_fmul_fmf(float %a, float %b, float %c, float %d, float %n0) nounwind {			define float @fadd_fma_fmul_fmf(float %a, float %b, float %c, float %d, float %n0) nounwind {
	; CHECK-LABEL: fadd_fma_fmul_fmf:			; CHECK-LABEL: fadd_fma_fmul_fmf:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: fmadd s2, s2, s3, s4			; CHECK-NEXT: fmadd s2, s2, s3, s4
	; CHECK-NEXT: fmadd s0, s0, s1, s2			; CHECK-NEXT: fmadd s0, s0, s1, s2
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%m1 = fmul float %a, %b			%m1 = fmul contract float %a, %b
	%m2 = fmul float %c, %d			%m2 = fmul contract float %c, %d
	%a1 = fadd contract float %m1, %m2			%a1 = fadd contract float %m1, %m2
	%a2 = fadd reassoc float %n0, %a1			%a2 = fadd contract reassoc float %n0, %a1
	ret float %a2			ret float %a2
	}			}

	; Not minimum FMF.			; Not minimum FMF.

	define float @fadd_fma_fmul_2(float %a, float %b, float %c, float %d, float %n0) nounwind {			define float @fadd_fma_fmul_2(float %a, float %b, float %c, float %d, float %n0) nounwind {
	; CHECK-LABEL: fadd_fma_fmul_2:			; CHECK-LABEL: fadd_fma_fmul_2:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/fmuladd.f16.ll

	Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines
	; GFX10-FLUSH: v_add_f16_e32			; GFX10-FLUSH: v_add_f16_e32
	; GFX10-DENORM: v_fmac_f16_e32 {{v[0-9]+, v[0-9]+, v[0-9]+}}			; GFX10-DENORM: v_fmac_f16_e32 {{v[0-9]+, v[0-9]+, v[0-9]+}}

	define amdgpu_kernel void @fmul_fadd_contract_f16(half addrspace(1)* %out, half addrspace(1)* %in1,			define amdgpu_kernel void @fmul_fadd_contract_f16(half addrspace(1)* %out, half addrspace(1)* %in1,
	half addrspace(1)* %in2, half addrspace(1)* %in3) #0 {			half addrspace(1)* %in2, half addrspace(1)* %in3) #0 {
	%r0 = load half, half addrspace(1)* %in1			%r0 = load half, half addrspace(1)* %in1
	%r1 = load half, half addrspace(1)* %in2			%r1 = load half, half addrspace(1)* %in2
	%r2 = load half, half addrspace(1)* %in3			%r2 = load half, half addrspace(1)* %in3
	%mul = fmul half %r0, %r1			%mul = fmul contract half %r0, %r1
	%add = fadd contract half %mul, %r2			%add = fadd contract half %mul, %r2
	store half %add, half addrspace(1)* %out			store half %add, half addrspace(1)* %out
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}fmuladd_2.0_a_b_f16			; GCN-LABEL: {{^}}fmuladd_2.0_a_b_f16
	; GCN: {{buffer\|flat\|global}}_load_ushort [[R1:v[0-9]+]],			; GCN: {{buffer\|flat\|global}}_load_ushort [[R1:v[0-9]+]],
	; GCN: {{buffer\|flat\|global}}_load_ushort [[R2:v[0-9]+]],			; GCN: {{buffer\|flat\|global}}_load_ushort [[R2:v[0-9]+]],
	▲ Show 20 Lines • Show All 532 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/fmuladd.f32.ll

	Show First 20 Lines • Show All 75 Lines • ▼ Show 20 Lines
	; GCN-DENORM-SLOWFMA-CONTRACT: v_add_f32_e32			; GCN-DENORM-SLOWFMA-CONTRACT: v_add_f32_e32

	; GCN-DENORM-FASTFMA: v_fma_f32			; GCN-DENORM-FASTFMA: v_fma_f32
	define amdgpu_kernel void @fmul_fadd_contract_f32(float addrspace(1)* %out, float addrspace(1)* %in1,			define amdgpu_kernel void @fmul_fadd_contract_f32(float addrspace(1)* %out, float addrspace(1)* %in1,
	float addrspace(1)* %in2, float addrspace(1)* %in3) #0 {			float addrspace(1)* %in2, float addrspace(1)* %in3) #0 {
	%r0 = load volatile float, float addrspace(1)* %in1			%r0 = load volatile float, float addrspace(1)* %in1
	%r1 = load volatile float, float addrspace(1)* %in2			%r1 = load volatile float, float addrspace(1)* %in2
	%r2 = load volatile float, float addrspace(1)* %in3			%r2 = load volatile float, float addrspace(1)* %in3
	%mul = fmul float %r0, %r1			%mul = fmul contract float %r0, %r1
	%add = fadd contract float %mul, %r2			%add = fadd contract float %mul, %r2
	store float %add, float addrspace(1)* %out			store float %add, float addrspace(1)* %out
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}fmuladd_2.0_a_b_f32			; GCN-LABEL: {{^}}fmuladd_2.0_a_b_f32
	; GCN: {{buffer\|flat\|global}}_load_dword [[R1:v[0-9]+]],			; GCN: {{buffer\|flat\|global}}_load_dword [[R1:v[0-9]+]],
	; GCN: {{buffer\|flat\|global}}_load_dword [[R2:v[0-9]+]],			; GCN: {{buffer\|flat\|global}}_load_dword [[R2:v[0-9]+]],
	▲ Show 20 Lines • Show All 536 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/fmuladd.f64.ll

	Show All 35 Lines
	; GCN-LABEL: {{^}}fmul_fadd_contract_f64:			; GCN-LABEL: {{^}}fmul_fadd_contract_f64:
	; GCN: v_fma_f64 {{v\[[0-9]+:[0-9]+\], v\[[0-9]+:[0-9]+\], v\[[0-9]+:[0-9]+\], v\[[0-9]+:[0-9]+\]}}			; GCN: v_fma_f64 {{v\[[0-9]+:[0-9]+\], v\[[0-9]+:[0-9]+\], v\[[0-9]+:[0-9]+\], v\[[0-9]+:[0-9]+\]}}

	define amdgpu_kernel void @fmul_fadd_contract_f64(double addrspace(1)* %out, double addrspace(1)* %in1,			define amdgpu_kernel void @fmul_fadd_contract_f64(double addrspace(1)* %out, double addrspace(1)* %in1,
	double addrspace(1)* %in2, double addrspace(1)* %in3) #0 {			double addrspace(1)* %in2, double addrspace(1)* %in3) #0 {
	%r0 = load double, double addrspace(1)* %in1			%r0 = load double, double addrspace(1)* %in1
	%r1 = load double, double addrspace(1)* %in2			%r1 = load double, double addrspace(1)* %in2
	%r2 = load double, double addrspace(1)* %in3			%r2 = load double, double addrspace(1)* %in3
	%tmp = fmul double %r0, %r1			%tmp = fmul contract double %r0, %r1
	%r3 = fadd contract double %tmp, %r2			%r3 = fadd contract double %tmp, %r2
	store double %r3, double addrspace(1)* %out			store double %r3, double addrspace(1)* %out
	ret void			ret void
	}			}

	; GCN-LABEL: {{^}}fadd_a_a_b_f64:			; GCN-LABEL: {{^}}fadd_a_a_b_f64:
	; GCN: {{buffer\|flat}}_load_dwordx2 [[R1:v\[[0-9]+:[0-9]+\]]],			; GCN: {{buffer\|flat}}_load_dwordx2 [[R1:v\[[0-9]+:[0-9]+\]]],
	; GCN: {{buffer\|flat}}_load_dwordx2 [[R2:v\[[0-9]+:[0-9]+\]]],			; GCN: {{buffer\|flat}}_load_dwordx2 [[R2:v\[[0-9]+:[0-9]+\]]],
	▲ Show 20 Lines • Show All 144 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/fmuladd.v2f16.ll

	Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
	; GFX9-FLUSH: v_pk_add_f16 {{v[0-9]+, v[0-9]+, v[0-9]+}}			; GFX9-FLUSH: v_pk_add_f16 {{v[0-9]+, v[0-9]+, v[0-9]+}}

	; GFX9-DENORM: v_pk_fma_f16 {{v[0-9]+, v[0-9]+, v[0-9]+}}			; GFX9-DENORM: v_pk_fma_f16 {{v[0-9]+, v[0-9]+, v[0-9]+}}
	define amdgpu_kernel void @fmul_fadd_contract_v2f16(<2 x half> addrspace(1)* %out, <2 x half> addrspace(1)* %in1,			define amdgpu_kernel void @fmul_fadd_contract_v2f16(<2 x half> addrspace(1)* %out, <2 x half> addrspace(1)* %in1,
	<2 x half> addrspace(1)* %in2, <2 x half> addrspace(1)* %in3) #0 {			<2 x half> addrspace(1)* %in2, <2 x half> addrspace(1)* %in3) #0 {
	%r0 = load <2 x half>, <2 x half> addrspace(1)* %in1			%r0 = load <2 x half>, <2 x half> addrspace(1)* %in1
	%r1 = load <2 x half>, <2 x half> addrspace(1)* %in2			%r1 = load <2 x half>, <2 x half> addrspace(1)* %in2
	%r2 = load <2 x half>, <2 x half> addrspace(1)* %in3			%r2 = load <2 x half>, <2 x half> addrspace(1)* %in3
	%r3 = fmul <2 x half> %r0, %r1			%r3 = fmul contract <2 x half> %r0, %r1
	%r4 = fadd contract <2 x half> %r3, %r2			%r4 = fadd contract <2 x half> %r3, %r2
	store <2 x half> %r4, <2 x half> addrspace(1)* %out			store <2 x half> %r4, <2 x half> addrspace(1)* %out
	ret void			ret void
	}			}


	; GCN-LABEL: {{^}}fmuladd_2.0_a_b_v2f16:			; GCN-LABEL: {{^}}fmuladd_2.0_a_b_v2f16:
	; GCN: {{buffer\|flat\|global}}_load_dword [[R1:v[0-9]+]],			; GCN: {{buffer\|flat\|global}}_load_dword [[R1:v[0-9]+]],
	▲ Show 20 Lines • Show All 77 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/combine-fneg.ll

	Show All 17 Lines
	; CHECK-NEXT: xvnmsubadp 1, 0, 2			; CHECK-NEXT: xvnmsubadp 1, 0, 2
	; CHECK-NEXT: xvnmaddadp 2, 2, 1			; CHECK-NEXT: xvnmaddadp 2, 2, 1
	; CHECK-NEXT: xvmuldp 34, 34, 2			; CHECK-NEXT: xvmuldp 34, 34, 2
	; CHECK-NEXT: xvmuldp 35, 35, 2			; CHECK-NEXT: xvmuldp 35, 35, 2
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	entry:			entry:
	%splat.splatinsert = insertelement <4 x double> undef, double %a0, i32 0			%splat.splatinsert = insertelement <4 x double> undef, double %a0, i32 0
	%splat.splat = shufflevector <4 x double> %splat.splatinsert, <4 x double> undef, <4 x i32> zeroinitializer			%splat.splat = shufflevector <4 x double> %splat.splatinsert, <4 x double> undef, <4 x i32> zeroinitializer
	%div = fdiv reassoc nsz arcp ninf <4 x double> %a1, %splat.splat			%div = fdiv contract reassoc nsz arcp ninf <4 x double> %a1, %splat.splat
	%sub = fsub reassoc nsz <4 x double> <double 0.000000e+00, double 0.000000e+00, double 0.000000e+00, double 0.000000e+00>, %div			%sub = fsub contract reassoc nsz <4 x double> <double 0.000000e+00, double 0.000000e+00, double 0.000000e+00, double 0.000000e+00>, %div
	ret <4 x double> %sub			ret <4 x double> %sub
	}			}

llvm/test/CodeGen/PowerPC/fdiv.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -verify-machineinstrs < %s -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr8 \| FileCheck %s			; RUN: llc -verify-machineinstrs < %s -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr8 \| FileCheck %s
	; RUN: llc -verify-machineinstrs < %s -mtriple=powerpc64-ibm-aix-xcoff -mcpu=pwr8 -vec-extabi \| FileCheck %s			; RUN: llc -verify-machineinstrs < %s -mtriple=powerpc64-ibm-aix-xcoff -mcpu=pwr8 -vec-extabi \| FileCheck %s

	define dso_local float @foo_nosw(float %0, float %1) local_unnamed_addr {			define dso_local float @foo_nosw(float %0, float %1) local_unnamed_addr {
	; CHECK-LABEL: foo_nosw:			; CHECK-LABEL: foo_nosw:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: xsdivsp 1, 1, 2			; CHECK-NEXT: xsdivsp 1, 1, 2
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%3 = fdiv reassoc arcp nsz float %0, %1			%3 = fdiv contract reassoc arcp nsz float %0, %1
	ret float %3			ret float %3
	}			}

	define dso_local float @foo(float %0, float %1) local_unnamed_addr {			define dso_local float @foo(float %0, float %1) local_unnamed_addr {
	; CHECK-LABEL: foo:			; CHECK-LABEL: foo:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: xsresp 3, 2			; CHECK-NEXT: xsresp 3, 2
	; CHECK-NEXT: xsmulsp 0, 1, 3			; CHECK-NEXT: xsmulsp 0, 1, 3
	; CHECK-NEXT: xsnmsubasp 1, 2, 0			; CHECK-NEXT: xsnmsubasp 1, 2, 0
	; CHECK-NEXT: xsmaddasp 0, 3, 1			; CHECK-NEXT: xsmaddasp 0, 3, 1
	; CHECK-NEXT: fmr 1, 0			; CHECK-NEXT: fmr 1, 0
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%3 = fdiv reassoc arcp nsz ninf float %0, %1			%3 = fdiv contract reassoc arcp nsz ninf float %0, %1
	ret float %3			ret float %3
	}			}

llvm/test/CodeGen/PowerPC/fma-aggr-FMF.ll

	Show All 16 Lines
	}			}

	; There is no contract on the mul with no extra use so we can't fuse that.			; There is no contract on the mul with no extra use so we can't fuse that.
	; Since we are fusing with the mul with an extra use, the fmul needs to stick			; Since we are fusing with the mul with an extra use, the fmul needs to stick
	; around beside the fma.			; around beside the fma.
	define float @no_fma_with_fewer_uses(float %f1, float %f2, float %f3, float %f4) {			define float @no_fma_with_fewer_uses(float %f1, float %f2, float %f3, float %f4) {
	; CHECK-LABEL: no_fma_with_fewer_uses:			; CHECK-LABEL: no_fma_with_fewer_uses:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: xsmulsp 0, 1, 2			; CHECK-NEXT: xsmulsp 0, 3, 4
	; CHECK-NEXT: fmr 1, 0			; CHECK-NEXT: xsmulsp 3, 1, 2
				qiucfUnsubmitted Not Done Reply Inline Actions extra space qiucf: extra space
	; CHECK-NEXT: xsmaddasp 1, 3, 4			; CHECK-NEXT: xsmaddasp 0, 1, 2
	; CHECK-NEXT: xsdivsp 1, 0, 1			; CHECK-NEXT: xsdivsp 1, 3, 0
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%mul1 = fmul contract float %f1, %f2			%mul1 = fmul contract float %f1, %f2
	%mul2 = fmul float %f3, %f4			%mul2 = fmul float %f3, %f4
	%add = fadd contract float %mul1, %mul2			%add = fadd contract float %mul1, %mul2
	%second_use_of_mul1 = fdiv float %mul1, %add			%second_use_of_mul1 = fdiv float %mul1, %add
	ret float %second_use_of_mul1			ret float %second_use_of_mul1
	}			}

llvm/test/CodeGen/PowerPC/fma-assoc.ll

Show First 20 Lines • Show All 314 Lines • ▼ Show 20 Lines	; CHECK-VSX-NEXT: blr
double %D, double %E) {		double %D, double %E) {
%F = fmul reassoc double %A, %B ; <double> [#uses=1]		%F = fmul reassoc double %A, %B ; <double> [#uses=1]
%G = fmul reassoc double %C, %D ; <double> [#uses=1]		%G = fmul reassoc double %C, %D ; <double> [#uses=1]
%H = fadd reassoc double %F, %G ; <double> [#uses=1]		%H = fadd reassoc double %F, %G ; <double> [#uses=1]
%I = fadd reassoc double %E, %H ; <double> [#uses=1]		%I = fadd reassoc double %E, %H ; <double> [#uses=1]
ret double %I		ret double %I
}		}

		; FIXME: -ffp-contract=fast does NOT work here?
		spatelUnsubmitted Not Done Reply Inline Actions Ideally, we would remove the RUN lines that have that setting and rely on IR-level FMF instead. This may require duplicating some tests and updating the FMF. We are trying to rely on the IR and node-level alone if possible - see D99080 for discussion. spatel: Ideally, we would remove the RUN lines that have that setting and rely on IR-level FMF instead.
		jsjiAuthorUnsubmitted Done Reply Inline Actions Can we do this in another follow up patch? As you mentioned in D99080, I believe there are quite some tests needs updates. jsji: Can we do this in another follow up patch? As you mentioned in D99080, I believe there are…
		spatelUnsubmitted Not Done Reply Inline Actions Sure - this is going to require manual inspection, adding tests, etc. I just wanted to point out that fixing the global option might not be worth the effort (if the code is behaving as expected using FMF). spatel: Sure - this is going to require manual inspection, adding tests, etc. I just wanted to point…
define double @test_reassoc_FMSUB_ASSOC1(double %A, double %B, double %C,		define double @test_reassoc_FMSUB_ASSOC1(double %A, double %B, double %C,
; CHECK-LABEL: test_reassoc_FMSUB_ASSOC1:		; CHECK-LABEL: test_reassoc_FMSUB_ASSOC1:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: fmsub 0, 3, 4, 5		; CHECK-NEXT: fmul 0, 3, 4
; CHECK-NEXT: fmadd 1, 1, 2, 0		; CHECK-NEXT: fmadd 0, 1, 2, 0
		; CHECK-NEXT: fsub 1, 0, 5
; CHECK-NEXT: blr		; CHECK-NEXT: blr
;		;
; CHECK-VSX-LABEL: test_reassoc_FMSUB_ASSOC1:		; CHECK-VSX-LABEL: test_reassoc_FMSUB_ASSOC1:
; CHECK-VSX: # %bb.0:		; CHECK-VSX: # %bb.0:
; CHECK-VSX-NEXT: xsmsubmdp 3, 4, 5		; CHECK-VSX-NEXT: xsmuldp 0, 3, 4
; CHECK-VSX-NEXT: xsmaddadp 3, 1, 2		; CHECK-VSX-NEXT: xsmaddadp 0, 1, 2
; CHECK-VSX-NEXT: fmr 1, 3		; CHECK-VSX-NEXT: xssubdp 1, 0, 5
; CHECK-VSX-NEXT: blr		; CHECK-VSX-NEXT: blr
double %D, double %E) {		double %D, double %E) {
%F = fmul reassoc double %A, %B ; <double> [#uses=1]		%F = fmul reassoc double %A, %B ; <double> [#uses=1]
%G = fmul reassoc double %C, %D ; <double> [#uses=1]		%G = fmul reassoc double %C, %D ; <double> [#uses=1]
%H = fadd reassoc double %F, %G ; <double> [#uses=1]		%H = fadd reassoc double %F, %G ; <double> [#uses=1]
%I = fsub reassoc double %H, %E ; <double> [#uses=1]		%I = fsub reassoc double %H, %E ; <double> [#uses=1]
ret double %I		ret double %I
}		}

		define double @test_reassoc_FMSUB_ASSOC11(double %A, double %B, double %C,
		; CHECK-LABEL: test_reassoc_FMSUB_ASSOC11:
		; CHECK: # %bb.0:
		; CHECK-NEXT: fmsub 0, 3, 4, 5
		; CHECK-NEXT: fmadd 1, 1, 2, 0
		; CHECK-NEXT: blr
		;
		; CHECK-VSX-LABEL: test_reassoc_FMSUB_ASSOC11:
		; CHECK-VSX: # %bb.0:
		; CHECK-VSX-NEXT: xsmsubmdp 3, 4, 5
		; CHECK-VSX-NEXT: xsmaddadp 3, 1, 2
		; CHECK-VSX-NEXT: fmr 1, 3
		; CHECK-VSX-NEXT: blr
		double %D, double %E) {
		%F = fmul contract reassoc double %A, %B ; <double> [#uses=1]
		%G = fmul contract reassoc double %C, %D ; <double> [#uses=1]
		%H = fadd contract reassoc double %F, %G ; <double> [#uses=1]
		%I = fsub contract reassoc double %H, %E ; <double> [#uses=1]
		ret double %I
		}


define double @test_reassoc_FMSUB_ASSOC2(double %A, double %B, double %C,		define double @test_reassoc_FMSUB_ASSOC2(double %A, double %B, double %C,
; CHECK-LABEL: test_reassoc_FMSUB_ASSOC2:		; CHECK-LABEL: test_reassoc_FMSUB_ASSOC2:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: fmul 0, 3, 4		; CHECK-NEXT: fmul 0, 3, 4
; CHECK-NEXT: fmadd 0, 1, 2, 0		; CHECK-NEXT: fmadd 0, 1, 2, 0
; CHECK-NEXT: fsub 1, 5, 0		; CHECK-NEXT: fsub 1, 5, 0
; CHECK-NEXT: blr		; CHECK-NEXT: blr
;		;
Show All 9 Lines	; CHECK-VSX-NEXT: blr
%H = fadd reassoc double %F, %G ; <double> [#uses=1]		%H = fadd reassoc double %F, %G ; <double> [#uses=1]
%I = fsub reassoc double %E, %H ; <double> [#uses=1]		%I = fsub reassoc double %E, %H ; <double> [#uses=1]
ret double %I		ret double %I
}		}

define double @test_fast_FMSUB_ASSOC2(double %A, double %B, double %C,		define double @test_fast_FMSUB_ASSOC2(double %A, double %B, double %C,
; CHECK-LABEL: test_fast_FMSUB_ASSOC2:		; CHECK-LABEL: test_fast_FMSUB_ASSOC2:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: fnmsub 0, 3, 4, 5		; CHECK-NEXT: fmul 0, 3, 4
; CHECK-NEXT: fnmsub 1, 1, 2, 0		; CHECK-NEXT: fmadd 0, 1, 2, 0
		; CHECK-NEXT: fsub 1, 5, 0
; CHECK-NEXT: blr		; CHECK-NEXT: blr
;		;
; CHECK-VSX-LABEL: test_fast_FMSUB_ASSOC2:		; CHECK-VSX-LABEL: test_fast_FMSUB_ASSOC2:
; CHECK-VSX: # %bb.0:		; CHECK-VSX: # %bb.0:
; CHECK-VSX-NEXT: xsnmsubmdp 3, 4, 5		; CHECK-VSX-NEXT: xsmuldp 0, 3, 4
; CHECK-VSX-NEXT: xsnmsubadp 3, 1, 2		; CHECK-VSX-NEXT: xsmaddadp 0, 1, 2
; CHECK-VSX-NEXT: fmr 1, 3		; CHECK-VSX-NEXT: xssubdp 1, 5, 0
; CHECK-VSX-NEXT: blr		; CHECK-VSX-NEXT: blr
double %D, double %E) {		double %D, double %E) {
%F = fmul reassoc double %A, %B ; <double> [#uses=1]		%F = fmul reassoc double %A, %B ; <double> [#uses=1]
%G = fmul reassoc double %C, %D ; <double> [#uses=1]		%G = fmul reassoc double %C, %D ; <double> [#uses=1]
%H = fadd reassoc double %F, %G ; <double> [#uses=1]		%H = fadd reassoc double %F, %G ; <double> [#uses=1]
%I = fsub reassoc nsz double %E, %H ; <double> [#uses=1]		%I = fsub reassoc nsz double %E, %H ; <double> [#uses=1]
ret double %I		ret double %I
}		}
▲ Show 20 Lines • Show All 225 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/fma-combine.ll

	Show First 20 Lines • Show All 178 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: fmr 4, 3			; CHECK-NEXT: fmr 4, 3
	; CHECK-NEXT: xsmaddasp 3, 2, 0			; CHECK-NEXT: xsmaddasp 3, 2, 0
	; CHECK-NEXT: xsnmaddasp 4, 2, 0			; CHECK-NEXT: xsnmaddasp 4, 2, 0
	; CHECK-NEXT: xsmaddasp 1, 2, 3			; CHECK-NEXT: xsmaddasp 1, 2, 3
	; CHECK-NEXT: xsmaddasp 1, 4, 2			; CHECK-NEXT: xsmaddasp 1, 4, 2
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%tmp = load float, float* undef, align 4			%tmp = load float, float* undef, align 4
	%tmp2 = load float, float* undef, align 4			%tmp2 = load float, float* undef, align 4
	%tmp3 = fmul reassoc float %tmp, 0x3FE372D780000000			%tmp3 = fmul contract reassoc float %tmp, 0x3FE372D780000000
	%tmp4 = fadd reassoc float %tmp3, 1.000000e+00			%tmp4 = fadd contract reassoc float %tmp3, 1.000000e+00
	%tmp5 = fmul reassoc float %tmp2, %tmp4			%tmp5 = fmul contract reassoc float %tmp2, %tmp4
	%tmp6 = load float, float* undef, align 4			%tmp6 = load float, float* undef, align 4
	%tmp7 = load float, float* undef, align 4			%tmp7 = load float, float* undef, align 4
	%tmp8 = fmul reassoc float %tmp7, 0x3FE372D780000000			%tmp8 = fmul contract reassoc float %tmp7, 0x3FE372D780000000
	%tmp9 = fsub reassoc nsz float -1.000000e+00, %tmp8			%tmp9 = fsub contract reassoc nsz float -1.000000e+00, %tmp8
	%tmp10 = fmul reassoc float %tmp9, %tmp6			%tmp10 = fmul contract reassoc float %tmp9, %tmp6
	%tmp11 = fadd reassoc float %tmp5, 5.000000e-01			%tmp11 = fadd contract reassoc float %tmp5, 5.000000e-01
	%tmp12 = fadd reassoc float %tmp11, %tmp10			%tmp12 = fadd contract reassoc float %tmp11, %tmp10
	ret float %tmp12			ret float %tmp12
	}			}

	; This would crash while trying getNegatedExpression().			; This would crash while trying getNegatedExpression().
	define dso_local double @getNegatedExpression_crash(double %x, double %y) {			define dso_local double @getNegatedExpression_crash(double %x, double %y) {
	; CHECK-FAST-LABEL: getNegatedExpression_crash:			; CHECK-FAST-LABEL: getNegatedExpression_crash:
	; CHECK-FAST: # %bb.0:			; CHECK-FAST: # %bb.0:
	; CHECK-FAST-NEXT: addis 3, 2, .LCPI5_1@toc@ha			; CHECK-FAST-NEXT: addis 3, 2, .LCPI5_1@toc@ha
	▲ Show 20 Lines • Show All 145 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/fma-mutate.ll

	Show All 23 Lines
	; CHECK-NEXT: xsmuldp 1, 1, 0			; CHECK-NEXT: xsmuldp 1, 1, 0
	; CHECK-NEXT: xsmaddadp 3, 1, 0			; CHECK-NEXT: xsmaddadp 3, 1, 0
	; CHECK-NEXT: xsmuldp 0, 1, 4			; CHECK-NEXT: xsmuldp 0, 1, 4
	; CHECK-NEXT: xsmuldp 1, 0, 3			; CHECK-NEXT: xsmuldp 1, 0, 3
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	; CHECK-NEXT: .LBB0_2:			; CHECK-NEXT: .LBB0_2:
	; CHECK-NEXT: xssqrtdp 1, 1			; CHECK-NEXT: xssqrtdp 1, 1
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%r = call reassoc afn ninf double @llvm.sqrt.f64(double %a)			%r = call contract reassoc afn ninf double @llvm.sqrt.f64(double %a)
	ret double %r			ret double %r
	}			}

	define double @foo3_safe(double %a) nounwind {			define double @foo3_safe(double %a) nounwind {
	; CHECK-LABEL: foo3_safe:			; CHECK-LABEL: foo3_safe:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: xssqrtdp 1, 1			; CHECK-NEXT: xssqrtdp 1, 1
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%r = call double @llvm.sqrt.f64(double %a)			%r = call double @llvm.sqrt.f64(double %a)
	ret double %r			ret double %r
	}			}

llvm/test/CodeGen/PowerPC/fma-negate.ll

	Show First 20 Lines • Show All 173 Lines • ▼ Show 20 Lines
	; VSX-NEXT: xsnmsubadp 1, 2, 3			; VSX-NEXT: xsnmsubadp 1, 2, 3
	; VSX-NEXT: blr			; VSX-NEXT: blr
	;			;
	; NO-VSX-LABEL: test_fast_mul_sub_f64:			; NO-VSX-LABEL: test_fast_mul_sub_f64:
	; NO-VSX: # %bb.0: # %entry			; NO-VSX: # %bb.0: # %entry
	; NO-VSX-NEXT: fnmsub 1, 2, 3, 1			; NO-VSX-NEXT: fnmsub 1, 2, 3, 1
	; NO-VSX-NEXT: blr			; NO-VSX-NEXT: blr
	entry:			entry:
	%0 = fmul reassoc nsz double %b, %c			%0 = fmul contract reassoc nsz double %b, %c
	%1 = fsub reassoc nsz double %a, %0			%1 = fsub contract reassoc nsz double %a, %0
	ret double %1			ret double %1
	}			}

	define double @test_fast_2mul_sub_f64(double %a, double %b, double %c,			define double @test_fast_2mul_sub_f64(double %a, double %b, double %c,
	; VSX-LABEL: test_fast_2mul_sub_f64:			; VSX-LABEL: test_fast_2mul_sub_f64:
	; VSX: # %bb.0: # %entry			; VSX: # %bb.0: # %entry
	; VSX-NEXT: xsmuldp 0, 3, 4			; VSX-NEXT: xsmuldp 0, 3, 4
	; VSX-NEXT: xsmsubadp 0, 1, 2			; VSX-NEXT: xsmsubadp 0, 1, 2
	; VSX-NEXT: fmr 1, 0			; VSX-NEXT: fmr 1, 0
	; VSX-NEXT: blr			; VSX-NEXT: blr
	;			;
	; NO-VSX-LABEL: test_fast_2mul_sub_f64:			; NO-VSX-LABEL: test_fast_2mul_sub_f64:
	; NO-VSX: # %bb.0: # %entry			; NO-VSX: # %bb.0: # %entry
	; NO-VSX-NEXT: fmul 0, 3, 4			; NO-VSX-NEXT: fmul 0, 3, 4
	; NO-VSX-NEXT: fmsub 1, 1, 2, 0			; NO-VSX-NEXT: fmsub 1, 1, 2, 0
	; NO-VSX-NEXT: blr			; NO-VSX-NEXT: blr
	double %d) {			double %d) {
	entry:			entry:
	%0 = fmul reassoc double %a, %b			%0 = fmul contract reassoc double %a, %b
	%1 = fmul reassoc double %c, %d			%1 = fmul contract reassoc double %c, %d
	%2 = fsub reassoc double %0, %1			%2 = fsub contract reassoc double %0, %1
	ret double %2			ret double %2
	}			}

	define double @test_fast_neg_fma_f64(double %a, double %b, double %c) {			define double @test_fast_neg_fma_f64(double %a, double %b, double %c) {
	; VSX-LABEL: test_fast_neg_fma_f64:			; VSX-LABEL: test_fast_neg_fma_f64:
	; VSX: # %bb.0: # %entry			; VSX: # %bb.0: # %entry
	; VSX-NEXT: xsnmsubadp 3, 1, 2			; VSX-NEXT: xsnmsubadp 3, 1, 2
	; VSX-NEXT: fmr 1, 3			; VSX-NEXT: fmr 1, 3
	Show All 15 Lines
	; VSX-NEXT: xsnmsubasp 1, 2, 3			; VSX-NEXT: xsnmsubasp 1, 2, 3
	; VSX-NEXT: blr			; VSX-NEXT: blr
	;			;
	; NO-VSX-LABEL: test_fast_mul_sub_f32:			; NO-VSX-LABEL: test_fast_mul_sub_f32:
	; NO-VSX: # %bb.0: # %entry			; NO-VSX: # %bb.0: # %entry
	; NO-VSX-NEXT: fnmsubs 1, 2, 3, 1			; NO-VSX-NEXT: fnmsubs 1, 2, 3, 1
	; NO-VSX-NEXT: blr			; NO-VSX-NEXT: blr
	entry:			entry:
	%0 = fmul reassoc float %b, %c			%0 = fmul contract reassoc float %b, %c
	%1 = fsub reassoc nsz float %a, %0			%1 = fsub contract reassoc nsz float %a, %0
	ret float %1			ret float %1
	}			}

	define float @test_fast_2mul_sub_f32(float %a, float %b, float %c, float %d) {			define float @test_fast_2mul_sub_f32(float %a, float %b, float %c, float %d) {
	; VSX-LABEL: test_fast_2mul_sub_f32:			; VSX-LABEL: test_fast_2mul_sub_f32:
	; VSX: # %bb.0: # %entry			; VSX: # %bb.0: # %entry
	; VSX-NEXT: xsmulsp 0, 3, 4			; VSX-NEXT: xsmulsp 0, 3, 4
	; VSX-NEXT: xsmsubasp 0, 1, 2			; VSX-NEXT: xsmsubasp 0, 1, 2
	; VSX-NEXT: fmr 1, 0			; VSX-NEXT: fmr 1, 0
	; VSX-NEXT: blr			; VSX-NEXT: blr
	;			;
	; NO-VSX-LABEL: test_fast_2mul_sub_f32:			; NO-VSX-LABEL: test_fast_2mul_sub_f32:
	; NO-VSX: # %bb.0: # %entry			; NO-VSX: # %bb.0: # %entry
	; NO-VSX-NEXT: fmuls 0, 3, 4			; NO-VSX-NEXT: fmuls 0, 3, 4
	; NO-VSX-NEXT: fmsubs 1, 1, 2, 0			; NO-VSX-NEXT: fmsubs 1, 1, 2, 0
	; NO-VSX-NEXT: blr			; NO-VSX-NEXT: blr
	entry:			entry:
	%0 = fmul reassoc float %a, %b			%0 = fmul contract reassoc float %a, %b
	%1 = fmul reassoc float %c, %d			%1 = fmul contract reassoc float %c, %d
	%2 = fsub reassoc nsz float %0, %1			%2 = fsub contract reassoc nsz float %0, %1
	ret float %2			ret float %2
	}			}

	define float @test_fast_neg_fma_f32(float %a, float %b, float %c) {			define float @test_fast_neg_fma_f32(float %a, float %b, float %c) {
	; VSX-LABEL: test_fast_neg_fma_f32:			; VSX-LABEL: test_fast_neg_fma_f32:
	; VSX: # %bb.0: # %entry			; VSX: # %bb.0: # %entry
	; VSX-NEXT: xsnmsubasp 3, 1, 2			; VSX-NEXT: xsnmsubasp 3, 1, 2
	; VSX-NEXT: fmr 1, 3			; VSX-NEXT: fmr 1, 3
	▲ Show 20 Lines • Show All 58 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/fma-precision.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -verify-machineinstrs -mcpu=pwr9 -mtriple=powerpc64le-linux-gnu \| FileCheck %s			; RUN: llc < %s -verify-machineinstrs -mcpu=pwr9 -mtriple=powerpc64le-linux-gnu \| FileCheck %s
				spatelUnsubmitted Not Done Reply Inline Actions Similar to the earlier inline comment (see D99080 for details). Can we avoid using the global flag by updating FMF instead? spatel: Similar to the earlier inline comment (see D99080 for details). Can we avoid using the global…

	; Verify that the fold of ab-cd respect the uses of a*b			; Verify that the fold of ab-cd respect the uses of a*b
	define double @fsub1(double %a, double %b, double %c, double %d) {			define double @fsub1(double %a, double %b, double %c, double %d) {
	; CHECK-LABEL: fsub1:			; CHECK-LABEL: fsub1:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: xsmuldp 0, 2, 1			; CHECK-NEXT: xsmuldp 0, 2, 1
	; CHECK-NEXT: fmr 1, 0			; CHECK-NEXT: fmr 1, 0
	; CHECK-NEXT: xsnmsubadp 1, 4, 3			; CHECK-NEXT: xsnmsubadp 1, 4, 3
	; CHECK-NEXT: xsmuldp 1, 0, 1			; CHECK-NEXT: xsmuldp 1, 0, 1
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	entry:			entry:
	%mul = fmul reassoc double %b, %a			%mul = fmul contract reassoc double %b, %a
	%mul1 = fmul reassoc double %d, %c			%mul1 = fmul contract reassoc double %d, %c
	%sub = fsub reassoc nsz double %mul, %mul1			%sub = fsub contract reassoc nsz double %mul, %mul1
	%mul3 = fmul reassoc double %mul, %sub			%mul3 = fmul contract reassoc double %mul, %sub
	ret double %mul3			ret double %mul3
	}			}

	; Verify that the fold of ab-cd respect the uses of c*d			; Verify that the fold of ab-cd respect the uses of c*d
	define double @fsub2(double %a, double %b, double %c, double %d) {			define double @fsub2(double %a, double %b, double %c, double %d) {
	; CHECK-LABEL: fsub2:			; CHECK-LABEL: fsub2:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: xsmuldp 0, 4, 3			; CHECK-NEXT: xsmuldp 0, 4, 3
	; CHECK-NEXT: fmr 3, 0			; CHECK-NEXT: fmr 3, 0
	; CHECK-NEXT: xsmsubadp 3, 2, 1			; CHECK-NEXT: xsmsubadp 3, 2, 1
	; CHECK-NEXT: xsmuldp 1, 0, 3			; CHECK-NEXT: xsmuldp 1, 0, 3
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	entry:			entry:
	%mul = fmul reassoc double %b, %a			%mul = fmul contract reassoc double %b, %a
	%mul1 = fmul reassoc double %d, %c			%mul1 = fmul contract reassoc double %d, %c
	%sub = fsub reassoc double %mul, %mul1			%sub = fsub contract reassoc double %mul, %mul1
	%mul3 = fmul reassoc double %mul1, %sub			%mul3 = fmul contract reassoc double %mul1, %sub
	ret double %mul3			ret double %mul3
	}			}

	; Verify that the fold of ab-cd if there is no uses of ab and cd			; Verify that the fold of ab-cd if there is no uses of ab and cd
	define double @fsub3(double %a, double %b, double %c, double %d) {			define double @fsub3(double %a, double %b, double %c, double %d) {
	; CHECK-LABEL: fsub3:			; CHECK-LABEL: fsub3:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: xsmuldp 0, 4, 3			; CHECK-NEXT: xsmuldp 0, 4, 3
	; CHECK-NEXT: xsmsubadp 0, 2, 1			; CHECK-NEXT: xsmsubadp 0, 2, 1
	; CHECK-NEXT: fmr 1, 0			; CHECK-NEXT: fmr 1, 0
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	entry:			entry:
	%mul = fmul reassoc double %b, %a			%mul = fmul contract reassoc double %b, %a
	%mul1 = fmul reassoc double %d, %c			%mul1 = fmul contract reassoc double %d, %c
	%sub = fsub reassoc double %mul, %mul1			%sub = fsub contract reassoc double %mul, %mul1
	ret double %sub			ret double %sub
	}			}

	; Verify that the fold of ab+cd respect the uses of a*b			; Verify that the fold of ab+cd respect the uses of a*b
	define double @fadd1(double %a, double %b, double %c, double %d) {			define double @fadd1(double %a, double %b, double %c, double %d) {
	; CHECK-LABEL: fadd1:			; CHECK-LABEL: fadd1:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: xsmuldp 0, 2, 1			; CHECK-NEXT: xsmuldp 0, 2, 1
	; CHECK-NEXT: fmr 1, 0			; CHECK-NEXT: fmr 1, 0
	; CHECK-NEXT: xsmaddadp 1, 4, 3			; CHECK-NEXT: xsmaddadp 1, 4, 3
	; CHECK-NEXT: xsmuldp 1, 0, 1			; CHECK-NEXT: xsmuldp 1, 0, 1
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	entry:			entry:
	%mul = fmul reassoc double %b, %a			%mul = fmul contract reassoc double %b, %a
	%mul1 = fmul reassoc double %d, %c			%mul1 = fmul contract reassoc double %d, %c
	%add = fadd reassoc double %mul1, %mul			%add = fadd contract reassoc double %mul1, %mul
	%mul3 = fmul reassoc double %mul, %add			%mul3 = fmul contract reassoc double %mul, %add
	ret double %mul3			ret double %mul3
	}			}

	; Verify that the fold of ab+cd respect the uses of c*d			; Verify that the fold of ab+cd respect the uses of c*d
	define double @fadd2(double %a, double %b, double %c, double %d) {			define double @fadd2(double %a, double %b, double %c, double %d) {
	; CHECK-LABEL: fadd2:			; CHECK-LABEL: fadd2:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: xsmuldp 0, 4, 3			; CHECK-NEXT: xsmuldp 0, 4, 3
	; CHECK-NEXT: fmr 3, 0			; CHECK-NEXT: fmr 3, 0
	; CHECK-NEXT: xsmaddadp 3, 2, 1			; CHECK-NEXT: xsmaddadp 3, 2, 1
	; CHECK-NEXT: xsmuldp 1, 0, 3			; CHECK-NEXT: xsmuldp 1, 0, 3
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	entry:			entry:
	%mul = fmul reassoc double %b, %a			%mul = fmul contract reassoc double %b, %a
	%mul1 = fmul reassoc double %d, %c			%mul1 = fmul contract reassoc double %d, %c
	%add = fadd reassoc double %mul1, %mul			%add = fadd contract reassoc double %mul1, %mul
	%mul3 = fmul reassoc double %mul1, %add			%mul3 = fmul contract reassoc double %mul1, %add
	ret double %mul3			ret double %mul3
	}			}

	; Verify that the fold of ab+cd if there is no uses of ab and cd			; Verify that the fold of ab+cd if there is no uses of ab and cd
	define double @fadd3(double %a, double %b, double %c, double %d) {			define double @fadd3(double %a, double %b, double %c, double %d) {
	; CHECK-LABEL: fadd3:			; CHECK-LABEL: fadd3:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: xsmuldp 1, 2, 1			; CHECK-NEXT: xsmuldp 1, 2, 1
	; CHECK-NEXT: xsmaddadp 1, 4, 3			; CHECK-NEXT: xsmaddadp 1, 4, 3
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	entry:			entry:
	%mul = fmul reassoc double %b, %a			%mul = fmul contract reassoc double %b, %a
	%mul1 = fmul reassoc double %d, %c			%mul1 = fmul contract reassoc double %d, %c
	%add = fadd reassoc double %mul1, %mul			%add = fadd contract reassoc double %mul1, %mul
	ret double %add			ret double %add
	}			}

	define double @fma_multi_uses1(double %a, double %b, double %c, double %d, double* %p1, double* %p2, double* %p3) {			define double @fma_multi_uses1(double %a, double %b, double %c, double %d, double* %p1, double* %p2, double* %p3) {
	; CHECK-LABEL: fma_multi_uses1:			; CHECK-LABEL: fma_multi_uses1:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: xsmuldp 1, 1, 2			; CHECK-NEXT: xsmuldp 1, 1, 2
	; CHECK-NEXT: xsmuldp 0, 3, 4			; CHECK-NEXT: xsmuldp 0, 3, 4
	; CHECK-NEXT: stfd 1, 0(7)			; CHECK-NEXT: stfd 1, 0(7)
	; CHECK-NEXT: stfd 1, 0(8)			; CHECK-NEXT: stfd 1, 0(8)
	; CHECK-NEXT: xsnmsubadp 1, 3, 4			; CHECK-NEXT: xsnmsubadp 1, 3, 4
	; CHECK-NEXT: stfd 0, 0(9)			; CHECK-NEXT: stfd 0, 0(9)
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%ab = fmul reassoc double %a, %b			%ab = fmul contract reassoc double %a, %b
	%cd = fmul reassoc double %c, %d			%cd = fmul contract reassoc double %c, %d
	store double %ab, double* %p1 ; extra use of %ab			store double %ab, double* %p1 ; extra use of %ab
	store double %ab, double* %p2 ; another extra use of %ab			store double %ab, double* %p2 ; another extra use of %ab
	store double %cd, double* %p3 ; extra use of %cd			store double %cd, double* %p3 ; extra use of %cd
	%r = fsub reassoc nsz double %ab, %cd			%r = fsub contract reassoc nsz double %ab, %cd
	ret double %r			ret double %r
	}			}

	define double @fma_multi_uses2(double %a, double %b, double %c, double %d, double* %p1, double* %p2, double* %p3) {			define double @fma_multi_uses2(double %a, double %b, double %c, double %d, double* %p1, double* %p2, double* %p3) {
	; CHECK-LABEL: fma_multi_uses2:			; CHECK-LABEL: fma_multi_uses2:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: xsmuldp 5, 1, 2			; CHECK-NEXT: xsmuldp 5, 1, 2
	; CHECK-NEXT: xsmuldp 0, 3, 4			; CHECK-NEXT: xsmuldp 0, 3, 4
	; CHECK-NEXT: stfd 5, 0(7)			; CHECK-NEXT: stfd 5, 0(7)
	; CHECK-NEXT: stfd 0, 0(8)			; CHECK-NEXT: stfd 0, 0(8)
	; CHECK-NEXT: stfd 0, 0(9)			; CHECK-NEXT: stfd 0, 0(9)
	; CHECK-NEXT: xsmsubadp 0, 1, 2			; CHECK-NEXT: xsmsubadp 0, 1, 2
	; CHECK-NEXT: fmr 1, 0			; CHECK-NEXT: fmr 1, 0
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%ab = fmul reassoc double %a, %b			%ab = fmul contract reassoc double %a, %b
	%cd = fmul reassoc double %c, %d			%cd = fmul contract reassoc double %c, %d
	store double %ab, double* %p1 ; extra use of %ab			store double %ab, double* %p1 ; extra use of %ab
	store double %cd, double* %p2 ; extra use of %cd			store double %cd, double* %p2 ; extra use of %cd
	store double %cd, double* %p3 ; another extra use of %cd			store double %cd, double* %p3 ; another extra use of %cd
	%r = fsub reassoc double %ab, %cd			%r = fsub contract reassoc double %ab, %cd
	ret double %r			ret double %r
	}			}

	define double @fma_multi_uses3(double %a, double %b, double %c, double %d, double %f, double %g, double* %p1, double* %p2, double* %p3) {			define double @fma_multi_uses3(double %a, double %b, double %c, double %d, double %f, double %g, double* %p1, double* %p2, double* %p3) {
	; CHECK-LABEL: fma_multi_uses3:			; CHECK-LABEL: fma_multi_uses3:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: xsmuldp 0, 1, 2			; CHECK-NEXT: xsmuldp 0, 1, 2
	; CHECK-NEXT: xsmuldp 1, 5, 6			; CHECK-NEXT: xsmuldp 1, 5, 6
	; CHECK-NEXT: ld 3, 96(1)			; CHECK-NEXT: ld 3, 96(1)
	; CHECK-NEXT: stfd 0, 0(9)			; CHECK-NEXT: stfd 0, 0(9)
	; CHECK-NEXT: stfd 0, 0(10)			; CHECK-NEXT: stfd 0, 0(10)
	; CHECK-NEXT: stfd 1, 0(3)			; CHECK-NEXT: stfd 1, 0(3)
	; CHECK-NEXT: xsnmsubadp 1, 3, 4			; CHECK-NEXT: xsnmsubadp 1, 3, 4
	; CHECK-NEXT: xsnmsubadp 0, 3, 4			; CHECK-NEXT: xsnmsubadp 0, 3, 4
	; CHECK-NEXT: xsadddp 1, 0, 1			; CHECK-NEXT: xsadddp 1, 0, 1
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%ab = fmul reassoc double %a, %b			%ab = fmul contract reassoc double %a, %b
	%cd = fmul reassoc double %c, %d			%cd = fmul contract reassoc double %c, %d
	%fg = fmul reassoc double %f, %g			%fg = fmul contract reassoc double %f, %g
	store double %ab, double* %p1 ; extra use of %ab			store double %ab, double* %p1 ; extra use of %ab
	store double %ab, double* %p2 ; another extra use of %ab			store double %ab, double* %p2 ; another extra use of %ab
	store double %fg, double* %p3 ; extra use of %fg			store double %fg, double* %p3 ; extra use of %fg
	%q = fsub reassoc nsz double %fg, %cd ; The uses of %cd reduce to 1 after %r is folded. 2 uses of %fg, fold %cd, remove def of %cd			%q = fsub contract reassoc nsz double %fg, %cd ; The uses of %cd reduce to 1 after %r is folded. 2 uses of %fg, fold %cd, remove def of %cd
	%r = fsub reassoc nsz double %ab, %cd ; Fold %r before %q. 3 uses of %ab, 2 uses of %cd, fold %cd			%r = fsub contract reassoc nsz double %ab, %cd ; Fold %r before %q. 3 uses of %ab, 2 uses of %cd, fold %cd
	%add = fadd reassoc double %r, %q			%add = fadd contract reassoc double %r, %q
	ret double %add			ret double %add
	}			}

llvm/test/CodeGen/PowerPC/fmf-propagation.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: llc < %s -mtriple=powerpc64le -debug-only=isel -o /dev/null 2>&1 \| FileCheck %s --check-prefix=FMFDEBUG			; RUN: llc < %s -mtriple=powerpc64le -debug-only=isel -o /dev/null 2>&1 \| FileCheck %s --check-prefix=FMFDEBUG
	; RUN: llc < %s -mtriple=powerpc64le \| FileCheck %s --check-prefix=FMF			; RUN: llc < %s -mtriple=powerpc64le \| FileCheck %s --check-prefix=FMF
	; RUN: llc < %s -mtriple=powerpc64le -debug-only=isel -o /dev/null 2>&1 -enable-unsafe-fp-math -enable-no-nans-fp-math \| FileCheck %s --check-prefix=GLOBALDEBUG			; RUN: llc < %s -mtriple=powerpc64le -debug-only=isel -o /dev/null 2>&1 -enable-unsafe-fp-math -enable-no-nans-fp-math \| FileCheck %s --check-prefix=GLOBALDEBUG
	; RUN: llc < %s -mtriple=powerpc64le -enable-unsafe-fp-math -enable-no-nans-fp-math -enable-no-signed-zeros-fp-math \| FileCheck %s --check-prefix=GLOBAL			; RUN: llc < %s -mtriple=powerpc64le -enable-unsafe-fp-math -enable-no-nans-fp-math -enable-no-signed-zeros-fp-math \| FileCheck %s --check-prefix=GLOBAL

	; Test FP transforms using instruction/node-level fast-math-flags.			; Test FP transforms using instruction/node-level fast-math-flags.
	; We're also checking debug output to verify that FMF is propagated to the newly created nodes.			; We're also checking debug output to verify that FMF is propagated to the newly created nodes.
	; The run with the global unsafe param tests the pre-FMF behavior using regular instructions/nodes.			; The run with the global unsafe param tests the pre-FMF behavior using regular instructions/nodes.

	declare float @llvm.fma.f32(float, float, float)			declare float @llvm.fma.f32(float, float, float)
	declare float @llvm.sqrt.f32(float)			declare float @llvm.sqrt.f32(float)

	; X * Y + Z --> fma(X, Y, Z)			; X * Y + Z --> fma(X, Y, Z)

				; contract bits in fmul is checked.

	; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fadd_contract1:'			; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fadd_contract1:'
	; FMFDEBUG: fma contract {{t[0-9]+}}, {{t[0-9]+}}, {{t[0-9]+}}			; FMFDEBUG-NOT: fma contract {{t[0-9]+}}, {{t[0-9]+}}, {{t[0-9]+}}
	; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fadd_contract1:'			; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fadd_contract1:'

	define float @fmul_fadd_contract1(float %x, float %y, float %z) {			define float @fmul_fadd_contract1(float %x, float %y, float %z) {
	; FMF-LABEL: fmul_fadd_contract1:			; FMF-LABEL: fmul_fadd_contract1:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: xsmaddasp 3, 1, 2			; FMF-NEXT: xsmulsp 0, 1, 2
	; FMF-NEXT: fmr 1, 3			; FMF-NEXT: xsaddsp 1, 0, 3
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: fmul_fadd_contract1:			; GLOBAL-LABEL: fmul_fadd_contract1:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: xsmaddasp 3, 1, 2			; GLOBAL-NEXT: xsmaddasp 3, 1, 2
	; GLOBAL-NEXT: fmr 1, 3			; GLOBAL-NEXT: fmr 1, 3
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%mul = fmul float %x, %y			%mul = fmul float %x, %y
	%add = fadd contract float %mul, %z			%add = fadd contract float %mul, %z
	ret float %add			ret float %add
	}			}

	; This shouldn't change anything - the intermediate fmul result is now also flagged.			; contract bits in fadd is also checked.

	; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fadd_contract2:'			; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fadd_contract2:'
	; FMFDEBUG: fma contract {{t[0-9]+}}, {{t[0-9]+}}, {{t[0-9]+}}			; FMFDEBUG-NOT: fma contract {{t[0-9]+}}, {{t[0-9]+}}, {{t[0-9]+}}
	; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fadd_contract2:'			; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fadd_contract2:'

	define float @fmul_fadd_contract2(float %x, float %y, float %z) {			define float @fmul_fadd_contract2(float %x, float %y, float %z) {
	; FMF-LABEL: fmul_fadd_contract2:			; FMF-LABEL: fmul_fadd_contract2:
	; FMF: # %bb.0:			; FMF: # %bb.0:
				; FMF-NEXT: xsmulsp 0, 1, 2
				; FMF-NEXT: xsaddsp 1, 0, 3
				; FMF-NEXT: blr
				;
				; GLOBAL-LABEL: fmul_fadd_contract2:
				; GLOBAL: # %bb.0:
				; GLOBAL-NEXT: xsmaddasp 3, 1, 2
				; GLOBAL-NEXT: fmr 1, 3
				; GLOBAL-NEXT: blr
				%mul = fmul contract float %x, %y
				%add = fadd float %mul, %z
				ret float %add
				}

				; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fadd_contract3:'
				; FMFDEBUG: fma contract {{t[0-9]+}}, {{t[0-9]+}}, {{t[0-9]+}}
				; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fadd_contract3:'

				define float @fmul_fadd_contract3(float %x, float %y, float %z) {
				; FMF-LABEL: fmul_fadd_contract3:
				; FMF: # %bb.0:
	; FMF-NEXT: xsmaddasp 3, 1, 2			; FMF-NEXT: xsmaddasp 3, 1, 2
	; FMF-NEXT: fmr 1, 3			; FMF-NEXT: fmr 1, 3
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: fmul_fadd_contract2:			; GLOBAL-LABEL: fmul_fadd_contract3:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: xsmaddasp 3, 1, 2			; GLOBAL-NEXT: xsmaddasp 3, 1, 2
	; GLOBAL-NEXT: fmr 1, 3			; GLOBAL-NEXT: fmr 1, 3
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%mul = fmul contract float %x, %y			%mul = fmul contract float %x, %y
	%add = fadd contract float %mul, %z			%add = fadd contract float %mul, %z
	ret float %add			ret float %add
	}			}

	; Reassociation implies that FMA contraction is allowed.			; Reassociation does NOT imply that FMA contraction is allowed.

	; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fadd_reassoc1:'			; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fadd_reassoc1:'
	; FMFDEBUG: fma reassoc {{t[0-9]+}}, {{t[0-9]+}}, {{t[0-9]+}}			; FMFDEBUG-NOT: fma reassoc {{t[0-9]+}}, {{t[0-9]+}}, {{t[0-9]+}}
	; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fadd_reassoc1:'			; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fadd_reassoc1:'

	define float @fmul_fadd_reassoc1(float %x, float %y, float %z) {			define float @fmul_fadd_reassoc1(float %x, float %y, float %z) {
	; FMF-LABEL: fmul_fadd_reassoc1:			; FMF-LABEL: fmul_fadd_reassoc1:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: xsmaddasp 3, 1, 2			; FMF-NEXT: xsmulsp 0, 1, 2
	; FMF-NEXT: fmr 1, 3			; FMF-NEXT: xsaddsp 1, 0, 3
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: fmul_fadd_reassoc1:			; GLOBAL-LABEL: fmul_fadd_reassoc1:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: xsmaddasp 3, 1, 2			; GLOBAL-NEXT: xsmaddasp 3, 1, 2
	; GLOBAL-NEXT: fmr 1, 3			; GLOBAL-NEXT: fmr 1, 3
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%mul = fmul float %x, %y			%mul = fmul float %x, %y
	%add = fadd reassoc float %mul, %z			%add = fadd reassoc float %mul, %z
	ret float %add			ret float %add
	}			}

	; This shouldn't change anything - the intermediate fmul result is now also flagged.			; This shouldn't change anything - the intermediate fmul result is now also flagged.

	; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fadd_reassoc2:'			; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fadd_reassoc2:'
	; FMFDEBUG: fma reassoc {{t[0-9]+}}, {{t[0-9]+}}			; FMFDEBUG-NOT: fma reassoc {{t[0-9]+}}, {{t[0-9]+}}
	; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fadd_reassoc2:'			; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fadd_reassoc2:'

	define float @fmul_fadd_reassoc2(float %x, float %y, float %z) {			define float @fmul_fadd_reassoc2(float %x, float %y, float %z) {
	; FMF-LABEL: fmul_fadd_reassoc2:			; FMF-LABEL: fmul_fadd_reassoc2:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: xsmaddasp 3, 1, 2			; FMF-NEXT: xsmulsp 0, 1, 2
	; FMF-NEXT: fmr 1, 3			; FMF-NEXT: xsaddsp 1, 0, 3
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: fmul_fadd_reassoc2:			; GLOBAL-LABEL: fmul_fadd_reassoc2:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: xsmaddasp 3, 1, 2			; GLOBAL-NEXT: xsmaddasp 3, 1, 2
	; GLOBAL-NEXT: fmr 1, 3			; GLOBAL-NEXT: fmr 1, 3
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%mul = fmul reassoc float %x, %y			%mul = fmul reassoc float %x, %y
	%add = fadd reassoc float %mul, %z			%add = fadd reassoc float %mul, %z
	ret float %add			ret float %add
	}			}

	; The fadd is now fully 'fast'. This implies that contraction is allowed.			; The fadd is now fully 'fast', but fmul is not yet.
				spatelUnsubmitted Not Done Reply Inline Actions This comment was not valid even before this patch. spatel: This comment was not valid even before this patch.

	; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fadd_fast1:'			; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fadd_fast1:'
	; FMFDEBUG: fma reassoc {{t[0-9]+}}, {{t[0-9]+}}, {{t[0-9]+}}			; FMFDEBUG-NOT: fma nnan ninf nsz arcp contract afn reassoc {{t[0-9]+}}, {{t[0-9]+}}, {{t[0-9]+}}
	; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fadd_fast1:'			; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fadd_fast1:'

	define float @fmul_fadd_fast1(float %x, float %y, float %z) {			define float @fmul_fadd_fast1(float %x, float %y, float %z) {
	; FMF-LABEL: fmul_fadd_fast1:			; FMF-LABEL: fmul_fadd_fast1:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: xsmaddasp 3, 1, 2			; FMF-NEXT: xsmulsp 0, 1, 2
	; FMF-NEXT: fmr 1, 3			; FMF-NEXT: xsaddsp 1, 0, 3
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: fmul_fadd_fast1:			; GLOBAL-LABEL: fmul_fadd_fast1:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: xsmaddasp 3, 1, 2			; GLOBAL-NEXT: xsmaddasp 3, 1, 2
	; GLOBAL-NEXT: fmr 1, 3			; GLOBAL-NEXT: fmr 1, 3
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%mul = fmul reassoc float %x, %y			%mul = fmul float %x, %y
	%add = fadd reassoc float %mul, %z			%add = fadd fast float %mul, %z
	ret float %add			ret float %add
	}			}

	; This shouldn't change anything - the intermediate fmul result is now also flagged.			; This implies that contraction is allowed - the intermediate fmul result is now also flagged.

	; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fadd_fast2:'			; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fadd_fast2:'
	; FMFDEBUG: fma reassoc {{t[0-9]+}}, {{t[0-9]+}}, {{t[0-9]+}}			; FMFDEBUG: fma nnan ninf nsz arcp contract afn reassoc {{t[0-9]+}}, {{t[0-9]+}}, {{t[0-9]+}}
	; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fadd_fast2:'			; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fadd_fast2:'

	define float @fmul_fadd_fast2(float %x, float %y, float %z) {			define float @fmul_fadd_fast2(float %x, float %y, float %z) {
	; FMF-LABEL: fmul_fadd_fast2:			; FMF-LABEL: fmul_fadd_fast2:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: xsmaddasp 3, 1, 2			; FMF-NEXT: xsmaddasp 3, 1, 2
	; FMF-NEXT: fmr 1, 3			; FMF-NEXT: fmr 1, 3
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: fmul_fadd_fast2:			; GLOBAL-LABEL: fmul_fadd_fast2:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: xsmaddasp 3, 1, 2			; GLOBAL-NEXT: xsmaddasp 3, 1, 2
	; GLOBAL-NEXT: fmr 1, 3			; GLOBAL-NEXT: fmr 1, 3
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%mul = fmul reassoc float %x, %y			%mul = fmul fast float %x, %y
	%add = fadd reassoc float %mul, %z			%add = fadd fast float %mul, %z
	ret float %add			ret float %add
	}			}

	; fma(X, 7.0, X * 42.0) --> X * 49.0			; fma(X, 7.0, X * 42.0) --> X * 49.0
	; This is the minimum FMF needed for this transform - the FMA allows reassociation.			; This is the minimum FMF needed for this transform - the FMA allows reassociation.

	; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_reassoc1:'			; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_reassoc1:'
	; FMFDEBUG: fmul reassoc {{t[0-9]+}},			; FMFDEBUG: fmul reassoc {{t[0-9]+}},
	; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_reassoc1:'			; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_reassoc1:'

	; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_reassoc1:'			; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_reassoc1:'
	; GLOBALDEBUG: fmul reassoc {{t[0-9]+}}			; GLOBALDEBUG: fmul reassoc {{t[0-9]+}}
	; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_reassoc1:'			; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_reassoc1:'

	define float @fmul_fma_reassoc1(float %x) {			define float @fmul_fma_reassoc1(float %x) {
	; FMF-LABEL: fmul_fma_reassoc1:			; FMF-LABEL: fmul_fma_reassoc1:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: addis 3, 2, .LCPI6_0@toc@ha			; FMF-NEXT: addis 3, 2, .LCPI7_0@toc@ha
	; FMF-NEXT: lfs 0, .LCPI6_0@toc@l(3)			; FMF-NEXT: lfs 0, .LCPI7_0@toc@l(3)
	; FMF-NEXT: xsmulsp 1, 1, 0			; FMF-NEXT: xsmulsp 1, 1, 0
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: fmul_fma_reassoc1:			; GLOBAL-LABEL: fmul_fma_reassoc1:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: addis 3, 2, .LCPI6_0@toc@ha			; GLOBAL-NEXT: addis 3, 2, .LCPI7_0@toc@ha
	; GLOBAL-NEXT: lfs 0, .LCPI6_0@toc@l(3)			; GLOBAL-NEXT: lfs 0, .LCPI7_0@toc@l(3)
	; GLOBAL-NEXT: xsmulsp 1, 1, 0			; GLOBAL-NEXT: xsmulsp 1, 1, 0
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%mul = fmul float %x, 42.0			%mul = fmul float %x, 42.0
	%fma = call reassoc float @llvm.fma.f32(float %x, float 7.0, float %mul)			%fma = call reassoc float @llvm.fma.f32(float %x, float 7.0, float %mul)
	ret float %fma			ret float %fma
	}			}

	; This shouldn't change anything - the intermediate fmul result is now also flagged.			; This shouldn't change anything - the intermediate fmul result is now also flagged.

	; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_reassoc2:'			; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_reassoc2:'
	; FMFDEBUG: fmul reassoc {{t[0-9]+}}			; FMFDEBUG: fmul reassoc {{t[0-9]+}}
	; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_reassoc2:'			; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_reassoc2:'

	; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_reassoc2:'			; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_reassoc2:'
	; GLOBALDEBUG: fmul reassoc {{t[0-9]+}}			; GLOBALDEBUG: fmul reassoc {{t[0-9]+}}
	; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_reassoc2:'			; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_reassoc2:'

	define float @fmul_fma_reassoc2(float %x) {			define float @fmul_fma_reassoc2(float %x) {
	; FMF-LABEL: fmul_fma_reassoc2:			; FMF-LABEL: fmul_fma_reassoc2:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: addis 3, 2, .LCPI7_0@toc@ha			; FMF-NEXT: addis 3, 2, .LCPI8_0@toc@ha
	; FMF-NEXT: lfs 0, .LCPI7_0@toc@l(3)			; FMF-NEXT: lfs 0, .LCPI8_0@toc@l(3)
	; FMF-NEXT: xsmulsp 1, 1, 0			; FMF-NEXT: xsmulsp 1, 1, 0
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: fmul_fma_reassoc2:			; GLOBAL-LABEL: fmul_fma_reassoc2:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: addis 3, 2, .LCPI7_0@toc@ha			; GLOBAL-NEXT: addis 3, 2, .LCPI8_0@toc@ha
	; GLOBAL-NEXT: lfs 0, .LCPI7_0@toc@l(3)			; GLOBAL-NEXT: lfs 0, .LCPI8_0@toc@l(3)
	; GLOBAL-NEXT: xsmulsp 1, 1, 0			; GLOBAL-NEXT: xsmulsp 1, 1, 0
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%mul = fmul reassoc float %x, 42.0			%mul = fmul reassoc float %x, 42.0
	%fma = call reassoc float @llvm.fma.f32(float %x, float 7.0, float %mul)			%fma = call reassoc float @llvm.fma.f32(float %x, float 7.0, float %mul)
	ret float %fma			ret float %fma
	}			}

	; The FMA is now fully 'fast'. This implies that reassociation is allowed.			; The FMA is now fully 'fast'. This implies that reassociation is allowed.

	; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_fast1:'			; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_fast1:'
	; FMFDEBUG: fmul reassoc {{t[0-9]+}}			; FMFDEBUG: fmul nnan ninf nsz arcp contract afn reassoc {{t[0-9]+}}
	; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_fast1:'			; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_fast1:'

	; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_fast1:'			; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_fast1:'
	; GLOBALDEBUG: fmul reassoc {{t[0-9]+}}			; GLOBALDEBUG: fmul nnan ninf nsz arcp contract afn reassoc {{t[0-9]+}}
	; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_fast1:'			; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_fast1:'

	define float @fmul_fma_fast1(float %x) {			define float @fmul_fma_fast1(float %x) {
	; FMF-LABEL: fmul_fma_fast1:			; FMF-LABEL: fmul_fma_fast1:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: addis 3, 2, .LCPI8_0@toc@ha			; FMF-NEXT: addis 3, 2, .LCPI9_0@toc@ha
	; FMF-NEXT: lfs 0, .LCPI8_0@toc@l(3)			; FMF-NEXT: lfs 0, .LCPI9_0@toc@l(3)
	; FMF-NEXT: xsmulsp 1, 1, 0			; FMF-NEXT: xsmulsp 1, 1, 0
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: fmul_fma_fast1:			; GLOBAL-LABEL: fmul_fma_fast1:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: addis 3, 2, .LCPI8_0@toc@ha			; GLOBAL-NEXT: addis 3, 2, .LCPI9_0@toc@ha
	; GLOBAL-NEXT: lfs 0, .LCPI8_0@toc@l(3)			; GLOBAL-NEXT: lfs 0, .LCPI9_0@toc@l(3)
	; GLOBAL-NEXT: xsmulsp 1, 1, 0			; GLOBAL-NEXT: xsmulsp 1, 1, 0
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%mul = fmul float %x, 42.0			%mul = fmul float %x, 42.0
	%fma = call reassoc float @llvm.fma.f32(float %x, float 7.0, float %mul)			%fma = call fast float @llvm.fma.f32(float %x, float 7.0, float %mul)
	ret float %fma			ret float %fma
	}			}

	; This shouldn't change anything - the intermediate fmul result is now also flagged.			; This shouldn't change anything - the intermediate fmul result is now also flagged.

	; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_fast2:'			; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_fast2:'
	; FMFDEBUG: fmul reassoc {{t[0-9]+}}			; FMFDEBUG: fmul nnan ninf nsz arcp contract afn reassoc {{t[0-9]+}}
	; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_fast2:'			; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_fast2:'

	; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_fast2:'			; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fmul_fma_fast2:'
	; GLOBALDEBUG: fmul reassoc {{t[0-9]+}}			; GLOBALDEBUG: fmul nnan ninf nsz arcp contract afn reassoc {{t[0-9]+}}
	; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_fast2:'			; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'fmul_fma_fast2:'

	define float @fmul_fma_fast2(float %x) {			define float @fmul_fma_fast2(float %x) {
	; FMF-LABEL: fmul_fma_fast2:			; FMF-LABEL: fmul_fma_fast2:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: addis 3, 2, .LCPI9_0@toc@ha			; FMF-NEXT: addis 3, 2, .LCPI10_0@toc@ha
	; FMF-NEXT: lfs 0, .LCPI9_0@toc@l(3)			; FMF-NEXT: lfs 0, .LCPI10_0@toc@l(3)
	; FMF-NEXT: xsmulsp 1, 1, 0			; FMF-NEXT: xsmulsp 1, 1, 0
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: fmul_fma_fast2:			; GLOBAL-LABEL: fmul_fma_fast2:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: addis 3, 2, .LCPI9_0@toc@ha			; GLOBAL-NEXT: addis 3, 2, .LCPI10_0@toc@ha
	; GLOBAL-NEXT: lfs 0, .LCPI9_0@toc@l(3)			; GLOBAL-NEXT: lfs 0, .LCPI10_0@toc@l(3)
	; GLOBAL-NEXT: xsmulsp 1, 1, 0			; GLOBAL-NEXT: xsmulsp 1, 1, 0
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%mul = fmul reassoc float %x, 42.0			%mul = fmul fast float %x, 42.0
	%fma = call reassoc float @llvm.fma.f32(float %x, float 7.0, float %mul)			%fma = call fast float @llvm.fma.f32(float %x, float 7.0, float %mul)
	ret float %fma			ret float %fma
	}			}

	; Reduced precision for sqrt is allowed - should use estimate and NR iterations.			; Reduced precision for sqrt is allowed - should use estimate and NR iterations.

	; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'sqrt_afn_ieee:'			; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'sqrt_afn_ieee:'
	; FMFDEBUG: fmul ninf afn {{t[0-9]+}}			; FMFDEBUG: fmul ninf afn {{t[0-9]+}}
	; FMFDEBUG: Type-legalized selection DAG: %bb.0 'sqrt_afn_ieee:'			; FMFDEBUG: Type-legalized selection DAG: %bb.0 'sqrt_afn_ieee:'

	; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'sqrt_afn_ieee:'			; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'sqrt_afn_ieee:'
	; GLOBALDEBUG: fmul ninf afn {{t[0-9]+}}			; GLOBALDEBUG: fmul ninf afn {{t[0-9]+}}
	; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'sqrt_afn_ieee:'			; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'sqrt_afn_ieee:'

	define float @sqrt_afn_ieee(float %x) #0 {			define float @sqrt_afn_ieee(float %x) #0 {
	; FMF-LABEL: sqrt_afn_ieee:			; FMF-LABEL: sqrt_afn_ieee:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: xsabsdp 0, 1			; FMF-NEXT: xsabsdp 0, 1
	; FMF-NEXT: addis 3, 2, .LCPI10_2@toc@ha			; FMF-NEXT: addis 3, 2, .LCPI11_2@toc@ha
	; FMF-NEXT: lfs 2, .LCPI10_2@toc@l(3)			; FMF-NEXT: lfs 2, .LCPI11_2@toc@l(3)
	; FMF-NEXT: fcmpu 0, 0, 2			; FMF-NEXT: fcmpu 0, 0, 2
	; FMF-NEXT: xxlxor 0, 0, 0			; FMF-NEXT: xxlxor 0, 0, 0
	; FMF-NEXT: blt 0, .LBB10_2			; FMF-NEXT: blt 0, .LBB11_2
	; FMF-NEXT: # %bb.1:			; FMF-NEXT: # %bb.1:
	; FMF-NEXT: xsrsqrtesp 0, 1			; FMF-NEXT: xsrsqrtesp 0, 1
	; FMF-NEXT: addis 3, 2, .LCPI10_0@toc@ha			; FMF-NEXT: addis 3, 2, .LCPI11_0@toc@ha
	; FMF-NEXT: addis 4, 2, .LCPI10_1@toc@ha			; FMF-NEXT: addis 4, 2, .LCPI11_1@toc@ha
	; FMF-NEXT: lfs 2, .LCPI10_0@toc@l(3)			; FMF-NEXT: lfs 2, .LCPI11_0@toc@l(3)
	; FMF-NEXT: lfs 3, .LCPI10_1@toc@l(4)			; FMF-NEXT: lfs 3, .LCPI11_1@toc@l(4)
	; FMF-NEXT: xsmulsp 1, 1, 0			; FMF-NEXT: xsmulsp 1, 1, 0
	; FMF-NEXT: xsmulsp 0, 1, 0			; FMF-NEXT: xsmulsp 0, 1, 0
	; FMF-NEXT: xsmulsp 1, 1, 2			; FMF-NEXT: xsmulsp 1, 1, 2
	; FMF-NEXT: xsaddsp 0, 0, 3			; FMF-NEXT: xsaddsp 0, 0, 3
	; FMF-NEXT: xsmulsp 0, 1, 0			; FMF-NEXT: xsmulsp 0, 1, 0
	; FMF-NEXT: .LBB10_2:			; FMF-NEXT: .LBB11_2:
	; FMF-NEXT: fmr 1, 0			; FMF-NEXT: fmr 1, 0
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: sqrt_afn_ieee:			; GLOBAL-LABEL: sqrt_afn_ieee:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: xsabsdp 0, 1			; GLOBAL-NEXT: xsabsdp 0, 1
	; GLOBAL-NEXT: addis 3, 2, .LCPI10_2@toc@ha			; GLOBAL-NEXT: addis 3, 2, .LCPI11_2@toc@ha
	; GLOBAL-NEXT: lfs 2, .LCPI10_2@toc@l(3)			; GLOBAL-NEXT: lfs 2, .LCPI11_2@toc@l(3)
	; GLOBAL-NEXT: fcmpu 0, 0, 2			; GLOBAL-NEXT: fcmpu 0, 0, 2
	; GLOBAL-NEXT: xxlxor 0, 0, 0			; GLOBAL-NEXT: xxlxor 0, 0, 0
	; GLOBAL-NEXT: blt 0, .LBB10_2			; GLOBAL-NEXT: blt 0, .LBB11_2
	; GLOBAL-NEXT: # %bb.1:			; GLOBAL-NEXT: # %bb.1:
	; GLOBAL-NEXT: xsrsqrtesp 0, 1			; GLOBAL-NEXT: xsrsqrtesp 0, 1
	; GLOBAL-NEXT: addis 3, 2, .LCPI10_0@toc@ha			; GLOBAL-NEXT: addis 3, 2, .LCPI11_0@toc@ha
	; GLOBAL-NEXT: addis 4, 2, .LCPI10_1@toc@ha			; GLOBAL-NEXT: addis 4, 2, .LCPI11_1@toc@ha
	; GLOBAL-NEXT: lfs 2, .LCPI10_0@toc@l(3)			; GLOBAL-NEXT: lfs 2, .LCPI11_0@toc@l(3)
	; GLOBAL-NEXT: lfs 3, .LCPI10_1@toc@l(4)			; GLOBAL-NEXT: lfs 3, .LCPI11_1@toc@l(4)
	; GLOBAL-NEXT: xsmulsp 1, 1, 0			; GLOBAL-NEXT: xsmulsp 1, 1, 0
	; GLOBAL-NEXT: xsmaddasp 2, 1, 0			; GLOBAL-NEXT: xsmaddasp 2, 1, 0
	; GLOBAL-NEXT: xsmulsp 0, 1, 3			; GLOBAL-NEXT: xsmulsp 0, 1, 3
	; GLOBAL-NEXT: xsmulsp 0, 0, 2			; GLOBAL-NEXT: xsmulsp 0, 0, 2
	; GLOBAL-NEXT: .LBB10_2:			; GLOBAL-NEXT: .LBB11_2:
	; GLOBAL-NEXT: fmr 1, 0			; GLOBAL-NEXT: fmr 1, 0
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%rt = call afn ninf float @llvm.sqrt.f32(float %x)			%rt = call afn ninf float @llvm.sqrt.f32(float %x)
	ret float %rt			ret float %rt
	}			}

	define float @sqrt_afn_ieee_inf(float %x) #0 {			define float @sqrt_afn_ieee_inf(float %x) #0 {
	; FMF-LABEL: sqrt_afn_ieee_inf:			; FMF-LABEL: sqrt_afn_ieee_inf:
	Show All 17 Lines
	; GLOBALDEBUG: fmul ninf afn {{t[0-9]+}}			; GLOBALDEBUG: fmul ninf afn {{t[0-9]+}}
	; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'sqrt_afn_preserve_sign:'			; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'sqrt_afn_preserve_sign:'

	define float @sqrt_afn_preserve_sign(float %x) #1 {			define float @sqrt_afn_preserve_sign(float %x) #1 {
	; FMF-LABEL: sqrt_afn_preserve_sign:			; FMF-LABEL: sqrt_afn_preserve_sign:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: xxlxor 0, 0, 0			; FMF-NEXT: xxlxor 0, 0, 0
	; FMF-NEXT: fcmpu 0, 1, 0			; FMF-NEXT: fcmpu 0, 1, 0
	; FMF-NEXT: beq 0, .LBB12_2			; FMF-NEXT: beq 0, .LBB13_2
	; FMF-NEXT: # %bb.1:			; FMF-NEXT: # %bb.1:
	; FMF-NEXT: xsrsqrtesp 0, 1			; FMF-NEXT: xsrsqrtesp 0, 1
	; FMF-NEXT: addis 3, 2, .LCPI12_0@toc@ha			; FMF-NEXT: addis 3, 2, .LCPI13_0@toc@ha
	; FMF-NEXT: addis 4, 2, .LCPI12_1@toc@ha			; FMF-NEXT: addis 4, 2, .LCPI13_1@toc@ha
	; FMF-NEXT: lfs 2, .LCPI12_0@toc@l(3)			; FMF-NEXT: lfs 2, .LCPI13_0@toc@l(3)
	; FMF-NEXT: lfs 3, .LCPI12_1@toc@l(4)			; FMF-NEXT: lfs 3, .LCPI13_1@toc@l(4)
	; FMF-NEXT: xsmulsp 1, 1, 0			; FMF-NEXT: xsmulsp 1, 1, 0
	; FMF-NEXT: xsmulsp 0, 1, 0			; FMF-NEXT: xsmulsp 0, 1, 0
	; FMF-NEXT: xsmulsp 1, 1, 2			; FMF-NEXT: xsmulsp 1, 1, 2
	; FMF-NEXT: xsaddsp 0, 0, 3			; FMF-NEXT: xsaddsp 0, 0, 3
	; FMF-NEXT: xsmulsp 0, 1, 0			; FMF-NEXT: xsmulsp 0, 1, 0
	; FMF-NEXT: .LBB12_2:			; FMF-NEXT: .LBB13_2:
	; FMF-NEXT: fmr 1, 0			; FMF-NEXT: fmr 1, 0
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: sqrt_afn_preserve_sign:			; GLOBAL-LABEL: sqrt_afn_preserve_sign:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: xxlxor 0, 0, 0			; GLOBAL-NEXT: xxlxor 0, 0, 0
	; GLOBAL-NEXT: fcmpu 0, 1, 0			; GLOBAL-NEXT: fcmpu 0, 1, 0
	; GLOBAL-NEXT: beq 0, .LBB12_2			; GLOBAL-NEXT: beq 0, .LBB13_2
	; GLOBAL-NEXT: # %bb.1:			; GLOBAL-NEXT: # %bb.1:
	; GLOBAL-NEXT: xsrsqrtesp 0, 1			; GLOBAL-NEXT: xsrsqrtesp 0, 1
	; GLOBAL-NEXT: addis 3, 2, .LCPI12_0@toc@ha			; GLOBAL-NEXT: addis 3, 2, .LCPI13_0@toc@ha
	; GLOBAL-NEXT: addis 4, 2, .LCPI12_1@toc@ha			; GLOBAL-NEXT: addis 4, 2, .LCPI13_1@toc@ha
	; GLOBAL-NEXT: lfs 2, .LCPI12_0@toc@l(3)			; GLOBAL-NEXT: lfs 2, .LCPI13_0@toc@l(3)
	; GLOBAL-NEXT: lfs 3, .LCPI12_1@toc@l(4)			; GLOBAL-NEXT: lfs 3, .LCPI13_1@toc@l(4)
	; GLOBAL-NEXT: xsmulsp 1, 1, 0			; GLOBAL-NEXT: xsmulsp 1, 1, 0
	; GLOBAL-NEXT: xsmaddasp 2, 1, 0			; GLOBAL-NEXT: xsmaddasp 2, 1, 0
	; GLOBAL-NEXT: xsmulsp 0, 1, 3			; GLOBAL-NEXT: xsmulsp 0, 1, 3
	; GLOBAL-NEXT: xsmulsp 0, 0, 2			; GLOBAL-NEXT: xsmulsp 0, 0, 2
	; GLOBAL-NEXT: .LBB12_2:			; GLOBAL-NEXT: .LBB13_2:
	; GLOBAL-NEXT: fmr 1, 0			; GLOBAL-NEXT: fmr 1, 0
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%rt = call afn ninf float @llvm.sqrt.f32(float %x)			%rt = call afn ninf float @llvm.sqrt.f32(float %x)
	ret float %rt			ret float %rt
	}			}

	define float @sqrt_afn_preserve_sign_inf(float %x) #1 {			define float @sqrt_afn_preserve_sign_inf(float %x) #1 {
	; FMF-LABEL: sqrt_afn_preserve_sign_inf:			; FMF-LABEL: sqrt_afn_preserve_sign_inf:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: xssqrtsp 1, 1			; FMF-NEXT: xssqrtsp 1, 1
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: sqrt_afn_preserve_sign_inf:			; GLOBAL-LABEL: sqrt_afn_preserve_sign_inf:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: xssqrtsp 1, 1			; GLOBAL-NEXT: xssqrtsp 1, 1
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%rt = call afn float @llvm.sqrt.f32(float %x)			%rt = call afn float @llvm.sqrt.f32(float %x)
	ret float %rt			ret float %rt
	}			}

	; The call is now fully 'fast'. This implies that approximation is allowed.			; The call is now fully 'fast'. This implies that approximation is allowed.

	; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'sqrt_fast_ieee:'			; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'sqrt_fast_ieee:'
	; FMFDEBUG: fmul ninf afn reassoc {{t[0-9]+}}			; FMFDEBUG: fmul ninf contract afn reassoc {{t[0-9]+}}
	; FMFDEBUG: Type-legalized selection DAG: %bb.0 'sqrt_fast_ieee:'			; FMFDEBUG: Type-legalized selection DAG: %bb.0 'sqrt_fast_ieee:'

	; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'sqrt_fast_ieee:'			; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'sqrt_fast_ieee:'
	; GLOBALDEBUG: fmul ninf afn reassoc {{t[0-9]+}}			; GLOBALDEBUG: fmul ninf contract afn reassoc {{t[0-9]+}}
	; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'sqrt_fast_ieee:'			; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'sqrt_fast_ieee:'

	define float @sqrt_fast_ieee(float %x) #0 {			define float @sqrt_fast_ieee(float %x) #0 {
	; FMF-LABEL: sqrt_fast_ieee:			; FMF-LABEL: sqrt_fast_ieee:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: xsabsdp 0, 1			; FMF-NEXT: xsabsdp 0, 1
	; FMF-NEXT: addis 3, 2, .LCPI14_2@toc@ha			; FMF-NEXT: addis 3, 2, .LCPI15_2@toc@ha
	; FMF-NEXT: lfs 2, .LCPI14_2@toc@l(3)			; FMF-NEXT: lfs 2, .LCPI15_2@toc@l(3)
	; FMF-NEXT: fcmpu 0, 0, 2			; FMF-NEXT: fcmpu 0, 0, 2
	; FMF-NEXT: xxlxor 0, 0, 0			; FMF-NEXT: xxlxor 0, 0, 0
	; FMF-NEXT: blt 0, .LBB14_2			; FMF-NEXT: blt 0, .LBB15_2
	; FMF-NEXT: # %bb.1:			; FMF-NEXT: # %bb.1:
	; FMF-NEXT: xsrsqrtesp 0, 1			; FMF-NEXT: xsrsqrtesp 0, 1
	; FMF-NEXT: addis 3, 2, .LCPI14_0@toc@ha			; FMF-NEXT: addis 3, 2, .LCPI15_0@toc@ha
	; FMF-NEXT: addis 4, 2, .LCPI14_1@toc@ha			; FMF-NEXT: addis 4, 2, .LCPI15_1@toc@ha
	; FMF-NEXT: lfs 2, .LCPI14_0@toc@l(3)			; FMF-NEXT: lfs 2, .LCPI15_0@toc@l(3)
	; FMF-NEXT: lfs 3, .LCPI14_1@toc@l(4)			; FMF-NEXT: lfs 3, .LCPI15_1@toc@l(4)
	; FMF-NEXT: xsmulsp 1, 1, 0			; FMF-NEXT: xsmulsp 1, 1, 0
	; FMF-NEXT: xsmaddasp 2, 1, 0			; FMF-NEXT: xsmaddasp 2, 1, 0
	; FMF-NEXT: xsmulsp 0, 1, 3			; FMF-NEXT: xsmulsp 0, 1, 3
	; FMF-NEXT: xsmulsp 0, 0, 2			; FMF-NEXT: xsmulsp 0, 0, 2
	; FMF-NEXT: .LBB14_2:			; FMF-NEXT: .LBB15_2:
	; FMF-NEXT: fmr 1, 0			; FMF-NEXT: fmr 1, 0
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: sqrt_fast_ieee:			; GLOBAL-LABEL: sqrt_fast_ieee:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: xsabsdp 0, 1			; GLOBAL-NEXT: xsabsdp 0, 1
	; GLOBAL-NEXT: addis 3, 2, .LCPI14_2@toc@ha			; GLOBAL-NEXT: addis 3, 2, .LCPI15_2@toc@ha
	; GLOBAL-NEXT: lfs 2, .LCPI14_2@toc@l(3)			; GLOBAL-NEXT: lfs 2, .LCPI15_2@toc@l(3)
	; GLOBAL-NEXT: fcmpu 0, 0, 2			; GLOBAL-NEXT: fcmpu 0, 0, 2
	; GLOBAL-NEXT: xxlxor 0, 0, 0			; GLOBAL-NEXT: xxlxor 0, 0, 0
	; GLOBAL-NEXT: blt 0, .LBB14_2			; GLOBAL-NEXT: blt 0, .LBB15_2
	; GLOBAL-NEXT: # %bb.1:			; GLOBAL-NEXT: # %bb.1:
	; GLOBAL-NEXT: xsrsqrtesp 0, 1			; GLOBAL-NEXT: xsrsqrtesp 0, 1
	; GLOBAL-NEXT: addis 3, 2, .LCPI14_0@toc@ha			; GLOBAL-NEXT: addis 3, 2, .LCPI15_0@toc@ha
	; GLOBAL-NEXT: addis 4, 2, .LCPI14_1@toc@ha			; GLOBAL-NEXT: addis 4, 2, .LCPI15_1@toc@ha
	; GLOBAL-NEXT: lfs 2, .LCPI14_0@toc@l(3)			; GLOBAL-NEXT: lfs 2, .LCPI15_0@toc@l(3)
	; GLOBAL-NEXT: lfs 3, .LCPI14_1@toc@l(4)			; GLOBAL-NEXT: lfs 3, .LCPI15_1@toc@l(4)
	; GLOBAL-NEXT: xsmulsp 1, 1, 0			; GLOBAL-NEXT: xsmulsp 1, 1, 0
	; GLOBAL-NEXT: xsmaddasp 2, 1, 0			; GLOBAL-NEXT: xsmaddasp 2, 1, 0
	; GLOBAL-NEXT: xsmulsp 0, 1, 3			; GLOBAL-NEXT: xsmulsp 0, 1, 3
	; GLOBAL-NEXT: xsmulsp 0, 0, 2			; GLOBAL-NEXT: xsmulsp 0, 0, 2
	; GLOBAL-NEXT: .LBB14_2:			; GLOBAL-NEXT: .LBB15_2:
	; GLOBAL-NEXT: fmr 1, 0			; GLOBAL-NEXT: fmr 1, 0
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%rt = call reassoc afn ninf float @llvm.sqrt.f32(float %x)			%rt = call contract reassoc afn ninf float @llvm.sqrt.f32(float %x)
	ret float %rt			ret float %rt
	}			}

	; The call is now fully 'fast'. This implies that approximation is allowed.			; The call is now fully 'fast'. This implies that approximation is allowed.

	; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'sqrt_fast_preserve_sign:'			; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'sqrt_fast_preserve_sign:'
	; FMFDEBUG: fmul ninf afn reassoc {{t[0-9]+}}			; FMFDEBUG: fmul ninf contract afn reassoc {{t[0-9]+}}
	; FMFDEBUG: Type-legalized selection DAG: %bb.0 'sqrt_fast_preserve_sign:'			; FMFDEBUG: Type-legalized selection DAG: %bb.0 'sqrt_fast_preserve_sign:'

	; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'sqrt_fast_preserve_sign:'			; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'sqrt_fast_preserve_sign:'
	; GLOBALDEBUG: fmul ninf afn reassoc {{t[0-9]+}}			; GLOBALDEBUG: fmul ninf contract afn reassoc {{t[0-9]+}}
	; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'sqrt_fast_preserve_sign:'			; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'sqrt_fast_preserve_sign:'

	define float @sqrt_fast_preserve_sign(float %x) #1 {			define float @sqrt_fast_preserve_sign(float %x) #1 {
	; FMF-LABEL: sqrt_fast_preserve_sign:			; FMF-LABEL: sqrt_fast_preserve_sign:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: xxlxor 0, 0, 0			; FMF-NEXT: xxlxor 0, 0, 0
	; FMF-NEXT: fcmpu 0, 1, 0			; FMF-NEXT: fcmpu 0, 1, 0
	; FMF-NEXT: beq 0, .LBB15_2			; FMF-NEXT: beq 0, .LBB16_2
	; FMF-NEXT: # %bb.1:			; FMF-NEXT: # %bb.1:
	; FMF-NEXT: xsrsqrtesp 0, 1			; FMF-NEXT: xsrsqrtesp 0, 1
	; FMF-NEXT: addis 3, 2, .LCPI15_0@toc@ha			; FMF-NEXT: addis 3, 2, .LCPI16_0@toc@ha
	; FMF-NEXT: addis 4, 2, .LCPI15_1@toc@ha			; FMF-NEXT: addis 4, 2, .LCPI16_1@toc@ha
	; FMF-NEXT: lfs 2, .LCPI15_0@toc@l(3)			; FMF-NEXT: lfs 2, .LCPI16_0@toc@l(3)
	; FMF-NEXT: lfs 3, .LCPI15_1@toc@l(4)			; FMF-NEXT: lfs 3, .LCPI16_1@toc@l(4)
	; FMF-NEXT: xsmulsp 1, 1, 0			; FMF-NEXT: xsmulsp 1, 1, 0
	; FMF-NEXT: xsmaddasp 2, 1, 0			; FMF-NEXT: xsmaddasp 2, 1, 0
	; FMF-NEXT: xsmulsp 0, 1, 3			; FMF-NEXT: xsmulsp 0, 1, 3
	; FMF-NEXT: xsmulsp 0, 0, 2			; FMF-NEXT: xsmulsp 0, 0, 2
	; FMF-NEXT: .LBB15_2:			; FMF-NEXT: .LBB16_2:
	; FMF-NEXT: fmr 1, 0			; FMF-NEXT: fmr 1, 0
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: sqrt_fast_preserve_sign:			; GLOBAL-LABEL: sqrt_fast_preserve_sign:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: xxlxor 0, 0, 0			; GLOBAL-NEXT: xxlxor 0, 0, 0
	; GLOBAL-NEXT: fcmpu 0, 1, 0			; GLOBAL-NEXT: fcmpu 0, 1, 0
	; GLOBAL-NEXT: beq 0, .LBB15_2			; GLOBAL-NEXT: beq 0, .LBB16_2
	; GLOBAL-NEXT: # %bb.1:			; GLOBAL-NEXT: # %bb.1:
	; GLOBAL-NEXT: xsrsqrtesp 0, 1			; GLOBAL-NEXT: xsrsqrtesp 0, 1
	; GLOBAL-NEXT: addis 3, 2, .LCPI15_0@toc@ha			; GLOBAL-NEXT: addis 3, 2, .LCPI16_0@toc@ha
	; GLOBAL-NEXT: addis 4, 2, .LCPI15_1@toc@ha			; GLOBAL-NEXT: addis 4, 2, .LCPI16_1@toc@ha
	; GLOBAL-NEXT: lfs 2, .LCPI15_0@toc@l(3)			; GLOBAL-NEXT: lfs 2, .LCPI16_0@toc@l(3)
	; GLOBAL-NEXT: lfs 3, .LCPI15_1@toc@l(4)			; GLOBAL-NEXT: lfs 3, .LCPI16_1@toc@l(4)
	; GLOBAL-NEXT: xsmulsp 1, 1, 0			; GLOBAL-NEXT: xsmulsp 1, 1, 0
	; GLOBAL-NEXT: xsmaddasp 2, 1, 0			; GLOBAL-NEXT: xsmaddasp 2, 1, 0
	; GLOBAL-NEXT: xsmulsp 0, 1, 3			; GLOBAL-NEXT: xsmulsp 0, 1, 3
	; GLOBAL-NEXT: xsmulsp 0, 0, 2			; GLOBAL-NEXT: xsmulsp 0, 0, 2
	; GLOBAL-NEXT: .LBB15_2:			; GLOBAL-NEXT: .LBB16_2:
	; GLOBAL-NEXT: fmr 1, 0			; GLOBAL-NEXT: fmr 1, 0
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%rt = call reassoc ninf afn float @llvm.sqrt.f32(float %x)			%rt = call contract reassoc ninf afn float @llvm.sqrt.f32(float %x)
	ret float %rt			ret float %rt
	}			}

	; fcmp can have fast-math-flags.			; fcmp can have fast-math-flags.

	; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fcmp_nnan:'			; FMFDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fcmp_nnan:'
	; FMFDEBUG: select_cc nnan {{t[0-9]+}}			; FMFDEBUG: select_cc nnan {{t[0-9]+}}
	; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fcmp_nnan:'			; FMFDEBUG: Type-legalized selection DAG: %bb.0 'fcmp_nnan:'

	; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fcmp_nnan:'			; GLOBALDEBUG-LABEL: Optimized lowered selection DAG: %bb.0 'fcmp_nnan:'
	; GLOBALDEBUG: select_cc nnan {{t[0-9]+}}			; GLOBALDEBUG: select_cc nnan {{t[0-9]+}}
	; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'fcmp_nnan:'			; GLOBALDEBUG: Type-legalized selection DAG: %bb.0 'fcmp_nnan:'

	define double @fcmp_nnan(double %a, double %y, double %z) {			define double @fcmp_nnan(double %a, double %y, double %z) {
	; FMF-LABEL: fcmp_nnan:			; FMF-LABEL: fcmp_nnan:
	; FMF: # %bb.0:			; FMF: # %bb.0:
	; FMF-NEXT: xxlxor 0, 0, 0			; FMF-NEXT: xxlxor 0, 0, 0
	; FMF-NEXT: xscmpudp 0, 1, 0			; FMF-NEXT: xscmpudp 0, 1, 0
	; FMF-NEXT: blt 0, .LBB16_2			; FMF-NEXT: blt 0, .LBB17_2
	; FMF-NEXT: # %bb.1:			; FMF-NEXT: # %bb.1:
	; FMF-NEXT: fmr 3, 2			; FMF-NEXT: fmr 3, 2
	; FMF-NEXT: .LBB16_2:			; FMF-NEXT: .LBB17_2:
	; FMF-NEXT: fmr 1, 3			; FMF-NEXT: fmr 1, 3
	; FMF-NEXT: blr			; FMF-NEXT: blr
	;			;
	; GLOBAL-LABEL: fcmp_nnan:			; GLOBAL-LABEL: fcmp_nnan:
	; GLOBAL: # %bb.0:			; GLOBAL: # %bb.0:
	; GLOBAL-NEXT: xxlxor 0, 0, 0			; GLOBAL-NEXT: xxlxor 0, 0, 0
	; GLOBAL-NEXT: xscmpudp 0, 1, 0			; GLOBAL-NEXT: xscmpudp 0, 1, 0
	; GLOBAL-NEXT: blt 0, .LBB16_2			; GLOBAL-NEXT: blt 0, .LBB17_2
	; GLOBAL-NEXT: # %bb.1:			; GLOBAL-NEXT: # %bb.1:
	; GLOBAL-NEXT: fmr 3, 2			; GLOBAL-NEXT: fmr 3, 2
	; GLOBAL-NEXT: .LBB16_2:			; GLOBAL-NEXT: .LBB17_2:
	; GLOBAL-NEXT: fmr 1, 3			; GLOBAL-NEXT: fmr 1, 3
	; GLOBAL-NEXT: blr			; GLOBAL-NEXT: blr
	%cmp = fcmp nnan ult double %a, 0.0			%cmp = fcmp nnan ult double %a, 0.0
	%z.y = select i1 %cmp, double %z, double %y			%z.y = select i1 %cmp, double %z, double %y
	ret double %z.y			ret double %z.y
	}			}

	; FP library calls can have fast-math-flags.			; FP library calls can have fast-math-flags.
	▲ Show 20 Lines • Show All 69 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/machine-combiner.ll

	Show First 20 Lines • Show All 202 Lines • ▼ Show 20 Lines

	define double @reassociate_mamaa_double(double %0, double %1, double %2, double %3, double %4, double %5) {			define double @reassociate_mamaa_double(double %0, double %1, double %2, double %3, double %4, double %5) {
	; CHECK-LABEL: reassociate_mamaa_double:			; CHECK-LABEL: reassociate_mamaa_double:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-PWR-DAG: xsmaddadp 1, 6, 5			; CHECK-PWR-DAG: xsmaddadp 1, 6, 5
	; CHECK-PWR-DAG: xsmaddadp 2, 4, 3			; CHECK-PWR-DAG: xsmaddadp 2, 4, 3
	; CHECK-PWR: xsadddp 1, 2, 1			; CHECK-PWR: xsadddp 1, 2, 1
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%7 = fmul reassoc nsz double %3, %2			%7 = fmul contract reassoc nsz double %3, %2
	%8 = fmul reassoc nsz double %5, %4			%8 = fmul contract reassoc nsz double %5, %4
	%9 = fadd reassoc nsz double %1, %0			%9 = fadd contract reassoc nsz double %1, %0
	%10 = fadd reassoc nsz double %9, %7			%10 = fadd contract reassoc nsz double %9, %7
	%11 = fadd reassoc nsz double %10, %8			%11 = fadd contract reassoc nsz double %10, %8
	ret double %11			ret double %11
	}			}

	define float @reassociate_mamaa_float(float %0, float %1, float %2, float %3, float %4, float %5) {			define float @reassociate_mamaa_float(float %0, float %1, float %2, float %3, float %4, float %5) {
	; CHECK-LABEL: reassociate_mamaa_float:			; CHECK-LABEL: reassociate_mamaa_float:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-DAG: fmadds [[REG0:[0-9]+]], 4, 3, 2			; CHECK-DAG: fmadds [[REG0:[0-9]+]], 4, 3, 2
	; CHECK-DAG: fmadds [[REG1:[0-9]+]], 6, 5, 1			; CHECK-DAG: fmadds [[REG1:[0-9]+]], 6, 5, 1
	; CHECK: fadds 1, [[REG0]], [[REG1]]			; CHECK: fadds 1, [[REG0]], [[REG1]]
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%7 = fmul reassoc nsz float %3, %2			%7 = fmul contract reassoc nsz float %3, %2
	%8 = fmul reassoc nsz float %5, %4			%8 = fmul contract reassoc nsz float %5, %4
	%9 = fadd reassoc nsz float %1, %0			%9 = fadd contract reassoc nsz float %1, %0
	%10 = fadd reassoc nsz float %9, %7			%10 = fadd contract reassoc nsz float %9, %7
	%11 = fadd reassoc nsz float %10, %8			%11 = fadd contract reassoc nsz float %10, %8
	ret float %11			ret float %11
	}			}

	define <4 x float> @reassociate_mamaa_vec(<4 x float> %0, <4 x float> %1, <4 x float> %2, <4 x float> %3, <4 x float> %4, <4 x float> %5) {			define <4 x float> @reassociate_mamaa_vec(<4 x float> %0, <4 x float> %1, <4 x float> %2, <4 x float> %3, <4 x float> %4, <4 x float> %5) {
	; CHECK-LABEL: reassociate_mamaa_vec:			; CHECK-LABEL: reassociate_mamaa_vec:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-PWR-DAG: xvmaddasp [[REG0:[0-9]+]], 39, 38			; CHECK-PWR-DAG: xvmaddasp [[REG0:[0-9]+]], 39, 38
	; CHECK-PWR-DAG: xvmaddasp [[REG1:[0-9]+]], 37, 36			; CHECK-PWR-DAG: xvmaddasp [[REG1:[0-9]+]], 37, 36
	; CHECK-PWR: xvaddsp 34, [[REG1]], [[REG0]]			; CHECK-PWR: xvaddsp 34, [[REG1]], [[REG0]]
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%7 = fmul reassoc nsz <4 x float> %3, %2			%7 = fmul contract reassoc nsz <4 x float> %3, %2
	%8 = fmul reassoc nsz <4 x float> %5, %4			%8 = fmul contract reassoc nsz <4 x float> %5, %4
	%9 = fadd reassoc nsz <4 x float> %1, %0			%9 = fadd contract reassoc nsz <4 x float> %1, %0
	%10 = fadd reassoc nsz <4 x float> %9, %7			%10 = fadd contract reassoc nsz <4 x float> %9, %7
	%11 = fadd reassoc nsz <4 x float> %10, %8			%11 = fadd contract reassoc nsz <4 x float> %10, %8
	ret <4 x float> %11			ret <4 x float> %11
	}			}

	define double @reassociate_mamama_double(double %0, double %1, double %2, double %3, double %4, double %5, double %6, double %7, double %8) {			define double @reassociate_mamama_double(double %0, double %1, double %2, double %3, double %4, double %5, double %6, double %7, double %8) {
	; CHECK-LABEL: reassociate_mamama_double:			; CHECK-LABEL: reassociate_mamama_double:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-PWR: xsmaddadp 7, 2, 1			; CHECK-PWR: xsmaddadp 7, 2, 1
	; CHECK-PWR-DAG: xsmuldp [[REG0:[0-9]+]], 4, 3			; CHECK-PWR-DAG: xsmuldp [[REG0:[0-9]+]], 4, 3
	; CHECK-PWR-DAG: xsmaddadp 7, 6, 5			; CHECK-PWR-DAG: xsmaddadp 7, 6, 5
	; CHECK-PWR-DAG: xsmaddadp [[REG0]], 9, 8			; CHECK-PWR-DAG: xsmaddadp [[REG0]], 9, 8
	; CHECK-PWR: xsadddp 1, 7, [[REG0]]			; CHECK-PWR: xsadddp 1, 7, [[REG0]]
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%10 = fmul reassoc nsz double %1, %0			%10 = fmul contract reassoc nsz double %1, %0
	%11 = fmul reassoc nsz double %3, %2			%11 = fmul contract reassoc nsz double %3, %2
	%12 = fmul reassoc nsz double %5, %4			%12 = fmul contract reassoc nsz double %5, %4
	%13 = fmul reassoc nsz double %8, %7			%13 = fmul contract reassoc nsz double %8, %7
	%14 = fadd reassoc nsz double %11, %10			%14 = fadd contract reassoc nsz double %11, %10
	%15 = fadd reassoc nsz double %14, %6			%15 = fadd contract reassoc nsz double %14, %6
	%16 = fadd reassoc nsz double %15, %12			%16 = fadd contract reassoc nsz double %15, %12
	%17 = fadd reassoc nsz double %16, %13			%17 = fadd contract reassoc nsz double %16, %13
	ret double %17			ret double %17
	}			}

	define dso_local float @reassociate_mamama_8(float %0, float %1, float %2, float %3, float %4, float %5, float %6, float %7, float %8,			define dso_local float @reassociate_mamama_8(float %0, float %1, float %2, float %3, float %4, float %5, float %6, float %7, float %8,
	float %9, float %10, float %11, float %12, float %13, float %14, float %15, float %16) {			float %9, float %10, float %11, float %12, float %13, float %14, float %15, float %16) {
	; CHECK-LABEL: reassociate_mamama_8:			; CHECK-LABEL: reassociate_mamama_8:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-DAG: fmadds [[REG0:[0-9]+]], 3, 2, 1			; CHECK-DAG: fmadds [[REG0:[0-9]+]], 3, 2, 1
	; CHECK-DAG: fmuls [[REG1:[0-9]+]], 5, 4			; CHECK-DAG: fmuls [[REG1:[0-9]+]], 5, 4
	; CHECK-DAG: fmadds [[REG2:[0-9]+]], 7, 6, [[REG0]]			; CHECK-DAG: fmadds [[REG2:[0-9]+]], 7, 6, [[REG0]]
	; CHECK-DAG: fmadds [[REG3:[0-9]+]], 9, 8, [[REG1]]			; CHECK-DAG: fmadds [[REG3:[0-9]+]], 9, 8, [[REG1]]
	;			;
	; CHECK-DAG: fmadds [[REG4:[0-9]+]], 13, 12, [[REG3]]			; CHECK-DAG: fmadds [[REG4:[0-9]+]], 13, 12, [[REG3]]
	; CHECK-DAG: fmadds [[REG5:[0-9]+]], 11, 10, [[REG2]]			; CHECK-DAG: fmadds [[REG5:[0-9]+]], 11, 10, [[REG2]]
	;			;
	; CHECK-DAG: fmadds [[REG6:[0-9]+]], 3, 2, [[REG4]]			; CHECK-DAG: fmadds [[REG6:[0-9]+]], 3, 2, [[REG4]]
	; CHECK-DAG: fmadds [[REG7:[0-9]+]], 5, 4, [[REG5]]			; CHECK-DAG: fmadds [[REG7:[0-9]+]], 5, 4, [[REG5]]
	; CHECK: fadds 1, [[REG7]], [[REG6]]			; CHECK: fadds 1, [[REG7]], [[REG6]]
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%18 = fmul reassoc nsz float %2, %1			%18 = fmul contract reassoc nsz float %2, %1
	%19 = fadd reassoc nsz float %18, %0			%19 = fadd contract reassoc nsz float %18, %0
	%20 = fmul reassoc nsz float %4, %3			%20 = fmul contract reassoc nsz float %4, %3
	%21 = fadd reassoc nsz float %19, %20			%21 = fadd contract reassoc nsz float %19, %20
	%22 = fmul reassoc nsz float %6, %5			%22 = fmul contract reassoc nsz float %6, %5
	%23 = fadd reassoc nsz float %21, %22			%23 = fadd contract reassoc nsz float %21, %22
	%24 = fmul reassoc nsz float %8, %7			%24 = fmul contract reassoc nsz float %8, %7
	%25 = fadd reassoc nsz float %23, %24			%25 = fadd contract reassoc nsz float %23, %24
	%26 = fmul reassoc nsz float %10, %9			%26 = fmul contract reassoc nsz float %10, %9
	%27 = fadd reassoc nsz float %25, %26			%27 = fadd contract reassoc nsz float %25, %26
	%28 = fmul reassoc nsz float %12, %11			%28 = fmul contract reassoc nsz float %12, %11
	%29 = fadd reassoc nsz float %27, %28			%29 = fadd contract reassoc nsz float %27, %28
	%30 = fmul reassoc nsz float %14, %13			%30 = fmul contract reassoc nsz float %14, %13
	%31 = fadd reassoc nsz float %29, %30			%31 = fadd contract reassoc nsz float %29, %30
	%32 = fmul reassoc nsz float %16, %15			%32 = fmul contract reassoc nsz float %16, %15
	%33 = fadd reassoc nsz float %31, %32			%33 = fadd contract reassoc nsz float %31, %32
	ret float %33			ret float %33
	}			}

llvm/test/CodeGen/PowerPC/recipest.ll

	Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
	; CHECK-P9-NEXT: xsmuldp 0, 0, 3			; CHECK-P9-NEXT: xsmuldp 0, 0, 3
	; CHECK-P9-NEXT: xsmuldp 0, 0, 5			; CHECK-P9-NEXT: xsmuldp 0, 0, 5
	; CHECK-P9-NEXT: xsmuldp 2, 2, 0			; CHECK-P9-NEXT: xsmuldp 2, 2, 0
	; CHECK-P9-NEXT: xsmaddadp 4, 2, 0			; CHECK-P9-NEXT: xsmaddadp 4, 2, 0
	; CHECK-P9-NEXT: xsmuldp 0, 0, 3			; CHECK-P9-NEXT: xsmuldp 0, 0, 3
	; CHECK-P9-NEXT: xsmuldp 0, 0, 4			; CHECK-P9-NEXT: xsmuldp 0, 0, 4
	; CHECK-P9-NEXT: xsmuldp 1, 1, 0			; CHECK-P9-NEXT: xsmuldp 1, 1, 0
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%x = call arcp reassoc double @llvm.sqrt.f64(double %b)			%x = call arcp contract reassoc double @llvm.sqrt.f64(double %b)
	%r = fdiv arcp reassoc double %a, %x			%r = fdiv arcp contract reassoc double %a, %x
	ret double %r			ret double %r
	}			}

	define double @foo_safe(double %a, double %b) nounwind {			define double @foo_safe(double %a, double %b) nounwind {
	; CHECK-P7-LABEL: foo_safe:			; CHECK-P7-LABEL: foo_safe:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: fsqrt 0, 2			; CHECK-P7-NEXT: fsqrt 0, 2
	; CHECK-P7-NEXT: fdiv 1, 1, 0			; CHECK-P7-NEXT: fdiv 1, 1, 0
	▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines
	; CHECK-P9-NEXT: addis 3, 2, .LCPI3_1@toc@ha			; CHECK-P9-NEXT: addis 3, 2, .LCPI3_1@toc@ha
	; CHECK-P9-NEXT: xsmulsp 2, 2, 0			; CHECK-P9-NEXT: xsmulsp 2, 2, 0
	; CHECK-P9-NEXT: xsmaddasp 3, 2, 0			; CHECK-P9-NEXT: xsmaddasp 3, 2, 0
	; CHECK-P9-NEXT: lfs 2, .LCPI3_1@toc@l(3)			; CHECK-P9-NEXT: lfs 2, .LCPI3_1@toc@l(3)
	; CHECK-P9-NEXT: xsmulsp 0, 0, 2			; CHECK-P9-NEXT: xsmulsp 0, 0, 2
	; CHECK-P9-NEXT: xsmulsp 0, 0, 3			; CHECK-P9-NEXT: xsmulsp 0, 0, 3
	; CHECK-P9-NEXT: xsmuldp 1, 1, 0			; CHECK-P9-NEXT: xsmuldp 1, 1, 0
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%x = call reassoc arcp float @llvm.sqrt.f32(float %b)			%x = call contract reassoc arcp float @llvm.sqrt.f32(float %b)
	%y = fpext float %x to double			%y = fpext float %x to double
	%r = fdiv reassoc arcp double %a, %y			%r = fdiv contract reassoc arcp double %a, %y
	ret double %r			ret double %r
	}			}

	define double @foof_safe(double %a, float %b) nounwind {			define double @foof_safe(double %a, float %b) nounwind {
	; CHECK-P7-LABEL: foof_safe:			; CHECK-P7-LABEL: foof_safe:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: fsqrts 0, 2			; CHECK-P7-NEXT: fsqrts 0, 2
	; CHECK-P7-NEXT: fdiv 1, 1, 0			; CHECK-P7-NEXT: fdiv 1, 1, 0
	▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines
	; CHECK-P9-NEXT: xsmuldp 0, 0, 5			; CHECK-P9-NEXT: xsmuldp 0, 0, 5
	; CHECK-P9-NEXT: xsmuldp 2, 2, 0			; CHECK-P9-NEXT: xsmuldp 2, 2, 0
	; CHECK-P9-NEXT: xsmaddadp 4, 2, 0			; CHECK-P9-NEXT: xsmaddadp 4, 2, 0
	; CHECK-P9-NEXT: xsmuldp 0, 0, 3			; CHECK-P9-NEXT: xsmuldp 0, 0, 3
	; CHECK-P9-NEXT: xsmuldp 0, 0, 4			; CHECK-P9-NEXT: xsmuldp 0, 0, 4
	; CHECK-P9-NEXT: xsrsp 0, 0			; CHECK-P9-NEXT: xsrsp 0, 0
	; CHECK-P9-NEXT: xsmulsp 1, 1, 0			; CHECK-P9-NEXT: xsmulsp 1, 1, 0
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%x = call reassoc arcp double @llvm.sqrt.f64(double %b)			%x = call contract reassoc arcp double @llvm.sqrt.f64(double %b)
	%y = fptrunc double %x to float			%y = fptrunc double %x to float
	%r = fdiv reassoc arcp float %a, %y			%r = fdiv contract reassoc arcp float %a, %y
	ret float %r			ret float %r
	}			}

	define float @food_safe(float %a, double %b) nounwind {			define float @food_safe(float %a, double %b) nounwind {
	; CHECK-P7-LABEL: food_safe:			; CHECK-P7-LABEL: food_safe:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: fsqrt 0, 2			; CHECK-P7-NEXT: fsqrt 0, 2
	; CHECK-P7-NEXT: frsp 0, 0			; CHECK-P7-NEXT: frsp 0, 0
	▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines
	; CHECK-P9-NEXT: addis 3, 2, .LCPI7_1@toc@ha			; CHECK-P9-NEXT: addis 3, 2, .LCPI7_1@toc@ha
	; CHECK-P9-NEXT: xsmulsp 2, 2, 0			; CHECK-P9-NEXT: xsmulsp 2, 2, 0
	; CHECK-P9-NEXT: xsmaddasp 3, 2, 0			; CHECK-P9-NEXT: xsmaddasp 3, 2, 0
	; CHECK-P9-NEXT: lfs 2, .LCPI7_1@toc@l(3)			; CHECK-P9-NEXT: lfs 2, .LCPI7_1@toc@l(3)
	; CHECK-P9-NEXT: xsmulsp 0, 0, 2			; CHECK-P9-NEXT: xsmulsp 0, 0, 2
	; CHECK-P9-NEXT: xsmulsp 0, 0, 3			; CHECK-P9-NEXT: xsmulsp 0, 0, 3
	; CHECK-P9-NEXT: xsmulsp 1, 1, 0			; CHECK-P9-NEXT: xsmulsp 1, 1, 0
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%x = call reassoc arcp float @llvm.sqrt.f32(float %b)			%x = call contract reassoc arcp float @llvm.sqrt.f32(float %b)
	%r = fdiv reassoc arcp float %a, %x			%r = fdiv contract reassoc arcp float %a, %x
	ret float %r			ret float %r
	}			}

	define float @goo_safe(float %a, float %b) nounwind {			define float @goo_safe(float %a, float %b) nounwind {
	; CHECK-P7-LABEL: goo_safe:			; CHECK-P7-LABEL: goo_safe:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: fsqrts 0, 2			; CHECK-P7-NEXT: fsqrts 0, 2
	; CHECK-P7-NEXT: fdivs 1, 1, 0			; CHECK-P7-NEXT: fdivs 1, 1, 0
	▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
	; CHECK-P9-NEXT: xsmulsp 0, 0, 1			; CHECK-P9-NEXT: xsmulsp 0, 0, 1
	; CHECK-P9-NEXT: xsresp 1, 2			; CHECK-P9-NEXT: xsresp 1, 2
	; CHECK-P9-NEXT: xsmulsp 0, 0, 4			; CHECK-P9-NEXT: xsmulsp 0, 0, 4
	; CHECK-P9-NEXT: xsmulsp 4, 0, 1			; CHECK-P9-NEXT: xsmulsp 4, 0, 1
	; CHECK-P9-NEXT: xsnmsubasp 0, 2, 4			; CHECK-P9-NEXT: xsnmsubasp 0, 2, 4
	; CHECK-P9-NEXT: xsmaddasp 4, 1, 0			; CHECK-P9-NEXT: xsmaddasp 4, 1, 0
	; CHECK-P9-NEXT: xsmulsp 1, 3, 4			; CHECK-P9-NEXT: xsmulsp 1, 3, 4
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%x = call reassoc arcp nsz float @llvm.sqrt.f32(float %a)			%x = call contract reassoc arcp nsz float @llvm.sqrt.f32(float %a)
	%y = fmul reassoc nsz float %x, %b			%y = fmul contract reassoc nsz float %x, %b
	%z = fdiv reassoc arcp nsz ninf float %c, %y			%z = fdiv contract reassoc arcp nsz ninf float %c, %y
	ret float %z			ret float %z
	}			}

	define float @rsqrt_fmul_safe(float %a, float %b, float %c) {			define float @rsqrt_fmul_safe(float %a, float %b, float %c) {
	; CHECK-P7-LABEL: rsqrt_fmul_safe:			; CHECK-P7-LABEL: rsqrt_fmul_safe:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: fsqrts 0, 1			; CHECK-P7-NEXT: fsqrts 0, 1
	; CHECK-P7-NEXT: fmuls 0, 0, 2			; CHECK-P7-NEXT: fmuls 0, 0, 2
	▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
	; CHECK-P9-NEXT: addi 3, 3, .LCPI12_1@toc@l			; CHECK-P9-NEXT: addi 3, 3, .LCPI12_1@toc@l
	; CHECK-P9-NEXT: xvmulsp 1, 35, 0			; CHECK-P9-NEXT: xvmulsp 1, 35, 0
	; CHECK-P9-NEXT: xvmaddasp 2, 1, 0			; CHECK-P9-NEXT: xvmaddasp 2, 1, 0
	; CHECK-P9-NEXT: lxvx 1, 0, 3			; CHECK-P9-NEXT: lxvx 1, 0, 3
	; CHECK-P9-NEXT: xvmulsp 0, 0, 1			; CHECK-P9-NEXT: xvmulsp 0, 0, 1
	; CHECK-P9-NEXT: xvmulsp 0, 0, 2			; CHECK-P9-NEXT: xvmulsp 0, 0, 2
	; CHECK-P9-NEXT: xvmulsp 34, 34, 0			; CHECK-P9-NEXT: xvmulsp 34, 34, 0
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%x = call reassoc arcp <4 x float> @llvm.sqrt.v4f32(<4 x float> %b)			%x = call contract reassoc arcp <4 x float> @llvm.sqrt.v4f32(<4 x float> %b)
	%r = fdiv reassoc arcp <4 x float> %a, %x			%r = fdiv contract reassoc arcp <4 x float> %a, %x
	ret <4 x float> %r			ret <4 x float> %r
	}			}

	define <4 x float> @hoo_safe(<4 x float> %a, <4 x float> %b) nounwind {			define <4 x float> @hoo_safe(<4 x float> %a, <4 x float> %b) nounwind {
	; CHECK-P7-LABEL: hoo_safe:			; CHECK-P7-LABEL: hoo_safe:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: addi 3, 1, -32			; CHECK-P7-NEXT: addi 3, 1, -32
	; CHECK-P7-NEXT: stvx 3, 0, 3			; CHECK-P7-NEXT: stvx 3, 0, 3
	▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
	; CHECK-P9-NEXT: lfs 0, .LCPI14_0@toc@l(3)			; CHECK-P9-NEXT: lfs 0, .LCPI14_0@toc@l(3)
	; CHECK-P9-NEXT: xsmaddadp 0, 2, 3			; CHECK-P9-NEXT: xsmaddadp 0, 2, 3
	; CHECK-P9-NEXT: xsnmsubadp 3, 3, 0			; CHECK-P9-NEXT: xsnmsubadp 3, 3, 0
	; CHECK-P9-NEXT: xsmuldp 0, 1, 3			; CHECK-P9-NEXT: xsmuldp 0, 1, 3
	; CHECK-P9-NEXT: xsnmsubadp 1, 2, 0			; CHECK-P9-NEXT: xsnmsubadp 1, 2, 0
	; CHECK-P9-NEXT: xsmaddadp 0, 3, 1			; CHECK-P9-NEXT: xsmaddadp 0, 3, 1
	; CHECK-P9-NEXT: fmr 1, 0			; CHECK-P9-NEXT: fmr 1, 0
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%r = fdiv reassoc arcp nsz ninf double %a, %b			%r = fdiv contract reassoc arcp nsz ninf double %a, %b
	ret double %r			ret double %r
	}			}

	define double @foo2_safe(double %a, double %b) nounwind {			define double @foo2_safe(double %a, double %b) nounwind {
	; CHECK-P7-LABEL: foo2_safe:			; CHECK-P7-LABEL: foo2_safe:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: fdiv 1, 1, 2			; CHECK-P7-NEXT: fdiv 1, 1, 2
	; CHECK-P7-NEXT: blr			; CHECK-P7-NEXT: blr
	Show All 32 Lines
	; CHECK-P9-LABEL: goo2_fmf:			; CHECK-P9-LABEL: goo2_fmf:
	; CHECK-P9: # %bb.0:			; CHECK-P9: # %bb.0:
	; CHECK-P9-NEXT: xsresp 3, 2			; CHECK-P9-NEXT: xsresp 3, 2
	; CHECK-P9-NEXT: xsmulsp 0, 1, 3			; CHECK-P9-NEXT: xsmulsp 0, 1, 3
	; CHECK-P9-NEXT: xsnmsubasp 1, 2, 0			; CHECK-P9-NEXT: xsnmsubasp 1, 2, 0
	; CHECK-P9-NEXT: xsmaddasp 0, 3, 1			; CHECK-P9-NEXT: xsmaddasp 0, 3, 1
	; CHECK-P9-NEXT: fmr 1, 0			; CHECK-P9-NEXT: fmr 1, 0
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%r = fdiv reassoc arcp nsz ninf float %a, %b			%r = fdiv contract reassoc arcp nsz ninf float %a, %b
	ret float %r			ret float %r
	}			}

	define float @goo2_safe(float %a, float %b) nounwind {			define float @goo2_safe(float %a, float %b) nounwind {
	; CHECK-P7-LABEL: goo2_safe:			; CHECK-P7-LABEL: goo2_safe:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: fdivs 1, 1, 2			; CHECK-P7-NEXT: fdivs 1, 1, 2
	; CHECK-P7-NEXT: blr			; CHECK-P7-NEXT: blr
	Show All 34 Lines
	; CHECK-P9-LABEL: hoo2_fmf:			; CHECK-P9-LABEL: hoo2_fmf:
	; CHECK-P9: # %bb.0:			; CHECK-P9: # %bb.0:
	; CHECK-P9-NEXT: xvresp 1, 35			; CHECK-P9-NEXT: xvresp 1, 35
	; CHECK-P9-NEXT: xvmulsp 0, 34, 1			; CHECK-P9-NEXT: xvmulsp 0, 34, 1
	; CHECK-P9-NEXT: xvnmsubasp 34, 35, 0			; CHECK-P9-NEXT: xvnmsubasp 34, 35, 0
	; CHECK-P9-NEXT: xvmaddasp 0, 1, 34			; CHECK-P9-NEXT: xvmaddasp 0, 1, 34
	; CHECK-P9-NEXT: xxlor 34, 0, 0			; CHECK-P9-NEXT: xxlor 34, 0, 0
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%r = fdiv reassoc arcp nsz ninf <4 x float> %a, %b			%r = fdiv contract reassoc arcp nsz ninf <4 x float> %a, %b
	ret <4 x float> %r			ret <4 x float> %r
	}			}

	define <4 x float> @hoo2_safe(<4 x float> %a, <4 x float> %b) nounwind {			define <4 x float> @hoo2_safe(<4 x float> %a, <4 x float> %b) nounwind {
	; CHECK-P7-LABEL: hoo2_safe:			; CHECK-P7-LABEL: hoo2_safe:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: addi 3, 1, -32			; CHECK-P7-NEXT: addi 3, 1, -32
	; CHECK-P7-NEXT: addi 4, 1, -48			; CHECK-P7-NEXT: addi 4, 1, -48
	▲ Show 20 Lines • Show All 98 Lines • ▼ Show 20 Lines
	; CHECK-P9-NEXT: xsmuldp 1, 1, 0			; CHECK-P9-NEXT: xsmuldp 1, 1, 0
	; CHECK-P9-NEXT: xsmaddadp 3, 1, 0			; CHECK-P9-NEXT: xsmaddadp 3, 1, 0
	; CHECK-P9-NEXT: xsmuldp 0, 1, 2			; CHECK-P9-NEXT: xsmuldp 0, 1, 2
	; CHECK-P9-NEXT: xsmuldp 1, 0, 3			; CHECK-P9-NEXT: xsmuldp 1, 0, 3
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	; CHECK-P9-NEXT: .LBB20_2:			; CHECK-P9-NEXT: .LBB20_2:
	; CHECK-P9-NEXT: xssqrtdp 1, 1			; CHECK-P9-NEXT: xssqrtdp 1, 1
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%r = call reassoc ninf afn double @llvm.sqrt.f64(double %a)			%r = call contract reassoc ninf afn double @llvm.sqrt.f64(double %a)
	ret double %r			ret double %r
	}			}

	define double @foo3_fmf_crbits_off(double %a) #2 {			define double @foo3_fmf_crbits_off(double %a) #2 {
	; CHECK-P7-LABEL: foo3_fmf_crbits_off:			; CHECK-P7-LABEL: foo3_fmf_crbits_off:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: fabs 0, 1			; CHECK-P7-NEXT: fabs 0, 1
	; CHECK-P7-NEXT: addis 3, 2, .LCPI21_2@toc@ha			; CHECK-P7-NEXT: addis 3, 2, .LCPI21_2@toc@ha
	▲ Show 20 Lines • Show All 67 Lines • ▼ Show 20 Lines
	; CHECK-P9-NEXT: xsmuldp 1, 1, 0			; CHECK-P9-NEXT: xsmuldp 1, 1, 0
	; CHECK-P9-NEXT: xsmaddadp 3, 1, 0			; CHECK-P9-NEXT: xsmaddadp 3, 1, 0
	; CHECK-P9-NEXT: xsmuldp 0, 1, 2			; CHECK-P9-NEXT: xsmuldp 0, 1, 2
	; CHECK-P9-NEXT: xsmuldp 1, 0, 3			; CHECK-P9-NEXT: xsmuldp 1, 0, 3
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	; CHECK-P9-NEXT: .LBB21_2:			; CHECK-P9-NEXT: .LBB21_2:
	; CHECK-P9-NEXT: xssqrtdp 1, 1			; CHECK-P9-NEXT: xssqrtdp 1, 1
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%r = call reassoc ninf afn double @llvm.sqrt.f64(double %a)			%r = call contract reassoc ninf afn double @llvm.sqrt.f64(double %a)
	ret double %r			ret double %r
	}			}

	define double @foo3_safe(double %a) nounwind {			define double @foo3_safe(double %a) nounwind {
	; CHECK-P7-LABEL: foo3_safe:			; CHECK-P7-LABEL: foo3_safe:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: fsqrt 1, 1			; CHECK-P7-NEXT: fsqrt 1, 1
	; CHECK-P7-NEXT: blr			; CHECK-P7-NEXT: blr
	▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
	; CHECK-P9-NEXT: xsmulsp 1, 1, 0			; CHECK-P9-NEXT: xsmulsp 1, 1, 0
	; CHECK-P9-NEXT: xsmaddasp 2, 1, 0			; CHECK-P9-NEXT: xsmaddasp 2, 1, 0
	; CHECK-P9-NEXT: lfs 0, .LCPI23_1@toc@l(3)			; CHECK-P9-NEXT: lfs 0, .LCPI23_1@toc@l(3)
	; CHECK-P9-NEXT: xsmulsp 0, 1, 0			; CHECK-P9-NEXT: xsmulsp 0, 1, 0
	; CHECK-P9-NEXT: xsmulsp 0, 0, 2			; CHECK-P9-NEXT: xsmulsp 0, 0, 2
	; CHECK-P9-NEXT: .LBB23_2:			; CHECK-P9-NEXT: .LBB23_2:
	; CHECK-P9-NEXT: fmr 1, 0			; CHECK-P9-NEXT: fmr 1, 0
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%r = call reassoc ninf afn float @llvm.sqrt.f32(float %a)			%r = call contract reassoc ninf afn float @llvm.sqrt.f32(float %a)
	ret float %r			ret float %r
	}			}

	define float @goo3_safe(float %a) nounwind {			define float @goo3_safe(float %a) nounwind {
	; CHECK-P7-LABEL: goo3_safe:			; CHECK-P7-LABEL: goo3_safe:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: fsqrts 1, 1			; CHECK-P7-NEXT: fsqrts 1, 1
	; CHECK-P7-NEXT: blr			; CHECK-P7-NEXT: blr
	▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines
	; CHECK-P9-NEXT: xvmaddasp 2, 1, 0			; CHECK-P9-NEXT: xvmaddasp 2, 1, 0
	; CHECK-P9-NEXT: lxvx 0, 0, 3			; CHECK-P9-NEXT: lxvx 0, 0, 3
	; CHECK-P9-NEXT: xvmulsp 0, 1, 0			; CHECK-P9-NEXT: xvmulsp 0, 1, 0
	; CHECK-P9-NEXT: xvmulsp 34, 0, 2			; CHECK-P9-NEXT: xvmulsp 34, 0, 2
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	; CHECK-P9-NEXT: .LBB25_2:			; CHECK-P9-NEXT: .LBB25_2:
	; CHECK-P9-NEXT: xvsqrtsp 34, 34			; CHECK-P9-NEXT: xvsqrtsp 34, 34
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%r = call reassoc ninf afn <4 x float> @llvm.sqrt.v4f32(<4 x float> %a)			%r = call contract reassoc ninf afn <4 x float> @llvm.sqrt.v4f32(<4 x float> %a)
	ret <4 x float> %r			ret <4 x float> %r
	}			}

	define <4 x float> @hoo3_safe(<4 x float> %a) nounwind {			define <4 x float> @hoo3_safe(<4 x float> %a) nounwind {
	; CHECK-P7-LABEL: hoo3_safe:			; CHECK-P7-LABEL: hoo3_safe:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: addi 3, 1, -32			; CHECK-P7-NEXT: addi 3, 1, -32
	; CHECK-P7-NEXT: stvx 2, 0, 3			; CHECK-P7-NEXT: stvx 2, 0, 3
	▲ Show 20 Lines • Show All 114 Lines • ▼ Show 20 Lines
	; CHECK-P9-NEXT: xvmuldp 3, 34, 0			; CHECK-P9-NEXT: xvmuldp 3, 34, 0
	; CHECK-P9-NEXT: xvmaddadp 2, 3, 0			; CHECK-P9-NEXT: xvmaddadp 2, 3, 0
	; CHECK-P9-NEXT: xvmuldp 0, 3, 1			; CHECK-P9-NEXT: xvmuldp 0, 3, 1
	; CHECK-P9-NEXT: xvmuldp 34, 0, 2			; CHECK-P9-NEXT: xvmuldp 34, 0, 2
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	; CHECK-P9-NEXT: .LBB27_2:			; CHECK-P9-NEXT: .LBB27_2:
	; CHECK-P9-NEXT: xvsqrtdp 34, 34			; CHECK-P9-NEXT: xvsqrtdp 34, 34
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%r = call reassoc ninf afn <2 x double> @llvm.sqrt.v2f64(<2 x double> %a)			%r = call contract reassoc ninf afn <2 x double> @llvm.sqrt.v2f64(<2 x double> %a)
	ret <2 x double> %r			ret <2 x double> %r
	}			}

	define <2 x double> @hoo4_safe(<2 x double> %a) #1 {			define <2 x double> @hoo4_safe(<2 x double> %a) #1 {
	; CHECK-P7-LABEL: hoo4_safe:			; CHECK-P7-LABEL: hoo4_safe:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: fsqrt 1, 1			; CHECK-P7-NEXT: fsqrt 1, 1
	; CHECK-P7-NEXT: fsqrt 2, 2			; CHECK-P7-NEXT: fsqrt 2, 2
	Show All 36 Lines
	; CHECK-P8-NEXT: ld 0, 16(1)			; CHECK-P8-NEXT: ld 0, 16(1)
	; CHECK-P8-NEXT: mtlr 0			; CHECK-P8-NEXT: mtlr 0
	; CHECK-P8-NEXT: blr			; CHECK-P8-NEXT: blr
	;			;
	; CHECK-P9-LABEL: hoo5_fmf:			; CHECK-P9-LABEL: hoo5_fmf:
	; CHECK-P9: # %bb.0:			; CHECK-P9: # %bb.0:
	; CHECK-P9-NEXT: xssqrtqp 2, 2			; CHECK-P9-NEXT: xssqrtqp 2, 2
	; CHECK-P9-NEXT: blr			; CHECK-P9-NEXT: blr
	%r = call reassoc ninf afn fp128 @llvm.sqrt.f128(fp128 %a)			%r = call contract reassoc ninf afn fp128 @llvm.sqrt.f128(fp128 %a)
	ret fp128 %r			ret fp128 %r
	}			}

	define fp128 @hoo5_safe(fp128 %a) #1 {			define fp128 @hoo5_safe(fp128 %a) #1 {
	; CHECK-P7-LABEL: hoo5_safe:			; CHECK-P7-LABEL: hoo5_safe:
	; CHECK-P7: # %bb.0:			; CHECK-P7: # %bb.0:
	; CHECK-P7-NEXT: mflr 0			; CHECK-P7-NEXT: mflr 0
	; CHECK-P7-NEXT: std 0, 16(1)			; CHECK-P7-NEXT: std 0, 16(1)
	Show All 31 Lines

llvm/test/CodeGen/PowerPC/register-pressure-reduction.ll

	Show All 31 Lines
	; CHECK-FMA-NEXT: addis r3, r2, .LCPI0_0@toc@ha			; CHECK-FMA-NEXT: addis r3, r2, .LCPI0_0@toc@ha
	; CHECK-FMA-NEXT: xsmulsp f1, f2, f1			; CHECK-FMA-NEXT: xsmulsp f1, f2, f1
	; CHECK-FMA-NEXT: lfs f0, .LCPI0_0@toc@l(r3)			; CHECK-FMA-NEXT: lfs f0, .LCPI0_0@toc@l(r3)
	; CHECK-FMA-NEXT: addis r3, r2, .LCPI0_1@toc@ha			; CHECK-FMA-NEXT: addis r3, r2, .LCPI0_1@toc@ha
	; CHECK-FMA-NEXT: lfs f2, .LCPI0_1@toc@l(r3)			; CHECK-FMA-NEXT: lfs f2, .LCPI0_1@toc@l(r3)
	; CHECK-FMA-NEXT: xsmaddasp f1, f4, f2			; CHECK-FMA-NEXT: xsmaddasp f1, f4, f2
	; CHECK-FMA-NEXT: xsmaddasp f1, f3, f0			; CHECK-FMA-NEXT: xsmaddasp f1, f3, f0
	; CHECK-FMA-NEXT: blr			; CHECK-FMA-NEXT: blr
	%5 = fmul reassoc nsz float %1, %0			%5 = fmul contract reassoc nsz float %1, %0
	%6 = fsub reassoc nsz float %2, %3			%6 = fsub contract reassoc nsz float %2, %3
	%7 = fmul reassoc nsz float %6, 0x3DB2533FE0000000			%7 = fmul contract reassoc nsz float %6, 0x3DB2533FE0000000
	%8 = fadd reassoc nsz float %7, %5			%8 = fadd contract reassoc nsz float %7, %5
	ret float %8			ret float %8
	}			}

	define double @foo_double(double %0, double %1, double %2, double %3) {			define double @foo_double(double %0, double %1, double %2, double %3) {
	; CHECK-LABEL: foo_double:			; CHECK-LABEL: foo_double:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: xsmuldp f1, f2, f1			; CHECK-NEXT: xsmuldp f1, f2, f1
	; CHECK-NEXT: xssubdp f0, f3, f4			; CHECK-NEXT: xssubdp f0, f3, f4
	Show All 16 Lines
	; CHECK-FMA-NEXT: addis r3, r2, .LCPI1_0@toc@ha			; CHECK-FMA-NEXT: addis r3, r2, .LCPI1_0@toc@ha
	; CHECK-FMA-NEXT: xsmuldp f1, f2, f1			; CHECK-FMA-NEXT: xsmuldp f1, f2, f1
	; CHECK-FMA-NEXT: lfd f0, .LCPI1_0@toc@l(r3)			; CHECK-FMA-NEXT: lfd f0, .LCPI1_0@toc@l(r3)
	; CHECK-FMA-NEXT: addis r3, r2, .LCPI1_1@toc@ha			; CHECK-FMA-NEXT: addis r3, r2, .LCPI1_1@toc@ha
	; CHECK-FMA-NEXT: lfd f2, .LCPI1_1@toc@l(r3)			; CHECK-FMA-NEXT: lfd f2, .LCPI1_1@toc@l(r3)
	; CHECK-FMA-NEXT: xsmaddadp f1, f4, f2			; CHECK-FMA-NEXT: xsmaddadp f1, f4, f2
	; CHECK-FMA-NEXT: xsmaddadp f1, f3, f0			; CHECK-FMA-NEXT: xsmaddadp f1, f3, f0
	; CHECK-FMA-NEXT: blr			; CHECK-FMA-NEXT: blr
	%5 = fmul reassoc nsz double %1, %0			%5 = fmul contract reassoc nsz double %1, %0
	%6 = fsub reassoc nsz double %2, %3			%6 = fsub contract reassoc nsz double %2, %3
	%7 = fmul reassoc nsz double %6, 0x3DB2533FE68CADDE			%7 = fmul contract reassoc nsz double %6, 0x3DB2533FE68CADDE
	%8 = fadd reassoc nsz double %7, %5			%8 = fadd contract reassoc nsz double %7, %5
	ret double %8			ret double %8
	}			}

	define float @foo_float_reuse_const(float %0, float %1, float %2, float %3) {			define float @foo_float_reuse_const(float %0, float %1, float %2, float %3) {
	; CHECK-LABEL: foo_float_reuse_const:			; CHECK-LABEL: foo_float_reuse_const:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: addis r3, r2, .LCPI2_0@toc@ha			; CHECK-NEXT: addis r3, r2, .LCPI2_0@toc@ha
	; CHECK-NEXT: xsmulsp f1, f2, f1			; CHECK-NEXT: xsmulsp f1, f2, f1
	Show All 32 Lines
	; CHECK-FMA-NEXT: lfs f5, .LCPI2_1@toc@l(r3)			; CHECK-FMA-NEXT: lfs f5, .LCPI2_1@toc@l(r3)
	; CHECK-FMA-NEXT: addis r3, r2, .LC0@toc@ha			; CHECK-FMA-NEXT: addis r3, r2, .LC0@toc@ha
	; CHECK-FMA-NEXT: ld r3, .LC0@toc@l(r3)			; CHECK-FMA-NEXT: ld r3, .LC0@toc@l(r3)
	; CHECK-FMA-NEXT: xsmaddasp f1, f4, f5			; CHECK-FMA-NEXT: xsmaddasp f1, f4, f5
	; CHECK-FMA-NEXT: xsmaddasp f1, f3, f0			; CHECK-FMA-NEXT: xsmaddasp f1, f3, f0
	; CHECK-FMA-NEXT: xsmulsp f0, f2, f5			; CHECK-FMA-NEXT: xsmulsp f0, f2, f5
	; CHECK-FMA-NEXT: stfs f0, 0(r3)			; CHECK-FMA-NEXT: stfs f0, 0(r3)
	; CHECK-FMA-NEXT: blr			; CHECK-FMA-NEXT: blr
	%5 = fmul reassoc nsz float %1, %0			%5 = fmul contract reassoc nsz float %1, %0
	%6 = fsub reassoc nsz float %2, %3			%6 = fsub contract reassoc nsz float %2, %3
	%7 = fmul reassoc nsz float %6, 0x3DB2533FE0000000			%7 = fmul contract reassoc nsz float %6, 0x3DB2533FE0000000
	%8 = fadd reassoc nsz float %7, %5			%8 = fadd contract reassoc nsz float %7, %5
	%9 = fmul reassoc nsz float %1, 0xBDB2533FE0000000			%9 = fmul contract reassoc nsz float %1, 0xBDB2533FE0000000
	store float %9, float* @global_val, align 4			store float %9, float* @global_val, align 4
	ret float %8			ret float %8
	}			}

llvm/test/CodeGen/PowerPC/repeated-fp-divisors.ll

	Show All 37 Lines
	; CHECK-NEXT: xvresp 1, 0			; CHECK-NEXT: xvresp 1, 0
	; CHECK-NEXT: xvnmsubasp 35, 0, 1			; CHECK-NEXT: xvnmsubasp 35, 0, 1
	; CHECK-NEXT: xvmulsp 0, 34, 36			; CHECK-NEXT: xvmulsp 0, 34, 36
	; CHECK-NEXT: xvmaddasp 1, 1, 35			; CHECK-NEXT: xvmaddasp 1, 1, 35
	; CHECK-NEXT: xvmulsp 34, 0, 1			; CHECK-NEXT: xvmulsp 34, 0, 1
	; CHECK-NEXT: blr			; CHECK-NEXT: blr
	%ins = insertelement <4 x float> undef, float %a, i32 0			%ins = insertelement <4 x float> undef, float %a, i32 0
	%splat = shufflevector <4 x float> %ins, <4 x float> undef, <4 x i32> zeroinitializer			%splat = shufflevector <4 x float> %ins, <4 x float> undef, <4 x i32> zeroinitializer
	%t1 = fmul reassoc <4 x float> %b, <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 0x3FF028F5C0000000>			%t1 = fmul contract reassoc <4 x float> %b, <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 0x3FF028F5C0000000>
	%mul = fdiv reassoc arcp nsz ninf <4 x float> %t1, %splat			%mul = fdiv contract reassoc arcp nsz ninf <4 x float> %t1, %splat
	ret <4 x float> %mul			ret <4 x float> %mul
	}			}

llvm/test/CodeGen/X86/machine-combiner.ll

	Show First 20 Lines • Show All 234 Lines • ▼ Show 20 Lines
	; AVX1-NEXT: vaddps %xmm1, %xmm0, %xmm0			; AVX1-NEXT: vaddps %xmm1, %xmm0, %xmm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX512-LABEL: reassociate_adds_v4f32:			; AVX512-LABEL: reassociate_adds_v4f32:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vfmadd213ps {{.#+}} xmm0 = (xmm1 xmm0) + xmm2			; AVX512-NEXT: vfmadd213ps {{.#+}} xmm0 = (xmm1 xmm0) + xmm2
	; AVX512-NEXT: vaddps %xmm0, %xmm3, %xmm0			; AVX512-NEXT: vaddps %xmm0, %xmm3, %xmm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%t0 = fmul reassoc nsz <4 x float> %x0, %x1			%t0 = fmul contract reassoc nsz <4 x float> %x0, %x1
	%t1 = fadd reassoc nsz <4 x float> %x2, %t0			%t1 = fadd contract reassoc nsz <4 x float> %x2, %t0
	%t2 = fadd reassoc nsz <4 x float> %x3, %t1			%t2 = fadd reassoc nsz <4 x float> %x3, %t1
	ret <4 x float> %t2			ret <4 x float> %t2
	}			}

	; Verify that SSE and AVX 128-bit vector double-precision adds are reassociated.			; Verify that SSE and AVX 128-bit vector double-precision adds are reassociated.

	define <2 x double> @reassociate_adds_v2f64(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, <2 x double> %x3) {			define <2 x double> @reassociate_adds_v2f64(<2 x double> %x0, <2 x double> %x1, <2 x double> %x2, <2 x double> %x3) {
	; SSE-LABEL: reassociate_adds_v2f64:			; SSE-LABEL: reassociate_adds_v2f64:
	Show All 10 Lines
	; AVX1-NEXT: vaddpd %xmm1, %xmm0, %xmm0			; AVX1-NEXT: vaddpd %xmm1, %xmm0, %xmm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX512-LABEL: reassociate_adds_v2f64:			; AVX512-LABEL: reassociate_adds_v2f64:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vfmadd213pd {{.#+}} xmm0 = (xmm1 xmm0) + xmm2			; AVX512-NEXT: vfmadd213pd {{.#+}} xmm0 = (xmm1 xmm0) + xmm2
	; AVX512-NEXT: vaddpd %xmm0, %xmm3, %xmm0			; AVX512-NEXT: vaddpd %xmm0, %xmm3, %xmm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%t0 = fmul reassoc nsz <2 x double> %x0, %x1			%t0 = fmul contract reassoc nsz <2 x double> %x0, %x1
	%t1 = fadd reassoc nsz <2 x double> %x2, %t0			%t1 = fadd contract reassoc nsz <2 x double> %x2, %t0
	%t2 = fadd reassoc nsz <2 x double> %x3, %t1			%t2 = fadd reassoc nsz <2 x double> %x3, %t1
	ret <2 x double> %t2			ret <2 x double> %t2
	}			}

	; Verify that SSE and AVX 128-bit vector single-precision multiplies are reassociated.			; Verify that SSE and AVX 128-bit vector single-precision multiplies are reassociated.

	define <4 x float> @reassociate_muls_v4f32(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, <4 x float> %x3) {			define <4 x float> @reassociate_muls_v4f32(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, <4 x float> %x3) {
	; SSE-LABEL: reassociate_muls_v4f32:			; SSE-LABEL: reassociate_muls_v4f32:
	▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines
	; AVX1-NEXT: vaddps %ymm1, %ymm0, %ymm0			; AVX1-NEXT: vaddps %ymm1, %ymm0, %ymm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX512-LABEL: reassociate_adds_v8f32:			; AVX512-LABEL: reassociate_adds_v8f32:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vfmadd213ps {{.#+}} ymm0 = (ymm1 ymm0) + ymm2			; AVX512-NEXT: vfmadd213ps {{.#+}} ymm0 = (ymm1 ymm0) + ymm2
	; AVX512-NEXT: vaddps %ymm0, %ymm3, %ymm0			; AVX512-NEXT: vaddps %ymm0, %ymm3, %ymm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%t0 = fmul reassoc nsz <8 x float> %x0, %x1			%t0 = fmul contract reassoc nsz <8 x float> %x0, %x1
	%t1 = fadd reassoc nsz <8 x float> %x2, %t0			%t1 = fadd contract reassoc nsz <8 x float> %x2, %t0
	%t2 = fadd reassoc nsz <8 x float> %x3, %t1			%t2 = fadd reassoc nsz <8 x float> %x3, %t1
	ret <8 x float> %t2			ret <8 x float> %t2
	}			}

	; Verify that AVX 256-bit vector double-precision adds are reassociated.			; Verify that AVX 256-bit vector double-precision adds are reassociated.

	define <4 x double> @reassociate_adds_v4f64(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, <4 x double> %x3) {			define <4 x double> @reassociate_adds_v4f64(<4 x double> %x0, <4 x double> %x1, <4 x double> %x2, <4 x double> %x3) {
	; SSE-LABEL: reassociate_adds_v4f64:			; SSE-LABEL: reassociate_adds_v4f64:
	Show All 13 Lines
	; AVX1-NEXT: vaddpd %ymm1, %ymm0, %ymm0			; AVX1-NEXT: vaddpd %ymm1, %ymm0, %ymm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX512-LABEL: reassociate_adds_v4f64:			; AVX512-LABEL: reassociate_adds_v4f64:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vfmadd213pd {{.#+}} ymm0 = (ymm1 ymm0) + ymm2			; AVX512-NEXT: vfmadd213pd {{.#+}} ymm0 = (ymm1 ymm0) + ymm2
	; AVX512-NEXT: vaddpd %ymm0, %ymm3, %ymm0			; AVX512-NEXT: vaddpd %ymm0, %ymm3, %ymm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%t0 = fmul reassoc nsz <4 x double> %x0, %x1			%t0 = fmul contract reassoc nsz <4 x double> %x0, %x1
	%t1 = fadd reassoc nsz <4 x double> %x2, %t0			%t1 = fadd contract reassoc nsz <4 x double> %x2, %t0
	%t2 = fadd reassoc nsz <4 x double> %x3, %t1			%t2 = fadd reassoc nsz <4 x double> %x3, %t1
	ret <4 x double> %t2			ret <4 x double> %t2
	}			}

	; Verify that AVX 256-bit vector single-precision multiplies are reassociated.			; Verify that AVX 256-bit vector single-precision multiplies are reassociated.

	define <8 x float> @reassociate_muls_v8f32(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, <8 x float> %x3) {			define <8 x float> @reassociate_muls_v8f32(<8 x float> %x0, <8 x float> %x1, <8 x float> %x2, <8 x float> %x3) {
	; SSE-LABEL: reassociate_muls_v8f32:			; SSE-LABEL: reassociate_muls_v8f32:
	▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
	; AVX1-NEXT: vaddps %ymm2, %ymm1, %ymm1			; AVX1-NEXT: vaddps %ymm2, %ymm1, %ymm1
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX512-LABEL: reassociate_adds_v16f32:			; AVX512-LABEL: reassociate_adds_v16f32:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vfmadd213ps {{.#+}} zmm0 = (zmm1 zmm0) + zmm2			; AVX512-NEXT: vfmadd213ps {{.#+}} zmm0 = (zmm1 zmm0) + zmm2
	; AVX512-NEXT: vaddps %zmm0, %zmm3, %zmm0			; AVX512-NEXT: vaddps %zmm0, %zmm3, %zmm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%t0 = fmul reassoc nsz <16 x float> %x0, %x1			%t0 = fmul contract reassoc nsz <16 x float> %x0, %x1
	%t1 = fadd reassoc nsz <16 x float> %x2, %t0			%t1 = fadd contract reassoc nsz <16 x float> %x2, %t0
	%t2 = fadd reassoc nsz <16 x float> %x3, %t1			%t2 = fadd reassoc nsz <16 x float> %x3, %t1
	ret <16 x float> %t2			ret <16 x float> %t2
	}			}

	; Verify that AVX512 512-bit vector double-precision adds are reassociated.			; Verify that AVX512 512-bit vector double-precision adds are reassociated.

	define <8 x double> @reassociate_adds_v8f64(<8 x double> %x0, <8 x double> %x1, <8 x double> %x2, <8 x double> %x3) {			define <8 x double> @reassociate_adds_v8f64(<8 x double> %x0, <8 x double> %x1, <8 x double> %x2, <8 x double> %x3) {
	; SSE-LABEL: reassociate_adds_v8f64:			; SSE-LABEL: reassociate_adds_v8f64:
	Show All 22 Lines
	; AVX1-NEXT: vaddpd %ymm2, %ymm1, %ymm1			; AVX1-NEXT: vaddpd %ymm2, %ymm1, %ymm1
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX512-LABEL: reassociate_adds_v8f64:			; AVX512-LABEL: reassociate_adds_v8f64:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vfmadd213pd {{.#+}} zmm0 = (zmm1 zmm0) + zmm2			; AVX512-NEXT: vfmadd213pd {{.#+}} zmm0 = (zmm1 zmm0) + zmm2
	; AVX512-NEXT: vaddpd %zmm0, %zmm3, %zmm0			; AVX512-NEXT: vaddpd %zmm0, %zmm3, %zmm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%t0 = fmul reassoc nsz <8 x double> %x0, %x1			%t0 = fmul contract reassoc nsz <8 x double> %x0, %x1
	%t1 = fadd reassoc nsz <8 x double> %x2, %t0			%t1 = fadd contract reassoc nsz <8 x double> %x2, %t0
	%t2 = fadd reassoc nsz <8 x double> %x3, %t1			%t2 = fadd reassoc nsz <8 x double> %x3, %t1
	ret <8 x double> %t2			ret <8 x double> %t2
	}			}

	; Verify that AVX512 512-bit vector single-precision multiplies are reassociated.			; Verify that AVX512 512-bit vector single-precision multiplies are reassociated.

	define <16 x float> @reassociate_muls_v16f32(<16 x float> %x0, <16 x float> %x1, <16 x float> %x2, <16 x float> %x3) {			define <16 x float> @reassociate_muls_v16f32(<16 x float> %x0, <16 x float> %x1, <16 x float> %x2, <16 x float> %x3) {
	; SSE-LABEL: reassociate_muls_v16f32:			; SSE-LABEL: reassociate_muls_v16f32:
	▲ Show 20 Lines • Show All 657 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/sqrt-fastmath.ll

	Show First 20 Lines • Show All 706 Lines • ▼ Show 20 Lines
	; AVX512-NEXT: vfmadd231ps {{.#+}} xmm3 = (xmm2 xmm1) + xmm3			; AVX512-NEXT: vfmadd231ps {{.#+}} xmm3 = (xmm2 xmm1) + xmm3
	; AVX512-NEXT: vbroadcastss {{.*#+}} xmm1 = [-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1]			; AVX512-NEXT: vbroadcastss {{.*#+}} xmm1 = [-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1]
	; AVX512-NEXT: vmulps %xmm1, %xmm2, %xmm1			; AVX512-NEXT: vmulps %xmm1, %xmm2, %xmm1
	; AVX512-NEXT: vmulps %xmm3, %xmm1, %xmm1			; AVX512-NEXT: vmulps %xmm3, %xmm1, %xmm1
	; AVX512-NEXT: vmulps %xmm1, %xmm0, %xmm0			; AVX512-NEXT: vmulps %xmm1, %xmm0, %xmm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%s = call <4 x float> @llvm.sqrt.v4f32(<4 x float> %z)			%s = call <4 x float> @llvm.sqrt.v4f32(<4 x float> %z)
	%a = call <4 x float> @llvm.fabs.v4f32(<4 x float> %y)			%a = call <4 x float> @llvm.fabs.v4f32(<4 x float> %y)
	%m = fmul reassoc <4 x float> %a, %s			%m = fmul contract reassoc <4 x float> %a, %s
	%d = fdiv reassoc arcp <4 x float> %x, %m			%d = fdiv contract reassoc arcp <4 x float> %x, %m
	ret <4 x float> %d			ret <4 x float> %d
	}			}

	; This has 'arcp' but does not have 'reassoc' FMF.			; This has 'arcp' but does not have 'reassoc' FMF.
	; We allow converting the sqrt to an estimate, but			; We allow converting the sqrt to an estimate, but
	; do not pull the divisor into the estimate.			; do not pull the divisor into the estimate.
	; x / (fabs(y) * sqrt(z)) --> x * rsqrt(z) / fabs(y)			; x / (fabs(y) * sqrt(z)) --> x * rsqrt(z) / fabs(y)

	▲ Show 20 Lines • Show All 165 Lines • ▼ Show 20 Lines
	; AVX512-NEXT: vbroadcastss {{.*#+}} xmm3 = [-3.0E+0,-3.0E+0,-3.0E+0,-3.0E+0]			; AVX512-NEXT: vbroadcastss {{.*#+}} xmm3 = [-3.0E+0,-3.0E+0,-3.0E+0,-3.0E+0]
	; AVX512-NEXT: vfmadd231ps {{.#+}} xmm3 = (xmm2 xmm1) + xmm3			; AVX512-NEXT: vfmadd231ps {{.#+}} xmm3 = (xmm2 xmm1) + xmm3
	; AVX512-NEXT: vbroadcastss {{.*#+}} xmm1 = [-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1]			; AVX512-NEXT: vbroadcastss {{.*#+}} xmm1 = [-5.0E-1,-5.0E-1,-5.0E-1,-5.0E-1]
	; AVX512-NEXT: vmulps %xmm1, %xmm2, %xmm1			; AVX512-NEXT: vmulps %xmm1, %xmm2, %xmm1
	; AVX512-NEXT: vmulps %xmm3, %xmm1, %xmm1			; AVX512-NEXT: vmulps %xmm3, %xmm1, %xmm1
	; AVX512-NEXT: vmulps %xmm1, %xmm0, %xmm0			; AVX512-NEXT: vmulps %xmm1, %xmm0, %xmm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%s = call <4 x float> @llvm.sqrt.v4f32(<4 x float> %y)			%s = call <4 x float> @llvm.sqrt.v4f32(<4 x float> %y)
	%m = fmul reassoc <4 x float> %y, %s			%m = fmul contract reassoc <4 x float> %y, %s
	%d = fdiv reassoc arcp <4 x float> %x, %m			%d = fdiv contract reassoc arcp <4 x float> %x, %m
	ret <4 x float> %d			ret <4 x float> %d
	}			}

	define double @sqrt_fdiv_common_operand(double %x) nounwind {			define double @sqrt_fdiv_common_operand(double %x) nounwind {
	; SSE-LABEL: sqrt_fdiv_common_operand:			; SSE-LABEL: sqrt_fdiv_common_operand:
	; SSE: # %bb.0:			; SSE: # %bb.0:
	; SSE-NEXT: sqrtsd %xmm0, %xmm0			; SSE-NEXT: sqrtsd %xmm0, %xmm0
	; SSE-NEXT: retq			; SSE-NEXT: retq
	▲ Show 20 Lines • Show All 117 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine] reassoc flag shouldn't enable contractClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 353487

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/test/CodeGen/AArch64/fadd-combines.ll

llvm/test/CodeGen/AMDGPU/fmuladd.f16.ll

llvm/test/CodeGen/AMDGPU/fmuladd.f32.ll

llvm/test/CodeGen/AMDGPU/fmuladd.f64.ll

llvm/test/CodeGen/AMDGPU/fmuladd.v2f16.ll

llvm/test/CodeGen/PowerPC/combine-fneg.ll

llvm/test/CodeGen/PowerPC/fdiv.ll

llvm/test/CodeGen/PowerPC/fma-aggr-FMF.ll

llvm/test/CodeGen/PowerPC/fma-assoc.ll

llvm/test/CodeGen/PowerPC/fma-combine.ll

llvm/test/CodeGen/PowerPC/fma-mutate.ll

llvm/test/CodeGen/PowerPC/fma-negate.ll

llvm/test/CodeGen/PowerPC/fma-precision.ll

llvm/test/CodeGen/PowerPC/fmf-propagation.ll

llvm/test/CodeGen/PowerPC/machine-combiner.ll

llvm/test/CodeGen/PowerPC/recipest.ll

llvm/test/CodeGen/PowerPC/register-pressure-reduction.ll

llvm/test/CodeGen/PowerPC/repeated-fp-divisors.ll

llvm/test/CodeGen/X86/machine-combiner.ll

llvm/test/CodeGen/X86/sqrt-fastmath.ll

[DAGCombine] reassoc flag shouldn't enable contract
ClosedPublic