This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
InstCombineAddSub.cpp
-
test/Transforms/
-
Transforms/
-
InstCombine/
5/14
fast-math.ll
-
Reassociate/
-
fast-MissedTree.ll
-
fast-SubReassociate.ll
-
fast-basictest.ll

Differential D45453

[InstCombine] Enable Add/Sub simplifications with only 'reassoc' FMF
ClosedPublic

Authored by wristow on Apr 9 2018, 1:26 PM.

Download Raw Diff

Details

Reviewers

spatel
majnemer
efriedma

Summary

These simplifications were previously enabled only with isFast(), but that
is more restrictive than required. Since r317488, FMF has 'reassoc' to
control these cases at a finer level.

Diff Detail

Event Timeline

wristow created this revision.Apr 9 2018, 1:26 PM

The tests here demonstrate the overlap between instcombine and reassociate, and based on feedback I've gotten, I'd say that rL170471 was a mistake for instcombine. It created a mini-reassociation pass within instcombine.

I've been cataloging, testing, and updating FP binop instcombines recently (eg, rL328502, rL329501), and the remaining big blob is FAddCombine() - which is poorly named because as shown in this patch, it's also called from visitFSub()...

I don't actually know everything that can happen in this code, but if others are confident that it's just reassociation (without need for nsz/nnan/ninf), then I suppose this is ok to loosen up. I'd still like to have dedicated tests in instcombine for whatever is happening in FAddCombine() though, so add to the tests from rL170471 to demonstrate at least some of the diffs without needing -reassociate?

A couple of short comments/questions, and one longer one:

In D45453#1063225, @spatel wrote:

... the remaining big blob is FAddCombine() - which is poorly named because as shown in this patch, it's also called from visitFSub()...

Should we rename it to FAddSubCombine() in this work (assuming this patch moves forward, given the other questions here)?

... I'd still like to have dedicated tests in instcombine for whatever is happening in FAddCombine() though, so add to the tests from rL170471 to demonstrate at least some of the diffs without needing -reassociate?

To be sure I'm understanding, you mean demonstrate some diffs in those tests using the flag 'reassoc' rather than 'fast' (without needing -reassociate), right? I have a concern that meaningful transformations based on the 'reassoc' flag alone may require both -instcombine and -reassociate.

My longer question relates to:

... but if others are confident that it's just reassociation (without need for nsz/nnan/ninf), then I suppose this is ok to loosen up.

In a sense, I do feel we should check 'nsz' in addition to 'reassoc' (and maybe others). And similarly, the tests in the other files that I modified in this patch possibly should only do the transformation if 'reassoc' and 'nsz' (and others?) are set. But when starting with C/C++ (when -ffast-math isn't specified), we only will set 'reassoc' if some other switches in addition to -fassociative-math are passed. Specifically:
-fassociative-math -fno-signed-zeros -fno-trapping-math

(details at D39812).

So in a sense, an instruction that has only 'reassoc' set on it isn't complete, in that it cannot be generated by Clang. I feel like this is an IEEE thing, rather than a C/C++ thing (so other language front-ends ought to do something similar). But from this perspective, is checking only 'reassoc' OK? Or is should we change tests that have only 'reassoc' on an instruction to also have 'nsz'? Which then leads me to the problem/question that -fno-trapping-math doesn't have a bit in FMF, so it creates a new dimension to the problem/question.

Anyway, I took the interpretation that when the 'reassoc' bit is set, it in a sense means the equivalent of at least -fassociative-math -fno-signed-zeros -fno-trapping-math from a C/C++ command-line perspective. I'm not sure how we should interpret this. As I write this comment, I feel like I've opened a can of worms.

In D45453#1063950, @wristow wrote:

A couple of short comments/questions, and one longer one:

In D45453#1063225, @spatel wrote:

... the remaining big blob is FAddCombine() - which is poorly named because as shown in this patch, it's also called from visitFSub()...

Should we rename it to FAddSubCombine() in this work (assuming this patch moves forward, given the other questions here)?

I'd prefer more specific like reassociateFAddFSub(), but I wouldn't bother unless we want to keep that code in its current form.

... I'd still like to have dedicated tests in instcombine for whatever is happening in FAddCombine() though, so add to the tests from rL170471 to demonstrate at least some of the diffs without needing -reassociate?

To be sure I'm understanding, you mean demonstrate some diffs in those tests using the flag 'reassoc' rather than 'fast' (without needing -reassociate), right?

Yes. Example from fast-math.ll:

; (X + C1) + C2 => X + (C1 + C2)
define float @fold5_reassoc(float %f1, float %f2) {
; CHECK-LABEL: @fold5(
; CHECK-NEXT:    [[ADD1:%.*]] = fadd reassoc float [[F1:%.*]], 9.000000e+00
; CHECK-NEXT:    ret float [[ADD1]]
;
  %add = fadd float %f1, 4.000000e+00
  %add1 = fadd reassoc float %add, 5.000000e+00
  ret float %add1
}

So put something like that under the existing test to show the minimal FMF requirement. Or just replace the 'fast' in the existing test with 'reassoc'.

I have a concern that meaningful transformations based on the 'reassoc' flag alone may require both -instcombine and -reassociate.

There could be some kind of symbiotic behavior here, but this is really saying that we don't have a well-defined division of labor between these passes, right?

My longer question relates to:

... but if others are confident that it's just reassociation (without need for nsz/nnan/ninf), then I suppose this is ok to loosen up.

In a sense, I do feel we should check 'nsz' in addition to 'reassoc' (and maybe others). And similarly, the tests in the other files that I modified in this patch possibly should only do the transformation if 'reassoc' and 'nsz' (and others?) are set. But when starting with C/C++ (when -ffast-math isn't specified), we only will set 'reassoc' if some other switches in addition to -fassociative-math are passed. Specifically:
-fassociative-math -fno-signed-zeros -fno-trapping-math

(details at D39812).

So in a sense, an instruction that has only 'reassoc' set on it isn't complete, in that it cannot be generated by Clang. I feel like this is an IEEE thing, rather than a C/C++ thing (so other language front-ends ought to do something similar). But from this perspective, is checking only 'reassoc' OK? Or is should we change tests that have only 'reassoc' on an instruction to also have 'nsz'? Which then leads me to the problem/question that -fno-trapping-math doesn't have a bit in FMF, so it creates a new dimension to the problem/question.

Anyway, I took the interpretation that when the 'reassoc' bit is set, it in a sense means the equivalent of at least -fassociative-math -fno-signed-zeros -fno-trapping-math from a C/C++ command-line perspective. I'm not sure how we should interpret this. As I write this comment, I feel like I've opened a can of worms.

Hopefully, not. :)

We need to be careful here - just because clang is behaving some way doesn't mean other front-ends are doing the same. We defined the LLVM FMF (maybe not precisely enough in the LangRef) so 'reassoc' and 'nsz' are independent bits.

I've made several recent changes to the FMF requirements in instsimplify/instcombine assuming that 'reassoc' does not imply anything beyond the ability to rearrange math ops. If the sign-of-zero doesn't match the IEEE requirements, then the transform needs 'nsz' too. Trapping isn't a concern because we assume a default FP environment (see recent edits to the LangRef).

So a few ways forward:

Easiest: change the FMF requirements in this patch to reassoc + nsz. We're confident that this code only deals with fadd/fsub, so { nnan , ninf , arcp , afn , contract } shouldn't be in play.
More involved: audit (and add tests for) all of the potential transforms enabled by FAddCombine and determine when only 'reassoc' is needed.
Ideal: audit instcombine and the reassociate pass and split up the transforms, so we don't have redundancies - or at least the redundancies are justified and documented. Note that this goes beyond just the FAddCombine chunk of code. We also do reassociation in instcombine under the name SimplifyAssociativeOrCommutative(), and I think FAddCombine() has redundant code for things that are handled there too.

Thanks for the clarifications Sanjay!

In D45453#1064271, @spatel wrote:
...
Yes. Example from fast-math.ll:
; (X + C1) + C2 => X + (C1 + C2)
define float @fold5_reassoc(float %f1, float %f2) {
; CHECK-LABEL: @fold5(
; CHECK-NEXT:    [[ADD1:%.*]] = fadd reassoc float [[F1:%.*]], 9.000000e+00
; CHECK-NEXT:    ret float [[ADD1]]
;
  %add = fadd float %f1, 4.000000e+00
  %add1 = fadd reassoc float %add, 5.000000e+00
  ret float %add1
}
So put something like that under the existing test to show the minimal FMF requirement. Or just replace the 'fast' in the existing test with 'reassoc'.

I have a concern that meaningful transformations based on the 'reassoc' flag alone may require both -instcombine and -reassociate.

There could be some kind of symbiotic behavior here, but this is really saying that we don't have a well-defined division of labor between these passes, right?

Yes, a lack of a well-defined division of labor between these passes is exactly my concern. I'll look into adding or modifying tests. Depending on possible interaction, that may not be fruitful. Ultimately, that should be addressed.

To be explicit, my immediate goal is to make reassociation "work" when unrelated FMF are disabled. For example:

float foo(float arg, float x) { return (arg + x) - x; }

should simply return arg when compiled at -O2 -ffast-math -fno-reciprocal-math. But as it currently stands, since -fno-reciprocal-math makes isFast() false, disabling the reciprocal transformation incorrectly prevents the reassoication.

So although ultimately the -instcombine/-reassociate division of labor should be addressed, I'd prefer to make this small step in improving the FMF status without doing that more substantial piece of work.

So in a sense, an instruction that has only 'reassoc' set on it isn't complete, in that it cannot be generated by Clang. I feel like this is an IEEE thing, rather than a C/C++ thing (so other language front-ends ought to do something similar). But from this perspective, is checking only 'reassoc' OK? Or is should we change tests that have only 'reassoc' on an instruction to also have 'nsz'? Which then leads me to the problem/question that -fno-trapping-math doesn't have a bit in FMF, so it creates a new dimension to the problem/question.

Anyway, I took the interpretation that when the 'reassoc' bit is set, it in a sense means the equivalent of at least -fassociative-math -fno-signed-zeros -fno-trapping-math from a C/C++ command-line perspective. I'm not sure how we should interpret this. As I write this comment, I feel like I've opened a can of worms.

Hopefully, not. :)

We need to be careful here - just because clang is behaving some way doesn't mean other front-ends are doing the same. We defined the LLVM FMF (maybe not precisely enough in the LangRef) so 'reassoc' and 'nsz' are independent bits.

OK, I can certainly be on board with that. My point about it feels like an IEEE thing rather than a C/C++ thing was to imply that other front-ends really should be doing the same. But in any case, I do like that clarity of (for example) 'reassoc' and 'nsz' explicitly being understood to be independent bits (or more directly, that 'reassoc' alone isn't generally enough to enable reassoication). Further, that clarification helps to keep the can of worms closed.

...
So a few ways forward:

Easiest: change the FMF requirements in this patch to reassoc + nsz. We're confident that this code only deals with fadd/fsub, so { nnan , ninf , arcp , afn , contract } shouldn't be in play.

More involved: audit (and add tests for) all of the potential transforms enabled by FAddCombine and determine when only 'reassoc' is needed.

Ideal: audit instcombine and the reassociate pass and split up the transforms, so we don't have redundancies - or at least the redundancies are justified and documented. Note that this goes beyond just the FAddCombine chunk of code. We also do reassociation in instcombine under the name SimplifyAssociativeOrCommutative(), and I think FAddCombine() has redundant code for things that are handled there too.

Short term, I'll intend to do (1) -- this approach will let me meet my goal of making reassociation work when unrelated FMF are disabled (at least it will in some cases). In doing that, I'll experiment with updating the tests in fast-math.ll, like you suggest.
Also in doing that, I'll look into (2) at least somewhat, to see if there are additional changes that clearly would go well as part of this sort of change.

Longer term, I'll look into (3). Not sure when I'll get to that, though.

I've updated the code-change in "InstCombineAddSub.cpp" to require both 'reassoc' and 'nsz' to do the transformations. (I haven't changed the name from FAddCombine to FAddSubCombine or ReassociateFAddFSub. I could do that if we move forward with this.)

There could be some kind of symbiotic behavior here, but this is really saying that we don't have a well-defined division of labor between these passes, right?

Yes, a lack of a well-defined division of labor between these passes is exactly my concern. I'll look into adding or modifying tests. Depending on possible interaction, that may not be fruitful. Ultimately, that should be addressed.

In looking this over, I was pleasantly surprised that I didn't run into the troublesome interaction I was concerned about. So I've added new tests to "InstCombine/fast-math.ll" to check for the transformations happening when 'reassoc' + 'nsz' are on (with only '-instcombine' used). I've also added tests that verify that 'reassoc' alone doesn't enable the transformation in the appropriate places. Lastly, I updated the tests I had previously modified to use both 'reassoc' and 'nsz' where appropriate when testing for the minimal set of needed flags to transform things.

In D45453#1066811, @wristow wrote:

I've updated the code-change in "InstCombineAddSub.cpp" to require both 'reassoc' and 'nsz' to do the transformations. (I haven't changed the name from FAddCombine to FAddSubCombine or ReassociateFAddFSub. I could do that if we move forward with this.)

There could be some kind of symbiotic behavior here, but this is really saying that we don't have a well-defined division of labor between these passes, right?

Yes, a lack of a well-defined division of labor between these passes is exactly my concern. I'll look into adding or modifying tests. Depending on possible interaction, that may not be fruitful. Ultimately, that should be addressed.

In looking this over, I was pleasantly surprised that I didn't run into the troublesome interaction I was concerned about. So I've added new tests to "InstCombine/fast-math.ll" to check for the transformations happening when 'reassoc' + 'nsz' are on (with only '-instcombine' used). I've also added tests that verify that 'reassoc' alone doesn't enable the transformation in the appropriate places. Lastly, I updated the tests I had previously modified to use both 'reassoc' and 'nsz' where appropriate when testing for the minimal set of needed flags to transform things.

Thanks - more instcombine tests in this area are needed IMO.

I think the code change is fine in that it removes the unrelated FMF from these transforms. And it may be that after this change, we're 'good enough'. Ie, it may be that nobody cares about splitting the FMF hairs further than this.

But I have pointed out some test comments that I don't think are correct. I didn't look at all tests, but I suggest softening the language in general and adding 'TODO', so if someone does decide to improve more, the tests acknowledge that what we have currently isn't complete.

test/Transforms/InstCombine/fast-math.ll
64–65	x * 2.0 + x --> x * 3.0 This doesn't require 'nsz'? If x = -0.0, this preserves that the output is -0.0. In fact, this exact case doesn't need any fast-ness? (cc @scanon) https://reviews.llvm.org/D31164?id=92521#inline-270764
102–104	(4.0 - x) + (5.0 - y) --> 9.0 - (x + y) Again, I don't think this actually requires 'nsz'. Maybe it's better to mark all similar tests with "TODO: We may be able to remove the 'nsz' requirement."
269–272	See the earlier link - another potential special-case where we don't need any FMF.
421–428	-x + y --> y - x This doesn't require any FMF.
433–440	x + (-y) --> x - y Again, no FMF needed at all.

Thanks Sanjay. Updated the tests to be more careful about when 'nsz' is and isn't needed, along with comments to clarify. Also simplified some tests where no FMF is needed (irrespective of this patch).

wristow marked 5 inline comments as done.Apr 13 2018, 6:22 PM

wristow added inline comments.

test/Transforms/InstCombine/fast-math.ll
64–65	I've updated the comment about 'nsz' not being needed, and I've changed the test so as not to be exercising the special-case of 3*x (and added a comment about the special-case as a TODO, in case anyone ever wants to take advantage of that point).
102–104	Thanks. In looking, I think this one definitely doesn't require 'nsz', so I've added a TODO comment that's says 'nsz' is unneeded (rather than "may be able to remove..."). Same for many of the others, so I've added similar TODO comments, where appropriate.
269–272	I've updated the test to have the opportunity for more folding, so it's not the special-case of x+x+x. Also, 'nsz' isn't technically required, so I've added a TODO.
421–428	Good point! In fact, it does fold without any FMF, even without my patch. I've updated the test to demonstrate the folding without 'fast' (or anything else).
433–440	As with 'fold14()' above, this does fold without any FMF, even without my patch. Test updated.

LGTM - see inline for a couple of potential follow-ups.

test/Transforms/InstCombine/fast-math.ll
392	Maybe independent of this patch, but the fmul shouldn't need any FMF for this fold, so it's worth checking if that actually happens. Another consideration is whether the fmul hasOneUse(). We definitely don't have enough multi-use tests to show that folds are not unintentionally creating extra instructions.
421–428	Nit: it's a bit crazy that there are no existing regression tests for these always-safe fneg folds; must be from the early cowboy compiler days. :) There really should be an 'fadd.ll' test file for non-FMF folds.

This revision is now accepted and ready to land.Apr 14 2018, 10:03 AM

Thanks for the review Sanjay. Now committed, at r330089

https://reviews.llvm.org/rL330089

wristow added inline comments.Apr 14 2018, 12:50 PM

test/Transforms/InstCombine/fast-math.ll
392	Hmmm... You're saying that the transformation: `c1 * x - x => (c1 - 1.0) * x` shouldn't need any FMF? That doesn't seem right to me. I think both 'reassoc' and 'nsz' should be needed.

spatel added inline comments.Apr 14 2018, 3:08 PM

test/Transforms/InstCombine/fast-math.ll
392	The fsub needs FMF, but not the fmul. That's the general rule we're using in FMF-enabled folds. The non-strict final value (the result of the fsub) is what we're reassociating/factoring here, so as long as that instruction has reassoc+nsz, it doesn't matter if the fmul intermediate value is strict.

The fsub needs FMF, but not the fmul. That's the general rule we're using in FMF-enabled folds. The non-strict final value (the result of the fsub) is what we're reassociating/factoring here, so as long as that instruction has reassoc+nsz, it doesn't matter if the fmul intermediate value is strict

Got it. Thanks for explaining,

spatel mentioned this in rL330126: [InstCombine] simplify fneg+fadd folds; NFC.Apr 16 2018, 7:17 AM

Revision Contents

Path

Size


	llvm/

lib/

Transforms/

InstCombine/

InstCombineAddSub.cpp

7 lines

test/

Transforms/

InstCombine/

fast-math.ll

445 lines

Reassociate/

fast-MissedTree.ll

13 lines

fast-SubReassociate.ll

13 lines

fast-basictest.ll

97 lines

Diff 142500

lib/Transforms/InstCombine/InstCombineAddSub.cpp

Show First 20 Lines • Show All 505 Lines • ▼ Show 20 Lines	Value FAddCombine::performFactorization(Instruction I) {

Value *RI = createFDiv(NewAddSub, Factor);		Value *RI = createFDiv(NewAddSub, Factor);
if (Instruction *II = dyn_cast<Instruction>(RI))		if (Instruction *II = dyn_cast<Instruction>(RI))
II->setFastMathFlags(Flags);		II->setFastMathFlags(Flags);
return RI;		return RI;
}		}

Value FAddCombine::simplify(Instruction I) {		Value FAddCombine::simplify(Instruction I) {
assert(I->isFast() && "Expected 'fast' instruction");		assert(I->hasAllowReassoc() && I->hasNoSignedZeros() &&
		"Expected 'reassoc'+'nsz' instruction");

// Currently we are not able to handle vector type.		// Currently we are not able to handle vector type.
if (I->getType()->isVectorTy())		if (I->getType()->isVectorTy())
return nullptr;		return nullptr;

assert((I->getOpcode() == Instruction::FAdd \|\|		assert((I->getOpcode() == Instruction::FAdd \|\|
I->getOpcode() == Instruction::FSub) && "Expect add/sub");		I->getOpcode() == Instruction::FSub) && "Expect add/sub");

▲ Show 20 Lines • Show All 850 Lines • ▼ Show 20 Lines	if (SIToFPInst *RHSConv = dyn_cast<SIToFPInst>(RHS)) {
}		}
}		}
}		}

// Handle specials cases for FAdd with selects feeding the operation		// Handle specials cases for FAdd with selects feeding the operation
if (Value *V = SimplifySelectsFeedingBinaryOp(I, LHS, RHS))		if (Value *V = SimplifySelectsFeedingBinaryOp(I, LHS, RHS))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

if (I.isFast()) {		if (I.hasAllowReassoc() && I.hasNoSignedZeros()) {
if (Value *V = FAddCombine(Builder).simplify(&I))		if (Value *V = FAddCombine(Builder).simplify(&I))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);
}		}

return Changed ? &I : nullptr;		return Changed ? &I : nullptr;
}		}

/// Optimize pointer differences into the same array into a size. Consider:		/// Optimize pointer differences into the same array into a size. Consider:
▲ Show 20 Lines • Show All 352 Lines • ▼ Show 20 Lines	if (match(Op1, m_OneUse(m_FPExt(m_FNeg(m_Value(Y)))))) {
Value *ExtY = Builder.CreateFPExt(Y, I.getType());		Value *ExtY = Builder.CreateFPExt(Y, I.getType());
return BinaryOperator::CreateFAddFMF(Op0, ExtY, &I);		return BinaryOperator::CreateFAddFMF(Op0, ExtY, &I);
}		}

// Handle specials cases for FSub with selects feeding the operation		// Handle specials cases for FSub with selects feeding the operation
if (Value *V = SimplifySelectsFeedingBinaryOp(I, Op0, Op1))		if (Value *V = SimplifySelectsFeedingBinaryOp(I, Op0, Op1))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);

if (I.isFast()) {		if (I.hasAllowReassoc() && I.hasNoSignedZeros()) {
if (Value *V = FAddCombine(Builder).simplify(&I))		if (Value *V = FAddCombine(Builder).simplify(&I))
return replaceInstUsesWith(I, V);		return replaceInstUsesWith(I, V);
}		}

return nullptr;		return nullptr;
}		}

test/Transforms/InstCombine/fast-math.ll

	Show All 31 Lines
	; CHECK-NEXT: ret float [[MUL1]]			; CHECK-NEXT: ret float [[MUL1]]
	;			;
	%mul = fmul float %a, 0x3FF3333340000000			%mul = fmul float %a, 0x3FF3333340000000
	%mul1 = fmul fast float %mul, 0x4002666660000000			%mul1 = fmul fast float %mul, 0x4002666660000000
	ret float %mul1			ret float %mul1
	}			}

	; C * f1 + f1 = (C+1) * f1			; C * f1 + f1 = (C+1) * f1
				; TODO: The particular case where C is 2 (so the folded result is 3.0*f1) is
				; always safe, and so doesn't need any FMF.
				; That is, (x + x + x) and (3*x) each have only a single rounding.
	define double @fold3(double %f1) {			define double @fold3(double %f1) {
	; CHECK-LABEL: @fold3(			; CHECK-LABEL: @fold3(
	; CHECK-NEXT: [[TMP1:%.]] = fmul fast double [[F1:%.]], 3.000000e+00			; CHECK-NEXT: [[TMP1:%.]] = fmul fast double [[F1:%.]], 6.000000e+00
	; CHECK-NEXT: ret double [[TMP1]]			; CHECK-NEXT: ret double [[TMP1]]
	;			;
	%t1 = fmul fast double 2.000000e+00, %f1			%t1 = fmul fast double 5.000000e+00, %f1
	%t2 = fadd fast double %f1, %t1			%t2 = fadd fast double %f1, %t1
	ret double %t2			ret double %t2
	}			}

				; Check again with 'reassoc' and 'nsz' ('nsz' not technically required).
				define double @fold3_reassoc_nsz(double %f1) {
				; CHECK-LABEL: @fold3_reassoc_nsz(
				; CHECK-NEXT: [[TMP1:%.]] = fmul reassoc nsz double [[F1:%.]], 6.000000e+00
				; CHECK-NEXT: ret double [[TMP1]]
				;
				%t1 = fmul reassoc nsz double 5.000000e+00, %f1
				%t2 = fadd reassoc nsz double %f1, %t1
				ret double %t2
				}

				; TODO: This doesn't require 'nsz'. It should fold to f1 * 6.0.
				define double @fold3_reassoc(double %f1) {
				spatelUnsubmitted Done Reply Inline Actions x * 2.0 + x --> x * 3.0 This doesn't require 'nsz'? If x = -0.0, this preserves that the output is -0.0. In fact, this exact case doesn't need any fast-ness? (cc @scanon) https://reviews.llvm.org/D31164?id=92521#inline-270764 spatel: x * 2.0 + x --> x * 3.0 This doesn't require 'nsz'? If x = -0.0, this preserves that the…
				wristowAuthorUnsubmitted Not Done Reply Inline Actions I've updated the comment about 'nsz' not being needed, and I've changed the test so as not to be exercising the special-case of 3x (and added a comment about the special-case as a TODO, in case anyone ever wants to take advantage of that point). wristow:* I've updated the comment about 'nsz' not being needed, and I've changed the test so as not to…
				; CHECK-LABEL: @fold3_reassoc(
				; CHECK-NEXT: [[TMP1:%.]] = fmul reassoc double [[F1:%.]], 5.000000e+00
				; CHECK-NEXT: [[TMP2:%.*]] = fadd reassoc double [[TMP1]], [[F1]]
				; CHECK-NEXT: ret double [[TMP2]]
				;
				%t1 = fmul reassoc double 5.000000e+00, %f1
				%t2 = fadd reassoc double %f1, %t1
				ret double %t2
				}

	; (C1 - X) + (C2 - Y) => (C1+C2) - (X + Y)			; (C1 - X) + (C2 - Y) => (C1+C2) - (X + Y)
	define float @fold4(float %f1, float %f2) {			define float @fold4(float %f1, float %f2) {
	; CHECK-LABEL: @fold4(			; CHECK-LABEL: @fold4(
	; CHECK-NEXT: [[TMP1:%.]] = fadd fast float [[F1:%.]], [[F2:%.*]]			; CHECK-NEXT: [[TMP1:%.]] = fadd fast float [[F1:%.]], [[F2:%.*]]
	; CHECK-NEXT: [[TMP2:%.*]] = fsub fast float 9.000000e+00, [[TMP1]]			; CHECK-NEXT: [[TMP2:%.*]] = fsub fast float 9.000000e+00, [[TMP1]]
	; CHECK-NEXT: ret float [[TMP2]]			; CHECK-NEXT: ret float [[TMP2]]
	;			;
	%sub = fsub float 4.000000e+00, %f1			%sub = fsub float 4.000000e+00, %f1
	%sub1 = fsub float 5.000000e+00, %f2			%sub1 = fsub float 5.000000e+00, %f2
	%add = fadd fast float %sub, %sub1			%add = fadd fast float %sub, %sub1
	ret float %add			ret float %add
	}			}

				; Check again with 'reassoc' and 'nsz' ('nsz' not technically required).
				define float @fold4_reassoc_nsz(float %f1, float %f2) {
				; CHECK-LABEL: @fold4_reassoc_nsz(
				; CHECK-NEXT: [[TMP1:%.]] = fadd reassoc nsz float [[F1:%.]], [[F2:%.*]]
				; CHECK-NEXT: [[TMP2:%.*]] = fsub reassoc nsz float 9.000000e+00, [[TMP1]]
				; CHECK-NEXT: ret float [[TMP2]]
				;
				%sub = fsub float 4.000000e+00, %f1
				%sub1 = fsub float 5.000000e+00, %f2
				%add = fadd reassoc nsz float %sub, %sub1
				ret float %add
				}

				; TODO: This doesn't require 'nsz'. It should fold to (9.0 - (f1 + f2)).
				define float @fold4_reassoc(float %f1, float %f2) {
				; CHECK-LABEL: @fold4_reassoc(
				spatelUnsubmitted Done Reply Inline Actions (4.0 - x) + (5.0 - y) --> 9.0 - (x + y) Again, I don't think this actually requires 'nsz'. Maybe it's better to mark all similar tests with "TODO: We may be able to remove the 'nsz' requirement." spatel: (4.0 - x) + (5.0 - y) --> 9.0 - (x + y) Again, I don't think this actually requires 'nsz'.
				wristowAuthorUnsubmitted Not Done Reply Inline Actions Thanks. In looking, I think this one definitely doesn't require 'nsz', so I've added a TODO comment that's says 'nsz' is unneeded (rather than "may be able to remove..."). Same for many of the others, so I've added similar TODO comments, where appropriate. wristow: Thanks. In looking, I think this one //definitely //doesn't require 'nsz', so I've added a…
				; CHECK-NEXT: [[TMP1:%.]] = fsub float 4.000000e+00, [[F1:%.]]
				; CHECK-NEXT: [[TMP2:%.]] = fsub float 5.000000e+00, [[F2:%.]]
				; CHECK-NEXT: [[TMP3:%.*]] = fadd reassoc float [[TMP1]], [[TMP2]]
				; CHECK-NEXT: ret float [[TMP3]]
				;
				%sub = fsub float 4.000000e+00, %f1
				%sub1 = fsub float 5.000000e+00, %f2
				%add = fadd reassoc float %sub, %sub1
				ret float %add
				}

	; (X + C1) + C2 => X + (C1 + C2)			; (X + C1) + C2 => X + (C1 + C2)
	define float @fold5(float %f1, float %f2) {			define float @fold5(float %f1) {
	; CHECK-LABEL: @fold5(			; CHECK-LABEL: @fold5(
	; CHECK-NEXT: [[ADD1:%.]] = fadd fast float [[F1:%.]], 9.000000e+00			; CHECK-NEXT: [[ADD1:%.]] = fadd fast float [[F1:%.]], 9.000000e+00
	; CHECK-NEXT: ret float [[ADD1]]			; CHECK-NEXT: ret float [[ADD1]]
	;			;
	%add = fadd float %f1, 4.000000e+00			%add = fadd float %f1, 4.000000e+00
	%add1 = fadd fast float %add, 5.000000e+00			%add1 = fadd fast float %add, 5.000000e+00
	ret float %add1			ret float %add1
	}			}

	; (X + X) + X => 3.0 * X			; Check again with 'reassoc' and 'nsz' ('nsz' not technically required).
				define float @fold5_reassoc_nsz(float %f1) {
				; CHECK-LABEL: @fold5_reassoc_nsz(
				; CHECK-NEXT: [[ADD1:%.]] = fadd reassoc nsz float [[F1:%.]], 9.000000e+00
				; CHECK-NEXT: ret float [[ADD1]]
				;
				%add = fadd float %f1, 4.000000e+00
				%add1 = fadd reassoc nsz float %add, 5.000000e+00
				ret float %add1
				}

				; TODO: This doesn't require 'nsz'. It should fold to f1 + 9.0
				define float @fold5_reassoc(float %f1) {
				; CHECK-LABEL: @fold5_reassoc(
				; CHECK-NEXT: [[ADD:%.]] = fadd float [[F1:%.]], 4.000000e+00
				; CHECK-NEXT: [[ADD1:%.*]] = fadd reassoc float [[ADD]], 5.000000e+00
				; CHECK-NEXT: ret float [[ADD1]]
				;
				%add = fadd float %f1, 4.000000e+00
				%add1 = fadd reassoc float %add, 5.000000e+00
				ret float %add1
				}

				; (X + X) + X + X => 4.0 * X
	define float @fold6(float %f1) {			define float @fold6(float %f1) {
	; CHECK-LABEL: @fold6(			; CHECK-LABEL: @fold6(
	; CHECK-NEXT: [[TMP1:%.]] = fmul fast float [[F1:%.]], 3.000000e+00			; CHECK-NEXT: [[TMP1:%.]] = fmul fast float [[F1:%.]], 4.000000e+00
	; CHECK-NEXT: ret float [[TMP1]]			; CHECK-NEXT: ret float [[TMP1]]
	;			;
	%t1 = fadd fast float %f1, %f1			%t1 = fadd fast float %f1, %f1
	%t2 = fadd fast float %f1, %t1			%t2 = fadd fast float %f1, %t1
	ret float %t2			%t3 = fadd fast float %t2, %f1
				ret float %t3
				}

				; Check again with 'reassoc' and 'nsz' ('nsz' not technically required).
				define float @fold6_reassoc_nsz(float %f1) {
				; CHECK-LABEL: @fold6_reassoc_nsz(
				; CHECK-NEXT: [[TMP1:%.]] = fmul reassoc nsz float [[F1:%.]], 4.000000e+00
				; CHECK-NEXT: ret float [[TMP1]]
				;
				%t1 = fadd reassoc nsz float %f1, %f1
				%t2 = fadd reassoc nsz float %f1, %t1
				%t3 = fadd reassoc nsz float %t2, %f1
				ret float %t3
				}

				; TODO: This doesn't require 'nsz'. It should fold to f1 * 4.0.
				define float @fold6_reassoc(float %f1) {
				; CHECK-LABEL: @fold6_reassoc(
				; CHECK-NEXT: [[TMP1:%.]] = fadd reassoc float [[F1:%.]], [[F1]]
				; CHECK-NEXT: [[TMP2:%.*]] = fadd reassoc float [[TMP1]], [[F1]]
				; CHECK-NEXT: [[TMP3:%.*]] = fadd reassoc float [[TMP2]], [[F1]]
				; CHECK-NEXT: ret float [[TMP3]]
				;
				%t1 = fadd reassoc float %f1, %f1
				%t2 = fadd reassoc float %f1, %t1
				%t3 = fadd reassoc float %t2, %f1
				ret float %t3
	}			}

	; C1 * X + (X + X) = (C1 + 2) * X			; C1 * X + (X + X) = (C1 + 2) * X
	define float @fold7(float %f1) {			define float @fold7(float %f1) {
	; CHECK-LABEL: @fold7(			; CHECK-LABEL: @fold7(
	; CHECK-NEXT: [[TMP1:%.]] = fmul fast float [[F1:%.]], 7.000000e+00			; CHECK-NEXT: [[TMP1:%.]] = fmul fast float [[F1:%.]], 7.000000e+00
	; CHECK-NEXT: ret float [[TMP1]]			; CHECK-NEXT: ret float [[TMP1]]
	;			;
	%t1 = fmul fast float %f1, 5.000000e+00			%t1 = fmul fast float %f1, 5.000000e+00
	%t2 = fadd fast float %f1, %f1			%t2 = fadd fast float %f1, %f1
	%t3 = fadd fast float %t1, %t2			%t3 = fadd fast float %t1, %t2
	ret float %t3			ret float %t3
	}			}

	; (X + X) + (X + X) => 4.0 * X			; Check again with 'reassoc' and 'nsz' ('nsz' not technically required).
				define float @fold7_reassoc_nsz(float %f1) {
				; CHECK-LABEL: @fold7_reassoc_nsz(
				; CHECK-NEXT: [[TMP1:%.]] = fmul reassoc nsz float [[F1:%.]], 7.000000e+00
				; CHECK-NEXT: ret float [[TMP1]]
				;
				%t1 = fmul reassoc nsz float %f1, 5.000000e+00
				%t2 = fadd reassoc nsz float %f1, %f1
				%t3 = fadd reassoc nsz float %t1, %t2
				ret float %t3
				}

				; TODO: This doesn't require 'nsz'. It should fold to f1 * 7.0.
				define float @fold7_reassoc(float %f1) {
				; CHECK-LABEL: @fold7_reassoc(
				; CHECK-NEXT: [[TMP1:%.]] = fmul reassoc float [[F1:%.]], 5.000000e+00
				; CHECK-NEXT: [[TMP2:%.*]] = fadd reassoc float [[F1]], [[F1]]
				; CHECK-NEXT: [[TMP3:%.*]] = fadd reassoc float [[TMP1]], [[TMP2]]
				; CHECK-NEXT: ret float [[TMP3]]
				;
				%t1 = fmul reassoc float %f1, 5.000000e+00
				%t2 = fadd reassoc float %f1, %f1
				%t3 = fadd reassoc float %t1, %t2
				ret float %t3
				}

				; (X + X) + (X + X) + X => 5.0 * X
	define float @fold8(float %f1) {			define float @fold8(float %f1) {
	; CHECK-LABEL: @fold8(			; CHECK-LABEL: @fold8(
	; CHECK-NEXT: [[TMP1:%.]] = fmul fast float [[F1:%.]], 4.000000e+00			; CHECK-NEXT: [[TMP1:%.]] = fmul fast float [[F1:%.]], 5.000000e+00
	; CHECK-NEXT: ret float [[TMP1]]			; CHECK-NEXT: ret float [[TMP1]]
	;			;
	%t1 = fadd fast float %f1, %f1			%t1 = fadd fast float %f1, %f1
	%t2 = fadd fast float %f1, %f1			%t2 = fadd fast float %f1, %f1
	%t3 = fadd fast float %t1, %t2			%t3 = fadd fast float %t1, %t2
	ret float %t3			%t4 = fadd fast float %t3, %f1
				ret float %t4
				}

				; Check again with 'reassoc' and 'nsz' ('nsz' not technically required).
				define float @fold8_reassoc_nsz(float %f1) {
				; CHECK-LABEL: @fold8_reassoc_nsz(
				; CHECK-NEXT: [[TMP1:%.]] = fmul reassoc nsz float [[F1:%.]], 5.000000e+00
				; CHECK-NEXT: ret float [[TMP1]]
				;
				%t1 = fadd reassoc nsz float %f1, %f1
				%t2 = fadd reassoc nsz float %f1, %f1
				%t3 = fadd reassoc nsz float %t1, %t2
				%t4 = fadd reassoc nsz float %t3, %f1
				ret float %t4
				}

				; TODO: This doesn't require 'nsz'. It should fold to f1 * 5.0.
				define float @fold8_reassoc(float %f1) {
				; CHECK-LABEL: @fold8_reassoc(
				; CHECK-NEXT: [[TMP1:%.]] = fadd reassoc float [[F1:%.]], [[F1]]
				; CHECK-NEXT: [[TMP2:%.*]] = fadd reassoc float [[F1]], [[F1]]
				; CHECK-NEXT: [[TMP3:%.*]] = fadd reassoc float [[TMP1]], [[TMP2]]
				; CHECK-NEXT: [[TMP4:%.*]] = fadd reassoc float [[TMP3]], [[F1]]
				; CHECK-NEXT: ret float [[TMP4]]
				;
				%t1 = fadd reassoc float %f1, %f1
				%t2 = fadd reassoc float %f1, %f1
				%t3 = fadd reassoc float %t1, %t2
				%t4 = fadd reassoc float %t3, %f1
				ret float %t4
	}			}

	; X - (X + Y) => 0 - Y			; X - (X + Y) => 0 - Y
	define float @fold9(float %f1, float %f2) {			define float @fold9(float %f1, float %f2) {
	; CHECK-LABEL: @fold9(			; CHECK-LABEL: @fold9(
	; CHECK-NEXT: [[TMP1:%.]] = fsub fast float -0.000000e+00, [[F2:%.]]			; CHECK-NEXT: [[TMP1:%.]] = fsub fast float -0.000000e+00, [[F2:%.]]
	; CHECK-NEXT: ret float [[TMP1]]			; CHECK-NEXT: ret float [[TMP1]]
				spatelUnsubmitted Done Reply Inline Actions See the earlier link - another potential special-case where we don't need any FMF. spatel: See the earlier link - another potential special-case where we don't need any FMF.
				wristowAuthorUnsubmitted Not Done Reply Inline Actions I've updated the test to have the opportunity for more folding, so it's not the special-case of x+x+x. Also, 'nsz' isn't technically required, so I've added a TODO. wristow: I've updated the test to have the opportunity for more folding, so it's not the special-case of…
	;			;
	%t1 = fadd float %f1, %f2			%t1 = fadd float %f1, %f2
	%t3 = fsub fast float %f1, %t1			%t3 = fsub fast float %f1, %t1
	ret float %t3			ret float %t3
	}			}

				; Check again with 'reassoc' and 'nsz' ('nsz' not technically required).
				define float @fold9_reassoc_nsz(float %f1, float %f2) {
				; CHECK-LABEL: @fold9_reassoc_nsz(
				; CHECK-NEXT: [[TMP1:%.]] = fsub reassoc nsz float -0.000000e+00, [[F2:%.]]
				; CHECK-NEXT: ret float [[TMP1]]
				;
				%t1 = fadd float %f1, %f2
				%t3 = fsub reassoc nsz float %f1, %t1
				ret float %t3
				}

				; TODO: This doesn't require 'nsz'. It should fold to 0 - f2
				define float @fold9_reassoc(float %f1, float %f2) {
				; CHECK-LABEL: @fold9_reassoc(
				; CHECK-NEXT: [[TMP1:%.]] = fadd float [[F1:%.]], [[F2:%.*]]
				; CHECK-NEXT: [[TMP2:%.*]] = fsub reassoc float [[F1]], [[TMP1]]
				; CHECK-NEXT: ret float [[TMP2]]
				;
				%t1 = fadd float %f1, %f2
				%t3 = fsub reassoc float %f1, %t1
				ret float %t3
				}

	; Let C3 = C1 + C2. (f1 + C1) + (f2 + C2) => (f1 + f2) + C3 instead of			; Let C3 = C1 + C2. (f1 + C1) + (f2 + C2) => (f1 + f2) + C3 instead of
	; "(f1 + C3) + f2" or "(f2 + C3) + f1". Placing constant-addend at the			; "(f1 + C3) + f2" or "(f2 + C3) + f1". Placing constant-addend at the
	; top of resulting simplified expression tree may potentially reveal some			; top of resulting simplified expression tree may potentially reveal some
	; optimization opportunities in the super-expression trees.			; optimization opportunities in the super-expression trees.
	;			;
	define float @fold10(float %f1, float %f2) {			define float @fold10(float %f1, float %f2) {
	; CHECK-LABEL: @fold10(			; CHECK-LABEL: @fold10(
	; CHECK-NEXT: [[T2:%.]] = fadd fast float [[F1:%.]], [[F2:%.*]]			; CHECK-NEXT: [[T2:%.]] = fadd fast float [[F1:%.]], [[F2:%.*]]
	; CHECK-NEXT: [[T3:%.*]] = fadd fast float [[T2]], -1.000000e+00			; CHECK-NEXT: [[T3:%.*]] = fadd fast float [[T2]], -1.000000e+00
	; CHECK-NEXT: ret float [[T3]]			; CHECK-NEXT: ret float [[T3]]
	;			;
	%t1 = fadd fast float 2.000000e+00, %f1			%t1 = fadd fast float 2.000000e+00, %f1
	%t2 = fsub fast float %f2, 3.000000e+00			%t2 = fsub fast float %f2, 3.000000e+00
	%t3 = fadd fast float %t1, %t2			%t3 = fadd fast float %t1, %t2
	ret float %t3			ret float %t3
	}			}

				; Check again with 'reassoc' and 'nsz'.
				; TODO: We may be able to remove the 'nsz' requirement.
				define float @fold10_reassoc_nsz(float %f1, float %f2) {
				; CHECK-LABEL: @fold10_reassoc_nsz(
				; CHECK-NEXT: [[T2:%.]] = fadd reassoc nsz float [[F1:%.]], [[F2:%.*]]
				; CHECK-NEXT: [[T3:%.*]] = fadd reassoc nsz float [[T2]], -1.000000e+00
				; CHECK-NEXT: ret float [[T3]]
				;
				%t1 = fadd reassoc nsz float 2.000000e+00, %f1
				%t2 = fsub reassoc nsz float %f2, 3.000000e+00
				%t3 = fadd reassoc nsz float %t1, %t2
				ret float %t3
				}

				; Observe that the fold is not done with only reassoc (the instructions are
				; canonicalized, but not folded).
				; TODO: As noted above, 'nsz' may not be required for this to be fully folded.
				define float @fold10_reassoc(float %f1, float %f2) {
				; CHECK-LABEL: @fold10_reassoc(
				; CHECK-NEXT: [[TMP1:%.]] = fadd reassoc float [[F1:%.]], 2.000000e+00
				; CHECK-NEXT: [[TMP2:%.]] = fadd reassoc float [[F2:%.]], -3.000000e+00
				; CHECK-NEXT: [[TMP3:%.*]] = fadd reassoc float [[TMP1]], [[TMP2]]
				; CHECK-NEXT: ret float [[TMP3]]
				;
				%t1 = fadd reassoc float 2.000000e+00, %f1
				%t2 = fsub reassoc float %f2, 3.000000e+00
				%t3 = fadd reassoc float %t1, %t2
				ret float %t3
				}

	; This used to crash/miscompile.			; This used to crash/miscompile.

	define float @fail1(float %f1, float %f2) {			define float @fail1(float %f1, float %f2) {
	; CHECK-LABEL: @fail1(			; CHECK-LABEL: @fail1(
	; CHECK-NEXT: [[TMP1:%.]] = fmul fast float [[F1:%.]], 3.000000e+00			; CHECK-NEXT: [[TMP1:%.]] = fmul fast float [[F1:%.]], 3.000000e+00
	; CHECK-NEXT: [[TMP2:%.*]] = fadd fast float [[TMP1]], -3.000000e+00			; CHECK-NEXT: [[TMP2:%.*]] = fadd fast float [[TMP1]], -3.000000e+00
	; CHECK-NEXT: ret float [[TMP2]]			; CHECK-NEXT: ret float [[TMP2]]
	;			;
	Show All 21 Lines
	; CHECK-NEXT: [[TMP1:%.]] = fmul fast float [[X:%.]], 6.000000e+00			; CHECK-NEXT: [[TMP1:%.]] = fmul fast float [[X:%.]], 6.000000e+00
	; CHECK-NEXT: ret float [[TMP1]]			; CHECK-NEXT: ret float [[TMP1]]
	;			;
	%mul = fmul fast float %x, 7.000000e+00			%mul = fmul fast float %x, 7.000000e+00
	%sub = fsub fast float %mul, %x			%sub = fsub fast float %mul, %x
	ret float %sub			ret float %sub
	}			}

				; Check again using the minimal subset of FMF.
				define float @fold13_reassoc_nsz(float %x) {
				; CHECK-LABEL: @fold13_reassoc_nsz(
				; CHECK-NEXT: [[TMP1:%.]] = fmul reassoc nsz float [[X:%.]], 6.000000e+00
				; CHECK-NEXT: ret float [[TMP1]]
				;
				%mul = fmul reassoc nsz float %x, 7.000000e+00
				spatelUnsubmitted Not Done Reply Inline Actions Maybe independent of this patch, but the fmul shouldn't need any FMF for this fold, so it's worth checking if that actually happens. Another consideration is whether the fmul hasOneUse(). We definitely don't have enough multi-use tests to show that folds are not unintentionally creating extra instructions. spatel: Maybe independent of this patch, but the fmul shouldn't need any FMF for this fold, so it's…
				wristowAuthorUnsubmitted Not Done Reply Inline Actions Hmmm... You're saying that the transformation: `c1 * x - x => (c1 - 1.0) * x` shouldn't need any FMF? That doesn't seem right to me. I think both 'reassoc' and 'nsz' should be needed. wristow: Hmmm... You're saying that the transformation: `c1 * x - x => (c1 - 1.0) * x` shouldn't need…
				spatelUnsubmitted Not Done Reply Inline Actions The fsub needs FMF, but not the fmul. That's the general rule we're using in FMF-enabled folds. The non-strict final value (the result of the fsub) is what we're reassociating/factoring here, so as long as that instruction has reassoc+nsz, it doesn't matter if the fmul intermediate value is strict. spatel: The fsub needs FMF, but not the fmul. That's the general rule we're using in FMF-enabled folds.
				%sub = fsub reassoc nsz float %mul, %x
				ret float %sub
				}

				; Verify the fold is not done with only 'reassoc' ('nsz' is required).
				define float @fold13_reassoc(float %x) {
				; CHECK-LABEL: @fold13_reassoc(
				; CHECK-NEXT: [[TMP1:%.]] = fmul reassoc float [[X:%.]], 7.000000e+00
				; CHECK-NEXT: [[TMP2:%.*]] = fsub reassoc float [[TMP1]], [[X]]
				; CHECK-NEXT: ret float [[TMP2]]
				;
				%mul = fmul reassoc float %x, 7.000000e+00
				%sub = fsub reassoc float %mul, %x
				ret float %sub
				}

	; -x + y => y - x			; -x + y => y - x
				; This is always safe. No FMF required.
	define float @fold14(float %x, float %y) {			define float @fold14(float %x, float %y) {
	; CHECK-LABEL: @fold14(			; CHECK-LABEL: @fold14(
	; CHECK-NEXT: [[ADD:%.]] = fsub fast float [[Y:%.]], [[X:%.*]]			; CHECK-NEXT: [[ADD:%.]] = fsub float [[Y:%.]], [[X:%.*]]
	; CHECK-NEXT: ret float [[ADD]]			; CHECK-NEXT: ret float [[ADD]]
	;			;
	%neg = fsub fast float -0.0, %x			%neg = fsub float -0.0, %x
	%add = fadd fast float %neg, %y			%add = fadd float %neg, %y
	ret float %add			ret float %add
	}			}

	; x + -y => x - y			; x + -y => x - y
				; This is always safe. No FMF required.
	define float @fold15(float %x, float %y) {			define float @fold15(float %x, float %y) {
	; CHECK-LABEL: @fold15(			; CHECK-LABEL: @fold15(
	; CHECK-NEXT: [[ADD:%.]] = fsub fast float [[X:%.]], [[Y:%.*]]			; CHECK-NEXT: [[ADD:%.]] = fsub float [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: ret float [[ADD]]			; CHECK-NEXT: ret float [[ADD]]
	;			;
	%neg = fsub fast float -0.0, %y			%neg = fsub float -0.0, %y
				spatelUnsubmitted Done Reply Inline Actions -x + y --> y - x This doesn't require any FMF. spatel: -x + y --> y - x This doesn't require any FMF.
				wristowAuthorUnsubmitted Not Done Reply Inline Actions Good point! In fact, it does fold without any FMF, even without my patch. I've updated the test to demonstrate the folding without 'fast' (or anything else). wristow: Good point! In fact, it //does //fold without any FMF, even without my patch. I've updated…
				spatelUnsubmitted Not Done Reply Inline Actions Nit: it's a bit crazy that there are no existing regression tests for these always-safe fneg folds; must be from the early cowboy compiler days. :) There really should be an 'fadd.ll' test file for non-FMF folds. spatel: Nit: it's a bit crazy that there are no existing regression tests for these always-safe fneg…
	%add = fadd fast float %x, %neg			%add = fadd float %x, %neg
	ret float %add			ret float %add
	}			}

	; (select X+Y, X-Y) => X + (select Y, -Y)			; (select X+Y, X-Y) => X + (select Y, -Y)
				; This is always safe. No FMF required.
	define float @fold16(float %x, float %y) {			define float @fold16(float %x, float %y) {
	; CHECK-LABEL: @fold16(			; CHECK-LABEL: @fold16(
	; CHECK-NEXT: [[CMP:%.]] = fcmp ogt float [[X:%.]], [[Y:%.*]]			; CHECK-NEXT: [[CMP:%.]] = fcmp ogt float [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP1:%.*]] = fsub fast float -0.000000e+00, [[Y]]			; CHECK-NEXT: [[TMP1:%.*]] = fsub float -0.000000e+00, [[Y]]
	; CHECK-NEXT: [[R_P:%.*]] = select i1 [[CMP]], float [[Y]], float [[TMP1]]			; CHECK-NEXT: [[R_P:%.*]] = select i1 [[CMP]], float [[Y]], float [[TMP1]]
	; CHECK-NEXT: [[R:%.*]] = fadd fast float [[R_P]], [[X]]			; CHECK-NEXT: [[R:%.*]] = fadd float [[R_P]], [[X]]
				spatelUnsubmitted Done Reply Inline Actions x + (-y) --> x - y Again, no FMF needed at all. spatel: x + (-y) --> x - y Again, no FMF needed at all.
				wristowAuthorUnsubmitted Not Done Reply Inline Actions As with 'fold14()' above, this does fold without any FMF, even without my patch. Test updated. wristow: As with 'fold14()' above, this does fold without any FMF, even without my patch. Test updated.
	; CHECK-NEXT: ret float [[R]]			; CHECK-NEXT: ret float [[R]]
	;			;
	%cmp = fcmp ogt float %x, %y			%cmp = fcmp ogt float %x, %y
	%plus = fadd fast float %x, %y			%plus = fadd float %x, %y
	%minus = fsub fast float %x, %y			%minus = fsub float %x, %y
	%r = select i1 %cmp, float %plus, float %minus			%r = select i1 %cmp, float %plus, float %minus
	ret float %r			ret float %r
	}			}

	; =========================================================================			; =========================================================================
	;			;
	; Testing-cases about negation			; Testing-cases about negation
	;			;
	▲ Show 20 Lines • Show All 110 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret float [[TMP2]]			; CHECK-NEXT: ret float [[TMP2]]
	;			;
	%t1 = fmul fast float %x, %z			%t1 = fmul fast float %x, %z
	%t2 = fmul fast float %y, %z			%t2 = fmul fast float %y, %z
	%t3 = fadd fast float %t1, %t2			%t3 = fadd fast float %t1, %t2
	ret float %t3			ret float %t3
	}			}

				; Check again using the minimal subset of FMF.
				define float @fact_mul1_reassoc_nsz(float %x, float %y, float %z) {
				; CHECK-LABEL: @fact_mul1_reassoc_nsz(
				; CHECK-NEXT: [[TMP1:%.]] = fadd reassoc nsz float [[X:%.]], [[Y:%.*]]
				; CHECK-NEXT: [[TMP2:%.]] = fmul reassoc nsz float [[TMP1]], [[Z:%.]]
				; CHECK-NEXT: ret float [[TMP2]]
				;
				%t1 = fmul reassoc nsz float %x, %z
				%t2 = fmul reassoc nsz float %y, %z
				%t3 = fadd reassoc nsz float %t1, %t2
				ret float %t3
				}

				; Verify the fold is not done with only 'reassoc' ('nsz' is required).
				define float @fact_mul1_reassoc(float %x, float %y, float %z) {
				; CHECK-LABEL: @fact_mul1_reassoc(
				; CHECK-NEXT: [[TMP1:%.]] = fmul reassoc float [[X:%.]], [[Z:%.*]]
				; CHECK-NEXT: [[TMP2:%.]] = fmul reassoc float [[Y:%.]] [[Z]]
				; CHECK-NEXT: [[TMP3:%.*]] = fadd reassoc float [[TMP1]], [[TMP2]]
				; CHECK-NEXT: ret float [[TMP3]]
				;
				%t1 = fmul reassoc float %x, %z
				%t2 = fmul reassoc float %y, %z
				%t3 = fadd reassoc float %t1, %t2
				ret float %t3
				}

	; zx + yz => (x+y) * z			; zx + yz => (x+y) * z
	define float @fact_mul2(float %x, float %y, float %z) {			define float @fact_mul2(float %x, float %y, float %z) {
	; CHECK-LABEL: @fact_mul2(			; CHECK-LABEL: @fact_mul2(
	; CHECK-NEXT: [[TMP1:%.]] = fsub fast float [[X:%.]], [[Y:%.*]]			; CHECK-NEXT: [[TMP1:%.]] = fsub fast float [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = fmul fast float [[TMP1]], [[Z:%.]]			; CHECK-NEXT: [[TMP2:%.]] = fmul fast float [[TMP1]], [[Z:%.]]
	; CHECK-NEXT: ret float [[TMP2]]			; CHECK-NEXT: ret float [[TMP2]]
	;			;
	%t1 = fmul fast float %z, %x			%t1 = fmul fast float %z, %x
	%t2 = fmul fast float %y, %z			%t2 = fmul fast float %y, %z
	%t3 = fsub fast float %t1, %t2			%t3 = fsub fast float %t1, %t2
	ret float %t3			ret float %t3
	}			}

				; Check again using the minimal subset of FMF.
				define float @fact_mul2_reassoc_nsz(float %x, float %y, float %z) {
				; CHECK-LABEL: @fact_mul2_reassoc_nsz(
				; CHECK-NEXT: [[TMP1:%.]] = fsub reassoc nsz float [[X:%.]], [[Y:%.*]]
				; CHECK-NEXT: [[TMP2:%.]] = fmul reassoc nsz float [[TMP1]], [[Z:%.]]
				; CHECK-NEXT: ret float [[TMP2]]
				;
				%t1 = fmul reassoc nsz float %z, %x
				%t2 = fmul reassoc nsz float %y, %z
				%t3 = fsub reassoc nsz float %t1, %t2
				ret float %t3
				}

				; Verify the fold is not done with only 'reassoc' ('nsz' is required).
				define float @fact_mul2_reassoc(float %x, float %y, float %z) {
				; CHECK-LABEL: @fact_mul2_reassoc(
				; CHECK-NEXT: [[TMP1:%.]] = fmul reassoc float [[Z:%.]], [[X:%.*]]
				; CHECK-NEXT: [[TMP2:%.]] = fmul reassoc float [[Y:%.]], [[Z]]
				; CHECK-NEXT: [[TMP3:%.*]] = fsub reassoc float [[TMP1]], [[TMP2]]
				; CHECK-NEXT: ret float [[TMP3]]
				;
				%t1 = fmul reassoc float %z, %x
				%t2 = fmul reassoc float %y, %z
				%t3 = fsub reassoc float %t1, %t2
				ret float %t3
				}

	; zx - zy => (x-y) * z			; zx - zy => (x-y) * z
	define float @fact_mul3(float %x, float %y, float %z) {			define float @fact_mul3(float %x, float %y, float %z) {
	; CHECK-LABEL: @fact_mul3(			; CHECK-LABEL: @fact_mul3(
	; CHECK-NEXT: [[TMP1:%.]] = fsub fast float [[X:%.]], [[Y:%.*]]			; CHECK-NEXT: [[TMP1:%.]] = fsub fast float [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = fmul fast float [[TMP1]], [[Z:%.]]			; CHECK-NEXT: [[TMP2:%.]] = fmul fast float [[TMP1]], [[Z:%.]]
	; CHECK-NEXT: ret float [[TMP2]]			; CHECK-NEXT: ret float [[TMP2]]
	;			;
	%t2 = fmul fast float %z, %y			%t2 = fmul fast float %z, %y
	%t1 = fmul fast float %z, %x			%t1 = fmul fast float %z, %x
	%t3 = fsub fast float %t1, %t2			%t3 = fsub fast float %t1, %t2
	ret float %t3			ret float %t3
	}			}

				; Check again using the minimal subset of FMF.
				define float @fact_mul3_reassoc_nsz(float %x, float %y, float %z) {
				; CHECK-LABEL: @fact_mul3_reassoc_nsz(
				; CHECK-NEXT: [[TMP1:%.]] = fsub reassoc nsz float [[X:%.]], [[Y:%.*]]
				; CHECK-NEXT: [[TMP2:%.]] = fmul reassoc nsz float [[TMP1]], [[Z:%.]]
				; CHECK-NEXT: ret float [[TMP2]]
				;
				%t2 = fmul reassoc nsz float %z, %y
				%t1 = fmul reassoc nsz float %z, %x
				%t3 = fsub reassoc nsz float %t1, %t2
				ret float %t3
				}

				; Verify the fold is not done with only 'reassoc' ('nsz' is required).
				define float @fact_mul3_reassoc(float %x, float %y, float %z) {
				; CHECK-LABEL: @fact_mul3_reassoc(
				; CHECK-NEXT: [[TMP2:%.]] = fmul reassoc float [[Z:%.]], [[Y:%.*]]
				; CHECK-NEXT: [[TMP1:%.]] = fmul reassoc float [[Z]], [[X:%.]]
				; CHECK-NEXT: [[TMP3:%.*]] = fsub reassoc float [[TMP1]], [[TMP2]]
				; CHECK-NEXT: ret float [[TMP3]]
				;
				%t2 = fmul reassoc float %z, %y
				%t1 = fmul reassoc float %z, %x
				%t3 = fsub reassoc float %t1, %t2
				ret float %t3
				}

	; xz - zy => (x-y) * z			; xz - zy => (x-y) * z
	define float @fact_mul4(float %x, float %y, float %z) {			define float @fact_mul4(float %x, float %y, float %z) {
	; CHECK-LABEL: @fact_mul4(			; CHECK-LABEL: @fact_mul4(
	; CHECK-NEXT: [[TMP1:%.]] = fsub fast float [[X:%.]], [[Y:%.*]]			; CHECK-NEXT: [[TMP1:%.]] = fsub fast float [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = fmul fast float [[TMP1]], [[Z:%.]]			; CHECK-NEXT: [[TMP2:%.]] = fmul fast float [[TMP1]], [[Z:%.]]
	; CHECK-NEXT: ret float [[TMP2]]			; CHECK-NEXT: ret float [[TMP2]]
	;			;
	%t1 = fmul fast float %x, %z			%t1 = fmul fast float %x, %z
	%t2 = fmul fast float %z, %y			%t2 = fmul fast float %z, %y
	%t3 = fsub fast float %t1, %t2			%t3 = fsub fast float %t1, %t2
	ret float %t3			ret float %t3
	}			}

				; Check again using the minimal subset of FMF.
				define float @fact_mul4_reassoc_nsz(float %x, float %y, float %z) {
				; CHECK-LABEL: @fact_mul4_reassoc_nsz(
				; CHECK-NEXT: [[TMP1:%.]] = fsub reassoc nsz float [[X:%.]], [[Y:%.*]]
				; CHECK-NEXT: [[TMP2:%.]] = fmul reassoc nsz float [[TMP1]], [[Z:%.]]
				; CHECK-NEXT: ret float [[TMP2]]
				;
				%t1 = fmul reassoc nsz float %x, %z
				%t2 = fmul reassoc nsz float %z, %y
				%t3 = fsub reassoc nsz float %t1, %t2
				ret float %t3
				}

				; Verify the fold is not done with only 'reassoc' ('nsz' is required).
				define float @fact_mul4_reassoc(float %x, float %y, float %z) {
				; CHECK-LABEL: @fact_mul4_reassoc(
				; CHECK-NEXT: [[TMP1:%.]] = fmul reassoc float [[X:%.]], [[Z:%.*]]
				; CHECK-NEXT: [[TMP2:%.]] = fmul reassoc float [[Z]], [[Y:%.]]
				; CHECK-NEXT: [[TMP3:%.*]] = fsub reassoc float [[TMP1]], [[TMP2]]
				; CHECK-NEXT: ret float [[TMP3]]
				;
				%t1 = fmul reassoc float %x, %z
				%t2 = fmul reassoc float %z, %y
				%t3 = fsub reassoc float %t1, %t2
				ret float %t3
				}

	; x/y + x/z, no xform			; x/y + x/z, no xform
	define float @fact_div1(float %x, float %y, float %z) {			define float @fact_div1(float %x, float %y, float %z) {
	; CHECK-LABEL: @fact_div1(			; CHECK-LABEL: @fact_div1(
	; CHECK-NEXT: [[T1:%.]] = fdiv fast float [[X:%.]], [[Y:%.*]]			; CHECK-NEXT: [[T1:%.]] = fdiv fast float [[X:%.]], [[Y:%.*]]
	; CHECK-NEXT: [[T2:%.]] = fdiv fast float [[X]], [[Z:%.]]			; CHECK-NEXT: [[T2:%.]] = fdiv fast float [[X]], [[Z:%.]]
	; CHECK-NEXT: [[T3:%.*]] = fadd fast float [[T1]], [[T2]]			; CHECK-NEXT: [[T3:%.*]] = fadd fast float [[T1]], [[T2]]
	; CHECK-NEXT: ret float [[T3]]			; CHECK-NEXT: ret float [[T3]]
	;			;
	Show All 25 Lines
	; CHECK-NEXT: ret float [[TMP2]]			; CHECK-NEXT: ret float [[TMP2]]
	;			;
	%t1 = fdiv fast float %y, %x			%t1 = fdiv fast float %y, %x
	%t2 = fdiv fast float %z, %x			%t2 = fdiv fast float %z, %x
	%t3 = fadd fast float %t1, %t2			%t3 = fadd fast float %t1, %t2
	ret float %t3			ret float %t3
	}			}

				; Check again using the minimal subset of FMF.
				define float @fact_div3_reassoc_nsz(float %x, float %y, float %z) {
				; CHECK-LABEL: @fact_div3_reassoc_nsz(
				; CHECK-NEXT: [[TMP1:%.]] = fadd reassoc nsz float [[Y:%.]], [[Z:%.*]]
				; CHECK-NEXT: [[TMP2:%.]] = fdiv reassoc nsz float [[TMP1]], [[X:%.]]
				; CHECK-NEXT: ret float [[TMP2]]
				;
				%t1 = fdiv reassoc nsz float %y, %x
				%t2 = fdiv reassoc nsz float %z, %x
				%t3 = fadd reassoc nsz float %t1, %t2
				ret float %t3
				}

				; Verify the fold is not done with only 'reassoc' ('nsz' is required).
				define float @fact_div3_reassoc(float %x, float %y, float %z) {
				; CHECK-LABEL: @fact_div3_reassoc(
				; CHECK-NEXT: [[TMP1:%.]] = fdiv reassoc float [[Y:%.]], [[X:%.*]]
				; CHECK-NEXT: [[TMP2:%.]] = fdiv reassoc float [[Z:%.]], [[X]]
				; CHECK-NEXT: [[TMP3:%.*]] = fadd reassoc float [[TMP1]], [[TMP2]]
				; CHECK-NEXT: ret float [[TMP3]]
				;
				%t1 = fdiv reassoc float %y, %x
				%t2 = fdiv reassoc float %z, %x
				%t3 = fadd reassoc float %t1, %t2
				ret float %t3
				}

	; y/x - z/x => (y-z)/x			; y/x - z/x => (y-z)/x
	define float @fact_div4(float %x, float %y, float %z) {			define float @fact_div4(float %x, float %y, float %z) {
	; CHECK-LABEL: @fact_div4(			; CHECK-LABEL: @fact_div4(
	; CHECK-NEXT: [[TMP1:%.]] = fsub fast float [[Y:%.]], [[Z:%.*]]			; CHECK-NEXT: [[TMP1:%.]] = fsub fast float [[Y:%.]], [[Z:%.*]]
	; CHECK-NEXT: [[TMP2:%.]] = fdiv fast float [[TMP1]], [[X:%.]]			; CHECK-NEXT: [[TMP2:%.]] = fdiv fast float [[TMP1]], [[X:%.]]
	; CHECK-NEXT: ret float [[TMP2]]			; CHECK-NEXT: ret float [[TMP2]]
	;			;
	%t1 = fdiv fast float %y, %x			%t1 = fdiv fast float %y, %x
	%t2 = fdiv fast float %z, %x			%t2 = fdiv fast float %z, %x
	%t3 = fsub fast float %t1, %t2			%t3 = fsub fast float %t1, %t2
	ret float %t3			ret float %t3
	}			}

				; Check again using the minimal subset of FMF.
				define float @fact_div4_reassoc_nsz(float %x, float %y, float %z) {
				; CHECK-LABEL: @fact_div4_reassoc_nsz(
				; CHECK-NEXT: [[TMP1:%.]] = fsub reassoc nsz float [[Y:%.]], [[Z:%.*]]
				; CHECK-NEXT: [[TMP2:%.]] = fdiv reassoc nsz float [[TMP1]], [[X:%.]]
				; CHECK-NEXT: ret float [[TMP2]]
				;
				%t1 = fdiv reassoc nsz float %y, %x
				%t2 = fdiv reassoc nsz float %z, %x
				%t3 = fsub reassoc nsz float %t1, %t2
				ret float %t3
				}

				; Verify the fold is not done with only 'reassoc' ('nsz' is required).
				define float @fact_div4_reassoc(float %x, float %y, float %z) {
				; CHECK-LABEL: @fact_div4_reassoc(
				; CHECK-NEXT: [[TMP1:%.]] = fdiv reassoc float [[Y:%.]], [[X:%.*]]
				; CHECK-NEXT: [[TMP2:%.]] = fdiv reassoc float [[Z:%.]], [[X]]
				; CHECK-NEXT: [[TMP3:%.*]] = fsub reassoc float [[TMP1]], [[TMP2]]
				; CHECK-NEXT: ret float [[TMP3]]
				;
				%t1 = fdiv reassoc float %y, %x
				%t2 = fdiv reassoc float %z, %x
				%t3 = fsub reassoc float %t1, %t2
				ret float %t3
				}

	; y/x - z/x => (y-z)/x is disabled if y-z is denormal.			; y/x - z/x => (y-z)/x is disabled if y-z is denormal.
	define float @fact_div5(float %x) {			define float @fact_div5(float %x) {
	; CHECK-LABEL: @fact_div5(			; CHECK-LABEL: @fact_div5(
	; CHECK-NEXT: [[TMP1:%.]] = fdiv fast float 0x3818000000000000, [[X:%.]]			; CHECK-NEXT: [[TMP1:%.]] = fdiv fast float 0x3818000000000000, [[X:%.]]
	; CHECK-NEXT: ret float [[TMP1]]			; CHECK-NEXT: ret float [[TMP1]]
	;			;
	%t1 = fdiv fast float 0x3810000000000000, %x			%t1 = fdiv fast float 0x3810000000000000, %x
	%t2 = fdiv fast float 0x3800000000000000, %x			%t2 = fdiv fast float 0x3800000000000000, %x
	▲ Show 20 Lines • Show All 295 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP1:%.]] = fcmp fast olt fp128 [[A:%.]], [[B:%.*]]			; CHECK-NEXT: [[TMP1:%.]] = fcmp fast olt fp128 [[A:%.]], [[B:%.*]]
	; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], fp128 [[A]], fp128 [[B]]			; CHECK-NEXT: [[TMP2:%.*]] = select i1 [[TMP1]], fp128 [[A]], fp128 [[B]]
	; CHECK-NEXT: ret fp128 [[TMP2]]			; CHECK-NEXT: ret fp128 [[TMP2]]
	;			;
	%c = call fast fp128 @fminl(fp128 %a, fp128 %b)			%c = call fast fp128 @fminl(fp128 %a, fp128 %b)
	ret fp128 %c			ret fp128 %c
	}			}

				; ((which ? 2.0 : a) + 1.0) => (which ? 3.0 : (a + 1.0))
				; This is always safe. No FMF required.
	define float @test55(i1 %which, float %a) {			define float @test55(i1 %which, float %a) {
	; CHECK-LABEL: @test55(			; CHECK-LABEL: @test55(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br i1 [[WHICH:%.]], label [[FINAL:%.]], label [[DELAY:%.*]]			; CHECK-NEXT: br i1 [[WHICH:%.]], label [[FINAL:%.]], label [[DELAY:%.*]]
	; CHECK: delay:			; CHECK: delay:
	; CHECK-NEXT: [[PHITMP:%.]] = fadd fast float [[A:%.]], 1.000000e+00			; CHECK-NEXT: [[PHITMP:%.]] = fadd float [[A:%.]], 1.000000e+00
	; CHECK-NEXT: br label [[FINAL]]			; CHECK-NEXT: br label [[FINAL]]
	; CHECK: final:			; CHECK: final:
	; CHECK-NEXT: [[A:%.]] = phi float [ 3.000000e+00, [[ENTRY:%.]] ], [ [[PHITMP]], [[DELAY]] ]			; CHECK-NEXT: [[A:%.]] = phi float [ 3.000000e+00, [[ENTRY:%.]] ], [ [[PHITMP]], [[DELAY]] ]
	; CHECK-NEXT: ret float [[A]]			; CHECK-NEXT: ret float [[A]]
	;			;
	entry:			entry:
	br i1 %which, label %final, label %delay			br i1 %which, label %final, label %delay

	delay:			delay:
	br label %final			br label %final

	final:			final:
	%A = phi float [ 2.0, %entry ], [ %a, %delay ]			%A = phi float [ 2.0, %entry ], [ %a, %delay ]
	%value = fadd fast float %A, 1.0			%value = fadd float %A, 1.0
	ret float %value			ret float %value
	}			}

test/Transforms/Reassociate/fast-MissedTree.ll

	; RUN: opt < %s -reassociate -instcombine -S \| FileCheck %s			; RUN: opt < %s -reassociate -instcombine -S \| FileCheck %s

	define float @test1(float %A, float %B) {			define float @test1(float %A, float %B) {
	; CHECK-LABEL: @test1(			; CHECK-LABEL: @test1(
	; CHECK-NEXT: [[Z:%.*]] = fadd fast float %A, %B			; CHECK-NEXT: [[Z:%.*]] = fadd fast float %A, %B
	; CHECK-NEXT: ret float [[Z]]			; CHECK-NEXT: ret float [[Z]]
	;			;
	%W = fadd fast float %B, -5.0			%W = fadd fast float %B, -5.0
	%Y = fadd fast float %A, 5.0			%Y = fadd fast float %A, 5.0
	%Z = fadd fast float %W, %Y			%Z = fadd fast float %W, %Y
	ret float %Z			ret float %Z
	}			}

	; Check again using minimal subset of FMF.			; Check again using minimal subset of FMF.
				; Both 'reassoc' and 'nsz' are required.
				define float @test1_reassoc_nsz(float %A, float %B) {
				; CHECK-LABEL: @test1_reassoc_nsz(
				; CHECK-NEXT: [[Z:%.*]] = fadd reassoc nsz float %A, %B
				; CHECK-NEXT: ret float [[Z]]
				;
				%W = fadd reassoc nsz float %B, -5.0
				%Y = fadd reassoc nsz float %A, 5.0
				%Z = fadd reassoc nsz float %W, %Y
				ret float %Z
				}

				; Verify the fold is not done with only 'reassoc' ('nsz' is required).
	define float @test1_reassoc(float %A, float %B) {			define float @test1_reassoc(float %A, float %B) {
	; CHECK-LABEL: @test1_reassoc(			; CHECK-LABEL: @test1_reassoc(
	; CHECK-NEXT: [[W:%.*]] = fadd reassoc float %B, -5.000000e+00			; CHECK-NEXT: [[W:%.*]] = fadd reassoc float %B, -5.000000e+00
	; CHECK-NEXT: [[Y:%.*]] = fadd reassoc float %A, 5.000000e+00			; CHECK-NEXT: [[Y:%.*]] = fadd reassoc float %A, 5.000000e+00
	; CHECK-NEXT: [[Z:%.*]] = fadd reassoc float [[Y]], [[W]]			; CHECK-NEXT: [[Z:%.*]] = fadd reassoc float [[Y]], [[W]]
	; CHECK-NEXT: ret float [[Z]]			; CHECK-NEXT: ret float [[Z]]
	;			;
	%W = fadd reassoc float %B, -5.0			%W = fadd reassoc float %B, -5.0
	%Y = fadd reassoc float %A, 5.0			%Y = fadd reassoc float %A, 5.0
	%Z = fadd reassoc float %W, %Y			%Z = fadd reassoc float %W, %Y
	ret float %Z			ret float %Z
	}			}

test/Transforms/Reassociate/fast-SubReassociate.ll

Show All 23 Lines	;
%W = fadd fast float %B, 5.000000e+00		%W = fadd fast float %B, 5.000000e+00
%X = fadd fast float %A, -7.000000e+00		%X = fadd fast float %A, -7.000000e+00
%Y = fsub fast float %X, %W		%Y = fsub fast float %X, %W
%Z = fadd fast float %Y, 1.200000e+01		%Z = fadd fast float %Y, 1.200000e+01
ret float %Z		ret float %Z
}		}

; Check again using minimal subset of FMF.		; Check again using minimal subset of FMF.
		; Both 'reassoc' and 'nsz' are required.
		define float @test2_minimal(float %A, float %B) {
		; CHECK-LABEL: @test2_minimal(
		; CHECK-NEXT: [[Z:%.*]] = fsub reassoc nsz float %A, %B
		; CHECK-NEXT: ret float [[Z]]
		;
		%W = fadd reassoc nsz float %B, 5.000000e+00
		%X = fadd reassoc nsz float %A, -7.000000e+00
		%Y = fsub reassoc nsz float %X, %W
		%Z = fadd reassoc nsz float %Y, 1.200000e+01
		ret float %Z
		}

		; Verify the fold is not done with only 'reassoc' ('nsz' is required).
define float @test2_reassoc(float %A, float %B) {		define float @test2_reassoc(float %A, float %B) {
; CHECK-LABEL: @test2_reassoc(		; CHECK-LABEL: @test2_reassoc(
; CHECK-NEXT: [[W:%.*]] = fadd reassoc float %B, 5.000000e+00		; CHECK-NEXT: [[W:%.*]] = fadd reassoc float %B, 5.000000e+00
; CHECK-NEXT: [[X:%.*]] = fadd reassoc float %A, -7.000000e+00		; CHECK-NEXT: [[X:%.*]] = fadd reassoc float %A, -7.000000e+00
; CHECK-NEXT: [[Y:%.*]] = fsub reassoc float [[X]], [[W]]		; CHECK-NEXT: [[Y:%.*]] = fsub reassoc float [[X]], [[W]]
; CHECK-NEXT: [[Z:%.*]] = fadd reassoc float [[Y]], 1.200000e+01		; CHECK-NEXT: [[Z:%.*]] = fadd reassoc float [[Y]], 1.200000e+01
; CHECK-NEXT: ret float [[Z]]		; CHECK-NEXT: ret float [[Z]]
;		;
▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

test/Transforms/Reassociate/fast-basictest.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -reassociate -gvn -instcombine -S \| FileCheck %s		; RUN: opt < %s -reassociate -gvn -instcombine -S \| FileCheck %s

; With reassociation, constant folding can eliminate the 12 and -12 constants.		; With reassociation, constant folding can eliminate the 12 and -12 constants.
define float @test1(float %arg) {		define float @test1(float %arg) {
; CHECK-LABEL: @test1(		; CHECK-LABEL: @test1(
; CHECK-NEXT: [[ARG_NEG:%.]] = fsub fast float -0.000000e+00, [[ARG:%.]]		; CHECK-NEXT: [[ARG_NEG:%.]] = fsub fast float -0.000000e+00, [[ARG:%.]]
; CHECK-NEXT: ret float [[ARG_NEG]]		; CHECK-NEXT: ret float [[ARG_NEG]]
;		;
%t1 = fsub fast float -1.200000e+01, %arg		%t1 = fsub fast float -1.200000e+01, %arg
%t2 = fadd fast float %t1, 1.200000e+01		%t2 = fadd fast float %t1, 1.200000e+01
ret float %t2		ret float %t2
}		}

		; Check again using the minimal subset of FMF.
		; Both 'reassoc' and 'nsz' are required.
		define float @test1_minimal(float %arg) {
		; CHECK-LABEL: @test1_minimal(
		; CHECK-NEXT: [[ARG_NEG:%.]] = fsub reassoc nsz float -0.000000e+00, [[ARG:%.]]
		; CHECK-NEXT: ret float [[ARG_NEG]]
		;
		%t1 = fsub reassoc nsz float -1.200000e+01, %arg
		%t2 = fadd reassoc nsz float %t1, 1.200000e+01
		ret float %t2
		}

		; Verify the fold is not done with only 'reassoc' ('nsz' is required).
define float @test1_reassoc(float %arg) {		define float @test1_reassoc(float %arg) {
; CHECK-LABEL: @test1_reassoc(		; CHECK-LABEL: @test1_reassoc(
; CHECK-NEXT: [[T1:%.]] = fsub reassoc float -1.200000e+01, [[ARG:%.]]		; CHECK-NEXT: [[T1:%.]] = fsub reassoc float -1.200000e+01, [[ARG:%.]]
; CHECK-NEXT: [[T2:%.*]] = fadd reassoc float [[T1]], 1.200000e+01		; CHECK-NEXT: [[T2:%.*]] = fadd reassoc float [[T1]], 1.200000e+01
; CHECK-NEXT: ret float [[T2]]		; CHECK-NEXT: ret float [[T2]]
;		;
%t1 = fsub reassoc float -1.200000e+01, %arg		%t1 = fsub reassoc float -1.200000e+01, %arg
%t2 = fadd reassoc float %t1, 1.200000e+01		%t2 = fadd reassoc float %t1, 1.200000e+01
▲ Show 20 Lines • Show All 182 Lines • ▼ Show 20 Lines
; CHECK-NEXT: [[FACTOR:%.]] = fmul fast float [[X:%.]], 9.400000e+01		; CHECK-NEXT: [[FACTOR:%.]] = fmul fast float [[X:%.]], 9.400000e+01
; CHECK-NEXT: ret float [[FACTOR]]		; CHECK-NEXT: ret float [[FACTOR]]
;		;
%Y = fmul fast float %X, 4.700000e+01		%Y = fmul fast float %X, 4.700000e+01
%Z = fadd fast float %Y, %Y		%Z = fadd fast float %Y, %Y
ret float %Z		ret float %Z
}		}

		; Check again with 'reassoc' and 'nsz' ('nsz' not technically required).
		define float @test9_reassoc_nsz(float %X) {
		; CHECK-LABEL: @test9_reassoc_nsz(
		; CHECK-NEXT: [[FACTOR:%.]] = fmul reassoc nsz float [[X:%.]], 9.400000e+01
		; CHECK-NEXT: ret float [[FACTOR]]
		;
		%Y = fmul reassoc nsz float %X, 4.700000e+01
		%Z = fadd reassoc nsz float %Y, %Y
		ret float %Z
		}

		; TODO: This doesn't require 'nsz'. It should fold to X * 94.0
define float @test9_reassoc(float %X) {		define float @test9_reassoc(float %X) {
; CHECK-LABEL: @test9_reassoc(		; CHECK-LABEL: @test9_reassoc(
; CHECK-NEXT: [[Y:%.]] = fmul reassoc float [[X:%.]], 4.700000e+01		; CHECK-NEXT: [[Y:%.]] = fmul reassoc float [[X:%.]], 4.700000e+01
; CHECK-NEXT: [[Z:%.*]] = fadd reassoc float [[Y]], [[Y]]		; CHECK-NEXT: [[Z:%.*]] = fadd reassoc float [[Y]], [[Y]]
; CHECK-NEXT: ret float [[Z]]		; CHECK-NEXT: ret float [[Z]]
;		;
%Y = fmul reassoc float %X, 4.700000e+01		%Y = fmul reassoc float %X, 4.700000e+01
%Z = fadd reassoc float %Y, %Y		%Z = fadd reassoc float %Y, %Y
ret float %Z		ret float %Z
}		}

		; Side note: (x + x + x) and (3*x) each have only a single rounding. So
		; transforming x+x+x to 3*x is always safe, even without any FMF.
		; To avoid that special-case, we have the addition of 'x' four times, here.
define float @test10(float %X) {		define float @test10(float %X) {
; CHECK-LABEL: @test10(		; CHECK-LABEL: @test10(
; CHECK-NEXT: [[FACTOR:%.]] = fmul fast float [[X:%.]], 3.000000e+00		; CHECK-NEXT: [[FACTOR:%.]] = fmul fast float [[X:%.]], 4.000000e+00
; CHECK-NEXT: ret float [[FACTOR]]		; CHECK-NEXT: ret float [[FACTOR]]
;		;
%Y = fadd fast float %X ,%X		%Y = fadd fast float %X ,%X
%Z = fadd fast float %Y, %X		%Z = fadd fast float %Y, %X
ret float %Z		%W = fadd fast float %Z, %X
		ret float %W
		}

		; Check again with 'reassoc' and 'nsz' ('nsz' not technically required).
		define float @test10_reassoc_nsz(float %X) {
		; CHECK-LABEL: @test10_reassoc_nsz(
		; CHECK-NEXT: [[FACTOR:%.]] = fmul reassoc nsz float [[X:%.]], 4.000000e+00
		; CHECK-NEXT: ret float [[FACTOR]]
		;
		%Y = fadd reassoc nsz float %X ,%X
		%Z = fadd reassoc nsz float %Y, %X
		%W = fadd reassoc nsz float %Z, %X
		ret float %W
}		}

		; TODO: This doesn't require 'nsz'. It should fold to 4 * x
define float @test10_reassoc(float %X) {		define float @test10_reassoc(float %X) {
; CHECK-LABEL: @test10_reassoc(		; CHECK-LABEL: @test10_reassoc(
; CHECK-NEXT: [[Y:%.]] = fadd reassoc float [[X:%.]], [[X]]		; CHECK-NEXT: [[Y:%.]] = fadd reassoc float [[X:%.]], [[X]]
; CHECK-NEXT: [[Z:%.*]] = fadd reassoc float [[Y]], [[X]]		; CHECK-NEXT: [[Z:%.*]] = fadd reassoc float [[Y]], [[X]]
; CHECK-NEXT: ret float [[Z]]		; CHECK-NEXT: [[W:%.*]] = fadd reassoc float [[Z]], [[X]]
		; CHECK-NEXT: ret float [[W]]
;		;
%Y = fadd reassoc float %X ,%X		%Y = fadd reassoc float %X ,%X
%Z = fadd reassoc float %Y, %X		%Z = fadd reassoc float %Y, %X
ret float %Z		%W = fadd reassoc float %Z, %X
		ret float %W
}		}

define float @test11(float %W) {		define float @test11(float %W) {
; CHECK-LABEL: @test11(		; CHECK-LABEL: @test11(
; CHECK-NEXT: [[FACTOR:%.]] = fmul fast float [[W:%.]], 3.810000e+02		; CHECK-NEXT: [[FACTOR:%.]] = fmul fast float [[W:%.]], 3.810000e+02
; CHECK-NEXT: ret float [[FACTOR]]		; CHECK-NEXT: ret float [[FACTOR]]
;		;
%X = fmul fast float %W, 127.0		%X = fmul fast float %W, 127.0
%Y = fadd fast float %X ,%X		%Y = fadd fast float %X ,%X
%Z = fadd fast float %Y, %X		%Z = fadd fast float %Y, %X
ret float %Z		ret float %Z
}		}

		; Check again using the minimal subset of FMF.
		; Check again with 'reassoc' and 'nsz' ('nsz' not technically required).
		define float @test11_reassoc_nsz(float %W) {
		; CHECK-LABEL: @test11_reassoc_nsz(
		; CHECK-NEXT: [[FACTOR:%.]] = fmul reassoc nsz float [[W:%.]], 3.810000e+02
		; CHECK-NEXT: ret float [[FACTOR]]
		;
		%X = fmul reassoc nsz float %W, 127.0
		%Y = fadd reassoc nsz float %X ,%X
		%Z = fadd reassoc nsz float %Y, %X
		ret float %Z
		}

		; TODO: This doesn't require 'nsz'. It should fold to W*381.0.
define float @test11_reassoc(float %W) {		define float @test11_reassoc(float %W) {
; CHECK-LABEL: @test11_reassoc(		; CHECK-LABEL: @test11_reassoc(
; CHECK-NEXT: [[X:%.]] = fmul reassoc float [[W:%.]], 1.270000e+02		; CHECK-NEXT: [[X:%.]] = fmul reassoc float [[W:%.]], 1.270000e+02
; CHECK-NEXT: [[Y:%.*]] = fadd reassoc float [[X]], [[X]]		; CHECK-NEXT: [[Y:%.*]] = fadd reassoc float [[X]], [[X]]
; CHECK-NEXT: [[Z:%.*]] = fadd reassoc float [[X]], [[Y]]		; CHECK-NEXT: [[Z:%.*]] = fadd reassoc float [[X]], [[Y]]
; CHECK-NEXT: ret float [[Z]]		; CHECK-NEXT: ret float [[Z]]
;		;
%X = fmul reassoc float %W, 127.0		%X = fmul reassoc float %W, 127.0
Show All 11 Lines	;
%A = fsub fast float 1.000000e+00, %X		%A = fsub fast float 1.000000e+00, %X
%B = fsub fast float 2.000000e+00, %X		%B = fsub fast float 2.000000e+00, %X
%C = fsub fast float 3.000000e+00, %X		%C = fsub fast float 3.000000e+00, %X
%Y = fadd fast float %A ,%B		%Y = fadd fast float %A ,%B
%Z = fadd fast float %Y, %C		%Z = fadd fast float %Y, %C
ret float %Z		ret float %Z
}		}

		; Check again with 'reassoc' and 'nsz' ('nsz' not technically required).
		define float @test12_reassoc_nsz(float %X) {
		; CHECK-LABEL: @test12_reassoc_nsz(
		; CHECK-NEXT: [[FACTOR:%.]] = fmul reassoc nsz float [[X:%.]], 3.000000e+00
		; CHECK-NEXT: [[Z:%.*]] = fsub reassoc nsz float 6.000000e+00, [[FACTOR]]
		; CHECK-NEXT: ret float [[Z]]
		;
		%A = fsub reassoc nsz float 1.000000e+00, %X
		%B = fsub reassoc nsz float 2.000000e+00, %X
		%C = fsub reassoc nsz float 3.000000e+00, %X
		%Y = fadd reassoc nsz float %A ,%B
		%Z = fadd reassoc nsz float %Y, %C
		ret float %Z
		}

		; TODO: This doesn't require 'nsz'. It should fold to (6.0 - 3.0*x)
define float @test12_reassoc(float %X) {		define float @test12_reassoc(float %X) {
; CHECK-LABEL: @test12_reassoc(		; CHECK-LABEL: @test12_reassoc(
; CHECK-NEXT: [[A:%.]] = fsub reassoc float 1.000000e+00, [[X:%.]]		; CHECK-NEXT: [[A:%.]] = fsub reassoc float 1.000000e+00, [[X:%.]]
; CHECK-NEXT: [[B:%.*]] = fsub reassoc float 2.000000e+00, [[X]]		; CHECK-NEXT: [[B:%.*]] = fsub reassoc float 2.000000e+00, [[X]]
; CHECK-NEXT: [[C:%.*]] = fsub reassoc float 3.000000e+00, [[X]]		; CHECK-NEXT: [[C:%.*]] = fsub reassoc float 3.000000e+00, [[X]]
; CHECK-NEXT: [[Y:%.*]] = fadd reassoc float [[A]], [[B]]		; CHECK-NEXT: [[Y:%.*]] = fadd reassoc float [[A]], [[B]]
; CHECK-NEXT: [[Z:%.*]] = fadd reassoc float [[C]], [[Y]]		; CHECK-NEXT: [[Z:%.*]] = fadd reassoc float [[C]], [[Y]]
; CHECK-NEXT: ret float [[Z]]		; CHECK-NEXT: ret float [[Z]]
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
; CHECK-NEXT: ret float [[TMP2]]		; CHECK-NEXT: ret float [[TMP2]]
;		;
%B = fmul fast float %X1, 47. ; X1*47		%B = fmul fast float %X1, 47. ; X1*47
%C = fmul fast float %X2, -47. ; X2*-47		%C = fmul fast float %X2, -47. ; X2*-47
%D = fadd fast float %B, %C ; X147 + X2-47 -> 47*(X1-X2)		%D = fadd fast float %B, %C ; X147 + X2-47 -> 47*(X1-X2)
ret float %D		ret float %D
}		}

		; (x1 * 47) + (x2 * -47) => (x1 - x2) * 47
		; Check again with 'reassoc' and 'nsz' ('nsz' not technically required).
		define float @test14_reassoc_nsz(float %X1, float %X2) {
		; CHECK-LABEL: @test14_reassoc_nsz(
		; CHECK-NEXT: [[TMP1:%.]] = fsub reassoc nsz float [[X1:%.]], [[X2:%.*]]
		; CHECK-NEXT: [[TMP2:%.*]] = fmul reassoc nsz float [[TMP1]], 4.700000e+01
		; CHECK-NEXT: ret float [[TMP2]]
		;
		%B = fmul reassoc nsz float %X1, 47. ; X1*47
		%C = fmul reassoc nsz float %X2, -47. ; X2*-47
		%D = fadd reassoc nsz float %B, %C ; X147 + X2-47 -> 47*(X1-X2)
		ret float %D
		}

		; TODO: This doesn't require 'nsz'. It should fold to ((x1 - x2) * 47.0)
define float @test14_reassoc(float %X1, float %X2) {		define float @test14_reassoc(float %X1, float %X2) {
; CHECK-LABEL: @test14_reassoc(		; CHECK-LABEL: @test14_reassoc(
; CHECK-NEXT: [[B:%.]] = fmul reassoc float [[X1:%.]], 4.700000e+01		; CHECK-NEXT: [[B:%.]] = fmul reassoc float [[X1:%.]], 4.700000e+01
; CHECK-NEXT: [[C:%.]] = fmul reassoc float [[X2:%.]], 4.700000e+01		; CHECK-NEXT: [[C:%.]] = fmul reassoc float [[X2:%.]], 4.700000e+01
; CHECK-NEXT: [[D1:%.*]] = fsub reassoc float [[B]], [[C]]		; CHECK-NEXT: [[D1:%.*]] = fsub reassoc float [[B]], [[C]]
; CHECK-NEXT: ret float [[D1]]		; CHECK-NEXT: ret float [[D1]]
;		;
%B = fmul reassoc float %X1, 47. ; X1*47		%B = fmul reassoc float %X1, 47. ; X1*47
▲ Show 20 Lines • Show All 170 Lines • Show Last 20 Lines