This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
8/11
Reassociate.cpp
-
test/Transforms/
-
Transforms/
-
PhaseOrdering/
-
fast-basictest.ll
-
Reassociate/
-
fast-basictest.ll

Differential D129523

[Reassociate] Enable FP reassociation via 'reassoc' and 'nsz'
ClosedPublic

Authored by wristow on Jul 11 2022, 4:14 PM.

Download Raw Diff

Details

Reviewers

spatel
RKSimon
lebedev.ri

Commits

rGc6507930493b: [Reassociate] Enable FP reassociation via 'reassoc' and 'nsz'

Summary

[Reassociate] Enable FP reassociation via 'reassoc' and 'nsz'

Compiling with '-ffast-math' tuns on all the FastMathFlags (FMF), as
expected, and that enables FP reassociation. Only the two FMF flags
'reassoc' and 'nsz' are technically required to perform reassociation,
but disabling other unrelated FMF bits is needlessly suppressing the
optimization.

This patch fixes that needless suppression, and makes appropriate
adjustments to test-cases, fixing some outstanding TODOs in the process.

Fixes: #56483

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

wristow created this revision.Jul 11 2022, 4:14 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 11 2022, 4:14 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

wristow requested review of this revision.Jul 11 2022, 4:14 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 11 2022, 4:14 PM

Assuming this approach is sensible, I'm raising the question of whether this patch should be split into two.

Specifically, in implementing the fix, I found cases where opt produced less efficient code with my fix than without it. I ultimately noticed that the similar inefficiency already existed even when the full set of fast flags were enabled, and there were some TODOs suggesting that should be investigated. See for example, test7() in "llvm/test/Transforms/Reassociate/fast-basictest.ll". So in fixing this, I made some changes to deal with those inefficiencies.

Splitting this patch into two (one to deal with those inefficiencies, and one to deal with the needless suppression of reassociation) seems reasonable. But regardless, I wanted to show the entire patch here, to give the full context of the fix.

Harbormaster completed remote builds in B174765: Diff 443783.Jul 11 2022, 4:52 PM

nikic added a subscriber: nikic.Jul 12 2022, 1:04 AM

nikic added inline comments.

llvm/lib/Transforms/Scalar/Reassociate.cpp
151	Just `I->isAssociative()` is sufficient, `isFast()` implies `isAssociative()`.

I support the direction. IIRC, fixing FMF requirements for -reassociate came up during similar patches that were made to fix -instcombine like D47335. This just got lost in my bug queue.

Using I->isAssociative() directly seems like the right change rather than adding a wrapper (agree with inline comment from @nikic). Any uses of isFast() in IR are bogus at this point (that API should be deprecated) - there is no transform that requires all of the individual FMF.

If we can fix the regressions as an independent patch, that would be ideal.

Thanks for the quick feedback!

Using I->isAssociative() directly seems like the right change rather than adding a wrapper (agree with inline comment from @nikic). Any uses of isFast() in IR are bogus at this point (that API should be deprecated) - there is no transform that requires all of the individual FMF.

I'll update the check to look for hasAllowReassoc() && hasNoSignedZeros() rather than using isFast() at all, and rather than invoking isAssociative() (described in more detail in my inline comment).

If we can fix the regressions as an independent patch, that would be ideal.

I'll split this into two independent patches.

llvm/lib/Transforms/Scalar/Reassociate.cpp
151	Actually that is what I tried in my first attempt, but it didn't work. :) The problem is that there are operators that are not associative (and so `isAssociative()` returns `false`), but some of those operators can be transformed into something associative, and the Reassociation Pass then nicely handles them. For example, subtraction isn't associative, but if we have: (0.0-X)Y + Z then this can (arithmetically) be transformed into: Z - XY But that transformation was suppressed when `isAssociative()` alone was used, because it returned `false`. So for example, code like the following: %A = fsub fast float 0.0, %X %B = fmul fast float %A, %Y %C = fadd fast float %B, %Z no longer was optimized when I only used `isAssociative()`. But as I write this, I realize that by including `isFast()`, that's not the right solution. I should have done is check for `reassoc` and `nsz`: `hasAllowReassoc() && hasNoSignedZeros()` rather than `isFast()`. I'll change that. That will optimize more cases.

I'll split this into two independent patches.

I've split off the miscellaneous cleanup into a separate patch: D129612. Once that wraps up, I'll update the patch here to just be the improved optimization with reassoc and nsz.

craig.topper added a subscriber: craig.topper.Jul 12 2022, 9:31 PM

craig.topper added inline comments.

llvm/lib/Transforms/Scalar/Reassociate.cpp
151	I think there was some confusion about what `isAssociative` is. It's based on opcode not the fast math flags. It's a property like isCommutative.

wristow added inline comments.Jul 12 2022, 10:27 PM

llvm/lib/Transforms/Scalar/Reassociate.cpp
151	To elaborate, for the `Instruction` method `I->isAssociative()`, it's based purely on the opcode for integer-typed operations; but for floating-point operations, it's based on both the opcode and whether or not `reassoc` and `nsz` are set.

craig.topper added inline comments.Jul 12 2022, 10:36 PM

llvm/lib/Transforms/Scalar/Reassociate.cpp
151	Oops. I missed that. Thanks for clarifying!

Update to:

Remove the cleanup that has been done separately (and committed as 230c8c56f21cfe4e23a24793f3137add1af1d1f0).
Check only for reassoc and nsz (rather than isFast()).

wristow marked 3 inline comments as done.Jul 14 2022, 10:11 PM

Harbormaster completed remote builds in B175564: Diff 444878.Jul 14 2022, 11:03 PM

This is a straightforward translation of the existing FMF checks, so LGTM.
I added some inline comments for potential cleanups that could be done as NFC commits either before or after.

llvm/lib/Transforms/Scalar/Reassociate.cpp
145–147	It would be good to add a line to the code comment to state the non-obvious bit that came up in this review. Something like this: /// This is not the same as testing Instruction::isAssociative() because it includes /// operations like fsub.
157	Existing issue: this is awkward - we're dyn_casting to Instruction, but then we nakedly cast to BinaryOperator on the return. Could this cast to BO directly? There might be some subtlety with binop constant expressions that needs to be addressed.
2230	Existing issue: checking the type isn't consistent with the other diffs that check if the value is an FPMathOp. I'm not sure what, if any, functional difference it would make to the pass, but it seems wrong that these aren't using the same condition.

This revision is now accepted and ready to land.Jul 15 2022, 8:35 AM

In D129523#3655223, @spatel wrote:

This is a straightforward translation of the existing FMF checks, so LGTM.
I added some inline comments for potential cleanups that could be done as NFC commits either before or after.

Thanks for the review @spatel. I'll update the comment and commit. And I'll revisit the suggested cleanups a bit later.

llvm/lib/Transforms/Scalar/Reassociate.cpp
145–147	Good idea Sanjay. I'll do that before committing.
157	Good point. I'll be out of the office for a week, and I don't want to rush into this change (but I do want to get the main fix done now). So I'll revisit this point when I return.
2230	Another good point. Again, I'll revisit this point when I return.

This revision was landed with ongoing or failed builds.Jul 15 2022, 11:45 AM

Closed by commit rGc6507930493b: [Reassociate] Enable FP reassociation via 'reassoc' and 'nsz' (authored by wristow). · Explain Why

This revision was automatically updated to reflect the committed changes.

wristow added a commit: rGc6507930493b: [Reassociate] Enable FP reassociation via 'reassoc' and 'nsz'.

wristow mentioned this in D130408: [Reassociate][NFC] Consistent checking if FastMathFlags are suitable.Jul 22 2022, 7:39 PM

wristow mentioned this in rG3089b411a465: [Reassociate][NFC] Consistent checking for FastMathFlags suitability.Jul 24 2022, 5:46 PM

wristow mentioned this in D130448: [Reassociate][NFC] Use an appropriate `dyn_cast` for `BinaryOperator`.Jul 24 2022, 6:42 PM

wristow mentioned this in rG3bbd380a5b51: [Reassociate][NFC] Use an appropriate dyn_cast for BinaryOperator.Jul 25 2022, 10:26 AM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

Reassociate.cpp

20 lines

test/

Transforms/

PhaseOrdering/

fast-basictest.ll

29 lines

Reassociate/

fast-basictest.ll

26 lines

Diff 445083

llvm/lib/Transforms/Scalar/Reassociate.cpp

Show First 20 Lines • Show All 136 Lines • ▼ Show 20 Lines	XorOpnd::XorOpnd(Value *V) {
}		}

// view the operand as "V \| 0"		// view the operand as "V \| 0"
SymbolicPart = V;		SymbolicPart = V;
ConstPart = APInt::getZero(V->getType()->getScalarSizeInBits());		ConstPart = APInt::getZero(V->getType()->getScalarSizeInBits());
isOr = true;		isOr = true;
}		}

		/// Return true if I is an instruction with the FastMathFlags that are needed
		/// for general reassociation set. This is not the same as testing
		/// Instruction::isAssociative() because it includes operations like fsub.
		spatelUnsubmitted Not Done Reply Inline Actions It would be good to add a line to the code comment to state the non-obvious bit that came up in this review. Something like this: /// This is not the same as testing Instruction::isAssociative() because it includes /// operations like fsub. spatel: It would be good to add a line to the code comment to state the non-obvious bit that came up in…
		wristowAuthorUnsubmitted Done Reply Inline Actions Good idea Sanjay. I'll do that before committing. wristow: Good idea Sanjay. I'll do that before committing.
		/// (This routine is only intended to be called for floating-point operations.)
		static bool hasFPAssociativeFlags(Instruction *I) {
		assert(I && I->getType()->isFPOrFPVectorTy() && "Should only check FP ops");
		return I->hasAllowReassoc() && I->hasNoSignedZeros();
		nikicUnsubmitted Done Reply Inline Actions Just `I->isAssociative()` is sufficient, `isFast()` implies `isAssociative()`. nikic: Just `I->isAssociative()` is sufficient, `isFast()` implies `isAssociative()`.
		wristowAuthorUnsubmitted Done Reply Inline Actions Actually that is what I tried in my first attempt, but it didn't work. :) The problem is that there are operators that are not associative (and so `isAssociative()` returns `false`), but some of those operators can be transformed into something associative, and the Reassociation Pass then nicely handles them. For example, subtraction isn't associative, but if we have: (0.0-X)Y + Z then this can (arithmetically) be transformed into: Z - XY But that transformation was suppressed when `isAssociative()` alone was used, because it returned `false`. So for example, code like the following: %A = fsub fast float 0.0, %X %B = fmul fast float %A, %Y %C = fadd fast float %B, %Z no longer was optimized when I only used `isAssociative()`. But as I write this, I realize that by including `isFast()`, that's not the right solution. I should have done is check for `reassoc` and `nsz`: `hasAllowReassoc() && hasNoSignedZeros()` rather than `isFast()`. I'll change that. That will optimize more cases. wristow: Actually that is what I tried in my first attempt, but it didn't work. :) The problem is that…
		craig.topperUnsubmitted Done Reply Inline Actions I think there was some confusion about what `isAssociative` is. It's based on opcode not the fast math flags. It's a property like isCommutative. craig.topper: I think there was some confusion about what `isAssociative` is. It's based on opcode not the…
		wristowAuthorUnsubmitted Done Reply Inline Actions To elaborate, for the `Instruction` method `I->isAssociative()`, it's based purely on the opcode for integer-typed operations; but for floating-point operations, it's based on both the opcode and whether or not `reassoc` and `nsz` are set. wristow: To elaborate, for the `Instruction` method `I->isAssociative()`, it's based purely on the…
		craig.topperUnsubmitted Done Reply Inline Actions Oops. I missed that. Thanks for clarifying! craig.topper: Oops. I missed that. Thanks for clarifying!
		}

/// Return true if V is an instruction of the specified opcode and if it		/// Return true if V is an instruction of the specified opcode and if it
/// only has one use.		/// only has one use.
static BinaryOperator isReassociableOp(Value V, unsigned Opcode) {		static BinaryOperator isReassociableOp(Value V, unsigned Opcode) {
auto *I = dyn_cast<Instruction>(V);		auto *I = dyn_cast<Instruction>(V);
		spatelUnsubmitted Not Done Reply Inline Actions Existing issue: this is awkward - we're dyn_casting to Instruction, but then we nakedly cast to BinaryOperator on the return. Could this cast to BO directly? There might be some subtlety with binop constant expressions that needs to be addressed. spatel: Existing issue: this is awkward - we're dyn_casting to Instruction, but then we nakedly cast to…
		wristowAuthorUnsubmitted Done Reply Inline Actions Good point. I'll be out of the office for a week, and I don't want to rush into this change (but I do want to get the main fix done now). So I'll revisit this point when I return. wristow: Good point. I'll be out of the office for a week, and I don't want to rush into this change…
if (I && I->hasOneUse() && I->getOpcode() == Opcode)		if (I && I->hasOneUse() && I->getOpcode() == Opcode)
if (!isa<FPMathOperator>(I) \|\| I->isFast())		if (!isa<FPMathOperator>(I) \|\| hasFPAssociativeFlags(I))
return cast<BinaryOperator>(I);		return cast<BinaryOperator>(I);
return nullptr;		return nullptr;
}		}

static BinaryOperator isReassociableOp(Value V, unsigned Opcode1,		static BinaryOperator isReassociableOp(Value V, unsigned Opcode1,
unsigned Opcode2) {		unsigned Opcode2) {
auto *I = dyn_cast<Instruction>(V);		auto *I = dyn_cast<Instruction>(V);
if (I && I->hasOneUse() &&		if (I && I->hasOneUse() &&
(I->getOpcode() == Opcode1 \|\| I->getOpcode() == Opcode2))		(I->getOpcode() == Opcode1 \|\| I->getOpcode() == Opcode2))
if (!isa<FPMathOperator>(I) \|\| I->isFast())		if (!isa<FPMathOperator>(I) \|\| hasFPAssociativeFlags(I))
return cast<BinaryOperator>(I);		return cast<BinaryOperator>(I);
return nullptr;		return nullptr;
}		}

void ReassociatePass::BuildRankMap(Function &F,		void ReassociatePass::BuildRankMap(Function &F,
ReversePostOrderTraversal<Function*> &RPOT) {		ReversePostOrderTraversal<Function*> &RPOT) {
unsigned Rank = 2;		unsigned Rank = 2;

▲ Show 20 Lines • Show All 399 Lines • ▼ Show 20 Lines	#endif

// At this point we have a value which, first of all, is not a binary		// At this point we have a value which, first of all, is not a binary
// expression of the right kind, and secondly, is only used inside the		// expression of the right kind, and secondly, is only used inside the
// expression. This means that it can safely be modified. See if we		// expression. This means that it can safely be modified. See if we
// can usefully morph it into an expression of the right kind.		// can usefully morph it into an expression of the right kind.
assert((!isa<Instruction>(Op) \|\|		assert((!isa<Instruction>(Op) \|\|
cast<Instruction>(Op)->getOpcode() != Opcode		cast<Instruction>(Op)->getOpcode() != Opcode
\|\| (isa<FPMathOperator>(Op) &&		\|\| (isa<FPMathOperator>(Op) &&
!cast<Instruction>(Op)->isFast())) &&		!hasFPAssociativeFlags(cast<Instruction>(Op)))) &&
"Should have been handled above!");		"Should have been handled above!");
assert(Op->hasOneUse() && "Has uses outside the expression tree!");		assert(Op->hasOneUse() && "Has uses outside the expression tree!");

// If this is a multiply expression, turn any internal negations into		// If this is a multiply expression, turn any internal negations into
// multiplies by -1 so they can be reassociated. Add any users of the		// multiplies by -1 so they can be reassociated. Add any users of the
// newly created multiplication by -1 to the redo list, so any		// newly created multiplication by -1 to the redo list, so any
// reassociation opportunities that are exposed will be reassociated		// reassociation opportunities that are exposed will be reassociated
// further.		// further.
▲ Show 20 Lines • Show All 1,626 Lines • ▼ Show 20 Lines	void ReassociatePass::OptimizeInst(Instruction *I) {
// transformations simpler.		// transformations simpler.
if (I->isCommutative())		if (I->isCommutative())
canonicalizeOperands(I);		canonicalizeOperands(I);

// Canonicalize negative constants out of expressions.		// Canonicalize negative constants out of expressions.
if (Instruction *Res = canonicalizeNegFPConstants(I))		if (Instruction *Res = canonicalizeNegFPConstants(I))
I = Res;		I = Res;

// Don't optimize floating-point instructions unless they are 'fast'.		// Don't optimize floating-point instructions unless they have the
if (I->getType()->isFPOrFPVectorTy() && !I->isFast())		// appropriate FastMathFlags for reassociation enabled.
		if (I->getType()->isFPOrFPVectorTy() && !hasFPAssociativeFlags(I))
		spatelUnsubmitted Not Done Reply Inline Actions Existing issue: checking the type isn't consistent with the other diffs that check if the value is an FPMathOp. I'm not sure what, if any, functional difference it would make to the pass, but it seems wrong that these aren't using the same condition. spatel: Existing issue: checking the type isn't consistent with the other diffs that check if the value…
		wristowAuthorUnsubmitted Done Reply Inline Actions Another good point. Again, I'll revisit this point when I return. wristow: Another good point. Again, I'll revisit this point when I return.
return;		return;

// Do not reassociate boolean (i1) expressions. We want to preserve the		// Do not reassociate boolean (i1) expressions. We want to preserve the
// original order of evaluation for short-circuited comparisons that		// original order of evaluation for short-circuited comparisons that
// SimplifyCFG has folded to AND/OR expressions. If the expression		// SimplifyCFG has folded to AND/OR expressions. If the expression
// is not further optimized, it is likely to be transformed back to a		// is not further optimized, it is likely to be transformed back to a
// short-circuited form for code gen, and the source order may have been		// short-circuited form for code gen, and the source order may have been
// optimized for the most likely conditions.		// optimized for the most likely conditions.
▲ Show 20 Lines • Show All 404 Lines • Show Last 20 Lines

llvm/test/Transforms/PhaseOrdering/fast-basictest.ll

Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines
;		;
%1 = fadd fast float %a, 1234.0		%1 = fadd fast float %a, 1234.0
%2 = fadd fast float %b, %1		%2 = fadd fast float %b, %1
%3 = fneg fast float %a		%3 = fneg fast float %a
%4 = fadd fast float %2, %3		%4 = fadd fast float %2, %3
ret float %4		ret float %4
}		}

; TODO: check if it is possible to perform the optimization without 'fast'
; with 'reassoc' and 'nsz' only.
define float @test15_reassoc_nsz(float %b, float %a) {		define float @test15_reassoc_nsz(float %b, float %a) {
; CHECK-LABEL: @test15_reassoc_nsz(		; CHECK-LABEL: @test15_reassoc_nsz(
; CHECK-NEXT: [[TMP1:%.]] = fadd reassoc nsz float [[A:%.]], 1.234000e+03		; CHECK-NEXT: [[TMP1:%.]] = fadd reassoc nsz float [[B:%.]], 1.234000e+03
; CHECK-NEXT: [[TMP2:%.]] = fadd reassoc nsz float [[TMP1]], [[B:%.]]		; CHECK-NEXT: ret float [[TMP1]]
; CHECK-NEXT: [[TMP3:%.*]] = fsub reassoc nsz float [[TMP2]], [[A]]
; CHECK-NEXT: ret float [[TMP3]]
;		;
%1 = fadd reassoc nsz float %a, 1234.0		%1 = fadd reassoc nsz float %a, 1234.0
%2 = fadd reassoc nsz float %b, %1		%2 = fadd reassoc nsz float %b, %1
%3 = fsub reassoc nsz float 0.0, %a		%3 = fsub reassoc nsz float 0.0, %a
%4 = fadd reassoc nsz float %2, %3		%4 = fadd reassoc nsz float %2, %3
ret float %4		ret float %4
}		}

▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	;
%c = fneg fast float %z		%c = fneg fast float %z
%d = fmul fast float %a, %b		%d = fmul fast float %a, %b
%e = fmul fast float %c, %d		%e = fmul fast float %c, %d
%f = fmul fast float %e, 1.234500e+04		%f = fmul fast float %e, 1.234500e+04
%g = fneg fast float %f		%g = fneg fast float %f
ret float %g		ret float %g
}		}

; TODO: check if it is possible to perform the optimization without 'fast'
; with 'reassoc' and 'nsz' only.
define float @test16_reassoc_nsz(float %a, float %b, float %z) {		define float @test16_reassoc_nsz(float %a, float %b, float %z) {
; CHECK-LABEL: @test16_reassoc_nsz(		; REASSOC_AND_IC-LABEL: @test16_reassoc_nsz(
; CHECK-NEXT: [[C:%.]] = fneg reassoc nsz float [[Z:%.]]		; REASSOC_AND_IC-NEXT: [[C:%.]] = fmul reassoc nsz float [[A:%.]], 1.234500e+04
; CHECK-NEXT: [[D:%.]] = fmul reassoc nsz float [[A:%.]], [[B:%.*]]		; REASSOC_AND_IC-NEXT: [[E:%.]] = fmul reassoc nsz float [[C]], [[B:%.]]
; CHECK-NEXT: [[E:%.*]] = fmul reassoc nsz float [[D]], [[C]]		; REASSOC_AND_IC-NEXT: [[F:%.]] = fmul reassoc nsz float [[E]], [[Z:%.]]
; CHECK-NEXT: [[G:%.*]] = fmul reassoc nsz float [[E]], -1.234500e+04		; REASSOC_AND_IC-NEXT: ret float [[F]]
; CHECK-NEXT: ret float [[G]]		;
		; O2-LABEL: @test16_reassoc_nsz(
		; O2-NEXT: [[D:%.]] = fmul reassoc nsz float [[A:%.]], 1.234500e+04
		; O2-NEXT: [[E:%.]] = fmul reassoc nsz float [[D]], [[B:%.]]
		; O2-NEXT: [[G:%.]] = fmul reassoc nsz float [[E]], [[Z:%.]]
		; O2-NEXT: ret float [[G]]
;		;
%c = fsub reassoc nsz float 0.000000e+00, %z		%c = fsub reassoc nsz float 0.000000e+00, %z
%d = fmul reassoc nsz float %a, %b		%d = fmul reassoc nsz float %a, %b
%e = fmul reassoc nsz float %c, %d		%e = fmul reassoc nsz float %c, %d
%f = fmul reassoc nsz float %e, 1.234500e+04		%f = fmul reassoc nsz float %e, 1.234500e+04
%g = fsub reassoc nsz float 0.000000e+00, %f		%g = fsub reassoc nsz float 0.000000e+00, %f
ret float %g		ret float %g
}		}
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	;
%t3 = fsub fast float %a, %b		%t3 = fsub fast float %a, %b
%t5 = fsub fast float %t3, %c		%t5 = fsub fast float %t3, %c
%t7 = fsub fast float %t5, %a		%t7 = fsub fast float %t5, %a
ret float %t7		ret float %t7
}		}

define float @test19_reassoc_nsz(float %a, float %b, float %c) nounwind {		define float @test19_reassoc_nsz(float %a, float %b, float %c) nounwind {
; CHECK-LABEL: @test19_reassoc_nsz(		; CHECK-LABEL: @test19_reassoc_nsz(
; CHECK-NEXT: [[TMP1:%.]] = fadd reassoc nsz float [[B:%.]], [[C:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = fadd reassoc nsz float [[C:%.]], [[B:%.*]]
; CHECK-NEXT: [[T7:%.*]] = fneg reassoc nsz float [[TMP1]]		; CHECK-NEXT: [[T7:%.*]] = fneg reassoc nsz float [[TMP1]]
; CHECK-NEXT: ret float [[T7]]		; CHECK-NEXT: ret float [[T7]]
;		;
%t3 = fsub reassoc nsz float %a, %b		%t3 = fsub reassoc nsz float %a, %b
%t5 = fsub reassoc nsz float %t3, %c		%t5 = fsub reassoc nsz float %t3, %c
%t7 = fsub reassoc nsz float %t5, %a		%t7 = fsub reassoc nsz float %t5, %a
ret float %t7		ret float %t7
}		}
Show All 14 Lines

llvm/test/Transforms/Reassociate/fast-basictest.ll

Show First 20 Lines • Show All 175 Lines • ▼ Show 20 Lines	;
%aab = fmul reassoc float %aa, %B		%aab = fmul reassoc float %aa, %B
%ac = fmul reassoc float %A, %C		%ac = fmul reassoc float %A, %C
%aac = fmul reassoc float %ac, %A		%aac = fmul reassoc float %ac, %A
%r = fadd reassoc float %aab, %aac		%r = fadd reassoc float %aab, %aac
ret float %r		ret float %r
}		}

; (-X)Y + Z -> Z-XY		; (-X)Y + Z -> Z-XY

define float @test7(float %X, float %Y, float %Z) {		define float @test7(float %X, float %Y, float %Z) {
; CHECK-LABEL: @test7(		; CHECK-LABEL: @test7(
; CHECK-NEXT: [[B:%.]] = fmul fast float [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[B:%.]] = fmul fast float [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[TMP1:%.]] = fsub fast float [[Z:%.]], [[B]]		; CHECK-NEXT: [[TMP1:%.]] = fsub fast float [[Z:%.]], [[B]]
; CHECK-NEXT: ret float [[TMP1]]		; CHECK-NEXT: ret float [[TMP1]]
;		;
%A = fsub fast float 0.0, %X		%A = fsub fast float 0.0, %X
%B = fmul fast float %A, %Y		%B = fmul fast float %A, %Y
Show All 10 Lines	;
%A = fneg fast float %X		%A = fneg fast float %X
%B = fmul fast float %A, %Y		%B = fmul fast float %A, %Y
%C = fadd fast float %B, %Z		%C = fadd fast float %B, %Z
ret float %C		ret float %C
}		}

define float @test7_reassoc_nsz(float %X, float %Y, float %Z) {		define float @test7_reassoc_nsz(float %X, float %Y, float %Z) {
; CHECK-LABEL: @test7_reassoc_nsz(		; CHECK-LABEL: @test7_reassoc_nsz(
; CHECK-NEXT: [[A:%.]] = fsub reassoc nsz float 0.000000e+00, [[X:%.]]		; CHECK-NEXT: [[B:%.]] = fmul reassoc nsz float [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[B:%.]] = fmul reassoc nsz float [[A]], [[Y:%.]]		; CHECK-NEXT: [[TMP1:%.]] = fsub reassoc nsz float [[Z:%.]], [[B]]
; CHECK-NEXT: [[C:%.]] = fadd reassoc nsz float [[B]], [[Z:%.]]		; CHECK-NEXT: ret float [[TMP1]]
; CHECK-NEXT: ret float [[C]]
;		;
%A = fsub reassoc nsz float 0.0, %X		%A = fsub reassoc nsz float 0.0, %X
%B = fmul reassoc nsz float %A, %Y		%B = fmul reassoc nsz float %A, %Y
%C = fadd reassoc nsz float %B, %Z		%C = fadd reassoc nsz float %B, %Z
ret float %C		ret float %C
}		}

; Verify that fold is not done only with 'reassoc' ('nsz' is required)		; Verify that fold is not done only with 'reassoc' ('nsz' is required)
▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	;
%B = fmul fast float %A, %X2 ; -X1*X2		%B = fmul fast float %A, %X2 ; -X1*X2
%C = fmul fast float %X1, %X3 ; X1*X3		%C = fmul fast float %X1, %X3 ; X1*X3
%D = fadd fast float %B, %C ; -X1X2 + X1X3 -> X1*(X3-X2)		%D = fadd fast float %B, %C ; -X1X2 + X1X3 -> X1*(X3-X2)
ret float %D		ret float %D
}		}

define float @test12_reassoc_nsz(float %X1, float %X2, float %X3) {		define float @test12_reassoc_nsz(float %X1, float %X2, float %X3) {
; CHECK-LABEL: @test12_reassoc_nsz(		; CHECK-LABEL: @test12_reassoc_nsz(
; CHECK-NEXT: [[A:%.]] = fsub reassoc nsz float 0.000000e+00, [[X1:%.]]		; CHECK-NEXT: [[B:%.]] = fmul reassoc nsz float [[X2:%.]], [[X1:%.*]]
; CHECK-NEXT: [[B:%.]] = fmul reassoc nsz float [[A]], [[X2:%.]]		; CHECK-NEXT: [[C:%.]] = fmul reassoc nsz float [[X3:%.]], [[X1]]
; CHECK-NEXT: [[C:%.]] = fmul reassoc nsz float [[X1]], [[X3:%.]]		; CHECK-NEXT: [[TMP1:%.*]] = fsub reassoc nsz float [[C]], [[B]]
; CHECK-NEXT: [[D:%.*]] = fadd reassoc nsz float [[B]], [[C]]		; CHECK-NEXT: ret float [[TMP1]]
; CHECK-NEXT: ret float [[D]]
;		;
%A = fsub reassoc nsz float 0.000000e+00, %X1		%A = fsub reassoc nsz float 0.000000e+00, %X1
%B = fmul reassoc nsz float %A, %X2 ; -X1*X2		%B = fmul reassoc nsz float %A, %X2 ; -X1*X2
%C = fmul reassoc nsz float %X1, %X3 ; X1*X3		%C = fmul reassoc nsz float %X1, %X3 ; X1*X3
%D = fadd reassoc nsz float %B, %C ; -X1X2 + X1X3 -> X1*(X3-X2)		%D = fadd reassoc nsz float %B, %C ; -X1X2 + X1X3 -> X1*(X3-X2)
ret float %D		ret float %D
}		}

▲ Show 20 Lines • Show All 107 Lines • ▼ Show 20 Lines
;		;
%1 = fadd fast float %a, 1234.0		%1 = fadd fast float %a, 1234.0
%2 = fadd fast float %b, %1		%2 = fadd fast float %b, %1
%3 = fneg fast float %a		%3 = fneg fast float %a
%4 = fadd fast float %2, %3		%4 = fadd fast float %2, %3
ret float %4		ret float %4
}		}

		; TODO: check if we can remove dead fsub.
define float @test15_reassoc_nsz(float %b, float %a) {		define float @test15_reassoc_nsz(float %b, float %a) {
; CHECK-LABEL: @test15_reassoc_nsz(		; CHECK-LABEL: @test15_reassoc_nsz(
; CHECK-NEXT: [[TMP1:%.]] = fadd reassoc nsz float [[A:%.]], 1.234000e+03		; CHECK-NEXT: [[TMP1:%.]] = fsub reassoc nsz float 0.000000e+00, [[A:%.]]
; CHECK-NEXT: [[TMP2:%.]] = fadd reassoc nsz float [[B:%.]], [[TMP1]]		; CHECK-NEXT: [[TMP2:%.]] = fadd reassoc nsz float [[B:%.]], 1.234000e+03
; CHECK-NEXT: [[TMP3:%.*]] = fsub reassoc nsz float 0.000000e+00, [[A]]		; CHECK-NEXT: ret float [[TMP2]]
; CHECK-NEXT: [[TMP4:%.*]] = fadd reassoc nsz float [[TMP3]], [[TMP2]]
; CHECK-NEXT: ret float [[TMP4]]
;		;
%1 = fadd reassoc nsz float %a, 1234.0		%1 = fadd reassoc nsz float %a, 1234.0
%2 = fadd reassoc nsz float %b, %1		%2 = fadd reassoc nsz float %b, %1
%3 = fsub reassoc nsz float 0.0, %a		%3 = fsub reassoc nsz float 0.0, %a
%4 = fadd reassoc nsz float %2, %3		%4 = fadd reassoc nsz float %2, %3
ret float %4		ret float %4
}		}

▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines