This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
7/8
Reassociate.cpp
-
test/Transforms/Reassociate/
-
Transforms/
-
Reassociate/
-
fast-ReassociateVector.ll
-
fast-basictest.ll
-
pr42349.ll

Differential D129612

[Reassociate] Cleanup minor missed optimizations
ClosedPublic

Authored by wristow on Jul 12 2022, 6:17 PM.

Download Raw Diff

Details

Reviewers

spatel
RKSimon
lebedev.ri

Summary

In analyzing issue #56483, it was noticed that running opt with
-reassociate was missing some minor optimizations. For example,
there were cases where the running opt on IR with floating-point
instructions that have the fast flags applied, sometimes resulted in
less efficient code than the input IR (things like dead instructions
left behind, and missed reassociations). These were sometimes noted
in the test-files with TODOs, to investigate further. This commit
fixes some of these problems, removing some TODOs in the process.

FTR, I refer to these as "minor" missed optimizations, because when
running a full clang/llvm compilation, these inefficiencies are not
happening, as other passes clean that residue up. Regardless, having
cleaner IR produced by opt, makes assessing the quality of fixes done
in opt easier.

Diff Detail

Unit TestsFailed

	Time	Test
	60,030 ms	x64 debian > MLIR.Examples/standalone::test.toy

Event Timeline

wristow created this revision.Jul 12 2022, 6:17 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2022, 6:17 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

wristow requested review of this revision.Jul 12 2022, 6:17 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2022, 6:17 PM

Harbormaster completed remote builds in B175018: Diff 444128.Jul 12 2022, 8:58 PM

To add some context to make it easier to review, the approach taken here of adding a ReassociatePass::OrderedSet parameter (named ToRedo) to the function LinearizeExprTree , is the same technique already used in two other functions in "Reassociate.cpp": NegateValue and BreakUpSubtract.

Is it possible to create an integer (mul and sub) test that also has this problem?

llvm/lib/Transforms/Scalar/Reassociate.cpp
580–581	I know this is existing code, but for readability, I'd rename this "Neg" instead of "Tmp" and remove the partially-braced nested-ifs: Instruction *Neg; if ((Opcode == Instruction::Mul && match(Op, m_Neg(m_Value()))) \|\| (Opcode == Instruction::FMul && match(Op, m_FNeg(m_Value()))) && match(Op, m_Instruction(Neg)) {
580–581	NI -> Mul ?
580–581	Add a line to this comment to explain why we add users of the negate to the redo list?
584	UTmp -> UserBO

In D129612#3648882, @spatel wrote:

Is it possible to create an integer (mul and sub) test that also has this problem?

I expect so. I'll look into creating one.

llvm/lib/Transforms/Scalar/Reassociate.cpp
580–581	Will do.
580–581	Sounds good.
580–581	Will do.
584	Will do.

Addressed review comments from @spatel , except for adding an integer test.

If I take a similar integer test:

; (-X)*Y + Z -> Z-X*Y
define i32 @int_test7(i32 %X, i32 %Y, i32 %Z) {
  %A = sub i32 0, %X
  %B = mul i32 %A, %Y
  %C = add i32 %B, %Z
  ret i32 %C
}

the change does clean the results up somewhat, but not as completely as the floating-point version. Specifically, without the change, the above test is reassociated to:

define i32 @int_test7(i32 %X, i32 %Y, i32 %Z) {
  %1 = sub i32 0, 0
  %A = mul i32 %Y, %X
  %B = mul i32 %A, -1
  %C = add i32 %B, %Z
  ret i32 %C
}

So it has the unused instruction %1 = sub i32 0, 0 remaining.

And with the change, that instruction is deleted:

define i32 @int_test7(i32 %X, i32 %Y, i32 %Z) {
  %A = mul i32 %Y, %X
  %B = mul i32 %A, -1
  %C = add i32 %B, %Z
  ret i32 %C
}

The equivalent FP version transforms the computation of %B into a multiply of positive 1 (which is then optimized away), and %C is then computed with a subtraction, rather than an addition. So the FP version has only two arithmetic instructions (multiply and subtract), whereas the integer version ends up as above (multiply, multiply (by negative 1), addition).

Looking into this difference, I see it's because the routine ReassociatePass::OptimizeInst has:

// Canonicalize negative constants out of expressions.
if (Instruction *Res = canonicalizeNegFPConstants(I))
  I = Res;

which nicely handles this multiply by -1.0 for the floating-point case, but there isn't an equivalent canonicalizeNegIntConstants concept.

wristow marked 3 inline comments as done.Jul 13 2022, 5:21 PM

Harbormaster completed remote builds in B175261: Diff 444461.Jul 13 2022, 7:11 PM

In D129612#3650193, @wristow wrote:

Addressed review comments from @spatel , except for adding an integer test.

...

the change does clean the results up somewhat, but not as completely as the floating-point version. Specifically, without the change, the above test is reassociated to:

I'd prefer to have that test, so we have coverage for the integer path added with this patch. It's still an improvement even if it's not as big as the FP side.
Other than that, LGTM.

This revision is now accepted and ready to land.Jul 14 2022, 7:38 AM

In D129612#3652041, @spatel wrote:

I'd prefer to have that test, so we have coverage for the integer path added with this patch. It's still an improvement even if it's not as big as the FP side.
Other than that, LGTM.

I was just thinking the same thing and adding that test (with a TODO)!

Added an integer version of a reassociation test, including a TODO of a possible further improvement.

Thanks for the review @spatel !

Committed at 230c8c56f21cfe4e23a24793f3137add1af1d1f0.

Manually closing this (forgot to add the "Differential Revision" note to the commit message).

Harbormaster completed remote builds in B175406: Diff 444659.Jul 14 2022, 11:02 AM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

Reassociate.cpp

38 lines

test/

Transforms/

Reassociate/

fast-ReassociateVector.ll

6 lines

fast-basictest.ll

68 lines

pr42349.ll

1 line

Diff 444659

llvm/lib/Transforms/Scalar/Reassociate.cpp

Show First 20 Lines • Show All 443 Lines • ▼ Show 20 Lines
/// then you will never see such an undef operand unless you get back to 'I',		/// then you will never see such an undef operand unless you get back to 'I',
/// which requires passing through a phi node.		/// which requires passing through a phi node.
///		///
/// Note that this routine may also mutate binary operators of the wrong type		/// Note that this routine may also mutate binary operators of the wrong type
/// that have all uses inside the expression (i.e. only used by non-leaf nodes		/// that have all uses inside the expression (i.e. only used by non-leaf nodes
/// of the expression) if it can turn them into binary operators of the right		/// of the expression) if it can turn them into binary operators of the right
/// type and thus make the expression bigger.		/// type and thus make the expression bigger.
static bool LinearizeExprTree(Instruction *I,		static bool LinearizeExprTree(Instruction *I,
SmallVectorImpl<RepeatedValue> &Ops) {		SmallVectorImpl<RepeatedValue> &Ops,
		ReassociatePass::OrderedSet &ToRedo) {
assert((isa<UnaryOperator>(I) \|\| isa<BinaryOperator>(I)) &&		assert((isa<UnaryOperator>(I) \|\| isa<BinaryOperator>(I)) &&
"Expected a UnaryOperator or BinaryOperator!");		"Expected a UnaryOperator or BinaryOperator!");
LLVM_DEBUG(dbgs() << "LINEARIZE: " << *I << '\n');		LLVM_DEBUG(dbgs() << "LINEARIZE: " << *I << '\n');
unsigned Bitwidth = I->getType()->getScalarType()->getPrimitiveSizeInBits();		unsigned Bitwidth = I->getType()->getScalarType()->getPrimitiveSizeInBits();
unsigned Opcode = I->getOpcode();		unsigned Opcode = I->getOpcode();
assert(I->isAssociative() && I->isCommutative() &&		assert(I->isAssociative() && I->isCommutative() &&
"Expected an associative and commutative operation!");		"Expected an associative and commutative operation!");

▲ Show 20 Lines • Show All 110 Lines • ▼ Show 20 Lines	#endif
// can usefully morph it into an expression of the right kind.		// can usefully morph it into an expression of the right kind.
assert((!isa<Instruction>(Op) \|\|		assert((!isa<Instruction>(Op) \|\|
cast<Instruction>(Op)->getOpcode() != Opcode		cast<Instruction>(Op)->getOpcode() != Opcode
\|\| (isa<FPMathOperator>(Op) &&		\|\| (isa<FPMathOperator>(Op) &&
!cast<Instruction>(Op)->isFast())) &&		!cast<Instruction>(Op)->isFast())) &&
"Should have been handled above!");		"Should have been handled above!");
assert(Op->hasOneUse() && "Has uses outside the expression tree!");		assert(Op->hasOneUse() && "Has uses outside the expression tree!");

// If this is a multiply expression, turn any internal negations into		// If this is a multiply expression, turn any internal negations into
// multiplies by -1 so they can be reassociated.		// multiplies by -1 so they can be reassociated. Add any users of the
		spatelUnsubmitted Not Done Reply Inline Actions I know this is existing code, but for readability, I'd rename this "Neg" instead of "Tmp" and remove the partially-braced nested-ifs: Instruction Neg; if ((Opcode == Instruction::Mul && match(Op, m_Neg(m_Value()))) \|\| (Opcode == Instruction::FMul && match(Op, m_FNeg(m_Value()))) && match(Op, m_Instruction(Neg)) { spatel:* I know this is existing code, but for readability, I'd rename this "Neg" instead of "Tmp" and…
		wristowAuthorUnsubmitted Done Reply Inline Actions Sounds good. wristow: Sounds good.
		spatelUnsubmitted Done Reply Inline Actions NI -> Mul ? spatel: NI -> Mul ?
		wristowAuthorUnsubmitted Done Reply Inline Actions Will do. wristow: Will do.
		spatelUnsubmitted Done Reply Inline Actions Add a line to this comment to explain why we add users of the negate to the redo list? spatel: Add a line to this comment to explain why we add users of the negate to the redo list?
		wristowAuthorUnsubmitted Done Reply Inline Actions Will do. wristow: Will do.
if (Instruction *Tmp = dyn_cast<Instruction>(Op))		// newly created multiplication by -1 to the redo list, so any
if ((Opcode == Instruction::Mul && match(Tmp, m_Neg(m_Value()))) \|\|		// reassociation opportunities that are exposed will be reassociated
(Opcode == Instruction::FMul && match(Tmp, m_FNeg(m_Value())))) {		// further.
		spatelUnsubmitted Done Reply Inline Actions UTmp -> UserBO spatel: UTmp -> UserBO
		wristowAuthorUnsubmitted Done Reply Inline Actions Will do. wristow: Will do.
		Instruction *Neg;
		if (((Opcode == Instruction::Mul && match(Op, m_Neg(m_Value()))) \|\|
		(Opcode == Instruction::FMul && match(Op, m_FNeg(m_Value())))) &&
		match(Op, m_Instruction(Neg))) {
LLVM_DEBUG(dbgs()		LLVM_DEBUG(dbgs()
<< "MORPH LEAF: " << *Op << " (" << Weight << ") TO ");		<< "MORPH LEAF: " << *Op << " (" << Weight << ") TO ");
Tmp = LowerNegateToMultiply(Tmp);		Instruction *Mul = LowerNegateToMultiply(Neg);
LLVM_DEBUG(dbgs() << *Tmp << '\n');		LLVM_DEBUG(dbgs() << *Mul << '\n');
Worklist.push_back(std::make_pair(Tmp, Weight));		Worklist.push_back(std::make_pair(Mul, Weight));
		for (User *U : Mul->users()) {
		if (BinaryOperator *UserBO = dyn_cast<BinaryOperator>(U))
		ToRedo.insert(UserBO);
		}
		ToRedo.insert(Neg);
Changed = true;		Changed = true;
continue;		continue;
}		}

// Failed to morph into an expression of the right type. This really is		// Failed to morph into an expression of the right type. This really is
// a leaf.		// a leaf.
LLVM_DEBUG(dbgs() << "ADD LEAF: " << *Op << " (" << Weight << ")\n");		LLVM_DEBUG(dbgs() << "ADD LEAF: " << *Op << " (" << Weight << ")\n");
assert(!isReassociableOp(Op, Opcode) && "Value was morphed?");		assert(!isReassociableOp(Op, Opcode) && "Value was morphed?");
LeafOrder.push_back(Op);		LeafOrder.push_back(Op);
Leaves[Op] = Weight;		Leaves[Op] = Weight;
}		}
▲ Show 20 Lines • Show All 536 Lines • ▼ Show 20 Lines
/// and if this sequence contains a multiply by Factor,		/// and if this sequence contains a multiply by Factor,
/// remove Factor from the tree and return the new tree.		/// remove Factor from the tree and return the new tree.
Value ReassociatePass::RemoveFactorFromExpression(Value V, Value *Factor) {		Value ReassociatePass::RemoveFactorFromExpression(Value V, Value *Factor) {
BinaryOperator *BO = isReassociableOp(V, Instruction::Mul, Instruction::FMul);		BinaryOperator *BO = isReassociableOp(V, Instruction::Mul, Instruction::FMul);
if (!BO)		if (!BO)
return nullptr;		return nullptr;

SmallVector<RepeatedValue, 8> Tree;		SmallVector<RepeatedValue, 8> Tree;
MadeChange \|= LinearizeExprTree(BO, Tree);		MadeChange \|= LinearizeExprTree(BO, Tree, RedoInsts);
SmallVector<ValueEntry, 8> Factors;		SmallVector<ValueEntry, 8> Factors;
Factors.reserve(Tree.size());		Factors.reserve(Tree.size());
for (unsigned i = 0, e = Tree.size(); i != e; ++i) {		for (unsigned i = 0, e = Tree.size(); i != e; ++i) {
RepeatedValue E = Tree[i];		RepeatedValue E = Tree[i];
Factors.append(E.second.getZExtValue(),		Factors.append(E.second.getZExtValue(),
ValueEntry(getRank(E.first), E.first));		ValueEntry(getRank(E.first), E.first));
}		}

▲ Show 20 Lines • Show All 1,162 Lines • ▼ Show 20 Lines	void ReassociatePass::OptimizeInst(Instruction *I) {

ReassociateExpression(BO);		ReassociateExpression(BO);
}		}

void ReassociatePass::ReassociateExpression(BinaryOperator *I) {		void ReassociatePass::ReassociateExpression(BinaryOperator *I) {
// First, walk the expression tree, linearizing the tree, collecting the		// First, walk the expression tree, linearizing the tree, collecting the
// operand information.		// operand information.
SmallVector<RepeatedValue, 8> Tree;		SmallVector<RepeatedValue, 8> Tree;
MadeChange \|= LinearizeExprTree(I, Tree);		MadeChange \|= LinearizeExprTree(I, Tree, RedoInsts);
SmallVector<ValueEntry, 8> Ops;		SmallVector<ValueEntry, 8> Ops;
Ops.reserve(Tree.size());		Ops.reserve(Tree.size());
for (const RepeatedValue &E : Tree)		for (const RepeatedValue &E : Tree)
Ops.append(E.second.getZExtValue(), ValueEntry(getRank(E.first), E.first));		Ops.append(E.second.getZExtValue(), ValueEntry(getRank(E.first), E.first));

LLVM_DEBUG(dbgs() << "RAIn:\t"; PrintOps(I, Ops); dbgs() << '\n');		LLVM_DEBUG(dbgs() << "RAIn:\t"; PrintOps(I, Ops); dbgs() << '\n');

// Now that we have linearized the tree to a list and have gathered all of		// Now that we have linearized the tree to a list and have gathered all of
▲ Show 20 Lines • Show All 291 Lines • Show Last 20 Lines

llvm/test/Transforms/Reassociate/fast-ReassociateVector.ll

Show First 20 Lines • Show All 275 Lines • ▼ Show 20 Lines	;
%4 = fadd reassoc <2 x double> %2, %3		%4 = fadd reassoc <2 x double> %2, %3
ret <2 x double> %4		ret <2 x double> %4
}		}

; Check -(-(z40)a) -> a40z.		; Check -(-(z40)a) -> a40z.

define <2 x float> @test10(<2 x float> %a, <2 x float> %b, <2 x float> %z) {		define <2 x float> @test10(<2 x float> %a, <2 x float> %b, <2 x float> %z) {
; CHECK-LABEL: @test10(		; CHECK-LABEL: @test10(
; CHECK-NEXT: [[TMP1:%.*]] = fsub fast <2 x float> zeroinitializer, zeroinitializer
; CHECK-NEXT: [[C:%.]] = fmul fast <2 x float> [[A:%.]], <float 4.000000e+01, float 4.000000e+01>		; CHECK-NEXT: [[C:%.]] = fmul fast <2 x float> [[A:%.]], <float 4.000000e+01, float 4.000000e+01>
; CHECK-NEXT: [[E:%.]] = fmul fast <2 x float> [[C]], [[Z:%.]]		; CHECK-NEXT: [[E:%.]] = fmul fast <2 x float> [[C]], [[Z:%.]]
; CHECK-NEXT: [[TMP2:%.*]] = fadd fast <2 x float> [[E]], zeroinitializer		; CHECK-NEXT: [[TMP1:%.*]] = fadd fast <2 x float> [[E]], zeroinitializer
; CHECK-NEXT: ret <2 x float> [[TMP2]]		; CHECK-NEXT: ret <2 x float> [[TMP1]]
;		;
%d = fmul fast <2 x float> %z, <float 4.000000e+01, float 4.000000e+01>		%d = fmul fast <2 x float> %z, <float 4.000000e+01, float 4.000000e+01>
%c = fsub fast <2 x float> <float 0.000000e+00, float 0.000000e+00>, %d		%c = fsub fast <2 x float> <float 0.000000e+00, float 0.000000e+00>, %d
%e = fmul fast <2 x float> %a, %c		%e = fmul fast <2 x float> %a, %c
%f = fsub fast <2 x float> <float 0.000000e+00, float 0.000000e+00>, %e		%f = fsub fast <2 x float> <float 0.000000e+00, float 0.000000e+00>, %e
ret <2 x float> %f		ret <2 x float> %f
}		}

define <2 x float> @test10_unary_fneg(<2 x float> %a, <2 x float> %b, <2 x float> %z) {		define <2 x float> @test10_unary_fneg(<2 x float> %a, <2 x float> %b, <2 x float> %z) {
; CHECK-LABEL: @test10_unary_fneg(		; CHECK-LABEL: @test10_unary_fneg(
; CHECK-NEXT: [[TMP1:%.*]] = fneg fast <2 x float> zeroinitializer
; CHECK-NEXT: [[E:%.]] = fmul fast <2 x float> [[A:%.]], <float 4.000000e+01, float 4.000000e+01>		; CHECK-NEXT: [[E:%.]] = fmul fast <2 x float> [[A:%.]], <float 4.000000e+01, float 4.000000e+01>
; CHECK-NEXT: [[F:%.]] = fmul fast <2 x float> [[E]], [[Z:%.]]		; CHECK-NEXT: [[F:%.]] = fmul fast <2 x float> [[E]], [[Z:%.]]
; CHECK-NEXT: ret <2 x float> [[F]]		; CHECK-NEXT: ret <2 x float> [[F]]
;		;
%d = fmul fast <2 x float> %z, <float 4.000000e+01, float 4.000000e+01>		%d = fmul fast <2 x float> %z, <float 4.000000e+01, float 4.000000e+01>
%c = fneg fast <2 x float> %d		%c = fneg fast <2 x float> %d
%e = fmul fast <2 x float> %a, %c		%e = fmul fast <2 x float> %a, %c
%f = fneg fast <2 x float> %e		%f = fneg fast <2 x float> %e
▲ Show 20 Lines • Show All 149 Lines • Show Last 20 Lines

llvm/test/Transforms/Reassociate/fast-basictest.ll

Show First 20 Lines • Show All 175 Lines • ▼ Show 20 Lines	;
%aab = fmul reassoc float %aa, %B		%aab = fmul reassoc float %aa, %B
%ac = fmul reassoc float %A, %C		%ac = fmul reassoc float %A, %C
%aac = fmul reassoc float %ac, %A		%aac = fmul reassoc float %ac, %A
%r = fadd reassoc float %aab, %aac		%r = fadd reassoc float %aab, %aac
ret float %r		ret float %r
}		}

; (-X)Y + Z -> Z-XY		; (-X)Y + Z -> Z-XY
; TODO: check why IR transformation of test7 with 'fast' math flag
; is worse than without it (and even without transformation)

define float @test7(float %X, float %Y, float %Z) {		define float @test7(float %X, float %Y, float %Z) {
; CHECK-LABEL: @test7(		; CHECK-LABEL: @test7(
; CHECK-NEXT: [[TMP1:%.*]] = fsub fast float 0.000000e+00, 0.000000e+00		; CHECK-NEXT: [[B:%.]] = fmul fast float [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[A:%.]] = fmul fast float [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = fsub fast float [[Z:%.]], [[B]]
; CHECK-NEXT: [[B:%.*]] = fmul fast float [[A]], 1.000000e+00		; CHECK-NEXT: ret float [[TMP1]]
; CHECK-NEXT: [[TMP2:%.]] = fsub fast float [[Z:%.]], [[B]]
; CHECK-NEXT: ret float [[TMP2]]
;		;
%A = fsub fast float 0.0, %X		%A = fsub fast float 0.0, %X
%B = fmul fast float %A, %Y		%B = fmul fast float %A, %Y
%C = fadd fast float %B, %Z		%C = fadd fast float %B, %Z
ret float %C		ret float %C
}		}

define float @test7_unary_fneg(float %X, float %Y, float %Z) {		define float @test7_unary_fneg(float %X, float %Y, float %Z) {
; CHECK-LABEL: @test7_unary_fneg(		; CHECK-LABEL: @test7_unary_fneg(
; CHECK-NEXT: [[TMP1:%.*]] = fneg fast float 0.000000e+00		; CHECK-NEXT: [[B:%.]] = fmul fast float [[Y:%.]], [[X:%.*]]
; CHECK-NEXT: [[A:%.]] = fmul fast float [[Y:%.]], [[X:%.*]]		; CHECK-NEXT: [[TMP1:%.]] = fsub fast float [[Z:%.]], [[B]]
; CHECK-NEXT: [[B:%.*]] = fmul fast float [[A]], 1.000000e+00		; CHECK-NEXT: ret float [[TMP1]]
; CHECK-NEXT: [[TMP2:%.]] = fsub fast float [[Z:%.]], [[B]]
; CHECK-NEXT: ret float [[TMP2]]
;		;
%A = fneg fast float %X		%A = fneg fast float %X
%B = fmul fast float %A, %Y		%B = fmul fast float %A, %Y
%C = fadd fast float %B, %Z		%C = fadd fast float %B, %Z
ret float %C		ret float %C
}		}

define float @test7_reassoc_nsz(float %X, float %Y, float %Z) {		define float @test7_reassoc_nsz(float %X, float %Y, float %Z) {
Show All 18 Lines
; CHECK-NEXT: ret float [[C]]		; CHECK-NEXT: ret float [[C]]
;		;
%A = fsub reassoc float 0.0, %X		%A = fsub reassoc float 0.0, %X
%B = fmul reassoc float %A, %Y		%B = fmul reassoc float %A, %Y
%C = fadd reassoc float %B, %Z		%C = fadd reassoc float %B, %Z
ret float %C		ret float %C
}		}

		; Integer version of:
		; (-X)Y + Z -> Z-XY
		; TODO: check if we can change the mul of -1 and the add to a sub.
		define i32 @test7_int(i32 %X, i32 %Y, i32 %Z) {
		; CHECK-LABEL: @test7_int(
		; CHECK-NEXT: [[A:%.]] = mul i32 [[Y:%.]], [[X:%.*]]
		; CHECK-NEXT: [[B:%.*]] = mul i32 [[A]], -1
		; CHECK-NEXT: [[C:%.]] = add i32 [[B]], [[Z:%.]]
		; CHECK-NEXT: ret i32 [[C]]
		;
		%A = sub i32 0, %X
		%B = mul i32 %A, %Y
		%C = add i32 %B, %Z
		ret i32 %C
		}

define float @test8(float %X) {		define float @test8(float %X) {
; CHECK-LABEL: @test8(		; CHECK-LABEL: @test8(
; CHECK-NEXT: [[FACTOR:%.]] = fmul fast float [[X:%.]], 9.400000e+01		; CHECK-NEXT: [[FACTOR:%.]] = fmul fast float [[X:%.]], 9.400000e+01
; CHECK-NEXT: ret float [[FACTOR]]		; CHECK-NEXT: ret float [[FACTOR]]
;		;
%Y = fmul fast float %X, 4.700000e+01		%Y = fmul fast float %X, 4.700000e+01
%Z = fadd fast float %Y, %Y		%Z = fadd fast float %Y, %Y
ret float %Z		ret float %Z
Show All 21 Lines	;
%X = fmul fast float %W, 127.0		%X = fmul fast float %W, 127.0
%Y = fadd fast float %X ,%X		%Y = fadd fast float %X ,%X
%Z = fadd fast float %Y, %X		%Z = fadd fast float %Y, %X
ret float %Z		ret float %Z
}		}

define float @test11(float %X) {		define float @test11(float %X) {
; CHECK-LABEL: @test11(		; CHECK-LABEL: @test11(
; CHECK-NEXT: [[TMP1:%.*]] = fneg fast float 0.000000e+00
; CHECK-NEXT: [[FACTOR:%.]] = fmul fast float [[X:%.]], -3.000000e+00		; CHECK-NEXT: [[FACTOR:%.]] = fmul fast float [[X:%.]], -3.000000e+00
; CHECK-NEXT: [[Z:%.*]] = fadd fast float [[FACTOR]], 6.000000e+00		; CHECK-NEXT: [[Z:%.*]] = fadd fast float [[FACTOR]], 6.000000e+00
; CHECK-NEXT: ret float [[Z]]		; CHECK-NEXT: ret float [[Z]]
;		;
%A = fsub fast float 1.000000e+00, %X		%A = fsub fast float 1.000000e+00, %X
%B = fsub fast float 2.000000e+00, %X		%B = fsub fast float 2.000000e+00, %X
%C = fsub fast float 3.000000e+00, %X		%C = fsub fast float 3.000000e+00, %X
%Y = fadd fast float %A ,%B		%Y = fadd fast float %A ,%B
%Z = fadd fast float %Y, %C		%Z = fadd fast float %Y, %C
ret float %Z		ret float %Z
}		}

; TODO: check why IR transformation of test12 with 'fast' math flag
; is worse than without it (and even without transformation)

define float @test12(float %X1, float %X2, float %X3) {		define float @test12(float %X1, float %X2, float %X3) {
; CHECK-LABEL: @test12(		; CHECK-LABEL: @test12(
; CHECK-NEXT: [[TMP1:%.*]] = fsub fast float 0.000000e+00, 0.000000e+00		; CHECK-NEXT: [[B:%.]] = fmul fast float [[X2:%.]], [[X1:%.*]]
; CHECK-NEXT: [[A:%.]] = fmul fast float [[X2:%.]], [[X1:%.*]]
; CHECK-NEXT: [[B:%.*]] = fmul fast float [[A]], 1.000000e+00
; CHECK-NEXT: [[C:%.]] = fmul fast float [[X3:%.]], [[X1]]		; CHECK-NEXT: [[C:%.]] = fmul fast float [[X3:%.]], [[X1]]
; CHECK-NEXT: [[TMP2:%.*]] = fsub fast float [[C]], [[B]]		; CHECK-NEXT: [[TMP1:%.*]] = fsub fast float [[C]], [[B]]
; CHECK-NEXT: ret float [[TMP2]]		; CHECK-NEXT: ret float [[TMP1]]
;		;
%A = fsub fast float 0.000000e+00, %X1		%A = fsub fast float 0.000000e+00, %X1
%B = fmul fast float %A, %X2 ; -X1*X2		%B = fmul fast float %A, %X2 ; -X1*X2
%C = fmul fast float %X1, %X3 ; X1*X3		%C = fmul fast float %X1, %X3 ; X1*X3
%D = fadd fast float %B, %C ; -X1X2 + X1X3 -> X1*(X3-X2)		%D = fadd fast float %B, %C ; -X1X2 + X1X3 -> X1*(X3-X2)
ret float %D		ret float %D
}		}

define float @test12_unary_fneg(float %X1, float %X2, float %X3) {		define float @test12_unary_fneg(float %X1, float %X2, float %X3) {
; CHECK-LABEL: @test12_unary_fneg(		; CHECK-LABEL: @test12_unary_fneg(
; CHECK-NEXT: [[TMP1:%.*]] = fneg fast float 0.000000e+00		; CHECK-NEXT: [[B:%.]] = fmul fast float [[X2:%.]], [[X1:%.*]]
; CHECK-NEXT: [[A:%.]] = fmul fast float [[X2:%.]], [[X1:%.*]]
; CHECK-NEXT: [[B:%.*]] = fmul fast float [[A]], 1.000000e+00
; CHECK-NEXT: [[C:%.]] = fmul fast float [[X3:%.]], [[X1]]		; CHECK-NEXT: [[C:%.]] = fmul fast float [[X3:%.]], [[X1]]
; CHECK-NEXT: [[TMP2:%.*]] = fsub fast float [[C]], [[B]]		; CHECK-NEXT: [[TMP1:%.*]] = fsub fast float [[C]], [[B]]
; CHECK-NEXT: ret float [[TMP2]]		; CHECK-NEXT: ret float [[TMP1]]
;		;
%A = fneg fast float %X1		%A = fneg fast float %X1
%B = fmul fast float %A, %X2 ; -X1*X2		%B = fmul fast float %A, %X2 ; -X1*X2
%C = fmul fast float %X1, %X3 ; X1*X3		%C = fmul fast float %X1, %X3 ; X1*X3
%D = fadd fast float %B, %C ; -X1X2 + X1X3 -> X1*(X3-X2)		%D = fadd fast float %B, %C ; -X1X2 + X1X3 -> X1*(X3-X2)
ret float %D		ret float %D
}		}

▲ Show 20 Lines • Show All 158 Lines • ▼ Show 20 Lines
}		}

; Test that we can turn things like X-(YZ) -> X-1Y*Z.		; Test that we can turn things like X-(YZ) -> X-1Y*Z.
; That only works with both instcombine and reassociate passes enabled.		; That only works with both instcombine and reassociate passes enabled.
; Check that reassociate is not enough.		; Check that reassociate is not enough.

define float @test16(float %a, float %b, float %z) {		define float @test16(float %a, float %b, float %z) {
; CHECK-LABEL: @test16(		; CHECK-LABEL: @test16(
; CHECK-NEXT: [[TMP1:%.*]] = fsub fast float 0.000000e+00, 0.000000e+00
; CHECK-NEXT: [[C:%.]] = fmul fast float [[A:%.]], 1.234500e+04		; CHECK-NEXT: [[C:%.]] = fmul fast float [[A:%.]], 1.234500e+04
; CHECK-NEXT: [[E:%.]] = fmul fast float [[C]], [[B:%.]]		; CHECK-NEXT: [[E:%.]] = fmul fast float [[C]], [[B:%.]]
; CHECK-NEXT: [[F:%.]] = fmul fast float [[E]], [[Z:%.]]		; CHECK-NEXT: [[F:%.]] = fmul fast float [[E]], [[Z:%.]]
; CHECK-NEXT: [[TMP2:%.*]] = fadd fast float [[F]], 0.000000e+00		; CHECK-NEXT: [[TMP1:%.*]] = fadd fast float [[F]], 0.000000e+00
; CHECK-NEXT: ret float [[TMP2]]		; CHECK-NEXT: ret float [[TMP1]]
;		;
%c = fsub fast float 0.000000e+00, %z		%c = fsub fast float 0.000000e+00, %z
%d = fmul fast float %a, %b		%d = fmul fast float %a, %b
%e = fmul fast float %c, %d		%e = fmul fast float %c, %d
%f = fmul fast float %e, 1.234500e+04		%f = fmul fast float %e, 1.234500e+04
%g = fsub fast float 0.000000e+00, %f		%g = fsub fast float 0.000000e+00, %f
ret float %g		ret float %g
}		}

define float @test16_unary_fneg(float %a, float %b, float %z) {		define float @test16_unary_fneg(float %a, float %b, float %z) {
; CHECK-LABEL: @test16_unary_fneg(		; CHECK-LABEL: @test16_unary_fneg(
; CHECK-NEXT: [[TMP1:%.*]] = fneg fast float 0.000000e+00
; CHECK-NEXT: [[E:%.]] = fmul fast float [[A:%.]], 1.234500e+04		; CHECK-NEXT: [[E:%.]] = fmul fast float [[A:%.]], 1.234500e+04
; CHECK-NEXT: [[F:%.]] = fmul fast float [[E]], [[B:%.]]		; CHECK-NEXT: [[F:%.]] = fmul fast float [[E]], [[B:%.]]
; CHECK-NEXT: [[G:%.]] = fmul fast float [[F]], [[Z:%.]]		; CHECK-NEXT: [[G:%.]] = fmul fast float [[F]], [[Z:%.]]
; CHECK-NEXT: ret float [[G]]		; CHECK-NEXT: ret float [[G]]
;		;
%c = fneg fast float %z		%c = fneg fast float %z
%d = fmul fast float %a, %b		%d = fmul fast float %a, %b
%e = fmul fast float %c, %d		%e = fmul fast float %c, %d
Show All 15 Lines	;
%d = fmul reassoc float %a, %b		%d = fmul reassoc float %a, %b
%e = fmul reassoc float %c, %d		%e = fmul reassoc float %c, %d
%f = fmul reassoc float %e, 1.234500e+04		%f = fmul reassoc float %e, 1.234500e+04
%g = fsub reassoc float 0.000000e+00, %f		%g = fsub reassoc float 0.000000e+00, %f
ret float %g		ret float %g
}		}

; TODO: check if we can remove:		; TODO: check if we can remove:
; - fsub fast 0, 0
; - fadd fast x, 0		; - fadd fast x, 0
; ... as 'fast' implies 'nsz'		; ... as 'fast' implies 'nsz'
define float @test17(float %a, float %b, float %z) {		define float @test17(float %a, float %b, float %z) {
; CHECK-LABEL: @test17(		; CHECK-LABEL: @test17(
; CHECK-NEXT: [[TMP1:%.*]] = fsub fast float 0.000000e+00, 0.000000e+00
; CHECK-NEXT: [[C:%.]] = fmul fast float [[A:%.]], 4.000000e+01		; CHECK-NEXT: [[C:%.]] = fmul fast float [[A:%.]], 4.000000e+01
; CHECK-NEXT: [[E:%.]] = fmul fast float [[C]], [[Z:%.]]		; CHECK-NEXT: [[E:%.]] = fmul fast float [[C]], [[Z:%.]]
; CHECK-NEXT: [[TMP2:%.*]] = fadd fast float [[E]], 0.000000e+00		; CHECK-NEXT: [[TMP1:%.*]] = fadd fast float [[E]], 0.000000e+00
; CHECK-NEXT: ret float [[TMP2]]		; CHECK-NEXT: ret float [[TMP1]]
;		;
%d = fmul fast float %z, 4.000000e+01		%d = fmul fast float %z, 4.000000e+01
%c = fsub fast float 0.000000e+00, %d		%c = fsub fast float 0.000000e+00, %d
%e = fmul fast float %a, %c		%e = fmul fast float %a, %c
%f = fsub fast float 0.000000e+00, %e		%f = fsub fast float 0.000000e+00, %e
ret float %f		ret float %f
}		}

; TODO: check if we can remove fneg fast 0 as 'fast' implies 'nsz'
define float @test17_unary_fneg(float %a, float %b, float %z) {		define float @test17_unary_fneg(float %a, float %b, float %z) {
; CHECK-LABEL: @test17_unary_fneg(		; CHECK-LABEL: @test17_unary_fneg(
; CHECK-NEXT: [[TMP1:%.*]] = fneg fast float 0.000000e+00
; CHECK-NEXT: [[E:%.]] = fmul fast float [[A:%.]], 4.000000e+01		; CHECK-NEXT: [[E:%.]] = fmul fast float [[A:%.]], 4.000000e+01
; CHECK-NEXT: [[F:%.]] = fmul fast float [[E]], [[Z:%.]]		; CHECK-NEXT: [[F:%.]] = fmul fast float [[E]], [[Z:%.]]
; CHECK-NEXT: ret float [[F]]		; CHECK-NEXT: ret float [[F]]
;		;
%d = fmul fast float %z, 4.000000e+01		%d = fmul fast float %z, 4.000000e+01
%c = fneg fast float %d		%c = fneg fast float %d
%e = fmul fast float %a, %c		%e = fmul fast float %a, %c
%f = fneg fast float %e		%f = fneg fast float %e
Show All 17 Lines

llvm/test/Transforms/Reassociate/pr42349.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -reassociate -S \| FileCheck %s			; RUN: opt < %s -reassociate -S \| FileCheck %s

	define float @wibble(float %tmp6) #0 {			define float @wibble(float %tmp6) #0 {
	; CHECK-LABEL: @wibble(			; CHECK-LABEL: @wibble(
	; CHECK-NEXT: bb:			; CHECK-NEXT: bb:
	; CHECK-NEXT: [[TMP7:%.]] = fmul float [[TMP6:%.]], -1.000000e+00			; CHECK-NEXT: [[TMP7:%.]] = fmul float [[TMP6:%.]], -1.000000e+00
	; CHECK-NEXT: [[TMP0:%.*]] = fsub float -0.000000e+00, 0.000000e+00
	; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], 0xFFF0000000000000			; CHECK-NEXT: [[TMP9:%.*]] = fmul fast float [[TMP6]], 0xFFF0000000000000
	; CHECK-NEXT: ret float [[TMP9]]			; CHECK-NEXT: ret float [[TMP9]]
	;			;
	bb:			bb:
	%tmp7 = fsub float -0.000000e+00, %tmp6			%tmp7 = fsub float -0.000000e+00, %tmp6
	%tmp9 = fmul fast float %tmp7, 0x7FF0000000000000			%tmp9 = fmul fast float %tmp7, 0x7FF0000000000000
	ret float %tmp9			ret float %tmp9
	}			}

	attributes #0 = { "use-soft-float"="false" }			attributes #0 = { "use-soft-float"="false" }