This is an archive of the discontinued LLVM Phabricator instance.

Optimize square root squared (PR21126)
ClosedPublic

Authored by spatel on Oct 2 2014, 11:17 AM.

Download Raw Diff

Details

Reviewers

majnemer
hfinkel
dexonsmith

Commits

rG12d1ce540883: Optimize square root squared (PR21126).
rL218906: Optimize square root squared (PR21126).

Summary

When unsafe-fp-math is enabled, we can turn sqrt(X) * sqrt(X) into X.

This can happen in the real world when calculating x ** 3/2. This occurs in test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c.

Diff Detail

Repository: rL LLVM

Event Timeline

spatel updated this revision to Diff 14338.Oct 2 2014, 11:17 AM

spatel retitled this revision from to Optimize square root squared (PR21126).

spatel updated this object.

spatel edited the test plan for this revision. (Show Details)

spatel added reviewers: hfinkel, majnemer, dexonsmith.

spatel added a subscriber: Unknown Object (MLST).

LGTM.

test/Transforms/InstCombine/fmul.ll
130 ↗	(On Diff #14338)	We don't need the "unsafe-fp-math"="true" because you're just checking the 'fast' on the fmul, right? If you don't need it, please remove it.

This revision is now accepted and ready to land.Oct 2 2014, 12:29 PM

spatel added inline comments.Oct 2 2014, 1:05 PM

test/Transforms/InstCombine/fmul.ll
130 ↗	(On Diff #14338)	Thanks, Hal! Yes - the 'fast' alone is enough to do this optimization. I'm not a fan of the instruction-level fast-math flags...because it leads to another question. What should happen in the 2nd test case if it looks like this: define double @sqrt_squared2(double %f) #0 { %sqrt = call double @llvm.sqrt.f64(double %f) %mul1 = fmul fast double %sqrt, %sqrt %mul2 = fmul double %mul1, %sqrt ret double %mul2 } When the %mul1 operand is replaced in the 2nd fmul instruction, does that fmul instruction now become fast too? Is fast infectious?

hfinkel added inline comments.Oct 2 2014, 1:17 PM

test/Transforms/InstCombine/fmul.ll
130 ↗	(On Diff #14338)	I think that you should change your mind ;) The motivation behind the fast-math flags, in addition to providing client more-precise control over what instructions might be folded, was to create a "fast math" system that could survive inlining. As a result, the answer to your last question is no, it is not infectious, it is designed so that inlined "fast math" functions can have their instructions retain that property without affecting operations from the caller (which might not be "fast math" -- a particular problem when you consider LTO).

spatel added inline comments.Oct 2 2014, 1:44 PM

test/Transforms/InstCombine/fmul.ll
130 ↗	(On Diff #14338)	I'd love to love this. :) ..but, you see what happens in this example: There's no function-level attribute for unsafe-math. When we kill the first fmul, we eliminate the only trace of 'fast'-ness in this function (ignore for the moment that the llvm.sqrt intrinsic isn't IEEE-754 compliant). After the optimization, we have a seemingly safe sqrt and a safe fmul. But the entirety of the code that will be produced for this function is decidedly unsafe. We just deleted the knowledge that the original function was not safe. So in fact, fast is infectious, but silently so! LTO users should be aware that linking any unsafe code produces an overall unsafe program?

hfinkel added inline comments.Oct 2 2014, 1:56 PM

test/Transforms/InstCombine/fmul.ll
130 ↗	(On Diff #14338)	Sure, but that's normally what the user wants! They want fast semantics for the routines in one module, and precise semantics for routines in another, then they want LTO to inline one into the other. The fast flags are not meant to preserve a record of non-strict semantics -- that's not the optimizers job ;) -- but rather are intended to allow optimization control.

spatel added inline comments.Oct 2 2014, 2:08 PM

test/Transforms/InstCombine/fmul.ll
130 ↗	(On Diff #14338)	Ok, sold. :) I was just pretending to be an FP wonk. If the real FP wonks complain, please protect this optimization! I'll fix up the testcase and get this checked in.

Closed by commit rL218906 (authored by @spatel).

spatel mentioned this in D41381: [InstSimplify] Missed optimization in math expression: squashing exp(log), log(exp).Jan 2 2018, 2:28 PM

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

InstCombine/

InstCombineMulDivRem.cpp

5 lines

test/

Transforms/

InstCombine/

fmul.ll

29 lines

Diff 14346

llvm/trunk/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp

Show First 20 Lines • Show All 525 Lines • ▼ Show 20 Lines	if (AllowReassociate && isFiniteNonZeroFp(C)) {
RI->copyFastMathFlags(&I);		RI->copyFastMathFlags(&I);
return RI;		return RI;
}		}
}		}
}		}
}		}
}		}

		// sqrt(X) * sqrt(X) -> X
		if (AllowReassociate && (Op0 == Op1))
		if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(Op0))
		if (II->getIntrinsicID() == Intrinsic::sqrt)
		return ReplaceInstUsesWith(I, II->getOperand(0));

// Under unsafe algebra do:		// Under unsafe algebra do:
// X * log2(0.5Y) = Xlog2(Y) - X		// X * log2(0.5Y) = Xlog2(Y) - X
if (AllowReassociate) {		if (AllowReassociate) {
Value *OpX = nullptr;		Value *OpX = nullptr;
Value *OpY = nullptr;		Value *OpY = nullptr;
IntrinsicInst *Log2;		IntrinsicInst *Log2;
detectLog2OfHalf(Op0, OpY, Log2);		detectLog2OfHalf(Op0, OpY, Log2);
▲ Show 20 Lines • Show All 858 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/fmul.ll

Show First 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	define float @test11(float %x, float %y) {
%a = fadd fast float %x, 1.0		%a = fadd fast float %x, 1.0
%b = fadd fast float %y, 2.0		%b = fadd fast float %y, 2.0
%c = fadd fast float %a, %b		%c = fadd fast float %a, %b
ret float %c		ret float %c
; CHECK-LABEL: @test11(		; CHECK-LABEL: @test11(
; CHECK-NOT: fadd float		; CHECK-NOT: fadd float
; CHECK: fadd fast float		; CHECK: fadd fast float
}		}

		; PR21126: http://llvm.org/bugs/show_bug.cgi?id=21126
		; With unsafe/fast math, sqrt(X) * sqrt(X) is just X.
		declare double @llvm.sqrt.f64(double)

		define double @sqrt_squared1(double %f) #0 {
		%sqrt = call double @llvm.sqrt.f64(double %f)
		%mul = fmul fast double %sqrt, %sqrt
		ret double %mul
		; CHECK-LABEL: @sqrt_squared1(
		; CHECK-NEXT: ret double %f
		}

		; With unsafe/fast math, sqrt(X) * sqrt(X) is just X,
		; but make sure another use of the sqrt is intact.
		; Note that the remaining fmul is altered but is not 'fast'
		; itself because it was not marked 'fast' originally.
		; Thus, we have an overall fast result, but no more indication of
		; 'fast'ness in the code.
		define double @sqrt_squared2(double %f) #0 {
		%sqrt = call double @llvm.sqrt.f64(double %f)
		%mul1 = fmul fast double %sqrt, %sqrt
		%mul2 = fmul double %mul1, %sqrt
		ret double %mul2
		; CHECK-LABEL: @sqrt_squared2(
		; CHECK-NEXT: %sqrt = call double @llvm.sqrt.f64(double %f)
		; CHECK-NEXT: %mul2 = fmul double %sqrt, %f
		; CHECK-NEXT: ret double %mul2
		}