This is an archive of the discontinued LLVM Phabricator instance.

Differential D16696

InstCombine: fabs(x) * fabs(x) -> x * x
ClosedPublic

Authored by arsenm on Jan 28 2016, 3:22 PM.

Download Raw Diff

Details

Reviewers

scanon

Diff Detail

Event Timeline

arsenm updated this revision to Diff 46319.Jan 28 2016, 3:22 PM

arsenm retitled this revision from to InstCombine: fabs(x) * fabs(x) -> x * x.

arsenm updated this object.

arsenm added a subscriber: llvm-commits.

Make sure name is kept

mgrang added a subscriber: mgrang.Jan 28 2016, 3:41 PM

mgrang added inline comments.

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
621	Should this transformation also be guarded under AllowReassociate?
622	Would it be better to move the checks for the IntrinsicID under a switch-case? switch (II->getIntrinsicID())

arsenm added inline comments.Jan 28 2016, 3:44 PM

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
621	I'm pretty sure this is always safe
622	Maybe, but since there are only two and sqrt has an additional condition, the switch would probably be longer

DavidKreitzer added a subscriber: DavidKreitzer.Jan 29 2016, 12:25 PM

DavidKreitzer added inline comments.Jan 29 2016, 1:36 PM

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
621	If X is -NaN, this transform will change the result of fabs(X) * fabs(X) from +NaN to -NaN. (In all other cases, it will produce the same result.) So my recommendation is to guard this transformation by I.hasNoNaNs().

deadalnix added a subscriber: deadalnix.Jan 29 2016, 1:46 PM

deadalnix added inline comments.

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
621	My understanding of the standard is that NaN has a sign in order to avoid special casing it for operation like fabs. There is no expectation that it has a well defined value.

spatel added a subscriber: spatel.Jan 29 2016, 1:59 PM

I forgot that we had intrinsic/lib folds outside of SimplifyLibCalls. Sorry, if I've missed it, but Is there a reason it's better to do this (and the sqrt and log opts) here?

cc'ing Steve about the NaN behavior.

In D16696#339699, @spatel wrote:

I forgot that we had intrinsic/lib folds outside of SimplifyLibCalls. Sorry, if I've missed it, but Is there a reason it's better to do this (and the sqrt and log opts) here?

I'm not a huge fan of SimplifyLibCalls and would prefer optimizations stick to the intrinsics. SimplifyLibCalls has the assumption that you you only can optimize when TargetLibraryInfo reports having a sqrt lib call (even for the intrinsic). This is not helpful for targets that do not have libcalls, and may have hardware instructions or custom expansions for the operations. Especially since we treat the math intrinsics as "fast" versions with looser rules than the library function, I don't think we should conflate the intrinsics and library calls so much.

I would prefer if optimizations stuck to the intrinsic whenever possible. See my RFC from a couple days ago: http://lists.llvm.org/pipermail/llvm-dev/2016-January/094593.html

The signbit of NaN explicitly has no meaning, so there's no concern there. LGTM.

This revision is now accepted and ready to land.Jan 29 2016, 2:31 PM

r259295

Revision Contents

Path

Size

lib/

Transforms/

InstCombine/

InstCombineMulDivRem.cpp

20 lines

test/

Transforms/

InstCombine/

fmul.ll

29 lines

Diff 46321

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp

Show First 20 Lines • Show All 606 Lines • ▼ Show 20 Lines	if (AllowReassociate && isFiniteNonZeroFp(C)) {
RI->copyFastMathFlags(&I);		RI->copyFastMathFlags(&I);
return RI;		return RI;
}		}
}		}
}		}
}		}
}		}

		if (Op0 == Op1) {
		if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(Op0)) {
// sqrt(X) * sqrt(X) -> X		// sqrt(X) * sqrt(X) -> X
if (AllowReassociate && (Op0 == Op1))		if (AllowReassociate && II->getIntrinsicID() == Intrinsic::sqrt)
if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(Op0))
if (II->getIntrinsicID() == Intrinsic::sqrt)
return ReplaceInstUsesWith(I, II->getOperand(0));		return ReplaceInstUsesWith(I, II->getOperand(0));

		// fabs(X) * fabs(X) -> X * X
		mgrangUnsubmitted Not Done Reply Inline Actions Should this transformation also be guarded under AllowReassociate? mgrang: Should this transformation also be guarded under AllowReassociate?
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions I'm pretty sure this is always safe arsenm: I'm pretty sure this is always safe
		DavidKreitzerUnsubmitted Not Done Reply Inline Actions If X is -NaN, this transform will change the result of fabs(X) * fabs(X) from +NaN to -NaN. (In all other cases, it will produce the same result.) So my recommendation is to guard this transformation by I.hasNoNaNs(). DavidKreitzer: If X is -NaN, this transform will change the result of fabs(X) * fabs(X) from +NaN to -NaN. (In…
		deadalnixUnsubmitted Not Done Reply Inline Actions My understanding of the standard is that NaN has a sign in order to avoid special casing it for operation like fabs. There is no expectation that it has a well defined value. deadalnix: My understanding of the standard is that NaN has a sign in order to avoid special casing it for…
		if (II->getIntrinsicID() == Intrinsic::fabs) {
		mgrangUnsubmitted Not Done Reply Inline Actions Would it be better to move the checks for the IntrinsicID under a switch-case? switch (II->getIntrinsicID()) mgrang: Would it be better to move the checks for the IntrinsicID under a switch-case? switch (II…
		arsenmAuthorUnsubmitted Not Done Reply Inline Actions Maybe, but since there are only two and sqrt has an additional condition, the switch would probably be longer arsenm: Maybe, but since there are only two and sqrt has an additional condition, the switch would…
		Instruction *FMulVal = BinaryOperator::CreateFMul(II->getOperand(0),
		II->getOperand(0),
		I.getName());
		FMulVal->copyFastMathFlags(&I);

		return FMulVal;
		}
		}
		}

// Under unsafe algebra do:		// Under unsafe algebra do:
// X * log2(0.5Y) = Xlog2(Y) - X		// X * log2(0.5Y) = Xlog2(Y) - X
if (AllowReassociate) {		if (AllowReassociate) {
Value *OpX = nullptr;		Value *OpX = nullptr;
Value *OpY = nullptr;		Value *OpY = nullptr;
IntrinsicInst *Log2;		IntrinsicInst *Log2;
detectLog2OfHalf(Op0, OpY, Log2);		detectLog2OfHalf(Op0, OpY, Log2);
if (OpY) {		if (OpY) {
▲ Show 20 Lines • Show All 876 Lines • Show Last 20 Lines

test/Transforms/InstCombine/fmul.ll

Show First 20 Lines • Show All 146 Lines • ▼ Show 20 Lines	define double @sqrt_squared2(double %f) {
%mul1 = fmul fast double %sqrt, %sqrt		%mul1 = fmul fast double %sqrt, %sqrt
%mul2 = fmul double %mul1, %sqrt		%mul2 = fmul double %mul1, %sqrt
ret double %mul2		ret double %mul2
; CHECK-LABEL: @sqrt_squared2(		; CHECK-LABEL: @sqrt_squared2(
; CHECK-NEXT: %sqrt = call double @llvm.sqrt.f64(double %f)		; CHECK-NEXT: %sqrt = call double @llvm.sqrt.f64(double %f)
; CHECK-NEXT: %mul2 = fmul double %sqrt, %f		; CHECK-NEXT: %mul2 = fmul double %sqrt, %f
; CHECK-NEXT: ret double %mul2		; CHECK-NEXT: ret double %mul2
}		}

		declare float @llvm.fabs.f32(float) nounwind readnone

		; CHECK-LABEL @fabs_squared(
		; CHECK: %mul = fmul float %x, %x
		define float @fabs_squared(float %x) {
		%x.fabs = call float @llvm.fabs.f32(float %x)
		%mul = fmul float %x.fabs, %x.fabs
		ret float %mul
		}

		; CHECK-LABEL @fabs_squared_fast(
		; CHECK: %mul = fmul fast float %x, %x
		define float @fabs_squared_fast(float %x) {
		%x.fabs = call float @llvm.fabs.f32(float %x)
		%mul = fmul fast float %x.fabs, %x.fabs
		ret float %mul
		}

		; CHECK-LABEL @fabs_x_fabs(
		; CHECK: call float @llvm.fabs.f32(float %x)
		; CHECK: call float @llvm.fabs.f32(float %y)
		; CHECK: %mul = fmul float %x.fabs, %y.fabs
		define float @fabs_x_fabs(float %x, float %y) {
		%x.fabs = call float @llvm.fabs.f32(float %x)
		%y.fabs = call float @llvm.fabs.f32(float %y)
		%mul = fmul float %x.fabs, %y.fabs
		ret float %mul
		}