Download Raw Diff

Details

Reviewers

hfinkel
spatel

Commits

rGe5fbf591a7fd: [InstCombine] Missed optimization in math expression: sin(x) / cos(x) => tan(x)
rL322255: [InstCombine] Missed optimization in math expression: sin(x) / cos(x) => tan(x)

Summary

This patch enables folding sin(x) / cos(x) -> tan(x), cos(x) / sin(x) -> 1 / tan(x) under -ffast-math flag

Diff Detail

Event Timeline

Quolyk created this revision.Dec 15 2017, 5:55 AM

Quolyk mentioned this in D41322: [InstCombine] Missed optimization in math expression: squashing sqrt functions.Dec 16 2017, 7:41 AM

Quolyk mentioned this in D41342: [InstCombine] Missed optimization in math expression: simplify calls exp functions.Dec 18 2017, 1:21 AM

Quolyk retitled this revision from [InstCombine] Missed optimization in math expression: sin(x) / cos(x) => tan(x) to [WIP][InstCombine] Missed optimization in math expression: sin(x) / cos(x) => tan(x).

Quolyk added a reviewer: davide.

Quolyk mentioned this in D41381: [InstSimplify] Missed optimization in math expression: squashing exp(log), log(exp).Dec 19 2017, 2:21 AM

hfinkel added inline comments.Dec 19 2017, 8:10 PM

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
1456	Use `match` here?
1463	For the name here... Predicate the transform on TLI->has(LibFunc_tan) (or tanf, tanl, depending on the type). Use TLI->getName(LibFunc_tan) (or tanf, tanl, depending on the type).
1464	Needs BuilderTy::FastMathFlagGuard Guard(Builder); above this.

Quolyk updated this revision to Diff 127683.Dec 20 2017, 4:29 AM

Quolyk retitled this revision from [WIP][InstCombine] Missed optimization in math expression: sin(x) / cos(x) => tan(x) to [InstCombine] Missed optimization in math expression: sin(x) / cos(x) => tan(x).

@hfinkel I'd like to thank you for your help and patience reviewing my patches, as I'm new to the community. I hope it's ok when I include you as a reviewer.

Quolyk marked 3 inline comments as done.Dec 20 2017, 4:32 AM

In D41286#960694, @Quolyk wrote:

@hfinkel I'd like to thank you for your help and patience reviewing my patches, as I'm new to the community. I hope it's ok when I include you as a reviewer.

It's certainly fine to include me as a review. Welcome to the community!

Quolyk updated this revision to Diff 128599.Jan 4 2018, 12:52 AM

Quolyk edited the summary of this revision. (Show Details)

spatel added inline comments.Jan 5 2018, 10:43 AM

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
560–563	Can you move and reuse the very similar function that's currently in SimplifyLibCalls.cpp? static bool hasUnaryFloatFn(const TargetLibraryInfo TLI, Type Ty, LibFunc DoubleFn, LibFunc FloatFn, LibFunc LongDoubleFn) { See also: /// Emit a call to the unary function named 'Name' (e.g. 'floor'). This /// function is known to take a single of type matching 'Op' and returns one /// value with the same type. If 'Op' is a long double, 'l' is added as the /// suffix of name, if 'Op' is a float, we add a 'f' suffix. Value emitUnaryFloatFnCall(Value Op, StringRef Name, IRBuilder<> &B, const AttributeList &Attrs); ...in BuildLibCalls.h
1457–1460	Use m_Specific to simplify this: // sin(a) / cos(a) -> tan(a) Value *A; if (match(Op0, m_Intrinsic<Intrinsic::sin>(m_Value(A))) && match(Op1, m_Intrinsic<Intrinsic::cos>(m_Specific(A)))) {
test/Transforms/InstCombine/fdiv.ll
89–91	Please vary the fast-ness in these tests or add test(s) that show some variation. I'd prefer to see at least one test where we show the minimum case as we showed in one of the related patches - only the fdiv has relaxed math while the trig calls are strict.

Quolyk updated this revision to Diff 128867.Jan 7 2018, 12:49 AM

Quolyk marked 3 inline comments as done.

spatel added inline comments.Jan 8 2018, 8:05 AM

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
1454	Should we also handle: cos(a) / sin(a) -> 1 / tan(a) ?
test/Transforms/InstCombine/fdiv.ll
160–162	Do we need to declare tan functions for these tests?

@scanon should sign off this.

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
1454	Please wait for @scanon opinion before implementing every possible 10th grade trigonometrical identity.

This revision now requires changes to proceed.Jan 8 2018, 8:07 AM

For reference. this is what GCC generates (although it's unclear whether it's a good idea to follow them)
https://godbolt.org/g/YUUKGE

spatel added inline comments.Jan 8 2018, 10:57 AM

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
1454	Do you have some specific numerical concern here? As you've noted, this is a well-known math transform. We can make cos(a) / sin(a) a 'TODO' if you think we should use a different transform. https://stackoverflow.com/questions/3738384/stable-cotangent

davide added inline comments.Jan 8 2018, 11:02 AM

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
1454	The concern is that those transformation can overflow quite dramatically, even for `-ffast-math`. The other concern is that I'm not a numerical expert, so I'd love to have this signed off from somebody who knows better than me. The last concern is, again, we shouldn't pattern match every possible thing just because, it slows down the compiler without real benefit, so, do you know how this pattern is frequent?

spatel added inline comments.Jan 8 2018, 11:55 AM

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
1454	The compile-time concern is misguided. This pattern, like every other "10th grade trigonometrical identity", should be optimized by an optimizing compiler because that's the job of an optimizing compiler. These patterns can occur by way of templated code, inlining, or because the programmer may not be a computer performance expert. Think: scientists who are trying to model/simulate some math problem, but don't know much about perf...because again: that's the optimizing compiler's job. If this patch or pass is causing a compile-time problem for you, please point to or file a bug. Obstructing patches like this is doubly bad when you're undermining the efforts of new contributors.

davide added inline comments.Jan 8 2018, 12:28 PM

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
1454	I'm afraid this is not entirely correct. The job of an optimizing compiler is that of making tradeoff on what to optimize, based on cost. I'm not obstructing this patch, I'm asking for a second opinion. If you have a numerical explanation of why this patch can go in, I'll be happy to accept, otherwise I'll defer the review to @scanon.

spatel added inline comments.Jan 8 2018, 1:34 PM

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
1457–1458	We probably don't want to do this transform if both of the existing values have more than one use; we'd be trading a division for a libcall. If only one value has >1 use, it's probably still ok. Add more tests. :)

If you have a numerical explanation of why this patch can go in, I'll be happy to accept,

I'm not sure what you're looking for. Brute force sin/cos is equal to tan for every possible value?

#include <math.h>
#include <stdio.h>
#include <string.h>

int main() {
  unsigned int i;
  for (i = 0x00000000; i<0x7f800002; i++) {
    float f;
    memcpy(&f, &i, 4);
    float slow = sin(f) / cos(f);
    float fast = tan(f);
    if (slow != fast) {
      printf("\nsin(%f)/cos(%f) = %f, tan(%f) = %f\n", f, f, slow, f, fast);
    }
    if (i % (1024*1024*256) == 0) printf("0x%x...\n", i); 
  }

  return 0;
}

$  clang -O2 tan_checker.c ; ./a.out 
0x0...
0x10000000...
0x20000000...
0x30000000...
0x40000000...
0x50000000...
0x60000000...
0x70000000...

sin(inf)/cos(inf) = nan, tan(inf) = nan

sin(nan)/cos(nan) = nan, tan(nan) = nan

Note that I'm testing on macOS x86 with strict math (the sin and cos calls are replaced by __sincos_stret).
Double-check to make sure I haven't screwed anything up there, but this suggests we could do this transform without -ffast-math?

In D41286#970342, @spatel wrote:

If you have a numerical explanation of why this patch can go in, I'll be happy to accept,

I'm not sure what you're looking for. Brute force sin/cos is equal to tan for every possible value?

I don't think this is feasible (for 64-bit values). -ffast-math is a little fun and complicated, e.g.

log(pow(x, y)) -> y*log(x)

which seems perfectly fine on paper, for x = -1, y = 4

log(pow(-1, 4)) -> 0
4*log(-1) -> NaN.

(courtesy of Steven)

My point is that reasoning about seemingly innocuous algebraic simplification turns out to be harder than expected, and therefore we should make a conscious choice on whether implement these after careful numerical analysis.

In D41286#970343, @davide wrote:

log(pow(x, y)) -> y*log(x)

How is this relevant? Log is only defined for positive numbers, while sin/cos/tan are valid across all numerical inputs.

I don't think there's anything I more I can say here; sorry @Quolyk , I tried. If @hfinkel , @efriedma , @andrew.w.kaylor or anyone else would like to comment that would be great.

I don't feel qualified enough to say whether this can go in, somebody with fast-math experience should comment.

In D41286#970372, @spatel wrote:

In D41286#970343, @davide wrote:

log(pow(x, y)) -> y*log(x)

How is this relevant? Log is only defined for positive numbers, while sin/cos/tan are valid across all numerical inputs.

My point is that fast-math implications can be non-trivial.

davide removed a reviewer: davide.Jan 8 2018, 3:50 PM

Quolyk updated this revision to Diff 129042.Jan 9 2018, 1:04 AM

Quolyk edited the summary of this revision. (Show Details)

I changed the brute force float checker to test 1/tan(x), and it matches cos(x)/sin(x) in all cases on macOS 10.13. This is also true on Ubuntu 17.10 x86-64.

LGTM.

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
1474–1475	Could hoist this check to the first 'if' to reduce the duplication.

This revision is now accepted and ready to land.Jan 10 2018, 12:45 PM

Quolyk closed this revision.Jan 10 2018, 10:34 PM

@spatel @davide thanks for review and comments

For reference, @bkramer improved (thanks!) the attribute propagation:
rL322284
rL322285

spatel mentioned this in D41283: [InstCombine] Missed optimization in math expression: tan(a) * cos(a) == sin(a).Jan 12 2018, 10:09 AM

spatel mentioned this in rL325247: [InstCombine] allow sin/cos transforms with 'reassoc'.Feb 15 2018, 7:10 AM

Diff 127115

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp

Show First 20 Lines • Show All 551 Lines • ▼ Show 20 Lines	static bool isFMulOrFDivWithConstant(Value *V) {
if (C0 && C1)		if (C0 && C1)
return false;		return false;

return (C0 && isFiniteNonZeroFp(C0)) \|\| (C1 && isFiniteNonZeroFp(C1));		return (C0 && isFiniteNonZeroFp(C0)) \|\| (C1 && isFiniteNonZeroFp(C1));
}		}

/// foldFMulConst() is a helper routine of InstCombiner::visitFMul().		/// foldFMulConst() is a helper routine of InstCombiner::visitFMul().
/// The input \p FMulOrDiv is a FMul/FDiv with one and only one operand		/// The input \p FMulOrDiv is a FMul/FDiv with one and only one operand
/// being a constant (i.e. isFMulOrFDivWithConstant(FMulOrDiv) == true).		/// being a constant (i.e. isFMulOrFDivWithConstant(FMulOrDiv) == true).
/// This function is to simplify "FMulOrDiv * C" and returns the		/// This function is to simplify "FMulOrDiv * C" and returns the
/// resulting expression. Note that this function could return NULL in		/// resulting expression. Note that this function could return NULL in
/// case the constants cannot be folded into a normal floating-point.		/// case the constants cannot be folded into a normal floating-point.
		spatelUnsubmitted Done Reply Inline Actions Can you move and reuse the very similar function that's currently in SimplifyLibCalls.cpp? static bool hasUnaryFloatFn(const TargetLibraryInfo TLI, Type Ty, LibFunc DoubleFn, LibFunc FloatFn, LibFunc LongDoubleFn) { See also: /// Emit a call to the unary function named 'Name' (e.g. 'floor'). This /// function is known to take a single of type matching 'Op' and returns one /// value with the same type. If 'Op' is a long double, 'l' is added as the /// suffix of name, if 'Op' is a float, we add a 'f' suffix. Value emitUnaryFloatFnCall(Value Op, StringRef Name, IRBuilder<> &B, const AttributeList &Attrs); ...in BuildLibCalls.h spatel: Can you move and reuse the very similar function that's currently in SimplifyLibCalls.cpp?
Value InstCombiner::foldFMulConst(Instruction FMulOrDiv, Constant *C,		Value InstCombiner::foldFMulConst(Instruction FMulOrDiv, Constant *C,
Instruction *InsertBefore) {		Instruction *InsertBefore) {
assert(isFMulOrFDivWithConstant(FMulOrDiv) && "V is invalid");		assert(isFMulOrFDivWithConstant(FMulOrDiv) && "V is invalid");

Value *Opnd0 = FMulOrDiv->getOperand(0);		Value *Opnd0 = FMulOrDiv->getOperand(0);
Value *Opnd1 = FMulOrDiv->getOperand(1);		Value *Opnd1 = FMulOrDiv->getOperand(1);

Constant *C0 = dyn_cast<Constant>(Opnd0);		Constant *C0 = dyn_cast<Constant>(Opnd0);
▲ Show 20 Lines • Show All 874 Lines • ▼ Show 20 Lines	if (AllowReassociate) {
if (NewInst) {		if (NewInst) {
if (Instruction *T = dyn_cast<Instruction>(NewInst))		if (Instruction *T = dyn_cast<Instruction>(NewInst))
T->setDebugLoc(I.getDebugLoc());		T->setDebugLoc(I.getDebugLoc());
SimpR->setFastMathFlags(I.getFastMathFlags());		SimpR->setFastMathFlags(I.getFastMathFlags());
return SimpR;		return SimpR;
}		}
}		}

		// sin(a) / cos(a) -> tan(a)
		spatelUnsubmitted Not Done Reply Inline Actions Should we also handle: cos(a) / sin(a) -> 1 / tan(a) ? spatel: Should we also handle: cos(a) / sin(a) -> 1 / tan(a) ?
		davideUnsubmitted Not Done Reply Inline Actions Please wait for @scanon opinion before implementing every possible 10th grade trigonometrical identity. davide: Please wait for @scanon opinion before implementing every possible 10th grade trigonometrical…
		spatelUnsubmitted Not Done Reply Inline Actions Do you have some specific numerical concern here? As you've noted, this is a well-known math transform. We can make cos(a) / sin(a) a 'TODO' if you think we should use a different transform. https://stackoverflow.com/questions/3738384/stable-cotangent spatel: Do you have some specific numerical concern here? As you've noted, this is a well-known math…
		davideUnsubmitted Not Done Reply Inline Actions The concern is that those transformation can overflow quite dramatically, even for `-ffast-math`. The other concern is that I'm not a numerical expert, so I'd love to have this signed off from somebody who knows better than me. The last concern is, again, we shouldn't pattern match every possible thing just because, it slows down the compiler without real benefit, so, do you know how this pattern is frequent? davide: The concern is that those transformation can overflow quite dramatically, even for `-ffast…
		spatelUnsubmitted Not Done Reply Inline Actions The compile-time concern is misguided. This pattern, like every other "10th grade trigonometrical identity", should be optimized by an optimizing compiler because that's the job of an optimizing compiler. These patterns can occur by way of templated code, inlining, or because the programmer may not be a computer performance expert. Think: scientists who are trying to model/simulate some math problem, but don't know much about perf...because again: that's the optimizing compiler's job. If this patch or pass is causing a compile-time problem for you, please point to or file a bug. Obstructing patches like this is doubly bad when you're undermining the efforts of new contributors. spatel: The compile-time concern is misguided. This pattern, like every other "10th grade…
		davideUnsubmitted Not Done Reply Inline Actions I'm afraid this is not entirely correct. The job of an optimizing compiler is that of making tradeoff on what to optimize, based on cost. I'm not obstructing this patch, I'm asking for a second opinion. If you have a numerical explanation of why this patch can go in, I'll be happy to accept, otherwise I'll defer the review to @scanon. davide: I'm afraid this is not entirely correct. The job of an optimizing compiler is that of making…
		if (AllowReassociate) {
		if (IntrinsicInst *Sin = dyn_cast<IntrinsicInst>(Op0)) {
		hfinkelUnsubmitted Done Reply Inline Actions Use `match` here? hfinkel: Use `match` here?
		if (IntrinsicInst *Cos = dyn_cast<IntrinsicInst>(Op1)) {
		if (Sin->getIntrinsicID() == Intrinsic::sin &&
		spatelUnsubmitted Not Done Reply Inline Actions We probably don't want to do this transform if both of the existing values have more than one use; we'd be trading a division for a libcall. If only one value has >1 use, it's probably still ok. Add more tests. :) spatel: We probably don't want to do this transform if both of the existing values have more than one…
		Cos->getIntrinsicID() == Intrinsic::cos &&
		Sin->getArgOperand(0) == Cos->getArgOperand(0)) {
		spatelUnsubmitted Done Reply Inline Actions Use m_Specific to simplify this: // sin(a) / cos(a) -> tan(a) Value A; if (match(Op0, m_Intrinsic<Intrinsic::sin>(m_Value(A))) && match(Op1, m_Intrinsic<Intrinsic::cos>(m_Specific(A)))) { spatel:* Use m_Specific to simplify this: // sin(a) / cos(a) -> tan(a) Value *A; if (match(Op0…
		Module *M = I.getModule();
		Type *Ty = I.getType();
		Value *Tan = M->getOrInsertFunction("tan", Ty, Ty);
		hfinkelUnsubmitted Done Reply Inline Actions For the name here... Predicate the transform on TLI->has(LibFunc_tan) (or tanf, tanl, depending on the type). Use TLI->getName(LibFunc_tan) (or tanf, tanl, depending on the type). hfinkel: For the name here... 1. Predicate the transform on TLI->has(LibFunc_tan) (or tanf, tanl…
		Builder.setFastMathFlags(I.getFastMathFlags());
		hfinkelUnsubmitted Done Reply Inline Actions Needs BuilderTy::FastMathFlagGuard Guard(Builder); above this. hfinkel: Needs BuilderTy::FastMathFlagGuard Guard(Builder); above this.
		Value *TanCall = Builder.CreateCall(Tan, Cos->getArgOperand(0), "tan");
		return replaceInstUsesWith(I, TanCall);
		}
		}
		}
		}

Value *LHS;		Value *LHS;
Value *RHS;		Value *RHS;

// -x / -y -> x / y		// -x / -y -> x / y
		spatelUnsubmitted Not Done Reply Inline Actions Could hoist this check to the first 'if' to reduce the duplication. spatel: Could hoist this check to the first 'if' to reduce the duplication.
if (match(Op0, m_FNeg(m_Value(LHS))) && match(Op1, m_FNeg(m_Value(RHS)))) {		if (match(Op0, m_FNeg(m_Value(LHS))) && match(Op1, m_FNeg(m_Value(RHS)))) {
I.setOperand(0, LHS);		I.setOperand(0, LHS);
I.setOperand(1, RHS);		I.setOperand(1, RHS);
return &I;		return &I;
}		}

return nullptr;		return nullptr;
}		}
▲ Show 20 Lines • Show All 174 Lines • Show Last 20 Lines

test/Transforms/InstCombine/fdiv.ll

	Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
	; CHECK-LABEL @fdiv_fneg_fneg_fast(			; CHECK-LABEL @fdiv_fneg_fneg_fast(
	; CHECK: %div = fdiv fast float %x, %y			; CHECK: %div = fdiv fast float %x, %y
	define float @fdiv_fneg_fneg_fast(float %x, float %y) {			define float @fdiv_fneg_fneg_fast(float %x, float %y) {
	%x.fneg = fsub float -0.0, %x			%x.fneg = fsub float -0.0, %x
	%y.fneg = fsub float -0.0, %y			%y.fneg = fsub float -0.0, %y
	%div = fdiv fast float %x.fneg, %y.fneg			%div = fdiv fast float %x.fneg, %y.fneg
	ret float %div			ret float %div
	}			}

				; CHECK-LABEL @fdiv_sin_cos(
				; CHECK: call fast double @tan(double %a)
				define double @fdiv_sin_cos(double %a) {
				entry:
				%0 = call fast double @llvm.sin.f64(double %a)
				%1 = call fast double @llvm.cos.f64(double %a)
				%div = fdiv fast double %0, %1
				ret double %div
				}

				declare double @llvm.sin.f64(double) nounwind readnone speculatable
				declare double @llvm.cos.f64(double) nounwind readnone speculatable
				declare double @tan(double)
				spatelUnsubmitted Done Reply Inline Actions Please vary the fast-ness in these tests or add test(s) that show some variation. I'd prefer to see at least one test where we show the minimum case as we showed in one of the related patches - only the fdiv has relaxed math while the trig calls are strict. spatel: Please vary the fast-ness in these tests or add test(s) that show some variation. I'd prefer to…
				spatelUnsubmitted Not Done Reply Inline Actions Do we need to declare tan functions for these tests? spatel: Do we need to declare tan functions for these tests?

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Missed optimization in math expression: sin(x) / cos(x) => tan(x)
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 127115

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp

test/Transforms/InstCombine/fdiv.ll

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Missed optimization in math expression: sin(x) / cos(x) => tan(x)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 127115

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp

test/Transforms/InstCombine/fdiv.ll

[InstCombine] Missed optimization in math expression: sin(x) / cos(x) => tan(x)
ClosedPublic