This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
InstCombineInternal.h
-
InstCombineMulDivRem.cpp
2/2
InstructionCombining.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
combined.ll
-
combined_different_arguments.ll
-
sin_div_tan.ll
-
sin_mul_cos.ll
-
tan_div_sin.ll
-
tan_mul_cos.ll

Differential D41659

Implementing missing trigonometric optimizations
Needs ReviewPublic

Authored by cs15btech11041 on Jan 2 2018, 12:53 AM.

Download Raw Diff

Details

Reviewers

majnemer
craig.topper
davide
scanon
escha

Summary

Here we have implemented the following missing trigonometric optimizations.

tan(x)*cos(x)=sin(x)
sin(x)*cos(x) = sin(2*x)/2
sin(x)/tan(x)=cos(x);
tan(x)/sin(x)=1/cos(x);

Here is the reference to the missing optimization reported on bugzilla.

https://bugs.llvm.org/show_bug.cgi?id=35602

Diff Detail

Repository: rL LLVM

Event Timeline

cs15btech11041 created this revision.Jan 2 2018, 12:53 AM

As mentioned in llvm-dev, it looks like David Majnemer and Craig Topper might be appropriate reviewers for this patch (or can perhaps help to suggest other reviewers).

It looks like you've run clang-format on the entire lib/Transforms/InstCombine/InstructionCombining.cpp file. This makes it harder to review the changes you've made. Best practice is to only clang-format lines that you touch. I use the git-clang-format helper script for this, and it looks like clang-format-diff.py can help if you're using SVN:

https://github.com/llvm-mirror/clang/blob/master/tools/clang-format/git-clang-format
https://github.com/llvm-mirror/clang/blob/master/tools/clang-format/clang-format-diff.py

Thanks for your patch!

Usually it is easier to review smaller patches. Splitting it up into 4 patches for the 4 different cases you implemented would make it slightly easier to reason about the 4 unrelated transformation in isolation.

This may need special guard with fast-math flags (precision, etc), no?

Meta point: It's unclear to me whether this is something profitable to implement.
Sure, you can probably take a textbook and implement all the possible identities written there, but, what's the point?

I looked and it seems other compilers (most notably, GCC) don't implement at least the first transformation you implemented (I didn't bother to check the others).
https://godbolt.org/g/kKP5PW

tl;dr: what's your motivation?

This revision now requires changes to proceed.Jan 2 2018, 3:21 AM

Something else you may want to keep in mind is that these transformations may introduce dramatic rounding errors, so they need some thoughts/analysis before they can go in (cc: @scanon), even under -ffast-math.

davide added reviewers: scanon, escha.Jan 2 2018, 3:23 AM

spatel added a subscriber: spatel.Jan 2 2018, 7:50 AM

aprantl added a subscriber: aprantl.Jan 2 2018, 9:54 AM

aprantl added inline comments.

lib/Transforms/InstCombine/InstructionCombining.cpp
1431	Please remove the \brief, it is redundant.
1442	Please always use full sentences in documentation: `// Return nullptr if argument to calls are not same.`

Updates made:

Made the test cases more modular as mentioned in the comments.
Instead of clang format on entire InstructionCombine.cpp clang-formatted only the modified/inserted code for better understanding.

3.Comments improved.

In D41659#965688, @asb wrote:

As mentioned in llvm-dev, it looks like David Majnemer and Craig Topper might be appropriate reviewers for this patch (or can perhaps help to suggest other reviewers).

Hi, thanks for the comment. Thanks for adding appropriate reviewers for the patch.

It looks like you've run clang-format on the entire lib/Transforms/InstCombine/InstructionCombining.cpp file. This makes it harder to review the changes you've made. Best practice is to only clang-format lines that you touch. I use the git-clang-format helper script for this, and it looks like clang-format-diff.py can help if you're using SVN:

As suggested, I have updated the diff file with clang format only on the section of code modified/inserted. Hopefully it is easier for the review. Thanks for the suggestion.

https://github.com/llvm-mirror/clang/blob/master/tools/clang-format/git-clang-format
https://github.com/llvm-mirror/clang/blob/master/tools/clang-format/clang-format-diff.py

In D41659#965694, @fhahn wrote:

Thanks for your patch!

Usually it is easier to review smaller patches. Splitting it up into 4 patches for the 4 different cases you implemented would make it slightly easier to reason about the 4 unrelated transformation in isolation.

Yes. Thank you for the suggestions. I have updated the diff to incorporate the changes you suggested. And submitted different test case for each optimization in isolation. Hope it has made it easier to review the patch.

In D41659#965696, @rengolin wrote:

This may need special guard with fast-math flags (precision, etc), no?

I have added FastMathFlagGuard Guard to the IRBuilder. And also copied the ffast-math flags of the division/multiplication instruction. To the new trigonometric/multiplication/division instruction inserted. As for precision, we are replacing two trigonometric functions with single equivalent trigonometric function with fast math flag guards. Also the type which we are specifying in CreateCall is same as that of the division/multiplication instruction. Hence would have the same type and precision. Please let me know if I did not answer your question.

In D41659#965699, @davide wrote:

Meta point: It's unclear to me whether this is something profitable to implement.
Sure, you can probably take a textbook and implement all the possible identities written there, but, what's the point?

I looked and it seems other compilers (most notably, GCC) don't implement at least the first transformation you implemented (I didn't bother to check the others).
https://godbolt.org/g/kKP5PW

tl;dr: what's your motivation?

Hi, thanks for the comments. the motivation for taking this up were two things.
First was missing optimization reported on bugzilla. However it mentions only the first optimization implemented. I have implemented for other similar cases as well.

https://bugs.llvm.org/show_bug.cgi?id=35602

Second, replacing two trigonometric operations with single trigonometric operation or with a mul/div instruction would save considerable clock cycles. It says trigonometric operations require almost five time the clock cycles needed by multiplication/division.Also to add we are reducing the code size. As one could see from the test cases, the number of instructions after transformations <= number of instructions before. And to that we are also saving clock cycles.

https://stackoverflow.com/questions/2479517/is-trigonometry-computationally-expensive

Added the right file for sin_div_tan.ll test case (incorrect was added in previous commit).

In D41659#965694, @fhahn wrote:

Thanks for your patch!

Usually it is easier to review smaller patches. Splitting it up into 4 patches for the 4 different cases you implemented would make it slightly easier to reason about the 4 unrelated transformation in isolation.

Strongly agree. I understand there's a common theme here, but it will save time making redundant comments if we start with a patch for just one of these transforms.

The structure of the patch is not correct. We don't need a trig helper function because it's impossible for an integer op like mul or udiv to have an FP operand like llvm.sin.
Please have a look at these patches for the correct way to match intrinsics and libcalls:
D41322
D41283
D41389

In D41659#966397, @cs15btech11041 wrote:
Yes. Thank you for the suggestions. I have updated the diff to incorporate the changes you suggested. And submitted different test case for each optimization in isolation. Hope it has made it easier to review the patch.

I think I wasn't too clear, sorry. I meant splitting up the whole patch into 4 different patches and submitting them as separate reviewss, i.e. one review only including the code & test for tan(x)*cos(x)=sin(x), one for sin(x)*cos(x) = sin(2*x)/2 and so on.

This also makes it slightly easier to decide if each transform is beneficial on a case-by-case basis.

Revision Contents

Path

Size

lib/

Transforms/

InstCombine/

InstCombineInternal.h

1 line

InstCombineMulDivRem.cpp

15 lines

InstructionCombining.cpp

131 lines

test/

Transforms/

InstCombine/

combined.ll

24 lines

combined_different_arguments.ll

24 lines

18 lines

18 lines

19 lines

19 lines

Diff 128490

lib/Transforms/InstCombine/InstCombineInternal.h

Context not available.

	Value *SimplifyVectorOp(BinaryOperator &Inst);	Value *SimplifyVectorOp(BinaryOperator &Inst);

		Value *SimplifyTrigOp(Instruction &Inst);

	/// Given a binary operator, cast instruction, or select which has a PHI node	/// Given a binary operator, cast instruction, or select which has a PHI node
	/// as operand #0, see if we can fold the instruction into the PHI (which is	/// as operand #0, see if we can fold the instruction into the PHI (which is
Context not available.

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp

Context not available.
	bool Changed = SimplifyAssociativeOrCommutative(I);	bool Changed = SimplifyAssociativeOrCommutative(I);
	Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);	Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);

		if (Value *V = SimplifyTrigOp(I))
		return replaceInstUsesWith(I, V);

	if (Value *V = SimplifyVectorOp(I))	if (Value *V = SimplifyVectorOp(I))
	return replaceInstUsesWith(I, V);	return replaceInstUsesWith(I, V);

Context not available.
	bool Changed = SimplifyAssociativeOrCommutative(I);	bool Changed = SimplifyAssociativeOrCommutative(I);
	Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);	Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);

		if (Value *V = SimplifyTrigOp(I))
		return replaceInstUsesWith(I, V);

	if (Value *V = SimplifyVectorOp(I))	if (Value *V = SimplifyVectorOp(I))
	return replaceInstUsesWith(I, V);	return replaceInstUsesWith(I, V);

Context not available.
	Instruction *InstCombiner::visitUDiv(BinaryOperator &I) {	Instruction *InstCombiner::visitUDiv(BinaryOperator &I) {
	Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);	Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);

		if (Value *V = SimplifyTrigOp(I))
		return replaceInstUsesWith(I, V);

	if (Value *V = SimplifyVectorOp(I))	if (Value *V = SimplifyVectorOp(I))
	return replaceInstUsesWith(I, V);	return replaceInstUsesWith(I, V);

Context not available.
	Instruction *InstCombiner::visitSDiv(BinaryOperator &I) {	Instruction *InstCombiner::visitSDiv(BinaryOperator &I) {
	Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);	Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);

		if (Value *V = SimplifyTrigOp(I))
		return replaceInstUsesWith(I, V);

	if (Value *V = SimplifyVectorOp(I))	if (Value *V = SimplifyVectorOp(I))
	return replaceInstUsesWith(I, V);	return replaceInstUsesWith(I, V);

Context not available.
	Instruction *InstCombiner::visitFDiv(BinaryOperator &I) {	Instruction *InstCombiner::visitFDiv(BinaryOperator &I) {
	Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);	Value Op0 = I.getOperand(0), Op1 = I.getOperand(1);

		if (Value *V = SimplifyTrigOp(I))
		return replaceInstUsesWith(I, V);

	if (Value *V = SimplifyVectorOp(I))	if (Value *V = SimplifyVectorOp(I))
	return replaceInstUsesWith(I, V);	return replaceInstUsesWith(I, V);

Context not available.

lib/Transforms/InstCombine/InstructionCombining.cpp

Context not available.
	return BO;	return BO;
	}	}

		static bool isTrigLibCall(CallInst *CI) {
		// We can only hope to do anything useful if we can ignore things like errno
		// and floating-point exceptions.
		// We already checked the prototype.
		return CI->hasFnAttr(Attribute::NoUnwind) &&
		CI->hasFnAttr(Attribute::ReadNone);
		}

		/// Replaces the division or multiplication operations on optimizable trigonometric
		/// functions to equivalent trigonometric operation
		aprantlUnsubmitted Not Done Reply Inline Actions Please remove the \brief, it is redundant. aprantl: Please remove the \brief, it is redundant.

		Value *InstCombiner::SimplifyTrigOp(Instruction &I) {

		if (((I.getOpcode() == Instruction::FMul) \|\|
		(I.getOpcode() == Instruction::Mul))) {
		if (isa<CallInst>(I.getOperand(0)) && isa<CallInst>(I.getOperand(1))) {
		CallInst *call1 = dyn_cast<CallInst>(I.getOperand(0));
		CallInst *call2 = dyn_cast<CallInst>(I.getOperand(1));

		// return nullptr if argument to calls are not same. eg tan(x)*tan(y) etc.
		if (!(call1->getOperand(0) == call2->getOperand(0)))
		aprantlUnsubmitted Not Done Reply Inline Actions Please always use full sentences in documentation: `// Return nullptr if argument to calls are not same.` aprantl: Please always use full sentences in documentation: `// Return nullptr if argument to calls are…
		return nullptr;

		// tan(x)*cos(x)=sin(x);

		Value *Op1 = call1->getArgOperand(1);
		Value *Op2 = call2->getArgOperand(1);

		if ((isTrigLibCall(call1) &&
		((Op1->getName() == "tan") \|\| (Op1->getName() == "tanf") \|\|
		(Op1->getName() == "tanf")) &&
		(Op2->getName().startswith("llvm.cos"))) \|\|
		(isTrigLibCall(call2) && (Op1->getName().startswith("llvm.cos")) &&
		((Op2->getName() == "tan") \|\| (Op2->getName() == "tanf") \|\|
		(Op2->getName() == "tanl")))) {
		IRBuilder<>::FastMathFlagGuard Guard(Builder);
		Builder.setFastMathFlags(I.getFastMathFlags());
		Function *fun = Intrinsic::getDeclaration(I.getModule(), Intrinsic::sin,
		I.getType());
		CallInst *SinCall = Builder.CreateCall(fun, call2->getOperand(0));

		SinCall->takeName(&I);
		SinCall->copyFastMathFlags(&I);

		return SinCall;
		}

		// sin(x)cos(x)=sin(2x)0.5

		else if (((call1->getOperand(1)->getName().startswith("llvm.sin")) &&
		(call2->getOperand(1)->getName().startswith("llvm.cos"))) \|\|
		((call1->getOperand(1)->getName().startswith("llvm.cos")) &&
		(call2->getOperand(1)->getName().startswith("llvm.sin")))) {
		IRBuilder<>::FastMathFlagGuard Guard(Builder);
		Builder.setFastMathFlags(I.getFastMathFlags());
		Function *fun = Intrinsic::getDeclaration(I.getModule(), Intrinsic::sin,
		I.getType());

		Constant *ConstFloatTwo =
		ConstantFP::get(Type::getDoubleTy(I.getContext()), 2.0);
		Instruction *MulIns = dyn_cast<Instruction>(
		Builder.CreateFMul(ConstFloatTwo, call1->getOperand(0)));
		CallInst *SinCall = Builder.CreateCall(fun, MulIns);
		Instruction *DivIns =
		dyn_cast<Instruction>(Builder.CreateFDiv(SinCall, ConstFloatTwo));
		DivIns->takeName(&I);
		DivIns->copyFastMathFlags(&I);

		return DivIns;
		}
		}
		}

		else if ((I.getOpcode() == Instruction::FDiv) \|\|
		(I.getOpcode() == Instruction::SDiv) \|\|
		(I.getOpcode() == Instruction::UDiv)) {
		if (isa<CallInst>(I.getOperand(0)) && isa<CallInst>(I.getOperand(1))) {
		CallInst *call1 = dyn_cast<CallInst>(I.getOperand(0));
		CallInst *call2 = dyn_cast<CallInst>(I.getOperand(1));

		// return nullptr if argument to calls are not same

		if (!(call1->getOperand(0) == call2->getOperand(0)))
		return nullptr;

		Value *Op1 = call1->getArgOperand(1);
		Value *Op2 = call2->getArgOperand(1);

		// sin/tan=cos;
		if ((Op1->getName().startswith("llvm.sin")) &&

		(isTrigLibCall(call2) &&
		((Op2->getName() == "tan") \|\| (Op2->getName() == "tanf") \|\|
		(Op2->getName() == "tanl")))) {
		IRBuilder<>::FastMathFlagGuard Guard(Builder);
		Builder.setFastMathFlags(I.getFastMathFlags());
		Function *fun = Intrinsic::getDeclaration(I.getModule(), Intrinsic::cos,
		I.getType());
		CallInst *CosCall = Builder.CreateCall(fun, call1->getOperand(0));
		CosCall->takeName(&I);
		CosCall->copyFastMathFlags(&I);
		return CosCall;
		}

		// tan/sin=1/cos;
		else if ((Op2->getName().startswith("llvm.sin")) &&

		(isTrigLibCall(call1) &&
		((Op1->getName() == "tan") \|\| (Op1->getName() == "tanf") \|\|
		(Op1->getName() == "tanl")))) {
		IRBuilder<>::FastMathFlagGuard Guard(Builder);
		Builder.setFastMathFlags(I.getFastMathFlags());
		Function *fun = Intrinsic::getDeclaration(I.getModule(), Intrinsic::cos,
		I.getType());
		CallInst *CosCall = Builder.CreateCall(fun, call1->getOperand(0));
		Constant *ConstFloatOne =
		ConstantFP::get(Type::getDoubleTy(I.getContext()), 1.0);

		Instruction *DivIns =
		dyn_cast<Instruction>(Builder.CreateFDiv(ConstFloatOne, CosCall));
		DivIns->takeName(&I);
		DivIns->copyFastMathFlags(&I);

		return DivIns;
		}
		}
		}

		return nullptr;
		}

	/// \brief Makes transformation of binary operation specific for vector types.	/// \brief Makes transformation of binary operation specific for vector types.
	/// \param Inst Binary operator to transform.	/// \param Inst Binary operator to transform.
	/// \return Pointer to node that must replace the original binary operator, or	/// \return Pointer to node that must replace the original binary operator, or
Context not available.

test/Transforms/InstCombine/combined.ll

This file was added.

				; RUN: opt < %s -instcombine -S \| FileCheck %s

				declare double @tan(double)
				declare double @llvm.cos.f64(double)
				declare double @llvm.sin.f64(double)

				define double @_Z8combinedd(double) {
				%2 = call fast double @llvm.sin.f64(double %0)
				%3 = call fast double @tan(double %0) #8
				%4 = fdiv fast double %2, %3
				%5 = call fast double @tan(double %0) #8
				%6 = call fast double @llvm.cos.f64(double %0)
				%7 = fmul fast double %5, %6
				%8 = fmul fast double %4, %7
				ret double %8

				; CHECK-LABEL: @_Z8combinedd(
				; CHECK-NEXT: %2 = fmul fast double %0, 2.000000e+00
				; CHECK-NEXT: %3 = call fast double @llvm.sin.f64(double %2)
				; CHECK-NEXT: %4 = fmul fast double %3, 5.000000e-01
				; CHECK-NEXT: ret double %4
				}

				attributes #8 = { nounwind readnone }

test/Transforms/InstCombine/combined_different_arguments.ll

This file was added.

				; RUN: opt < %s -instcombine -S \| FileCheck %s

				declare double @tan(double)
				declare double @llvm.cos.f64(double)
				declare double @llvm.sin.f64(double)

				define double @_Z19different_argumentsdd(double, double) {
				%3 = call fast double @llvm.sin.f64(double %0)
				%4 = call fast double @tan(double %0) #8
				%5 = fdiv fast double %3, %4
				%6 = call fast double @tan(double %1) #8
				%7 = call fast double @llvm.cos.f64(double %1)
				%8 = fmul fast double %6, %7
				%9 = fmul fast double %5, %8
				ret double %9

				; CHECK-LABEL: @_Z19different_argumentsdd(
				; CHECK-NEXT: %3 = call fast double @llvm.cos.f64(double %0)
				; CHECK-NEXT: %4 = call fast double @llvm.sin.f64(double %1)
				; CHECK-NEXT: %5 = fmul fast double %3, %4
				; CHECK-NEXT: ret double %5
				}

				attributes #8 = { nounwind readnone }

test/Transforms/InstCombine/sin_div_tan.ll

This file was added.

				; RUN: opt < %s -instcombine -S \| FileCheck %s

				declare double @tan(double)
				declare double @llvm.cos.f64(double)
				declare double @llvm.sin.f64(double)

				define double @_Z11sin_div_tand(double) {
				%2 = call fast double @llvm.sin.f64(double %0)
				%3 = call fast double @tan(double %0) #8
				%4 = fdiv fast double %2, %3
				ret double %4

				; CHECK-LABEL: @_Z11sin_div_tand(
				; CHECK-NEXT: %2 = call fast double @llvm.cos.f64(double %0)
				; CHECK-NEXT: ret double %2
				}

				attributes #8 = { nounwind readnone }

test/Transforms/InstCombine/sin_mul_cos.ll

This file was added.

				; RUN: opt < %s -instcombine -S \| FileCheck %s

				declare double @llvm.cos.f64(double)
				declare double @llvm.sin.f64(double)

				define double @_Z11sin_mul_cosd(double) {
				%2 = call fast double @llvm.sin.f64(double %0)
				%3 = call fast double @llvm.cos.f64(double %0)
				%4 = fmul fast double %2, %3
				ret double %4

				; CHECK-LABEL: @_Z11sin_mul_cosd(
				; CHECK-NEXT: %2 = fmul fast double %0, 2.000000e+00
				; CHECK-NEXT: %3 = call fast double @llvm.sin.f64(double %2)
				; CHECK-NEXT: %4 = fmul fast double %3, 5.000000e-01
				; CHECK-NEXT: ret double %4
				}

test/Transforms/InstCombine/tan_div_sin.ll

This file was added.

				; RUN: opt < %s -instcombine -S \| FileCheck %s

				declare double @tan(double)
				declare double @llvm.cos.f64(double)
				declare double @llvm.sin.f64(double)

				define double @_Z11tan_div_sind(double) {
				%2 = call fast double @tan(double %0) #8
				%3 = call fast double @llvm.sin.f64(double %0)
				%4 = fdiv fast double %2, %3
				ret double %4

				; CHECK-LABEL: @_Z11tan_div_sind(
				; CHECK-NEXT: %2 = call fast double @llvm.cos.f64(double %0)
				; CHECK-NEXT: %3 = fdiv fast double 1.000000e+00, %2
				; CHECK-NEXT: ret double %3
				}

				attributes #8 = { nounwind readnone }

test/Transforms/InstCombine/tan_mul_cos.ll

This file was added.

				; RUN: opt < %s -instcombine -S \| FileCheck %s

				declare double @tan(double)
				declare double @llvm.cos.f64(double)
				declare double @llvm.sin.f64(double)

				define double @_Z11tan_mul_cosd(double) {
				%2 = call fast double @tan(double %0) #8
				%3 = call fast double @llvm.cos.f64(double %0)
				%4 = fmul fast double %2, %3
				ret double %4

				; CHECK-LABEL:@_Z11tan_mul_cosd(
				; CHECK-NEXT: %2 = call fast double @llvm.sin.f64(double %0)
				; CHECK-NEXT: ret double %2

				}

				attributes #8 = { nounwind readnone }