This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Analysis/
-
Analysis/
-
InlineCost.cpp
-
test/Transforms/Inline/
-
Transforms/
-
Inline/
-
inline_constprop.ll

Differential D62699

[InlineCost] Add support for unary fneg.
ClosedPublic

Authored by craig.topper on May 30 2019, 1:30 PM.

Download Raw Diff

Details

Reviewers

cameron.mcinally
spatel
fhahn
bjope
efriedma

Commits

rG6cda33ba3642: [InlineCost] Add support for unary fneg.
rL362732: [InlineCost] Add support for unary fneg.

Summary

This adds support for unary fneg based on the implementation of BinaryOperator without the soft float FP cost.

Previously we would just delegate to visitUnaryInstruction. I think the only real change is that we will pass the FastMath flags to SimplifyFNeg now.

Unfortunately, SimplifyFNegInst does not currently do anything with the fastmath flags so I don't think this changes any behavior. Thus I don't know how to test it.

Diff Detail

Repository: rL LLVM

Event Timeline

craig.topper created this revision.May 30 2019, 1:30 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 30 2019, 1:30 PM

Herald added subscribers: haicheng, hiraditya, kristof.beyls and 2 others. · View Herald Transcript

efriedma added inline comments.May 30 2019, 2:32 PM

llvm/lib/Analysis/InlineCost.cpp
1131 ↗	(On Diff #202283)	Is this actually true? fneg should be one or two native instructions for almost every combination of type/target, even ones that don't have native floating-point support.

craig.topper marked an inline comment as done.May 30 2019, 2:40 PM

craig.topper added inline comments.

llvm/lib/Analysis/InlineCost.cpp
1131 ↗	(On Diff #202283)	Probably not. But it is the answer we would have gotten for fsub -0.0, %x. Should we exclude both forms of writing fneg from this?

LGTM.

I'm not that familiar with InlineCost, but I see that almost all of this is copied from visitBinaryOperator(...), so I'm pretty confident in the review.

I was able to test the increased fp cost. The included test case will report a cost of 100 before this patch and 125 after this patch. I based this on an earlier arm test case in the test file and just changed an fmul by constant to an fneg. I can pre-commit this test but wanted opinions on if this test needs changes.

FMul seems like a good first attempt. Maybe a cost of XOR by constant is better, but they're probably pretty similar anyway.

This revision is now accepted and ready to land.May 30 2019, 2:57 PM

cameron.mcinally added inline comments.May 30 2019, 3:02 PM

llvm/lib/Analysis/InlineCost.cpp
1131 ↗	(On Diff #202283)	This seems like a good check to have around for a general UnaryOperator. Of course there's only one right now, but who knows in the future...

FMul seems like a good first attempt. Maybe a cost of XOR by constant is better, but they're probably pretty similar anyway.

We're talking about the difference in cost between a runtime call and a xor by a constant; those are not really similar.

llvm/lib/Analysis/InlineCost.cpp
1131 ↗	(On Diff #202283)	We probably should treat "fsub -0.0, %x" the same way, yes. It probably doesn't makes sense to try to predict the costs of future unary operators that don't current exist.

In D62699#1524170, @efriedma wrote:

FMul seems like a good first attempt. Maybe a cost of XOR by constant is better, but they're probably pretty similar anyway.

We're talking about the difference in cost between a runtime call and a xor by a constant; those are not really similar.

Those are different. Why is the cost calculation so pessimistic? For soft float targets?

Yes, soft-float targets, since TTI.getFPOpCost(I.getType()) == TargetTransformInfo::TCC_Expensive is true.

craig.topper mentioned this in D62747: [InlineCost] Don't add the soft float functional call cost for the fneg idiom, fsub -0.0, %x.May 31 2019, 1:29 PM

Diffusion mentioned this in rL362304: [InlineCost] Don't add the soft float function call cost for the fneg idiom….Jun 1 2019, 12:38 PM

craig.topper mentioned this in rG7cebf0af4076: [InlineCost] Don't add the soft float function call cost for the fneg idiom….Jun 1 2019, 12:38 PM

Restrict to just FNeg since we have no other UnaryOperators yet.

Harbormaster completed remote builds in B32777: Diff 202570.Jun 1 2019, 3:20 PM

craig.topper requested review of this revision.Jun 1 2019, 3:21 PM

craig.topper retitled this revision from [InlineCost] Add support for UnaryOperator to [InlineCost] Add support for unary fneg..

craig.topper edited the summary of this revision. (Show Details)

This is a good compromise. I'm okay with this if Eli is...

It should be possible to write a testcase based on the simplification? Otherwise looks fine.

The only optimizations that are in simplifyFNegInst currently are constant folding and (fneg (fneg X))->X. The constant folding case would already have been supported by using visitUnaryInstruction. For the (fneg (fneg X)) case, will InlineCost be able to do something if one fneg is in the caller and the other is in the callee?

Is it likely that we're ever going to be able to do anything useful with the fast-math flags?

Even if the new code behaves the same way as visitUnaryInstruction, I'd like to have some test coverage, to show this doesn't break anything.

In D62699#1528428, @efriedma wrote:

Is it likely that we're ever going to be able to do anything useful with the fast-math flags?

I don't think that decision has been made. It's just a bit-twiddle, so I'm not sure if FMFs would give us anything.

That said, if we knew nnan and nsz, we could convert a unary FNeg into a binary FNeg. I don't think that would be a win, but haven't studied it...

Add test case that should requires constant folding an fneg.

LGTM

This revision is now accepted and ready to land.Jun 6 2019, 10:40 AM

Closed by commit rL362732: [InlineCost] Add support for unary fneg. (authored by ctopper). · Explain WhyJun 6 2019, 11:59 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Analysis/

InlineCost.cpp

23 lines

test/

Transforms/

Inline/

inline_constprop.ll

31 lines

Diff 203413

llvm/trunk/lib/Analysis/InlineCost.cpp

Show First 20 Lines • Show All 267 Lines • ▼ Show 20 Lines	class CallAnalyzer : public InstVisitor<CallAnalyzer, bool> {
bool visitBitCast(BitCastInst &I);		bool visitBitCast(BitCastInst &I);
bool visitPtrToInt(PtrToIntInst &I);		bool visitPtrToInt(PtrToIntInst &I);
bool visitIntToPtr(IntToPtrInst &I);		bool visitIntToPtr(IntToPtrInst &I);
bool visitCastInst(CastInst &I);		bool visitCastInst(CastInst &I);
bool visitUnaryInstruction(UnaryInstruction &I);		bool visitUnaryInstruction(UnaryInstruction &I);
bool visitCmpInst(CmpInst &I);		bool visitCmpInst(CmpInst &I);
bool visitSub(BinaryOperator &I);		bool visitSub(BinaryOperator &I);
bool visitBinaryOperator(BinaryOperator &I);		bool visitBinaryOperator(BinaryOperator &I);
		bool visitFNeg(UnaryOperator &I);
bool visitLoad(LoadInst &I);		bool visitLoad(LoadInst &I);
bool visitStore(StoreInst &I);		bool visitStore(StoreInst &I);
bool visitExtractValue(ExtractValueInst &I);		bool visitExtractValue(ExtractValueInst &I);
bool visitInsertValue(InsertValueInst &I);		bool visitInsertValue(InsertValueInst &I);
bool visitCallBase(CallBase &Call);		bool visitCallBase(CallBase &Call);
bool visitReturnInst(ReturnInst &RI);		bool visitReturnInst(ReturnInst &RI);
bool visitBranchInst(BranchInst &BI);		bool visitBranchInst(BranchInst &BI);
bool visitSelectInst(SelectInst &SI);		bool visitSelectInst(SelectInst &SI);
▲ Show 20 Lines • Show All 817 Lines • ▼ Show 20 Lines	bool CallAnalyzer::visitBinaryOperator(BinaryOperator &I) {
if (I.getType()->isFloatingPointTy() &&		if (I.getType()->isFloatingPointTy() &&
TTI.getFPOpCost(I.getType()) == TargetTransformInfo::TCC_Expensive &&		TTI.getFPOpCost(I.getType()) == TargetTransformInfo::TCC_Expensive &&
!match(&I, m_FNeg(m_Value())))		!match(&I, m_FNeg(m_Value())))
addCost(InlineConstants::CallPenalty);		addCost(InlineConstants::CallPenalty);

return false;		return false;
}		}

		bool CallAnalyzer::visitFNeg(UnaryOperator &I) {
		Value *Op = I.getOperand(0);
		Constant *COp = dyn_cast<Constant>(Op);
		if (!COp)
		COp = SimplifiedValues.lookup(Op);

		Value *SimpleV = SimplifyFNegInst(COp ? COp : Op,
		cast<FPMathOperator>(I).getFastMathFlags(),
		DL);

		if (Constant *C = dyn_cast_or_null<Constant>(SimpleV))
		SimplifiedValues[&I] = C;

		if (SimpleV)
		return true;

		// Disable any SROA on arguments to arbitrary, unsimplified fneg.
		disableSROA(Op);

		return false;
		}

bool CallAnalyzer::visitLoad(LoadInst &I) {		bool CallAnalyzer::visitLoad(LoadInst &I) {
Value *SROAArg;		Value *SROAArg;
DenseMap<Value *, int>::iterator CostIt;		DenseMap<Value *, int>::iterator CostIt;
if (lookupSROAArgAndCost(I.getPointerOperand(), SROAArg, CostIt)) {		if (lookupSROAArgAndCost(I.getPointerOperand(), SROAArg, CostIt)) {
if (I.isSimple()) {		if (I.isSimple()) {
accumulateSROACost(CostIt, InlineConstants::InstrCost);		accumulateSROACost(CostIt, InlineConstants::InstrCost);
return true;		return true;
}		}
▲ Show 20 Lines • Show All 1,104 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/Inline/inline_constprop.ll

	Show First 20 Lines • Show All 339 Lines • ▼ Show 20 Lines
	bb1:			bb1:
	%call = call i16 @caller7.external(i16 1)			%call = call i16 @caller7.external(i16 1)
	call void @callee7(i16 0, i16 %call)			call void @callee7(i16 0, i16 %call)
	ret void			ret void
	}			}
	; CHECK-LABEL: define void @caller7(			; CHECK-LABEL: define void @caller7(
	; CHECK: %call = call i16 @caller7.external(i16 1)			; CHECK: %call = call i16 @caller7.external(i16 1)
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void

				define float @caller8(float %y) {
				; Check that we can constant-prop through fneg instructions
				;
				; CHECK-LABEL: @caller8(
				; CHECK-NOT: call
				; CHECK: ret
				%x = call float @callee8(float -42.0, float %y)
				ret float %x
				}

				define float @callee8(float %x, float %y) {
				%neg = fneg float %x
				%icmp = fcmp ugt float %neg, 42.0
				br i1 %icmp, label %bb.true, label %bb.false

				bb.true:
				; This block musn't be counted in the inline cost.
				%y1 = fadd float %y, 1.0
				%y2 = fadd float %y1, 1.0
				%y3 = fadd float %y2, 1.0
				%y4 = fadd float %y3, 1.0
				%y5 = fadd float %y4, 1.0
				%y6 = fadd float %y5, 1.0
				%y7 = fadd float %y6, 1.0
				%y8 = fadd float %y7, 1.0
				ret float %y8

				bb.false:
				ret float %x
				}