This is an archive of the discontinued LLVM Phabricator instance.

[InlineCost] Use TTI to check if GEP is free.
ClosedPublic

Authored by haicheng on Jan 13 2017, 12:58 PM.

Download Raw Diff

Details

Reviewers

chandlerc
davidxl
jingyue
hfinkel
junbuml
mcrosier

Commits

rGda556345dcfd: [InlineCost] Use TTI to check if GEP is free.
rL292526: [InlineCost] Use TTI to check if GEP is free.

Summary

Currently, a GEP is considered free only if its indices are all constant. TTI::getGEPCost() can give target-specific more accurate analysis. TTI is already used for the cost of many other instructions.

Diff Detail

Repository: rL LLVM

Event Timeline

haicheng updated this revision to Diff 84352.Jan 13 2017, 12:58 PM

haicheng retitled this revision from to [InlineCost] Use TTI to check if GEP is free..

haicheng updated this object.

haicheng added reviewers: chandlerc, hfinkel, jingyue, junbuml, davidxl.

haicheng set the repository for this revision to rL LLVM.

haicheng added a subscriber: llvm-commits.

Herald added subscribers: eraman, mcrosier. · View Herald TranscriptJan 13 2017, 12:58 PM

LGTM, but this should be further reviewed by someone with more familiarity of the inliner cost model.

test/Transforms/Inline/gep-cost.ll
1	I apologize for my lack of understanding. Assuming this isn't obvious to someone more familiar with the inliner cost model, would you mind adding a few comments to this test case explaining exactly what you're testing? I diffed the output from this test with and without your change and the only cost that was changed was for inner1. Thus, I'm trying to understand the significance of inner2.

haicheng added inline comments.Jan 18 2017, 12:53 PM

test/Transforms/Inline/gep-cost.ll
1	Thank you, Chad. I will add appropriate comments. The GEP in inner2() is reg+imm+reg which is not legal addressing mode for AArch64. I expect to see only ret can be simplified. The current implementation only considers GEP with all constant indices are foldable, so the GEP in inner2() is considered not foldable even without my patch. I add this test for completeness. The GEP in inner1(), however, is reg+reg and is legal for AArch64. It can be detected only with my patch.

While the approach we have of doing cost modeling here has some problems, this does fit in well with what we're doing for zero extends, sign extends, and related. So I'm happy with that side. I'll defer the code side to Chad.

mcrosier added inline comments.Jan 18 2017, 1:41 PM

test/Transforms/Inline/gep-cost.ll

Ah, I see. Suggestions below.

Here you could probably say something like:

// The GEP in inner1() is [insert legal addressing mode], which is a legal addressing mode for AArch64.  Thus, both the gep and ret can be simplified.

Here you could probably say something like:

// The GEP in inner2() is  reg+imm+reg, which is not a legal addressing mode for AArch64.  Thus, only the ret can be simplified and not the gep.

Update the test case. Thank you, Chad and Chandler.

LGTM, assuming you've done the necessary performance testing for both AArch64 and X86.

This revision is now accepted and ready to land.Jan 19 2017, 8:00 AM

Closed by commit rL292526: [InlineCost] Use TTI to check if GEP is free. (authored by haicheng). · Explain WhyJan 19 2017, 2:39 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Analysis/

InlineCost.cpp

20 lines

test/

Transforms/

Inline/

gep-cost.ll

25 lines

Diff 84352

lib/Analysis/InlineCost.cpp

Show First 20 Lines • Show All 128 Lines • ▼ Show 20 Lines	class CallAnalyzer : public InstVisitor<CallAnalyzer, bool> {
bool isAllocaDerivedArg(Value *V);		bool isAllocaDerivedArg(Value *V);
bool lookupSROAArgAndCost(Value V, Value &Arg,		bool lookupSROAArgAndCost(Value V, Value &Arg,
DenseMap<Value *, int>::iterator &CostIt);		DenseMap<Value *, int>::iterator &CostIt);
void disableSROA(DenseMap<Value *, int>::iterator CostIt);		void disableSROA(DenseMap<Value *, int>::iterator CostIt);
void disableSROA(Value *V);		void disableSROA(Value *V);
void accumulateSROACost(DenseMap<Value *, int>::iterator CostIt,		void accumulateSROACost(DenseMap<Value *, int>::iterator CostIt,
int InstructionCost);		int InstructionCost);
bool isGEPOffsetConstant(GetElementPtrInst &GEP);		bool isGEPOffsetConstant(GetElementPtrInst &GEP);
		bool isGEPFree(GetElementPtrInst &GEP);
bool accumulateGEPOffset(GEPOperator &GEP, APInt &Offset);		bool accumulateGEPOffset(GEPOperator &GEP, APInt &Offset);
bool simplifyCallSite(Function *F, CallSite CS);		bool simplifyCallSite(Function *F, CallSite CS);
ConstantInt stripAndComputeInBoundsConstantOffsets(Value &V);		ConstantInt stripAndComputeInBoundsConstantOffsets(Value &V);

/// Return true if the given argument to the function being considered for		/// Return true if the given argument to the function being considered for
/// inlining has the given attribute set either at the call site or the		/// inlining has the given attribute set either at the call site or the
/// function declaration. Primarily used to inspect call site specific		/// function declaration. Primarily used to inspect call site specific
/// attributes since these can be more precise than the ones on the callee		/// attributes since these can be more precise than the ones on the callee
▲ Show 20 Lines • Show All 181 Lines • ▼ Show 20 Lines	for (gep_type_iterator GTI = gep_type_begin(GEP), GTE = gep_type_end(GEP);
}		}

APInt TypeSize(IntPtrWidth, DL.getTypeAllocSize(GTI.getIndexedType()));		APInt TypeSize(IntPtrWidth, DL.getTypeAllocSize(GTI.getIndexedType()));
Offset += OpC->getValue().sextOrTrunc(IntPtrWidth) * TypeSize;		Offset += OpC->getValue().sextOrTrunc(IntPtrWidth) * TypeSize;
}		}
return true;		return true;
}		}

		/// \brief Use TTI to check whether a GEP is free.
		///
		/// Respects any simplified values known during the analysis of this callsite.
		bool CallAnalyzer::isGEPFree(GetElementPtrInst &GEP) {
		SmallVector<Value *, 4> Indices;
		for (User::op_iterator I = GEP.idx_begin(), E = GEP.idx_end(); I != E; ++I)
		if (Constant SimpleOp = SimplifiedValues.lookup(I))
		Indices.push_back(SimpleOp);
		else
		Indices.push_back(*I);
		return TargetTransformInfo::TCC_Free ==
		TTI.getGEPCost(GEP.getSourceElementType(), GEP.getPointerOperand(),
		Indices);
		}

bool CallAnalyzer::visitAlloca(AllocaInst &I) {		bool CallAnalyzer::visitAlloca(AllocaInst &I) {
// Check whether inlining will turn a dynamic alloca into a static		// Check whether inlining will turn a dynamic alloca into a static
// alloca and handle that case.		// alloca and handle that case.
if (I.isArrayAllocation()) {		if (I.isArrayAllocation()) {
Constant *Size = SimplifiedValues.lookup(I.getArraySize());		Constant *Size = SimplifiedValues.lookup(I.getArraySize());
if (auto *AllocSize = dyn_cast_or_null<ConstantInt>(Size)) {		if (auto *AllocSize = dyn_cast_or_null<ConstantInt>(Size)) {
const DataLayout &DL = F.getParent()->getDataLayout();		const DataLayout &DL = F.getParent()->getDataLayout();
Type *Ty = I.getAllocatedType();		Type *Ty = I.getAllocatedType();
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	if (I.isInBounds()) {
std::pair<Value *, APInt> BaseAndOffset = ConstantOffsetPtrs.lookup(Ptr);		std::pair<Value *, APInt> BaseAndOffset = ConstantOffsetPtrs.lookup(Ptr);
if (BaseAndOffset.first) {		if (BaseAndOffset.first) {
// Check if the offset of this GEP is constant, and if so accumulate it		// Check if the offset of this GEP is constant, and if so accumulate it
// into Offset.		// into Offset.
if (!accumulateGEPOffset(cast<GEPOperator>(I), BaseAndOffset.second)) {		if (!accumulateGEPOffset(cast<GEPOperator>(I), BaseAndOffset.second)) {
// Non-constant GEPs aren't folded, and disable SROA.		// Non-constant GEPs aren't folded, and disable SROA.
if (SROACandidate)		if (SROACandidate)
disableSROA(CostIt);		disableSROA(CostIt);
return false;		return isGEPFree(I);
}		}

// Add the result as a new mapping to Base + Offset.		// Add the result as a new mapping to Base + Offset.
ConstantOffsetPtrs[&I] = BaseAndOffset;		ConstantOffsetPtrs[&I] = BaseAndOffset;

// Also handle SROA candidates here, we already know that the GEP is		// Also handle SROA candidates here, we already know that the GEP is
// all-constant indexed.		// all-constant indexed.
if (SROACandidate)		if (SROACandidate)
Show All 9 Lines	if (isGEPOffsetConstant(I)) {

// Constant GEPs are modeled as free.		// Constant GEPs are modeled as free.
return true;		return true;
}		}

// Variable GEPs will require math and will disable SROA.		// Variable GEPs will require math and will disable SROA.
if (SROACandidate)		if (SROACandidate)
disableSROA(CostIt);		disableSROA(CostIt);
return false;		return isGEPFree(I);
}		}

bool CallAnalyzer::visitBitCast(BitCastInst &I) {		bool CallAnalyzer::visitBitCast(BitCastInst &I) {
// Propagate constants through bitcasts.		// Propagate constants through bitcasts.
Constant *COp = dyn_cast<Constant>(I.getOperand(0));		Constant *COp = dyn_cast<Constant>(I.getOperand(0));
if (!COp)		if (!COp)
COp = SimplifiedValues.lookup(I.getOperand(0));		COp = SimplifiedValues.lookup(I.getOperand(0));
if (COp)		if (COp)
▲ Show 20 Lines • Show All 1,176 Lines • Show Last 20 Lines

test/Transforms/Inline/gep-cost.ll

This file was added.

				; RUN: opt -inline < %s -S -debug-only=inline-cost 2>&1 \| FileCheck %s
				mcrosierUnsubmitted Done Reply Inline Actions I apologize for my lack of understanding. Assuming this isn't obvious to someone more familiar with the inliner cost model, would you mind adding a few comments to this test case explaining exactly what you're testing? I diffed the output from this test with and without your change and the only cost that was changed was for inner1. Thus, I'm trying to understand the significance of inner2. mcrosier: I apologize for my lack of understanding. Assuming this isn't obvious to someone more familiar…
				haichengAuthorUnsubmitted Done Reply Inline Actions Thank you, Chad. I will add appropriate comments. The GEP in inner2() is reg+imm+reg which is not legal addressing mode for AArch64. I expect to see only ret can be simplified. The current implementation only considers GEP with all constant indices are foldable, so the GEP in inner2() is considered not foldable even without my patch. I add this test for completeness. The GEP in inner1(), however, is reg+reg and is legal for AArch64. It can be detected only with my patch. haicheng: Thank you, Chad. I will add appropriate comments. The GEP in inner2() is reg+imm+reg which is…
				mcrosierUnsubmitted Done Reply Inline Actions Ah, I see. Suggestions below. mcrosier: Ah, I see. Suggestions below.

				target datalayout = "e-m:e-i64:64-i128:128-n32:64-S128"
				target triple = "aarch64--linux-gnu"

				define void @outer1([4 x i32]* %ptr1, [4 x [4 x i32]]* %ptr2, i32 %i) {
				call void @inner1([4 x i32]* %ptr1, i32 %i)
				call void @inner2([4 x [4 x i32]]* %ptr2, i32 %i)
				ret void
				}
				; CHECK: Analyzing call of inner1
				mcrosierUnsubmitted Done Reply Inline Actions Here you could probably say something like: // The GEP in inner1() is [insert legal addressing mode], which is a legal addressing mode for AArch64. Thus, both the gep and ret can be simplified. mcrosier: Here you could probably say something like: // The GEP in inner1() is [insert legal…
				; CHECK: NumInstructionsSimplified: 2
				; CHECK: NumInstructions: 2
				define void @inner1([4 x i32]* %ptr, i32 %i) {
				%G = getelementptr inbounds [4 x i32], [4 x i32]* %ptr, i32 %i
				ret void
				}

				; CHECK: Analyzing call of inner2
				mcrosierUnsubmitted Done Reply Inline Actions Here you could probably say something like: // The GEP in inner2() is reg+imm+reg, which is not a legal addressing mode for AArch64. Thus, only the ret can be simplified and not the gep. mcrosier: Here you could probably say something like: // The GEP in inner2() is reg+imm+reg, which is…
				; CHECK: NumInstructionsSimplified: 1
				; CHECK: NumInstructions: 2
				define void @inner2([4 x [4 x i32]]* %ptr, i32 %i) {
				%G = getelementptr inbounds [4 x [4 x i32]], [4 x [4 x i32]]* %ptr, i32 1, i32 %i
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

[InlineCost] Use TTI to check if GEP is free.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 84352

lib/Analysis/InlineCost.cpp

test/Transforms/Inline/gep-cost.ll

[InlineCost] Use TTI to check if GEP is free.
ClosedPublic