This is an archive of the discontinued LLVM Phabricator instance.

[TTI/CostModel] improve TTI::getGEPCost and use it in CostModel::getInstructionCost
ClosedPublic

Authored by jingyue on May 17 2015, 10:14 PM.

Details

Summary

This patch updates TargetTransformInfoImplCRTPBase::getGEPCost to consider
addressing modes. It now returns TCC_Free when the GEP can be completely folded
to an addresing mode.

I started this patch as I refactored SLSR. Function isGEPFoldable looks common
and is indeed used by some WIP of mine. So I extracted that logic to getGEPCost.

Furthermore, I noticed getGEPCost wasn't directly tested anywhere. The best
testing bed seems CostModel, but its getInstructionCost method invokes
getAddressComputationCost for GEPs which provides very coarse estimation. So
this patch also makes getInstructionCost call the updated getGEPCost for GEPs.
This change inevitably breaks some tests because the cost model changes, but
nothing looks seriously wrong -- if we believe the new cost model is the right
way to go, these tests should be updated.

This patch is not perfect yet -- the comments in some tests need to be updated.
I want to know whether this is a right approach before fixing those details.

Diff Detail

Event Timeline

jingyue updated this revision to Diff 25943.May 17 2015, 10:14 PM
jingyue retitled this revision from to [TTI/CostModel] improve TTI::getGEPCost and use it in CostModel::getInstructionCost.
jingyue updated this object.
jingyue edited the test plan for this revision. (Show Details)
jingyue added reviewers: chandlerc, hfinkel.
jingyue added a subscriber: Unknown Object (MLST).
hfinkel added inline comments.May 18 2015, 12:42 AM
include/llvm/Analysis/TargetTransformInfo.h
139–140

I see no reason to change the interface here. We can localize the const_casts to the implementation, instead of forcing them into all callers that have const references/pointers.

include/llvm/Analysis/TargetTransformInfoImpl.h
394

You should call stripPointerCasts before doing the dyn_cast.

394

Should be dyn_cast_or_null b/c the cost model should not require the base pointer be known when estimating costs.

lib/Analysis/CostModel.cpp
385–386

You don't need the { } here any longer.

jingyue added inline comments.May 19 2015, 4:02 PM
include/llvm/Analysis/TargetTransformInfoImpl.h
394

Interesting -- when Ptr == nullptr, what is a good estimation? Fall back to using getAddressComputationCost?

hfinkel added inline comments.May 20 2015, 9:25 AM
include/llvm/Analysis/TargetTransformInfoImpl.h
394

No, but there is another issue here. We're currently moving to give pointer an opaque type, and so you'll need to separately require a Type* on this interface. Please add that. Once you have the Type *, you should:

  1. assert that the type is equal to the pointer element type of the provided pointer (if it is provided) -- we'll remove this once we actually get rid of pointer element types, but for now we need the asserts
  2. use that type in the code below (there is a getGEPReturnType overload that takes the element type instead of the pointer itself)
  3. Make some default assumption for HasBaseReg (I think you should assume true).

Makes sense. Thank you!

jingyue updated this revision to Diff 27278.Jun 7 2015, 10:31 PM

Some changes

  1. strip pointer casts before checking whether Ptr is a GlobalValue
  1. support Ptr == nullptr
  1. getGEPCost still takes ArrayRef<const Value *> as operands
  1. in the direction of supporting opaque pointer types, getGEPCost takes the source pointer element type as a parameter.
  1. fixed partial.ll. With the new getGEPCost, the loop size in @test1 is 13 and thus the loop can be partially unrolled. Added extra instructions so that the loop size remains 16.
jingyue added inline comments.Jun 7 2015, 10:34 PM
test/Transforms/LoopVectorize/X86/metadata-enable.ll
63

After the change to getGEPCost, the loop size is reduced. Bump the trip count so that the loops still cannot be fully unrolled.

hfinkel added inline comments.Jun 22 2015, 4:59 PM
include/llvm/Analysis/TargetTransformInfoImpl.h
397

pointer can have -> pointers have

410

I don't understand why you're doing this. Will gep_type_begin not compile with an ArrayRef<const Value *>? If so, we should just fix that.

413

That's not right. isLegalAddressingMode takes the address space as an optional last parameter. Please propagate it.

jingyue updated this revision to Diff 29527.Jul 12 2015, 8:13 PM

Two changes

  1. pass Ptr's address space to isLegalAddressingMode
  2. allow gep_type_begin to take ArrayRef<const Value *>
jingyue marked 3 inline comments as done.Jul 12 2015, 8:17 PM

Some new changes in this patch are due to "git rebase".

hfinkel accepted this revision.Jul 26 2015, 6:45 AM
hfinkel edited edge metadata.

LGTM.

include/llvm/Analysis/TargetTransformInfoImpl.h
397

a pointer can have -> pointers have

(when we make the change, there will be no 'can' about it, they'll have to have an opaque type).

This revision is now accepted and ready to land.Jul 26 2015, 6:45 AM
jingyue updated this revision to Diff 30659.Jul 26 2015, 10:14 AM
jingyue edited edge metadata.

updated comments

LGTM, thanks!

jingyue closed this revision.Jul 26 2015, 10:28 AM

Hi Jingyue and Hal,

I believe this patch has a mistake in calling isLegalAddressingMode(). Please check the inlined the comments.

Thank you,

Haicheng Wu

include/llvm/Analysis/TargetTransformInfoImpl.h
432

I think we should use *GTI here rather than PointerType::get(*GTI, AS).

For example, on AArch64, the GEP in the following IRs can be folded

%"class.boost::array.16" = type { [24 x i32] }
%arrayidx.i17 = getelementptr inbounds %"class.boost::array.16", %"class.boost::array.16"* %moves, i64 0, i32 0, i64 %conv7                                                                               
store i32 %add, i32* %arrayidx.i17, align 4, !tbaa !18

The assembly is simple as

str     w24, [x20, x8, lsl #2]

However, isLegalAddressingMode() here returns false because we pass a wrong type.

I checked the implementation of isLegalAddressingMode() in several different backends (including ARM) and several other passes (LoopStrengthReduce and CodeGenPrepare) that call isLegalAddressingMode(). It think I am correct.

jingyue added inline comments.Nov 30 2016, 3:07 PM
include/llvm/Analysis/TargetTransformInfoImpl.h
432

Thanks! I think you are right. Can you create a patch and add a unit test for this?

Thank you. I am working on this.