This is an archive of the discontinued LLVM Phabricator instance.

TTI: Add getCallInstrCost.
ClosedPublic

Authored by mzolotukhin on Mar 5 2015, 7:52 PM.

Details

Summary

TTI: Add getCallInstrCost.

This is the third of 6 patches for enabling vectorization of calls.

Diff Detail

Event Timeline

mzolotukhin updated this revision to Diff 21325.Mar 5 2015, 7:52 PM
mzolotukhin retitled this revision from to TTI: Add getCallInstrCost..
mzolotukhin updated this object.
mzolotukhin edited the test plan for this revision. (Show Details)
mzolotukhin added a subscriber: Unknown Object (MLST).
hfinkel added inline comments.Mar 5 2015, 8:54 PM
include/llvm/CodeGen/BasicTTIImpl.h
681

That's not quite right. You still need to call getCallInstrCost with the associated vector types.

mzolotukhin added inline comments.Mar 6 2015, 3:33 PM
include/llvm/CodeGen/BasicTTIImpl.h
681

Hi Hal,

I don't think I get it. When the types are vector, we first call getCallInstrCost for corresponding scalar types and then return ScalarCalls*ScalarCost+ScalarizationCost for not vectorizable functions and ScalarCost for vectorizable ones. If the types are scalar, we just return 10. What's missing here?

Could you please elaborate your point a little?

hfinkel added inline comments.Mar 8 2015, 5:19 PM
include/llvm/CodeGen/BasicTTIImpl.h
681

The key problem here is that this logic does not belong inside the cost-model interface. What the cost model is supposed to provide is an interface that can be used for the vectorizer (and other clients) to ask the target, "Given this instruction on values of these types, please provide an estimate of the relative reciprocal throughput." The logic of how to transition from scalar instructions to vector instructions must live in the vectorizer, not inside the cost model. You're assuming too much about the actions that will be taken by the user of the interface by using the function-vectorization logic inside the most model to provide an estimate based on, not only the provided instruction and types, but also how the vectorizer will likely replace the queried function call with some other one.

We might not want to create external function declarations for each function we might wish to query as part of the interface, so we'll either need to allow for F to be nullptr, or change the interface.

What we're querying here is the cost of the function call, as provided, any logic related to vectorization needs to live inside the vectorizer, only the vectorizer knows how it will vectorize, and how to formulate any scalarization cost. The only scalarization costs that the TTI interface should know about, are those that will be incurred by the type legalization process in CodeGen -- those introduced at the IR level by TTI's users, need to be estimated by TTI's users based on how that scalarization is actually done.

mzolotukhin added inline comments.Mar 9 2015, 8:06 PM
include/llvm/CodeGen/BasicTTIImpl.h
681

Ok, I see what you mean now, thank you for the explanation!

Then my plan is to leave in TTI only trivial getCallCost, which will always return 10 (in default implementation). The current logic, which takes into account scalarization costs, will be moved to the vectorizer almost unchanged. With such changes back-ends could control the vectorizer by setting costs of Call, InsertElement and ExtractElement instructions. Does it sound good?

mzolotukhin added inline comments.Mar 9 2015, 8:57 PM
include/llvm/CodeGen/BasicTTIImpl.h
681

Actually, we use getScalarizationCost in other places in BasicTTIImple too: e.g. in getArithmeticInstrCost, getCastInstrCost, and others - do you think we need to change these places as well?

hfinkel added inline comments.Mar 9 2015, 10:33 PM
include/llvm/CodeGen/BasicTTIImpl.h
681

Yes, that sounds like the right way to do it; thanks!

The other places in BasicTTIImpl use getScalarizationCost to model what happens during type/operation legalization in CodeGen, so that's different (and should stay).

mzolotukhin added inline comments.Mar 10 2015, 12:27 PM
include/llvm/CodeGen/BasicTTIImpl.h
681

Hi Hal,

I looked at what we need to change to make it this way:

  1. We need to implement getScalarizationCost outside TTI, so that it can be used in LoopVectorizer, SLPVectorizer, IndVarSimplify, CodeGenPrepare and other passes that currently might use getArithInstrCost/getCastInstrCost/etc. with a vector type.
  2. We need to change all these uses with
if (Ty->isVectorTy())
  cost = getScalarizationCost(Ty) + getInstrCost(Ty)
else
  cost = getInstrCost(Ty)
  1. We need to make sure the we don't actually change costs on any platform.

All of these seem to me like a pretty massive, and probably not so elegant change - we need to change a lot of places, and we'll introduce a lot of very similar code. Also, it might be true that the vectorizer might want to determine a specific way to scalarize a value, but other users of getArithInstrCost/etc. might don't care about it that much. E.g. I don't think it's worth adding this logic to CodeGenPrepare - it'll be the same as we have now in TTI anyway (and we'll need to keep it both in and outside TTI).

Did I understand what you suggest correctly at all?

hfinkel added inline comments.Mar 10 2015, 1:24 PM
include/llvm/CodeGen/BasicTTIImpl.h
681

Did I understand what you suggest correctly at all?

No, not really ;)

We need to implement getScalarizationCost outside TTI

No, we should leave it as it.

I think we have a terminology problem here. There are two types of 'scalarization': One type is that done by CodeGen's type / operator legalization framework. This is what is modeled by getScalarizationCost, and that should stay as is. getArithInstrCost/getCastInstrCost/etc. use that logic to estimate the cost of the scalarization that will be performed by legalization, and that's fine.

As we've discussed, things are a bit different for calls. Calls don't really get 'scalarized' by CodeGen in the same sense. A call is a call, and the only meaningful 'scalarization' that happens might be in breaking up its parameters for argument passing and reconstituting its return value if those types are not legal (and, even if they are, depending on the calling convention). This is different for intrinsics, but for calls to actual functions, there is no sense in which type legalization turns one call into many calls. The vectorizer, with knowledge of certain calls, might do this (because it understands their semantics), but it does this "manually" (by inserting extractelement/insertelement instructions and then different calls). This process should be modeled by the vectorizer by getting the cost of those instructions it will actually insert. This is very different from what is done for arithmetic instructions (where we let CodeGen do the legalization).

  • Remove scalarization cost estimate (now it belongs to the vectorizer).
hfinkel added inline comments.Mar 11 2015, 12:25 AM
include/llvm/Analysis/TargetTransformInfo.h
454

Please specifically mention that F may by nullptr.

  • Expand comments for getCallInstrCost. Mention that F might be nullptr.
hfinkel added inline comments.Mar 16 2015, 10:38 PM
include/llvm/Analysis/TargetTransformInfo.h
41

Remove this (should not needed, as noted below).

455

I don't think we need TLI here given the current strategy.

  • Remove TLI argument from getCallInstrCost.
hfinkel accepted this revision.Mar 17 2015, 4:57 AM
hfinkel edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Mar 17 2015, 4:57 AM
mzolotukhin closed this revision.Mar 17 2015, 12:32 PM

Thanks, committed in r232524!