This is an archive of the discontinued LLVM Phabricator instance.

Update the vectorizer cost model for getVectorInstrCost to reflect actual costs of the operations on Power8/Power9
AbandonedPublic

Authored by nemanjai on Aug 30 2016, 2:36 AM.

Details

Summary

Integer vector extractions and insertions are accomplished with direct moves which take about 5 cycles.
Single precision floating point values just need to be aligned and converted (and inserted with a vperm in the insert case).
Double precision floating point values just need to be aligned (and inserted using an xxpermdi in the insert case).

For Power9, 32-bit values can be inserted into a vector without a vperm so do not require loading a permute mask.

This patch reflects these aspects of the operations in getVectorInstrCost.

Diff Detail

Repository
rL LLVM

Event Timeline

nemanjai updated this revision to Diff 69654.Aug 30 2016, 2:36 AM
nemanjai retitled this revision from to Update the vectorizer cost model for getVectorInstrCost to reflect actual costs of the operations on Power8/Power9.
nemanjai updated this object.
nemanjai added reviewers: hfinkel, kbarton, amehsan, wschmidt.
nemanjai set the repository for this revision to rL LLVM.
nemanjai added a subscriber: llvm-commits.
nemanjai updated this revision to Diff 69681.Aug 30 2016, 7:53 AM

Updated the very confusing condition for whether the type is an integer of the right width at the right index.

Please note:
This is not necessarily meant to be in a state where it can be approved/committed. This patch implements the costs of these instructions based on latency of the instructions themselves. As such, this patch is meant to start a discussion on exactly how we want the cost model to work and what we want the returned costs to represent.

As things stand now, the relative cost of a direct move and an actual Load-hit-store hazard are not nearly representative of their actual relative costs. But I've tried to ensure that we fall through to the "LHS" code in cases where we will actually have an LHS hazard.

kbarton edited edge metadata.Sep 22 2016, 11:51 AM

Added a few comments to start the discussion of how this should be designed.

lib/Target/PowerPC/PPCTargetTransformInfo.cpp
315

This seems to be the point where all of the interesting changes are, so I'll add my comments here.

I would recommend that some of these values (i.e, DirectMoveCost be defined somewhere else, either as an enumeration or #define. I expect that these costs will change over time (with the hardware), and it would be good to have a clear and convenient mechanism to represent that as opposed to a bunch of condition checks in this function.

Off the top of my head, I can think of a few general ways to design this:

  1. Try to create a generic cost function here, that is "built" based on values defined as constants
  2. Create a separate function for each architecture, that needs to be updated with every new hardware (i.e., P8InstrCost, p9InstrCost, ...)
  3. Create a class that can be used, with subclasses for new architectures that can override the defaults when values change.

I'm sure there are other possibilities as well.

amehsan edited edge metadata.Sep 30 2016, 7:21 AM

I start with comments about the code here and then get to the more general dicussion:

1- Some hardcoded constants represent cost of an operation. For example in line 365 we have

DirectMoveCost + 1

and 1 seems to represent the cost of a swap/permute. There are other places that value of 1 has been used for this cost. We should clearly define a constant equal to one for this cost and use that constant name, like what is done for DirectMoveCost

2- IIUC, there is an implicit assumption here, that the cost of direct move on pwr8 and pwr9 is the same. Is this known for a fact?

As a more general comment:

1- Is there a reason that this function cannot be replaced by a target description file? Apart from the fact, that we need to do extra work to develop code to support defining these cost in td files, is there a fundamental reason that these cannot be defined in td files?

nemanjai abandoned this revision.Apr 3 2018, 4:34 PM