This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Put some of the TTI costmodel behind hasNeon calls.
ClosedPublic

Authored by dmgreen on Jun 17 2019, 10:57 AM.

Details

Summary

This puts some of the calls in ARMTargetTransformInfo.cpp behind hasNeon() checks, now that we have MVE coming. I have also regenerated a large number of cost model checks, which I show the differences of here (which is really the meat of this patch).

Many of the costs will not be very precise yet, there are updates for MVE coming. They look like they are mostly moving in the correct direction. Broadcasts (VDUPs) and divs both have patches to adjust the costs specifically.

Diff Detail

Event Timeline

dmgreen created this revision.Jun 17 2019, 10:57 AM

Hi Dave,

Bit of a wild idea/guess: I was wondering if you had given some thoughts if we could create some sort of abstraction for MVE and NEON. What I mean is that we're going to see a lot of

if (hasNEON())
  // do this
if (hasMVE())
  // do that

all over the place here. With long blocks and a lot of identation, readability is going to suffer a little bit. Is there something more c++y we could do here and provide different implementations whether we have NEON or MVE? Or would that not overly complicate things here, or not worth it?

I'm not sure. Any thoughts on what that would be?

No, I am not sure either, can't propose anything.
Creating helper functions, lambdas, etc. might be good enough.

Perhaps you see if you can refactor getShuffleCost a bit, avoid the big if-block by returning early.

Did you see D63448 for getShuffleCost? I'm not early return would work there, but I could try moving the ST->hasNEON() check into each of the if (Kind == TTI::SK_Reverse). Although that's a bit of duplication it would make it work similarly to getCastInstrCost.

The other alternative it to move the MVEDupTbl table in D63448 into the same block as NEONDupTbl is here, and switching which array is used on isMVE/isNeon inside the block. I'm not a big fan of that though, as it mangles the two architectures together more than I'd like and I feel like it makes things harder to understand (although would be smaller, the goal should be conceptually simpler, not smaller).

SjoerdMeijer accepted this revision.Jun 18 2019, 3:47 AM

LGTM.
Let's see how this goes.

This revision is now accepted and ready to land.Jun 18 2019, 3:47 AM
This revision was automatically updated to reflect the committed changes.