Page MenuHomePhabricator

[ARM][CostModel] Select instruction costs.
Needs ReviewPublic

Authored by samparker on Jun 18 2020, 6:29 AM.

Details

Summary

Modify the ARM getCmpSelInstrCost implementation to return something a little more sensible for select instructions when we're querying a cost kind other than throughput, or we have a scalar type. We now consider the legalization cost and also make selects +1 more expensive.

Diff Detail

Event Timeline

samparker created this revision.Jun 18 2020, 6:29 AM

I think this should be reasonable. Can you explain where the cost of 2 is coming form? I would expect IT to be roughly free in a lot of places. ARM has conditional statements for pretty much everything. And armv8.1-m has csel :) (But, we don't actually use that at the moment..)

Do you have benchmark results? Not that you have to share them here, but do they look, um, reliably better?

llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
513

Simplift ?

Can you explain where the cost of 2 is coming form?

Well the IT isn't guaranteed to be free, it still takes up code size and, of course, it doesn't even exist for T1. But in my testing, a few months ago, it wasn't worth differentiating between M-profile versions - but I will double check. The cmsis numbers for the M33 look good, it doesn't effect many kernels but they're all decent improvements.

Fixed typo and also predicated the code for an M-class subtarget.

Now only +1 for the IT block when we're querying for code size. But we now +1 for i1 values too to account for (some) cost of rematerialising those values.

samparker updated this revision to Diff 278751.Fri, Jul 17, 6:21 AM

Rebased and cleaned up the logic a bit.

I think you could argue the cost of selects in many ways on ARM. A lot of them will be free (folded into another instruction), a lot will cost 1, some will cost 2 because of the IT or even higher on a T1 core. I think in the end, whatever looks best on benchmarks is probably the best way to go.

llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
588

I'm not a huge fan of isMClass. That covers quite a range, and the top end (and even mid-end) of that range will chew through IT bocks/selects like they are not even present.

Maybe use some variant of isThumb would be better? I wouldn't expect the code below to change much between a mclass and others, unless I'm missing something?

samparker updated this revision to Diff 282559.Mon, Aug 3, 3:04 AM

Yeah, makes sense, it's now using isThumb and I've cleaned up the predicate a bit too. The thumbv7 select costs tests have been updated accordingly too.