This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Implement getIntrinsicInstrCost, handle min/max intrinsics.
ClosedPublic

Authored by fhahn on Oct 22 2020, 5:44 AM.

Details

Summary

This patch adds a specialized implementation of getIntrinsicInstrCost
and add initial cost-modeling for min/max vector intrinsics.

AArch64 NEON support umin/smin/umax/smax for vectors
<8 x i8>, <16 x i8>, <4 x i16>, <8 x i16>, <2 x i32> and <4 x i32>.
Notably, it does not support vectors with i64 elements.

This change by itself should have very little impact on codegen, but in
follow-up patches I plan to teach the vectorizers to consider using
those intrinsics on platforms where it is profitable, e.g. because there
is no general 'select'-like instruction.

The current cost returned should be better for throughput, latency and size.

Diff Detail

Event Timeline

fhahn created this revision.Oct 22 2020, 5:44 AM
fhahn requested review of this revision.Oct 22 2020, 5:44 AM
fhahn edited reviewers, added: spatel; removed: jpaquette.Oct 22 2020, 5:45 AM

Yeah, this sounds useful. I know we've hit problems with min/max costs in the past (the cmp/select kind, I don't know if they were float). It would be good to see them costed more accurately.

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
222

v4i16?

224

use auto instead? It's common to use CostTableLookup too, but I would guess that makes this more verbose?

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
100

What does this using do?

fhahn updated this revision to Diff 299944.Oct 22 2020, 6:13 AM

Add missing v4i16, use auto, remove unncessary using.

fhahn marked an inline comment as done.Oct 22 2020, 6:15 AM
fhahn added inline comments.
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
222

Yeah I missed that one initially. Should be fixed now.

224

I think CostTableLookup is overkill, because at the moment this just uses the same cost for each supported type. Updated to use auto.

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
100

IIRC there used to be some compilers that had trouble with calling BaseT::getIntrinsicInstrCost without this, but it builds fine on my system without it. Let's see if any bot complains.

I presume float versions would be useful too? Do you plan to add them?

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
222

Look like it would be worth making sure there are tests too.

224

Yeah I agree. Identical costs between sizes and opcodes don't make it very useful here.

fhahn updated this revision to Diff 299981.Oct 22 2020, 8:07 AM
fhahn marked an inline comment as done.

Add/update tests for v4i16, v2i32.

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
222

Agreed, added tests for the remaining missing cases (v2i32, v4i16).

dmgreen accepted this revision.Oct 22 2020, 8:18 AM

Thanks. LGTM

This revision is now accepted and ready to land.Oct 22 2020, 8:18 AM
This revision was landed with ongoing or failed builds.Oct 23 2020, 3:44 AM
This revision was automatically updated to reflect the committed changes.