This is an archive of the discontinued LLVM Phabricator instance.

[AArch64TTI] Compute imm materialization cost for AArch64 intrinsics
ClosedPublic

Authored by fhahn on Nov 25 2019, 5:41 AM.

Details

Summary

Currently, getIntImmCost returns TCC_Free for almost all intrinsics.
For most AArch64 specific intrinsics however, it looks like integer
constants cannot be folded into most of them (at least the ones I checked).

Unless we know that we can fold integer operands with the intrinsic, we
handle more cases correctly by returning the cost to materialize the
immediate than return TCC_Free.

Diff Detail

Event Timeline

fhahn created this revision.Nov 25 2019, 5:41 AM
SjoerdMeijer added inline comments.Nov 25 2019, 11:47 AM
llvm/test/Transforms/ConstantHoisting/AArch64/const-hoist-intrinsics.ll
2

I thought tests for instruction costs are checked with these kind of opt tests:

opt < %s  -cost-model -analyze -cost-kind=...

is that not a better/more direct way of testing this?

fhahn updated this revision to Diff 230955.Nov 25 2019, 12:02 PM

Update the test to return the materialization cost for AArch64 intrinsics, not all intrinsics not explicitly matched, as intrinsics like lifetime markers or debug instructions should be free.

I am not sure what the best way is to match all AArch64 intrinsics.

fhahn marked an inline comment as done.Nov 25 2019, 12:04 PM
fhahn added inline comments.
llvm/test/Transforms/ConstantHoisting/AArch64/const-hoist-intrinsics.ll
2

I think the various cost-kinds just focus just on the instruction, not the arguments. So there seems to be no way to get it to print the cost of materializing the immediate.

Build result: pass - 60287 tests passed, 0 failed and 732 were skipped.

Log files: console-log.txt, CMakeCache.txt

I wanted to look into this to make a less hand wavy suggestion, but I got lost a little bit lost, which always happens when I look in that jungle that is TTI and TTIImpl etc., so will just ask a question instead.

But I was wondering why can't we look at the instruction, and also look at the (immediate) arguments, and depending on how we expect them to get materialized return a cost that is something like the cost_instruction + cost_immediate?

And then test for something like:

Cost Model: Found an estimated cost of 3 for instruction:   %bar.3 = call i32 @llvm.aarch64.stxr.p0i64(i64 -9223372036317904704, i64* %ptr.3)

I wanted to look into this to make a less hand wavy suggestion, but I got lost a little bit lost, which always happens when I look in that jungle that is TTI and TTIImpl etc., so will just ask a question instead.

But I was wondering why can't we look at the instruction, and also look at the (immediate) arguments, and depending on how we expect them to get materialized return a cost that is something like the cost_instruction + cost_immediate?

I think currently we just look at the instruction and ignore the operands for the cost. IIRC some passes include the cost of the arguments in the cost they use, but I think in some cases that's quite pass specific. I think the problem with including the cost of the arguments is that in order to compute the cost of an argument that's an instruction, do we also compute the cost of that instruction? How far do we recurse? For constants that's no need to recurse, but I am not sure if there is a big benefit by including their cost?

Anyways, I agree that there's a lot of potential to improve TTI, but I am not sure it should be done as part of this patch :)

ributzka added inline comments.Nov 26 2019, 3:50 PM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
191

Please convert this to an early return instead.

fhahn updated this revision to Diff 231197.Nov 27 2019, 1:11 AM

Use early exit for AArch64 intrinsics, thanks!

fhahn marked an inline comment as done.Nov 27 2019, 1:11 AM
fhahn retitled this revision from [AArch64TTI] Do not return TCC_Free for all unknown intrinsics. to [AArch64TTI] Compute imm materialization cost for AArch64 intrinsics.
SjoerdMeijer accepted this revision.Nov 27 2019, 1:21 AM

Thanks, and looks reasonable to me.

This revision is now accepted and ready to land.Nov 27 2019, 1:21 AM

Build result: pass - 60333 tests passed, 0 failed and 732 were skipped.

Log files: console-log.txt, CMakeCache.txt

This revision was automatically updated to reflect the committed changes.