Page MenuHomePhabricator

[SLP] Add cost model for `llvm.powi.*` intrinsics
ClosedPublic

Authored by n-omer on Jun 20 2022, 2:52 AM.

Details

Summary

This patch adds handling for the llvm.powi.* intrinsics in BasicTTIImplBase::getIntrinsicInstrCost() and improves vectorization.

Closes #53887.

Diff Detail

Event Timeline

n-omer created this revision.Jun 20 2022, 2:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 20 2022, 2:52 AM
n-omer requested review of this revision.Jun 20 2022, 2:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 20 2022, 2:52 AM

diff with context?

llvm/include/llvm/CodeGen/TargetLowering.h
2204

Maybe drop the virtual - its easy enough to readd if a target wants to override this.

Would this be better as int Exponent?

n-omer updated this revision to Diff 438314.Jun 20 2022, 2:58 AM

Add context to diff.

RKSimon added inline comments.Jun 20 2022, 3:00 AM
llvm/test/Transforms/SLPVectorizer/X86/powi-regression.ll
4

This file should probably be pre-committed with current codegen and the patch rebased to show the codegen change.

20

Drop the "local_unnamed_addr #0"

n-omer retitled this revision from [SLP] Add cost models for llvm.powi.* intrinsics to [SLP] Add cost model for `llvm.powi.*` intrinsics.Jun 20 2022, 3:00 AM
n-omer updated this revision to Diff 438378.Jun 20 2022, 6:49 AM
n-omer marked an inline comment as done.

Rebase patch and address review comments.

n-omer updated this revision to Diff 438379.Jun 20 2022, 6:50 AM

Add context to diff.

n-omer marked 2 inline comments as done.Jun 20 2022, 6:51 AM
n-omer updated this revision to Diff 438604.Jun 21 2022, 2:29 AM

Remove unused included file.

RKSimon accepted this revision.Jun 21 2022, 6:55 AM

LGTM - cheers

This revision is now accepted and ready to land.Jun 21 2022, 6:55 AM
This revision was landed with ongoing or failed builds.Jun 21 2022, 7:41 AM
This revision was automatically updated to reflect the committed changes.

Reverted due to test failures, investigating further.

n-omer updated this revision to Diff 438954.Jun 22 2022, 2:54 AM

Update failing cost model tests.

n-omer updated this revision to Diff 438963.Jun 22 2022, 3:51 AM

Add context to diff.

@craig.topper @dmgreen Are the aarch64/riscv cost changes OK? They are in 'unsupported' sections - so not sure if they have extra meaning?

I think it is fine so long as the backend can handle it:
https://godbolt.org/z/5bvEn5563
But needs to be an invalid cost if it cannot:
https://godbolt.org/z/6jK7Kvc4s

Can you change the test lines to this, which should still return an invalid cost:

%powi = call <vscale x 4 x float> @llvm.powi.nxv4f32.i32(<vscale x 4 x float> %vec, i32 %extraarg)

Maybe copy the immediate tests below so we have coverage for both constant / variable exponent values?

n-omer updated this revision to Diff 439051.Jun 22 2022, 9:15 AM

Modify AArch64 and RISCV tests.

Thanks, LGTM. Although I was wondering why the cost came out as 12, not 14 from the constant 42. It may be possible to get a slightly better costs that more accurately match the code from ExpandPowI.

llvm/include/llvm/CodeGen/BasicTTIImpl.h
1424

I think a more accurate count might be ActiveBits + PopCount - 2, at least from playing around with a few examples.

n-omer updated this revision to Diff 439322.Jun 23 2022, 3:55 AM

Improve the cost computation and update tests.

Thanks @dmgreen.

n-omer marked an inline comment as done.Jun 23 2022, 3:55 AM

Thanks. I guess that changes all the negative constant values too..

If you want a cost for negative numbers I think it would use the abs of the value in the ActiveBits+PopCount-2 computation, plus the div like you already have it.

llvm/include/llvm/CodeGen/BasicTTIImpl.h
1426

I think it can go via an APInt: RHSC->getValue().countPopulation()

1430

For negative numbers if looks like it should use ((-RHSC).ActiveBits + (-RHSC).PopCount - 2)) * Mul + Div, using the negative -RHSC in the calculations above.

n-omer updated this revision to Diff 439403.Jun 23 2022, 8:11 AM

Fix cost model.

My bad, thanks.

n-omer marked 2 inline comments as done.Jun 23 2022, 8:12 AM

Thanks. The ticket is still "closed", so I cant properly accept it again. But it LGTM.

This revision was landed with ongoing or failed builds.Jun 24 2022, 3:24 AM