This is an archive of the discontinued LLVM Phabricator instance.

[SLP]Improve reduction cost model for scalars.
ClosedPublic

Authored by ABataev on Apr 11 2023, 12:02 PM.

Details

Summary

Instead of abstract cost of the scalar reduction ops, try to use the
cost of actual reduction operation instructions, where possible. Also,
remove the estimation of the vectorized GEPs pointers for reduced loads,
since it is already handled in the tree.

Diff Detail

Event Timeline

ABataev created this revision.Apr 11 2023, 12:02 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 11 2023, 12:02 PM
ABataev requested review of this revision.Apr 11 2023, 12:02 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 11 2023, 12:02 PM
ABataev updated this revision to Diff 512548.Apr 11 2023, 12:16 PM

Fix the check for number of uses of the reduced values.

RKSimon added inline comments.Apr 12 2023, 5:57 AM
llvm/test/Transforms/SLPVectorizer/X86/horizontal-smax.ll
79

CMP+CMOV is quick even on ancient x86 - the smax.i32 throughput cost of 1 is realistic.

The issue is the predicted smax.v16i32 reduction cost, which is currently 33 (based on expansion of costs in getMinMaxReductionCost), but realistically is closer to 12 cycles (based off some quick llvm-mca tests)

ABataev added inline comments.Apr 12 2023, 6:01 AM
llvm/test/Transforms/SLPVectorizer/X86/horizontal-smax.ll
79

Can you fix it?

RKSimon added inline comments.Apr 12 2023, 6:40 AM
llvm/test/Transforms/SLPVectorizer/X86/horizontal-smax.ll
79

I'll try to fix some of the obvious issues to unstick this patch, but a more complete fix will take more time.

RKSimon added inline comments.Apr 12 2023, 7:38 AM
llvm/test/Transforms/SLPVectorizer/X86/horizontal-smax.ll
79

Please can you rebase after rG63c3895327839ba5b57f5b99ec9e888abf976ac6 ?

RKSimon added inline comments.Apr 12 2023, 9:21 AM
llvm/test/Transforms/SLPVectorizer/X86/horizontal-minmax.ll
1127
llvm/test/Transforms/SLPVectorizer/X86/horizontal-smax.ll
38

This looks to be about right: https://gcc.godbolt.org/z/sq99696Y7

You can add additional SSE test levels if you want to be certain?

ABataev added inline comments.Apr 12 2023, 9:24 AM
llvm/test/Transforms/SLPVectorizer/X86/horizontal-smax.ll
38

You mean add some extra tests for smin/umin/umax/fmin/fmax?

RKSimon added inline comments.Apr 12 2023, 9:42 AM
llvm/test/Transforms/SLPVectorizer/X86/horizontal-smax.ll
38

No - extra SSE test levels - I've added them at rG162284b2e1a970a01144d1d8e7f8d4fd1e03c5bf

RKSimon accepted this revision.Apr 12 2023, 10:57 AM

LGTM - cheers

This revision is now accepted and ready to land.Apr 12 2023, 10:57 AM
This revision was landed with ongoing or failed builds.Apr 12 2023, 11:35 AM
This revision was automatically updated to reflect the committed changes.