This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Refine costs for i1 reductions
ClosedPublic

Authored by reames on Jun 10 2022, 9:40 AM.

Details

Summary

Our actual lowering for i1 reductions uses ctpop combined with possibly a vector negate and possibly a logic op afterwards. I believe ctpop to be low cost on all reasonable hardware.

The default costing implementation here was returning quite inconsistent costs. and/or were returning very high costs (because we seem to think moving into scalar registers is very expensive?) and others were returning lower but still too high (because of the assumed tree reduce strategy). While we should probably improve the generic costing strategy for i1 vectors, let's start by fixing the immediate problem.

Diff Detail

Event Timeline

reames created this revision.Jun 10 2022, 9:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2022, 9:40 AM
reames requested review of this revision.Jun 10 2022, 9:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2022, 9:40 AM
craig.topper accepted this revision.Jun 10 2022, 9:56 AM

LGTM

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
327

I really hope that max/min of i1 don't occur in normal scenarios.

This revision is now accepted and ready to land.Jun 10 2022, 9:56 AM
frasercrmck added inline comments.Jun 10 2022, 10:11 AM
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
329

One thing I've noticed is that we do use increasingly more vmnand as the vectors become larger and have to be split. Should we be taking this into account? See e.g. an and reduce with -riscv-v-vector-bits-min=128:

<256 x i1>:

vmnot.m v8, v0
vcpop.m a0, v8

<512 x i1>:

vmand.mm        v8, v8, v10
vmand.mm        v9, v0, v9
vmnand.mm       v8, v9, v8
vcpop.m a0, v8

<1024 x i1>:

vmand.mm        v10, v10, v14
vmand.mm        v8, v8, v12
vmand.mm        v8, v8, v10
vmand.mm        v9, v9, v13
vmand.mm        v10, v0, v11
vmand.mm        v9, v10, v9
vmnand.mm       v8, v9, v8
vcpop.m a0, v8
craig.topper requested changes to this revision.Jun 10 2022, 10:13 AM

We should consider LT.first to address the comment from @frasercrmck.

This revision now requires changes to proceed.Jun 10 2022, 10:13 AM
reames added inline comments.Jun 10 2022, 10:15 AM
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
329

Good catch, thanks!

reames updated this revision to Diff 435994.Jun 10 2022, 11:54 AM

Account for legalization

reames updated this revision to Diff 436005.Jun 10 2022, 12:10 PM

Rebase over tests with required legalization so that effect of legalization change is visible.

Rebase over tests with required legalization so that effect of legalization change is visible.

I think we lost the RISCVTargetTransformInfo.cpp change

reames updated this revision to Diff 436012.Jun 10 2022, 12:23 PM

Fix rebase mistakes

This revision is now accepted and ready to land.Jun 10 2022, 1:02 PM
This revision was landed with ongoing or failed builds.Jun 10 2022, 1:30 PM
This revision was automatically updated to reflect the committed changes.
Miss_Grape added inline comments.
llvm/test/Analysis/CostModel/RISCV/reduce-add.ll
7

why the costModel of v1i1 is not 5 but 2

Miss_Grape added inline comments.Aug 23 2022, 5:48 AM
llvm/test/Analysis/CostModel/RISCV/reduce-max.ll
180

llvm.vector.reduce.smax.v2i16 and llvm.vector.reduce.smax.v4i16 have the same asemble, why
llvm.vector.reduce.smax.v2i16: cost model 3
llvm.vector.reduce.smax.v4i16: cost model 4