This is an archive of the discontinued LLVM Phabricator instance.

[Cost]Canonicalize the cost for logical or/and reductions.
ClosedPublic

Authored by ABataev on Mar 4 2021, 8:55 AM.

Details

Summary

The generic cost of logical or/and reductions should be cost of bitcast
<ReduxWidth x i1> to iReduxWidth + cmp eq|ne iReduxWidth.

Diff Detail

Event Timeline

ABataev created this revision.Mar 4 2021, 8:55 AM
ABataev requested review of this revision.Mar 4 2021, 8:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 4 2021, 8:55 AM
Herald added a subscriber: MaskRay. · View Herald Transcript
ABataev added inline comments.Mar 4 2021, 8:57 AM
llvm/test/Analysis/CostModel/SystemZ/reduce-and.ll
13

The cost model for SystemZ is not complete/correct, needs to be fixed, that's why there is a regression for it.

ABataev added inline comments.Mar 4 2021, 8:59 AM
llvm/include/llvm/CodeGen/BasicTTIImpl.h
1909

SystemZ cost model does not implement vector-to-int bitcast and crashes. That's why have to use Base::getCastInstrCost() here rather than thisT()->getCastInstrCost()

ABataev retitled this revision from [Cost]Caonicalize the cost for logical or/and reductions. to [Cost]Canonicalize the cost for logical or/and reductions..Mar 4 2021, 8:59 AM
david-arm added inline comments.
llvm/include/llvm/CodeGen/BasicTTIImpl.h
1902

I'm not sure this is always true because some backends (e.g. AArch64) promote i1 to larger integers. The costs for AArch64 still look a bit odd to be honest. I tried them out manually and I observe about 8 instructions for AND reductions using <4 x i1> vectors since we have lots of bytewise moves of -1 into the vector lanes of a <4 x i32> vector.

ABataev added inline comments.Mar 4 2021, 9:44 AM
llvm/include/llvm/CodeGen/BasicTTIImpl.h
1902

This is known problem, see
https://bugs.llvm.org/show_bug.cgi?id=41636
https://bugs.llvm.org/show_bug.cgi?id=41635
https://bugs.llvm.org/show_bug.cgi?id=41634

Looks like the construct is not lowered properly on some targets

david-arm added inline comments.Mar 5 2021, 12:44 AM
llvm/include/llvm/CodeGen/BasicTTIImpl.h
1902

Sure, I totally agree the codegen for ARM and AArch64 is awful and I take your point. I was just wondering if this assumption was a problem:

%val = bitcast <ReduxWidth x i1> to iReduxWidth

as I don't think is true for targets that promote i1 to i32 or something like that. In the bug shown above (https://bugs.llvm.org/show_bug.cgi?id=41636) even the optimal code is still operating on vectors of i8 types.

I guess for those targets that do promote i1->iX they can come up with their own cost in the target specific getArithmeticReductionCost so maybe this isn't really a problem?

ABataev added inline comments.Mar 5 2021, 5:45 AM
llvm/include/llvm/CodeGen/BasicTTIImpl.h
1902

Yes, this is the idea. This patch provides just the basic cost estimation for this particular case, in case if the target cost is different it should define its own cost for this case.

RKSimon added inline comments.Mar 7 2021, 3:33 AM
llvm/include/llvm/CodeGen/BasicTTIImpl.h
1909

How much of a task would it be to tweak the SystemZ TTI to avoid this? I worry that this kind of thing gets forgotten about and could cause other problems in the future.

ABataev updated this revision to Diff 331859.Mar 19 2021, 6:58 AM

Added basic cost estimation for vector-to-scalar cast for SystemZ target.

RKSimon accepted this revision.Mar 19 2021, 7:54 AM

LGTM - cheers.

This is known problem, see
https://bugs.llvm.org/show_bug.cgi?id=41636
https://bugs.llvm.org/show_bug.cgi?id=41635
https://bugs.llvm.org/show_bug.cgi?id=41634

Please can you ping those bugs mentioning this default cost change?

This revision is now accepted and ready to land.Mar 19 2021, 7:54 AM

LGTM - cheers.

This is known problem, see
https://bugs.llvm.org/show_bug.cgi?id=41636
https://bugs.llvm.org/show_bug.cgi?id=41635
https://bugs.llvm.org/show_bug.cgi?id=41634

Please can you ping those bugs mentioning this default cost change?

Sure, will do, thanks!