This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Legalize horizontal fmax/fmin reductions on f16 vectors
ClosedPublic

Authored by LemonBoy on Mar 3 2021, 1:14 AM.

Details

Summary

Expand the horizontal reduction during the instruction selection phase, but only if the target doesn't support the full fp16 instruction set.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49401

Diff Detail

Event Timeline

LemonBoy created this revision.Mar 3 2021, 1:14 AM
LemonBoy requested review of this revision.Mar 3 2021, 1:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2021, 1:14 AM
david-arm added inline comments.
llvm/test/CodeGen/AArch64/vecreduce-fmax-legalization.ll
3

Hi, instead of having two RUN lines here would it be better to just use function attributes, i.e. for test_v4f16 have a function attribute that adds "fullfp16" support? That way you only need one RUN line and can avoid all the additional FP16 check lines too, since most of them seem to be the same as the first RUN line.

llvm/test/CodeGen/AArch64/vecreduce-fmin-legalization.ll
68

It looks like you've added FP16 checks here without the additional RUN line. Similar to my comment in the fmax case perhaps you can just use function attributes instead?

fhahn added a subscriber: fhahn.Mar 3 2021, 1:32 AM
fhahn added inline comments.
llvm/test/CodeGen/AArch64/vecreduce-fmax-legalization.ll
3

You could also use multiple check prefixes to avoid having repeated check for both RUN lines, if they are equal, e.g. FileCheck %s --check-prefix=CHECK --check-prefix=NOFP16 ..., FileCheck %s --check-prefix=CHECK --check-prefix=FP16 ...

With that, we should only need separate FP16/NOFP16 check lines for functions where there is a difference.

LemonBoy added inline comments.Mar 3 2021, 1:36 AM
llvm/test/CodeGen/AArch64/vecreduce-fmax-legalization.ll
3

The main idea is to check the expanded reduction is emitted when fullfp16 is not specified, I guess the FP16 check can be removed altogether if we don't care about checking the case where the intrinsic can be lowered.

llvm/test/CodeGen/AArch64/vecreduce-fmin-legalization.ll
68

Good catch, I keep forgetting pieces when I git add.

david-arm added inline comments.Mar 3 2021, 1:38 AM
llvm/test/CodeGen/AArch64/vecreduce-fmax-legalization.ll
3

Hi @fhahn sure that would work too and it's a good point - I was just thinking that fewer RUN lines meant faster test suites that's all.

nikic added a subscriber: nikic.Mar 3 2021, 1:39 AM
nikic added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
10488
fhahn added inline comments.Mar 3 2021, 1:49 AM
llvm/test/CodeGen/AArch64/vecreduce-fmax-legalization.ll
3

I was just thinking that fewer RUN lines meant faster test suites that's all.

That's also a good point. I'm not sure how things will shake out with more tests for half; I think we should have fp16 tests both with and without fullfp16, as long we have that and can avoid must of the redundant check lines, I am happy whichever way we go :)

67

can you also throw in a test with vectors that are not directly legal, regardless of fullfp16?

LemonBoy updated this revision to Diff 327735.Mar 3 2021, 3:08 AM

Don't mark the fmin/fmax ops as legal if fullfp16 is not available.
Update test cases.

LemonBoy marked 4 inline comments as done.Mar 3 2021, 3:11 AM
LemonBoy added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
10488

Indeed, fixed.
Thanks for the suggestion!

llvm/test/CodeGen/AArch64/vecreduce-fmax-legalization.ll
67

That's hitting a different problem in the legalization step, I'll try to address that in a separate patch.

aemerson accepted this revision.Mar 3 2021, 2:58 PM

LGTM.

This revision is now accepted and ready to land.Mar 3 2021, 2:58 PM