Page MenuHomePhabricator

[IR] Extend llvm.vector.reduce.fadd
AbandonedPublic

Authored by junaire on Jan 17 2022, 6:42 AM.

Details

Summary

This patch extend the @llvm.vector.reduce.fadd to take another integer
argument to indicate the order to apply.

Diff Detail

Unit TestsFailed

TimeTest
1,350 msx64 debian > Clang.CodeGen/X86::avx512-reduceIntrin.c
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/14.0.0/include -nostdsysteminc -ffreestanding /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGen/X86/avx512-reduceIntrin.c -O0 -triple=x86_64-apple-darwin -target-cpu skylake-avx512 -emit-llvm -o - -Wall -Werror | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGen/X86/avx512-reduceIntrin.c
1,680 msx64 debian > Clang.CodeGen/X86::avx512fp16-builtins.c
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/14.0.0/include -nostdsysteminc -ffreestanding -flax-vector-conversions=none /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGen/X86/avx512fp16-builtins.c -triple=x86_64-unknown-unknown -target-feature +avx512fp16 -emit-llvm -o - -Wall -Werror | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGen/X86/avx512fp16-builtins.c
1,640 msx64 debian > Clang.CodeGen/X86::avx512vlfp16-builtins.c
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/14.0.0/include -nostdsysteminc -flax-vector-conversions=none -ffreestanding /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGen/X86/avx512vlfp16-builtins.c -triple=x86_64-unknown-unknown -target-feature +avx512vl -target-feature +avx512fp16 -emit-llvm -o - -Wall -Werror | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGen/X86/avx512vlfp16-builtins.c
1,340 msx64 debian > Clang.utils/update_cc_test_checks::check-globals.test
Script: -- : 'RUN: at line 1'; rm -rf /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/utils/update_cc_test_checks/Output/check-globals.test.tmp && mkdir /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/utils/update_cc_test_checks/Output/check-globals.test.tmp
600 msx64 debian > Clang.utils/update_cc_test_checks::global-hex-value-regex.test
Script: -- : 'RUN: at line 1'; rm -rf /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/utils/update_cc_test_checks/Output/global-hex-value-regex.test.tmp && mkdir /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/utils/update_cc_test_checks/Output/global-hex-value-regex.test.tmp
View Full Test Results (516 Failed)

Event Timeline

junaire created this revision.Jan 17 2022, 6:42 AM
junaire requested review of this revision.Jan 17 2022, 6:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 17 2022, 6:42 AM
nikic added a comment.Jan 17 2022, 6:47 AM

I don't get this change. We already use reassoc FMF to allow a non-ordered reduction, what's the purpose of the new flag?

This also needs a LangRef update.

Please note that this patch is WIP.

After we add a new argument to the intrinsic, how can we update the previous lit tests automatically? I haven't tried to update all the tests by hand, cuz it looks like takes a huge amount of work and I believe there should be a way to do things more easily and I don't know.

I don't get this change. We already use reassoc FMF to allow a non-ordered reduction, what's the purpose of the new flag?

This also needs a LangRef update.

Thanks to have a look at this!
The relevant context is here: https://reviews.llvm.org/D116736

I don't get this change. We already use reassoc FMF to allow a non-ordered reduction, what's the purpose of the new flag?

Well, maybe I didn't get the reviewer's ideas right? Any suggestions are appreciated!

fhahn added a comment.Jan 17 2022, 7:00 AM

I don't get this change. We already use reassoc FMF to allow a non-ordered reduction, what's the purpose of the new flag?

This also needs a LangRef update.

reassoc implies 'any order', but in some cases it is desirable to specify a specific order, e.g. for the vector reduction builtin provided by Clang.

lebedev.ri requested changes to this revision.Jan 17 2022, 7:07 AM
lebedev.ri added a subscriber: lebedev.ri.

I don't get this change. We already use reassoc FMF to allow a non-ordered reduction, what's the purpose of the new flag?

This also needs a LangRef update.

reassoc implies 'any order', but in some cases it is desirable to specify a specific order, e.g. for the vector reduction builtin provided by Clang.

I'm not convinced about all this implied complexity.
I think clang will just have to emit expanded form of the reduction in that case.

This revision now requires changes to proceed.Jan 17 2022, 7:07 AM
nikic added a comment.Jan 17 2022, 7:08 AM

I don't get this change. We already use reassoc FMF to allow a non-ordered reduction, what's the purpose of the new flag?

This also needs a LangRef update.

reassoc implies 'any order', but in some cases it is desirable to specify a specific order, e.g. for the vector reduction builtin provided by Clang.

Are we following someone else's standard here, or can we specify the behavior? If a specific reduction order is required, I would very much expect that to be "ordered reduction".

I'm not a fan of introducing a third order here if we can avoid it. The value of having something other than ordered/unordered is not clear to me.

fhahn added a comment.Jan 17 2022, 7:53 AM

reassoc implies 'any order', but in some cases it is desirable to specify a specific order, e.g. for the vector reduction builtin provided by Clang.

Are we following someone else's standard here, or can we specify the behavior? If a specific reduction order is required, I would very much expect that to be "ordered reduction".

I'm not a fan of introducing a third order here if we can avoid it. The value of having something other than ordered/unordered is not clear to me.

I assume ordered here means sequential. The clang builtin specifies a tree-wise reduction order. The main motivation for that order is to enable faster execution and consistent results on different HW architectures. At the moment the existing intrinsic is either slow & consistent or fast & inconsistent.

I'm not convinced about all this implied complexity.
I think clang will just have to emit expanded form of the reduction in that case.

Sure it could, but it seems unfortunate that a common reduction pattern cannot be expressed with the reduction intrinsic. The main benefit of using the reduction intrinsic here is to make instruction selection substantially easier. AFAICT the main complexity will be in the backends, which will now have to be able to deal with the new order. But given that at the moment only AArch64 and RISCV, it should't be too bad?

scanon added a subscriber: scanon.Jan 17 2022, 8:06 AM

I assume ordered here means sequential. The clang builtin specifies a tree-wise reduction order.

It's worth noting that "sequential" is only desirable for matching scalar code (which is important, but isn't everything). Explicit vector code essentially always prefers a tree reduction (either even + odd or low + high, depending on arch/uarch, but either is better than sequential). Either normal tree reduction delivers faster results with smaller average errors, and they are still reproducible. Having a defined tree reduction is a great tool for explicit vectorizers.

junaire abandoned this revision.Thu, Jun 30, 11:52 PM

Abandon this since other folks now seem have better ways to solve the issue.

Herald added a project: Restricted Project. · View Herald TranscriptThu, Jun 30, 11:52 PM
Herald added a subscriber: StephenFan. · View Herald Transcript