Page MenuHomePhabricator

[RISCV] Enable interleaved access vectorization
Needs ReviewPublic

Authored by luke957 on Jun 29 2021, 9:31 AM.

Details

Summary

Enable interleaved access vectorization.

Diff Detail

Unit TestsFailed

TimeTest
2,760 msx64 debian > libarcher.critical::critical.c
Script: -- : 'RUN: at line 15'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/critical.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/critical.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/critical.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/critical.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/critical.c
2,800 msx64 debian > libarcher.critical::lock-nested.c
Script: -- : 'RUN: at line 15'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/lock-nested.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock-nested.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock-nested.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock-nested.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/lock-nested.c
2,950 msx64 debian > libarcher.races::critical-unrelated.c
Script: -- : 'RUN: at line 13'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/races/critical-unrelated.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/critical-unrelated.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/deflake.bash /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/critical-unrelated.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/critical-unrelated.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/races/critical-unrelated.c
2,930 msx64 debian > libarcher.races::lock-nested-unrelated.c
Script: -- : 'RUN: at line 13'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/races/lock-nested-unrelated.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-nested-unrelated.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/deflake.bash /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-nested-unrelated.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-nested-unrelated.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/races/lock-nested-unrelated.c
3,030 msx64 debian > libarcher.races::lock-unrelated.c
Script: -- : 'RUN: at line 13'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/races/lock-unrelated.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/deflake.bash /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/races/lock-unrelated.c
View Full Test Results (19 Failed)

Event Timeline

luke957 created this revision.Jun 29 2021, 9:31 AM
luke957 requested review of this revision.Jun 29 2021, 9:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2021, 9:31 AM

Please upload patches with full context using -U999999 has documented here https://releases.llvm.org/11.0.0/docs/Phabricator.html#requesting-a-review-via-the-web-interface

Do you plan to map these to segment load/store instructions in the future?

llvm/test/Transforms/LoopVectorize/RISCV/interleaved-accesses.ll
5

Is -enable-interleaved-mem-accesses=true needed if TTI enableInterleavedAccessVectorization() returns true

Please upload patches with full context using -U999999 has documented here https://releases.llvm.org/11.0.0/docs/Phabricator.html#requesting-a-review-via-the-web-interface

Do you plan to map these to segment load/store instructions in the future?

Yeah, segment load/store instructions are naturally suitable for mapping these. Do we need to create a new RISCVISD?

luke957 added inline comments.Jul 24 2021, 1:52 AM
llvm/test/Transforms/LoopVectorize/RISCV/interleaved-accesses.ll
5

Yes, -enable-interleaved-mem-accesses=true is not needed any longer.

If we aren't using segment load/store, what does the backend codegen for this look like?

Please upload patches with full context using -U999999 has documented here https://releases.llvm.org/11.0.0/docs/Phabricator.html#requesting-a-review-via-the-web-interface

Do you plan to map these to segment load/store instructions in the future?

Yeah, segment load/store instructions are naturally suitable for mapping these. Do we need to create a new RISCVISD?

I believe we need to run the InterleavedAccessPass and and and implement TargetLowering::LowerInterleavedLoad/Store to create IR intrinsics. That's how it is done on ARM for their vldX and vstX intstructions.

If we aren't using segment load/store, what does the backend codegen for this look like?

It looks like this

%wide.vec = load <8 x i32>, <8 x i32>* %1, align 4
%strided.vec = shufflevector <8 x i32> %wide.vec, <8 x i32> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
%strided.vec1 = shufflevector <8 x i32> %wide.vec, <8 x i32> poison, <4 x i32> <i32 1, i32 3, i32 5, i32 7>
...
%interleaved.vec = shufflevector <4 x i32> %3, <4 x i32> %4, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7>
store <8 x i32> %interleaved.vec, <8 x i32>* %7, align 4

InnerLoopVectorizer::vectorizeInterleaveGroup() will generate shufflevector instructions for interleaved accesses.

Please upload patches with full context using -U999999 has documented here https://releases.llvm.org/11.0.0/docs/Phabricator.html#requesting-a-review-via-the-web-interface

Do you plan to map these to segment load/store instructions in the future?

Yeah, segment load/store instructions are naturally suitable for mapping these. Do we need to create a new RISCVISD?

I believe we need to run the InterleavedAccessPass and and and implement TargetLowering::LowerInterleavedLoad/Store to create IR intrinsics. That's how it is done on ARM for their vldX and vstX intstructions.

Yeah, I think that is the right direction. Thanks. It seems I should submit a patch implementing TargetLowering::LowerInterleavedLoad/Store before this one.

If we aren't using segment load/store, what does the backend codegen for this look like?

It looks like this

%wide.vec = load <8 x i32>, <8 x i32>* %1, align 4
%strided.vec = shufflevector <8 x i32> %wide.vec, <8 x i32> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
%strided.vec1 = shufflevector <8 x i32> %wide.vec, <8 x i32> poison, <4 x i32> <i32 1, i32 3, i32 5, i32 7>
...
%interleaved.vec = shufflevector <4 x i32> %3, <4 x i32> %4, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7>
store <8 x i32> %interleaved.vec, <8 x i32>* %7, align 4

InnerLoopVectorizer::vectorizeInterleaveGroup() will generate shufflevector instructions for interleaved accesses.

I was asking what the RISCV assembly looks like. We don't have a 2 input shuffle instruction so this has to broken down into something like 2 vrgathers and a vmerge, but I'm not sure.