Page MenuHomePhabricator

[AMDGPU] Disable NSA for BVH instructions when appropriate
Needs ReviewPublic

Authored by critson on May 27 2021, 2:05 AM.

Details

Reviewers
foad
rampitec
Summary

Check maximum NSA size when selecting NSA or non-NSA BVH instructions.

Diff Detail

Unit TestsFailed

TimeTest
3,110 msx64 debian > libarcher.barrier::barrier.c
Script: -- : 'RUN: at line 14'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/barrier/barrier.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/barrier/Output/barrier.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/barrier/Output/barrier.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/barrier/Output/barrier.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/barrier/barrier.c
3,100 msx64 debian > libarcher.critical::critical.c
Script: -- : 'RUN: at line 15'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/critical.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/critical.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/critical.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/critical.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/critical.c
3,380 msx64 debian > libarcher.critical::lock-nested.c
Script: -- : 'RUN: at line 15'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/lock-nested.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock-nested.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock-nested.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock-nested.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/lock-nested.c
3,520 msx64 debian > libarcher.races::critical-unrelated.c
Script: -- : 'RUN: at line 13'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/races/critical-unrelated.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/critical-unrelated.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/deflake.bash /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/critical-unrelated.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/critical-unrelated.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/races/critical-unrelated.c
3,430 msx64 debian > libarcher.races::lock-nested-unrelated.c
Script: -- : 'RUN: at line 13'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/races/lock-nested-unrelated.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-nested-unrelated.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/deflake.bash /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-nested-unrelated.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/lock-nested-unrelated.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/races/lock-nested-unrelated.c
View Full Test Results (19 Failed)

Event Timeline

critson created this revision.May 27 2021, 2:05 AM
critson requested review of this revision.May 27 2021, 2:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 27 2021, 2:05 AM

Any reason to force SA?

Any reason to force SA?

This is primarily for testing stability issues with long form NSA instructions.
I am half minded that it does not need to go into the code base, but feel like the code should at least exist somewhere, such as a this review.

critson updated this revision to Diff 353298.Jun 21 2021, 1:22 AM
  • Rework this based on NSA limit functionality
critson retitled this revision from [AMDGPU] Add options to disable NSA for BVH instructions to [AMDGPU] Disable NSA for BVH instructions when appropriate.Jun 21 2021, 1:23 AM
critson edited the summary of this revision. (Show Details)
foad added inline comments.Jun 21 2021, 1:43 AM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4781

Maybe use a 2 by 2 by 2 array of opcodes?

critson updated this revision to Diff 353329.Jun 21 2021, 3:52 AM
  • Use array for opcode look up
critson updated this revision to Diff 361583.Sun, Jul 25, 11:59 PM

Ping.
It would be good to get this merged now the other NSA limiting features are in.

foad added inline comments.Mon, Jul 26, 7:15 AM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4812

Why do we have to round up to 8 or 16?

4816

Can this be undef instead of 0? Can we push the same register N times instead of creating N different registers?

llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
4259

Why does this hard code 512, when the comment says 256 or 512?

critson updated this revision to Diff 361938.Tue, Jul 27, 1:25 AM
  • Remove unnecessary padding.
  • Properly set bank size
critson marked 4 inline comments as done.Tue, Jul 27, 1:29 AM
critson added inline comments.
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4812

I cannot remember why I did this for GlobalIsel.
It certainly works without, perhaps it didn't when I first wrote this.

foad added inline comments.Tue, Jul 27, 2:19 AM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
4813–4814

Nit: we generally avoid explicit createGenericVirtualRegister calls. You can write: Register MergedOps = B.buildMerge(OpTy, Ops).getReg(0);

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
7414

Same question as for globalisel: do we need to round up at all here? Rounding up to 8 certainly seems odd now that we have v5, v6, v7 classes.

7416

Can we use undef instead of zero to avoid having to materialise a constant?