This modification strategy tries to understand which GEP instrucions is profitable to modify for register pressure decreasing.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Time | Test | |
---|---|---|
60,100 ms | x64 debian > Clang.CodeGenCXX::dllimport-members.cpp Script:
--
: 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/15.0.0/include -nostdsysteminc -no-opaque-pointers -no-enable-noundef-analysis -disable-llvm-passes -triple i686-windows-msvc -fms-compatibility -emit-llvm -std=c++1y -O0 -o - /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGenCXX/dllimport-members.cpp -DMSABI | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck --check-prefix=MSC --check-prefix=M32 /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGenCXX/dllimport-members.cpp
| |
60,080 ms | x64 debian > Clang.Driver::emit-reproducer.c Script:
--
: 'RUN: at line 1'; rm -rf /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/Driver/Output/emit-reproducer.c.tmp && mkdir /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/test/Driver/Output/emit-reproducer.c.tmp
| |
61,540 ms | x64 debian > Clang.Driver::fsanitize.c Script:
--
: 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -target x86_64-linux-gnu -fsanitize=undefined -fsanitize-trap=undefined /var/lib/buildkite-agent/builds/llvm-project/clang/test/Driver/fsanitize.c -### 2>&1 | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/clang/test/Driver/fsanitize.c --check-prefix=CHECK-UNDEFINED-TRAP
| |
60,570 ms | x64 debian > Clang.OpenMP::target_defaultmap_codegen_01.cpp Script:
--
: 'RUN: at line 8'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/15.0.0/include -nostdsysteminc -no-opaque-pointers -no-enable-noundef-analysis -DCK1 -verify -fopenmp -fopenmp-version=50 -fopenmp-targets=powerpc64le-ibm-linux-gnu -x c++ -triple powerpc64le-unknown-unknown -emit-llvm /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/target_defaultmap_codegen_01.cpp -o - | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck --allow-unused-prefixes -allow-deprecated-dag-overlap /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/target_defaultmap_codegen_01.cpp --check-prefix CK1
| |
60,170 ms | x64 debian > Clang.OpenMP::target_parallel_for_codegen_registration.cpp Script:
--
: 'RUN: at line 2'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/15.0.0/include -nostdsysteminc -no-opaque-pointers -verify -fopenmp -fopenmp-version=45 -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu -emit-llvm /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/target_parallel_for_codegen_registration.cpp -o - | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck --allow-unused-prefixes /var/lib/buildkite-agent/builds/llvm-project/clang/test/OpenMP/target_parallel_for_codegen_registration.cpp
| |
View Full Test Results (9 Failed) |
Event Timeline
At least for me, I need some context to be able to review this. What is the case which this improves in terms of codegen? And how common are such patterns? Keep in mind, I'm not terribly familiar with the pass here, so this may be pretty basic explanation. Do you have a bug with examples or analyze that lead to this change?
Sorry, I had to provide the context at the beginning.
Now clang for RISC-V doesn't use offset addressing in generated assembly. Example from Dhrystone
addiw a0, s1, 5 slli a1, a0, 0x2 add a2, s4, a1 sw s2, 0(a2) addiw a3, s1, 6 slli a3, a3, 0x2 add a3, a3, s4 sw s2, 0(a3) addiw a3, s1, 35 slli a3, a3, 0x2 add a3, a3, s4 sw a0, 0(a3)
It's inefficient because we can use offsets.
Adding this pass allows to generate the next code
addiw a4, a2, 5 slli a5, a2, 2 add a0, a0, a5 sw a3, 20(a0) sw a3, 24(a0) sw a4, 140(a0) ...
SeparateConstOffsetFromGEPPass is used to solve this problem in targets with limited addressing modes.
The changes inside pass was made because modification of all GEPs isn't profitable, seems that we need at least 2 GEPs and one value that was used for index that can be removed after modification. Otherwise we don't decrease register pressure.
This should be two patches, one changing the pass and one enabling for RISC-V.
llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | ||
---|---|---|
171 ↗ | (On Diff #436758) | Can we add a command line option to control this like AArch64 and PowerPC have? |
llvm/test/Transforms/SeparateConstOffsetFromGEP/RISCV/split-gep.ll | ||
2 | This test doesn't exist in the repo. Where is the patch that adds it? |
llvm/test/Transforms/SeparateConstOffsetFromGEP/RISCV/split-gep.ll | ||
---|---|---|
2 | I was told in the first comment to rebase on precommited tests. These tests are added as precommited in separate commit. Should I commit them? |
llvm/test/Transforms/SeparateConstOffsetFromGEP/RISCV/split-gep.ll | ||
---|---|---|
2 | Why is the script NOTE: Assertions have been autogenerated by utils/update_test_checks.py not in the pre-committed version? |
This should be two patches, one changing the pass and one enabling for RISC-V.
As far as I want to turn pass with enabled strategy, should I wait approve and merge of the accepting strategy and only after this create the second review? Or create series of patches as mentionedin documentation https://llvm.org/docs/Phabricator.html#creating-a-patch-series?