This is an archive of the discontinued LLVM Phabricator instance.

[GVN] Improve PRE on load instructions
ClosedPublic

Authored by Carrot on Dec 7 2022, 2:31 PM.

Details

Summary

This patch implements the enhancement proposed by https://github.com/llvm/llvm-project/issues/59312.

Suppose we have following code

   v0 = load %addr
   br %LoadBB

LoadBB:
   v1 = load %addr
   ...

PredBB:
   ...
   br %cond, label %LoadBB, label %SuccBB

SuccBB:
   v2 = load %addr
   ...

Instruction v1 in LoadBB is partially redundant, edge (PredBB, LoadBB) is a critical edge. SuccBB is another successor of PredBB, it contains another load v2 which is identical to v1. Current GVN splits the critical edge (PredBB, LoadBB) and inserts a new load in it. A better method is move the load of v2 into PredBB, then v1 can be changed to a PHI instruction.

If there are two or more similar predecessors, like the test case in the bug entry, current GVN simply gives up because otherwise it needs to split multiple critical edges. But we can move all loads in successor blocks into predecessors.

Diff Detail

Event Timeline

Carrot created this revision.Dec 7 2022, 2:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2022, 2:31 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
Carrot requested review of this revision.Dec 7 2022, 2:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2022, 2:31 PM
nikic added a reviewer: nikic.Dec 7 2022, 11:56 PM

Seems like this patch keeps finding values in vectors, and it might be a reason of CT degradation. Maybe refactor them to be maps instead?

llvm/lib/Transforms/Scalar/GVN.cpp
931

AvailValInBlkVect ?

Carrot added a comment.Dec 8 2022, 4:41 PM

Seems like this patch keeps finding values in vectors, and it might be a reason of CT degradation. Maybe refactor them to be maps instead?

Only function ReplaceValuesPerBlockEntry searching values in vector, it is called in the final transformation step. So it is unlikely causing CT issue unless there are a lot of such optimizations actually triggered. I would like to look for CT issues in earlier time.

llvm/lib/Transforms/Scalar/GVN.cpp
931

Similar to ConstructSSAForLoadSet, this function is a member of GVNPass, but AvailValInBlkVect is private in GVNPass. So I can't directly use it here.

Carrot updated this revision to Diff 481483.Dec 8 2022, 5:27 PM

Compile time improvement.

The original code checks all predecessors of LoadBB and find possible solution for all predecessors. And later checks if the solution is profitable. This versions checks if current solution is still profitable in the process of checking predecessors, and early exit if current solution is already non profitable.

The compile time regression on ClamAV is in file libclamav_htmlnorm.c. The increased compile time is completely in GVN pass, from 4.0s to 5.3s. There is a huge function cli_html_normalise in this file, it's more than 6600 lines in generated assembly file. The statistic result of NumPRELoad increased from 1 to 10 because of this patch. In function GVNPass::runImpl we have

unsigned Iteration = 0;
while (ShouldContinue) {
  LLVM_DEBUG(dbgs() << "GVN iteration: " << Iteration << "\n");
  (void) Iteration;
  ShouldContinue = iterateOnFunction(F);
  Changed |= ShouldContinue;
  ++Iteration;
}

It means we continuously do GVN on a function until there is no more such optimization applicable. This patch enables more optimizations, potentially it may also cause more iterations on a function. In this case, the loop is executed 4 times without this patch, but 5 times with this patch. These numbers closely correlate to the increased compile time.

So the extra compile time is caused by more iterations on a huge function, and the more iterations is caused by more optimizations enabled by this patch.

nikic added a comment.Dec 16 2022, 8:03 AM

The compile time regression on ClamAV is in file libclamav_htmlnorm.c. The increased compile time is completely in GVN pass, from 4.0s to 5.3s. There is a huge function cli_html_normalise in this file, it's more than 6600 lines in generated assembly file. The statistic result of NumPRELoad increased from 1 to 10 because of this patch. In function GVNPass::runImpl we have

unsigned Iteration = 0;
while (ShouldContinue) {
  LLVM_DEBUG(dbgs() << "GVN iteration: " << Iteration << "\n");
  (void) Iteration;
  ShouldContinue = iterateOnFunction(F);
  Changed |= ShouldContinue;
  ++Iteration;
}

It means we continuously do GVN on a function until there is no more such optimization applicable. This patch enables more optimizations, potentially it may also cause more iterations on a function. In this case, the loop is executed 4 times without this patch, but 5 times with this patch. These numbers closely correlate to the increased compile time.

So the extra compile time is caused by more iterations on a huge function, and the more iterations is caused by more optimizations enabled by this patch.

Thanks for the analysis. I think this is fine then. I expect https://reviews.llvm.org/D140097 to mitigate this somewhat.

llvm/lib/Transforms/Scalar/GVN.cpp
930

nit: replaceValuesPerBlockEntry

1399

Block scan needs a cutoff.

1404

Do we need to be concerned about speculating the load? You do perform a speculation check, but I think it does not cover the speculation of this load.

1405

cast<>

2703

I am not sure this assertion is safe to remove. I think a problem with your current code is that it may try to remove a load that is part of the leader table for that block, if that block has been processed before the current one.

llvm/test/Transforms/GVN/PRE/pre-load.ll
766

Please pre-commit the test.

Carrot updated this revision to Diff 484624.Dec 21 2022, 11:01 AM
Carrot marked 4 inline comments as done.
Carrot added inline comments.
llvm/lib/Transforms/Scalar/GVN.cpp
1404

Do you mean the following code in function PerformLoadPRE ?

+    for (auto &CEP : CriticalEdgePredAndLoad)
+      if (!isSafeToSpeculativelyExecute(Load, CEP.first->getTerminator(), AC,
+                                        DT))
+        return false;

It's specifically for this purpose. I used the original Load instruction instead of CEP.second because all of them are identical.

2703

Added code into function eliminatePartiallyRedundantLoad to delete the corresponding leader table entry.

mkazantsev accepted this revision.Dec 25 2022, 8:27 PM

I think this is fine, please wait few more days if someone else has concerns.

llvm/lib/Transforms/Scalar/GVN.cpp
1840

nit: /*CriticalEdgePredAndLoad*/ nullptr

This revision is now accepted and ready to land.Dec 25 2022, 8:27 PM
Carrot updated this revision to Diff 486682.Jan 5 2023, 2:33 PM
Carrot marked an inline comment as done.

Thanks for the review! Will commit this version.

lkail added a subscriber: lkail.Jan 5 2023, 6:50 PM
This revision was landed with ongoing or failed builds.Jan 9 2023, 3:09 PM
This revision was automatically updated to reflect the committed changes.
chapuni added a subscriber: chapuni.Jan 9 2023, 5:40 PM

Seems this causes miscompilation. Investigating.

dyung added a subscriber: dyung.Jan 9 2023, 6:21 PM

@Carrot This change seems to be causing some weird unit test failures that I cannot explain. Can you take a look and revert if you need time to investigate?

https://lab.llvm.org/buildbot/#/builders/247/builds/294
https://lab.llvm.org/buildbot/#/builders/75/builds/25910

Carrot added a comment.Jan 9 2023, 7:02 PM

@dyung @chapuni thank you for the report, I have reverted it.
How did you investigate it ? I try to reproduce it according to https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild, but always got following error

...
+ BUILDBOT_MONO_REPO_PATH=
+ BUILDBOT_REVISION=llvmorg-
+ buildbot_update
+ echo @@@BUILD_STEP update llvmorg-@@@
@@@BUILD_STEP update llvmorg-@@@
+ [[ -d '' ]]
+ local DEPTH=100
+ [[ -d llvm-project ]]
+ cd llvm-project
+ git fetch origin
+ git clean -fd
+ local REV=llvmorg-
+ git checkout -f llvmorg-
error: pathspec 'llvmorg-' did not match any file(s) known to git
+ git status
On branch main

No commits yet

nothing to commit (create/copy files and use "git add" to track)
+ git rev-list --pretty --max-count=1 HEAD
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
+ build_exception
+ echo

+ echo 'How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild'
How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild
+ echo
...

This change failed the https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/pr59747.c (-O2/-O3) on many platforms including x86-64.

I think it is easy to reproduce:

clang -O2 pr59747.c
./a.out
dyung added a comment.EditedJan 9 2023, 7:06 PM

@dyung @chapuni thank you for the report, I have reverted it.
How did you investigate it ? I try to reproduce it according to https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild, but always got following error

...
+ BUILDBOT_MONO_REPO_PATH=
+ BUILDBOT_REVISION=llvmorg-
+ buildbot_update
+ echo @@@BUILD_STEP update llvmorg-@@@
@@@BUILD_STEP update llvmorg-@@@
+ [[ -d '' ]]
+ local DEPTH=100
+ [[ -d llvm-project ]]
+ cd llvm-project
+ git fetch origin
+ git clean -fd
+ local REV=llvmorg-
+ git checkout -f llvmorg-
error: pathspec 'llvmorg-' did not match any file(s) known to git
+ git status
On branch main

No commits yet

nothing to commit (create/copy files and use "git add" to track)
+ git rev-list --pretty --max-count=1 HEAD
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
+ build_exception
+ echo

+ echo 'How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild'
How to reproduce locally: https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild
+ echo
...

I can't speak for the sanitizer bot, but for llvm-clang-x86_64-gcc-ubuntu you should be able to reproduce some of the failures by running:

  • cmake ../llvm-project/llvm -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ -DCMAKE_BUILD_TYPE=Release -DCLANG_ENABLE_CLANGD=OFF -DLLVM_BUILD_RUNTIME=ON -DLLVM_BUILD_TESTS=ON -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_INCLUDE_EXAMPLES=OFF '-DLLVM_LIT_ARGS=--verbose -j48' -DLLVM_USE_LINKER=gold '-DLLVM_ENABLE_PROJECTS=compiler-rt;clang;lld;cross-project-tests;llvm;clang-tools-extra' -GNinja
  • ninja check-fuzzer

This change failed the https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/pr59747.c (-O2/-O3) on many platforms including x86-64.

I think it is easy to reproduce:

clang -O2 pr59747.c
./a.out

The 11 unexpected failures should be caused by this change.

Carrot added a comment.Jan 9 2023, 8:43 PM

@SixWeining @dyung , both reproduction work for me, thanks a lot!