Page MenuHomePhabricator

[Pipelines] Introduce DAE after ArgumentPromotion
Needs ReviewPublic

Authored by psamolysov on Jun 29 2022, 8:25 AM.

Details

Summary

The ArgumentPromotion pass uses Mem2Reg promotion at the end to cutting
down generated alloca instructions as well as meaningless stores and
this behavior can leave unused (dead) arguments. To eliminate the dead
arguments and therefore let the DeadCodeElimination remove becoming dead
inserted GEPs as well as loads and casts in the callers, the
DeadArgumentElimination pass should be run after the ArgumentPromotion
one.

Diff Detail

Unit TestsFailed

TimeTest
60,060 msx64 debian > ThreadSanitizer-x86_64.ThreadSanitizer-x86_64::restore_stack.cpp
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -fsanitize=thread -Wall -m64 -msse4.2 -gline-tables-only -I/var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/tsan/../ -std=c++11 -I/var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/tsan/../ -O1 /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/tsan/restore_stack.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/tsan/X86_64Config/Output/restore_stack.cpp.tmp && not /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/tsan/X86_64Config/Output/restore_stack.cpp.tmp 2>&1 | FileCheck /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/tsan/restore_stack.cpp

Event Timeline

psamolysov created this revision.Jun 29 2022, 8:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2022, 8:25 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
psamolysov requested review of this revision.Jun 29 2022, 8:25 AM
nikic added a comment.Jun 30 2022, 1:08 AM

There should be a PhaseOrdering test that shows that the argument is not removed in the current opimization pipeline.

llvm/lib/Passes/PassBuilderPipelines.cpp
775

I don't think we need a late module pass for this, it can be part of the normal module optimization pipeline (post module simplification).

There should be a PhaseOrdering test that shows that the argument is not removed in the current opimization pipeline.

Thank you for the suggestion. I've added the test (3b7650da725) and rebased the current patch onto main to modify the test here. Also, I've removed the example from the patch's description because it has actually been moved into the test.

llvm/lib/Passes/PassBuilderPipelines.cpp
775

I'm afraid I didn't get your comment. What I tried is to add DAE with MIWP.addModulePass but such module passes run before call graph passes.

psamolysov edited the summary of this revision. (Show Details)

Add a PhaseOrdering test to show how the patch affects dead argument elimination after argument promotion.

psamolysov planned changes to this revision.EditedJun 30 2022, 4:54 AM

A number of tests failed, I'm going to fix them ASAP.

Actualize the new PM tests.

mtrofin added inline comments.Jun 30 2022, 7:50 AM
llvm/lib/Passes/PassBuilderPipelines.cpp
731–733

are these changes due to running clang-format (i.e. the previous change didn't?)

1611

same clang-format q

1689

spurious change?

1712

was this clang-format?

fhahn added inline comments.Jun 30 2022, 7:52 AM
llvm/lib/Passes/PassBuilderPipelines.cpp
731–733
fhahn added a comment.Jun 30 2022, 7:53 AM

Do we need to retain the run of DeadArgumentEliminationPass in the original position or is a single run at the new position sufficient?

psamolysov added inline comments.Jun 30 2022, 7:57 AM
llvm/lib/Passes/PassBuilderPipelines.cpp
731–733

@mtrofin Yes, this change is the result of a run of clang-format as well as the previous one.

@fhahn Thank you for the suggestion.

1611

Should I revert the changes introduced by clang-format?

1689

This change is introduced by clang-format. Should there be two empty lines between MainFPM.addPass(MergedLoadStoreMotionPass()); and if (EnableConstraintElimination)?

1712

Yes, it was.

psamolysov added inline comments.Jun 30 2022, 8:03 AM
llvm/lib/Passes/PassBuilderPipelines.cpp
1611

If these formatting changes make sense, I can introduce them in a separate patch. Should I?

mtrofin added inline comments.Jun 30 2022, 8:05 AM
llvm/lib/Passes/PassBuilderPipelines.cpp
1611

They are fine here, I was just checking. Thanks!

Do we need to retain the run of DeadArgumentEliminationPass in the original position or is a single run at the new position sufficient?

Good point! I tried and also removed the guard that the DAE pass should run with O3 only (currently it run exactly as the pass in the original position did: with any O > 0). I have a chance to run the LLVM :: Transforms tests only, one test failed - LLVM :: Transforms\InstCombine\unused-nonnull.ll:

error: CHECK-SAME: expected string not found in input
; CHECK-SAME: (i32 [[ARGC:%.*]], i8** nocapture readnone [[ARGV:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
              ^
<stdin>:7:17: note: scanning from here
define i32 @main(i32 %argc, i8** nocapture readonly %argv) local_unnamed_addr #0 {

The difference is that the %argv argument has the readonly attribute, not readnone. I'm not sure whether this makes much sense (I guess this could because readnone can theoretically open a door for some optimizations for which readonly cannot).

I've submitted this change to let the buildbot run all the tests and see the difference.

llvm/lib/Passes/PassBuilderPipelines.cpp
775

Oh, I think I have got the idea: instead of adding the DAE with MIWP.addLateModulePass() just after adding the ArgumentPromotion in the buildInlinerPipeline, to add DAE after this line with usual addPass:

MPM.addPass(buildInlinerPipeline(Level, Phase));

This makes sense when we removing the original point of adding the DAE pass, because the buildInlinerPipeline() function is called not in every case. Actually, we just moving DAE after inlining.

Try to remove the DAE from the original point. Also, apply the suggestion from @nikic - make the DAE a part of the normal module optimization pipeline (post module simplification).

Fix Clang :: CodeGen/thinlto-distributed-newpm.ll

Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2022, 11:13 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

moving DAE after the function simplification pipeline makes sense

Do we need to retain the run of DeadArgumentEliminationPass in the original position or is a single run at the new position sufficient?

Good point! I tried and also removed the guard that the DAE pass should run with O3 only (currently it run exactly as the pass in the original position did: with any O > 0). I have a chance to run the LLVM :: Transforms tests only, one test failed - LLVM :: Transforms\InstCombine\unused-nonnull.ll:

error: CHECK-SAME: expected string not found in input
; CHECK-SAME: (i32 [[ARGC:%.*]], i8** nocapture readnone [[ARGV:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
              ^
<stdin>:7:17: note: scanning from here
define i32 @main(i32 %argc, i8** nocapture readonly %argv) local_unnamed_addr #0 {

The difference is that the %argv argument has the readonly attribute, not readnone. I'm not sure whether this makes much sense (I guess this could because readnone can theoretically open a door for some optimizations for which readonly cannot).

this would be fixed by running PostOrderFunctionAttrsPass after the function simplification pipeline which is something I've wanted to do but never got around to testing. we should be inferring more precise attributes after fully simplifying functions. the only case that might regress is recursive functions since when visiting the recursive calls we haven't computed attributes for the recursive functions yet

this test is regressing with this patch because previously DAE would replace passed arguments with poison if the argument isn't used in the function, removing the use of %argv in @main and func-attrs would mark %argv as readnone, but with this patch func-attrs runs on @main before the use of %argv is eliminated. the call to @compute is eliminated in all cases in the function simplification pipeline due to inferring the returned attribute on %x, which is why running func-attrs after the function simplification pipeline fixes this issue

@@ -759,9 +758,6 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level,
   if (AttributorRun & AttributorRunOption::CGSCC)
     MainCGPipeline.addPass(AttributorCGSCCPass());
 
-  // Now deduce any function attributes based in the current code.
-  MainCGPipeline.addPass(PostOrderFunctionAttrsPass());
-
   // When at O3 add argument promotion to the pass pipeline.
   // FIXME: It isn't at all clear why this should be limited to O3.
   if (Level == OptimizationLevel::O3)
@@ -781,6 +777,9 @@ PassBuilder::buildInlinerPipeline(OptimizationLevel Level,
       buildFunctionSimplificationPipeline(Level, Phase),
       PTO.EagerlyInvalidateAnalyses, EnableNoRerunSimplificationPipeline));
 
+  // Now deduce any function attributes based in the current code.
+  MainCGPipeline.addPass(PostOrderFunctionAttrsPass());
+
   MainCGPipeline.addPass(CoroSplitPass(Level != OptimizationLevel::O0));

@aeubanks Thank you for the great explanation.

I've applied your suggestion and re-uploaded the patch. The test unused-nonnull.ll has been fixed with keeping the readnone attribute.

[Pipelines] Fix the Clang :: CodeGenCoroutines/coro-elide.cpp

Now, the %_Z5task1v.Frame type doesn't contain a field of the
%_Z5task0v.Frame one.

I'd tried moving func-attrs after the function simplification pipeline and perf-wise on internal benchmarks it looks fine, but it causes some compile time regressions: https://llvm-compile-time-tracker.com/compare.php?from=d3dd6e57fe84e90cadcdc78fa71d632f6573f156&to=592b8acec55f577713a5e1fb610a36c8742c682b&stat=instructions
My guess is something to do with caching/invalidating analyses but I'm not 100% sure

(and we should definitely separate that out into its own patch first)

@aeubanks Hmm, if I correctly get your comment, I should revert this patch to the state before the proposed solution with moving the PostOrderFunctionAttrsPass at the end of the buildInlinerPipeline function regardless of the readonly instead of readnone regression. Personally along with your concern about compilation time, I have a concern about some changing in coroutines compilation, the Clang :: CodeGenCoroutines/coro-elide.cpp test demonstrates them:

// CHECK-NOT: %_Z5task1v.Frame = type {{.*}}%_Z5task0v.Frame

instead of

// CHECK: %_Z5task1v.Frame = type {{.*}}%_Z5task0v.Frame

I've reverted the latest changes because they require more investigation and, as you said, should be introduced in a separate patch.

Return the PostOrderFunctionAttrsPass pass back on its original place in the pipeline.

I think even with the readnone -> readonly change this patch should be fine, but let me run some internal benchmarks on this patch
the func-attrs change can come later