This is an archive of the discontinued LLVM Phabricator instance.

[ControlDependentDCE] Add Control Dependent DCE pass
Needs ReviewPublic

Authored by masakiarai on Sep 25 2020, 4:59 AM.

Details

Summary

Add Control Dependent DCE pass.

Add Control Dependent DCE pass.
This pass performs 'control dependent DCE', which converts a conditional
branch of BB into an unconditional branch if possible, when the basic block
BBa dominates BB and both basic blocks have conditional branches under the
same condition.

BBa:

%cmp = ...
br i1 %cmp, label %LA1, label %LA2

...

BB:

...
br i1 %cmp, label %LD1, label %LD2

The condition to apply this optimization is that the predecessors of BB is
divided into the set S1 coming from LA1 and the set S2 coming from LA2.
If S1(or S2) is empty, it rewrites the conditional branch to an
unconditional branch to LD2(or LD1).
If both S1 and S2 are not empty and BB can be duplicated, BB is duplicated
for S2, and each conditional branches in BB and the duplicated BB are
converted into unconditional branches for LD1 and LD2, respectively.

NOTE: This patch is a reconstruction of the following patch.
[Jump Threading] Convert conditional branches into unconditional branches using GVN results
https://reviews.llvm.org/D57953

This patch duplicates and holds the variables and functions related to TryThreadEdge in JumpThreading pass.
Therefore, it is necessary to create a patch that makes these duplicates into a library before committing this patch.

Diff Detail

Event Timeline

masakiarai created this revision.Sep 25 2020, 4:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2020, 4:59 AM
masakiarai requested review of this revision.Sep 25 2020, 4:59 AM
xbolva00 added a subscriber: xbolva00.

I corrected the description in alphabetical order in InitializePasses.h.

Test duplicate-conditional.ll succeeds in my local build.
I will investigate this as it may be a severe bug.

I guess I missed the initial conversation on this, apologies for that.
Can you explain in more detail why this is as standalone pass.
DCE, in general, is something I would like us to do more aggressively in the IPO setting so I'm a little worried to introduce more function passes.
Also, IPSCCP seems to perform well on one test cases already https://godbolt.org/z/sM9zr8 , maybe the second one is not far out of reach (@fhahn?)
We also have other places where we do this kind of reasoning and that could do the job, hopefully in a more holistic manner.

Considering only the DCE's feature, I think the ADCE pass is close to this pass's purpose.
The big difference between this pass and others is that this pass duplicates the basic block to isolate the control flow.
This feature can promote loop optimizations by isolating the control flow that does not enter loops out of the kernel part when there is a sequence of multiple loops in numerical applications.
However, this first version duplicates basic blocks only in the conservative case where the performance is definitely improved.
This pass needs to duplicate basic blocks more aggressively and selectively to facilitate loop optimizations.

Considering only the DCE's feature, I think the ADCE pass is close to this pass's purpose.
The big difference between this pass and others is that this pass duplicates the basic block to isolate the control flow.
This feature can promote loop optimizations by isolating the control flow that does not enter loops out of the kernel part when there is a sequence of multiple loops in numerical applications.
However, this first version duplicates basic blocks only in the conservative case where the performance is definitely improved.
This pass needs to duplicate basic blocks more aggressively and selectively to facilitate loop optimizations.

OK. This does not get across from reading the tests. I usually use the update scripts to generate check lines but whatever you prefer should include enough check lines and comments to that this becomes obvious.

That said, I still am not convinced this is best implemented in a new pass. Isn't the idea here that we have the value of a branch condition available for a subset of the incoming paths? That sounds like it would compose well with other ways through which we can know the value of a conditional for a subset of the incoming paths w/o having multiple branches with the same condition:

if (a) 
  assume(cond == 0);
if (cond) {
...
} else {
  ...
}

That said, I still am not convinced this is best implemented in a new pass. Isn't the idea here that we have the value of a branch condition available for a subset of the incoming paths? That sounds like it would compose well with other ways through which we can know the value of a conditional for a subset of the incoming paths w/o having multiple branches with the same condition:

I'm sorry that the explanation of the purpose of this pass is not clear.

By classifying the values that reach a basic block's entry, you can generally apply optimization by duplicating the basic block and deleting the conditional branch instruction.
If you use the variables referenced by the comparison instruction instead of the condition code, you will have more optimization opportunities than in this version of the implementation.
I think GVN should implement such optimizations.

The following cases may be beyond the functionality of GVN-related passes.

  Ba
 / \
X   Y
 \ /
  R
  |\
  | \
  B
 / \
T   F

In this case, by duplicating R, there is an opportunity to delete B's conditional branch instruction.
R may not be a single basic block, but a code region containing loops.
If it's better to create a version that includes this implementation before proposing, I'll update the patch with code that includes it.

I released this revision because I confirmed that this had been applied to test-suite benchmarks in the case of hundreds or more.

Test duplicate-conditional.ll succeeds in my local build.
I will investigate this as it may be a severe bug.

I fixed this bug and updated the patch.

The cause of the bug was referencing an uninitialized variable, which was in the code copied from JumpThreading pass did not adequately incorporate its update.

dexonsmith resigned from this revision.Oct 6 2020, 3:18 PM