This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] Remove identical successors in switch instructions in simple cases.
AbandonedPublic

Authored by DianQK on Jul 16 2023, 1:36 AM.

Details

Summary

Resolve Inefficient codegen for eq of enum with identical variants #113506.

I implemented identical successors check for a simple case, expecting it to cover some common patterns.
I consider this to be a series of patches, balancing optimization improvements while considering compile time.

Diff Detail

Event Timeline

DianQK created this revision.Jul 16 2023, 1:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 16 2023, 1:36 AM
DianQK requested review of this revision.Jul 16 2023, 1:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 16 2023, 1:36 AM
DianQK updated this revision to Diff 540768.Jul 16 2023, 1:41 AM
This comment was removed by DianQK.
DianQK edited the summary of this revision. (Show Details)Jul 16 2023, 1:48 AM
DianQK added reviewers: nikic, StephenFan, xbolva00.
xbolva00 added inline comments.Jul 16 2023, 2:12 AM
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
6810

return false when BB´s sizes are not same?

nikic added a comment.Jul 16 2023, 2:53 AM

I don't think this is the right way to approach the problem. We should instead extend the existing sinking transform to support sinking instructions from only a subset of predecessors (into a new intermediate block).

DianQK updated this revision to Diff 540772.Jul 16 2023, 3:01 AM

Update simplifySwitchIdenticalSucc.
Fix some failing test cases.

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
6810

We don't need to consider the same instances of BB, as this is already the expected result. If we still replace the same instances of BB, the update for DTU will become complicated.
But this logic is actually contrary to the semantics of this function. Let me revise it.

I don't think this is the right way to approach the problem. We should instead extend the existing sinking transform to support sinking instructions from only a subset of predecessors (into a new intermediate block).

Oh, thank you. I will give it a try.

DianQK abandoned this revision.Jul 16 2023, 7:24 AM

I don't think this is the right way to approach the problem. We should instead extend the existing sinking transform to support sinking instructions from only a subset of predecessors (into a new intermediate block).

Yes, we should extend the sinking common instructions. I noticed that the current sinking operation has already performed the related transformations (a simple subset of predecessors), but it seems to trigger some optimization conflicts.
I made a simple strategy adjustment at D155404.

I will try to address this issue during my next spare time.

The simple subset of predecessors is:

define i64 @compare(i64 %a, i64 %b, i64 %c) {
start:
  switch i64 %c, label %bb0 [
  i64 1, label %bb1
  i64 2, label %bb2
  ]

bb0:                                              ; preds = %start
  %0 = add i64 %a, %b
  br label %exit

bb1:                                              ; preds = %start
  %1 = add i64 %a, %b
  br label %exit

bb2:                                              ; preds = %start
  %2 = add i64 %a, %b
  br label %exit

exit:                                             ; preds = %bb3, %bb2, %bb1, %bb0
  %result = phi i64 [ %0, %bb0 ], [ %1, %bb1 ], [ %2, %bb2 ]
  ret i64 %result
}

There is another more complex case that can be considered in the future:

define i64 @compare2(i64 %a, i64 %b, i64 %c) {
start:
  switch i64 %c, label %bb0 [
  i64 1, label %bb1
  i64 2, label %bb2
  ]

bb0:                                              ; preds = %start
  %0 = add i64 %a, %b
  br label %exit

bb1:                                              ; preds = %start
  %1 = add i64 %a, %b
  br label %exit

bb2:                                              ; preds = %start
  %2 = sub i64 %a, %b                ; here is `sub`
  br label %exit

exit:                                             ; preds = %bb3, %bb2, %bb1, %bb0
  %result = phi i64 [ %0, %bb0 ], [ %1, %bb1 ], [ %2, %bb2 ]
  ret i64 %result
}
DianQK reclaimed this revision.Jul 17 2023, 6:46 PM

I reopened this revision to present my new idea and findings. Maybe such as Discord, discourse, or GitHub issues would be a preferable discussion?

I found that the sink common instructions are not often helpful for performance, see https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L2255-L2272.
But I found I could take this problem and turn it into two optimizations.

  1. Eliminating unreachable basic blocks. But here are specific patches, such as D6697 and D6471, that prevent this behavior. I'll look into that later.
  2. Without unreachable basic blocks, it should be possible to extend the hoist common instructions.
DianQK abandoned this revision.Jul 19 2023, 7:34 AM

D155711 should be a better option.