This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Improvement the analytics through the dominating condition
ClosedPublic

Authored by Allen on Feb 17 2023, 2:11 AM.

Details

Summary

Address the dominating condition, the urem fold is benefit from the analytics improvements.
Fix https://github.com/llvm/llvm-project/issues/60546

NOTE: delete the calls in simplifyBinaryIntrinsic and foldICmpWithDominatingICmp is used to reduce compile time.

Diff Detail

Event Timeline

Allen created this revision.Feb 17 2023, 2:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 17 2023, 2:11 AM
Allen requested review of this revision.Feb 17 2023, 2:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 17 2023, 2:11 AM
Allen updated this revision to Diff 498295.Feb 17 2023, 2:23 AM
Allen edited the summary of this revision. (Show Details)

update upstream test case

erikdesjardins added inline comments.
llvm/lib/Analysis/InstructionSimplify.cpp
3918–3919

This can be:

if (Checked)
  return *Checked ? getTrue(ITy) : getFalse(ITy);

since it's also useful to know that the dominating condition is false.

I expect you'll hit this case if you add a test withj the dominating icmp changed to uge and swap the branch targets

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=45a0b812fa13ec255cae91f974540a4d805a8d79&to=787c3e2333e8d655a61ef013f7006d7675c03574&stat=instructions%3Au Some impact, but less than I expected, so I think we can do this.

This patch should also remove the isImpliedByDomCondition() calls in simplifySelectInst() and simplifyBinaryIntrinsic(), because they are subsumed by this one.

Patch title and description also need an update -- this really has very little to do with the urem fold in particular, it's a generic analysis change.

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=45a0b812fa13ec255cae91f974540a4d805a8d79&to=787c3e2333e8d655a61ef013f7006d7675c03574&stat=instructions%3Au Some impact, but less than I expected, so I think we can do this.

This patch should also remove the isImpliedByDomCondition() calls in simplifySelectInst() and simplifyBinaryIntrinsic(), because they are subsumed by this one.

  • I verified the test, If remove the isImpliedByDomCondition() calls in simplifySelectInst() , there are 5 failed cases.
LLVM :: CodeGen/Thumb2/mve-memtp-branch.ll
LLVM :: Transforms/AggressiveInstCombine/lower-table-based-cttz-basics.ll
LLVM :: Transforms/InstSimplify/select-implied.ll
LLVM :: Transforms/LoopUnroll/runtime-exit-phi-scev-invalidation.ll
LLVM :: Transforms/LoopUnroll/runtime-loop-at-most-two-exits.ll
  • then, If continue remove the call in simplifyBinaryIntrinsic(), there is a extra failed case.
LLVM :: Transforms/InstCombine/unrecognized_three-way-comparison.ll
  • some seems regression, and others improve, so it can't be removed simple, should take a more work to fix them base the vesion of remove 2 isImpliedByDomCondition() ?

llvm/lib/Analysis/InstructionSimplify.cpp
3918–3919
  • The following case tranform, but it in fact swap the branch before this transform, so it is same to urem_with_dominating_condition, I don't sure this case should be saved ?
define i8 @urem_with_dominating_condition_false(i8 %x, i8 %n) {
start:
  %cond = icmp uge i8 %x, %n
  br i1 %cond, label %.bb1, label %.bb0 ; Swap the branch targets
.bb0:
  %add = add i8 %x, 1
  %out = urem i8 %add, %n
  ret i8 %out
.bb1:
  ret i8 0
}

I see. It looks like this much is safe to do without causing any regressions: https://github.com/nikic/llvm-project/commit/37366dc6bc6d168f081510a603982802a6964221 The ones in simplifyBinaryIntrinsic() and foldICmpWithDominatingICmp() are completely redundant.

The one in simplifySelectInst() is not redundant, because it uses a different context instruction: We would miss cases where we have br i1 %c followed by select i1 %c, because the comparison is not in the same block as the select. This is unfortunate...

Allen abandoned this revision.Feb 21 2023, 1:09 AM

I see. It looks like this much is safe to do without causing any regressions: https://github.com/nikic/llvm-project/commit/37366dc6bc6d168f081510a603982802a6964221 The ones in simplifyBinaryIntrinsic() and foldICmpWithDominatingICmp() are completely redundant.

The one in simplifySelectInst() is not redundant, because it uses a different context instruction: We would miss cases where we have br i1 %c followed by select i1 %c, because the comparison is not in the same block as the select. This is unfortunate...

Thanks for the quick resolution

nikic added a comment.Feb 21 2023, 1:28 AM

Why was this abandoned? Just to be clear, I did not commit anything, just testing behavior.

Allen added a comment.Feb 21 2023, 2:18 AM

Why was this abandoned? Just to be clear, I did not commit anything, just testing behavior.

Sorry, I mistakenly thought this was already uploaded. I'll change it according to your suggestion ASAP :)

Allen updated this revision to Diff 499093.Feb 21 2023, 2:56 AM
Allen retitled this revision from [InstCombine] canonicalize urem as cmp+select (part 2) to [InstCombine] Improvement the analytics through the dominating condition.
Allen edited the summary of this revision. (Show Details)
Allen added a reviewer: erikdesjardins.

Address comments

  1. Add return false and its test case urem_with_dominating_condition_false
  2. Delete some calls in function simplifyBinaryIntrinsic and foldICmpWithDominatingICmp, and update its case affected
  3. update tile
Allen updated this revision to Diff 499359.EditedFeb 21 2023, 7:37 PM

Fix 4 AMDGPU backend failed cases

LLVM :: CodeGen/AMDGPU/branch-relaxation.ll
LLVM :: CodeGen/AMDGPU/collapse-endcf.ll
LLVM :: CodeGen/AMDGPU/infinite-loop.ll
LLVM :: CodeGen/AMDGPU/uniform-cfg.ll
nikic added a subscriber: arsenm.Feb 22 2023, 7:30 AM

This basically LGTM, but probably target maintainers need to check how those CodeGen tests need to be modified to preserve test intent @arsenm @dmgreen.

arsenm added inline comments.Feb 22 2023, 7:35 AM
llvm/test/CodeGen/AMDGPU/branch-relaxation.ll
510

Probably just need to add more arguments to use for conditions

llvm/test/CodeGen/AMDGPU/uniform-cfg.ll
409

Probably same for all of these

Allen added a comment.EditedFeb 22 2023, 8:04 PM

hi @arsenm, as the CHECK of these case are not updated with update_llc_test_checks.py, so it is a bit hard to update, do you mind I Precommit the CHECK regenerated with update_llc_test_checks.py for these 4 cases ?

Allen added a comment.EditedFeb 24 2023, 7:57 PM

I will precommit to update the expired check on D144771 if you are agree.

Allen updated this revision to Diff 501095.Feb 28 2023, 4:57 AM
  1. Rebase to update test after precommit
  2. modify some cases's condition to avoid being their BBs optimizing away through the domination
arsenm accepted this revision.Feb 28 2023, 5:15 AM
arsenm added inline comments.
llvm/test/CodeGen/AMDGPU/uniform-cfg.ll
846

lost a branch but might not matter here

This revision is now accepted and ready to land.Feb 28 2023, 5:15 AM
This revision was landed with ongoing or failed builds.Mar 1 2023, 1:03 AM
This revision was automatically updated to reflect the committed changes.
Allen marked an inline comment as done.
fhahn added a subscriber: fhahn.Jun 28 2023, 8:22 AM

Looks like this patch is causing a code-size regression: https://github.com/llvm/llvm-project/issues/63576

It would be great if you could take a look

Allen added a comment.Sep 3 2023, 6:45 PM

Looks like this patch is causing a code-size regression: https://github.com/llvm/llvm-project/issues/63576

It would be great if you could take a look

hi @fhahn, thanks for your reporting and sorry for delay reply. I add some comment on that issue, and it has some conflict logic after the __builtin_unreachable(), so it seems an invalid.