Address the dominating condition, the urem fold is benefit from the analytics improvements.
Fix https://github.com/llvm/llvm-project/issues/60546
Details
Diff Detail
Event Timeline
llvm/lib/Analysis/InstructionSimplify.cpp | ||
---|---|---|
3918–3919 | This can be: if (Checked) return *Checked ? getTrue(ITy) : getFalse(ITy); since it's also useful to know that the dominating condition is false. I expect you'll hit this case if you add a test withj the dominating icmp changed to uge and swap the branch targets |
Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=45a0b812fa13ec255cae91f974540a4d805a8d79&to=787c3e2333e8d655a61ef013f7006d7675c03574&stat=instructions%3Au Some impact, but less than I expected, so I think we can do this.
This patch should also remove the isImpliedByDomCondition() calls in simplifySelectInst() and simplifyBinaryIntrinsic(), because they are subsumed by this one.
Patch title and description also need an update -- this really has very little to do with the urem fold in particular, it's a generic analysis change.
...and that's because I used the wrong link. The correct data is http://llvm-compile-time-tracker.com/compare.php?from=38b1a17c519d3873f413156c302cb7ed0d39ee3d&to=6cd96f83542c16374ea0e66e3b5840abeb5d0b8d&stat=instructions:u.
- I verified the test, If remove the isImpliedByDomCondition() calls in simplifySelectInst() , there are 5 failed cases.
LLVM :: CodeGen/Thumb2/mve-memtp-branch.ll LLVM :: Transforms/AggressiveInstCombine/lower-table-based-cttz-basics.ll LLVM :: Transforms/InstSimplify/select-implied.ll LLVM :: Transforms/LoopUnroll/runtime-exit-phi-scev-invalidation.ll LLVM :: Transforms/LoopUnroll/runtime-loop-at-most-two-exits.ll
- then, If continue remove the call in simplifyBinaryIntrinsic(), there is a extra failed case.
LLVM :: Transforms/InstCombine/unrecognized_three-way-comparison.ll
- some seems regression, and others improve, so it can't be removed simple, should take a more work to fix them base the vesion of remove 2 isImpliedByDomCondition() ?
llvm/lib/Analysis/InstructionSimplify.cpp | ||
---|---|---|
3918–3919 |
define i8 @urem_with_dominating_condition_false(i8 %x, i8 %n) { start: %cond = icmp uge i8 %x, %n br i1 %cond, label %.bb1, label %.bb0 ; Swap the branch targets .bb0: %add = add i8 %x, 1 %out = urem i8 %add, %n ret i8 %out .bb1: ret i8 0 } |
I see. It looks like this much is safe to do without causing any regressions: https://github.com/nikic/llvm-project/commit/37366dc6bc6d168f081510a603982802a6964221 The ones in simplifyBinaryIntrinsic() and foldICmpWithDominatingICmp() are completely redundant.
The one in simplifySelectInst() is not redundant, because it uses a different context instruction: We would miss cases where we have br i1 %c followed by select i1 %c, because the comparison is not in the same block as the select. This is unfortunate...
Why was this abandoned? Just to be clear, I did not commit anything, just testing behavior.
Sorry, I mistakenly thought this was already uploaded. I'll change it according to your suggestion ASAP :)
Address comments
- Add return false and its test case urem_with_dominating_condition_false
- Delete some calls in function simplifyBinaryIntrinsic and foldICmpWithDominatingICmp, and update its case affected
- update tile
Fix 4 AMDGPU backend failed cases
LLVM :: CodeGen/AMDGPU/branch-relaxation.ll LLVM :: CodeGen/AMDGPU/collapse-endcf.ll LLVM :: CodeGen/AMDGPU/infinite-loop.ll LLVM :: CodeGen/AMDGPU/uniform-cfg.ll
hi @arsenm, as the CHECK of these case are not updated with update_llc_test_checks.py, so it is a bit hard to update, do you mind I Precommit the CHECK regenerated with update_llc_test_checks.py for these 4 cases ?
- Rebase to update test after precommit
- modify some cases's condition to avoid being their BBs optimizing away through the domination
llvm/test/CodeGen/AMDGPU/uniform-cfg.ll | ||
---|---|---|
846 ↗ | (On Diff #501095) | lost a branch but might not matter here |
Looks like this patch is causing a code-size regression: https://github.com/llvm/llvm-project/issues/63576
It would be great if you could take a look
hi @fhahn, thanks for your reporting and sorry for delay reply. I add some comment on that issue, and it has some conflict logic after the __builtin_unreachable(), so it seems an invalid.
This can be:
since it's also useful to know that the dominating condition is false.
I expect you'll hit this case if you add a test withj the dominating icmp changed to uge and swap the branch targets