This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG][WIP] Add support for evaluating SetCC based on knownbits
AbandonedPublic

Authored by goldstein.w.n on Apr 27 2023, 2:29 PM.

Details

Summary

In some cases folds done through ISel/DAGCombining result in SetCC
conditions that we can easily simplify using known bits.

This patch adds support for that.

NB: The medium-term goal is to get this patch in, then use comparisons
against zero for testing future improvements to isKnownNeverZero as
there are issues using cttz/ctlz for that.

Diff Detail

Event Timeline

goldstein.w.n created this revision.Apr 27 2023, 2:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2023, 2:29 PM
goldstein.w.n requested review of this revision.Apr 27 2023, 2:29 PM
goldstein.w.n added inline comments.Apr 27 2023, 2:39 PM
llvm/test/CodeGen/X86/avx512-mask-op.ll
615

Why do we emit any code here?

llvm/test/CodeGen/X86/fold-rmw-ops.ll
1359

These movb $1, %al; testb %al, %al's (here and in many other cases) are unnecessary. I assume its because SelectionDAG only has BB view, so even if we can rule out some BBs (based on known true/false br-cond), there is no pass for that. Is there anything we can/should do about that?

Also NB, we really should never emit movb $1, %al; testb %al, %al just grabbing any gpr (I guess least recently used to minimize potential latency) and do cmpb %gpr8, gpr8 then jne/je depending if we want it to be always true/false.

Fix some typos

craig.topper added inline comments.Apr 27 2023, 10:13 PM
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
2472

I would not do this in FoldSetCC. FoldSetCC is called from getNode and I don't think we want to call to computeKnownBits hidden in that.

You you can do it in SimplifySetCC in TargetLowering.cpp I think.

craig.topper added inline comments.Apr 27 2023, 10:15 PM
llvm/test/CodeGen/X86/fold-rmw-ops.ll
1359

Hopefully most of these are just tests that should have been folded by InstCombine or other passes earlier and not really cases that originate in SelectionDAG.

craig.topper added inline comments.Apr 27 2023, 10:18 PM
llvm/test/CodeGen/X86/avx512-mask-op.ll
615

I'm guesting we started with a conditional branch and it got optimized out?

goldstein.w.n added inline comments.Apr 27 2023, 11:13 PM
llvm/test/CodeGen/X86/fold-rmw-ops.ll
1359

Do you think these regressions are something to worry about? Or acceptable as cases we would never expect to get from the middle-end.

This is very similar to a patch I tried a while ago: D86578 - one of the blockers for which was all the over-reduced test cases that it broke

This is very similar to a patch I tried a while ago: D86578 - one of the blockers for which was all the over-reduced test cases that it broke

What do you mean over-reduces?
But nikic has a point that realistic backend code shouldn't be affected by this much.
I think there are some cases that benefit like cttz/ctlz cases if zero is no poison AND target doesn't have tzcnt/lzcnt
but thats niche.

It may also be able to help with intermediate select statement introduced during lowering that are
actually constant foldable.

Maybe a different way to test isKnownNeverZero is better?

RKSimon added inline comments.Apr 29 2023, 10:09 AM
llvm/test/CodeGen/AArch64/cmp-const-max.ll
1

regenerate + commit this file and rebase to show the diffs from the patch

llvm/test/CodeGen/ARM/sub-cmp-peephole.ll
1

regenerate this file first so you show any diffs

llvm/test/CodeGen/X86/fold-rmw-ops.ll
1372

comparing against zero in all these or-with-imm tests just seems to be a copy+paste from the other logic ops in this file - maybe change it to something that isn't constant foldable (test for -ve?)

goldstein.w.n marked 3 inline comments as done.

Rebase + move impl to simplifysetcc

goldstein.w.n added inline comments.
llvm/test/CodeGen/X86/fold-rmw-ops.ll
1372

re: 'test for -ve?' hmm?

But would generally prefer to add new tests than modify existing.

RKSimon added inline comments.May 17 2023, 6:41 AM
llvm/test/CodeGen/X86/fold-rmw-ops.ll
1372

'test for -ve' === 'test for negative'

Adding additional tests would be fine.

foad added inline comments.May 18 2023, 1:50 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
4288–4289 ↗(On Diff #523267)

Just curious: does "strengthening" conditions like this actually generate better code? Is it something we do elsewhere in the compiler?

RKSimon added inline comments.May 18 2023, 2:37 AM
llvm/test/CodeGen/X86/avx512-mask-op.ll
4708

Whats going on here?

RKSimon added inline comments.May 18 2023, 9:56 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
4302 ↗(On Diff #523267)

I'm not sure this is always false?

goldstein.w.n added inline comments.May 18 2023, 9:58 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
4288–4289 ↗(On Diff #523267)

No worse in fact. This was really motivated by wanting to improve the knownbits analysis in selectiondag but having no good way to test it.

I think at the moment, since we don't have the infrastructure to fold basic blocks when icmp; br conditions are known true/false we end up with regressions.

Not sure exactly where this is going to go unless we add that infrastructure.

llvm/test/CodeGen/X86/avx512-mask-op.ll
4708

Its what you thought was a bad merge in the knownbits impl.

if (computeKnownBits(Op.getOperand(0)).One[0], Depth + 1)
  return true;

See the bug?

Should be:

if (computeKnownBits(Op.getOperand(0), Depth + 1).One[0])
  return true;

Suprised the former is accepted as an expression (not a clang warning/error). Its

if (expr_A, expr_B)

llvm/test/CodeGen/X86/fold-rmw-ops.ll
1372

As in add a new test file? This isn't a new file for the series, its just affected by the change.

Rebase (fixed bug in KnownBitsIsNeverZero)

RKSimon requested changes to this revision.May 18 2023, 10:21 AM

There's still a lot of tests in here that need adjusting so that they still test what we want them to test

llvm/test/CodeGen/X86/fold-rmw-ops.ll
1372

I'd prefer that these tests were adjusted, the icmp_eq vs 0 was just a dumb copy + paste - but if you don't want to do that, duplicating these OR tests immediately below with a icmp_sgt 0 would be OK

llvm/test/CodeGen/X86/hoist-and-by-const-from-shl-in-eqcmp-zero.ll
801

this needs adjusting

llvm/test/CodeGen/X86/or-with-overflow.ll
6

All these tests need adjusting

llvm/test/CodeGen/X86/pr16031.ll
9

not sure what to do with this test - either we try to fix it so it still matches what the original bug was about, or we delete it

llvm/test/CodeGen/X86/shrink-compare.ll
128

This is no longer a shrink-compare test

This revision now requires changes to proceed.May 18 2023, 10:21 AM
foad added inline comments.May 19 2023, 1:40 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
4288–4289 ↗(On Diff #523267)

You're adding code to the compiler to change these setcc conditions, but not actually make anything better? I do not think that is a good idea.

4302 ↗(On Diff #523267)

It's confusing but I think it's correct. From this point onwards, Res represents whether LHS and RHS are equal, irrespective of Cond.

goldstein.w.n added inline comments.May 19 2023, 8:16 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
4288–4289 ↗(On Diff #523267)

So worse might be a bit of a exageration. It's worse for some IR that the middle-end would never pass to us.

I was thinking maybe we could do this for SETCC not used by br. That case we see improvement. What I need to do I think though, do a little work seeing if any of the cases this changes are cases generated by the backend (i.e cases that wouldn't normally be cleaned up by the middle-end).

But I agree, in the current state its not submitable.

4302 ↗(On Diff #523267)

I'll make it a new variable.

llvm/test/CodeGen/X86/fold-rmw-ops.ll
1372

Ah, okay. Will do.

goldstein.w.n abandoned this revision.Jul 9 2023, 4:43 PM

Abandoning this revision. Still doesn't seem feasible to add this in the backend.