This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine] Remove AND in SETCC if we can prove they are unneeded
AbandonedPublic

Authored by dmgreen on Mar 2 2018, 12:53 PM.

Details

Summary

This is a possible solution for the remainder of PR35875. The idea is that if
we have a SETCC of two ANDs with the same mask, and we can calculate
that the non-mask bits of the operands of the AND are the same, we can
drop the ANDs from the SETCC.
i.e (x&C < y&C) => x<y for any s/u </>/=/etc.

Here is an attempt to prove such a thing at the IR level:
https://rise4fun.com/Alive/ha2

So that seems to be fine. Except that the code in question, the ANDs come
from type legalising a SETCC node, with values that come from anyext loads.
These don't provide upper known bits, even though they are likely to become
zext loads.

Hence all this plumbing to find and return anyext loads, treating them as
zeroext loads during computeKnownBits and adjusting the loads to zeroext
if that is useful for this case.

The left hand side of these tests are not yet checked in. As you can hopefully
see, it helps out quite a bit here. If there are better ways to solve this
problem, I'm all ears.

Diff Detail

Event Timeline

dmgreen created this revision.Mar 2 2018, 12:53 PM
dmgreen edited the summary of this revision. (Show Details)Mar 2 2018, 12:53 PM
nhaehnle removed a subscriber: nhaehnle.Mar 5 2018, 2:54 AM
samparker added inline comments.Mar 6 2018, 1:26 AM
include/llvm/CodeGen/SelectionDAG.h
1385

If you reorder the default parameters, you could reduce the number of changes needed in the backends.

lib/CodeGen/SelectionDAG/TargetLowering.cpp
2682

Is this safe? Shouldn't the load be checked for a single use or that all the uses on your known bits path?

FWIW - I haven't looked closely at this patch (it might be good independently of PR35875), but I think we can optimize the motivating case even more than what's shown here by reducing the IR:
https://bugs.llvm.org/show_bug.cgi?id=35875#c5

I was asked not to add DemandedElts to individual cases in computeKnownBits until you have test coverage - this should probably be the same for AnyToZeroExtLoads. That will be easier if you move the AnyToZeroExtLoads arg to the end to make use of the default nullptr.

Random idea - would a more general approach be for us to compute undefined bits as well as one/zero bits - either adding them to the KnownBits struct or add it separately?

Thanks all for the reviews. I'm currently hoping that the transform Sanjay suggested will take away my main motivation for this patch. Here are some responses in any case:

I was asked not to add DemandedElts to individual cases in computeKnownBits until you have test coverage - this should probably be the same for AnyToZeroExtLoads.

OK. Noted. By this you mean unit tests for computeKnownBits? Sounds like a sensible idea. I'm guessing the nature of ISel makes this difficult?

Random idea - would a more general approach be for us to compute undefined bits as well as one/zero bits - either adding them to the KnownBits struct or add it separately?

In this specific case (SETCC(AND(%x, C) , AND(%y, C)) where we know that the ~C bits of %x == %y, then the AND can be removed) - we need to know the bits are a specific value. We can't have the two sides being undef and then choosing to differently zero/signext for example. In the case I was looking, x and y were both and(xor(8bit load, -1), 255), so the top bits are all 1 on both sides, so long as we choose to zeroext. I think we need to know the bits, and know which nodes to change to force those bits to be specified. It might well be a good idea for other situations.

include/llvm/CodeGen/SelectionDAG.h
1385

Yeah, makes sense. That's probably best. I was trying to keep depth as the last argument for some reason.

lib/CodeGen/SelectionDAG/TargetLowering.cpp
2682

I don't know a huge amount about the internals of selection dag. My understanding was that if they are anyext then we cannot presume the values of the bits, but then neither can anything else. So long as we set the values of the bits (as we do here), then all other optimisations have to then use the bits as 0.

If I could have just changed the existing LD's to zeroextends, that would be simpler. But I believe the nodes are immutable.

So I don't think we have to check for single uses as the only bits we are changing are the upper bits from unspecified to 0's. Which isn't something that anything else can rely upon, without itself doing the unspecified -> set transform.

I was asked not to add DemandedElts to individual cases in computeKnownBits until you have test coverage - this should probably be the same for AnyToZeroExtLoads.

OK. Noted. By this you mean unit tests for computeKnownBits? Sounds like a sensible idea. I'm guessing the nature of ISel makes this difficult?

Most of my tests for these cases come from combines that make use of SelectionDAG::SignBitIsZero which uses computeKnownBits - converting uitofp to sitofp etc. in x86 combine-*.ll tests have some good examples.

@dmgreen Do you still need this patch or should it be abandoned?

dmgreen abandoned this revision.Sep 30 2018, 2:18 AM

Yep sure, with 52177, I no longer have a motivating case for this.