This allows the xor to be removed completely.
This might help with recomitting r341674, but seems good regardless.
Differential D51964
[InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y are freely invertible. craig.topper on Sep 11 2018, 11:30 PM. Authored by
Details This allows the xor to be removed completely. This might help with recomitting r341674, but seems good regardless.
Diff Detail
Event Timeline
Comment Actions Can you split off the inf-loop fix? That solves PR38915, right? Comment Actions I don’t know if it fixes that PR. This transform causes an infinite loop on the tests where both max/min operands are invertible but not constants. Test50 and test51 Comment Actions This whole patch does fix the infinite loop from PR38915, but it requires the whole patch and not just the change in InstCombineSelect.cpp Comment Actions The whole patch fixes our internal regression as well (PR38915 was the bug point reduced version of the regression). Thanks for the fix! Comment Actions I looked a bit closer at what's going on in PR38915. This patch fixes that particular case, but I'm worried that we still have the same bug lurking if we commit this without solving the underlying problem. I think these combines are fighting because we don't have a way to accurately assess 'one-use' in terms of another min/max within IsFreeToInvert(). This could be the best argument yet for integer min/max intrinsics because we're going to have to keep special-casing / adding those !X->hasNUsesOrMore(3) hacks to account for min/max transforms...or maybe this + whatever else is needed to recommit rL341674 is enough? The problem might also be solved by adjusting the min/max bailout that we have within visitICmp. We don't optimize cmps that are part of min/max because we're scared to break a min/max. There's another underlying problem that I'd like to look that I've been putting off: we have a bunch of min/max of min/max transforms in instcombine, but they really belong in instsimplify because we're just deleting ops. It's not a complete solution, but if we fix that, we're less likely to have this problem in instcombine because some patterns would be killed before we could get into trouble. Comment Actions I don't know how to solve the infinite loop bug completely yet (if anyone else does, please suggest), and given that this solves the existing bug while adding an optimization, I'll say LGTM. But please do:
define i32 @PR38915(i32 %x, i32 %y, i32 %z) { %xn = sub i32 0, %x %yn = sub i32 0, %y %c1 = icmp sgt i32 %xn, %yn %m1 = select i1 %c1, i32 %xn, i32 %yn %m1n = xor i32 %m1, -1 %c2 = icmp sgt i32 %m1n, %z %m2 = select i1 %c2, i32 %m1n, i32 %z %m2n = xor i32 %m2, -1 ret i32 %m2n } |