This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Add new combine to sub folding
AbandonedPublic

Authored by cdawson on Apr 30 2019, 6:29 AM.

Details

Summary

(X | Y) - Y --> (X | Y) ^ Y
(Y | X) - Y --> (X | Y) ^ Y

I verified the correctness using Alive:
https://rise4fun.com/Alive/czes

This transform enables these further transforms that already exist in instcombine:
(X | Y) ^ Y --> X & ~Y
(Y | X) ^ Y --> X & ~Y

As a result, the full expected transform is:
(X | Y) - Y --> X & ~Y
(Y | X) - Y --> X & ~Y

I've added tests for cases where Y is constant and where Y is non-constant (with operands in either order).

In the constant case the optimisation is a clear win as we go from 2 instructions to 1 as we can pre-compute ~Y.

I checked that the combine still appears to be profitable when Y is non-constant, by compiling for x86_64 -mpcu=btver2 where I observed that we go from generating

	movl	%ecx, %eax
	orl	%edx, %eax
	subl	%edx, %eax

to

	andnl	%ecx, %edx, %eax

Diff Detail

Event Timeline

cdawson created this revision.Apr 30 2019, 6:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2019, 6:29 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
cdawson retitled this revision from [InstCombine] Add new combine to sub folding. to [InstCombine] Add new combine to sub folding.Apr 30 2019, 6:29 AM

Were there tests with extra uses? If not, do add.

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1535–1536

I do not understand. You don't change the inner pattern at all.
Just do:

if (match(Op0, m_c_Or(m_Specific(Op1), m_Value())))
  return BinaryOperator::CreateXor(Op0, Op1);
cdawson added inline comments.Apr 30 2019, 7:45 AM
llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1535–1536

I totally agree with the change to the return statement and the m_c_Or check.

One point I'd like clarification on is the removal of the m_OneUse. I was including that since the transformation is to facilitate a further (X | Y) ^ Y --> X & ~Y which will only happen if there are no extra uses of (X | Y). If you think we should always transform (X | Y) - Y --> (X | Y) ^ Y I'm happy to remove the m_OneUse.

lebedev.ri added inline comments.Apr 30 2019, 8:47 AM
llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1535–1536

I was including that since the transformation is to facilitate a further (X | Y) ^ Y --> X & ~Y which will only happen if there are no extra uses of (X | Y).

Because it needs to produce 2 instructions: ~Y and then X & (~Y),
and instcombine is not allowed to increase instruction count,
so unless (X | Y) was one-use, and would go away,
that transform isn't allowed to happen.

So i guess the question is: ignoring any other transforms that may or may not happen afterwards,
do we have a preference between (X | Y) - Y and (X | Y) ^ Y?
I'd guess bitop is better regardless of any further optimizations,
because LHS is already a bitop.

But if we would prefer (X | Y) - Y (then we'd need an inverse transform),
then perhaps this should instead be

if (match(Op0, m_OneUse(m_c_Or(m_Specific(Op1), m_Value(X)))))
  return BinaryOperator::CreateAnd(X, Builder.CreateNot(Op1));

directly.

spatel added inline comments.Apr 30 2019, 9:08 AM
llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1535–1536

The xor is probably better for analysis (bit-tracking, demanded-bits, etc), so we want (sub->xor) IMO.

lebedev.ri added inline comments.Apr 30 2019, 9:14 AM
llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1535–1536

That was my view, too.

nikic added a subscriber: nikic.Apr 30 2019, 9:29 AM
cdawson updated this revision to Diff 197534.May 1 2019, 6:07 AM

Addressed review comments

cdawson marked 5 inline comments as done.May 1 2019, 6:08 AM
lebedev.ri added inline comments.May 1 2019, 6:12 AM
llvm/test/Transforms/InstCombine/sub.ll
1270–1313

Please precommit all the tests.

Right, no commit rights i guess?
This looks good to me. I think this needs one more test.

llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
1531–1532

This needs a newline

llvm/test/Transforms/InstCombine/sub.ll
1287

This needs a negative test, with swapped order in sub.

spatel accepted this revision.May 1 2019, 12:52 PM

LGTM too with Roman's test suggestions in place.

Another couple of minor improvements for those:

  1. I prefer to give the tests slightly meaningful names and/or copy the code comment to help explain the transform.
  2. It's nice to have a test with vector type, so we know those work/won't regress too.
This revision is now accepted and ready to land.May 1 2019, 12:52 PM

Apologies for the spam.

Since the mentioned review I discovered that the non-constant case was already handled in visitSub but the constant case wasn't due to an earlier optimisation. I have spun off a new review for this: D61517

Was this committed already?

cdawson abandoned this revision.May 7 2019, 8:53 AM

This patch has been superseded by D61517