This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Canonicalize (-X s/ Y) to -(X s/ Y)
ClosedPublic

Authored by shchenz on Apr 7 2019, 9:45 PM.

Details

Summary
int foo(int i, int j)
{
  int res = -( i / j);
  int res2 = -i / j;
  return res + res2;
}

Currently, we get:

define dso_local i32 @foo(i32, i32) local_unnamed_addr #0 {
  %3 = sdiv i32 %0, %1
  %4 = sub nsw i32 0, %0
  %5 = sdiv i32 %4, %1
  %6 = sub i32 %5, %3
  ret i32 %6
}

With this fold we will get

define dso_local i32 @foo(i32, i32) local_unnamed_addr #0 {
  %3 = sdiv i32 %0, %1
  %4 = sdiv i32 %0, %1
  %5 = sub nsw i32 0, %4
  %6 = sub i32 %5, %3
  ret i32 %6
}

Which will then get folded into

define dso_local i32 @foo(i32, i32) local_unnamed_addr #0 {
  %3 = sdiv i32 %0, %1
  %factor = mul i32 %3, -2
  ret i32 %factor
}

So we end with just one division:
https://godbolt.org/z/gSCKQ1

Diff Detail

Event Timeline

shchenz created this revision.Apr 7 2019, 9:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 7 2019, 9:45 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
shchenz updated this revision to Diff 194094.EditedApr 7 2019, 11:07 PM

move exact flag fixup code to a seperated patch https://reviews.llvm.org/D60396

lebedev.ri requested changes to this revision.Apr 7 2019, 11:27 PM

Looks promising.

llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
1047–1048

It's not commutative like that.

----------------------------------------
Optimization: -X / Y  to  -(X / Y)
Precondition: true
  %t0 = sub nsw i8 0, %x
  %r = sdiv i8 %t0, %y
=>
  %n0 = sdiv i8 %x, %y
  %r = sub i8 0, %n0

Done: 1
Optimization is correct!

----------------------------------------
Optimization: X / -Y --> -(X / Y)
Precondition: true
  %t0 = sub nsw i8 0, %y
  %r = sdiv i8 %x, %t0
=>
  %n0 = sdiv i8 %x, %y
  %r = sub i8 0, %n0


ERROR: Domain of definedness of Target is smaller than Source's for i8 %r

Example:
%y i8 = 0xFF (255, -1)
%x i8 = 0x80 (128, -128)
%t0 i8 = 0x01 (1)
%n0 i8 = 0x80 (128, -128)
Source value: 0x80 (128, -128)
Target value: undef

(You could use https://rise4fun.com/Alive, but it appears down for the moment?)

This revision now requires changes to proceed.Apr 7 2019, 11:27 PM
shchenz marked an inline comment as done.Apr 8 2019, 2:30 AM

Will fix it later. Thanks for your comments @lebedev.ri

llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
1047–1048

hmm, seems can not canonicalize X/-Y ---> -(X/Y).
I will only focus on -X / Y to -(X / Y) for this patch.

There should be same opportunity for (X/-Y) if Y is constant. Will let it to be improved in later patch.

shchenz updated this revision to Diff 194122.Apr 8 2019, 5:39 AM

address comments.

Now only canonicalize pattern -X/Y ---> -(X/Y).

shchenz updated this revision to Diff 194126.Apr 8 2019, 6:07 AM
lebedev.ri retitled this revision from [InstCombine] canonicalize sdiv with NEG operand to [InstCombine] Canonicalize (-X s/ Y) to -(X s/ Y).Apr 8 2019, 7:25 AM
lebedev.ri edited the summary of this revision. (Show Details)
lebedev.ri set the repository for this revision to rL LLVM.
lebedev.ri edited the summary of this revision. (Show Details)

Almost there.
Since we never touch the denominator, i think we are always good re vectors.
Does this need rebasing now that the other patch has landed?

llvm/test/Transforms/InstCombine/sdiv-canonicalize.ll
7

Shouldn't this sub also be nsw?

----------------------------------------
Optimization: nsw preserved
Precondition: true
  %o0 = sub nsw i8 0, %x
  %r = sdiv i8 %o0, %y
=>
  %n0 = sdiv i8 %x, %y
  %r = sub nsw i8 0, %n0

Done: 1
Optimization is correct!

----------------------------------------
Optimization: exact preserved
Precondition: true
  %o0 = sub nsw i8 0, %x
  %r = sdiv exact i8 %o0, %y
=>
  %n0 = sdiv exact i8 %x, %y
  %r = sub i8 0, %n0

Done: 1
Optimization is correct!

----------------------------------------
Optimization: both preserved
Precondition: true
  %o0 = sub nsw i8 0, %x
  %r = sdiv exact i8 %o0, %y
=>
  %n0 = sdiv exact i8 %x, %y
  %r = sub nsw i8 0, %n0

Done: 1
Optimization is correct!
11

There is a test that will show that we propagate exact?

Given that this doesn't fall into endless loop, i guess we don't do reverse transform.
I'm just wondering, this is the direction we should be doing it? @spatel

spatel added a comment.Apr 8 2019, 2:38 PM

Given that this doesn't fall into endless loop, i guess we don't do reverse transform.
I'm just wondering, this is the direction we should be doing it? @spatel

We already have these for mul:

// -X * Y --> -(X * Y)
// X * -Y --> -(X * Y)

...so that provides some precedent for the direction to move the negation.

Is there a sibling fold for srem?

Given that this doesn't fall into endless loop, i guess we don't do reverse transform.
I'm just wondering, this is the direction we should be doing it? @spatel

We already have these for mul:

// -X * Y --> -(X * Y)
// X * -Y --> -(X * Y)

...so that provides some precedent for the direction to move the negation.

Okay, that confirmed my suspicions.

Is there a sibling fold for srem?

Yep!

----------------------------------------
Optimization: nsw preserved
Precondition: true
  %o0 = sub nsw i8 0, %x
  %r = srem i8 %o0, %y
=>
  %n0 = srem i8 %x, %y
  %r = sub nsw i8 0, %n0

Done: 1
Optimization is correct!
shchenz marked 2 inline comments as done.Apr 8 2019, 7:22 PM
shchenz added inline comments.
llvm/test/Transforms/InstCombine/sdiv-canonicalize.ll
7

I have one question here before I set sub with nsw.

  %o0 = sub nsw i8 0, %x
  %r = sdiv i8 %o0, %y
=>
  %n0 = sdiv i8 %x, %y
  %r = sub i8 0, %n0

Is this a valid transformation? I know we get a affirmative answer in https://rise4fun.com/Alive/9G4. But assume this input %x is -128, %y is -2, for the source %o0 is a position value so %r is a position value too. But for the target %n0 is 64, and %r is -64? So source %r and target %r is not equal? Can you please help to point out what's wrong here? Thanks @lebedev.ri @spatel

11

will add one.

lebedev.ri added inline comments.Apr 8 2019, 11:42 PM
llvm/test/Transforms/InstCombine/sdiv-canonicalize.ll
7

-128 is SINT_MIN for i8, therefore the sub nsw 0, -128 is UB since it overflows despite the nsw flag.
https://godbolt.org/z/sWekHw
Therefore any and every answer is correct.

shchenz updated this revision to Diff 194308.Apr 9 2019, 6:31 AM

address comments.

lebedev.ri accepted this revision.Apr 9 2019, 7:09 AM

address comments.

Ok, thank you, looks good now.

----------------------------------------
Optimization: nsw preserved
Precondition: true
  %o0 = sub nsw i8 0, %x
  %r = srem i8 %o0, %y
=>
  %n0 = srem i8 %x, %y
  %r = sub nsw i8 0, %n0

Done: 1
Optimization is correct!

@shchenz if you don't intend to immediately-ish submit a patch for that sibling pattern, could you please file a bug?

This revision is now accepted and ready to land.Apr 9 2019, 7:09 AM

address comments.

Ok, thank you, looks good now.

----------------------------------------
Optimization: nsw preserved
Precondition: true
  %o0 = sub nsw i8 0, %x
  %r = srem i8 %o0, %y
=>
  %n0 = srem i8 %x, %y
  %r = sub nsw i8 0, %n0

Done: 1
Optimization is correct!

@shchenz if you don't intend to immediately-ish submit a patch for that sibling pattern, could you please file a bug?

Sure, https://bugs.llvm.org/show_bug.cgi?id=41443 is filed.

address comments.

Ok, thank you, looks good now.

----------------------------------------
Optimization: nsw preserved
Precondition: true
  %o0 = sub nsw i8 0, %x
  %r = srem i8 %o0, %y
=>
  %n0 = srem i8 %x, %y
  %r = sub nsw i8 0, %n0

Done: 1
Optimization is correct!

@shchenz if you don't intend to immediately-ish submit a patch for that sibling pattern, could you please file a bug?

Sure, https://bugs.llvm.org/show_bug.cgi?id=41443 is filed.

Thank you.

This revision was automatically updated to reflect the committed changes.

@shchenz how did rL358017 end up committing this diff into completely different place in this function?

lebedev.ri reopened this revision.Apr 9 2019, 11:45 AM
This revision is now accepted and ready to land.Apr 9 2019, 11:45 AM
lebedev.ri requested changes to this revision.Apr 9 2019, 11:45 AM
This revision now requires changes to proceed.Apr 9 2019, 11:45 AM

@shchenz how did rL358017 end up committing this diff into completely different place in this function?

Sorry, my bad. My remote dev machine is very slow yesterday night so I did not do a make check-all test before I committed the code. And I didn't reliaze that git merges my code into a wrong place in file lib/Transforms/InstCombine/InstCombineMulDivRem.cpp which was also changed by other patch yesterday.

I will send out another patch later.

shchenz updated this revision to Diff 194445.Apr 9 2019, 10:52 PM

rebased.

Sorry for breaking down unit testing. Could you please help to do another review for this patch. Thanks a lot.

lebedev.ri accepted this revision.Apr 9 2019, 11:22 PM

LG, please do run at least the check-llvm before committing..

This revision is now accepted and ready to land.Apr 9 2019, 11:22 PM

LG, please do run at least the check-llvm before committing..

absolutely will. ^-^

This revision was automatically updated to reflect the committed changes.