This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] X *fast (C ? 1.0 : 0.0) -> C ? X : 0.0
ClosedPublic

Authored by foad on Jul 15 2019, 2:23 AM.

Details

Summary

This is implemented with more general rules:

X op (C ? P : Q) -> C ? (X op P) : (X op Q)

// if X op P and X op Q both simplify

(C ? P : Q) op Y -> C ? (P op Y) : (Q op Y)

// if P op Y and Q op Y both simplify

The tests include some other cases where these more general rules apply.

Diff Detail

Event Timeline

foad created this revision.Jul 15 2019, 2:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2019, 2:23 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
foad marked an inline comment as done.Jul 15 2019, 2:28 AM
foad added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
776

Question to reviewers: would it be better to enhance SimplifyQuery / InstrInfoQuery to know about fast math flags, so that I don't have to pass FMF around explicitly?

I think we need a few more tests here

a) vector tests
b) 'extra use' tests
c) negative tests

xbolva00 added a comment.EditedJul 15 2019, 2:41 AM

Not sure how other reviewers, but I consider the use of 'C' as confusing here, since 'C' is usually a constant. I would recommend to avoid 'C', if it is not a constant.

Edit: As I see, original code also used 'C', so feel free to ignore me :)

foad added a comment.Jul 15 2019, 2:52 AM

Not sure how other reviewers, but I consider the use of 'C' as confusing here, since 'C' is usually a constant. I would recommend to avoid 'C', if it is not a constant.

Edit: As I see, original code also used 'C', so feel free to ignore me :)

Right, and since the code uses all of A B C D E F, I think it's reasonably clear that none of them are meant to stand for anything in particular.

foad updated this revision to Diff 209784.Jul 15 2019, 2:58 AM

Add a vector test.

foad added a comment.Jul 15 2019, 3:01 AM

I think we need a few more tests here

a) vector tests
b) 'extra use' tests
c) negative tests

a) I've added one.
b) The patch doesn't introduce any new hasOneUse() tests. Do you think it should?
c) There are infinitely many expressions not optimized by this pattern :) Can you give an example of the kind of thing you're thinking of?

xbolva00 added a comment.EditedJul 15 2019, 3:20 AM

b) yes, one more test never hurts
c) one test per.. e.g.; broken precondition, missing 'fast' ?

foad added a comment.Jul 15 2019, 3:23 AM

b) yes, one more test never hurts

I meant:
b) The patch doesn't introduce any more uses of hasOneUse() in the source code. Do you think it should?

xbolva00 added inline comments.Jul 15 2019, 3:25 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
777

Can you simplify lines 782-808? Almost same code in both branches..

foad updated this revision to Diff 209787.Jul 15 2019, 3:35 AM

Add a negative test for fmul without fast math flags.

foad marked an inline comment as done.Jul 15 2019, 3:42 AM
foad added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
777

I'd prefer to keep the SimplifyFPBinOp/CreateSelect calls in separate branches, otherwise I think the code will get much more confusing to read.

As for the boilerplate code to do with fast math flags: I could either pull it all out to the top level of the function (so it would run unconditionally, even if we're not able to optimize anything); or I could try teaching SimplifyQuery about fast math flags (as mentioned above), which would simplify it a little. Do you have any preference?

foad updated this revision to Diff 209793.Jul 15 2019, 4:12 AM

Rewrite to remove duplicated code.

foad marked an inline comment as done.Jul 15 2019, 4:12 AM
xbolva00 added inline comments.Jul 15 2019, 4:20 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
777

As for the boilerplate code to do with fast math flags:

Maybe use lambda?

xbolva00 added inline comments.Jul 15 2019, 4:22 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
799

Nice :)

Looks ok to me, but I would like some of the others to chime in first.

foad marked an inline comment as done.Jul 16 2019, 1:25 AM
foad added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
793

Question to reviewers: should I make this condition LHSIsSelect && LHS->hasOneUse() (and the same for RHS on line 792)? I am tempted to do so just to be conservative, to make sure that this instcombine actually removes all uses of the original select.

lebedev.ri resigned from this revision.Jul 16 2019, 4:40 AM

(i'm not touching fp folds)

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
793

Presumably you'd want to add all the necessary variation of tests with extra uses, and see which ones need
restrictions to one-use cases.

spatel added inline comments.Jul 17 2019, 12:04 PM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
784–785

I know this is an existing function name, but it doesn't match my expectation based on the name. Change it to simplifyAnyBinOp() or simplifyIntOrFPBinOp() as a preliminary step?

llvm/test/Transforms/InstCombine/fmul.ll
1002

'fast' is over-specifying the constraints here (and probably all of these tests).
fmul by 1.0 doesn't need any flags; fmul by 0.0 only needs nsz and nnan.

1013

Vary the opcodes (fadd, fsub, fdiv) in these tests for better coverage?

llvm/test/Transforms/InstCombine/mul.ll
522–530

Is this test changed by this patch? If not, it shouldn't be here.

foad marked 4 inline comments as done.Jul 17 2019, 12:37 PM
foad added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
784–785

I don't think your suggestions are any better than the status quo; how is anyone supposed to know or remember the difference between simplifyBinOp and simplifyAnyBinOp? Instead, how about I overload simplifyBinOp to take FMF as an optional (last) argument?

llvm/test/Transforms/InstCombine/fmul.ll
1002

True. I could change it to nsz nnan if you think that's better. (Incidentally I think nsz ninf should also be sufficient in theory, but I won't pursue that in this patch.)

1013

fmul is fairly unique in having useful simplifications for two different constant operands, 0.0 and 1.0, but I'll see what I can do.

llvm/test/Transforms/InstCombine/mul.ll
522–530

Yes. Without this patch, this test does not get simplified at all.

spatel added inline comments.Jul 17 2019, 1:27 PM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
784–785

Ok, that's less misleading than what we have now. You can make that change without review, then rebase this patch.

llvm/test/Transforms/InstCombine/mul.ll
522–530

Ok, then please commit this test with baseline CHECK lines as another preliminary NFC step ahead of this patch.

foad marked an inline comment as done.Jul 17 2019, 10:44 PM
foad added inline comments.
llvm/test/Transforms/InstCombine/mul.ll
522–530

I don't really understand why you're suggesting to do that with this test case, but not with the other tests I added. There really is not much difference between them.

Perhaps it would be clearer if I split my original patch into two:

  1. Pass fast math flags in to the calls to simplifyBinOp.
  2. Add the X op (C ? P : Q) -> C ? (X op P) : (X op Q) combines that are really the main point of the patch.
foad marked an inline comment as done.Jul 17 2019, 11:07 PM
foad added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
784–785

See D64902. I know you said I could do it without review, but I prefer to give others a chance to comment before committing.

spatel added inline comments.Jul 24 2019, 5:17 AM
llvm/test/Transforms/InstCombine/mul.ll
522–530

Yes, please add all of the baseline tests for each part of this patch and fork this into the minimal code change patches, so we can see each piece independently.

Do we miss int version of this? @spatel @lebedev.ri

int foo(int x, int y, bool b) {

return x / (b  ? 1 : y);

}

We have sdiv x, (select b 1 y), x86-64:
cmov...
idiv...

Do we miss int version of this? @spatel @lebedev.ri

int foo(int x, int y, bool b) {
    return x / (b  ? 1 : y);
}

We have sdiv x, (select b 1 y), x86-64:
cmov...
idiv...

I'd agree that having select %b (div %x, %y), %x would be better but that does not look like a valid transform:
https://godbolt.org/z/4Fs_dB
https://rise4fun.com/Alive/xplW

Optimization: D64713-int

ERROR: Domain of definedness of Target is smaller than Source's for i32 %r

Example:
%b i1 = 0x1 (1, -1)
%y i32 = poison
%x i32 = poison
%cond i32 = 0x00000001 (1)
%t0 i32 = UB
Source value: poison
Target value: UB
$ /repositories/alive2/build-Clang-unknown/alive -root-only /tmp/test.ll 
Processing /tmp/test.ll..

----------------------------------------
Name: D64713-int
  %cond = select i1 %b, i8 1, i8 %y
  %r = sdiv i8 %x, %cond
  ret i8 %r
=>
  %cond = select i1 %b, i8 1, i8 %y
  %t0 = sdiv i8 %x, %y
  %r = select i1 %b, i8 %x, i8 %t0
  ret i8 %r

ERROR: Source is more defined than target

Example:
i1 %b = #x1 (1)
i8 %y = poison
i8 %x = poison

Source:
i8 %cond = #x01 (1)
i8 %r = poison

Target:
i8 %cond = #x01 (1)
i8 %t0 = poison
i8 %r = poison

ah, poison :/

I checked gcc codegen and gcc can fold it..

foad marked an inline comment as done.Aug 2 2019, 7:44 AM
foad added inline comments.
llvm/test/Transforms/InstCombine/mul.ll
522–530

OK, part of this is now split out into D65658.

foad updated this revision to Diff 225888.Oct 21 2019, 8:04 AM

Rebase on top of pre-committed test cases.

foad added a comment.Nov 8 2019, 6:40 AM

Ping! I think I've addressed all the earlier comments.

spatel added inline comments.Nov 8 2019, 10:57 AM
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
794–804

Is it correct that everything outside of lines 791-801 is functionally identical to before? If so, please pre-commit that part as NFC.

foad updated this revision to Diff 228563.Nov 9 2019, 1:37 AM

Rebase after committing NFC parts.

foad marked an inline comment as done.Nov 9 2019, 1:38 AM
foad added inline comments.
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
794–804

Yes, done: d162e02cee74

spatel accepted this revision.Nov 10 2019, 6:29 AM

LGTM - the title of the patch is too specific at this point; more like:
[InstCombine] hoist binop that simplifies into arms of select
?

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
794–804

Thanks - that makes it clear how the new transforms fit in.

This revision is now accepted and ready to land.Nov 10 2019, 6:29 AM
This revision was automatically updated to reflect the committed changes.

Do we miss int version of this? @spatel @lebedev.ri

int foo(int x, int y, bool b) {
    return x / (b  ? 1 : y);
}

We have sdiv x, (select b 1 y), x86-64:
cmov...
idiv...

I'd agree that having select %b (div %x, %y), %x would be better but that does not look like a valid transform:
https://godbolt.org/z/4Fs_dB
https://rise4fun.com/Alive/xplW

Optimization: D64713-int

ERROR: Domain of definedness of Target is smaller than Source's for i32 %r

Example:
%b i1 = 0x1 (1, -1)
%y i32 = poison
%x i32 = poison
%cond i32 = 0x00000001 (1)
%t0 i32 = UB
Source value: poison
Target value: UB
$ /repositories/alive2/build-Clang-unknown/alive -root-only /tmp/test.ll 
Processing /tmp/test.ll..

----------------------------------------
Name: D64713-int
  %cond = select i1 %b, i8 1, i8 %y
  %r = sdiv i8 %x, %cond
  ret i8 %r
=>
  %cond = select i1 %b, i8 1, i8 %y
  %t0 = sdiv i8 %x, %y
  %r = select i1 %b, i8 %x, i8 %t0
  ret i8 %r

ERROR: Source is more defined than target

Example:
i1 %b = #x1 (1)
i8 %y = poison
i8 %x = poison

Source:
i8 %cond = #x01 (1)
i8 %r = poison

Target:
i8 %cond = #x01 (1)
i8 %t0 = poison
i8 %r = poison

Can this be solved using freeze?