This is an archive of the discontinued LLVM Phabricator instance.

[InstSimplify] Fold min/max intrinsic based on icmp of operands
ClosedPublic

Authored by nikic on Aug 13 2020, 1:17 PM.

Details

Summary

This is a reboot of D84655, now performing the inner icmp simplification query without undef folds.

It should be possible to handle the current foldMinMaxSharedOp() fold based on this, by moving the logic into icmp of min/max instead, making it more general. We can't drop the folds for constant operands, because those also allow undef, which we exclude here.

Diff Detail

Event Timeline

nikic created this revision.Aug 13 2020, 1:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 13 2020, 1:17 PM
nikic requested review of this revision.Aug 13 2020, 1:17 PM

I'm iffy about this. I was hoping that we could either remove the explicit folds or show more wins by using the general icmp simplification.
Based on the test diffs, we're only getting assume improvements, and that doesn't seem worth the effort of full icmp simplify.
It would be nice to show a more meaningful motivating case, but I doubt we will get that until we start canonicalizing to the intrinsics.
Thoughts?

I'm iffy about this. I was hoping that we could either remove the explicit folds or show more wins by using the general icmp simplification.
Based on the test diffs, we're only getting assume improvements, and that doesn't seem worth the effort of full icmp simplify.
It would be nice to show a more meaningful motivating case, but I doubt we will get that until we start canonicalizing to the intrinsics.
Thoughts?

I only used assume tests because that seemed like the most direct way to test the change. We also get all other folds that are implemented for icmp, e.g. we recognize that

%add = add nuw i8 %x, 1
%max = call i8 @llvm.umax.i8(i8 %x, i8 %add)
ret i8 %max

is the same as ret i8 %add. I'm not sure how valuable it would be to explicitly test these, as they are already covered by the icmp tests.

spatel accepted this revision.Aug 24 2020, 12:08 PM

I'm iffy about this. I was hoping that we could either remove the explicit folds or show more wins by using the general icmp simplification.
Based on the test diffs, we're only getting assume improvements, and that doesn't seem worth the effort of full icmp simplify.
It would be nice to show a more meaningful motivating case, but I doubt we will get that until we start canonicalizing to the intrinsics.
Thoughts?

I only used assume tests because that seemed like the most direct way to test the change. We also get all other folds that are implemented for icmp, e.g. we recognize that

%add = add nuw i8 %x, 1
%max = call i8 @llvm.umax.i8(i8 %x, i8 %add)
ret i8 %max

is the same as ret i8 %add. I'm not sure how valuable it would be to explicitly test these, as they are already covered by the icmp tests.

I agree it's not worth duplicating everything, but a bit of variation in the test coverage would make the context/motivation clearer when others look back at the code change. One more test? :)
We're never going to recover tiny add-on costs like this until we rewrite instsimplify/instcombine pattern matching completely (part of the switch to MLIR?).
I don't think we can know at this point if it's worthwhile, but I can't really object, so LGTM.

This revision is now accepted and ready to land.Aug 24 2020, 12:08 PM
This revision was landed with ongoing or failed builds.Aug 26 2020, 1:03 PM
This revision was automatically updated to reflect the committed changes.