This is an archive of the discontinued LLVM Phabricator instance.

[InstSimplify] Simplify saturating add/sub + icmp
ClosedPublic

Authored by nikic on Dec 15 2018, 2:56 AM.

Details

Summary

If a saturating add/sub has one constant operand, then we can determine the possible range of outputs it can produce, and simplify an icmp comparison based on that.

I'm implementing this in InstSimplify, with a similar approach to already existing code for binary operators. I previously started out by adding support for this to ConstantRange, which would be able to handle more general cases, but I couldn't figure out which pass would be responsible for the actual simplification.

Ref: https://github.com/rust-lang/rust/issues/44500

Diff Detail

Event Timeline

nikic created this revision.Dec 15 2018, 2:56 AM

The more general transforms (changing the icmp predicate and/or constant operand of the icmp) should be handled under InstCombiner::visitICmpInst().
This is a cheap analysis, so it's probably best to have this part in InstSimplify regardless of what we add to InstCombine.

The logic looks good, but see inline for an improvement related to canonical form.

lib/Analysis/InstructionSimplify.cpp
2668–2669

Assuming canonical IR isn't a valid assumption for InstSimplify because InstSimplify is used as an analysis independently of InstCombine. Even when InstSimplify is called from within InstCombine, it would theoretically be more efficient to handle non-canonical simplifications before doing other transforms in InstCombine.

Do we "internally canonicalize" a constant operand for uadd/sadd to operand 1 here in InstSimplify? If not, we might want to do that too. Also if the existing code for binops doesn't do that, it's probably worth a TODO comment.

In all cases, it's worth varying at least some of the regression tests to prove that we have those non-canonical patterns covered.

nikic updated this revision to Diff 178402.Dec 16 2018, 10:43 AM

Add handling for intrinsics in non-canonical form.

nikic marked 2 inline comments as done.Dec 16 2018, 10:52 AM
nikic added inline comments.
lib/Analysis/InstructionSimplify.cpp
2668–2669

I've implemented handling for the non-canonical forms. Regarding doing canonicalization in InstSimplify, I think it only does this if the simplify operation itself takes the instruction in unpacked form (such as SimplifyBinOp which has Opcode+LHS+RHS), not when working on explicit instructions. I'm assuming that swapping operands of actual instruction is not allowed inside InstSimplify.

spatel added inline comments.Dec 17 2018, 6:36 AM
lib/Analysis/InstructionSimplify.cpp
2681–2682

What happens if C is the signed min value here?

define i1 @ssub_icmp_op1_is_min_val(i8 %a) {
  %b = call i8 @llvm.ssub.sat.i8(i8 %a, i8 -128)
  %c = icmp sle i8 %b, -118
  ret i1 %c
}
nikic marked 2 inline comments as done.Dec 17 2018, 6:44 AM
nikic added inline comments.
lib/Analysis/InstructionSimplify.cpp
2681–2682

This will fold to ret i1 false. The computed range here is [0, SINT_MAX]. The last test is intended to check this case.

spatel accepted this revision.Dec 17 2018, 7:40 AM

LGTM

lib/Analysis/InstructionSimplify.cpp
2681–2682

Ah, sorry I missed that test.

It's not obvious to me that the result of:

%b = call i8 @llvm.ssub.sat.i8(i8 0, i8 -128)

is "127".

Is that worth noting here or in the LangRef?

This revision is now accepted and ready to land.Dec 17 2018, 7:40 AM
nikic marked an inline comment as done.Dec 17 2018, 8:40 AM
nikic added inline comments.
lib/Analysis/InstructionSimplify.cpp
2681–2682

If the semantics aren't clear, it would be good to clarify in LangRef. However, right now I'm not sure I understand where the ambiguity lies. What result would you expect for this operation?

spatel added inline comments.Dec 17 2018, 9:33 AM
lib/Analysis/InstructionSimplify.cpp
2681–2682

It's probably ok to disregard me on this; I don't have much experience actually using saturating math.
I was imagining that (0 ssub -128) could be interpreted as (0 sadd -(-128)) and the negation of the signed min val just returns the signed min val again...but that wouldn't be very useful, and if all hardware agrees that the result is calculated as shown here, then it's fine.

This revision was automatically updated to reflect the committed changes.