This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Add legalization for Integer Saturation Intrinsics.
Needs ReviewPublic

Authored by ab on Jan 14 2015, 2:04 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

I don't think there's a lot of room for variance in the implementation, but I'd like to have opinions on one particular point: the scalar expansion.

The expanded code for the intrinsics is the equivalent 2icmp+2select, roughly corresponding to:

(x > max ? max : (x < min ? min : x))

that is:

%0 = icmp slt i32 %x, %min
%1 = select i1 %0, i32 %min, i32 %x
%2 = icmp sgt i32 %1, %max
%3 = select i1 %2, i32 %max, i32 %1

With, for signed n-bit saturation:

min: -(2^(n-1))
max: 2^(n-1)-1

However, an alternative expansion could be to not reuse the intermediate result, i.e.:

%0 = icmp slt i32 %x, %min
%1 = select i1 %0, i32 %min, i32 %x
%2 = icmp sgt i32 %x, %max                  ; <--- %x instead of %1
%3 = select i1 %2, i32 %max, i32 %1

I'm not entirely convinced which is better. In practice, the latter leads to increased register pressure, and actually resulted in some regressions, just for going through the intrinsic and back to the expansion.
The current patch uses the first one (with the dependency).

Anyway, this isn't a blocking problem for this patch, since the intrinsics aren't generated. We'll see about introducing more machinery (say a target hook to only generate the intrinsics when profitable, a register-pressure-aware legalization, or a later relaxation, etc..) if it's deemed necessary.

Based on D6976.

Diff Detail

Event Timeline

ab updated this revision to Diff 18183.Jan 14 2015, 2:04 PM
ab retitled this revision from to [CodeGen] Add legalization for Integer Saturation Intrinsics..
ab updated this object.
ab edited the test plan for this revision. (Show Details)
ab added a subscriber: Unknown Object (MLST).
tkn added a subscriber: tkn.Nov 17 2016, 7:08 PM