Page MenuHomePhabricator

[Intrinsic] Unsigned Fixed Point Saturation Multiplication Intrinsic

Authored by bjope on Feb 6 2019, 12:26 PM.



Add an intrinsic that takes 2 unsigned integers with the scale of them provided as the third argument and performs fixed point multiplication on them. The result is saturated and clamped between the largest and smallest representable values of the first 2 operands.

This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics.

Diff Detail


Event Timeline

leonardchan created this revision.Feb 6 2019, 12:26 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 6 2019, 12:26 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
Ka-Ka added a subscriber: Ka-Ka.May 20 2019, 2:02 PM

Rebase and update now that D66052 has landed?

bjope commandeered this revision.Thu, Aug 22, 12:48 PM
bjope edited reviewers, added: leonardchan; removed: bjope.

I'll try to finish the work on this patch (hence the commandeer action). I hope @leonardchan won't mind that.

Rebase turned out not to be trivial, so I'll look into that.

bjope planned changes to this revision.Thu, Aug 22, 12:50 PM

Working on a rebase to trunk.

I'll try to finish the work on this patch (hence the commandeer action). I hope @leonardchan won't mind that.

Woops. Yeah sorry I've gotten sidetracked other projects but I don't mind being the reviewer for this. Thanks for commandeering. Didn't even know you could do that on phabricator.

bjope added inline comments.Sat, Aug 31, 2:01 AM
13321 ↗(On Diff #185613)

Should say: "; %res = 4 (or 5)"

13323 ↗(On Diff #185613)

Neither of these examples will overflow as currently written so they do not really describe the effect of saturation (and the results shown are incorrect afaict).

I'm planning to change this into:

; Saturation
%res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0)  ; %res = 15 (8 x 2 -> clamped to 15)
%res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2)  ; %res = 15 (2 x 2 -> clamped to 3.75)
2750 ↗(On Diff #185613)

I don't get this. Afaict SatMin isn't used for unsigned.
Anyway, I'm currently thinking about refactoring this part of the code, in a separate patch, before adding the umul.fix.sat support.

2780 ↗(On Diff #185613)

Afaict this should be the inverse. We overflow if HH != 0, and then we need to clamp. Now we clamp to maximum value when HH == 0.

bjope updated this revision to Diff 218208.Sat, Aug 31, 2:56 AM

Commandeered (from leonardchan).
Now being based on monorepo.
Fixed bugs in LangRef.
Fixed bugs in ExpandIntRes_MULFIX (basically reimplemented saturation for UMULFIXSAT after breaking it out into a separate if-clause (no longer implementing shift by scale, signed saturation and unsigned saturation in the same piece of code).

bjope marked 4 inline comments as done.Sat, Aug 31, 2:59 AM
RKSimon added inline comments.
6724 ↗(On Diff #218208)

Can't we now get here if !Saturating && !isOperationLegalOrCustom(ISD::MUL, VT) ?

bjope added inline comments.Sun, Sep 1, 8:55 AM
6724 ↗(On Diff #218208)

Yes, that would be wrong! I will restore this to fall through to check for SMUL_LOHI, MULHS etc.

Another idea is to disallow scale==0 for SMULFIX and UMULFIX. SelectionDAGBuilder could just select an ordinary MUL instead for smul.fix when scale is zero.
Similarly umul.fix/umul.fix.sat with scale==VTSize could be selected as MULHU instead of UMULFIX and UMULFIXSAT (getting rid of some more special cases).

6810 ↗(On Diff #218208)

This does not work for Scale==0 (and we might pass through the special handling for Scale==0 above. The same problem exists in the old code.
I'll need to look into this a bit more.

bjope planned changes to this revision.Sun, Sep 1, 9:04 AM

Some fixes are planned:

  • Fix problem with scale==0 spotted in TargetLowering::expandFixedPointMul
  • Add fixes that has been added for smul.fix.sat since this patch originally was put up for review (isTriviallyVectorizable, VectorLegalizer::Expand, DAGTypeLegalizer::WidenVectorResult, DAGCompielr::visitMULFIX).
bjope updated this revision to Diff 218331.Mon, Sep 2, 4:41 AM

Fixes to get umul.fix.sat on par with smul.fix.sat. Basically we now handle umul.fix.sat wherever we handled smul.fix.sat (and UMULFIXSAT wherever we handle SMULFIXSAT), except for ConstantFolding.

bjope marked 2 inline comments as done.Wed, Sep 4, 10:51 PM
bjope added inline comments.
6810 ↗(On Diff #218208)
bjope added a comment.Thu, Sep 5, 11:49 PM

Friendly ping.
(totally understand if tihs hasn't been on top of prio list for reviewers, and that it might be a lot to digest)

As I wrote earlier, right now this should be "on par" with smul.fix.sat when it comes to promotion/widening/legalization/scalarization etc. Also matching the amount of in-tree testing that we got for smul.fix.sat.
No outstanding comments that has not been resolved (afaict after having commandeered this revision).

rebase after D67071 has landed? (and add equivalent scale == 0 tests)

bjope added a comment.Fri, Sep 6, 6:59 AM

rebase after D67071 has landed? (and add equivalent scale == 0 tests)

Right now D67071 is based on this one. So it would be simpler to land this first.
And this patch also already includes the equivalent scale==0 zero tests: llvm/test/CodeGen/PowerPC/umulfixsat.ll

leonardchan accepted this revision.Fri, Sep 6, 11:05 AM

LGTM. Thanks for taking over this!

2984 ↗(On Diff #218331)

Nit: Add an else case with an llvm_unreachable mentioning that we returned earlier when Scale == VTSize and that the scale must be less than or equal to the width of the operands.

This revision is now accepted and ready to land.Fri, Sep 6, 11:05 AM
This revision was automatically updated to reflect the committed changes.