Page MenuHomePhabricator

[Intrinsic] Signed Fixed Point Saturation Multiplication Intrinsic
Needs ReviewPublic

Authored by leonardchan on Dec 14 2018, 2:19 PM.

Details

Summary

Add an intrinsic that takes 2 signed integers with the scale of them provided as the third argument and performs fixed point multiplication on them. The result is saturated and clamped between the largest and smallest representable values of the first 2 operands.

This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics.

Diff Detail

Event Timeline

leonardchan created this revision.Dec 14 2018, 2:19 PM

Nothing sticks out to me, so I think it looks good. Hard to tell if there are any sneaky edge cases in the lowering steps, though.

Maybe you should rebase this on top of the unsigned patch, since they're touching all the same places. Or are you waiting for it to land?

Nothing sticks out to me, so I think it looks good. Hard to tell if there are any sneaky edge cases in the lowering steps, though.

Maybe you should rebase this on top of the unsigned patch, since they're touching all the same places. Or are you waiting for it to land?

Yeah, I figure it would be better to submit these as separate patches since they're still technically independent of each other and ideally makes it easier for it to review.

@bjope @craig.topper @RKSimon Any comments on this patch?

RKSimon added inline comments.Jan 9 2019, 2:05 PM
llvm/test/CodeGen/X86/smul_fix_sat.ll
45

nounwind

leonardchan marked an inline comment as done.

Should https://reviews.llvm.org/D56987 be a parent for this? Then you'd need to rebase getExpandedFixedPointMultiplication since that has changed into converting into MUL when scale is zero (that is not valid for saturation).

Should https://reviews.llvm.org/D56987 be a parent for this? Then you'd need to rebase getExpandedFixedPointMultiplication since that has changed into converting into MUL when scale is zero (that is not valid for saturation).

Rebased

RKSimon added inline comments.Jan 30 2019, 1:08 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5534

(style) Do an early out to reduce indentation

if (!Saturating)
  return Result;
leonardchan marked an inline comment as done.

Updated and rebased

Please rebase after D55625 lands

Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2019, 5:42 AM

Updated and rebased

RKSimon added inline comments.Feb 4 2019, 10:39 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5479–5498

You've changed the logic to let non-vector cases to fall through, which leads to UNDEFs for scale == 0 cases.

leonardchan marked an inline comment as done.
bjope added inline comments.Feb 7 2019, 12:02 AM
llvm/lib/CodeGen/TargetLoweringBase.cpp
627

I'm not sure how to do this when overriding for a specific target. In our case we want it to be legal, but only when the scale is 15 (and VT is i16 or i24) or the scale is 31 (and VT is i32 or i40). Is there some easy solution for that? Setting it to legal/custom for any scale might be seen as an indication for optimizers that it is OK to introduce these operations for any scale.

This is however a general comment, also for the already pushed non-saturating versions. So it isn't anything that you need to deal with in this patch. But we might need a better solution in the long term.

ebevhan added inline comments.Feb 7 2019, 12:33 AM
llvm/lib/CodeGen/TargetLoweringBase.cpp
627

That's what the isSupportedFixedPointOperation hook in TargetLowering is for.

bjope added inline comments.Feb 7 2019, 12:40 AM
llvm/test/CodeGen/X86/smul_fix_sat.ll
3

The expansion is quite complicated (the splitting in four parts and detecting overflow etc).
Isn't there a risk that X86 is a typical target that will try to find more optimal solutions and maybe also make SMULFIXSAT legal? Then this test case might not really verify the expand code any longer?

On the other hand, these test cases are just jibberish to me anyway. I can't tell from looking at the checks that DAGTypeLegalizer::ExpandIntRes_MULFIX is doing the right thing. And it would not really help if using another target. Are there perhaps other ways to test DAGTypeLegalizer, such as unit tests?

One thing that probably can be done quite easily is to a bunch of tests using constant operands. Verifying that DAGCombiner will constant fold to the expected result after having expanded into legal operations (somehow making sure that DAGCombiner do not constant fold the SMULFIXSAT before it has been expanded, I guess someone will add such DAGCombines sooner or later). That way you might be able to get coverage for all paths through DAGTypeLegalizer::ExpandIntRes_MULFIX. Maybe this test should be in a separate test file.

bjope added inline comments.Feb 7 2019, 1:13 AM
llvm/lib/CodeGen/TargetLoweringBase.cpp
627

Ah, yes! And that is updated in this patch. Just me being blind (in combination with a some amnesia).

RKSimon added inline comments.Feb 7 2019, 4:54 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5467

Why drop the assert? Why not just add ISD::SMULFIXSAT tests?

5480–5498

I think you need something like this here (please double check my logic):

if (VT.isVector() && !isOperationLegalOrCustom(ISD::SMULO, VT) &&
    !(!Saturating && isOperationLegalOrCustom(ISD::MUL, VT)))
  return SDValue(); // unroll

And that will let you avoid the return SDValue() below by always defaulting to a scalar ISD::MUL/ISD::SMULO that legalization can handle.

RKSimon added inline comments.Feb 8 2019, 11:05 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5480–5498

I've committed rL353546 which /should/ mean that the scale==0 case is now safe to drop through.

leonardchan marked 6 inline comments as done.

Fixed mistake related to saturation during expansion. When we promote the operand and result type widths, this also changes the saturation width and affects the min/max values we compare against. This is easily solved by also shifting one of the operands after extension.

Oh I forgot to submit these inline comments

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5480–5498

I thought we still want to allow vectors to pass to calls to MUL and SMULO? Wouldn't this scalarize when we disallow vectors even if MUL and SMULO are legeal?

llvm/test/CodeGen/X86/smul_fix_sat.ll
3

Yeah I can see how these tests are hard to read. I wasn't aware of other ways this could be tested other than making sure the codegen is the same each time. I have my own scripts with different cases to verify the output is correct, but wasn't sure of any existing widely used method of "taking my IR, running it, and verify the results".

Testing with constant operands seems to produce better looking tests for non-saturating multiplication:

define i4 @func() {
; X64-LABEL: func:
; X64:       # %bb.0:
; X64-NEXT:    movb $3, %al 
; X64-NEXT:    retq
  %tmp = call i4 @llvm.smul.fix.i4( i4 3, i4 2 , i32 1)
  ret i4 %tmp
}

where we can immediately tell the result is 3, but there's still branching in the saturating case:

define i4 @func2() {
; X64-LABEL: func2:
; X64:       # %bb.0:
; X64-NEXT:    xorl %eax, %eax
; X64-NEXT:    testb %al, %al 
; X64-NEXT:    movb $127, %cl 
; X64-NEXT:    jg .LBB1_2
; X64-NEXT:  # %bb.1:
; X64-NEXT:    movb $3, %cl 
; X64-NEXT:  .LBB1_2:
; X64-NEXT:    movb $-1, %al 
; X64-NEXT:    negb %al 
; X64-NEXT:    movb $-128, %al 
; X64-NEXT:    jl .LBB1_4
; X64-NEXT:  # %bb.3:
; X64-NEXT:    movl %ecx, %eax
; X64-NEXT:  .LBB1_4:
; X64-NEXT:    retq
  %tmp = call i4 @llvm.smul.fix.sat.i4( i4 3, i4 2 , i32 1)
  ret i4 %tmp
}

so we can't get something as straightforward as with non-saturating.

leonardchan marked an inline comment as done.
leonardchan added inline comments.
llvm/test/CodeGen/X86/smul_fix_sat.ll
3

@bjope I added another test file that covers the saturation branches in ExpandIntRes_MULFIX using constant operands, although this doesn't seem to produce anything more readable than with variable operands.

bjope added inline comments.Fri, Feb 22, 4:27 PM
llvm/test/CodeGen/X86/smul_fix_sat.ll
3

Maybe it doesn't fold due to lack of constant folding for SMUL_LOHI (at least not for x86). What a pity.

I tried running the test using -mtriple=x86_64--, that at least produce code that is easier to map to the expansion.

I also tried some other targets:

-mtriple=ppc32  => looks like we get some constant folding here
-mtriple=ppc64 => asserts in llvm::SelectionDAG::transferDbgValues
-mtriple=hexagon => asserts in llvm::SelectionDAG::transferDbgValues
-mtriple=systemz => asserts in llvm::SelectionDAG::transferDbgValues
-mtriple=sparc => LLVM ERROR: Cannot select: t42: i32,i32 = addcarry t41:1, Constant:i32<0>, t90:1

(FWIW, no idea if the asserts and LLVM ERROR actually is related to your patch)

leonardchan marked an inline comment as done.

*ping*

llvm/test/CodeGen/X86/smul_fix_sat.ll
3

Updated the test to use -mtriple=x86_64-linuxand it looks a lot more readable.

bjope added inline comments.Thu, Mar 7, 8:10 AM
llvm/test/CodeGen/X86/smul_fix_sat.ll
3

Have you looked at the problem with asserts in llvm::SelectionDAG::transferDbgValues?
It happens when expanding smulfixsat, so something seems to be broken regarding the legalization (depending on target used).

leonardchan marked an inline comment as done.
leonardchan added inline comments.
llvm/test/CodeGen/X86/smul_fix_sat.ll
3
  • For addcarry, the problem seems to be that ISD::ADDCARRY is not supported on some 32 bit targets. The fix for this is just a check in expandMUL_LOHI to see if this operation is legal (https://reviews.llvm.org/D59119).
  • For llvm::SelectionDAG::transferDbgValues, this is because expandFixedPointMul returns an empty SDValue() to indicate this function failed due to some unsupported operation (most likely ISD::SMULO). I imagine the simplest solution for this is to just report_fatal_error since we do not have other operations we can use to perform saturation multiplication.