This is an archive of the discontinued LLVM Phabricator instance.

[Intrinsic] Signed Saturation Addition Intrinsic
ClosedPublic

Authored by leonardchan on Oct 9 2018, 4:30 PM.

Details

Summary

Add an intrinsic that takes 2 integers and perform saturation addition on them.

This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics.

Diff Detail

Repository
rL LLVM

Event Timeline

leonardchan created this revision.Oct 9 2018, 4:30 PM
craig.topper added inline comments.Oct 9 2018, 5:32 PM
lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
540

This doesn't seem right. Let's say the input was an i8 SATSADD that should have returned [-128,127] with no wrap around. Let's say the type to promote to i16. This becomes two sign_extend_inreg from i8 to i16 and an an i16 SATSADD, but that will only saturate when the math overflows 16 bits. So it's not longer saturating to [-128,127]. You get can [-256,254] now.

lib/CodeGen/TargetLoweringBase.cpp
1874

What about vector types?

1886

getSetCCResultType takes the type of the input to the setcc and returns the type that should be used for its output. So MVT::i1 here should LHS.getValueType()

craig.topper retitled this revision from [Intrinsic] Signed Saturation Addition Intirnsic to [Intrinsic] Signed Saturation Addition Intrinsic.Oct 9 2018, 9:40 PM
ebevhan added inline comments.Oct 10 2018, 12:41 AM
include/llvm/CodeGen/ISDOpcodes.h
265

This is a nitpick but I think it would be better if it were named SADDSAT to match the naming of the other nodes (SADDO, SMULO for example).

include/llvm/CodeGen/TargetLowering.h
824

The thing with saturating addition (and subtraction) is that the saturation bitwidth is determined by the width of the type itself anyway, so I don't think there's a need for a hook for it.

You only need a hook if the legality needs to be predicated on something other than the result and operand type.

include/llvm/IR/Intrinsics.td
708

Same naming scheme point here, to match the overflow intrinsics it should probably be int_sadd_sat or similar.

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
540

I think a more correct result could be obtained for iN to iM by doing ANY_EXTEND to iM->LSHR by M-N->SATSADD->ASHR M-N. I'm not sure if this is the best way, though.

lib/CodeGen/TargetLoweringBase.cpp
1883

Would there be a benefit to using SADDO here if it's legal? Minimum saturation occurs if Overflow && Sum > 0 and maximum saturation if Overflow && Sum < 0.

It really depends on whether or not the target can chain the operations effectively, but I suspect that most targets won't be able to keep the overflow flag alive without materializing it.

craig.topper added inline comments.Oct 10 2018, 12:43 AM
lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
540

Should LSHR be SHL?

ebevhan added inline comments.Oct 10 2018, 12:46 AM
lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
540

Oh, whoops. Yes, it should be SHL.

leonardchan marked 9 inline comments as done.
leonardchan added inline comments.
lib/CodeGen/TargetLoweringBase.cpp
1883

I'm running into legality issues when type expansion is involved. I was able to replace this block to use SADDO but ran into an error saying the SADDO node was not expanded when legalizing types. I'm not sure if I should call another function to "force check" expansion on my newly created SADDO node, but I couldn't find any other examples that show usage of SADDO during expansion.

Alternatively, I could move the code for expanding SADDO into its own method that I could call from here.

bjope added a subscriber: bjope.Oct 11 2018, 12:22 AM
ebevhan added inline comments.Oct 11 2018, 12:54 AM
lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
550

If I understand correctly, Op1Promoted and Op2Promoted are already any-extended to the wider type: "The low bits of the promoted value corresponding to the original type are exactly equal to Op." and "The extra bits contain rubbish". So there shouldn't be a need for an explicit ANY_EXTEND after.

2981

Seems like this snuck in.

lib/CodeGen/SelectionDAG/LegalizeTypes.h
418

Should be named SADDSAT.

lib/CodeGen/TargetLoweringBase.cpp
1883

Odd... It looks to me like expanding SADDO during type legalization should work fine. I was really just curious to see if it produced better code for the legal type case than expanding outright. Maybe someone else knows more about why it doesn't want to work.

If it's just a problem during type expansion you could always check if the type is legal, and if so, use SADDO. But supporting both SADDO and the full expansion seems like overkill.

leonardchan marked 6 inline comments as done.
leonardchan added inline comments.
lib/CodeGen/TargetLoweringBase.cpp
1883

I don't know what happened or what got fixed, but I'm able to use SADDO now for some reason :p. And the generated code is shorter now in the tests.

Looks good to me!

Though, I just realized that I think you need to add the new node to TargetSelectionDAG.td. Unsure if it needs to be added somewhere else as well.

bjope added inline comments.Oct 12 2018, 12:36 AM
include/llvm/IR/Intrinsics.td
707

This is also Commutative afaict.

Are we not supporting vectors at all? That needs to be checked in the IR verifier if so. Though I would really like to see this for vectors since X86 has 8 and 16 bit saturating vector addition.

lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
554

What about vector types?

bjope added inline comments.Oct 12 2018, 1:11 AM
lib/CodeGen/TargetLoweringBase.cpp
1890

Maybe I miss something, but shouldn't it be SETGE here?
Assume that we have and SADDSAT(SignedMin, SignedMin). Then we overflow, and the result it zero. But we should still saturate to SignedMin.

leonardchan marked 4 inline comments as done.

Are we not supporting vectors at all? That needs to be checked in the IR verifier if so. Though I would really like to see this for vectors since X86 has 8 and 16 bit saturating vector addition.

We can add them. Not sure if what I added is the correct way they should be expanded though. I might just not be looking hard enough, but I couldn't find other examples of intrinsics that take int vector arguments and expand them.

lib/CodeGen/TargetLoweringBase.cpp
1890

You're right. It should be.

Intrinsics with vector operands are expanded in LegalizeVectorOps.cpp. Sometimes with a vectorized sequence or in the worst case by unrolling.

Intrinsics with vector operands are expanded in LegalizeVectorOps.cpp. Sometimes with a vectorized sequence or in the worst case by unrolling.

Added the appropriate case in LegalizeVectorOps. This should eventually allow the node to be unrolled in Expand, which I think does the same thing I had before in a previous diff of this patch.

craig.topper added inline comments.Oct 12 2018, 11:29 PM
lib/CodeGen/TargetLoweringBase.cpp
1895

What if we did the selects like this? Then we don't need any ANDs and we only need one setcc.

overflow ? (sum < 0 ? satmax : satmin) : sum

craig.topper added inline comments.Oct 12 2018, 11:32 PM
lib/CodeGen/TargetLoweringBase.cpp
1862

This should be in TargetLowering.cpp with expandMUL_LOHI, expandMUL, etc. Not TargetLoweringBase.

craig.topper added inline comments.Oct 12 2018, 11:50 PM
test/CodeGen/X86/sadd_sat.ll
2 ↗(On Diff #169506)

Use -mtriple=i686-- not -march. And generate the checks using utils/update_llc_test_checks.py

craig.topper added inline comments.Oct 12 2018, 11:58 PM
test/CodeGen/X86/sadd_sat.ll
2 ↗(On Diff #169506)

It might also be better to add -mattr=cmov on the 32-bit test

leonardchan marked 5 inline comments as done.
leonardchan added inline comments.
test/CodeGen/X86/sadd_sat.ll
2 ↗(On Diff #169506)

This is a very nifty script!

craig.topper accepted this revision.Oct 15 2018, 9:56 PM
This revision is now accepted and ready to land.Oct 15 2018, 9:56 PM
This revision was automatically updated to reflect the committed changes.

When can we expect a signed subtraction intrinsic?

When can we expect a signed subtraction intrinsic?

Was going to work on that within the next few days. Right now I'm working on finishing up for a clang patch that uses these add intrinsics for fixed point addition.

I don't see this or its unsigned counterpart added to the LangRef; is that intentional, or an oversight?

I forgot to add it but will add it on my to-do list.