This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/GlobalISel: Use clamp modifier for [us]addsat/[us]subsat
ClosedPublic

Authored by arsenm on Jul 13 2020, 1:24 PM.

Details

Summary

We also have never handled this for SelectionDAG, which needs
additional work.

Diff Detail

Event Timeline

arsenm created this revision.Jul 13 2020, 1:24 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2020, 1:24 PM
foad added a comment.Jul 14 2020, 1:56 AM

There's an annoying asymmetry where we have signed saturating for
packed operations, but not for scalar 16-bit.

My Vega Instruction Set Architecture Reference Guide says:
"V_ADD_I16
D.i16 = S0.i16 + S1.i16.
Supports saturation (signed 16-bit integer domain)."

There's an annoying asymmetry where we have signed saturating for
packed operations, but not for scalar 16-bit.

My Vega Instruction Set Architecture Reference Guide says:
"V_ADD_I16
D.i16 = S0.i16 + S1.i16.
Supports saturation (signed 16-bit integer domain)."

Seems missing from the gfx8 and 10 manuals? I'm guessing it's really there though

foad added a comment.Jul 14 2020, 5:18 AM

There's an annoying asymmetry where we have signed saturating for
packed operations, but not for scalar 16-bit.

My Vega Instruction Set Architecture Reference Guide says:
"V_ADD_I16
D.i16 = S0.i16 + S1.i16.
Supports saturation (signed 16-bit integer domain)."

Seems missing from the gfx8 and 10 manuals? I'm guessing it's really there though

The same text is in the GFX10 "RDNA 1.0 Instruction Set Architecture Reference Guide" too, but with the instructions renamed to v_add_nc_i16 and v_add_nc_u16.

There's an annoying asymmetry where we have signed saturating for
packed operations, but not for scalar 16-bit.

My Vega Instruction Set Architecture Reference Guide says:
"V_ADD_I16
D.i16 = S0.i16 + S1.i16.
Supports saturation (signed 16-bit integer domain)."

Seems missing from the gfx8 and 10 manuals? I'm guessing it's really there though

I did find a note that it was added in gfx9, still not sure about 10

arsenm updated this revision to Diff 278041.Jul 14 2020, 6:45 PM
arsenm retitled this revision from AMDGPU/GlobalISel: Use clamp modifier for uaddsat/usubsat to AMDGPU/GlobalISel: Use clamp modifier for [us]addsat/[us]subsat.
arsenm edited the summary of this revision. (Show Details)

Use the gfx9 saturating i16/i32 for signed cases which apparently exist

arsenm updated this revision to Diff 278055.Jul 14 2020, 7:47 PM

Add comment for why DAG expansion sometimes get a better result

foad added inline comments.Jul 16 2020, 2:13 AM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
424

I don't think this change has any effect, does it? This whole hunk already makes loads of assumptions about which features were introduced at the same time as other features.

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
589

What does this name mean? My understanding is that gfx9 (a) added add/sub instructions with no carry-out and (b) added "i16" and "i32" add/sub instructions (which let you do signed saturation). But I'm not sure how to parse "NoCarryAddSubSat". Maybe at least add a comment?

arsenm updated this revision to Diff 278448.Jul 16 2020, 6:10 AM
arsenm marked 2 inline comments as done.
foad accepted this revision.Jul 16 2020, 8:12 AM

LGTM.

This revision is now accepted and ready to land.Jul 16 2020, 8:12 AM
arsenm closed this revision.Jul 28 2020, 8:18 AM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
424

If you were directly specifying subtarget target features, yes. If you fallback to the base path I would expect everything to work if not be ideal

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
589

I guess I can logically lump all of these with the no carry instructions

llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll