Page MenuHomePhabricator

[GlobalISel][AMDGPU] Legalize saturating add/subtract
Needs ReviewPublic

Authored by foad on Jan 20 2020, 8:36 AM.

Details

Summary

Add support in LegalizerHelper for lowering G_SADDSAT etc. either using
add/subtract-with-overflow or using max/min instructions.

Enable this lowering for AMDGPU so it can be tested.

Diff Detail

Event Timeline

foad created this revision.Jan 20 2020, 8:36 AM

Unit tests: pass. 62011 tests passed, 0 failed and 783 were skipped.

clang-tidy: unknown.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

foad updated this revision to Diff 239151.Jan 20 2020, 8:58 AM

Add legalize tests for signed 16-bit operations.

Unit tests: pass. 62011 tests passed, 0 failed and 783 were skipped.

clang-tidy: unknown.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

arsenm added inline comments.Jan 20 2020, 1:04 PM
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
3907–3910

I think this won't produce what we want with VOP3P instructions. For v3+s16, we would want G_UMIN to fewerElementsVector to v2s16, which won't be any of these. We would get vector min after lowering

4972

You can do .getReg(0) instead of ->getOperand(0).getReg()

4985–4986

Just .getReg(0)/1 works

4987–4988

Just passing -1/0 to buildConstant should work

llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
357

Bot is wrong here

llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-sat.ll
3 ↗(On Diff #239151)

Should probably just use update_mir_test_checks on this

llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sat.mir
3

This should get an explicit -mcpu. Maybe a runline for with/without i16 instructions

255

Should add at least a v2s16, v3s16, v4s16, v2i32, and s64 test

arsenm added inline comments.Jan 20 2020, 1:06 PM
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sat.mir
3

Really need at least a VI and gfx9+

arsenm added inline comments.Jan 20 2020, 1:30 PM
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
4968

I think it would also be beneficial to implement this in terms of two helper functions for the two different expansions. That way it's more flexible if a target wants to contextually use different strategies in a custom lowering

foad updated this revision to Diff 239851.Jan 23 2020, 4:23 AM
foad marked 5 inline comments as done.

Rebase and address some review comments.

foad marked 4 inline comments as done.Jan 23 2020, 4:35 AM
foad added inline comments.
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
3907–3910

Right, this requires more thought. I was hoping to reuse the exisiting isSupported logic without thinking too hard about it.

4968

Are there precedents for that kind of helper function that the target calls directly, instead of via the standard lower() action?

llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sat.mir
3

What sort of run line do I need to disable i16 instructions?

255

That would require G_SADDO (et al) to be legalizable for all those types, which it currently is not. Do I need to tackle that first, before proceeding with this patch?

Unit tests: fail. 62131 tests passed, 1 failed and 808 were skipped.

failed: libc++.std/thread/thread_mutex/thread_mutex_requirements/thread_mutex_requirements_mutex/thread_mutex_class/try_lock.pass.cpp

clang-tidy: fail. clang-tidy found 0 errors and 4 warnings.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

arsenm added inline comments.Jan 23 2020, 4:55 AM
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
4968

We use Helper.lowerFMinNumMaxNum directly in AMDGPULegalizerInfo.

I'm specifically thinking we want to switch which expansion is used on AMDGPU based on whether it's scalar or vector, so we would make these legal and then use the helpers in applyMappingImpl

llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sat.mir
3

Use a target without one. I usually use a Tahiti, Fiji, and gfx900 run line

255

You can add the tests and use -global-isel-abort=0, and they just won't legalize and defer it to a later patch. Handling the vector cases at least should be as easy as adding the instructions to the fewerElementsVector switch

When you commit can you split this into one patch to add the opcode/irtranslator support and another for the legalizer changes.

Can't we also set op_sel on v_add_i32 to get saturating behavior?

foad updated this revision to Diff 251997.Mon, Mar 23, 5:03 AM

Split opcode/irtranslator stuff out into D76600.

Use separate helper functions for expanding in terms of max/min or in terms of addo/subo.

foad retitled this revision from [GlobalISel][AMDGPU] Saturating add/subtract to [GlobalISel][AMDGPU] Legalize saturating add/subtract.Mon, Mar 23, 5:04 AM
foad edited the summary of this revision. (Show Details)
foad edited the summary of this revision. (Show Details)

Can you also add some end to end IR tests make sure to stress SALU vs. VALU for a mix of scalar and vector types

llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
2623–2624

By scalar vs. vector, I meant SGPR vs. VGPR. I haven't looked at whether this registers a real improvement here